Re: Realtime Search

2008-12-24 Thread Marvin Humphrey
On Wed, Dec 24, 2008 at 12:02:24PM -0600, robert engels wrote: > As I understood this discussion though, it was an attempt to remove > the in memory 'skip to' index, to avoid the reading of this during > index open/reopen. No. That idea was entertained briefly and quickly discarded. There se

Blob storage

2008-12-24 Thread Babak Farhang
Hi everyone, I've been working on a library called Skwish to complement indexes like Lucene, for blob storage and retrieval. This is nothing more than a structured implementation of storing all the files in one file and managing their offsets in another. The idea is to provide a fast, concurrent

Re: Realtime Search

2008-12-24 Thread Marvin Humphrey
On Tue, Dec 23, 2008 at 11:02:56PM -0600, robert engels wrote: > Seems doubtful you will be able to do this without increasing the > index size dramatically. Since it will need to be stored > "unpacked" (in order to have random access), yet the terms are > variable length - leading to using a

Re: Realtime Search

2008-12-24 Thread robert engels
On Dec 24, 2008, at 12:23 PM, Jason Rutherglen wrote: > Also, what are the requirements? Must a document be visible to search within 10ms of being added? 0-5ms. Otherwise it's not realtime, it's batch indexing. The realtime system can support small batches by encoding them into RAMDir

Re: Realtime Search

2008-12-24 Thread Jason Rutherglen
> Also, what are the requirements? Must a document be visible to search within 10ms of being added? 0-5ms. Otherwise it's not realtime, it's batch indexing. The realtime system can support small batches by encoding them into RAMDirectories if they are of sufficient size. > Or must it be visibl

Re: Realtime Search

2008-12-24 Thread robert engels
As I pointed out in another email, I understand the benefits of compression (compressed disks vs. uncompressed, etc.). PFOR is definitely a winner ! As I understood this discussion though, it was an attempt to remove the in memory 'skip to' index, to avoid the reading of this during index

Re: Realtime Search

2008-12-24 Thread Doug Cutting
Jason Rutherglen wrote: 2) Implement realtime search by incrementally creating and merging readers in memory. The system would use MemoryIndex or InstantiatedIndex to quickly (more quickly than RAMDirectory) create indexes from added documents. As a baseline, how fast is it to simply use RAM

Re: Realtime Search

2008-12-24 Thread Paul Elschot
Op Wednesday 24 December 2008 17:51:04 schreef robert engels: > Thinking about this some more, you could use fixed length pages for > the term index, with a page header containing a count of entries, and > use key compression (to avoid the constant entry size). > > The problem with this is that yo

Re: Realtime Search

2008-12-24 Thread robert engels
Thinking about this some more, you could use fixed length pages for the term index, with a page header containing a count of entries, and use key compression (to avoid the constant entry size). The problem with this is that you still have to decode the entries (slowing the processing - sinc

Re: ANNOUNCE: Welcome Ryan McKinley as Contrib/Documentation Committer

2008-12-24 Thread Michael Busch
Welcome Ryan! Thanks, Uwe, Merry Christmas to you and everyone else! -Michael On 12/24/08 2:10 AM, Uwe Schindler wrote: Welcome Ryan! I am looking forward to further improve the spatial contrib using TrieRangeQuery after Christmas holidays. By the way: Merry Christmas to everyone on this lis

RE: ANNOUNCE: Welcome Ryan McKinley as Contrib/Documentation Committer

2008-12-24 Thread Uwe Schindler
Welcome Ryan! I am looking forward to further improve the spatial contrib using TrieRangeQuery after Christmas holidays. By the way: Merry Christmas to everyone on this list! Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Orig

ANNOUNCE: Welcome Ryan McKinley as Contrib/Documentation Committer

2008-12-24 Thread Chris Hostetter
I'm happy to announce that in recognition of his efforts in moving forward with creating a spatial searching contrib (and his ongoing experience as both a Solr committer and PMC member) The PMC has voted to make Ryan McKinley a Lucene-Java Contrib and Documentation committer. Congrats Ryan,