Re: Filesystem based bitset

robert engels Fri, 09 Jan 2009 13:43:06 -0800

If your index can fit in the IO cache, you should using a completelydifferent implementation...

You should be writing a sequential transaction log for add/update/delete operations, and storing the entire index in memory(RAMDirectory) - with periodic background flushes of the log.

If you are running multiple processes (in KS), who is invoking them(inetd or similar?), if not, and users are on the system, you can'tcontrol what will happen with the IO cache...


If you want performance use a server based implementation.

If you don't care about performance, then performance is not anissue, so use the simplest approach (which is probably the currentimplementation).

Wasting time and resources trying to make the current implementation"better" (and more complex) to accommodate a poor design is just awaste of time and resources.



On Jan 9, 2009, at 3:30 PM, Marvin Humphrey wrote:

On Fri, Jan 09, 2009 at 08:11:31PM +0100, Karl Wettin wrote:
SSD is pretty close to RAM when it comes to seeking. Wouldn't that
mean that a bitset stored on an SSD would be more or less as fastas a
bitset in RAM?
Provided that your index can fit in the system i/o cache and staythere, youget the speed of RAM regardless of the underlying permanent storagetype.
There's no reason to wait on SSDs before implementing such a feature.
One thing we've contemplated in Lucy/KS is a FilterWriter, whichwould writeout cached bitsets at index time. Adding that on would look likesomething
this:

  public class MyArchitecture extends Architecture {
    public ArrayList<SegDataWriter> segDataWriters(InvIndex invindex,
                                                   Segment segment) {
      ArrayList<SegDataWriter> writers
          = super.segDataWriters(invindex, segment);
      writers.add(new FilterWriter(invindex, segment));
      return writers;
    }
  }
  public class MySchema extends Schema {
public Architecture architecture() { return new MyArchitecture(); }
    public MySchema() {
      TextField textFieldSpec = new TextField(new PolyAnalyzer("en"));
      specField("title", textFieldSpec);
      specField("content", textFieldSpec);
    }
  }
IndexWriter writer = new IndexWriter(new MySchema().open("/path/to/index"));
This isn't quite the same thing, because I believe you're talkingaboutadaptively caching filters on the fly at search time. However, Iexpect thisto work quite well when a finite set of filters is known inadvance, e.g. for
faceting categories.

Marvin Humphrey



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Filesystem based bitset

Reply via email to