If your index can fit in the IO cache, you should using a completely different implementation...

You should be writing a sequential transaction log for add/update/ delete operations, and storing the entire index in memory (RAMDirectory) - with periodic background flushes of the log.

If you are running multiple processes (in KS), who is invoking them (inetd or similar?), if not, and users are on the system, you can't control what will happen with the IO cache...

If you want performance use a server based implementation.

If you don't care about performance, then performance is not an issue, so use the simplest approach (which is probably the current implementation).

Wasting time and resources trying to make the current implementation "better" (and more complex) to accommodate a poor design is just a waste of time and resources.


On Jan 9, 2009, at 3:30 PM, Marvin Humphrey wrote:

On Fri, Jan 09, 2009 at 08:11:31PM +0100, Karl Wettin wrote:

SSD is pretty close to RAM when it comes to seeking. Wouldn't that
mean that a bitset stored on an SSD would be more or less as fast as a
bitset in RAM?

Provided that your index can fit in the system i/o cache and stay there, you get the speed of RAM regardless of the underlying permanent storage type.
There's no reason to wait on SSDs before implementing such a feature.

One thing we've contemplated in Lucy/KS is a FilterWriter, which would write out cached bitsets at index time. Adding that on would look like something
this:

  public class MyArchitecture extends Architecture {
    public ArrayList<SegDataWriter> segDataWriters(InvIndex invindex,
                                                   Segment segment) {
      ArrayList<SegDataWriter> writers
          = super.segDataWriters(invindex, segment);
      writers.add(new FilterWriter(invindex, segment));
      return writers;
    }
  }
  public class MySchema extends Schema {
public Architecture architecture() { return new MyArchitecture (); }
    public MySchema() {
      TextField textFieldSpec = new TextField(new PolyAnalyzer("en"));
      specField("title", textFieldSpec);
      specField("content", textFieldSpec);
    }
  }

IndexWriter writer = new IndexWriter(new MySchema().open("/path/ to/index"));

This isn't quite the same thing, because I believe you're talking about adaptively caching filters on the fly at search time. However, I expect this to work quite well when a finite set of filters is known in advance, e.g. for
faceting categories.

Marvin Humphrey



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to