Re: ZipFile directory implementation

tsuraan Mon, 09 Mar 2009 07:54:41 -0700

> Sounds interesting.  Can you tell us a bit more about the use case for it?
 Is it basically you are in a situation where you can't unzip the index?


Indices compress pretty nicely: 30% to 50% in my experience.  So, if youre
indices are read-only anyhow (mine aren't live; we do batch jobs to modify
them, so they're mostly read-only), they might as well be stored compressed
to save on disk usage.  Sometimes on-disk compression of files (in general)
can help throughput, since the drive IO tends to be a bottleneck rather than
the CPU load; I don't know whether that's true of zipped lucene indices
though.

> Also, have you looked at how it performs?

No, I'm not sure how to do this; what are good benchmarks of store
performance?  Write speed tends to be a significant thing to test, but my
ZipDirectory doesn't support writing.  What other operations tend to be
commonly done in searching?  I could create an IndexReader and call document
and getTermFreqVectors for each doc in my reader.  Is that a useful test, or
is there some established body of useful measures on a store?

Re: ZipFile directory implementation

Reply via email to