Re: ZipFile directory implementation

Michael McCandless Mon, 09 Mar 2009 10:57:46 -0700


tsuraan wrote:

Sounds interesting. Can you tell us a bit more about the use casefor it?
Is it basically you are in a situation where you can't unzip theindex?
Indices compress pretty nicely: 30% to 50% in my experience. So, ifyoureindices are read-only anyhow (mine aren't live; we do batch jobs tomodifythem, so they're mostly read-only), they might as well be storedcompressedto save on disk usage. Sometimes on-disk compression of files (ingeneral)can help throughput, since the drive IO tends to be a bottleneckrather thanthe CPU load; I don't know whether that's true of zipped luceneindices
though.
Also, have you looked at how it performs?
No, I'm not sure how to do this; what are good benchmarks of store
performance? Write speed tends to be a significant thing to test,but myZipDirectory doesn't support writing. What other operations tend tobecommonly done in searching? I could create an IndexReader and calldocumentand getTermFreqVectors for each doc in my reader. Is that a usefultest, or
is there some established body of useful measures on a store?


You could use contrib/benchmark.

I think query performance, for simple term queries, AND, OR, phrase,etc., would be interesting.

It sounds like the model is, you use a normal Lucene directory tocreate the index, then you zip it up, at which point you can then useZipDirectory to search it.

I think this would make a great contribution -- any chance you couldpackage it up and attach a patch to a new Jira issue?


Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: ZipFile directory implementation

Reply via email to