On 11/27/2012 2:25 PM, Markus Jelsma wrote:
Hi, please check this issue:
https://issues.apache.org/jira/browse/LUCENE-4226

But it is enabled because of:
https://issues.apache.org/jira/browse/LUCENE-4509

Since it's suddenly default you would have to completely wipe the index and 
reindex the data, at least i had to, because of numerous codec exceptions. It 
significantly reduced very large indexes we have.

I noticed the exceptions when I tried to restart after updating the .war. I stopped Solr, completely wiped out my data directories, and ran a DIH full-import on all shards after starting back up. The almost 32 percent drop in index size caught me off guard.

I had seen the compressed stored field issue come across dev and commits, but I didn't connect the dots in my brain.

I would imagine that if Solr has to actually hit the disk, this will be faster, but if the data is already in the OS disk cache, it would be slower. I'm curious whether the document cache stores the compressed or uncompressed version. If it's the uncompressed version, the document cache would get rid of any penalty.

Are there any config knobs for turning compression on/off, or changing the compression algorithm? Are those knobs available to Solr? I'm not doing anything on the scale of the Hathi Trust, but would I ever have any reasonable need to change things?

Thanks,
Shawn

Reply via email to