Adrien Grand created LUCENE-5914:
------------------------------------

             Summary: More options for stored fields compression
                 Key: LUCENE-5914
                 URL: https://issues.apache.org/jira/browse/LUCENE-5914
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Adrien Grand
            Assignee: Adrien Grand
             Fix For: 4.11


Since we added codec-level compression in Lucene 4.1 I think I got about the 
same amount of users complaining that compression was too aggressive and that 
compression was too light.

I think it is due to the fact that we have users that are doing very different 
things with Lucene. For example if you have a small index that fits in the 
filesystem cache (or is close to), then you might never pay for actual disk 
seeks and in such a case the fact that the current stored fields format needs 
to over-decompress data can sensibly slow search down on cheap queries.

On the other hand, it is more and more common to use Lucene for things like log 
analytics, and in that case you have huge amounts of data for which you don't 
care much about stored fields performance. However it is very frustrating to 
notice that the data that you store takes several times less space when you 
gzip it compared to your index although Lucene claims to compress stored fields.

For that reason, I think it would be nice to have some kind of options that 
would allow to trade speed for compression in the default codec.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to