Otis,

that's exactly what i have in mind. Compression should be optional on binary fields only in the first step. The default setting for compression should be "off" and must be enabled by the user. I would also check the size of the byte array passed in. Even if compression is enabled, it doesn't make sense to compress a dataset which is too small. We would end up with a compressed size which is bigger than the original size, due to the fact that compression needs some overhead.

Having the implementation ready, we could run several tests to see how the overall performance will be affected when using compression.

Bernhard


Otis Gospodnetic wrote:

Bernhard,

Sounds good to me.
I would, however, also be interested in the performance impact of
text-field compression.  While adapting Drew's patch, it may be nice to
make the compression mechanism pluggable.

Otis

--- Bernhard Messer <[EMAIL PROTECTED]> wrote:



hi developers,

a few month ago, there was a very interesting discussion about field compression and the possibility to store binary field values within a

lucene document. Regarding to this topic, Drew Farris came up with a patch to add the necessary functionality. I ran all the necessary
tests on his implementation and didn't find one problem. So the original implementation from Drew could now be enhanced to compress the binary


field data (maybe even the text fields if they are stored only)
before writing to disc. I made some simple statistical measurements using
the java.util.zip package for data compression. Enabling it, we could
save about 40% data when compressing plain text files with a size from 1KB
to 4KB. If there is still some interest, we could first try to update
the patch, because it's outdated due to several changes within the Fields


class. After finishing that, compression could be added to the
updated version of the patch.


sounds good to me, what do you think ?

best regards
Bernhard




--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]






---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]






Reply via email to