Hi,

On Sun, Jun 23, 2013 at 9:08 PM, Savia Beson <eks...@googlemail.com> wrote:
> I think Mathias was talking about the case with many smallish fields that all 
> get read per document.  DV approach would mean seeking N times, while stored 
> fields, only once? Or you meant he should encode all his fields  into single 
> byte[]?
>
> Or did I get it all wrong about stored vs DV :)

No, this is correct. But in that particular case, I think the best
option depends on how data is queried: if all features are always used
together then it makes sense to encode them all in a single
BinaryDocValuesField. On the other hand, if queries are more likely to
require only a subset of the features, encoding each feature in a
different field makes more sense.

> What helped a lot in a similar case was to make own codec and reduce chunk 
> size to something smallish, depending on your average document size… there is 
> a sweet spot somewhere compression/speed.

This would indeed make decompression faster on an index that fits in
the file-system cache, but as Uwe said, stored fields should only be
used to display search results. So requiring 100µs to decompress data
per document is not a big deal since you are only going to load 20 or
50 documents (size of a page of results). It is more important to help
the file-system cache to prevent actual random accesses to happen as
they can easily take 10ms on magnetic storage.

--
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to