Hi, On Sun, Jun 23, 2013 at 9:08 PM, Savia Beson <eks...@googlemail.com> wrote: > I think Mathias was talking about the case with many smallish fields that all > get read per document. DV approach would mean seeking N times, while stored > fields, only once? Or you meant he should encode all his fields into single > byte[]? > > Or did I get it all wrong about stored vs DV :)
No, this is correct. But in that particular case, I think the best option depends on how data is queried: if all features are always used together then it makes sense to encode them all in a single BinaryDocValuesField. On the other hand, if queries are more likely to require only a subset of the features, encoding each feature in a different field makes more sense. > What helped a lot in a similar case was to make own codec and reduce chunk > size to something smallish, depending on your average document size… there is > a sweet spot somewhere compression/speed. This would indeed make decompression faster on an index that fits in the file-system cache, but as Uwe said, stored fields should only be used to display search results. So requiring 100µs to decompress data per document is not a big deal since you are only going to load 20 or 50 documents (size of a page of results). It is more important to help the file-system cache to prevent actual random accesses to happen as they can easily take 10ms on magnetic storage. -- Adrien --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org