Re: CompressingTermVectors; per-field decompress?

2015-04-02 Thread Robert Muir
Vectors are totally per-document. Its hard to do anything smarter with them. Basically by this i mean, IMO vectors aren't going to get better until the semantics around them improves. From the original fileformats, i get the impression they were modelled after stored fields a lot, and I think

Re: CompressingTermVectors; per-field decompress?

2015-04-02 Thread Robert Muir
On Thu, Apr 2, 2015 at 4:02 PM, david.w.smi...@gmail.com david.w.smi...@gmail.com wrote: They are fundamentally per-document, yes, like stored fields — yes. But I don’t see how this fundamental constraint prevents the term vector format from returning a light “Fields” instance which loads

CompressingTermVectors; per-field decompress?

2015-04-02 Thread david.w.smi...@gmail.com
I was looking at a JIRA issue someone posted pertaining to optimizing highlighting for when there are term vectors ( SOLR-5855 ). I dug into the details a bit and learned something unexpected: CompressingTermVectorsReader.get(docId) fully loads all term vectors for the document. The client/user

Re: CompressingTermVectors; per-field decompress?

2015-04-02 Thread david.w.smi...@gmail.com
Thanks for your input Rob… On Thu, Apr 2, 2015 at 3:21 PM, Robert Muir rcm...@gmail.com wrote: Vectors are totally per-document. Its hard to do anything smarter with them. Basically by this i mean, IMO vectors aren't going to get better until the semantics around them improves. From the