I started this thread back in November.  Recall that I'm indexing xml and
storing the xpath as a payload in each token.  I am not encoding or mapping
the xpath but storing the text directly as String.getBytes().  We're not
using this to query in any way, just to add context to our search results. 
Presently, I'm ready to bounce around some more ideas about encoding xpath
or strings in general.

Back in the day Grant said:

 
> From what I understand from Michael Busch, you can store the path at  
> each token, but this doesn't seem efficient to me.  I would think you  
> may want to come up with some more efficient encoding.  I am cc'ing  
> Michael on this thread to see if he is able to add any light to the  
> subject (he may not be able to b/c of employer reasons).   If he  
> can't, then we can brainstorm a bit more on how to do it most  
> efficiently.
> 

The word "encoding" in Grant's response brings to mind Huffman coding
(http://en.wikipedia.org/wiki/Huffman_coding).  This would not solve the
query on payload problem that Yonik pointed out because the encoding would
be document centric, but could reduce the amount of total bytes that I need
to store. 

Any ideas?

Tricia
-- 
View this message in context: 
http://www.nabble.com/Payloads-in-Solr-tp13812560p16599300.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to