Hi Yonik,

On Sep 2, 2011, at 7:47 PM, Yonik Seeley wrote:

> On Fri, Sep 2, 2011 at 10:26 PM, Mattmann, Chris A (388J)
> <chris.a.mattm...@jpl.nasa.gov> wrote:
>> I'm left with childrenshospitallosangeles as a single token resultant from 
>> the chain.
>> So, when I go to sort the titles in Solr, I use sort=title_sort asc, and I 
>> am getting all kinds of weird results when doing
>> a query.
> 
> Hmmm, a random guess would be that perhaps your analysis chain is
> actually producing more than one token per document.   The lucene
> FieldCache takes the highest for each document (just a non-intended
> side-effect of how the FieldCache entry is populated by enumerating
> terms).
> 
> Try adding fsv=true to your request.  It's an undocumented feature
> used in distributed search (it stands for field sort values) used to
> collate results from different shards.  It should add "sort_values" to
> your response to tell you the sort values for each document.

First off, thanks for the reply. I appreciate it.

I tried the fsv=true parameter and it's great, it revealed what's really 
going on here:

 "sort_values":[
  "title_sort",[null,
        null,
        null,
        null,
....

I've got one of those null values for each returned document. Now I guess
I have to find out what's wrong with my CombiningFilter.

All it does basically is have a static method to call incrementToken() and then 
call TermAttribute.term() for each of the tokens in the stream. It takes these, 
appends them to a StringBuffer (concats them), and then returns a new 
KeywordTokenizer providing a StringReader initialized with the merged 
StringBuffer. Yes, I know this probably isn't the most efficient way and I'm 
open to suggestions.

I think in spelling this out though, I might have elaborated my problem. Since 
the method I call in the constructor for my CombiningFilter is 
super(mergeStreamTokens(in))
where mergeStreamTokens is a static method, I think I might have consumed the 
input 
TokenStream by the time it gets called for the sort. It works on analysis.jsp 
probably 
because the stream isn't re-consumed? Not sure, something wiggy is going on.

I'll keep poking, thanks again.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Reply via email to