Difficulties with Highlighting

2010-07-27 Thread Nathaniel Grove
I'm a relative beginner at SOLR, indexing and searching Unicode Tibetan 
texts. I am trying to use the highlighter but it just returns, empty 
elements, such as:


   lst name=highlighting
   lst name=kt-d-0103-text-v4p262a/
   /lst

What am I doing wrong?

The query that generated that is:

http://www.thlib.org:8080/thdl-solr/thdl-texts/select?indent=onversion=2.2q=%E0%BD%91%E0%BD%84%E0%BD%B4%E0%BD%A3%E0%BC%8B%E0%BD%98%E0%BD%81%E0%BD%93%E0%BC%8B+AND+type%3Atextstart=0rows=10fl=*%2Cscoreqt=standardwt=standardhl=truehl.fl=pg_bohl.snippets=50

The hit is in the multivalued field named pg_bo and in a doc with that 
id #. I've looked at the various highlighting parameters (not that I 
fully understand them) and tried fiddling with those but nothing helped. 
I did notice that if you change the hl.fl=*. Then you get the type field 
highlighted:


lst name=highlighting
   lst name=kt-d-0103-text-v4p262a
  arr name=type
   stremtext/em/str
   /arr
   /lst
/lst

But that's not much help. We are using a custom Tibetan tokenizer for 
the Unicode Tibetan text fields. Would this have something to do with it?


Any suggestions would be appreciated!

Thanks for your help,

Than Grove

--
Nathaniel Grove
Research Associate  Technical Director
Tibetan  Himalayan Library
University of Virginia
http://www.thlib.org



Re: Difficulties with Highlighting

2010-07-27 Thread Erik Hatcher

Than -

Looks like maybe your text_bo field type isn't analyzing how you'd  
like?   Though that's just a hunch.  I pasted the value of that field  
returned in the link you provided into your analysis.jsp page and it  
chunked tokens by whitespace.  Though I could be experiencing a copy/ 
paste/i18n issue.


Also looks like you're on Solr 1.3 - so it's likely quite worth  
upgrading to 1.4.1 (don't know if that directly affects this  
highlighting issue, just a general recommendation).


Erik

On Jul 27, 2010, at 3:43 PM, Nathaniel Grove wrote:

I'm a relative beginner at SOLR, indexing and searching Unicode  
Tibetan texts. I am trying to use the highlighter but it just  
returns, empty elements, such as:


  lst name=highlighting
  lst name=kt-d-0103-text-v4p262a/
  /lst

What am I doing wrong?

The query that generated that is:

http://www.thlib.org:8080/thdl-solr/thdl-texts/select?indent=onversion=2.2q=%E0%BD%91%E0%BD%84%E0%BD%B4%E0%BD%A3%E0%BC%8B%E0%BD%98%E0%BD%81%E0%BD%93%E0%BC%8B+AND+type%3Atextstart=0rows=10fl=*%2Cscoreqt=standardwt=standardhl=truehl.fl=pg_bohl.snippets=50

The hit is in the multivalued field named pg_bo and in a doc with  
that id #. I've looked at the various highlighting parameters (not  
that I fully understand them) and tried fiddling with those but  
nothing helped. I did notice that if you change the hl.fl=*. Then  
you get the type field highlighted:


lst name=highlighting
  lst name=kt-d-0103-text-v4p262a
 arr name=type
  stremtext/em/str
  /arr
  /lst
/lst

But that's not much help. We are using a custom Tibetan tokenizer  
for the Unicode Tibetan text fields. Would this have something to do  
with it?


Any suggestions would be appreciated!

Thanks for your help,

Than Grove

--
Nathaniel Grove
Research Associate  Technical Director
Tibetan  Himalayan Library
University of Virginia
http://www.thlib.org