Re: Using Lucene's payload in Solr

Grant Ingersoll Thu, 13 Aug 2009 09:13:51 -0700


On Aug 13, 2009, at 11:58 AM, Bill Au wrote:

Thanks for the tip on BFTQ. I have been using a nightly buildbefore thatwas committed. I have upgrade to the latest nightly build and willuse that
instead of BTQ.
I got DelimitedPayloadTokenFilter to work and see that the terms andpayloadof the field are correct but the delimiter and payload are stored sothey
appear in the response also.  Here is an example:

XML for indexing:
<field name="title">Solr|2.0 In|2.0 Action|2.0</field>


XML response:
<doc>
<str name"title">Solr|2.0 In|2.0 Action|2.0</str>
</doc>



Correct.

I want to set payload on a field that has a variable number ofwords. So Iguess I can use a copy field with a PatternTokenizerFactory tofilter out
the delimiter and payload.

I am thinking maybe I can do this instead when indexing:

XML for indexing:
<field name="title" payload="2.0">Solr In Action</field>


Hmmm, interesting, what's your motivation vs. boosting the field?

This will simplify indexing as I don't have to repeat the payloadfor eachword in the field. I do have to write a payload aware updatehandler. Itlooks like I can use Lucene's NumericPayloadTokenFilter in my customupdate
handler to

Any thoughts/comments/suggestions?

Bill
On Wed, Aug 12, 2009 at 7:13 AM, Grant Ingersoll<gsing...@apache.org>wrote:
On Aug 11, 2009, at 5:30 PM, Bill Au wrote:

It looks like things have changed a bit since this subject was last
brought
up here. I see that there are support in Solr/Lucene for indexingpayloaddata (DelimitedPayloadTokenFilterFactory andDelimitedPayloadTokenFilter).Overriding the Similarity class is straight forward. So the lastpiece ofthe puzzle is to use a BoostingTermQuery when searching. I thinkall I
need
to do is to subclass Solr's LuceneQParserPlugin usesSolrQueryParser under
the cover.  I think all I need to do is to write my own query parser
plugin
that uses a custom query parser, with the only difference being inthegetFieldQuery() method where a BoostingTermQuery is used insteadof a
TermQuery.
The BTQ is now deprecated in favor of theBoostingFunctionTermQuery, whichgives some more flexibility in terms of how the spans in a singledocument
are scored.
Am I on the right track?
Yes.

Has anyone done something like this already?
I intend to, but haven't started.
Since Solr already has indexing support for payload, I was hopingthat
query
support is already in the works if not available already. If not,I am
willing to contribute but will probably need some guidance since my
knowledge in Solr query parser is weak.
https://issues.apache.org/jira/browse/SOLR-1337


--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)using Solr/Lucene:

http://www.lucidimagination.com/search

Re: Using Lucene's payload in Solr

Reply via email to