Re: Best way to gather span/token positions from query?

Grant Ingersoll Thu, 30 Apr 2009 15:10:49 -0700

I've been thinking about how to add spans to Solr, but haven'tactually codified it yet. I see no reason why a query parser can'tsupport some syntax and the "dump spans" method approach can't be co-opted to write out the spans to the response. Seems like it wouldneed to be an additional part of the QueryComponent, plus someaddition to the query parsers. We can more easily add it to theDismax parser, but if we add it to the Lucene one, then we should makethat change in Lucene.


-Grant

On Apr 29, 2009, at 7:06 PM, Sean O'Connor wrote:

Hello,
I'm trying to find a decent approach for getting token positionsout of (or is that into?) solr query results. Is the best approachto extend a QueryComponent and/or HighlightComponent? I'm new tosolr, and still on fairly shaky ground soany pointers or suggestionsare quite welcome.
  As a little BACKGROUND:
I am trying to migrate a custom lucene-only content anaylsysproject to solr. The 'old' system programmatically runs a fewthousand predefined queries against a corpus, and then analyzes theresults. The lucene score is good, but the actual position of thehits is also quite important.
My previous system did a simple query parsing to createSpanQuerys, and then used a modified dumpSpans() to get the tokenposition from the spans. Now I am trying to find how to use solr'sgoodness (and MemoryIndex approach?) to get the span positions in amore logical manner. I think the answer is in the highlighter, butI'm getting a little twisted around, and could use a pointer.
I am using a recent Solr nightly snapshot, grails, Aduna Aperture,and Intellij (if any of that matters)
Thanks,

Sean


--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)using Solr/Lucene:

http://www.lucidimagination.com/search

Re: Best way to gather span/token positions from query?

Reply via email to