HI,
I am working on OpenNLP integration with SOLR. I have successfully applied
the patch (LUCENE-2899-x.patch) to latest SOLR source code (branch_4x).
I have designed OpenNLP analyzer and index data to it. Analyzer declaration
in schema.xml is as
<fieldType name="nlp_type" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<!-- Sequence of tokenizers and filters
applied at the index time-->
<tokenizer
class="solr.StandardTokenizerFactory"/>
<filter
class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/>
<filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter
class="solr.SnowballPorterFilterFactory"/>
<filter
class="solr.ASCIIFoldingFilterFactory"/>
</analyzer>
<analyzer type="query">
<!-- Sequence of tokenizers and filters
applied at the index time-->
<tokenizer
class="solr.StandardTokenizerFactory"/>
<filter class="solr.OpenNLPFilterFactory"
posTaggerModel="opennlp/en-pos-maxent.bin"/>
<filter class="solr.OpenNLPFilterFactory"
nerTaggerModels="opennlp/en-ner-person.bin"/>
<filter class="solr.OpenNLPFilterFactory"
nerTaggerModels="opennlp/en-ner-location.bin"/>
<filter
class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/>
</analyzer>
</fieldType>
And field declared for this analyzer:
<field name="Detail_Person" type="nlp_type" indexed="true" stored="true"
omitNorms="true" omitPositions="true"/>
Problem is here : When I search over this field Detail_Person, results are
not constant.
When I search Detail_Person:brett, it return one document
But again when I fire the same query, it return zero document.
And also these are logs:
97139 [http-bio-8080-exec-9] INFO
org.apache.solr.analysis.OpenNLPFilterFactory û OpenNLPFilterFactory create
97139 [http-bio-8080-exec-9] INFO
org.apache.solr.analysis.OpenNLPFilterFactory û OpenNLPFilterFactory create
97139 [http-bio-8080-exec-9] INFO
org.apache.solr.analysis.OpenNLPFilterFactory û OpenNLPFilterFactory create
97154 [http-bio-8080-exec-9] INFO org.apache.solr.core.SolrCore û
[collection1] webapp=/solr path=/select
params={fl=score,*&indent=true&q=Detail_Pe
rson:rashi&wt=json} hits=1 status=0 QTime=15
97154 [http-bio-8080-exec-9] DEBUG
org.apache.solr.servlet.SolrDispatchFilter û Closing out SolrRequest:
{{params(fl=score,*&indent=true&q=Detail_Per
son:rashi&wt=json),defaults(df=text&echoParams=explicit&rows=10)}}
134874 [http-bio-8080-exec-3] INFO
org.apache.solr.analysis.OpenNLPFilterFactory û OpenNLPFilterFactory create
134890 [http-bio-8080-exec-3] INFO
org.apache.solr.analysis.OpenNLPFilterFactory û OpenNLPFilterFactory create
134890 [http-bio-8080-exec-3] INFO
org.apache.solr.analysis.OpenNLPFilterFactory û OpenNLPFilterFactory create
134906 [http-bio-8080-exec-3] INFO org.apache.solr.core.SolrCore û
[collection1] webapp=/solr path=/select
params={fl=score,*&indent=true&q=Detail_P
erson:brett&wt=json} hits=2 status=0 QTime=32
134906 [http-bio-8080-exec-3] DEBUG
org.apache.solr.servlet.SolrDispatchFilter û Closing out SolrRequest:
{{params(fl=score,*&indent=true&q=Detail_Pe
rson:brett&wt=json),defaults(df=text&echoParams=explicit&rows=10)}}
147136 [http-bio-8080-exec-3] INFO org.apache.solr.core.SolrCore û
[collection1] webapp=/solr path=/select
params={fl=score,*&indent=true&q=Detail_P
erson:john&wt=json} hits=0 status=0 QTime=0
147136 [http-bio-8080-exec-3] DEBUG
org.apache.solr.servlet.SolrDispatchFilter û Closing out SolrRequest:
{{params(fl=score,*&indent=true&q=Detail_Pe
rson:john&wt=json),defaults(df=text&echoParams=explicit&rows=10)}}
302164 [http-bio-8080-exec-10] INFO
org.apache.solr.analysis.OpenNLPFilterFactory û OpenNLPFilterFactory create
302164 [http-bio-8080-exec-10] INFO
org.apache.solr.analysis.OpenNLPFilterFactory û OpenNLPFilterFactory create
302164 [http-bio-8080-exec-10] INFO
org.apache.solr.analysis.OpenNLPFilterFactory û OpenNLPFilterFactory create
302164 [http-bio-8080-exec-10] INFO org.apache.solr.core.SolrCore û
[collection1] webapp=/solr path=/select
params={fl=score,*&indent=true&q=Detail_
Person:john&wt=json} hits=1 status=0 QTime=15
302164 [http-bio-8080-exec-10] DEBUG
org.apache.solr.servlet.SolrDispatchFilter û Closing out SolrRequest:
{{params(fl=score,*&indent=true&q=Detail_P
erson:john&wt=json),defaults(
df=text&echoParams=explicit&rows=10)}}
Searching is not stable on OpenNLP field, sometimes it return documents and
sometimes not but documents are there.
And if I search on non OpenNLP fields, it is working properly, results are
stable and correct.
Please help me to make solr results consistent.
Thanks in Advance.