Can we have multiple Spellcheck Components under /select handler

2020-12-30 Thread rashi gandhi
Hi All

I am trying to configure multiple spellcheck components. I defined two
searchComponents in my solrconfig.xml, let's say  and
.
And added above both components under /select request handler with default
required attributes.

>  elevator spellcheck <
> str>spellcheck1 


However, I can not see spellcheck1 coming in response. (when I set
/select?spellcheck1=true=).

Can't we configure multiple spellcheck components with different names in
Solr.


handling stopwords for special scenarios

2020-04-09 Thread rashi gandhi
Hi All,

We are using stopword filter factory at both index and search time, to omit
the stopwords.

However, for a one particular case, we are getting "here" as a search query
and "here" is one the words in title/name representing our client.
We are returning zero results as "here" is one of the English
language stopwords which is getting omitted while indexing and searching
both.

One solution could be that I remove the "here" from list of stopwords,
however does not look feasible.

Is there any way where we can handle this kind of cases, where
stopwrods are meant to be actual search term?

Any leads would be appreciated.


Control Solr spellcheck functionality to provide suggestions for correct word

2019-04-05 Thread rashi gandhi
HI,

I am working on Solr spellcheck feature, and I am using index based
spellcheck dictionary as a source for spellcheck suggestions.
I observed that collated results returned by spellcheck component, provide
the suggestions for misspelled words, however also provide suggestions for
correctly spelled word in query.

For example,
 misspelled query - root priviladge to user

*collated results (even suggestion includes the same) *-
root privilege to user, room privilege to user, root privilege to users,
rest privilege to user, root privilege to used

It corrected word 'privilege' which was misspelled, however also provided
suggestions for 'root' or 'user', which were already correct.

is there a way , we can tell Solr not to provide suggestions for correct
word, when using spellcheck feature.

Please provide pointers.


Indexing using SolrOutputFormat class

2016-07-18 Thread rashi gandhi
Hi All,



I am using Solr-5.0.0 API for indexing data in our application and the
requirement is to index the data in batches, using solr-mapreduce API.



In our application, we may receive data from any type of input source for
example: file, streams and any other relational or non-relational Db’s in a
particular format. And I need to index this data into Solr, by using
SolrOutputFormat class.



As per my analysis until now, I find that SolrOutputFormat works with the
EmbeddedSolrServer and requires path to config files for indexing data,
without the need of passing host and port for creating the SolrClient.



I checked for the documentation online, but couldn’t find any proper
examples that make the use of SolrOutputFormat class.

Does anybody have some implementations or a document, which mentions
details like what exactly needs to be passed as input to SolrOutputFormat
configuration, etc.?



Any pointers would be helpful.


How to list all collections in solr-4.7.2

2015-12-03 Thread rashi gandhi
Hi all,

I have setup two solr-4.7.2 server instances on two diff machines with 3
zookeeper severs in solrcloud mode.

Now, I want to retrieve list of all the collections that I have created in
solrcloud mode.

I tried LIST command of collections api, but its not working with
solr-4.7.2.
Error: unknown command LIST

Please suggest me the command, that I can use.

Thanks.


Fwd: Issue with SOLR Distributed Search

2014-12-17 Thread rashi gandhi
Hi,

This is regarding the issue that we are facing with SOLR distributed search.
In our application, we are managing multiple shards at SOLR server to
manage the load. But there is a problem with the order of results that we
going to return to client during the search.

For Example: Currently there are two shards on which data is randomly
distributed.
When I search something, it was observerd that the results from one shard
appear first and then results from other shard.

Moreover, we are ordering results by applying two levels of sorting
(configurable as per user also):
1. Score
2. Modified Time

I did investigations for the above scenario and found that it is not
necessary that documents coming from one shard will always have the same
score as documents coming from other shard, even if they are identical.
I also went through the various SOLR documentations and links, and found
that currently there is a limitation to distributed search in SOLR that
Inverse-document frequency (IDF) calculations cannot be distributed and
TF/IDF computations are per shard.

This issue is particularly visible when there is significant difference
between the number of documents indexed in each shard. (For Ex: first shard
has 15000 docs and second shard has 5000).

Please review and let me know whether our findings for the above scenario
are appropriate or not.

Also, as per our investigation currently there is work ongoing in SOLR
community to support this concept of distributed/Global IDF. But, I wanted
to know if there is any solution possible right now to manage/control the
score of the documents during distributed search, so that the results seem
more relevant.

Thanks
Rashi


Fwd: Change in the Score of Similiar Documents

2014-11-25 Thread rashi gandhi
Hi,



I have created two shards at SOLR Server and I have indexed 6 documents
(all docs having exactly same data = Welcome to SOLR).

Let’s say ids are from 1 to 6 and they are indexed in such a way :

Shard_one : ids with 2,4,6 are present in this shard.

Shard_two : ids with 1,3,5 are present in this shard.



When I search “SOLR”  , all documents are returned (as expected) but in
order like 2, 4, 6, 1, 3, 5

With docs with id (2,4,6) having slightly high score than docs with id
(1,3,5).



I am not able to figure out why there is change in the score of docs from
two different shards at the time of querying, even  if the data in all docs
are same.

Is this because of indexing at multiple shards??



Please provide me some pointers to move ahead.



Thanks,

Rashi


SOLR Performance Benchmarking

2014-06-08 Thread rashi gandhi
Hi,

I am using SolrMeter for performance benchmarking. I am able to
successfully test my solr setup up to 1000 queries per min while
searching.
But when I am exceeding this limit say 1500 search queries per min,
facing Server Refused Connection in SOLR.
Currently, I have only one solr server running on 64-bit 4 GB ram
machine for testing.

Please provide me some pointers , to optimize SOLR so that it can
handle large number of request. (Specially more than 1000 request per
min).
Is there any change that I can do in solrconfig.xml or some other
change to support this?


Thanks in Advance





DISCLAIMER
==
This e-mail may contain privileged and confidential information which
is the property of Persistent Systems Ltd. It is intended only for the
use of the individual or entity to which it is addressed. If you are
not the intended recipient, you are not authorized to read, retain,
copy, print, distribute or use this message. If you have received this
communication in error, please notify the sender and delete all copies
of this message. Persistent Systems Ltd. does not accept any liability
for virus infected mails.


Fwd: Question to send

2014-05-23 Thread rashi gandhi
HI,



I have one running solr core with some data indexed on solr server.

This core  is designed to provide OpenNLP functionalities for indexing and
searching.

So I have kept following binary models at this location:
*\apache-tomcat-7.0.53\solr\collection1\conf\opennlp
*

· en-sent.bin

· en-token.bin

· en-pos-maxent.bin

· en-ner-person.bin

· en-ner-location.bin



*My Problem is*: When I unload the running core, and try to delete conf
directory from it.

It is not allowing me to delete directory with prompt that *en-sent.bin*and
*en-token.bin* is in use.

If I have unloaded core, then why it is not unlocking the connection with
core?

Is this a known issue with OpenNLP Binaries?

How can I release the connection between unloaded core and conf directory.
(Specially binary models)



Please provide me some pointers on this.

Thanks in Advance


Fwd: help on edismax_dynamic fields

2014-02-21 Thread rashi gandhi
Hello,



I am using edismax parser in my project.

I just wanted to confirm whether we can use dynamic fields with edismax or
not.

When I am using specific dynamic field in qf or pf parameter , it is
working.



But when iam using dynamic fields with *, like this:



requestHandler name=/select class=solr.SearchHandler

lst name=defaults

   str name=echoParamsexplicit/str

   int name=rows10/int

   str name=dftext/str

   str name=defTypeedismax/str

*  str name=qf*

*   *_nlp_new_sv^0.8*

*  *_nlp_copy_sv^0.2*

*  /str*

/lst

/requestHandler

It is not working.



Is it possible to use dynamic fields with *,  like mentioned above with
edismax?

Please provide me some pointers on this.



Thanks in advance.


Re: Need help for integrating solr-4.5.1 with UIMA

2014-02-07 Thread rashi gandhi
(AccessLogValve.java:953)

at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)

at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)

at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1008)

at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)

at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)

at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:619)

Caused by: org.apache.uima.resource.ResourceInitializationException

at
org.apache.lucene.analysis.uima.ae.BasicAEProvider.getAE(BasicAEProvider.java:58)

at
org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory.getInstance(UIMAUpdateRequestProcessorFactory.java:61)

... 22 more

Caused by: java.lang.NullPointerException

at
org.apache.uima.util.XMLInputSource.init(XMLInputSource.java:118)

at
org.apache.lucene.analysis.uima.ae.BasicAEProvider.getInputSource(BasicAEProvider.java:84)

at
org.apache.lucene.analysis.uima.ae.BasicAEProvider.getAE(BasicAEProvider.java:50)

... 23 more



i think solr is not able to load the descriptor file iam defining in
analysisEngine tab.

Please provide me some help on this.

Thanks in Advance


Rashi


On Mon, Feb 3, 2014 at 2:50 PM, rashi gandhi gandhirash...@gmail.comwrote:

 Hi,



 I'm trying to integrate Solr 4.5.1 with UIMA and following the steps of
 the solr-4.5.1\contrib\uima\readme.txt.

 Edited the solrconfig.xml as given in readme.txt. Also I have registered
 the required keys.



 But each time when I am indexing data , solr returns error:



 Feb 3, 2014 2:04:32 PM
 org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl
 callAnalysisComponentProcess(405)

 SEVERE: Exception occurred

 org.apache.uima.analysis_engine.AnalysisEngineProcessException

 at
 org.apache.uima.annotator.calais.OpenCalaisAnnotator.process(OpenCalaisAnnotator.java:206)

 at
 org.apache.uima.analysis_component.CasAnnotator_ImplBase.process(CasAnnotator_ImplBase.java:56)

 at
 org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:377)

 at
 org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:295)

 at
 org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:567)

 at
 org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.init(ASB_impl.java:409)

 at
 org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:342)

 at
 org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)

 at
 org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)

 at
 org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:280)

 at
 org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processText(UIMAUpdateRequestProcessor.java:173)

 at
 org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:79)

 at
 org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:247)

 at
 org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174)

 at
 org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)

 at
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)

 at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)

 at
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)

 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)

 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)

 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)

 at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)

 at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

 at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)

 at
 org.apache.catalina.core.StandardContextValve.invoke

Fwd: Need help for integrating solr-4.5.1 with UIMA

2014-02-03 Thread rashi gandhi
Hi,



I'm trying to integrate Solr 4.5.1 with UIMA and following the steps of the
solr-4.5.1\contrib\uima\readme.txt.

Edited the solrconfig.xml as given in readme.txt. Also I have registered
the required keys.



But each time when I am indexing data , solr returns error:



Feb 3, 2014 2:04:32 PM
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl
callAnalysisComponentProcess(405)

SEVERE: Exception occurred

org.apache.uima.analysis_engine.AnalysisEngineProcessException

at
org.apache.uima.annotator.calais.OpenCalaisAnnotator.process(OpenCalaisAnnotator.java:206)

at
org.apache.uima.analysis_component.CasAnnotator_ImplBase.process(CasAnnotator_ImplBase.java:56)

at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:377)

at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:295)

at
org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:567)

at
org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.init(ASB_impl.java:409)

at
org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:342)

at
org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)

at
org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)

at
org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:280)

at
org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processText(UIMAUpdateRequestProcessor.java:173)

at
org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:79)

at
org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:247)

at
org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174)

at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)

at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)

at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)

at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)

at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)

at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)

at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)

at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)

at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)

at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)

at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)

at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)

at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)

at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)

at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1008)

at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)

at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)

at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:619)

Caused by: java.net.ConnectException: Connection timed out: connect

at java.net.PlainSocketImpl.socketConnect(Native Method)

at
java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)

at
java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)

at
java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)

at
java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)

at java.net.Socket.connect(Socket.java:529)

at 

Re: need help on OpenNLP with Solr

2014-01-15 Thread rashi gandhi
Thanks Lance for clearing.

One more Question: Is there a possibility of integrating boosting with
payloads in LUCENE-2899 with solr?

Thanks in Advance


On Mon, Jan 6, 2014 at 9:46 PM, rashi gandhi gandhirash...@gmail.comwrote:

 Hi,

 Also i wanted know,
 Is it possible to integrate wordnet with this analyzer?
 I want to use wordnet as synonym expansion along with OpenNLP filters.
 What are the changes required in solr schema.xml and solrconfig.xml?

 Thanks in Advance


 On Mon, Jan 6, 2014 at 9:37 PM, rashi gandhi gandhirash...@gmail.comwrote:

 Hi,



 I have applied OpenNLP (LUCENE 2899.patch) patch to SOLR-4.5.1 for nlp
 searching and it is working fine.

 Also I have designed an analyzer for this:

 fieldType name=nlp_type class=solr.TextField
 positionIncrementGap=100

   analyzer type=index

 tokenizer class=solr.OpenNLPTokenizerFactory
 sentenceModel=opennlp/en-test-sent.bin
tokenizerModel=opennlp/en-test-tokenizer.bin/

 filter class=solr.StopFilterFactory
 ignoreCase=true words=stopwords.txt enablePositionIncrements=true/

 filter class=solr.OpenNLPFilterFactory
 posTaggerModel=opennlp/en-pos-maxent.bin/

 filter class=solr.OpenNLPFilterFactory
 nerTaggerModels=opennlp/en-ner-person.bin/

 filter class=solr.OpenNLPFilterFactory
 nerTaggerModels=opennlp/en-ner-location.bin/

 filter
 class=solr.LowerCaseFilterFactory/

 filter
 class=solr.SnowballPorterFilterFactory/

/analyzer

analyzer type=query

 tokenizer class=solr.OpenNLPTokenizerFactory
 sentenceModel=opennlp/en-test-sent.bin tokenizerModel
 =opennlp/en-test-tokenizer.bin/

 filter class=solr.StopFilterFactory
 ignoreCase=true words=stopwords.txt enablePositionIncrements=true/

 filter class=solr.OpenNLPFilterFactory
 posTaggerModel=opennlp/en-pos-maxent.bin/

 filter class=solr.OpenNLPFilterFactory
 nerTaggerModels=opennlp/en-ner-person.bin/

 filter class=solr.OpenNLPFilterFactory
 nerTaggerModels=opennlp/en-ner-location.bin/

 filter
 class=solr.LowerCaseFilterFactory/

 filter
 class=solr.SnowballPorterFilterFactory/

/analyzer

 /fieldType


 I am able to find that posTaggerModel is performing tagging in the
 phrases and add the payloads. ( but iam not able to analyze it)

 My Question is:
 Can i search a phrase giving high boost to NOUN then VERB ?
 For example: if iam searching sitting on blanket , so i want to give
 high boost to NOUN term first then VERB, that are tagged by OpenNLP.
 How can i use payloads for boosting?
 What are the changes required in schema.xml?

 Please provide me some pointers to move ahead

 Thanks in advance







Re: need help on OpenNLP with Solr

2014-01-06 Thread rashi gandhi
Hi,

Also i wanted know,
Is it possible to integrate wordnet with this analyzer?
I want to use wordnet as synonym expansion along with OpenNLP filters.
What are the changes required in solr schema.xml and solrconfig.xml?

Thanks in Advance


On Mon, Jan 6, 2014 at 9:37 PM, rashi gandhi gandhirash...@gmail.comwrote:

 Hi,



 I have applied OpenNLP (LUCENE 2899.patch) patch to SOLR-4.5.1 for nlp
 searching and it is working fine.

 Also I have designed an analyzer for this:

 fieldType name=nlp_type class=solr.TextField
 positionIncrementGap=100

   analyzer type=index

 tokenizer class=solr.OpenNLPTokenizerFactory
 sentenceModel=opennlp/en-test-sent.bin
tokenizerModel=opennlp/en-test-tokenizer.bin/

 filter class=solr.StopFilterFactory
 ignoreCase=true words=stopwords.txt enablePositionIncrements=true/

 filter class=solr.OpenNLPFilterFactory
 posTaggerModel=opennlp/en-pos-maxent.bin/

 filter class=solr.OpenNLPFilterFactory
 nerTaggerModels=opennlp/en-ner-person.bin/

 filter class=solr.OpenNLPFilterFactory
 nerTaggerModels=opennlp/en-ner-location.bin/

 filter
 class=solr.LowerCaseFilterFactory/

 filter
 class=solr.SnowballPorterFilterFactory/

/analyzer

analyzer type=query

 tokenizer class=solr.OpenNLPTokenizerFactory
 sentenceModel=opennlp/en-test-sent.bin tokenizerModel
 =opennlp/en-test-tokenizer.bin/

 filter class=solr.StopFilterFactory
 ignoreCase=true words=stopwords.txt enablePositionIncrements=true/

 filter class=solr.OpenNLPFilterFactory
 posTaggerModel=opennlp/en-pos-maxent.bin/

 filter class=solr.OpenNLPFilterFactory
 nerTaggerModels=opennlp/en-ner-person.bin/

 filter class=solr.OpenNLPFilterFactory
 nerTaggerModels=opennlp/en-ner-location.bin/

 filter
 class=solr.LowerCaseFilterFactory/

 filter
 class=solr.SnowballPorterFilterFactory/

/analyzer

 /fieldType


 I am able to find that posTaggerModel is performing tagging in the phrases
 and add the payloads. ( but iam not able to analyze it)

 My Question is:
 Can i search a phrase giving high boost to NOUN then VERB ?
 For example: if iam searching sitting on blanket , so i want to give
 high boost to NOUN term first then VERB, that are tagged by OpenNLP.
 How can i use payloads for boosting?
 What are the changes required in schema.xml?

 Please provide me some pointers to move ahead

 Thanks in advance






need help on OpenNLP with Solr

2014-01-06 Thread rashi gandhi
Hi,



I have applied OpenNLP (LUCENE 2899.patch) patch to SOLR-4.5.1 for nlp
searching and it is working fine.

Also I have designed an analyzer for this:

fieldType name=nlp_type class=solr.TextField
positionIncrementGap=100

  analyzer type=index

tokenizer class=solr.OpenNLPTokenizerFactory
sentenceModel=opennlp/en-test-sent.bin
   tokenizerModel=opennlp/en-test-tokenizer.bin/

filter class=solr.StopFilterFactory
ignoreCase=true words=stopwords.txt enablePositionIncrements=true/

filter class=solr.OpenNLPFilterFactory
posTaggerModel=opennlp/en-pos-maxent.bin/

filter class=solr.OpenNLPFilterFactory
nerTaggerModels=opennlp/en-ner-person.bin/

filter class=solr.OpenNLPFilterFactory
nerTaggerModels=opennlp/en-ner-location.bin/

filter
class=solr.LowerCaseFilterFactory/

filter
class=solr.SnowballPorterFilterFactory/

   /analyzer

   analyzer type=query

tokenizer class=solr.OpenNLPTokenizerFactory
sentenceModel=opennlp/en-test-sent.bin tokenizerModel
=opennlp/en-test-tokenizer.bin/

filter class=solr.StopFilterFactory
ignoreCase=true words=stopwords.txt enablePositionIncrements=true/

filter class=solr.OpenNLPFilterFactory
posTaggerModel=opennlp/en-pos-maxent.bin/

filter class=solr.OpenNLPFilterFactory
nerTaggerModels=opennlp/en-ner-person.bin/

filter class=solr.OpenNLPFilterFactory
nerTaggerModels=opennlp/en-ner-location.bin/

filter
class=solr.LowerCaseFilterFactory/

filter
class=solr.SnowballPorterFilterFactory/

   /analyzer

/fieldType


I am able to find that posTaggerModel is performing tagging in the phrases
and add the payloads. ( but iam not able to analyze it)

My Question is:
Can i search a phrase giving high boost to NOUN then VERB ?
For example: if iam searching sitting on blanket , so i want to give high
boost to NOUN term first then VERB, that are tagged by OpenNLP.
How can i use payloads for boosting?
What are the changes required in schema.xml?

Please provide me some pointers to move ahead

Thanks in advance


Need Help for Location searching

2013-12-31 Thread rashi gandhi
Hi,


I wanted to design an analyzer that can support location containment
relationship For example Europe-France-Paris


My requirement is like: when a user search for any country , then results
must have the documents having that country , as well as the documents
having states and cities which comes under that country.

But, documents with country name must have high relevancy.

And the same when a user search for state or city.

It must obeys containment relationship up to 4 levels .i.e.
Continent-Country-State-City


Also, I have designed analyzer using synonym filter factory for the same,
and its working as per expectation.

But I wanted to know, is there any another way, apart from using synonym
filter factory that can be used for the same.

Is SOLR provide any tokenziers or filters for this?

Please provide me some pointers to move ahead.


Thanks in Advance


SOLR: Searching on OpenNLP fields is unstable

2013-09-25 Thread rashi gandhi
HI,



I am working on OpenNLP integration with SOLR. I have successfully applied
the patch (LUCENE-2899-x.patch) to latest SOLR source code (branch_4x).

I have designed OpenNLP analyzer and index data to it. Analyzer declaration
in schema.xml is as



  fieldType name=nlp_type class=solr.TextField
positionIncrementGap=100

analyzer type=index

!-- Sequence of tokenizers and filters
applied at the index time--

tokenizer
class=solr.StandardTokenizerFactory/

filter
class=solr.LowerCaseFilterFactory/

filter class=solr.StopFilterFactory
ignoreCase=true words=stopwords.txt enablePositionIncrements=true/

filter class=solr.SynonymFilterFactory
synonyms=synonyms.txt ignoreCase=true expand=true/

filter
class=solr.SnowballPorterFilterFactory/

filter
class=solr.ASCIIFoldingFilterFactory/

/analyzer

analyzer type=query

!-- Sequence of tokenizers and filters
applied at the index time--

tokenizer
class=solr.StandardTokenizerFactory/

filter class=solr.OpenNLPFilterFactory
posTaggerModel=opennlp/en-pos-maxent.bin/

filter class=solr.OpenNLPFilterFactory
nerTaggerModels=opennlp/en-ner-person.bin/

 filter class=solr.OpenNLPFilterFactory
nerTaggerModels=opennlp/en-ner-location.bin/

filter
class=solr.LowerCaseFilterFactory/

filter class=solr.StopFilterFactory
ignoreCase=true words=stopwords.txt enablePositionIncrements=true/

 /analyzer

/fieldType



And field declared for this analyzer:

field name=Detail_Person type=nlp_type indexed=true stored=true
omitNorms=true omitPositions=true/



Problem is here : When I search over this field Detail_Person, results are
not constant.



When I search Detail_Person:brett, it return one document





But again when I fire the same query, it return zero document.



Searching is not stable on OpenNLP field, sometimes it return documents and
sometimes not but documents are there.

And if I search on non OpenNLP fields, it is working properly, results are
stable and correct.

Please help me to make solr results consistent.

Thanks in Advance.


OpenNLP Analyzers not working properly

2013-09-23 Thread rashi gandhi
Hi,



iam working on OpenNLP with SOLR. I have successfully applied the patch
LUCENE-2899-x.patch to latest SOLR code branch_4x.

I desgined some analyers based on OpenNLP filters and tokenziers and index
some documnets on that fields.

Searching on OpenNLP field is not constant. Not able to search on these
OpenNLP designed fields in solr schema.xml properly.

Also, how to use payloads for boosting the document.


Please help me on this.


Not able to deploy SOLR after applying OpenNLP patch

2013-09-12 Thread rashi gandhi
Hi,



My Question is related to OpenNLP Integration with SOLR.

I have successfully applied OpenNLP LUCENE-2899-x.patch to latest solr
branch checkout from here:

http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x

And also iam able to compile source code, generated all realted binaries
and able to create war file.

But facing issues while deployment of SOLR.

Here is the error

Caused by: org.apache.solr.common.SolrException: Plugin init failure for
[schema.xml] fieldType text_opennlp: Plugin init failure for [schema.xml]
a

nalyzer/tokenizer: Error loading class 'solr.OpenNLPTokenizerFactory'

at
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)

at
org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:467)

... 15 more

Caused by: org.apache.solr.common.SolrException: Plugin init failure for
[schema.xml] analyzer/tokenizer: Error loading class
'solr.OpenNLPTokenizerFa

ctory'

at
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)

at
org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:362)

at
org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:95)

at
org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43)

at
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)

... 16 more

Caused by: org.apache.solr.common.SolrException: Error loading class
'solr.OpenNLPTokenizerFactory'

at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:449)

at
org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:543)

at
org.apache.solr.schema.FieldTypePluginLoader$2.create(FieldTypePluginLoader.java:342)

at
org.apache.solr.schema.FieldTypePluginLoader$2.create(FieldTypePluginLoader.java:335)

at
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)

... 20 more

Caused by: java.lang.ClassNotFoundException: solr.OpenNLPTokenizerFactory

at java.net.URLClassLoader$1.run(URLClassLoader.java:366)

at java.net.URLClassLoader$1.run(URLClassLoader.java:355)

at java.security.AccessController.doPrivileged(Native Method)

at java.net.URLClassLoader.findClass(URLClassLoader.java:354)

at java.lang.ClassLoader.loadClass(ClassLoader.java:423)

at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789)

at java.lang.ClassLoader.loadClass(ClassLoader.java:356)

at java.lang.Class.forName0(Native Method)

at java.lang.Class.forName(Class.java:264)

at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433)

... 24 more

4446 [coreLoadExecutor-3-thread-1] ERROR
org.apache.solr.core.CoreContainer  û
null:org.apache.solr.common.SolrException: Unable to create core: colle

ction1

at
org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:931)

at org.apache.solr.core.CoreContainer.create(CoreContainer.java:563)

at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:244)

at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:236)

at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)

at java.util.concurrent.FutureTask.run(FutureTask.java:166)

at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)

at java.util.concurrent.FutureTask.run(FutureTask.java:166)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)

at java.lang.Thread.run(Thread.java:722)

Please help me on this.



Waiting for your reply.
Thanks in advance.