Need Help on Solr Client connection Pooling

2018-08-30 Thread Gembali Satish kumar
Hi Team,

Need some help on  Client connection object pooling
I am using SolrJ API to connect the Solr.

This below snippet I used to create the client object.

*SolrClient client = new HttpSolrClient.Builder(*
* SolrUtil.getSolrURL(tsConfigUtil.getClusterAdvertisedAddress(),
aInCollectionName)).build();*

after my job search done, I am closing my client.
*client.close();*

but from UI getting more requests to search the data
I think to create the *client object *on every request is costly is there
any way to pool the *SolrClient objects?*?
If there kindly share me the reference

Thanks and Regards,
Satish


spell check - preserve case in suggestions

2012-02-06 Thread Satish Kumar
Hi,

Say that the field name has the following terms:

Giants
Manning
New York


When someone searches for gants or Gants, I need the suggestion to be
returned as Giants (capital G - same case as in the content that was
indexed). Using lowercase filter in both index and query analyzers I get
the suggestion giants, but all the letters are in smaller case. Is it
possible to preserve the case in suggestions, yet get suggestions for input
term in upper or lower or mixed case?


Thanks,
Satish


Baseline vs. Incremental Indexing

2011-04-29 Thread Satish Kumar
Hi,

Currently we index new/updated records every 30 minutes (I am referring to
this as incremental/partial index) -- i.e., records will be added to an
existing index. Are there any benefits in creating a new index (i.e., delete
the existing index and create it) from a performance point of view everyday
or a week?

In other search system I worked with, incremental updates are generated in a
small file. When the server is restarted, each update in the small files
need to be applied. When there are several small files to be applied, the
restart process could take a few minutes to complete so the recommendation
was to run baseline process every night. I'm wondering if this is the case
with Solr as well?


Thanks,
Satish


Re: Baseline vs. Incremental Indexing

2011-04-29 Thread Satish Kumar
thanks Markus and Otis!

This link was helpful:
http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerations

On Fri, Apr 29, 2011 at 3:12 PM, Markus Jelsma
markus.jel...@openindex.iowrote:

 The only thing you'd periodically do is optimizing the existing index.

  Hi Satish,
 
  I can't think of any benefits you'd reap by complete/full reindexing into
 a
  new index.  Incremental indexing will be faster.
 
 
  Otis
  
  Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
  Lucene ecosystem search :: http://search-lucene.com/
 
 
 
  - Original Message 
 
   From: Satish Kumar satish.kumar.just.d...@gmail.com
   To: solr-user@lucene.apache.org
   Sent: Fri, April 29, 2011 2:58:25 PM
   Subject: Baseline vs. Incremental Indexing
  
   Hi,
  
   Currently we index new/updated records every 30 minutes (I am
  referring
   to this as incremental/partial index) -- i.e., records will be  added
 to
   an existing index. Are there any benefits in creating a new index
   (i.e., delete the existing index and create it) from a performance
 point
   of  view everyday or a week?
  
   In other search system I worked with,  incremental updates are
 generated
   in a small file. When the server is  restarted, each update in the
 small
   files need to be applied. When there are  several small files to be
   applied, the restart process could take a few  minutes to complete so
   the recommendation was to run baseline process every  night. I'm
   wondering if this is the case with Solr as  well?
  
  
   Thanks,
   Satish



Re: mm=0?

2010-09-13 Thread Satish Kumar
Hi Erik,

I completely agree with you that showing a random document for user's query
would be very poor experience. I have raised this in our product review
meetings before. I was told that because of contractual agreement some
sponsored content needs to be returned even if it meant no match. And the
sponsored content drives the ads displayed on the page-- so it is more for
showing some ad on the page when there is no matching result from sponsored
content for user's query.

Note that some other content in addition to sponsored content is displayed
on the page, so user is not seeing just one random result when there is not
a good match.

It looks like I have to do another search to get a random result when there
are no results. In this case I will use RandomSortField to generate random
result (so that a different ad is displayed from set of sponsored ads) for
each no result case.

Thanks for the comments!


Satish



On Sun, Sep 12, 2010 at 10:25 AM, Erick Erickson erickerick...@gmail.comwrote:

 Could you explain the use-case a bit? Because the very
 first response I would have is why in the world did
 product management make this a requirement and try
 to get the requirement changed

 As a user, I'm having a hard time imagining being well
 served by getting a document in response to a search that
 had no relation to my search, it was just a random doc
 selected from the corpus.

 All that said, I don't think a single query would do the trick.
 You could include a very special document with a field
 that no other document had with very special text in it. Say
 field name bogusmatch, filled with the text bogustext
 then, at least the second query would match one and only
 one document and would take minimal time. Or you could
 tack on to each and every query OR bogusmatch:bogustext^0.001
 (which would really be inexpensive) and filter it out if there
 was more than one response. By boosting it really low, it should
 always appear at the end of the list which wouldn't be a bad thing.

 DisMax might help you here...

 But do ask if it is really a requirement or just something nobody's
 objected to before bothering IMO...

 Best
 Erick

 On Sat, Sep 11, 2010 at 1:10 PM, Satish Kumar 
 satish.kumar.just.d...@gmail.com wrote:

  Hi,
 
  We have a requirement to show at least one result every time -- i.e.,
 even
  if user entered term is not found in any of the documents. I was hoping
  setting mm to 0 will return results in all cases, but it is not.
 
  For example, if user entered term alpha and it is *not* in any of the
  documents in the index, any document in the index can be returned. If
 term
  alpha is in the document set, documents having the term alpha only
 must
  be returned.
 
  My idea so far is to perform a search using user entered term. If there
 are
  any results, return them. If there are no results, perform another search
  without the query term-- this means doing two searches. Any suggestions
 on
  implementing this requirement using only one search?
 
 
  Thanks,
  Satish
 



mm=0?

2010-09-11 Thread Satish Kumar
Hi,

We have a requirement to show at least one result every time -- i.e., even
if user entered term is not found in any of the documents. I was hoping
setting mm to 0 will return results in all cases, but it is not.

For example, if user entered term alpha and it is *not* in any of the
documents in the index, any document in the index can be returned. If term
alpha is in the document set, documents having the term alpha only must
be returned.

My idea so far is to perform a search using user entered term. If there are
any results, return them. If there are no results, perform another search
without the query term-- this means doing two searches. Any suggestions on
implementing this requirement using only one search?


Thanks,
Satish


facets - id and display value

2010-08-19 Thread Satish Kumar
Hi,

Is it possible to associate properties to a facet? For example, facet on
categoryId (1, 2, 3 etc. ) and get properties display name, image, etc?


Thanks,
Satish


Re: anti-words - exact match

2010-08-09 Thread Satish Kumar
Thanks Jon.

My initial thought was exactly like yours. My preference was to implement
this requirement completely at Solr level so that different applications
won't have to put this logic. However, I am not sure how to shingle-ize the
input query and use that in filter query with a NOT operator at the solr
layer. The other option as you suggested is to single-ize the input query in
the application layer -- this is doable, but means adding logic in
application layer.

For now I am settling on the below solution:

- each anti-word (can be multiple words) will be stored as separate token.
The input record will contain different anti-word separated by
comma. solr.PatternTokenizerFactory will be used to split on comma and
create tokens

- the list of anti-words is stored in memory in application layer and
anti-words are extracted from the user entered query (e.g. if user enteres
'I have swollen foot' and 'swollen foot' is anti-word, swollen foot is
extracted)

- filter query with NOT operator on anti-word field is sent to solr


Thanks much!

Satish

This is tricky. You could try doing something with the ShingleFilter (
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory)
 at _query time_ to turn the users query:

 i have a swollen foot into:
 i, i have, i have a, i have a swollen,  have, have a, have
 a swollen... etc.

 I _think_ you can get the ShingleFilter factory to do that.

 But now you only want to exclude if one of those shingles matches the
 ENTIRE anti-word. So maybe index as non-tokenized, so each of those
 shingles will somehow only match on the complete thing.  You'd want to
 normalize spacing and punctuation.

 But then you need to turn that into a _negated_ element of your query.
 Perhaps by using an fq with a NOT/- in it? And a query which 'matches'
 (causing 'not' behavior) if _any_ of the shingles match.

 I have no idea if it's actually possible to put these things together in
 that way. A non-tokenized field? Which still has it's queries shingle-ized
 at query time? And then works as a negated query, matching for negation if
 any of the shingles match?  Not really sure how to put that together in your
 solrconfig.xml and/or application logic if needed. You could try.


yup, I didn't know how to shingle-ized the input query and use that as input
in filter query.


 Another option would be doing the query-time 'shingling' in your app, and
 then it's a somewhat more normal Solr query. fq= -shingle one -shingle
 two -shingle three etc.  Or put em in seperate fq's depending on how you
 want to use your filter cache. Still searching on a non-tokenized field, and
 still normalizing on white-space and punctuation at both index time and
 (using same normalization logic but in your application logic this time)
 query time.  I think that might work.

 So I'm not really sure, but maybe that gives you some ideas.

 Jonathan




 Satish Kumar wrote:

 Hi,

 We have a requirement to NOT display search results if user query contains
 terms that are in our anti-words field. For example, if user query is I
 have swollen foot and if some records in our index have swollen foot in
 anti-words field, we don't want to display those records. How do I go
 about
 implementing this?

 NOTE 1: anti-words field can contain multiple values. Each value can be a
 one or multiple words (e.g. swollen foot, headache, etc. )

 NOTE 2: the match must be exact. If anti-words field contains swollen
 foot
 and if user query is I have swollen foot, record must be excluded. If
 user
 query is My foot is swollen, the record should not be excluded.

 Any pointers is greatly appreciated!


 Thanks,
 Satish






randomness - percent share

2010-08-09 Thread Satish Kumar
Hi,

We have some identical records in our data set (e.g. what is swine flu?
written by two different authors). When user searches for What is swine
flu?, we want the result by author1 appear as the first result for x% of
the queries and result by author2 for y% of the queries (where x and y
should be configurable). I am wondering if I can use the percentShare value
(25, 40, 60, etc.) stored per record as an element in controlling the score,
yet generating randomness-- if record1 share is 75% and record share is 25%,
on an average record1 should appear 75 times and record2 25 times in 100
search queries; if not exactly 75 and 25, something in that range should be
fine too.

Any ideas on implementing this feature?


Thanks much!

Satish


anti-words - exact match

2010-08-05 Thread Satish Kumar
Hi,

We have a requirement to NOT display search results if user query contains
terms that are in our anti-words field. For example, if user query is I
have swollen foot and if some records in our index have swollen foot in
anti-words field, we don't want to display those records. How do I go about
implementing this?

NOTE 1: anti-words field can contain multiple values. Each value can be a
one or multiple words (e.g. swollen foot, headache, etc. )

NOTE 2: the match must be exact. If anti-words field contains swollen foot
and if user query is I have swollen foot, record must be excluded. If user
query is My foot is swollen, the record should not be excluded.

Any pointers is greatly appreciated!


Thanks,
Satish


Re: grouping in fq

2010-05-13 Thread Satish Kumar
 (+category:xyz +price:[100 TO *]) -category:xyz

this one doesn't seem to work (I'm not using a price field, but a text field
-- using price field here just for example).

Below are some other variations I tried:

(+category:xyz +price:[100 TO *]) -category:xyz -- zero results
(+category:xyz +price:[100 TO *]) (-category:xyz) -- returns only results
with category xyz and price =100
(+category:xyz +price:[100 TO *]) (*:* -category:xyz) -- returns results
with category xyz and price =100 AND results where category!=xyz


On Wed, May 12, 2010 at 2:54 PM, Lance Norskog goks...@gmail.com wrote:

 Because leading negative clauses don't work. The (*:* AND x) syntax
 means select everything AND also select x.

 You could also do
 (+category:xyz +price:[100 TO *]) -category:xyz

 On Tue, May 11, 2010 at 12:36 PM, Satish Kumar
 satish.kumar.just.d...@gmail.com wrote:
  thanks Ahmet.
 
  (+category:xyz +price:[100 TO *]) (+*:* -category:xyz)
  why do we have to use (+*:* -category:xyz) instead of  -category:xyz?
 
 
 
  On Tue, May 11, 2010 at 3:08 PM, Ahmet Arslan iori...@yahoo.com wrote:
 
   How do I implement a requirement like if category is xyz,
   the price should
   be greater than 100 for inclusion in the result set.
  
   In other words, the result set should contain:
   - all matching documents with category value not xyz
   - all matching documents with category value xyz and price
100
  
   I was thinking something like fq=(-category:xyz OR
   (category:xyz AND price 
   100))
  
   this doesn't seem to work. Any suggestions will be greatly
   appreciated.
 
  Something like this should work:
  (+category:xyz +price:[100 TO *]) (+*:* -category:xyz)
 
  and your price field must be one of the trie based fields.
 
 
 
 
 



 --
 Lance Norskog
 goks...@gmail.com



Re: Unbuffered entity enclosing request can not be repeated.

2010-05-11 Thread Satish Kumar
I upload only 50 documents per call. We have about 200K documents to index,
and we index every night. Any suggestions on how to handle this? (I can
catch this exception and do a retry.)

On Mon, May 10, 2010 at 8:33 PM, Lance Norskog goks...@gmail.com wrote:

 Yes, these occasionally happen with long indexing jobs. You might try
 limiting the number of  documents per upload call.

 On Sun, May 9, 2010 at 9:16 PM, Satish Kumar
 satish.kumar.just.d...@gmail.com wrote:
  Found these errors in Tomcat's log file:
 
  May 9, 2010 10:57:24 PM org.apache.solr.common.SolrException log
  SEVERE: java.lang.RuntimeException: [was class
  java.net.SocketTimeoutException] Read timed out
 at
 
 com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)
 at
  com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731)
 at
 
 com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3657)
 at
  com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809)
 at org.apache.solr.handler.XMLLoader.readDoc(XMLLoader.java:279)
 at
  org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:138)
 at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
 at
 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
 
 at
 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
 
 
 at
 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
 
 at
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
 
 at
 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 
 at
 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 
 at
 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 
 
 
 
  May 9, 2010 10:57:24 PM org.apache.solr.core.SolrCore execute
 
 
  INFO: [] webapp=/solr path=/update params={wt=javabinversion=1}
 status=500
  QTime=25938
 
  May 9, 2010 10:57:24 PM org.apache.solr.common.SolrException log
 
 
  SEVERE: java.lang.RuntimeException: [was class
  java.net.SocketTimeoutException] Read timed out
 
 at
 
 com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)
 
 at
  com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731)
 
 at
 
 com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3657)
 
 at
  com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809)
 
 at org.apache.solr.handler.XMLLoader.readDoc(XMLLoader.java:279)
 
 
 at
  org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:138)
 
 at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
 
 
 at
 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
 
 at
 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
 
 
 at
 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
 
 at
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
 
 
 
 
  May 9, 2010 10:57:33 PM
 org.apache.solr.update.processor.LogUpdateProcessor
  finish
 
  INFO: {} 0 2
 
 
  May 9, 2010 10:57:33 PM org.apache.solr.common.SolrException log
 
 
  SEVERE: org.apache.solr.common.SolrException: Invalid chunk header
 
 
 at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:72)
 
 
 at
 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
 
 at
 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
 
 
 at
 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
 
 at
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
 
 at
 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 
  :
 
 
  On Mon, May 10, 2010 at 12:10 AM, Satish Kumar 
  satish.kumar.just.d...@gmail.com wrote:
 
  Hi,
 
  I am getting the following error when I run the index process once in a
  while. I'm using Solr 1.4. Any suggestions on how to resolve this error?
 
  Caused by: org.apache.solr.client.solrj.SolrServerException:
  org.apache.commons.httpclient.ProtocolException: Unbuffered entity
 enclosing
  request can not be repeated.
  at
 
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:469

grouping in fq

2010-05-11 Thread Satish Kumar
Hi,

How do I implement a requirement like if category is xyz, the price should
be greater than 100 for inclusion in the result set.

In other words, the result set should contain:
- all matching documents with category value not xyz
- all matching documents with category value xyz and price  100

I was thinking something like fq=(-category:xyz OR (category:xyz AND price 
100))

this doesn't seem to work. Any suggestions will be greatly appreciated.


thanks,
Satish


Re: grouping in fq

2010-05-11 Thread Satish Kumar
thanks Ahmet.

(+category:xyz +price:[100 TO *]) (+*:* -category:xyz)
why do we have to use (+*:* -category:xyz) instead of  -category:xyz?



On Tue, May 11, 2010 at 3:08 PM, Ahmet Arslan iori...@yahoo.com wrote:

  How do I implement a requirement like if category is xyz,
  the price should
  be greater than 100 for inclusion in the result set.
 
  In other words, the result set should contain:
  - all matching documents with category value not xyz
  - all matching documents with category value xyz and price
   100
 
  I was thinking something like fq=(-category:xyz OR
  (category:xyz AND price 
  100))
 
  this doesn't seem to work. Any suggestions will be greatly
  appreciated.

 Something like this should work:
 (+category:xyz +price:[100 TO *]) (+*:* -category:xyz)

 and your price field must be one of the trie based fields.






query parser for boost query text

2010-05-11 Thread Satish Kumar
Hi,

Special characters in the text used for boost queries are not removed. For
example, bq=field1:(what is xyz?)^10 gets parsed into query field1:xyz?10
(what and is are stop words). Question mark didn't get removed -- field1
uses standard tokenizer and standard filter, so I expect it to get removed.
When I test it using the analysis page agains field1, question mark did get
removed.

any suggestions?

thanks,
satish


Unbuffered entity enclosing request can not be repeated.

2010-05-09 Thread Satish Kumar
Hi,

I am getting the following error when I run the index process once in a
while. I'm using Solr 1.4. Any suggestions on how to resolve this error?

Caused by: org.apache.solr.client.solrj.SolrServerException:
org.apache.commons.httpclient.ProtocolException: Unbuffered entity enclosing
request can not be repeated.
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:469)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)

... 3 more
Caused by: org.apache.commons.httpclient.ProtocolException: Unbuffered
entity enclosing request can not be repeated.
at
org.apache.commons.httpclient.methods.EntityEnclosingMethod.writeRequestBody(EntityEnclosingMethod.java:487)
at
org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2114)
at
org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1096)
at
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
at
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:416)
... 7 more



Thanks,
Satish


Re: Unbuffered entity enclosing request can not be repeated.

2010-05-09 Thread Satish Kumar
Found these errors in Tomcat's log file:

May 9, 2010 10:57:24 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.RuntimeException: [was class
java.net.SocketTimeoutException] Read timed out
at
com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)
at
com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731)
at
com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3657)
at
com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809)
at org.apache.solr.handler.XMLLoader.readDoc(XMLLoader.java:279)
at
org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:138)
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)

at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)


at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)

at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)

at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)

at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)

at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)




May 9, 2010 10:57:24 PM org.apache.solr.core.SolrCore execute


INFO: [] webapp=/solr path=/update params={wt=javabinversion=1} status=500
QTime=25938

May 9, 2010 10:57:24 PM org.apache.solr.common.SolrException log


SEVERE: java.lang.RuntimeException: [was class
java.net.SocketTimeoutException] Read timed out

at
com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)

at
com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731)

at
com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3657)

at
com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809)

at org.apache.solr.handler.XMLLoader.readDoc(XMLLoader.java:279)


at
org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:138)

at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)


at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)

at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)


at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)

at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)




May 9, 2010 10:57:33 PM org.apache.solr.update.processor.LogUpdateProcessor
finish

INFO: {} 0 2


May 9, 2010 10:57:33 PM org.apache.solr.common.SolrException log


SEVERE: org.apache.solr.common.SolrException: Invalid chunk header


at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:72)


at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)

at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)


at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)

at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)

at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)

:


On Mon, May 10, 2010 at 12:10 AM, Satish Kumar 
satish.kumar.just.d...@gmail.com wrote:

 Hi,

 I am getting the following error when I run the index process once in a
 while. I'm using Solr 1.4. Any suggestions on how to resolve this error?

 Caused by: org.apache.solr.client.solrj.SolrServerException:
 org.apache.commons.httpclient.ProtocolException: Unbuffered entity enclosing
 request can not be repeated.
 at
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:469)
 at
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
 at
 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
 at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)

 ... 3 more
 Caused by: org.apache.commons.httpclient.ProtocolException: Unbuffered
 entity enclosing request can not be repeated.
 at
 org.apache.commons.httpclient.methods.EntityEnclosingMethod.writeRequestBody(EntityEnclosingMethod.java:487)
 at
 org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2114

Re: Score cutoff

2010-05-03 Thread Satish Kumar
Hi,

Can someone give clues on how to implement this feature? This is a very
important requirement for us, so any help is greatly appreciated.


thanks!

On Tue, Apr 27, 2010 at 5:54 PM, Satish Kumar 
satish.kumar.just.d...@gmail.com wrote:

 Hi,

 For some of our queries, the top xx (five or so) results are of very high
 quality and results after xx are very poor. The difference in score for the
 high quality and poor quality results is high. For example, 3.5 for high
 quality and 0.8 for poor quality. We want to exclude results with score
 value that is less than 60% or so of the first result. Is there a filter
 that does this? If not, can someone please give some hints on how to
 implement this (we want to do this as part of solr relevance ranking so that
 the facet counts, etc will be correct).


 Thanks,
 Satish



Score cutoff

2010-04-27 Thread Satish Kumar
Hi,

For some of our queries, the top xx (five or so) results are of very high
quality and results after xx are very poor. The difference in score for the
high quality and poor quality results is high. For example, 3.5 for high
quality and 0.8 for poor quality. We want to exclude results with score
value that is less than 60% or so of the first result. Is there a filter
that does this? If not, can someone please give some hints on how to
implement this (we want to do this as part of solr relevance ranking so that
the facet counts, etc will be correct).


Thanks,
Satish