Re: Query Elevation exception on shard queries

2013-05-08 Thread varun srivastava
Ok found the solution .. Like SpellcheckComponent , Elevate Component also
requires shards.qt param .. But still dont know why both these components
doesn't work in absense of shards.qt . Can anyone explain ?

Thanks
Varun


On Mon, May 6, 2013 at 1:14 PM, varun srivastava varunmail...@gmail.comwrote:

 Thanks Ravi. So then it is a bug .


 On Mon, May 6, 2013 at 12:04 PM, Ravi Solr ravis...@gmail.com wrote:

 Varun,
  Since our cores were totally disjoint i.e. they pertain to two
 different applications which may or may not have results for a given
 query,
 we moved the elavation outside of solr into our java code. As long as both
 cores had some results to return for a given query elevation would work.

 Thanks,

 Ravi


 On Sat, May 4, 2013 at 1:54 PM, varun srivastava varunmail...@gmail.com
 wrote:

  Hi Ravi,
   I am getting same probelm . You got any solution ?
 
  Thanks
  Varun
 
 
  On Fri, Mar 29, 2013 at 11:48 AM, Ravi Solr ravis...@gmail.com wrote:
 
   Hello,
 We have a Solr 3.6.2 multicore setup, where each core is a
 complete
   index for one application. In our site search we use sharded query to
  query
   two cores at a time. The issue is, If one core has docs but other core
   doesn't for an elevated query solr is throwing a 500 error. I woudl
  really
   appreciate it if somebody can point me in the right direction on how
 to
   avoid this error, the following is my query
  
  
  
 
 [#|2013-03-29T13:44:55.609-0400|INFO|sun-appserver2.1|org.apache.solr.core.SolrCore|_ThreadID=22;_ThreadName=httpSSLWorkerThread-9001-0;|[core1]
   webapp=/solr path=/select/
  
  
 
 params={q=civil+warstart=0rows=10shards=localhost:/solr/core1,localhost:/solr/core2hl=truehl.fragsize=0hl.snippets=5hl.simple.pre=stronghl.simple.post=/stronghl.fl=bodyfl=*facet=truefacet.field=typefacet.mincount=1facet.method=enumfq=pubdate:[2005-01-01T00:00:00Z+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DPast+24+Hours}pubdate:[NOW/DAY-1DAY+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DPast+7+Days}pubdate:[NOW/DAY-7DAYS+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DPast+60+Days}pubdate:[NOW/DAY-60DAYS+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DPast+12+Months}pubdate:[NOW/DAY-1YEAR+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DAll+Since+2005}pubdate:[*+TO+NOW/DAY%2B1DAY]}
   status=500 QTime=15 |#]
  
  
   As you can see the 2 cores are core1 and core2. The core1 has data
 for he
   query 'civil war' however core2 doesn't have any data. We have the
 'civil
   war' in the elevate.xml which causes Solr to throw a SolrException as
   follows. However if I remove the elevate entry for this query,
 everything
   works well.
  
   *type* Status report
  
   *message*Index: 1, Size: 0 java.lang.IndexOutOfBoundsException:
 Index: 1,
   Size: 0 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at
   java.util.ArrayList.get(ArrayList.java:322) at
   org.apache.solr.common.util.NamedList.getVal(NamedList.java:137) at
  
  
 
 org.apache.solr.handler.component.ShardFieldSortedHitQueue$ShardComparator.sortVal(ShardDoc.java:221)
   at
  
  
 
 org.apache.solr.handler.component.ShardFieldSortedHitQueue$2.compare(ShardDoc.java:260)
   at
  
  
 
 org.apache.solr.handler.component.ShardFieldSortedHitQueue.lessThan(ShardDoc.java:160)
   at
  
  
 
 org.apache.solr.handler.component.ShardFieldSortedHitQueue.lessThan(ShardDoc.java:101)
   at
 org.apache.lucene.util.PriorityQueue.upHeap(PriorityQueue.java:223) at
   org.apache.lucene.util.PriorityQueue.add(PriorityQueue.java:132) at
  
  
 
 org.apache.lucene.util.PriorityQueue.insertWithOverflow(PriorityQueue.java:148)
   at
  
  
 
 org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:786)
   at
  
  
 
 org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:587)
   at
  
  
 
 org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:566)
   at
  
  
 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:283)
   at
  
  
 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376) at
  
  
 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365)
   at
  
  
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
   at
  
  
 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:246)
   at
  
  
 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:214)
   at
  
  
 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:313)
   at
  
  
 
 org.apache.catalina.core.StandardContextValve.invokeInternal(StandardContextValve.java:287)
   at
  
  
 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:218)
   at
  
  
 
 org.apache.catalina.core.StandardPipeline.doInvoke

Re: Elevate Problem with Distributed query

2013-05-08 Thread varun srivastava
Ok found the solution .. Like SpellcheckComponent , Elevate Component also
requires shards.qt param .. But still dont know why both these components
doesn't work in absense of shards.qt . Can anyone explain ?

Thanks


On Sat, May 4, 2013 at 1:08 PM, varun srivastava varunmail...@gmail.comwrote:


 i am getting following exception when sort fieldname is  _elevate_ .


 ava.lang.IndexOutOfBoundsException: Index: 1, Size: 0\n\tat
 java.util.ArrayList.RangeCheck(ArrayList.java:547)\n\tat
 java.util.ArrayList.get(ArrayList.java:322)\n\tat

 org.apache.solr.common.util.NamedList.getVal(NamedList.java:136)\n\tat
 org.apache.solr.handler.component.ShardFieldSortedHitQueue$ShardComparator.sortVal(ShardDoc.java:217)\n\tat


 org.apache.solr.handler.component.ShardFieldSortedHitQueue$2.compare(ShardDoc.java:255)\n\tat

 org.apache.solr.handler.component.ShardFieldSortedHitQueue.lessThan(ShardDoc.java:159)\n\tat
 org.apache.solr.handler.component.ShardFieldSortedHitQueue.lessThan(ShardDoc.java:101)\n\tat

 org.apache.lucene.util.PriorityQueue.insertWithOverflow(PriorityQueue.java:158)\n\tat
 org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:863)\n\tat


 org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:626)\n\tat


 org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:605)\n\tat


 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:309)\n\tat

 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)\n\tat
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)\n\tat

 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455)\n\tat
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276)\n\tat

 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)\n\tat
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)\n\tat

 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:225)\n\tat
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)\n\tat

 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)\n\tat
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)\n\tat

 org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927)\n\tat
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)\n\tat

 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)\n\tat
 org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1001)\n\tat

 org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:585)\n\tat
 org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)\n\tat

 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)\n\tat
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)\n\tat

 java.lang.Thread.run(Thread.java:662)



 On Sat, May 4, 2013 at 11:10 AM, varun srivastava 
 varunmail...@gmail.comwrote:

 Hi,
  Is Query Elevate featue is suppose to work with distributed query ? I
 have 2 shards but when I am doing distributed query I get following
 Exception. I am using solr 4.0.0


 in following bug yonik is refering to problem in his comment

 https://issues.apache.org/jira/browse/SOLR-2949?focusedCommentId=13232736page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13232736

 But it seems bug is fixed in 4.0 then why i am getting following
 exception with _elevate_ fieldname

 ava.lang.IndexOutOfBoundsException: Index: 1, Size: 0\n\tat
 java.util.ArrayList.RangeCheck(ArrayList.java:547)\n\tat
 java.util.ArrayList.get(ArrayList.java:322)\n\tat
 org.apache.solr.common.util.NamedList.getVal(NamedList.java:136)\n\tat
 org.apache.solr.handler.component.ShardFieldSortedHitQueue$ShardComparator.sortVal(ShardDoc.java:217)\n\tat
 org.apache.solr.handler.component.ShardFieldSortedHitQueue$2.compare(ShardDoc.java:255)\n\tat
 org.apache.solr.handler.component.ShardFieldSortedHitQueue.lessThan(ShardDoc.java:159)\n\tat
 org.apache.solr.handler.component.ShardFieldSortedHitQueue.lessThan(ShardDoc.java:101)\n\tat
 org.apache.lucene.util.PriorityQueue.insertWithOverflow(PriorityQueue.java:158)\n\tat
 org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:863)\n\tat
 org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:626)\n\tat
 org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:605)\n\tat
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:309)\n\tat
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)\n\tat
 org.apache.solr.core.SolrCore.execute(SolrCore.java

Re: Query Elevation exception on shard queries

2013-05-06 Thread varun srivastava
Thanks Ravi. So then it is a bug .


On Mon, May 6, 2013 at 12:04 PM, Ravi Solr ravis...@gmail.com wrote:

 Varun,
  Since our cores were totally disjoint i.e. they pertain to two
 different applications which may or may not have results for a given query,
 we moved the elavation outside of solr into our java code. As long as both
 cores had some results to return for a given query elevation would work.

 Thanks,

 Ravi


 On Sat, May 4, 2013 at 1:54 PM, varun srivastava varunmail...@gmail.com
 wrote:

  Hi Ravi,
   I am getting same probelm . You got any solution ?
 
  Thanks
  Varun
 
 
  On Fri, Mar 29, 2013 at 11:48 AM, Ravi Solr ravis...@gmail.com wrote:
 
   Hello,
 We have a Solr 3.6.2 multicore setup, where each core is a
 complete
   index for one application. In our site search we use sharded query to
  query
   two cores at a time. The issue is, If one core has docs but other core
   doesn't for an elevated query solr is throwing a 500 error. I woudl
  really
   appreciate it if somebody can point me in the right direction on how to
   avoid this error, the following is my query
  
  
  
 
 [#|2013-03-29T13:44:55.609-0400|INFO|sun-appserver2.1|org.apache.solr.core.SolrCore|_ThreadID=22;_ThreadName=httpSSLWorkerThread-9001-0;|[core1]
   webapp=/solr path=/select/
  
  
 
 params={q=civil+warstart=0rows=10shards=localhost:/solr/core1,localhost:/solr/core2hl=truehl.fragsize=0hl.snippets=5hl.simple.pre=stronghl.simple.post=/stronghl.fl=bodyfl=*facet=truefacet.field=typefacet.mincount=1facet.method=enumfq=pubdate:[2005-01-01T00:00:00Z+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DPast+24+Hours}pubdate:[NOW/DAY-1DAY+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DPast+7+Days}pubdate:[NOW/DAY-7DAYS+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DPast+60+Days}pubdate:[NOW/DAY-60DAYS+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DPast+12+Months}pubdate:[NOW/DAY-1YEAR+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DAll+Since+2005}pubdate:[*+TO+NOW/DAY%2B1DAY]}
   status=500 QTime=15 |#]
  
  
   As you can see the 2 cores are core1 and core2. The core1 has data for
 he
   query 'civil war' however core2 doesn't have any data. We have the
 'civil
   war' in the elevate.xml which causes Solr to throw a SolrException as
   follows. However if I remove the elevate entry for this query,
 everything
   works well.
  
   *type* Status report
  
   *message*Index: 1, Size: 0 java.lang.IndexOutOfBoundsException: Index:
 1,
   Size: 0 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at
   java.util.ArrayList.get(ArrayList.java:322) at
   org.apache.solr.common.util.NamedList.getVal(NamedList.java:137) at
  
  
 
 org.apache.solr.handler.component.ShardFieldSortedHitQueue$ShardComparator.sortVal(ShardDoc.java:221)
   at
  
  
 
 org.apache.solr.handler.component.ShardFieldSortedHitQueue$2.compare(ShardDoc.java:260)
   at
  
  
 
 org.apache.solr.handler.component.ShardFieldSortedHitQueue.lessThan(ShardDoc.java:160)
   at
  
  
 
 org.apache.solr.handler.component.ShardFieldSortedHitQueue.lessThan(ShardDoc.java:101)
   at org.apache.lucene.util.PriorityQueue.upHeap(PriorityQueue.java:223)
 at
   org.apache.lucene.util.PriorityQueue.add(PriorityQueue.java:132) at
  
  
 
 org.apache.lucene.util.PriorityQueue.insertWithOverflow(PriorityQueue.java:148)
   at
  
  
 
 org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:786)
   at
  
  
 
 org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:587)
   at
  
  
 
 org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:566)
   at
  
  
 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:283)
   at
  
  
 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376) at
  
  
 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365)
   at
  
  
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
   at
  
  
 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:246)
   at
  
  
 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:214)
   at
  
  
 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:313)
   at
  
  
 
 org.apache.catalina.core.StandardContextValve.invokeInternal(StandardContextValve.java:287)
   at
  
  
 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:218)
   at
  
  
 
 org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:648)
   at
  
  
 
 org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:593)
   at com.sun.enterprise.web.WebPipeline.invoke(WebPipeline.java:94) at
  
  
 
 com.sun.enterprise.web.PESessionLockingStandardPipeline.invoke(PESessionLockingStandardPipeline.java:98

Re: Is there a way to remove caches in SOLR?

2013-05-06 Thread varun srivastava
make size 0


On Mon, May 6, 2013 at 4:38 PM, bbarani bbar...@gmail.com wrote:

 I am trying to create performance metrics for SOLR. I don't want the
 searcher
 to warm up when I issue a query since I am trying to collect metrics for
 cold search. Is there a way to disable warming?



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Is-there-a-way-to-remove-caches-in-SOLR-tp4061216.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: How to get solr synonyms in result set.

2013-05-05 Thread varun srivastava
Hi Suneel,
 After discovering that only query time synonym work with solr I found a
good article on pros and cons of query and index time synonyms . It may
help you
http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/

Regards
Varun


On Sun, May 5, 2013 at 9:20 AM, Erick Erickson erickerick...@gmail.comwrote:

 Sure, you can specify a separate synonyms list at query time, just define
 an index and query time analysis chain one without the synonym filter
 factory and one without.

 Be aware that index-time and query-time have some different
 characteristics,
 especially around multi-word synonyms see:

 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory

 Best
 Erick

 On Sun, May 5, 2013 at 12:23 AM, varun srivastava
 varunmail...@gmail.com wrote:
  Hi ,
   Synonyms list is used at index time. So I dont think you can pass list
 at
  query time and make it work.
 
 
  On Fri, May 3, 2013 at 11:53 PM, Suneel Pandey pandey.sun...@gmail.com
 wrote:
 
  Hi,
 
  I want to get specific solr synonyms terms list during query time in
 result
  set based on filter criteria.
  I have implemented synonyms in .txt file.
 
  Thanks
 
 
 
 
 
 
 
 
  -
  Regards,
 
  Suneel Pandey
  Sr. Software Developer
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/How-to-get-solr-synonyms-in-result-set-tp4060796.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 



Re: Query Elevation exception on shard queries

2013-05-04 Thread varun srivastava
Hi Ravi,
 I am getting same probelm . You got any solution ?

Thanks
Varun


On Fri, Mar 29, 2013 at 11:48 AM, Ravi Solr ravis...@gmail.com wrote:

 Hello,
   We have a Solr 3.6.2 multicore setup, where each core is a complete
 index for one application. In our site search we use sharded query to query
 two cores at a time. The issue is, If one core has docs but other core
 doesn't for an elevated query solr is throwing a 500 error. I woudl really
 appreciate it if somebody can point me in the right direction on how to
 avoid this error, the following is my query


 [#|2013-03-29T13:44:55.609-0400|INFO|sun-appserver2.1|org.apache.solr.core.SolrCore|_ThreadID=22;_ThreadName=httpSSLWorkerThread-9001-0;|[core1]
 webapp=/solr path=/select/

 params={q=civil+warstart=0rows=10shards=localhost:/solr/core1,localhost:/solr/core2hl=truehl.fragsize=0hl.snippets=5hl.simple.pre=stronghl.simple.post=/stronghl.fl=bodyfl=*facet=truefacet.field=typefacet.mincount=1facet.method=enumfq=pubdate:[2005-01-01T00:00:00Z+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DPast+24+Hours}pubdate:[NOW/DAY-1DAY+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DPast+7+Days}pubdate:[NOW/DAY-7DAYS+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DPast+60+Days}pubdate:[NOW/DAY-60DAYS+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DPast+12+Months}pubdate:[NOW/DAY-1YEAR+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DAll+Since+2005}pubdate:[*+TO+NOW/DAY%2B1DAY]}
 status=500 QTime=15 |#]


 As you can see the 2 cores are core1 and core2. The core1 has data for he
 query 'civil war' however core2 doesn't have any data. We have the 'civil
 war' in the elevate.xml which causes Solr to throw a SolrException as
 follows. However if I remove the elevate entry for this query, everything
 works well.

 *type* Status report

 *message*Index: 1, Size: 0 java.lang.IndexOutOfBoundsException: Index: 1,
 Size: 0 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at
 java.util.ArrayList.get(ArrayList.java:322) at
 org.apache.solr.common.util.NamedList.getVal(NamedList.java:137) at

 org.apache.solr.handler.component.ShardFieldSortedHitQueue$ShardComparator.sortVal(ShardDoc.java:221)
 at

 org.apache.solr.handler.component.ShardFieldSortedHitQueue$2.compare(ShardDoc.java:260)
 at

 org.apache.solr.handler.component.ShardFieldSortedHitQueue.lessThan(ShardDoc.java:160)
 at

 org.apache.solr.handler.component.ShardFieldSortedHitQueue.lessThan(ShardDoc.java:101)
 at org.apache.lucene.util.PriorityQueue.upHeap(PriorityQueue.java:223) at
 org.apache.lucene.util.PriorityQueue.add(PriorityQueue.java:132) at

 org.apache.lucene.util.PriorityQueue.insertWithOverflow(PriorityQueue.java:148)
 at

 org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:786)
 at

 org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:587)
 at

 org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:566)
 at

 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:283)
 at

 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376) at

 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365)
 at

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
 at

 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:246)
 at

 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:214)
 at

 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:313)
 at

 org.apache.catalina.core.StandardContextValve.invokeInternal(StandardContextValve.java:287)
 at

 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:218)
 at

 org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:648)
 at

 org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:593)
 at com.sun.enterprise.web.WebPipeline.invoke(WebPipeline.java:94) at

 com.sun.enterprise.web.PESessionLockingStandardPipeline.invoke(PESessionLockingStandardPipeline.java:98)
 at

 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:222)
 at

 org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:648)
 at

 org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:593)
 at
 org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:587)
 at org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:1093)
 at

 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:166)
 at

 org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:648)
 at

 org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:593)
 at
 org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:587)
 at 

Elevate Problem with Distributed query

2013-05-04 Thread varun srivastava
Hi,
 Is Query Elevate featue is suppose to work with distributed query ? I have
2 shards but when I am doing distributed query I get following Exception. I
am using solr 4.0.0


in following bug yonik is refering to problem in his comment
https://issues.apache.org/jira/browse/SOLR-2949?focusedCommentId=13232736page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13232736

But it seems bug is fixed in 4.0 then why i am getting following exception
with _elevate_ fieldname

ava.lang.IndexOutOfBoundsException: Index: 1, Size: 0\n\tat
java.util.ArrayList.RangeCheck(ArrayList.java:547)\n\tat
java.util.ArrayList.get(ArrayList.java:322)\n\tat
org.apache.solr.common.util.NamedList.getVal(NamedList.java:136)\n\tat
org.apache.solr.handler.component.ShardFieldSortedHitQueue$ShardComparator.sortVal(ShardDoc.java:217)\n\tat
org.apache.solr.handler.component.ShardFieldSortedHitQueue$2.compare(ShardDoc.java:255)\n\tat
org.apache.solr.handler.component.ShardFieldSortedHitQueue.lessThan(ShardDoc.java:159)\n\tat
org.apache.solr.handler.component.ShardFieldSortedHitQueue.lessThan(ShardDoc.java:101)\n\tat
org.apache.lucene.util.PriorityQueue.insertWithOverflow(PriorityQueue.java:158)\n\tat
org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:863)\n\tat
org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:626)\n\tat
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:605)\n\tat
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:309)\n\tat
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)\n\tat
org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276)\n\tat
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)\n\tat
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)\n\tat
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:225)\n\tat
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)\n\tat
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)\n\tat
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)\n\tat
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927)\n\tat
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)\n\tat
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)\n\tat
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1001)\n\tat
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:585)\n\tat
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)\n\tat
java.lang.Thread.run(Thread.java:662)



Thanks

Varun


Re: Elevate Problem with Distributed query

2013-05-04 Thread varun srivastava
i am getting following exception when sort fieldname is  _elevate_ .


ava.lang.IndexOutOfBoundsException: Index: 1, Size: 0\n\tat
java.util.ArrayList.RangeCheck(ArrayList.java:547)\n\tat
java.util.ArrayList.get(ArrayList.java:322)\n\tat

org.apache.solr.common.util.NamedList.getVal(NamedList.java:136)\n\tat
org.apache.solr.handler.component.ShardFieldSortedHitQueue$ShardComparator.sortVal(ShardDoc.java:217)\n\tat

org.apache.solr.handler.component.ShardFieldSortedHitQueue$2.compare(ShardDoc.java:255)\n\tat

org.apache.solr.handler.component.ShardFieldSortedHitQueue.lessThan(ShardDoc.java:159)\n\tat
org.apache.solr.handler.component.ShardFieldSortedHitQueue.lessThan(ShardDoc.java:101)\n\tat

org.apache.lucene.util.PriorityQueue.insertWithOverflow(PriorityQueue.java:158)\n\tat
org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:863)\n\tat

org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:626)\n\tat

org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:605)\n\tat

org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:309)\n\tat

org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)\n\tat
org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)\n\tat

org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276)\n\tat

org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)\n\tat
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)\n\tat

org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:225)\n\tat
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)\n\tat

org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)\n\tat
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)\n\tat

org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927)\n\tat
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)\n\tat

org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)\n\tat
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1001)\n\tat

org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:585)\n\tat
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)\n\tat

java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)\n\tat

java.lang.Thread.run(Thread.java:662)



On Sat, May 4, 2013 at 11:10 AM, varun srivastava varunmail...@gmail.comwrote:

 Hi,
  Is Query Elevate featue is suppose to work with distributed query ? I
 have 2 shards but when I am doing distributed query I get following
 Exception. I am using solr 4.0.0


 in following bug yonik is refering to problem in his comment

 https://issues.apache.org/jira/browse/SOLR-2949?focusedCommentId=13232736page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13232736

 But it seems bug is fixed in 4.0 then why i am getting following exception
 with _elevate_ fieldname

 ava.lang.IndexOutOfBoundsException: Index: 1, Size: 0\n\tat
 java.util.ArrayList.RangeCheck(ArrayList.java:547)\n\tat
 java.util.ArrayList.get(ArrayList.java:322)\n\tat
 org.apache.solr.common.util.NamedList.getVal(NamedList.java:136)\n\tat
 org.apache.solr.handler.component.ShardFieldSortedHitQueue$ShardComparator.sortVal(ShardDoc.java:217)\n\tat
 org.apache.solr.handler.component.ShardFieldSortedHitQueue$2.compare(ShardDoc.java:255)\n\tat
 org.apache.solr.handler.component.ShardFieldSortedHitQueue.lessThan(ShardDoc.java:159)\n\tat
 org.apache.solr.handler.component.ShardFieldSortedHitQueue.lessThan(ShardDoc.java:101)\n\tat
 org.apache.lucene.util.PriorityQueue.insertWithOverflow(PriorityQueue.java:158)\n\tat
 org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:863)\n\tat
 org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:626)\n\tat
 org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:605)\n\tat
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:309)\n\tat
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)\n\tat
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)\n\tat
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455)\n\tat
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276)\n\tat
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)\n\tat

Re: Query Elevation exception on shard queries

2013-05-04 Thread varun srivastava
Hi ,
 I am getting same exception as Ravi when using shard query containing
elevated terms. I am using solr 4.0.0 stable version which is suppose to
contain fix for this.

https://issues.apache.org/jira/browse/SOLR-2949


ava.lang.IndexOutOfBoundsException: Index: 1, Size: 0\n\tat
java.util.ArrayList.RangeCheck(ArrayList.java:547)\n\tat
java.util.ArrayList.get(ArrayList.java:322)\n\tat

org.apache.solr.common.util.NamedList.getVal(NamedList.java:136)\n\tat
org.apache.solr.handler.component.ShardFieldSortedHitQueue$ShardComparator.sortVal(ShardDoc.java:217)\n\tat

org.apache.solr.handler.component.ShardFieldSortedHitQueue$2.compare(ShardDoc.java:255)\n\tat

org.apache.solr.handler.component.ShardFieldSortedHitQueue.lessThan(ShardDoc.java:159)\n\tat
org.apache.solr.handler.component.ShardFieldSortedHitQueue.lessThan(ShardDoc.java:101)\n\tat

org.apache.lucene.util.PriorityQueue.insertWithOverflow(PriorityQueue.java:158)\n\tat
org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:863)\n\tat

org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:626)\n\tat

org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:605)\n\tat

org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:309)\n\tat

org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)\n\tat
org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)\n\tat

org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276)\n\tat

org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)\n\tat
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)\n\tat

org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:225)\n\tat
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)\n\tat

org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)\n\tat
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)\n\tat

org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927)\n\tat
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)\n\tat

org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)\n\tat
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1001)\n\tat

org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:585)\n\tat
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)\n\tat

java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)\n\tat

java.lang.Thread.run(Thread.java:662)






On Sat, May 4, 2013 at 10:54 AM, varun srivastava varunmail...@gmail.comwrote:

 Hi Ravi,
  I am getting same probelm . You got any solution ?

 Thanks
 Varun


 On Fri, Mar 29, 2013 at 11:48 AM, Ravi Solr ravis...@gmail.com wrote:

 Hello,
   We have a Solr 3.6.2 multicore setup, where each core is a complete
 index for one application. In our site search we use sharded query to
 query
 two cores at a time. The issue is, If one core has docs but other core
 doesn't for an elevated query solr is throwing a 500 error. I woudl really
 appreciate it if somebody can point me in the right direction on how to
 avoid this error, the following is my query


 [#|2013-03-29T13:44:55.609-0400|INFO|sun-appserver2.1|org.apache.solr.core.SolrCore|_ThreadID=22;_ThreadName=httpSSLWorkerThread-9001-0;|[core1]
 webapp=/solr path=/select/

 params={q=civil+warstart=0rows=10shards=localhost:/solr/core1,localhost:/solr/core2hl=truehl.fragsize=0hl.snippets=5hl.simple.pre=stronghl.simple.post=/stronghl.fl=bodyfl=*facet=truefacet.field=typefacet.mincount=1facet.method=enumfq=pubdate:[2005-01-01T00:00:00Z+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DPast+24+Hours}pubdate:[NOW/DAY-1DAY+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DPast+7+Days}pubdate:[NOW/DAY-7DAYS+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DPast+60+Days}pubdate:[NOW/DAY-60DAYS+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DPast+12+Months}pubdate:[NOW/DAY-1YEAR+TO+NOW/DAY%2B1DAY]facet.query={!ex%3Ddt+key%3DAll+Since+2005}pubdate:[*+TO+NOW/DAY%2B1DAY]}
 status=500 QTime=15 |#]


 As you can see the 2 cores are core1 and core2. The core1 has data for he
 query 'civil war' however core2 doesn't have any data. We have the 'civil
 war' in the elevate.xml which causes Solr to throw a SolrException as
 follows. However if I remove the elevate entry for this query, everything
 works well.

 *type* Status report

 *message*Index: 1, Size: 0 java.lang.IndexOutOfBoundsException: Index: 1,
 Size: 0 at java.util.ArrayList.RangeCheck(ArrayList.java:547

Re: How to get solr synonyms in result set.

2013-05-04 Thread varun srivastava
Hi ,
 Synonyms list is used at index time. So I dont think you can pass list at
query time and make it work.


On Fri, May 3, 2013 at 11:53 PM, Suneel Pandey pandey.sun...@gmail.comwrote:

 Hi,

 I want to get specific solr synonyms terms list during query time in result
 set based on filter criteria.
 I have implemented synonyms in .txt file.

 Thanks








 -
 Regards,

 Suneel Pandey
 Sr. Software Developer
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/How-to-get-solr-synonyms-in-result-set-tp4060796.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: SolrCloud - Sorting Problem

2013-03-09 Thread varun srivastava
Hi Deniz,
 Your mail about distributed query is really helpful. Can you or someone
else improve the following wiki. RIght now we dont have any document
explaining distributed search in solr, which is now backbone of solr cloud.

http://wiki.apache.org/solr/WritingDistributedSearchComponents

Thanks
Varun

On Sun, Dec 2, 2012 at 10:49 PM, deniz denizdurmu...@gmail.com wrote:

 I think I have figured out this... at least some kinda..

 After putting logs here there in the code, especially in SolrCore,
 HttpShardHandler, SearchHandler classes, it seems like sorting is done
 after
 all of the shards finish responding and then before we see the results
 the
 result set is sorted... I am not sure if this is correct or not totally, it
 is what i see from the logs, in the request headers..

 so for a shard or distributed search the header looks like this:

 status=0,QTime=4,params={df=text,fl=*,position,shard.url=blablabla

 and just before i see the results on my browser the header becomes this:

 status=0,QTime=178,params={fl=*,position,sort=myfield desc

 and basically, because the position field was filled before actual sorting
 on the page, the positions are incorrect...

 is this right? i mean sorting is really done after everything finishes and
 we are about to get results?



 -
 Zeki ama calismiyor... Calissa yapar...
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/SolrCloud-Sorting-Problem-tp4023382p4023889.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: SolrCloud - Sorting Problem

2013-03-09 Thread varun srivastava
Also if anyone who understand DistributedSearch can update following wiki
it will be really helpful for all of us.

http://wiki.apache.org/solr/DistributedSearchDesign

Thanks
Varun

On Sat, Mar 9, 2013 at 4:03 PM, varun srivastava varunmail...@gmail.comwrote:

 Hi Deniz,
  Your mail about distributed query is really helpful. Can you or someone
 else improve the following wiki. RIght now we dont have any document
 explaining distributed search in solr, which is now backbone of solr cloud.

 http://wiki.apache.org/solr/WritingDistributedSearchComponents

 Thanks
 Varun

 On Sun, Dec 2, 2012 at 10:49 PM, deniz denizdurmu...@gmail.com wrote:

 I think I have figured out this... at least some kinda..

 After putting logs here there in the code, especially in SolrCore,
 HttpShardHandler, SearchHandler classes, it seems like sorting is done
 after
 all of the shards finish responding and then before we see the results
 the
 result set is sorted... I am not sure if this is correct or not totally,
 it
 is what i see from the logs, in the request headers..

 so for a shard or distributed search the header looks like this:

 status=0,QTime=4,params={df=text,fl=*,position,shard.url=blablabla

 and just before i see the results on my browser the header becomes this:

 status=0,QTime=178,params={fl=*,position,sort=myfield desc

 and basically, because the position field was filled before actual sorting
 on the page, the positions are incorrect...

 is this right? i mean sorting is really done after everything finishes and
 we are about to get results?



 -
 Zeki ama calismiyor... Calissa yapar...
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/SolrCloud-Sorting-Problem-tp4023382p4023889.html
 Sent from the Solr - User mailing list archive at Nabble.com.





Re: dropping fields from input data

2013-03-05 Thread varun srivastava
Thanks Hoss .. Is this available in 4.0 ?

On Tue, Mar 5, 2013 at 5:14 PM, Chris Hostetter hossman_luc...@fucit.orgwrote:


 :dynamicField name=stamp_* type=string indexed=false
 : stored=false multiValued=true/

 Take a look at IgnoreFieldUpdateProcessorFactory...


 https://lucene.apache.org/solr/4_1_0/solr-core/org/apache/solr/update/processor/IgnoreFieldUpdateProcessorFactory.html

 https://lucene.apache.org/solr/4_1_0/solr-core/org/apache/solr/update/processor/FieldMutatingUpdateProcessorFactory.html

 Using that (instead of setting indexed=false stored=false in schema.xml)
 has the advantage that you can use it to throw away fields early in the
 update processor pipeline, before any distributed logic happens in
 SolrCloud.


 -Hoss



Re: Role of zookeeper at runtime

2013-02-28 Thread varun srivastava
How can I setup cloud master-slave ? Can you point me to any sample config
or tutorial which describe the steps to get slor cloud in master-slave
setup.

As you know from my previous mails, that I dont need active solr replicas,
I just need a mechanism to copy a given solr cloud index to a new instance
of solr-cloud ( classic master-slave setup)

Eric/ Mark,
  We have 10 virtual data centres . Now its setup like this because we do
rolling update. While 1 st dc is getting indexed other 9 serve traffic .
Indexing one dc take 2 hours. Now with single shard we use to index one dc
and then quickly replicate index into other dcs by having master-slave
setup. Now in case of solr cloud obviously we can't index each dc
sequentially as it will take 2*10 hours. So we need way of indexing 1 dc
and then somehow quickly propagate the index binary to others. What will
you recommend for solr cloud ?

Thanks
Varun

On Thu, Feb 28, 2013 at 6:12 AM, Mark Miller markrmil...@gmail.com wrote:


 On Feb 26, 2013, at 6:49 PM, varun srivastava varunmail...@gmail.com
 wrote:

  So does it means while doing document add the state of cluster is
 fetched
  from zookeeper and then depending upon hash of docid the target shard is
  decided ?

 We keep the zookeeper info cached locally. We only updated it when
 ZooKeeper tells us it has changed.

 
  Assume we have 3 shards ( with no replicas) in which 1 went down while
  indexing , so will all the documents will be routed to remaining 2 shards
  or only 2/3 rd of the documents will be indexed ? If answer is remaining
 2
  shards will get all the documents , then if later 3rd shard comes up
 online
  then will solr cloud will do rebalancing ?

 All of the updates that hash to the third shard will fail. That is why we
 have replicas - if you have a replica, it will take over as the leader.

 
  Is anywhere in zookeeper we store the range of docids stored in each
 shard,
  or any other information about actual docs ?

 The range of hashes are stored for each shard in zk.

  We have 2 datacentres (dc1 and
  dc2) which need to be indexed with exactly same data and we update index
  only once a day. Both dc1 and dc2 have exact same solrcloud config and
  machines.
 
  Can we populate dc2 by just copying all the index binaries from
  solr-cores/core0/data of dc1, to the machines in dc2 ( to avoid indexing
  same documents on dc2). I guess solr replication API doesn't work in
  solrcloud, hence loooking for work around.
 
  Thanks
  Varun
 
  On Tue, Feb 26, 2013 at 3:34 PM, Mark Miller markrmil...@gmail.com
 wrote:
 
  ZooKeeper
  /
  /clusterstate.json - info about the layout and state of the cluster -
  collections, shards, urls, etc
  /collections - config to use for the collection, shard leader voting zk
  nodes
  /configs - sets of config files
  /live_nodes - ephemeral nodes, one per Solr node
  /overseer - work queue for update clusterstate.json, creating new
  collections, etc
  /overseer_elect - overseer voting zk nodes
 
  - Mark
 
  On Feb 26, 2013, at 6:18 PM, varun srivastava varunmail...@gmail.com
  wrote:
 
  Hi Mark,
  One more question
 
  While doing solr doc update/add what information is required from
  zookeeper
  ? Can you tell what all information is stored in zookeeper other than
 the
  startup configs.
 
  Thanks
  Varun
 
  On Tue, Feb 26, 2013 at 3:09 PM, Mark Miller markrmil...@gmail.com
  wrote:
 
 
  On Feb 26, 2013, at 5:25 PM, varun srivastava varunmail...@gmail.com
 
  wrote:
 
  Hi All,
  I have some questions regarding role of zookeeper in solrcloud
 runtime,
  while processing the queries .
 
  1) Is zookeeper cluster referred by solr shards for processing every
  request, or its only used to copy config on startup time ?
 
  No, it's not used per request. Solr talks to ZooKeeper on SolrCore
  startup
  - to get configs and set itself up. Then it only talks to ZooKeeper
  when a
  cluster state change happens - in that case, ZooKeeper pings Solr and
  Solr
  will get an update view of the cluster. That view is cached and used
 for
  requests. In a stable state, Solr is not talking to ZooKeeper other
 than
  the heartbeat they keep to know a node is up.
 
  2) How loadbalancing is done between replicas ? Is traffic stat
 shared
  through zookeeper ?
 
  Basic round robin. Traffic stats are not currently in Zk.
 
  3) If for any reason zookeeper cluster goes offline for sometime,
 does
  solr
  cloud will not be able to server any traffic ?
 
  It will stop allowing updates, but continue serving searches.
 
  - Mark
 
 
 
  Thanks
  Varun
 
 
 
 




Re: Solr cloud deployment on tomcat in prod

2013-02-28 Thread varun srivastava
Great .. I will do it and send you all for review.

Thanks
Varun

On Thu, Feb 28, 2013 at 4:50 AM, Erick Erickson erickerick...@gmail.comwrote:

 Anyone can edit the Wiki, contributions welcome!

 Best
 Erick


 On Mon, Feb 25, 2013 at 5:50 PM, varun srivastava varunmail...@gmail.com
 wrote:

  Hi,
   Is there any official documentation around deployment of solr cloud in
  production on tomcat ?
 
  I am looking for anything as detailed as following one .. It will be good
  if someone can take the following tutorial and get it on official
 solrcloud
  wiki after reviewing each step.
 
 
 
 http://www.myjeeva.com/2012/10/solrcloud-cluster-single-collection-deployment/
 
 
  Thanks
  Varun
 



Re: Role of zookeeper at runtime

2013-02-28 Thread varun srivastava
Any thought on this ?

We have 10 virtual data centres . Now its setup like this because we do
rolling update. While 1 st dc is getting indexed other 9 serve traffic .
Indexing one dc take 2 hours. Now with single shard we use to index one dc
and then quickly replicate index into other dcs by having master-slave
setup. Now in case of solr cloud obviously we can't index each dc
sequentially as it will take 2*10 hours. So we need way of indexing 1 dc
and then somehow quickly propagate the index binary to others. What will
you recommend for solr cloud ?

Thanks
Varun

On Thu, Feb 28, 2013 at 11:33 AM, varun srivastava
varunmail...@gmail.comwrote:

 How can I setup cloud master-slave ? Can you point me to any sample config
 or tutorial which describe the steps to get slor cloud in master-slave
 setup.

 As you know from my previous mails, that I dont need active solr replicas,
 I just need a mechanism to copy a given solr cloud index to a new instance
 of solr-cloud ( classic master-slave setup)

 Eric/ Mark,
   We have 10 virtual data centres . Now its setup like this because we do
 rolling update. While 1 st dc is getting indexed other 9 serve traffic .
 Indexing one dc take 2 hours. Now with single shard we use to index one dc
 and then quickly replicate index into other dcs by having master-slave
 setup. Now in case of solr cloud obviously we can't index each dc
 sequentially as it will take 2*10 hours. So we need way of indexing 1 dc
 and then somehow quickly propagate the index binary to others. What will
 you recommend for solr cloud ?

 Thanks
 Varun


 On Thu, Feb 28, 2013 at 6:12 AM, Mark Miller markrmil...@gmail.comwrote:


 On Feb 26, 2013, at 6:49 PM, varun srivastava varunmail...@gmail.com
 wrote:

  So does it means while doing document add the state of cluster is
 fetched
  from zookeeper and then depending upon hash of docid the target shard is
  decided ?

 We keep the zookeeper info cached locally. We only updated it when
 ZooKeeper tells us it has changed.

 
  Assume we have 3 shards ( with no replicas) in which 1 went down while
  indexing , so will all the documents will be routed to remaining 2
 shards
  or only 2/3 rd of the documents will be indexed ? If answer is
 remaining 2
  shards will get all the documents , then if later 3rd shard comes up
 online
  then will solr cloud will do rebalancing ?

 All of the updates that hash to the third shard will fail. That is why we
 have replicas - if you have a replica, it will take over as the leader.

 
  Is anywhere in zookeeper we store the range of docids stored in each
 shard,
  or any other information about actual docs ?

 The range of hashes are stored for each shard in zk.

  We have 2 datacentres (dc1 and
  dc2) which need to be indexed with exactly same data and we update index
  only once a day. Both dc1 and dc2 have exact same solrcloud config and
  machines.
 
  Can we populate dc2 by just copying all the index binaries from
  solr-cores/core0/data of dc1, to the machines in dc2 ( to avoid indexing
  same documents on dc2). I guess solr replication API doesn't work in
  solrcloud, hence loooking for work around.
 
  Thanks
  Varun
 
  On Tue, Feb 26, 2013 at 3:34 PM, Mark Miller markrmil...@gmail.com
 wrote:
 
  ZooKeeper
  /
  /clusterstate.json - info about the layout and state of the cluster -
  collections, shards, urls, etc
  /collections - config to use for the collection, shard leader voting zk
  nodes
  /configs - sets of config files
  /live_nodes - ephemeral nodes, one per Solr node
  /overseer - work queue for update clusterstate.json, creating new
  collections, etc
  /overseer_elect - overseer voting zk nodes
 
  - Mark
 
  On Feb 26, 2013, at 6:18 PM, varun srivastava varunmail...@gmail.com
  wrote:
 
  Hi Mark,
  One more question
 
  While doing solr doc update/add what information is required from
  zookeeper
  ? Can you tell what all information is stored in zookeeper other than
 the
  startup configs.
 
  Thanks
  Varun
 
  On Tue, Feb 26, 2013 at 3:09 PM, Mark Miller markrmil...@gmail.com
  wrote:
 
 
  On Feb 26, 2013, at 5:25 PM, varun srivastava 
 varunmail...@gmail.com
  wrote:
 
  Hi All,
  I have some questions regarding role of zookeeper in solrcloud
 runtime,
  while processing the queries .
 
  1) Is zookeeper cluster referred by solr shards for processing every
  request, or its only used to copy config on startup time ?
 
  No, it's not used per request. Solr talks to ZooKeeper on SolrCore
  startup
  - to get configs and set itself up. Then it only talks to ZooKeeper
  when a
  cluster state change happens - in that case, ZooKeeper pings Solr and
  Solr
  will get an update view of the cluster. That view is cached and used
 for
  requests. In a stable state, Solr is not talking to ZooKeeper other
 than
  the heartbeat they keep to know a node is up.
 
  2) How loadbalancing is done between replicas ? Is traffic stat
 shared
  through zookeeper ?
 
  Basic round robin. Traffic stats

Re: Role of zookeeper at runtime

2013-02-28 Thread varun srivastava
You can replicate from a SolrCloud node still. Just hit it's replication
handler and pass in the master url to replicate to

How will this work ? lets say s1dc1 is master of s1dc2 , s2dc1 is master
for s2dc2 .. so after hitting replicate index binary will get copied but
then how appropriate entries will be made in zookeeper. Zookeeper need to
know which doc id range residing in which shard.


Thanks
Varun

On Thu, Feb 28, 2013 at 4:27 PM, Mark Miller markrmil...@gmail.com wrote:


 On Feb 28, 2013, at 6:20 PM, varun srivastava varunmail...@gmail.com
 wrote:

  So we need way of indexing 1 dc
  and then somehow quickly propagate the index binary to others.

 You can replicate from a SolrCloud node still. Just hit it's replication
 handler and pass in the master url to replicate to. It doesn't have any
 guarantees in terms of data loss, eg it's not part of SolrCloud per say,
 but it's a fast way to move an index.

 - Mark




Re: org.apache.solr.cloud.ZkCLI timeout

2013-02-27 Thread varun srivastava
Hi Markus,
 Do you mean keeping the file in solr-cores/lib directory or inside
collection1 ( if  name of my solr cloud collection is collection1) ?
In case I keep it inside solr-cores/lib will I get the file by calling
SolrResourceLoader.openConfig(...) ?

 Mark,
  How can I tweak zookeeper limits ?

Thanks
Varun

On Wed, Feb 27, 2013 at 1:08 PM, Markus Jelsma
markus.jel...@openindex.iowrote:

 That's very big indeed, why not store it locally in the core's lib dir? It
 should work IIRC.

 -Original message-
  From:Mark Miller markrmil...@gmail.com
  Sent: Wed 27-Feb-2013 22:07
  To: solr-user@lucene.apache.org
  Subject: Re: org.apache.solr.cloud.ZkCLI timeout
 
  Did you adjust ZooKeeper so that it will accept files greater than 1MB
 per node?
 
  That's more config files than I've ever tried to deal with...
 
  - Mark
 
  On Feb 27, 2013, at 4:02 PM, varun srivastava varunmail...@gmail.com
 wrote:
 
   Hi,
   I am using org.apache.solr.cloud.ZkCLI to push some a 56MB config into
   zookeeper for one of my solr component, but ZKCLI is always timing
 out. I
   can see that ZKCLI is not able to push any config file greater than
 2MB of
   size even though zookeeper server is on same machine. Can we somehow
   increase the timeout for ZKCLI or any other way of pushing large config
   files ?
  
   Thanks
   Varun
 
 



Re: org.apache.solr.cloud.ZkCLI timeout

2013-02-27 Thread varun srivastava
solr-cores is my solr/home

On Wed, Feb 27, 2013 at 1:16 PM, varun srivastava varunmail...@gmail.comwrote:

 Hi Markus,
  Do you mean keeping the file in solr-cores/lib directory or inside
 collection1 ( if  name of my solr cloud collection is collection1) ?
 In case I keep it inside solr-cores/lib will I get the file by calling
 SolrResourceLoader.openConfig(...) ?

  Mark,
   How can I tweak zookeeper limits ?

 Thanks
 Varun

 On Wed, Feb 27, 2013 at 1:08 PM, Markus Jelsma markus.jel...@openindex.io
  wrote:

 That's very big indeed, why not store it locally in the core's lib dir?
 It should work IIRC.

 -Original message-
  From:Mark Miller markrmil...@gmail.com
  Sent: Wed 27-Feb-2013 22:07
  To: solr-user@lucene.apache.org
  Subject: Re: org.apache.solr.cloud.ZkCLI timeout
 
  Did you adjust ZooKeeper so that it will accept files greater than 1MB
 per node?
 
  That's more config files than I've ever tried to deal with...
 
  - Mark
 
  On Feb 27, 2013, at 4:02 PM, varun srivastava varunmail...@gmail.com
 wrote:
 
   Hi,
   I am using org.apache.solr.cloud.ZkCLI to push some a 56MB config into
   zookeeper for one of my solr component, but ZKCLI is always timing
 out. I
   can see that ZKCLI is not able to push any config file greater than
 2MB of
   size even though zookeeper server is on same machine. Can we somehow
   increase the timeout for ZKCLI or any other way of pushing large
 config
   files ?
  
   Thanks
   Varun
 
 





Re: zk Config URL?

2013-02-26 Thread varun srivastava
agree with darren here... setting up solr cloud is way too complicated ..
moreover if you are using tomcat. Do we have any ticket to simplify the
solr cloud installation ? I would love to include my suggestions in it.

Thanks
Varun

On Mon, Feb 25, 2013 at 7:24 PM, darren dar...@ontrenet.com wrote:

 Ok. But its way too complicated than it should be. It should work smarter.


 Sent from my Verizon Wireless 4G LTE Smartphone

  Original message 
 From: Anirudha Jadhav aniru...@nyu.edu
 Date:
 To: solr-user@lucene.apache.org
 Subject: Re: zk Config URL?

 Solr cloud reads solr cfg files from zookeeper.

 You need to push the cfg to zookeeper  link collection to cfg.
 This is exactly what mark suggested earlier in the thread. This is also
 explained in solr cloud wiki.

 On Monday, February 25, 2013, Darren Govoni wrote:

  Hi Mark,
 
 I download latest zk, and run it.
 
 In my glassfish server, I set these system wide properties:
 
  numShards = 1
  zkHost = 10.x.x.x:2181
  jetty.port = 8080 (port of my domain)
  bootstrap_config = true
 
  I copy all the solr 4.1 dist/*.jar into my glassfish domain lib/ext
  directory. Then I deploy solr 4.1 war.
  It throws this exception always.
 
  [#|2013-02-25T13:31:32.304+**|INFO|glassfish3.1.2|**
  javax.enterprise.system.**container.web.com.sun.**
  enterprise.web|_ThreadID=10;_**ThreadName=Thread-2;|WEB0171: Created
  virtual server [__asadmin]|#]
 
  [#|2013-02-25T13:31:32.768+**|INFO|glassfish3.1.2|**
  javax.enterprise.system.**container.web.com.sun.**
  enterprise.web|_ThreadID=10;_**ThreadName=Thread-2;|WEB0172: Virtual
  server [server] loaded default web module []|#]
 
  [#|2013-02-25T13:31:34.222+**|WARNING|glassfish3.1.2|**
  javax.enterprise.system.tools.**deployment.org.glassfish.**
  deployment.common|_ThreadID=**10;_ThreadName=Thread-2;|**DPL8007:
  Unsupported deployment descriptors element schemaLocation value
  http://www.bea.com/ns/**weblogic/90 http://www.bea.com/ns/weblogic/90
  http://www.bea.com/ns/**weblogic/90/weblogic-web-app.**xsd|#
 http://www.bea.com/ns/weblogic/90/weblogic-web-app.xsd%7C#
  ]
 
  [#|2013-02-25T13:31:34.223+**|SEVERE|glassfish3.1.2|**
  javax.enterprise.system.tools.**deployment.org.glassfish.**
  deployment.common|_ThreadID=**10;_ThreadName=Thread-2;|**DPL8006: get/add
  descriptor failure : filter-dispatched-requests-**enabled TO false|#]
 
  [#|2013-02-25T13:31:34.831+**|SEVERE|glassfish3.1.2|**
  javax.enterprise.system.**container.web.com.sun.**
 
 enterprise.web|_ThreadID=10;_**ThreadName=Thread-2;|**WebModule[/solr1]PWC1270:
  Exception starting filter SolrRequestFilter
  java.lang.**NoClassDefFoundError: javax/servlet/Filter
  at java.lang.ClassLoader.**defineClass1(Native Method)
  at java.lang.ClassLoader.**defineClassCond(ClassLoader.**java:631)
  at java.lang.ClassLoader.**defineClass(ClassLoader.java:**615)
  at java.security.**SecureClassLoader.defineClass(**
  SecureClassLoader.java:141)
  at java.net.URLClassLoader.**defineClass(URLClassLoader.**java:283)
  at java.net.URLClassLoader.**access$000(URLClassLoader.**java:58)
  at java.net.URLClassLoader$1.run(**URLClassLoader.java:197)
  at java.security.**AccessController.doPrivileged(**Native Method)
  at java.net.URLClassLoader.**findClass(URLClassLoader.java:**190)
  at sun.misc.Launcher$**ExtClassLoader.findClass(**Launcher.java:229)
  at java.lang.ClassLoader.**loadClass(ClassLoader.java:**306)
  at java.lang.ClassLoader.**loadClass(ClassLoader.java:**295)
  at com.sun.enterprise.v3.server.**APIClassLoaderServiceImpl$**
  APIClassLoader.loadClass(**APIClassLoaderServiceImpl.**java:206)
  at java.lang.ClassLoader.**loadClass(ClassLoader.java:**295)
  at java.lang.ClassLoader.**loadClass(ClassLoader.java:**295)
  at java.lang.ClassLoader.**loadClass(ClassLoader.java:**247)
  at org.glassfish.web.loader.**WebappClassLoader.loadClass(**
  WebappClassLoader.java:1456)
  at org.glassfish.web.loader.**WebappClassLoader.loadClass(**
  WebappClassLoader.java:1359)
  at org.apache.catalina.core.**ApplicationFilterConfig.**
  loadFilterClass(**ApplicationFilterConfig.java:**280)
  at org.apache.catalina.core.**ApplicationFilterConfig.**getFilter(**
  ApplicationFilterConfig.java:**250)
  at org.apache.catalina.core.**ApplicationFilterConfig.init**
  (ApplicationFilterConfig.java:**120)
  at org.apache.catalina.core.**StandardContext.filterStart(**
  StandardContext.java:4685)
  at org.apache.catalina.core.**StandardContext.start(**
  StandardContext.java:5377)
  at com.sun.enterprise.web.**WebModule.start(WebModule.**java:498)
  at org.apache.catalina.core.**ContainerBase.**addChildInternal(**
  ContainerBase.java:917)
  at org.apache.catalina.core.**ContainerBase.addChild(**
  ContainerBase.java:901)
  at org.apache.catalina.core.**StandardHost.addChild(**
  StandardHost.java:733)
  at 

Re: Role of zookeeper at runtime

2013-02-26 Thread varun srivastava
So does it means while doing document add the state of cluster is fetched
from zookeeper and then depending upon hash of docid the target shard is
decided ?

Assume we have 3 shards ( with no replicas) in which 1 went down while
indexing , so will all the documents will be routed to remaining 2 shards
or only 2/3 rd of the documents will be indexed ? If answer is remaining 2
shards will get all the documents , then if later 3rd shard comes up online
then will solr cloud will do rebalancing ?

Is anywhere in zookeeper we store the range of docids stored in each shard,
or any other information about actual docs ? We have 2 datacentres (dc1 and
dc2) which need to be indexed with exactly same data and we update index
only once a day. Both dc1 and dc2 have exact same solrcloud config and
machines.

 Can we populate dc2 by just copying all the index binaries from
solr-cores/core0/data of dc1, to the machines in dc2 ( to avoid indexing
same documents on dc2). I guess solr replication API doesn't work in
solrcloud, hence loooking for work around.

Thanks
Varun

On Tue, Feb 26, 2013 at 3:34 PM, Mark Miller markrmil...@gmail.com wrote:

 ZooKeeper
 /
  /clusterstate.json - info about the layout and state of the cluster -
 collections, shards, urls, etc
  /collections - config to use for the collection, shard leader voting zk
 nodes
  /configs - sets of config files
  /live_nodes - ephemeral nodes, one per Solr node
  /overseer - work queue for update clusterstate.json, creating new
 collections, etc
  /overseer_elect - overseer voting zk nodes

 - Mark

 On Feb 26, 2013, at 6:18 PM, varun srivastava varunmail...@gmail.com
 wrote:

  Hi Mark,
  One more question
 
  While doing solr doc update/add what information is required from
 zookeeper
  ? Can you tell what all information is stored in zookeeper other than the
  startup configs.
 
  Thanks
  Varun
 
  On Tue, Feb 26, 2013 at 3:09 PM, Mark Miller markrmil...@gmail.com
 wrote:
 
 
  On Feb 26, 2013, at 5:25 PM, varun srivastava varunmail...@gmail.com
  wrote:
 
  Hi All,
  I have some questions regarding role of zookeeper in solrcloud runtime,
  while processing the queries .
 
  1) Is zookeeper cluster referred by solr shards for processing every
  request, or its only used to copy config on startup time ?
 
  No, it's not used per request. Solr talks to ZooKeeper on SolrCore
 startup
  - to get configs and set itself up. Then it only talks to ZooKeeper
 when a
  cluster state change happens - in that case, ZooKeeper pings Solr and
 Solr
  will get an update view of the cluster. That view is cached and used for
  requests. In a stable state, Solr is not talking to ZooKeeper other than
  the heartbeat they keep to know a node is up.
 
  2) How loadbalancing is done between replicas ? Is traffic stat shared
  through zookeeper ?
 
  Basic round robin. Traffic stats are not currently in Zk.
 
  3) If for any reason zookeeper cluster goes offline for sometime, does
  solr
  cloud will not be able to server any traffic ?
 
  It will stop allowing updates, but continue serving searches.
 
  - Mark
 
 
 
  Thanks
  Varun
 
 




Re: zk Config URL?

2013-02-26 Thread varun srivastava
Is there any page following for solr cloud ?
http://wiki.apache.org/solr/SolrTomcat

Can we set -zkHost and -zkTimeout in
tomcat/webapps/solr/META_INF/context.xml


Thanks
Varun

On Tue, Feb 26, 2013 at 3:04 PM, Mark Miller markrmil...@gmail.com wrote:


 On Feb 26, 2013, at 4:35 PM, varun srivastava varunmail...@gmail.com
 wrote:

  Do we have any ticket to simplify the
  solr cloud installation ? I would love to include my suggestions in it.

 Please, throw some thoughts out on the list or start a new JIRA issue.

 - Mark




Re: zk Config URL?

2013-02-26 Thread varun srivastava
I dont like setting parameters as system properties, but I am happy if i
can setup these fields inside solr.xml . So you mean following config will
work

 cores adminPath=/admin/cores defaultCoreName=core0
zkClientTimeout=2  hostPort=tomcat port hostContext=solr
zkHost=zookeeper hosts 


Thanks
Varun

On Tue, Feb 26, 2013 at 4:09 PM, Mark Miller markrmil...@gmail.com wrote:


 On Feb 26, 2013, at 7:01 PM, varun srivastava varunmail...@gmail.com
 wrote:

  Is there any page following for solr cloud ?
  http://wiki.apache.org/solr/SolrTomcat
 

 Not that I know of. The main hitch with tomcat is that the hostPort in
 solr.xml is setup to be set by the jetty.port system property. So you
 either need to pass that property to Tomcat or rename it in solr.xml to
 something that makes sense in a Tomcat world. Otherwise, things are about
 the same as with Jetty.

  Can we set -zkHost and -zkTimeout in
  tomcat/webapps/solr/META_INF/context.xml

 No, they are set in solr.xml - special syntax is used to allow them to be
 set as system properties though.

 - Mark

 
 
  Thanks
  Varun
 
  On Tue, Feb 26, 2013 at 3:04 PM, Mark Miller markrmil...@gmail.com
 wrote:
 
 
  On Feb 26, 2013, at 4:35 PM, varun srivastava varunmail...@gmail.com
  wrote:
 
  Do we have any ticket to simplify the
  solr cloud installation ? I would love to include my suggestions in it.
 
  Please, throw some thoughts out on the list or start a new JIRA issue.
 
  - Mark
 
 




Re: zk Config URL?

2013-02-26 Thread varun srivastava
Hi Mark,
 specifying zkHost in solr.xml is not working. It seems only system
property -DzkHost works. Can you confirm the param name is zkHost in
solr.xml ?

Thanks
Varun

On Tue, Feb 26, 2013 at 4:24 PM, Mark Miller markrmil...@gmail.com wrote:


 On Feb 26, 2013, at 7:15 PM, varun srivastava varunmail...@gmail.com
 wrote:

  I dont like setting parameters as system properties,

 They are nice for the example, and often if you are using shell scripts or
 something to manage your cluster when you are screwing around, but yeah,
 many people will be happy to just put the info in the xml file.

  but I am happy if i
  can setup these fields inside solr.xml . So you mean following config
 will
  work
 
  cores adminPath=/admin/cores defaultCoreName=core0
  zkClientTimeout=2  hostPort=tomcat port hostContext=solr
  zkHost=zookeeper hosts 

 Yes.

 The only sys prop you would have to set is numShards, unless you removed
 the default collection and used the CoreAdmin or Collections API to create
 the first collection.

 - Mark




Re: zk Config URL?

2013-02-26 Thread varun srivastava
Hi Mark,
 How to provide solr-plugin directory to solr collection. I have my plugins
in solr_home/lib directory but still collection creation command failing as
its not getting the plugin classes
(/solr/admin/collections?action=CREATEname=europe-collectionnumShards=2replicationFactor=1)

Thanks
Varun

On Tue, Feb 26, 2013 at 5:26 PM, varun srivastava varunmail...@gmail.comwrote:

 Hi Mark,
  specifying zkHost in solr.xml is not working. It seems only system
 property -DzkHost works. Can you confirm the param name is zkHost in
 solr.xml ?

 Thanks
 Varun


 On Tue, Feb 26, 2013 at 4:24 PM, Mark Miller markrmil...@gmail.comwrote:


 On Feb 26, 2013, at 7:15 PM, varun srivastava varunmail...@gmail.com
 wrote:

  I dont like setting parameters as system properties,

 They are nice for the example, and often if you are using shell scripts
 or something to manage your cluster when you are screwing around, but yeah,
 many people will be happy to just put the info in the xml file.

  but I am happy if i
  can setup these fields inside solr.xml . So you mean following config
 will
  work
 
  cores adminPath=/admin/cores defaultCoreName=core0
  zkClientTimeout=2  hostPort=tomcat port hostContext=solr
  zkHost=zookeeper hosts 

 Yes.

 The only sys prop you would have to set is numShards, unless you removed
 the default collection and used the CoreAdmin or Collections API to create
 the first collection.

 - Mark





Re: SloppyPhraseScorer behavior change

2013-01-11 Thread varun srivastava
Moreover just checked .. autoGeneratePhraseQueries=true is set for both
3.4 and 4.0 in my schema.

Thanks
Varun

On Fri, Jan 11, 2013 at 1:04 PM, varun srivastava varunmail...@gmail.comwrote:

 Hi Jack,
  Is this a new change done in solr 4.0 ? Seems autoGeneratePhraseQueries
 option is present from solr 3.1. Just wanted to confirm this is the
 difference causing change in behavior between 3.4 and 4.0.


 Thanks
 Varun


 On Mon, Dec 24, 2012 at 3:00 PM, Jack Krupansky 
 j...@basetechnology.comwrote:

 Thanks. Sloppy phrase requires that the query terms be in a phrase, but
 you don't have any quotes in your query.

 Depending on your schema field type you may be running into a change in
 how auto-generated phrase queries are handled. It used to be that
 apple0ipad would always be treated as the quoted phrase apple 0 ipad, but
 now that is only true if your field type has autoGeneratePhraseQueries=true
 set. Now, if you don't have that option set, the term gets treated as
 (apple OR 0 OR ipad), which is a lot looser than the exact phrase.

 Look at the new example schema for the text_en_splitting field type as
 an example.


 -- Jack Krupansky

 -Original Message- From: varun srivastava
 Sent: Monday, December 24, 2012 5:49 PM
 To: solr-user@lucene.apache.org
 Subject: Re: SloppyPhraseScorer behavior change


 Hi Jack,
 My query was simple /solr/select?query=ipad apple apple0ipad
 and doc contained apple ipad .

 If you see the patch attached with the bug 3215 , you will find following
 comment. I want to confirm whether the behaviour I am observing is in sync
 with what the patch developer intended or its just some regression bug. In
 solr 3.4 phrase order is honored, whereas in solr 4.0 phrase order is not
 honored, i.e. apple ipad and ipad apple both treated as same.



 

 /**
 +   * Score a candidate doc for all slop-valid position-combinations
 (matches)
 +   * encountered while traversing/hopping the PhrasePositions.
 +   * br The score contribution of a match depends on the distance:
 +   * br - highest score for distance=0 (exact match).
 +   * br - score gets lower as distance gets higher.
 +   * brExample: for query a b~2, a document x a b a y can be
 scored twice:
 +   * once for a b (distance=0), and once for b a (distance=2).
 +   * brPossibly not all valid combinations are encountered, because
 for efficiency
 +   * we always propagate the least PhrasePosition. This allows to base on
 +   * PriorityQueue and move forward faster.
 +   * As result, for example, document a b c b a
 +   * would score differently for queries a b c~4 and c b a~4,
 although
 +   * they really are equivalent.
 +   * Similarly, for doc a b c b a f g, query c b~2
 +   * would get same score as g f~2, although c b~2 could be matched
 twice.
 +   * We may want to fix this in the future (currently not, for
 performance reasons).
 +   */

 



 On Mon, Dec 24, 2012 at 1:21 PM, Jack Krupansky j...@basetechnology.com
 **wrote:

  Could you post the full query URL, so we can see exactly what your query
 was? Or, post the output of debug=query, which will show us what Lucene
 query was generated.

 -- Jack Krupansky

 -Original Message- From: varun srivastava
 Sent: Monday, December 24, 2012 1:53 PM
 To: solr-user@lucene.apache.org
 Subject: SloppyPhraseScorer behavior change


 Hi,
  Due to following bug fix
 https://issues.apache.org/jira/browse/LUCENE-3215https://issues.apache.org/**jira/browse/LUCENE-3215
 https:**//issues.apache.org/jira/**browse/LUCENE-3215https://issues.apache.org/jira/browse/LUCENE-3215observing
 a change

 in behavior of SloppyPhraseScorer. I just wanted to
 confirm my understanding with you all.

 After solr 3.5 ( bug is fixed in 3.5), if there is a document a b c d
 e,
 then in solr 3.4 only query a b will match with document, but in solr
 3.5
 onwards, both  query a b and b a will match. Is it right ?


 Thanks
 Varun






Re: SloppyPhraseScorer behavior change

2012-12-24 Thread varun srivastava
Hi Jack,
 My query was simple /solr/select?query=ipad apple apple0ipad
and doc contained apple ipad .

If you see the patch attached with the bug 3215 , you will find following
comment. I want to confirm whether the behaviour I am observing is in sync
with what the patch developer intended or its just some regression bug. In
solr 3.4 phrase order is honored, whereas in solr 4.0 phrase order is not
honored, i.e. apple ipad and ipad apple both treated as same.





 /**
+   * Score a candidate doc for all slop-valid position-combinations (matches)
+   * encountered while traversing/hopping the PhrasePositions.
+   * br The score contribution of a match depends on the distance:
+   * br - highest score for distance=0 (exact match).
+   * br - score gets lower as distance gets higher.
+   * brExample: for query a b~2, a document x a b a y can be
scored twice:
+   * once for a b (distance=0), and once for b a (distance=2).
+   * brPossibly not all valid combinations are encountered, because
for efficiency
+   * we always propagate the least PhrasePosition. This allows to base on
+   * PriorityQueue and move forward faster.
+   * As result, for example, document a b c b a
+   * would score differently for queries a b c~4 and c b a~4, although
+   * they really are equivalent.
+   * Similarly, for doc a b c b a f g, query c b~2
+   * would get same score as g f~2, although c b~2 could be matched twice.
+   * We may want to fix this in the future (currently not, for
performance reasons).
+   */





On Mon, Dec 24, 2012 at 1:21 PM, Jack Krupansky j...@basetechnology.comwrote:

 Could you post the full query URL, so we can see exactly what your query
 was? Or, post the output of debug=query, which will show us what Lucene
 query was generated.

 -- Jack Krupansky

 -Original Message- From: varun srivastava
 Sent: Monday, December 24, 2012 1:53 PM
 To: solr-user@lucene.apache.org
 Subject: SloppyPhraseScorer behavior change


 Hi,
  Due to following bug fix
 https://issues.apache.org/**jira/browse/LUCENE-3215https://issues.apache.org/jira/browse/LUCENE-3215observing
  a change
 in behavior of SloppyPhraseScorer. I just wanted to
 confirm my understanding with you all.

 After solr 3.5 ( bug is fixed in 3.5), if there is a document a b c d e,
 then in solr 3.4 only query a b will match with document, but in solr 3.5
 onwards, both  query a b and b a will match. Is it right ?


 Thanks
 Varun



Re: solr 4.0 missing SolrPluginUtils addOrReplaceResults

2012-10-24 Thread varun srivastava
Hi Solr-Users,
 Anyone has any work around for SolrPluginUtils.addOrReplaceResults in solr
4.0 ? Should be easy to migrate the code from 3.6 branch to
4.0 SolrPluginUtils. Is there any specific reason why this method is
dropped in 4.0 ?

Thanks
Varun

On Tue, Oct 23, 2012 at 11:14 AM, varun srivastava
varunmail...@gmail.comwrote:

 Hi,
  What is the replacement for SolrPluginUtils.addOrReplaceResults in solr
 4.0 ?

 Thanks
 Varun



Re: anyone has solrcloud perfromance numbers ?

2012-10-02 Thread varun srivastava
Thanks Otis

On Tue, Oct 2, 2012 at 8:06 PM, Otis Gospodnetic otis.gospodne...@gmail.com
 wrote:

 I don't have the URL handy, but guys at LinkedIn have a benchmark tool for
 Solr, ElasticSearch, and Sensei. Check the list archives for URL and my
 signature below for a tool that can show metrics for any of those systems,
 which you'll probably want to observe during testing.

 Otis
 --
 Performance Monitoring - http://sematext.com/spm
 On Oct 2, 2012 6:47 PM, varun srivastava varunmail...@gmail.com wrote:

  Hi,
   Does anyone has some solr cloud preliminary performance numbers ? Or if
  someone has performance comparison ( throughput and latency) between
  solr
  3.6 and solrcloud ( having a huge monolithic index vs sharded) ?
 
  Thanks
  Varun
 



Re: anyone has solrcloud perfromance numbers ?

2012-10-02 Thread varun srivastava
Otis, I am looking for performance benchmark number rather than performance
monitoring tools. SPM looks like monitoring tool. Moreover its comparing
Solr with Elastic Search etc, I want comparison between Solr 3.6 and
solrcloud.

Thanks
Varun

On Tue, Oct 2, 2012 at 9:15 PM, varun srivastava varunmail...@gmail.comwrote:

 Thanks Otis


 On Tue, Oct 2, 2012 at 8:06 PM, Otis Gospodnetic 
 otis.gospodne...@gmail.com wrote:

 I don't have the URL handy, but guys at LinkedIn have a benchmark tool for
 Solr, ElasticSearch, and Sensei. Check the list archives for URL and my
 signature below for a tool that can show metrics for any of those systems,
 which you'll probably want to observe during testing.

 Otis
 --
 Performance Monitoring - http://sematext.com/spm
 On Oct 2, 2012 6:47 PM, varun srivastava varunmail...@gmail.com
 wrote:

  Hi,
   Does anyone has some solr cloud preliminary performance numbers ? Or if
  someone has performance comparison ( throughput and latency) between
  solr
  3.6 and solrcloud ( having a huge monolithic index vs sharded) ?
 
  Thanks
  Varun
 





Re: Zookeeper setup for solr cloud

2012-10-01 Thread varun srivastava
Hi,
  Rephrasing my question ... Let me know if anyone feel some problem with
following deployment of solrcloud

1) Have 200 solrcloud nodes ( serv1, serv2, .. serv200) with each machine
having both zookeeper and solr both.
2) zookeeper config contain the list of all servers

server.1=serv1:2888:3888
server.2=serv2:2888:3888

...
server.200=serv200:2888:3888


3) Each solrconfig only talks to localhost zookeeper -

 -DzkHost=localhost:9983


Thanks
Varun



On Sun, Sep 30, 2012 at 4:51 PM, Lance Norskog goks...@gmail.com wrote:

 You can find Solr information with this:
 http://find.searchhub.org/?q=zookeeper+cluster

 http://find.searchhub.org/link?url=http://wiki.apache.org/solr/SolrCloud


 - Original Message -
 | From: varun srivastava varunmail...@gmail.com
 | To: solr-user@lucene.apache.org
 | Sent: Saturday, September 29, 2012 9:38:16 PM
 | Subject: Zookeeper setup for solr cloud
 |
 | Hi,
 |  I would like to get recommendation on zookeeper ensemble
 |  architecture. I
 | am thinking of following options, please let me know if I am correct
 | in
 | pros and con of each option. Also please feel free to add
 | differentiating
 | points I am missing.
 |
 | 1) Have separate boxes for zookeeper ensemble and all the solrcloud
 | instances access it on runtime.
 |   Pros: Small set of zookeeper instances to maintain. May be sync up
 | between zookeeper boxes will be fast and reliable.
 |
 | 2) Let each solr box have zookeeper instance also. Each solr instance
 | accessing the localhost zookeeper.
 |Pros: solr will not incur over the wire cost at runtime, hence
 |should be
 | fast. More fault tolerant as solr not going over the wire to access
 | zookeeper.
 |Con: Lots of zookeeper instances and hence may be slow to update.
 |
 |
 | Thanks
 | Varun
 |



Re: Solr Caching - how to tune, how much to increase, and any tips on using Solr with JDK7 and G1 GC?

2012-09-29 Thread varun srivastava
Hi Erick,
 You mentioned for 4.0 memory pattern is much difference than 3.X . Can you
elaborate whether its worse or better ? Does 4.0 tend to use more memory
for similar index size as compared to 3.X ?

Thanks
Varun

On Sat, Sep 29, 2012 at 1:58 PM, Erick Erickson erickerick...@gmail.comwrote:

 Well, I haven't had experience with JDK7, so I'll skip that part...

 But about caches. First, as far as memory is concerned, be
 sure to read Uwe's blog about MMapDirectory here:
 http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

 As to the caches.

 Be a little careful here. Getting high hit rates on _all_ your caches
 is a waste.

 filterCache. This is the exception, you want as high a hit ratio as you can
 get for this one, it's where the results of all the fq= clauses go and is
 a
 major factor in speeding up QPS..

 queryResultCache. Hmmm, given the lack of updates to your index, this one
 may actually get more hits than Id expect. But it's a very cheap cache
 memory
 wise. Think of it as a map where the key is the query and the value is an
 array of queryResultWindowSize longs (document IDs). It's really intended
 for paging mostly. It's also often the case that the chances of the exact
 same query (except for start and rows) being issued is actually
 relatively
 small. As always YMMV. I usually see hit rates on this cache  10%.
 Evictions
 merely mean it's been around a long time, bumping the size of this cache
 probably won't affect the hit rate unless your app somehow submits just
 a few queries.


 documentCache. Again, this often doesn't have a great hit ration. It's main
 use as I understand it is to keep various parts of a query component chain
 from having to re-access the disk. Each element in a query component is
 completely separate from the others, so if two or more components want
 values from the doc, having them cached is useful. The usual recommendation
 is (#docs returned to user) * (expected simultaneous queries), where
 # docs returned to user is really the rows value.

 One of the consequences of having huge amounts of memory allocated to
 the JVM can be really long garbage collections. They happen less frequently
 but have more work to do when they happen.

 Oh, and when you start using 4.0, the memory patterns are much different...

 Finally, here's a great post on solr memory tuning, too bad the image links
 are broken...
 http://searchhub.org/dev/2011/03/27/garbage-collection-bootcamp-1-0/

 Best
 Erick

 On Sat, Sep 29, 2012 at 3:08 PM, Aaron Daubman daub...@gmail.com wrote:
  Greetings,
 
  I've recently moved to running some of our Solr (3.6.1) instances
  using JDK 7u7 with the G1 GC (playing with max pauses in the 20 to
  100ms range). By and large, it has been working well (or, perhaps I
  should say that without requiring much tuning it works much better in
  general than my haphazard attempts to tune CMS).
 
  I have two instances in particular, one with a heap size of 14G and
  one with a heap size of 60G. I'm attempting to squeeze out additional
  performance by increasing Solr's cache sizes (I am still seeing the
  hit ratio go up as I increase max size size and decrease the number of
  evictions), and am guessing this is the cause of some recent
  situations where the 14G instance especially eventually (12-24 hrs
  later under 100s of queries per minute) makes it to 80%-90% of the
  heap and then spirals into major GC with long-pause territory.
 
  I am wondering:
  1) if anybody has experience tuning the G1 GC, especially for use with
  Solr (what are decent max-pause times to use?)
  2) how to better tune Solr's cache sizes - e.g. how to even tell the
  actual amount of memory used by each cache (not # entries as the stats
  sow, but # bits)
  3) if there are any guidelines on when increasing a cache's size (even
  if it does continue to increase the hit ratio) runs into the law of
  diminishing returns or even starts to hurt - e.g. if the document
  cache has a current maxSize of 65536 and has seen 4409275 evictions,
  and currently has a hit ratio of 0.74, should the max be increased
  further? If so, how much ram needs to be added to the heap, and how
  much larger should its max size be made?
 
  I should mention that these solr instances are read-only (so cache is
  probably more valuable than in other scenarios - we only invalidate
  the searcher every 20-24hrs or so) and are also backed with indexes
  (6G and 70G for the 14G and 60G heap sizes) on IODrives, so I'm not as
  concerned about leaving RAM for linux to cache the index files (I'd
  much rather actually cache the post-transformed values).
 
  Thanks as always,
   Aaron



Zookeeper setup for solr cloud

2012-09-29 Thread varun srivastava
Hi,
 I would like to get recommendation on zookeeper ensemble architecture. I
am thinking of following options, please let me know if I am correct in
pros and con of each option. Also please feel free to add differentiating
points I am missing.

1) Have separate boxes for zookeeper ensemble and all the solrcloud
instances access it on runtime.
  Pros: Small set of zookeeper instances to maintain. May be sync up
between zookeeper boxes will be fast and reliable.

2) Let each solr box have zookeeper instance also. Each solr instance
accessing the localhost zookeeper.
   Pros: solr will not incur over the wire cost at runtime, hence should be
fast. More fault tolerant as solr not going over the wire to access
zookeeper.
   Con: Lots of zookeeper instances and hence may be slow to update.


Thanks
Varun