Re: Facet search on a docvalue field in a multi shard collection
Hi Erick Thanks for your input. I have retrieved and build the branch http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_5 Doing the same setup as in my previous post (two shard collection, fieldA af docValue type, index a single document and doing a facet search on fieldA), I now get the below exception. The cause (which is not visible from the stacktrace) is as previous: Cannot use facet.mincount=0 on field fieldA which is not indexed What could be my next steps from here? 620710 [qtp1728933440-15] ERROR org.apache.solr.core.SolrCore ▒ org.apache.solr.common.SolrException: Exception during facet.field: fieldA. at org.apache.solr.request.SimpleFacets$2.call(SimpleFacets.java:569) at org.apache.solr.request.SimpleFacets$2.call(SimpleFacets.java:554) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at org.apache.solr.request.SimpleFacets$1.execute(SimpleFacets.java:508) at org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:579) at org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:265) at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:78) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:724) Den 22-09-2013 17:09, Erick Erickson skrev: right, I think you're running into a bug I remember going by. I can't find it now, JIRA seems to be not responding. As I remember, where if a shard doesn't have a doc on it, you get an error. Although why facet.limit should figure in here is a mystery to me, maybe a coincidence? Significant work has been done about not requiring values for DocValues fields and stuff. Can you give a try on 4.x or the soon-to-be-released 4.5? Best, Erick On Sun, Sep 22, 2013 at 6:26 AM, Trym R. Møller t...@sigmat.dk wrote: Hi I have a problem doing facet search on a doc value field in a multi shard collection. Any ideas what I may be doing wrong? My Solr schema specifies fieldA as a docvalue type and I have created a two shard collection using Solr 4.4.0. When I do a facet search on fieldA with a large facet.limit then the query fails with the below exception A large facet.limit seems to be when (10 + (facet.limit * 1,5)) * number of shards rows matching my query The exception does not occur when I run with a single shard collection. It can easily be reproduced by indexing a single row
Re: Facet search on a docvalue field in a multi shard collection
Hi I have created https://issues.apache.org/jira/browse/SOLR-5260 as proposed by Erick. I hope anyone working with doc values can lead me in a direction of how to solve the bug. Best regards Trym Den 23-09-2013 16:01, Erick Erickson skrev: I haven't dived into the code, but it sure looks like a JIRA to me, can you open one? Best, Erick On Mon, Sep 23, 2013 at 1:48 AM, Trym R. Møller t...@sigmat.dk wrote: Hi Erick Thanks for your input. I have retrieved and build the branch http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_5 Doing the same setup as in my previous post (two shard collection, fieldA af docValue type, index a single document and doing a facet search on fieldA), I now get the below exception. The cause (which is not visible from the stacktrace) is as previous: Cannot use facet.mincount=0 on field fieldA which is not indexed What could be my next steps from here? 620710 [qtp1728933440-15] ERROR org.apache.solr.core.SolrCore ▒ org.apache.solr.common.SolrException: Exception during facet.field: fieldA. at org.apache.solr.request.SimpleFacets$2.call(SimpleFacets.java:569) at org.apache.solr.request.SimpleFacets$2.call(SimpleFacets.java:554) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at org.apache.solr.request.SimpleFacets$1.execute(SimpleFacets.java:508) at org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:579) at org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:265) at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:78) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:724) Den 22-09-2013 17:09, Erick Erickson skrev: right, I think you're running into a bug I remember going by. I can't find it now, JIRA seems to be not responding. As I remember, where if a shard doesn't have a doc on it, you get an error. Although why facet.limit should figure in here is a mystery to me, maybe a coincidence? Significant work has been done about not requiring values for DocValues fields and stuff. Can you give a try on 4.x or the soon-to-be-released 4.5? Best, Erick On Sun, Sep 22, 2013 at 6:26 AM, Trym R. Møller t...@sigmat.dk wrote: Hi I have a problem doing facet search on a doc value field in a multi shard collection. Any ideas what I may be doing wrong? My Solr schema specifies fieldA as a docvalue type and I have created
Facet search on a docvalue field in a multi shard collection
Hi I have a problem doing facet search on a doc value field in a multi shard collection. Any ideas what I may be doing wrong? My Solr schema specifies fieldA as a docvalue type and I have created a two shard collection using Solr 4.4.0. When I do a facet search on fieldA with a large facet.limit then the query fails with the below exception A large facet.limit seems to be when (10 + (facet.limit * 1,5)) * number of shards rows matching my query The exception does not occur when I run with a single shard collection. It can easily be reproduced by indexing a single row and querying it, as the default facet.limit is 100. The facet query received by Solr looks as follows: 576793 [qtp170860084-18] INFO org.apache.solr.core.SolrCore ¦ [trym_shard2_replica1] webapp=/solr path=/select params={facet=truestart=0q=*:*distrib=truecollection=trymfacet.field=fieldAwt=javabinversion=2rows=0} status=500 QTime=20 One of the internal query send by Solr to its shard looks like 576783 [qtp170860084-19] INFO org.apache.solr.core.SolrCore ¦ [trym_shard1_replica1] webapp=/solr path=/select params={facet=truedistrib=falsecollection=trym wt=javabinversion=2rows=0NOW=1379855011787shard.url=192.168.56.1:8501/solr/trym_shard1_replica1/df=textfl=id,scoref.fieldA.facet.limit=160start=0q=*: *facet.field=fieldAisShard=truefsv=true} hits=1 status=500 QTime=2 576784 [qtp170860084-17] ERROR org.apache.solr.servlet.SolrDispatchFilter ¦ null:java.lang.IllegalStateException: Cannot use facet.mincount=0 on a field which is not indexed at org.apache.solr.request.NumericFacets.getCounts(NumericFacets.java:257) at org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:423) at org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:530) at org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:259) at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:78) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:724) My schema.xml contains the following lines (among others :-)) dynamicField name=*A type=dlong indexed=false stored=true
Facet search on a docvalue field in a multi shard collection
Hi I have a problem doing facet search on a doc value field in a multi shard collection. Any ideas what I may be doing wrong? My Solr schema specifies fieldA as a docvalue type and I have created a two shard collection using Solr 4.4.0. When I do a facet search on fieldA with a large facet.limit then the query fails with the below exception A large facet.limit seems to be when (10 + (facet.limit * 1,5)) * number of shards rows matching my query The exception does not occur when I run with a single shard collection. It can easily be reproduced by indexing a single row and querying it, as the default facet.limit is 100. The facet query received by Solr looks as follows: 576793 [qtp170860084-18] INFO org.apache.solr.core.SolrCore ¦ [trym_shard2_replica1] webapp=/solr path=/select params={facet=truestart=0q=*:*distrib=truecollection=trymfacet.field=fieldAwt=javabinversion=2rows=0} status=500 QTime=20 One of the internal query send by Solr to its shard looks like 576783 [qtp170860084-19] INFO org.apache.solr.core.SolrCore ¦ [trym_shard1_replica1] webapp=/solr path=/select params={facet=truedistrib=falsecollection=trym wt=javabinversion=2rows=0NOW=1379855011787shard.url=192.168.56.1:8501/solr/trym_shard1_replica1/df=textfl=id,scoref.fieldA.facet.limit=160start=0q=*: *facet.field=fieldAisShard=truefsv=true} hits=1 status=500 QTime=2 576784 [qtp170860084-17] ERROR org.apache.solr.servlet.SolrDispatchFilter ¦ null:java.lang.IllegalStateException: Cannot use facet.mincount=0 on a field which is not indexed at org.apache.solr.request.NumericFacets.getCounts(NumericFacets.java:257) at org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:423) at org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:530) at org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:259) at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:78) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:724) My schema.xml contains the following lines (among others :-)) dynamicField name=*A type=dlong indexed=false stored=true
Re: SolrZKClient changed interface
Can anyone verify that the jira has been created sensible? Thanks in advance. https://issues.apache.org/jira/browse/SOLR-4066 Best regards Trym Den 10-11-2012 00:54, Mark Miller skrev: Please file a JIRA issue for this change. - Mark On Nov 9, 2012, at 8:41 AM, Trym R. Møller t...@sigmat.dk wrote: Hi The constructor of SolrZKClient has changed, I expect to ensure clean up of resources. The strategy is as follows: connManager = new ConnectionManager(...) try { ... } catch (Throwable e) { connManager.close(); throw new RuntimeException(); } try { connManager.waitForConnected(clientConnectTimeout); } catch (Throwable e) { connManager.close(); throw new RuntimeException(); } This results in a different exception (RuntimeException) returned from the constructor as earlier (nice exceptions as UnknownHostException, TimeoutException). Can this be changed so we keep the old nice exceptions e.g. as follows (requiring the constructor to declare these) or at least include them as cause in the RuntimeException? boolean closeBecauseOfException = true; try { ... connManager.waitForConnected(clientConnectTimeout); closeBecauseOfException = false } finally { if (closeBecauseOfException) { connManager.close(); } } Any comments appreciated. Best regards Trym http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_0/solr/solrj/src/java/org/apache/solr/common/cloud/SolrZkClient.java
SolrZKClient changed interface
Hi The constructor of SolrZKClient has changed, I expect to ensure clean up of resources. The strategy is as follows: connManager = new ConnectionManager(...) try { ... } catch (Throwable e) { connManager.close(); throw new RuntimeException(); } try { connManager.waitForConnected(clientConnectTimeout); } catch (Throwable e) { connManager.close(); throw new RuntimeException(); } This results in a different exception (RuntimeException) returned from the constructor as earlier (nice exceptions as UnknownHostException, TimeoutException). Can this be changed so we keep the old nice exceptions e.g. as follows (requiring the constructor to declare these) or at least include them as cause in the RuntimeException? boolean closeBecauseOfException = true; try { ... connManager.waitForConnected(clientConnectTimeout); closeBecauseOfException = false } finally { if (closeBecauseOfException) { connManager.close(); } } Any comments appreciated. Best regards Trym http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_0/solr/solrj/src/java/org/apache/solr/common/cloud/SolrZkClient.java
Tuning DirectUpdateHandler2.addDoc
Hi I have been profiling SolrCloud when indexing into a sharded non-replica collection because indexing slows down when the index files (*.fdt) grows to a couple of GB (the largest is about 3.5GB). When profiling for a couple of minutes I see that most time is spend in the DirectUpdateHandler2.addDoc method (being called about 8000 times). Its time is spend in UpdateLog.lookupVersion, VersionInfo.getVersionFromIndex, SolrIndexSearcher.lookupId (being called about 6000 times) and it spends it time in AtomicReader.termDocsEnums which is called about 530.000 times taking about 770.000 ms Is it true, that the reason for AtomicReader.termDocsEnums is being called 530.000/6000 =~ 90 times per SolrIndexSearcher.lookupId call, is that I have in average 90 term-files? Can I do anything to lower this number of term-files? I'm running more cores on my SolrCloud instance. Is there any way I can lower the time spend in each AtomicReader.termDocsEnums method call (this seems to be much faster when I don't have so many documents in my collection/shard)? Thanks as always. Best regards Trym
Understanding autoSoftCommit
Hi On my windows workstation I have tried to index a document into a SolrCloud instance with the following special configuration: autoCommit maxTime120/maxTime /autoCommit autoSoftCommit maxTime60/maxTime /autoSoftCommit ... updateLog str name=dir${solr.data.dir:}/str /updateLog That is commit every 20 minutes and soft commit every 10 minutes. Right after indexing I can find the document using /get (and not using /search) and after 10 minutes I can find it as well using /search. If I stop Solr using Ctrl+C or kill -9 (from my cygwin console) before the 10 minutes have passed and starts Solr again then I can find the document using both /get and /search. Are there any scenarios where I will loose an indexed document before either commit or soft commit is trigged? And does the transaction log have anything to do with this... Thanks in advance. Best regards Trym
Re: ramBufferSizeMB
Hi Thanks a lot for your answer, Erick! I changed the value of the autoSoftCommit property and it had the expected effect. It can be noted that this is per Core, so I get four getReader calls when my Solr contains four cores per autoSoftCommit interval. Is it correct that a segment file is ready for merging after a commit has been done (e.g. using the autoCommit property), so I will see merges of 100 and up documents (and the index writer continues writing into a new segment file)? It looks like the segments are being merged into 6 MB files and when enough into 60MB files and these again into 3,5GB files. Best regards Trym Den 19-09-2012 14:49, Erick Erickson skrev: I _think_ the getReader calls are being triggered by the autoSoftCommit being at one second. If so, this is probably OK. But bumping that up would nail whether that's the case... About RamBufferSizeMB. This has nothing to do with the size of the segments! It's just how much memory is consumed before the RAMBuffer is flushed to the _currently open_ segment. So until a hard commit happens, the currently open segment will continue to grow as successive RAMBuffers are flushed. bq: I expected that my Lucene index segment files would be a bit bigger than 1KB Is this a typo? The 512 is specifying MB.. Best Erick On Wed, Sep 19, 2012 at 6:01 AM, Trym R. Møller t...@sigmat.dk wrote: Hi Using SolrCloud I have added the following to solrconfig.xml (actually the node in zookeeper) ramBufferSizeMB512/ramBufferSizeMB After that I expected that my Lucene index segment files would be a bit bigger than 1KB as I'm indexing very small documents Enabling the infoStream I see a lot of flush at getReader (one segment of the infoStream file pasted below) 1. Where can I look for why documents are flushed so frequently? 2. Does it have anything to do with getReader and can I do anything so Solr doesn't need to get a new reader so often? Any comments are most welcome. Best regards Trym Furthermore I have specified autoCommit maxTime18/maxTime /autoCommit autoSoftCommit maxTime1000/maxTime /autoSoftCommit IW 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: flush at getReader DW 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: pool-12-thread-1 startFullFlush DW 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: anyChanges? numDocsInRam=7 deletes=false hasTickets:false pendingChangesInFullFlush: false DWFC 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: addFlushableState DocumentsWriterPerThread [pendingDeletes=gen=0, segment=_kc, aborting=false, numDocsInRAM=7, deleteQueue=DWDQ: [ generation: 1 ]] DWPT 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: flush postings as segment _kc numDocs=7 DWPT 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: new segment has 0 deleted docs DWPT 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: new segment has no vectors; norms; no docValues; prox; freqs DWPT 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: flushedFiles=[_kc_Lucene40_0.frq, _kc.fnm, _kc_Lucene40_0.tim, _kc_nrm.cfs, _kc.fdx, _kc.fdt, _kc_Lucene40_0.prx, _kc_nrm.cfe, _kc_Lucene40_0.tip] DWPT 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: flushed codec=Lucene40 DWPT 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: flushed: segment=_kc ramUsed=0,095 MB newFlushedSize(includes docstores)=0,003 MB docs/MB=2.283,058
ramBufferSizeMB
Hi Using SolrCloud I have added the following to solrconfig.xml (actually the node in zookeeper) ramBufferSizeMB512/ramBufferSizeMB After that I expected that my Lucene index segment files would be a bit bigger than 1KB as I'm indexing very small documents Enabling the infoStream I see a lot of flush at getReader (one segment of the infoStream file pasted below) 1. Where can I look for why documents are flushed so frequently? 2. Does it have anything to do with getReader and can I do anything so Solr doesn't need to get a new reader so often? Any comments are most welcome. Best regards Trym Furthermore I have specified autoCommit maxTime18/maxTime /autoCommit autoSoftCommit maxTime1000/maxTime /autoSoftCommit IW 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: flush at getReader DW 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: pool-12-thread-1 startFullFlush DW 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: anyChanges? numDocsInRam=7 deletes=false hasTickets:false pendingChangesInFullFlush: false DWFC 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: addFlushableState DocumentsWriterPerThread [pendingDeletes=gen=0, segment=_kc, aborting=false, numDocsInRAM=7, deleteQueue=DWDQ: [ generation: 1 ]] DWPT 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: flush postings as segment _kc numDocs=7 DWPT 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: new segment has 0 deleted docs DWPT 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: new segment has no vectors; norms; no docValues; prox; freqs DWPT 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: flushedFiles=[_kc_Lucene40_0.frq, _kc.fnm, _kc_Lucene40_0.tim, _kc_nrm.cfs, _kc.fdx, _kc.fdt, _kc_Lucene40_0.prx, _kc_nrm.cfe, _kc_Lucene40_0.tip] DWPT 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: flushed codec=Lucene40 DWPT 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: flushed: segment=_kc ramUsed=0,095 MB newFlushedSize(includes docstores)=0,003 MB docs/MB=2.283,058
Solr zk client stopping sending data
Hi Running a Solr cloud cluster after a while a Solr looses its connection to its ZooKeeper cluster as seen in the ZooKeeper log below. The Solr reconnects to another ZooKeeper in the ZK cluster and the only thing seen in the Solr log (running warning level) is a newly programmatic created collection stopping being recovered. Any ideas about how I can get information about what causes the Solr ZK client to stop sending data? Thanks in advance. Best regards Trym ZK 1 Log 2012-08-01 00:16:12,444 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@634] - EndOfStreamException: Unable to read additional data from client sessionid 0x138d6e86e790003, likely client has closed socket 2012-08-01 00:16:12,444 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket connection for client /solr-ip:42379 which had sessionid 0x138d6e86e790003 ZK 2 log 2012-08-01 00:16:13,464 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] - Accepted socket connection from /solr-ip:42820 2012-08-01 00:16:15,111 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client attempting to renew session 0x138d6e86e790003 at /solr-ip:42820 2012-08-01 00:16:15,112 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established session 0x138d6e86e790003 with negotiated timeout 15000 for client /solr-ip:42820 Solr log Aug 1, 2012 12:16:17 AM org.apache.solr.cloud.RecoveryStrategy close
Re: shard connection timeout
Hi Jason We are running with -XX:MaxGCPauseMillis=5000 as well, which might also help you. Best regards Trym Den 11-07-2012 07:55, Jason skrev: Actually we got this error when remote server is executing garbage collecting and that time is over about 1 minute. Solr server sometimes is frozen during gc and occurred connection refused error. Our gc option is -XX:+UseParallelGC -XX:+UseParallelOldGC -XX:+AggressiveOpts Response waiting is better than connection refuse for us. is there any solution? -- View this message in context: http://lucene.472066.n3.nabble.com/shard-connection-timeout-tp3994301p3994320.html Sent from the Solr - User mailing list archive at Nabble.com.
RecoveryStrategy overseer session expired
Hi Running SolrCloud with a Solr loosing its zookeeper connection while having a replica I see the below log message repeatedly and the shard never recovers. The Solr has successfully reconnected to ZooKeeper and ZooKeeper is running fine. I know that the cause is the loss of the ZooKeeper connection and I will work on that, but I can guarantee that one of my ZooKeepers will go down at some point (e.g. by a system admin), so I need the recovery to work. I can see the code has changed recently just in this area. Do anyone have a hint of what I may do to get more information about this? Thanks in advance for any comments. Best regards Trym SEVERE: Error while trying to recover. org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /overseer/queue/qn- at org.apache.zookeeper.KeeperException.create(KeeperException.java:118) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:643) at org.apache.solr.cloud.DistributedQueue.offer(DistributedQueue.java:236) at org.apache.solr.cloud.ZkController.publish(ZkController.java:745) at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:288) at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:210)
LeaderElection bugfix
Hi In Solr Cloud when a Solr looses its ZooKeeper connection e.g. because of a session timeout the LeaderElector ZooKeeper Watchers handling its replica slices are notified with two events: a Disconnected and a SyncConnected event. Currently the org.apache.solr.cloud.LeaderElector#checkIfIamLeader code does two bad things when this happens: 1. On the disconnect event it fails with a session timeout when talking to ZooKeeper (it has no zookeeper connection at this point) 2. On the syncConnected event it adds a new watcher to the ZooKeeper leader election leader node. As documented in the zookeeper programming guide, the watchers are not removed when a zookeeper connection is lost (even though the watchers are notified twice), so for each zookeeper connection loss the number of watchers are doubled. It can be noted that there are two watchers per replica slice (the overseer and collection/slice/election). A fix for this could be org.apache.solr.cloud.LeaderElector#checkIfIamLeader ... new Watcher() { @Override public void process(WatchedEvent event) { log.debug(seq + watcher received event: + event); // Reconnect should not add new watchers as the old watchers are still available! if (EventType.None.equals(event.getType())) { log.debug(Skipping event: + event); return; } The behaviour of this can be verified using the below test in the org.apache.solr.cloud.LeaderElectionIntegrationTest Can someone confirm this and add it to svn? Thanks in advance. Best regards Trym @Test public void testReplicaZookeeperConnectionLoss() throws Exception { // who is the leader? String leader = getLeader(); SetInteger shard1Ports = shardPorts.get(shard1); int leaderPort = getLeaderPort(leader); assertTrue(shard1Ports.toString(), shard1Ports.contains(leaderPort)); // timeout a replica a couple of times System.setProperty(zkClientTimeout, 500); int replicaPort = 7001; if (leaderPort == 7001) { replicaPort = 7000; } assertNotSame(containerMap.get(replicaPort).getZkController().getZkClient().getSolrZooKeeper(), containerMap.get(leaderPort).getZkController().getZkClient().getSolrZooKeeper()); containerMap.get(replicaPort).getZkController().getZkClient().getSolrZooKeeper().pauseCnxn(2000); Thread.sleep(10 * 1000); containerMap.get(replicaPort).getZkController().getZkClient().getSolrZooKeeper().pauseCnxn(2000); Thread.sleep(10 * 1000); containerMap.get(replicaPort).getZkController().getZkClient().getSolrZooKeeper().pauseCnxn(2000); Thread.sleep(10 * 1000); // kill the leader if (VERBOSE) System.out.println(Killing + leaderPort); shard1Ports.remove(leaderPort); containerMap.get(leaderPort).shutdown(); // poll until leader change is visible for (int j = 0; j 90; j++) { String currentLeader = getLeader(); if(!leader.equals(currentLeader)) { break; } Thread.sleep(500); } leader = getLeader(); int newLeaderPort = getLeaderPort(leader); int retry = 0; while (leaderPort == newLeaderPort) { if (retry++ == 20) { break; } Thread.sleep(1000); } if (leaderPort == newLeaderPort) { fail(We didn't find a new leader! + leaderPort + was shutdown, but it's still showing as the leader); } assertTrue(Could not find leader + newLeaderPort + in + shard1Ports, shard1Ports.contains(newLeaderPort)); }
Re: LeaderElection bugfix
Hi Sami Thanks for your rapid reply. Regarding 1) This seems to be time dependent but it is seen on my local windows running the unit test and on a linux server running Solr. Regarding 2) The test does not show the number of Watchers are increasing, but this can be observed either by dumping the memory from the jvm or by looking at the debug statements (if debug is enabled). I don't know how to make assert statements regarding the number of watchers in zookeeper, so the test is not quite informative, but more confirming that the fix doesn't destroy anything. Best regards Trym Den 27-06-2012 10:06, Sami Siren skrev: On Wed, Jun 27, 2012 at 10:32 AM, Trym R. Møller t...@sigmat.dk wrote: Hi Hi, The behaviour of this can be verified using the below test in the org.apache.solr.cloud.LeaderElectionIntegrationTest Can you reproduce the failure in your test every time or just rarely? I added the test method to LeaderElectionIntegrationTest and ran it few times but I can't get it to fail. -- Sami Siren
Re: LeaderElection bugfix
Hi Sami Regarding 2) A simple way to inspect the number of watchers, is to add an error log statement to the process method of the watcher public void process(WatchedEvent event) { log.error(seq + watcher received event: + event); and see that the number of logs doubles for each call to containerMap.get(replicaPort).getZkController().getZkClient().getSolrZooKeeper().pauseCnxn(2000); Best regards Trym Den 27-06-2012 10:14, Trym R. Møller skrev: Hi Sami Thanks for your rapid reply. Regarding 1) This seems to be time dependent but it is seen on my local windows running the unit test and on a linux server running Solr. Regarding 2) The test does not show the number of Watchers are increasing, but this can be observed either by dumping the memory from the jvm or by looking at the debug statements (if debug is enabled). I don't know how to make assert statements regarding the number of watchers in zookeeper, so the test is not quite informative, but more confirming that the fix doesn't destroy anything. Best regards Trym Den 27-06-2012 10:06, Sami Siren skrev: On Wed, Jun 27, 2012 at 10:32 AM, Trym R. Møller t...@sigmat.dk wrote: Hi Hi, The behaviour of this can be verified using the below test in the org.apache.solr.cloud.LeaderElectionIntegrationTest Can you reproduce the failure in your test every time or just rarely? I added the test method to LeaderElectionIntegrationTest and ran it few times but I can't get it to fail. -- Sami Siren
LeaderElection
Hi Messing with behaviour when Solr looses its ZooKeeper connection I'm trying to reproduce how a replica slice gets leader. I have made the below unit test in the LeaderElectionTest class which fails. I don't know if this simulates how Solr uses the LeaderElection class but please comment on the scenario. Thanks in advance. Best regards Trym @Test public void testMemoryElection() throws Exception { LeaderElector first = new LeaderElector(zkClient); ZkNodeProps props = new ZkNodeProps(ZkStateReader.BASE_URL_PROP, http://127.0.0.1/solr/;, ZkStateReader.CORE_NAME_PROP, 1); ElectionContext firstContext = new ShardLeaderElectionContextBase(first, slice1, collection2, dummynode1, props, zkStateReader); first.setup(firstContext); first.joinElection(firstContext); Thread.sleep(1000); assertEquals(original leader was not registered, http://127.0.0.1/solr/1/;, getLeaderUrl(collection2, slice1)); SolrZkClient zkClient2 = new SolrZkClient(server.getZkAddress(), TIMEOUT); LeaderElector second = new LeaderElector(zkClient2); props = new ZkNodeProps(ZkStateReader.BASE_URL_PROP, http://127.0.0.1/solr/;, ZkStateReader.CORE_NAME_PROP, 2); ElectionContext context = new ShardLeaderElectionContextBase(second, slice1, collection2, dummynode1, props, zkStateReader); second.setup(context); second.joinElection(context); Thread.sleep(1000); assertEquals(original leader should have stayed leader, http://127.0.0.1/solr/1/;, getLeaderUrl(zkClient2, collection2, slice1)); server.expire(zkClient.getSolrZooKeeper().getSessionId()); assertEquals(new leader was not registered, http://127.0.0.1/solr/2/;, getLeaderUrl(zkClient2, collection2, slice1)); }
Search request on Solr Cloud
Hi I would like to execute the following query on Solr trunk (cloud): http://localhost:8983/solr/select?collection=myCollectionq=*%3A*start=0rows=10wt=xml http://localhost:8983/solr/x/select?collection=edr_sms_2011_05q=*%3A*start=0rows=10wt=xml but it fails with a http 404 error. 1. Looking into SolrDispatchFilter#doFilter it seems like the query needs either a shard name, a collection name between solr and /select elements in the path or a defaultCoreName. Is this correct? 2. Looking into solr.xml it seems like I can specify the defaultCoreName in the cores-tag. Is this correct? 3. I create my cores dynamically and information about these are stored in zookeeper. Is it possible to store the defaultCoreName in zookeeper as well and where should I look to get information about how to this? Thanks for any comments on this. Best regards Trym
Programmatic create core
Hi On Solr trunk I am trying to create a new core using the following code: CoreAdminRequest.Create req = new CoreAdminRequest.Create(); req.setCoreName(shardName); req.setInstanceDir(instanceDir); req.setDataDir(dataDir); req.setCollection(collectionName); req.setShardId(shardId); req.setConfigName(configName); CommonsHttpSolrServer httpSolrServer = new CommonsHttpSolrServer(http://localhost:8983/solr/;); req.process(httpSolrServer); But it fails on the server with an exception, (see the stacktrace below). It seems as if it cannot find the config name of the collection, but I thought I specified it in req.setConfigName(configName);? Looking into code it seems like the parameter name zk is looking for is configName and the parameter name solrj (CoreAdminRequest) sets is config but I am uncertain if it is converted and if it represent the same thing. Furthermore when looking into svn I see that a bug must have sneaked into the /lucene/dev/trunk/solr/core/src/java/org/apache/solr/cloud/ZkController.java between Revision: 1296692 and Revision: 1294466 In the last condition: int retry = 1; for (; retry 6; retry++) { ... } if (retry == 10) { Any comments are appreciated. Best regards Trym 06-05-2012 17:15:53 org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Error executing default implementation of CREATE at org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:401) at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:141) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:360) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:173) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: org.apache.solr.common.cloud.ZooKeeperException: Could not find configName for collection myCollectionName at org.apache.solr.cloud.ZkController.getConfName(ZkController.java:955) at org.apache.solr.cloud.ZkController.createCollectionZkNode(ZkController.java:873) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:690) at org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:396) ... 21 more
Re: Programmatic create core
Thanks Mark you are my hero! I had missed the specification of the configuration of the collection and the only way to specify it was as follows: CoreAdminRequest.Create req = new CoreAdminRequest.Create() { @Override public SolrParams getParams() { return ((ModifiableSolrParams) super.getParams()).set(collection.configName, myCollectionConfigurationName); } }; Is there other ways to specify it or should the CoreAdminRequest be enhanced with this property? Best regards Trym Den 07-05-2012 17:33, Mark Miller skrev: Thanks - I'll fix that retry count issue right now. When you go to the admin UI and look at the zookeeper nodes, what is listed under config? I'll add the config names found to that error message. - Mark On Mon, May 7, 2012 at 2:12 AM, Trym R. Møllert...@sigmat.dk wrote: Hi On Solr trunk I am trying to create a new core using the following code: CoreAdminRequest.Create req = new CoreAdminRequest.Create(); req.setCoreName(shardName); req.setInstanceDir(**instanceDir); req.setDataDir(dataDir); req.setCollection(**collectionName); req.setShardId(shardId); req.setConfigName(configName); CommonsHttpSolrServer httpSolrServer = new CommonsHttpSolrServer( http://**localhost:8983/solr/http://localhost:8983/solr/); req.process(httpSolrServer); But it fails on the server with an exception, (see the stacktrace below). It seems as if it cannot find the config name of the collection, but I thought I specified it in req.setConfigName(configName)**;? Looking into code it seems like the parameter name zk is looking for is configName and the parameter name solrj (CoreAdminRequest) sets is config but I am uncertain if it is converted and if it represent the same thing. Furthermore when looking into svn I see that a bug must have sneaked into the /lucene/dev/trunk/solr/core/**src/java/org/apache/solr/** cloud/ZkController.java between Revision: 1296692 and Revision: 1294466 In the last condition: int retry = 1; for (; retry 6; retry++) { ... } if (retry == 10) { Any comments are appreciated. Best regards Trym 06-05-2012 17:15:53 org.apache.solr.common.**SolrException log SEVERE: org.apache.solr.common.**SolrException: Error executing default implementation of CREATE at org.apache.solr.handler.admin.**CoreAdminHandler.** handleCreateAction(**CoreAdminHandler.java:401) at org.apache.solr.handler.admin.**CoreAdminHandler.** handleRequestBody(**CoreAdminHandler.java:141) at org.apache.solr.handler.**RequestHandlerBase.**handleRequest(** RequestHandlerBase.java:129) at org.apache.solr.servlet.**SolrDispatchFilter.**handleAdminRequest(** SolrDispatchFilter.java:360) at org.apache.solr.servlet.**SolrDispatchFilter.doFilter(** SolrDispatchFilter.java:173) at org.mortbay.jetty.servlet.**ServletHandler$CachedChain.** doFilter(ServletHandler.java:**1212) at org.mortbay.jetty.servlet.**ServletHandler.handle(** ServletHandler.java:399) at org.mortbay.jetty.security.**SecurityHandler.handle(** SecurityHandler.java:216) at org.mortbay.jetty.servlet.**SessionHandler.handle(** SessionHandler.java:182) at org.mortbay.jetty.handler.**ContextHandler.handle(** ContextHandler.java:766) at org.mortbay.jetty.webapp.**WebAppContext.handle(** WebAppContext.java:450) at org.mortbay.jetty.handler.**ContextHandlerCollection.**handle(** ContextHandlerCollection.java:**230) at org.mortbay.jetty.handler.**HandlerCollection.handle(** HandlerCollection.java:114) at org.mortbay.jetty.handler.**HandlerWrapper.handle(** HandlerWrapper.java:152) at org.mortbay.jetty.Server.**handle(Server.java:326) at org.mortbay.jetty.**HttpConnection.handleRequest(** HttpConnection.java:542) at org.mortbay.jetty.**HttpConnection$RequestHandler.** headerComplete(HttpConnection.**java:928) at org.mortbay.jetty.HttpParser.**parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.**parseAvailable(HttpParser.**java:212) at org.mortbay.jetty.**HttpConnection.handle(**HttpConnection.java:404) at org.mortbay.jetty.bio.**SocketConnector$Connection.** run(SocketConnector.java:228) at org.mortbay.thread.**QueuedThreadPool$PoolThread.** run(QueuedThreadPool.java:582) Caused by: org.apache.solr.common.cloud.**ZooKeeperException: Could not find configName for collection myCollectionName at org.apache.solr.cloud.**ZkController.getConfName(** ZkController.java:955) at org.apache.solr.cloud.**ZkController.**createCollectionZkNode(** ZkController.java:873) at org.apache.solr.core.**CoreContainer.create(** CoreContainer.java:690) at org.apache.solr.handler.admin.**CoreAdminHandler.** handleCreateAction(**CoreAdminHandler.java:396) ... 21 more
Understanding RecoveryStrategy
Hi Using Solr trunk with the replica feature, I see the below exception repeatedly in the Solr log. I have been looking into the code of RecoveryStrategy#commitOnLeader and read the code as follows: 1. sends a commit request (with COMMIT_END_POINT=true) to the Solr instance containing the leader of the slice 2. sends a commit request to the Solr instance containing the leader of the slice The first results in a commit on the shards in the single leader Solr instance and the second results in a commit on the shards in the single leader Solr plus on all other Solrs having slices or replica belonging to the collection. I would expect that the first request is the relevant (and enough to do a recovery of the specific replica). Am I reading the second request wrong or is it a bug? The code I'm referring to is UpdateRequest ureq = new UpdateRequest(); ureq.setParams(new ModifiableSolrParams()); ureq.getParams().set(DistributedUpdateProcessor.COMMIT_END_POINT, true); ureq.getParams().set(RecoveryStrategy.class.getName(), baseUrl); 1.ureq.setAction(AbstractUpdateRequest.ACTION.COMMIT, false, true).process( server); 2.server.commit(); Thanks in advance for any input. Best regards Trym R. Møller Apr 21, 2012 10:14:11 AM org.apache.solr.common.SolrException log SEVERE: Error while trying to recover:org.apache.solr.client.solrj.SolrServerException: http://myIP:8983/solr/myShardId at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:493) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:264) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:103) at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:180) at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:156) at org.apache.solr.cloud.RecoveryStrategy.commitOnLeader(RecoveryStrategy.java:170) at org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:120) at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:341) at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:206) Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) at org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78) at org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106) at org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1116) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1413) at org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1973) at org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1735) at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1098) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:440) ... 8 more
Re: Solr hanging
Hi Chris Hostetter Does that mean, that the last two questions I have posted hasn't reached the mailing list? Best regards Trym Den 25-04-2012 19:58, Chris Hostetter skrev: : Subject: Solr hanging : References:31fdac6b-c4d9-4383-865d-2faca0f09...@geekychris.com :can4yxvff-mqoawbyow2rsf_v4tc8vpgb+z8auv-z3zp94vv...@mail.gmail.com : In-Reply-To: :can4yxvff-mqoawbyow2rsf_v4tc8vpgb+z8auv-z3zp94vv...@mail.gmail.com https://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if you change the subject line of your email, other mail headers still track which thread you replied to and your question is hidden in that thread and gets less attention. It makes following discussions in the mailing list archives particularly difficult. -Hoss
Recovery - too many updates received since start
Hi I experience that a Solr looses its connection with Zookeeper and re-establish it. After Solr is reconnection to Zookeeper it begins to recover. It has been missing the connection approximately 10 seconds and meanwhile the leader slice has received some documents (maybe about 1000 documents). Solr fails to update peer sync with the log message: Apr 21, 2012 10:13:40 AM org.apache.solr.update.PeerSync sync WARNING: PeerSync: core=mycollection_slice21_shard1 url=zk-1:2181,zk-2:2181,zk-3:2181 too many updates received since start - startingUpdates no longer overlaps with our currentUpdates Looking into PeerSync and UpdateLog I can see that 100 updates is the maximum allowed updates that a shard can be behind. Is it correct that this is not configurable and what is the reasons for choosing 100? I suspect that one must compare the work needed to replicate the full index with the performance loss/resource usage when enhancing the size of the UpdateLog? Any comments regarding this is greatly appreciated. Best regards Trym
Recover - Read timed out
Hi I experience that a Solr looses its connection with Zookeeper and re-establish it. After Solr is reconnection to Zookeeper it begins to recover its replicas. It has been missing the connection approximately 10 seconds and meanwhile the leader slice has received some documents (maybe about 1000 documents). Solr fails to update using peer sync and fails afterwards to do a full replicate with the log message below. The Solr from where the documents are replicated doesn't log anything when the replication is in progress. The full replica continues to fail with the read time out for about 10 hours and then Solr gives up. 1. How can I get more information about why the Read time out happens? 2. It seems like the Solr from where it replicates leaks a http connection each time (and a thread) having about 18.000 threads in 8 hours. Any comments are welcome. Best regards Trym Apr 21, 2012 10:14:11 AM org.apache.solr.common.SolrException log SEVERE: Error while trying to recover:org.apache.solr.client.solrj.SolrServerException: http://solr-ip:8983/solr/mycollection_slice21_shard2 at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:493) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:264) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:103) at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:180) at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:156) at org.apache.solr.cloud.RecoveryStrategy.commitOnLeader(RecoveryStrategy.java:170) at org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:120) at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:341) at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:206) Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) at org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78) at org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106) at org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1116) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1413) at org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1973) at org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1735) at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1098) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:440) ... 8 more
Re: Solr Hanging
Hi I have succeeded in reproducing the scenario with two Solr instances running. They cover a single collection with two slices and two replica, two cores in each Solr instance. I have changed the number of threads that Jetty is allowed to use as follows: New class=org.mortbay.thread.QueuedThreadPool Set name=minThreads3/Set Set name=maxThreads3/Set Set name=lowThreads0/Set /New And when indexing a single document this works fine but when concurrently indexing 10 documents, Solr frequently hangs. I know that Jetty per default are allowed to use 10.000 threads, but in my other setup, all these 10.000 allowed thread are used on a single Solr instance (I have 7 Solr instances) after some days and the hanging scenario occurs. I'm not sure if just adjusting the allowed number of threads are the best solution and would like to get some input as what to expect and if there are other things I can adjust. My setup is as written before 7 Solr instances handling a single collection with 28 leaders and 28 replicas distributed fairly on the Solrs (8 cores on each Solr). Thanks for any input. Best regards Trym Den 19-04-2012 14:36, Yonik Seeley skrev: On Thu, Apr 19, 2012 at 4:25 AM, Trym R. Møllert...@sigmat.dk wrote: Hi I am using Solr trunk and have 7 Solr instances running with 28 leaders and 28 replicas for a single collection. After indexing a while (a couple of days) the solrs start hanging and doing a thread dump on the jvm I see blocked threads like the following: Thread 2369: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=158 (Compiled frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=1987 (Compiled frame) - java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=399 (Compiled frame) - java.util.concurrent.ExecutorCompletionService.take() @bci=4, line=164 (Compiled frame) - org.apache.solr.update.SolrCmdDistributor.checkResponses(boolean) @bci=27, line=350 (Compiled frame) - org.apache.solr.update.SolrCmdDistributor.finish() @bci=18, line=98 (Compiled frame) - org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish() @bci=4, line=299 (Compiled frame) - org.apache.solr.update.processor.DistributedUpdateProcessor.finish() @bci=1, line=817 (Compiled frame) ... - org.mortbay.thread.QueuedThreadPool$PoolThread.run() @bci=25, line=582 (Interpreted frame) I read the stack trace as my indexing client has indexed a document and this Solr is now waiting for the replica? to respond before returning an answer to the client. Correct. What's the full stack trace like on both a leader and replica? We need to know what the replica is blocking on. What version of trunk are you using? -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10
Solr Hanging
Hi I am using Solr trunk and have 7 Solr instances running with 28 leaders and 28 replicas for a single collection. After indexing a while (a couple of days) the solrs start hanging and doing a thread dump on the jvm I see blocked threads like the following: Thread 2369: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=158 (Compiled frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=1987 (Compiled frame) - java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=399 (Compiled frame) - java.util.concurrent.ExecutorCompletionService.take() @bci=4, line=164 (Compiled frame) - org.apache.solr.update.SolrCmdDistributor.checkResponses(boolean) @bci=27, line=350 (Compiled frame) - org.apache.solr.update.SolrCmdDistributor.finish() @bci=18, line=98 (Compiled frame) - org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish() @bci=4, line=299 (Compiled frame) - org.apache.solr.update.processor.DistributedUpdateProcessor.finish() @bci=1, line=817 (Compiled frame) ... - org.mortbay.thread.QueuedThreadPool$PoolThread.run() @bci=25, line=582 (Interpreted frame) I read the stack trace as my indexing client has indexed a document and this Solr is now waiting for the replica? to respond before returning an answer to the client. The other Solrs have similar blocked threads. Any ideas of how I can get closer to the problem? Am I reading the stack trace correctly? Any further information that are relevant for commenting this problem? Thanks for any comments. Best regards Trym
Re: Solr Hanging
Thanks for your answer. I am running an (older) revision of solr from around the 29/2-2012 I suspect that the thread I have included is the leader of the shard? The Solr instance, I have the dump from, contains more than one leader, so I don't know which shard (slice) the thread is working on. How can I find the solr instance containing the replica (I guess ZooKeeper can't help me)? And when I have found the solr instance containing the replica, how do I know which thread is handling the update request (all my solr instances contains 8 cores)? If this is not possible, I might be able to restart with a setup where each Solr instances only contains a single core (a leader or a replica). Best regards Trym Den 19-04-2012 14:36, Yonik Seeley skrev: On Thu, Apr 19, 2012 at 4:25 AM, Trym R. Møllert...@sigmat.dk wrote: Hi I am using Solr trunk and have 7 Solr instances running with 28 leaders and 28 replicas for a single collection. After indexing a while (a couple of days) the solrs start hanging and doing a thread dump on the jvm I see blocked threads like the following: Thread 2369: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=158 (Compiled frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=1987 (Compiled frame) - java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=399 (Compiled frame) - java.util.concurrent.ExecutorCompletionService.take() @bci=4, line=164 (Compiled frame) - org.apache.solr.update.SolrCmdDistributor.checkResponses(boolean) @bci=27, line=350 (Compiled frame) - org.apache.solr.update.SolrCmdDistributor.finish() @bci=18, line=98 (Compiled frame) - org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish() @bci=4, line=299 (Compiled frame) - org.apache.solr.update.processor.DistributedUpdateProcessor.finish() @bci=1, line=817 (Compiled frame) ... - org.mortbay.thread.QueuedThreadPool$PoolThread.run() @bci=25, line=582 (Interpreted frame) I read the stack trace as my indexing client has indexed a document and this Solr is now waiting for the replica? to respond before returning an answer to the client. Correct. What's the full stack trace like on both a leader and replica? We need to know what the replica is blocking on. What version of trunk are you using? -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10
Solr hanging
Hi I am using Solr trunk and have 7 Solr instances running with 28 leaders and 28 replicas for a single collection. After indexing a while (a couple of days) the solrs start hanging and doing a thread dump on the jvm I see blocked threads like the following: Thread 2369: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=158 (Compiled frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=1987 (Compiled frame) - java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=399 (Compiled frame) - java.util.concurrent.ExecutorCompletionService.take() @bci=4, line=164 (Compiled frame) - org.apache.solr.update.SolrCmdDistributor.checkResponses(boolean) @bci=27, line=350 (Compiled frame) - org.apache.solr.update.SolrCmdDistributor.finish() @bci=18, line=98 (Compiled frame) - org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish() @bci=4, line=299 (Compiled frame) - org.apache.solr.update.processor.DistributedUpdateProcessor.finish() @bci=1, line=817 (Compiled frame) ... - org.mortbay.thread.QueuedThreadPool$PoolThread.run() @bci=25, line=582 (Interpreted frame) I read the stack trace as my indexing client has indexed a document and this Solr is now waiting for the replica? to respond before returning an answer to the client. The other Solrs have similar blocked threads. Any ideas of how I can get closer to the problem? Am I reading the stack trace correctly? Any further information that are relevant for commenting this problem? Thanks for any comments. Best regards Trym