[ https://issues.apache.org/jira/browse/SOLR-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14272877#comment-14272877 ]
Alexander S. commented on SOLR-6875: ------------------------------------ Now we have 4 shards, each with 2 replics (8 total nodes) and the next picture: {noformat} Shard 1: Replica 1: 14 486 089 Replica 2: 14 496 445 Shard 2 Replica 1: 14 496 609 Replica 2: 14 496 609 Shard 3 Replica 1: 14 492 812 Replica 2: 14 492 812 Shard 4 Replica 1: 14 488 755 Replica 2: 14 488 755 {noformat} How could it be? We didn't see anything like that before upgrade from 4.8.1 to 4.10.2. Also we enabled checkIntegrityAtMerge, could it be the reason? > No data integrity between replicas > ---------------------------------- > > Key: SOLR-6875 > URL: https://issues.apache.org/jira/browse/SOLR-6875 > Project: Solr > Issue Type: Bug > Affects Versions: 4.10.2 > Environment: One replica is @ Linux solr1.devops.wegohealth.com > 3.8.0-29-generic #42~precise1-Ubuntu SMP Wed Aug 14 16:19:23 UTC 2013 x86_64 > x86_64 x86_64 GNU/Linux > Another replica is @ Linux solr2.devops.wegohealth.com 3.16.0-23-generic > #30-Ubuntu SMP Thu Oct 16 13:17:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux > Solr is running with the next options: > * -Xms12G > * -Xmx16G > * -XX:+UseConcMarkSweepGC > * -XX:+UseLargePages > * -XX:+CMSParallelRemarkEnabled > * -XX:+ParallelRefProcEnabled > * -XX:+UseLargePages > * -XX:+AggressiveOpts > * -XX:CMSInitiatingOccupancyFraction=75 > Reporter: Alexander S. > > Setup: SolrCloud with 2 shards, each with 2 replicas, 4 nodes in total. > Indexing is stopped, one replica of a shard (Solr1) shows 45 574 039 docs, > and another (Solr1.1) 45 574 038 docs. > Solr1 is the leader, these errors appeared in the logs: > {code} > ERROR - 2014-12-20 09:54:38.783; > org.apache.solr.update.StreamingSolrServers$1; error > java.net.SocketException: Connection reset > at java.net.SocketInputStream.read(SocketInputStream.java:196) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at > org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160) > at > org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84) > at > org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273) > at > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140) > at > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) > at > org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260) > at > org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283) > at > org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251) > at > org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197) > at > org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271) > at > org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123) > at > org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:682) > at > org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:486) > at > org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863) > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106) > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57) > at > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > WARN - 2014-12-20 09:54:38.787; > org.apache.solr.update.processor.DistributedUpdateProcessor; Error sending > update > java.net.SocketException: Connection reset > at java.net.SocketInputStream.read(SocketInputStream.java:196) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at > org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160) > at > org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84) > at > org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273) > at > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140) > at > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) > at > org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260) > at > org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283) > at > org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251) > at > org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197) > at > org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271) > at > org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123) > at > org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:682) > at > org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:486) > at > org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863) > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106) > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57) > at > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > WARN - 2014-12-20 09:54:38.813; org.apache.solr.cloud.ZkController; Leader > is publishing core=crm-prod coreNodeName =10.128.209.232:8081_solr_crm-prod > state=down on behalf of un-reachable replica > http://10.128.209.232:8081/solr/crm-prod/; forcePublishState? false > ERROR - 2014-12-20 09:54:38.818; > org.apache.solr.update.processor.DistributedUpdateProcessor; Setting up to > try to start recovery on replica http://10.128.209.232:8081/solr/crm-prod/ > after: java.net.SocketException: Connection reset > {code} > On Solr1.1: > {code} > WARN - 2014-12-20 09:54:38.854; org.apache.solr.cloud.RecoveryStrategy; > Stopping recovery for core=crm-prod > coreNodeName=10.128.209.232:8081_solr_crm-prod > {code} > Index optimization was running at that time. > It was not a system crash, the server is up and was running smoothly with a > lot of available resources on board, lots of CPU, available RAM and a very > fast SSD RAID. So whatever happened Solr should get recovered properly, e.g. > as mysql does. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org