[jira] [Commented] (SOLR-9915) PeerSync alreadyInSync check is not backwards compatible
[ https://issues.apache.org/jira/browse/SOLR-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15795803#comment-15795803 ] ASF subversion and git services commented on SOLR-9915: --- Commit 122fa6cf64a56dd5ab5aff84f7f5c9a1305bde4e in lucene-solr's branch refs/heads/branch_6x from [~noble.paul] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=122fa6c ] SOLR-9915: PeerSync alreadyInSync check is not backwards compatible and results in full replication during a rolling restart > PeerSync alreadyInSync check is not backwards compatible > > > Key: SOLR-9915 > URL: https://issues.apache.org/jira/browse/SOLR-9915 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: replication (java) >Affects Versions: 6.3 >Reporter: Tim Owen >Assignee: Noble Paul >Priority: Blocker > Attachments: SOLR-9915.patch > > > The fingerprint check added to PeerSync in SOLR-9446 works fine when all > servers are running 6.3 but this means it's hard to do a rolling upgrade from > e.g. 6.2.1 to 6.3 because the 6.3 server sends a request to a 6.2.1 server to > get a fingerprint and then gets a NPE because the older server doesn't return > the expected field in its response. > This leads to the PeerSync completely failing, and results in a full index > replication from scratch, copying all index files over the network. We > noticed this happening when we tried to do a rolling upgrade on one of our > 6.2.1 clusters to 6.3. Unfortunately this amount of replication was hammering > our disks and network, so we had to do a full shutdown, upgrade all to 6.3 > and restart, which was not ideal for a production cluster. > The attached patch should behave more gracefully in this situation, as it > will typically return false for alreadyInSync() and then carry on doing the > normal re-sync based on versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9915) PeerSync alreadyInSync check is not backwards compatible
[ https://issues.apache.org/jira/browse/SOLR-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15795802#comment-15795802 ] ASF subversion and git services commented on SOLR-9915: --- Commit 1b9564a5dccb2938586f2f82f963bd1534b002cd in lucene-solr's branch refs/heads/branch_6x from [~noble.paul] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1b9564a ] SOLR-9915: PeerSync alreadyInSync check is not backwards compatible and results in full replication during a rolling restart > PeerSync alreadyInSync check is not backwards compatible > > > Key: SOLR-9915 > URL: https://issues.apache.org/jira/browse/SOLR-9915 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: replication (java) >Affects Versions: 6.3 >Reporter: Tim Owen >Assignee: Noble Paul >Priority: Blocker > Attachments: SOLR-9915.patch > > > The fingerprint check added to PeerSync in SOLR-9446 works fine when all > servers are running 6.3 but this means it's hard to do a rolling upgrade from > e.g. 6.2.1 to 6.3 because the 6.3 server sends a request to a 6.2.1 server to > get a fingerprint and then gets a NPE because the older server doesn't return > the expected field in its response. > This leads to the PeerSync completely failing, and results in a full index > replication from scratch, copying all index files over the network. We > noticed this happening when we tried to do a rolling upgrade on one of our > 6.2.1 clusters to 6.3. Unfortunately this amount of replication was hammering > our disks and network, so we had to do a full shutdown, upgrade all to 6.3 > and restart, which was not ideal for a production cluster. > The attached patch should behave more gracefully in this situation, as it > will typically return false for alreadyInSync() and then carry on doing the > normal re-sync based on versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9915) PeerSync alreadyInSync check is not backwards compatible
[ https://issues.apache.org/jira/browse/SOLR-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15795691#comment-15795691 ] ASF subversion and git services commented on SOLR-9915: --- Commit 5b1f6b2ba48f8afc6c822c097d0500eb2ed66815 in lucene-solr's branch refs/heads/master from [~noble.paul] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5b1f6b2 ] SOLR-9915: PeerSync alreadyInSync check is not backwards compatible and results in full replication during a rolling restart > PeerSync alreadyInSync check is not backwards compatible > > > Key: SOLR-9915 > URL: https://issues.apache.org/jira/browse/SOLR-9915 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: replication (java) >Affects Versions: 6.3 >Reporter: Tim Owen >Assignee: Noble Paul >Priority: Blocker > Attachments: SOLR-9915.patch > > > The fingerprint check added to PeerSync in SOLR-9446 works fine when all > servers are running 6.3 but this means it's hard to do a rolling upgrade from > e.g. 6.2.1 to 6.3 because the 6.3 server sends a request to a 6.2.1 server to > get a fingerprint and then gets a NPE because the older server doesn't return > the expected field in its response. > This leads to the PeerSync completely failing, and results in a full index > replication from scratch, copying all index files over the network. We > noticed this happening when we tried to do a rolling upgrade on one of our > 6.2.1 clusters to 6.3. Unfortunately this amount of replication was hammering > our disks and network, so we had to do a full shutdown, upgrade all to 6.3 > and restart, which was not ideal for a production cluster. > The attached patch should behave more gracefully in this situation, as it > will typically return false for alreadyInSync() and then carry on doing the > normal re-sync based on versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org