[ 
https://issues.apache.org/jira/browse/SOLR-11661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-11661:
--------------------------------
    Description: 
While testing SOLR-11458, [~ab] ran into an interesting failure which resulted 
in different document counts between leader and replica. The test is 
MoveReplicaHDFSTest on jira/solr-11458-2 branch.

The failure is rare but reproducible on beasting:
{code}
reproduce with: ant test  -Dtestcase=MoveReplicaHDFSTest 
-Dtests.method=testNormalFailedMove -Dtests.seed=161856CB543CD71C 
-Dtests.slow=true -Dtests.locale=ar-SA -Dtests.timezone=US/Michigan 
-Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1
   [junit4] FAILURE 14.2s | MoveReplicaHDFSTest.testNormalFailedMove <<<
   [junit4]    > Throwable #1: java.lang.AssertionError: expected:<100> but 
was:<56>
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([161856CB543CD71C:31134983787E4905]:0)
   [junit4]    >        at 
org.apache.solr.cloud.MoveReplicaTest.testFailedMove(MoveReplicaTest.java:305)
   [junit4]    >        at 
org.apache.solr.cloud.MoveReplicaHDFSTest.testNormalFailedMove(MoveReplicaHDFSTest.java:69)
{code}

The root problem here is when the old replica is not live during deletion of a 
collection, the correspond HDFS data of that replica is not removed therefore 
when a new collection with the same name as the deleted collection is created, 
new replicas will reuse the old HDFS data. This leads to many problems in 
leader election and recovery

  was:
While testing SOLR-11458, [~ab] ran into an interesting failure which resulted 
in different document counts between leader and replica. The test is 
MoveReplicaHDFSTest on jira/solr-11458-2 branch.

The failure is rare but reproducible on beasting:
{code}
reproduce with: ant test  -Dtestcase=MoveReplicaHDFSTest 
-Dtests.method=testNormalFailedMove -Dtests.seed=161856CB543CD71C 
-Dtests.slow=true -Dtests.locale=ar-SA -Dtests.timezone=US/Michigan 
-Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1
   [junit4] FAILURE 14.2s | MoveReplicaHDFSTest.testNormalFailedMove <<<
   [junit4]    > Throwable #1: java.lang.AssertionError: expected:<100> but 
was:<56>
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([161856CB543CD71C:31134983787E4905]:0)
   [junit4]    >        at 
org.apache.solr.cloud.MoveReplicaTest.testFailedMove(MoveReplicaTest.java:305)
   [junit4]    >        at 
org.apache.solr.cloud.MoveReplicaHDFSTest.testNormalFailedMove(MoveReplicaHDFSTest.java:69)
{code}


> New HDFS collection reuses old HDFS data from deleted HDFS collection with 
> same name causes inconsistent view of documents
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-11661
>                 URL: https://issues.apache.org/jira/browse/SOLR-11661
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>            Reporter: Shalin Shekhar Mangar
>            Priority: Major
>             Fix For: master (8.0), 7.3
>
>         Attachments: 11458-2-MoveReplicaHDFSTest-log.txt, SOLR-11661.patch, 
> SOLR-11661.patch
>
>
> While testing SOLR-11458, [~ab] ran into an interesting failure which 
> resulted in different document counts between leader and replica. The test is 
> MoveReplicaHDFSTest on jira/solr-11458-2 branch.
> The failure is rare but reproducible on beasting:
> {code}
> reproduce with: ant test  -Dtestcase=MoveReplicaHDFSTest 
> -Dtests.method=testNormalFailedMove -Dtests.seed=161856CB543CD71C 
> -Dtests.slow=true -Dtests.locale=ar-SA -Dtests.timezone=US/Michigan 
> -Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1
>    [junit4] FAILURE 14.2s | MoveReplicaHDFSTest.testNormalFailedMove <<<
>    [junit4]    > Throwable #1: java.lang.AssertionError: expected:<100> but 
> was:<56>
>    [junit4]    >      at 
> __randomizedtesting.SeedInfo.seed([161856CB543CD71C:31134983787E4905]:0)
>    [junit4]    >      at 
> org.apache.solr.cloud.MoveReplicaTest.testFailedMove(MoveReplicaTest.java:305)
>    [junit4]    >      at 
> org.apache.solr.cloud.MoveReplicaHDFSTest.testNormalFailedMove(MoveReplicaHDFSTest.java:69)
> {code}
> The root problem here is when the old replica is not live during deletion of 
> a collection, the correspond HDFS data of that replica is not removed 
> therefore when a new collection with the same name as the deleted collection 
> is created, new replicas will reuse the old HDFS data. This leads to many 
> problems in leader election and recovery



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to