[jira] [Commented] (CASSANDRA-2870) dynamic snitch + read repair off can cause LOCAL_QUORUM reads to return spurious UnavailableException

2011-07-13 Thread Patrick Mackinlay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064609#comment-13064609
 ] 

Patrick Mackinlay commented on CASSANDRA-2870:
--

In the default configuration of 0.7.6-2 (and other versions) LOCAL_QUORUM reads 
dont work. This is not a minor bug and should be fixed in the next release.
By default configuration I mean the tar ball that is distributed by the 
cassandra website.
The fact that it is not a regression just shows that this functionality was 
never properly tested.


 dynamic snitch + read repair off can cause LOCAL_QUORUM reads to return 
 spurious UnavailableException
 -

 Key: CASSANDRA-2870
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2870
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.7.8, 0.8.2

 Attachments: 2870.txt


 When Read Repair is off, we want to avoid doing requests to more nodes than 
 necessary to satisfy the ConsistencyLevel.  ReadCallback does this here:
 {code}
 this.endpoints = repair || resolver instanceof RowRepairResolver
? endpoints
: endpoints.subList(0, Math.min(endpoints.size(), 
 blockfor)); // min so as to not throw exception until assureSufficient is 
 called
 {code}
 You can see that it is assuming that the endpoints list is sorted in order 
 of preferred-ness for the read.
 Then the LOCAL_QUORUM code in DatacenterReadCallback checks to see if we have 
 enough nodes to do the read:
 {code}
 int localEndpoints = 0;
 for (InetAddress endpoint : endpoints)
 {
 if (localdc.equals(snitch.getDatacenter(endpoint)))
 localEndpoints++;
 }
 if (localEndpoints  blockfor)
 throw new UnavailableException();
 {code}
 So if repair is off (so we truncate our endpoints list) AND dynamic snitch 
 has decided that nodes in another DC are to be preferred over local ones, 
 we'll throw UE even if all the replicas are healthy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2870) dynamic snitch + read repair off can cause LOCAL_QUORUM reads to return spurious UnavailableException

2011-07-13 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064621#comment-13064621
 ] 

Jonathan Ellis commented on CASSANDRA-2870:
---

It will be fixed in 0.7.8; 0.7.7 entered the release process before this was 
reported.

 dynamic snitch + read repair off can cause LOCAL_QUORUM reads to return 
 spurious UnavailableException
 -

 Key: CASSANDRA-2870
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2870
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.7.8, 0.8.2

 Attachments: 2870.txt


 When Read Repair is off, we want to avoid doing requests to more nodes than 
 necessary to satisfy the ConsistencyLevel.  ReadCallback does this here:
 {code}
 this.endpoints = repair || resolver instanceof RowRepairResolver
? endpoints
: endpoints.subList(0, Math.min(endpoints.size(), 
 blockfor)); // min so as to not throw exception until assureSufficient is 
 called
 {code}
 You can see that it is assuming that the endpoints list is sorted in order 
 of preferred-ness for the read.
 Then the LOCAL_QUORUM code in DatacenterReadCallback checks to see if we have 
 enough nodes to do the read:
 {code}
 int localEndpoints = 0;
 for (InetAddress endpoint : endpoints)
 {
 if (localdc.equals(snitch.getDatacenter(endpoint)))
 localEndpoints++;
 }
 if (localEndpoints  blockfor)
 throw new UnavailableException();
 {code}
 So if repair is off (so we truncate our endpoints list) AND dynamic snitch 
 has decided that nodes in another DC are to be preferred over local ones, 
 we'll throw UE even if all the replicas are healthy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2870) dynamic snitch + read repair off can cause LOCAL_QUORUM reads to return spurious UnavailableException

2011-07-08 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061846#comment-13061846
 ] 

Sylvain Lebresne commented on CASSANDRA-2870:
-

+1

 dynamic snitch + read repair off can cause LOCAL_QUORUM reads to return 
 spurious UnavailableException
 -

 Key: CASSANDRA-2870
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2870
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.7.8, 0.8.2

 Attachments: 2870.txt


 When Read Repair is off, we want to avoid doing requests to more nodes than 
 necessary to satisfy the ConsistencyLevel.  ReadCallback does this here:
 {code}
 this.endpoints = repair || resolver instanceof RowRepairResolver
? endpoints
: endpoints.subList(0, Math.min(endpoints.size(), 
 blockfor)); // min so as to not throw exception until assureSufficient is 
 called
 {code}
 You can see that it is assuming that the endpoints list is sorted in order 
 of preferred-ness for the read.
 Then the LOCAL_QUORUM code in DatacenterReadCallback checks to see if we have 
 enough nodes to do the read:
 {code}
 int localEndpoints = 0;
 for (InetAddress endpoint : endpoints)
 {
 if (localdc.equals(snitch.getDatacenter(endpoint)))
 localEndpoints++;
 }
 if (localEndpoints  blockfor)
 throw new UnavailableException();
 {code}
 So if repair is off (so we truncate our endpoints list) AND dynamic snitch 
 has decided that nodes in another DC are to be preferred over local ones, 
 we'll throw UE even if all the replicas are healthy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2870) dynamic snitch + read repair off can cause LOCAL_QUORUM reads to return spurious UnavailableException

2011-07-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062053#comment-13062053
 ] 

Hudson commented on CASSANDRA-2870:
---

Integrated in Cassandra-0.7 #526 (See 
[https://builds.apache.org/job/Cassandra-0.7/526/])
fix possibility of spuriousUnavailableException for LOCAL_QUORUM reads with 
dynamic snitch and read repair disabled
patch by jbellis; reviewed by slebresne for CASSANDRA-2870

jbellis : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1144380
Files : 
* /cassandra/branches/cassandra-0.7/CHANGES.txt
* 
/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/ReadCallback.java
* 
/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/DatacenterReadCallback.java


 dynamic snitch + read repair off can cause LOCAL_QUORUM reads to return 
 spurious UnavailableException
 -

 Key: CASSANDRA-2870
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2870
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.7.8, 0.8.2

 Attachments: 2870.txt


 When Read Repair is off, we want to avoid doing requests to more nodes than 
 necessary to satisfy the ConsistencyLevel.  ReadCallback does this here:
 {code}
 this.endpoints = repair || resolver instanceof RowRepairResolver
? endpoints
: endpoints.subList(0, Math.min(endpoints.size(), 
 blockfor)); // min so as to not throw exception until assureSufficient is 
 called
 {code}
 You can see that it is assuming that the endpoints list is sorted in order 
 of preferred-ness for the read.
 Then the LOCAL_QUORUM code in DatacenterReadCallback checks to see if we have 
 enough nodes to do the read:
 {code}
 int localEndpoints = 0;
 for (InetAddress endpoint : endpoints)
 {
 if (localdc.equals(snitch.getDatacenter(endpoint)))
 localEndpoints++;
 }
 if (localEndpoints  blockfor)
 throw new UnavailableException();
 {code}
 So if repair is off (so we truncate our endpoints list) AND dynamic snitch 
 has decided that nodes in another DC are to be preferred over local ones, 
 we'll throw UE even if all the replicas are healthy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2870) dynamic snitch + read repair off can cause LOCAL_QUORUM reads to return spurious UnavailableException

2011-07-08 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062256#comment-13062256
 ] 

Jeremy Hanna commented on CASSANDRA-2870:
-

This also appears to affect 0.7.6 and when read repair is not off.  I didn't 
set read repair on my CFs (defaults to 100%) and tried a simple rowcount pig 
script using read consistency LOCAL_QUORUM and it fails with UE.  I would think 
if that's the case, the priority should be higher and it should go in 0.7.7.  
Any thoughts?

 dynamic snitch + read repair off can cause LOCAL_QUORUM reads to return 
 spurious UnavailableException
 -

 Key: CASSANDRA-2870
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2870
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.7.8, 0.8.2

 Attachments: 2870.txt


 When Read Repair is off, we want to avoid doing requests to more nodes than 
 necessary to satisfy the ConsistencyLevel.  ReadCallback does this here:
 {code}
 this.endpoints = repair || resolver instanceof RowRepairResolver
? endpoints
: endpoints.subList(0, Math.min(endpoints.size(), 
 blockfor)); // min so as to not throw exception until assureSufficient is 
 called
 {code}
 You can see that it is assuming that the endpoints list is sorted in order 
 of preferred-ness for the read.
 Then the LOCAL_QUORUM code in DatacenterReadCallback checks to see if we have 
 enough nodes to do the read:
 {code}
 int localEndpoints = 0;
 for (InetAddress endpoint : endpoints)
 {
 if (localdc.equals(snitch.getDatacenter(endpoint)))
 localEndpoints++;
 }
 if (localEndpoints  blockfor)
 throw new UnavailableException();
 {code}
 So if repair is off (so we truncate our endpoints list) AND dynamic snitch 
 has decided that nodes in another DC are to be preferred over local ones, 
 we'll throw UE even if all the replicas are healthy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2870) dynamic snitch + read repair off can cause LOCAL_QUORUM reads to return spurious UnavailableException

2011-07-08 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062308#comment-13062308
 ] 

Jonathan Ellis commented on CASSANDRA-2870:
---

This has been present since LOCAL_QUORUM was introduced, so it's not a new 
regression.  And a reasonable workaround exists (disable dynamic snitch).  So 
no, I don't think we should hold up 0.7.7 for this.

 dynamic snitch + read repair off can cause LOCAL_QUORUM reads to return 
 spurious UnavailableException
 -

 Key: CASSANDRA-2870
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2870
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.7.8, 0.8.2

 Attachments: 2870.txt


 When Read Repair is off, we want to avoid doing requests to more nodes than 
 necessary to satisfy the ConsistencyLevel.  ReadCallback does this here:
 {code}
 this.endpoints = repair || resolver instanceof RowRepairResolver
? endpoints
: endpoints.subList(0, Math.min(endpoints.size(), 
 blockfor)); // min so as to not throw exception until assureSufficient is 
 called
 {code}
 You can see that it is assuming that the endpoints list is sorted in order 
 of preferred-ness for the read.
 Then the LOCAL_QUORUM code in DatacenterReadCallback checks to see if we have 
 enough nodes to do the read:
 {code}
 int localEndpoints = 0;
 for (InetAddress endpoint : endpoints)
 {
 if (localdc.equals(snitch.getDatacenter(endpoint)))
 localEndpoints++;
 }
 if (localEndpoints  blockfor)
 throw new UnavailableException();
 {code}
 So if repair is off (so we truncate our endpoints list) AND dynamic snitch 
 has decided that nodes in another DC are to be preferred over local ones, 
 we'll throw UE even if all the replicas are healthy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2870) dynamic snitch + read repair off can cause LOCAL_QUORUM reads to return spurious UnavailableException

2011-07-08 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062324#comment-13062324
 ] 

Jeremy Hanna commented on CASSANDRA-2870:
-

Okay - it just seemed like a higher priority issue with the scope expanded.  
We'll probably just disable dynamic snitch until the fix is in a release then.

 dynamic snitch + read repair off can cause LOCAL_QUORUM reads to return 
 spurious UnavailableException
 -

 Key: CASSANDRA-2870
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2870
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.7.8, 0.8.2

 Attachments: 2870.txt


 When Read Repair is off, we want to avoid doing requests to more nodes than 
 necessary to satisfy the ConsistencyLevel.  ReadCallback does this here:
 {code}
 this.endpoints = repair || resolver instanceof RowRepairResolver
? endpoints
: endpoints.subList(0, Math.min(endpoints.size(), 
 blockfor)); // min so as to not throw exception until assureSufficient is 
 called
 {code}
 You can see that it is assuming that the endpoints list is sorted in order 
 of preferred-ness for the read.
 Then the LOCAL_QUORUM code in DatacenterReadCallback checks to see if we have 
 enough nodes to do the read:
 {code}
 int localEndpoints = 0;
 for (InetAddress endpoint : endpoints)
 {
 if (localdc.equals(snitch.getDatacenter(endpoint)))
 localEndpoints++;
 }
 if (localEndpoints  blockfor)
 throw new UnavailableException();
 {code}
 So if repair is off (so we truncate our endpoints list) AND dynamic snitch 
 has decided that nodes in another DC are to be preferred over local ones, 
 we'll throw UE even if all the replicas are healthy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira