[jira] [Commented] (CASSANDRA-2840) cassandra-cli describe keyspace shows confusing memtable thresholds

2011-06-29 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057016#comment-13057016
 ] 

Yuki Morishita commented on CASSANDRA-2840:
---

duplicate CASSANDRA-2599

 cassandra-cli describe keyspace shows confusing memtable thresholds
 ---

 Key: CASSANDRA-2840
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2840
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.5
 Environment: linux rackspace instance 4cpu/4G
Reporter: Lanny Ripple
Priority: Minor

 The 'describe keyspace' output seems to be mixing up the labeling for minutes 
 and MB for Memtable thresholds output.
 Example off our ring:
Memtable thresholds: 0.2859375/61/1440 (millions of ops/minutes/MB)
 We use minutes=1440 and MB=61.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057002#comment-13057002
 ] 

Mck SembWever edited comment on CASSANDRA-2388 at 6/29/11 6:31 AM:
---

This does happen already (i've seen it while testing initial patches that were 
no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

On a side note, there is no guarantee that any given TT will have its split 
accessible via a local c* node - this is only a preference in CFRR. A failed 
job may just as likely got to a random c* node. At least now we can actually 
properly limit to the one DC and sort by proximity. 

One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...

  was (Author: michaelsembwever):
This does happen already (i've seen it while testing initial patches that 
were no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than  a fallback to another 
TT. For example a c* node may die in the middle of a TT...
  
 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057002#comment-13057002
 ] 

Mck SembWever edited comment on CASSANDRA-2388 at 6/29/11 6:32 AM:
---

This does happen already (i've seen it while testing initial patches that were 
no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

On a side note, there is no guarantee that any given TT will have its split 
accessible via a local c* node - this is only a preference in CFRR. A failed 
job may just as likely go to a random c* node. At least now we can actually 
properly limit to the one DC and sort by proximity. 

One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...

  was (Author: michaelsembwever):
This does happen already (i've seen it while testing initial patches that 
were no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

On a side note, there is no guarantee that any given TT will have its split 
accessible via a local c* node - this is only a preference in CFRR. A failed 
job may just as likely got to a random c* node. At least now we can actually 
properly limit to the one DC and sort by proximity. 

One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...
  
 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057002#comment-13057002
 ] 

Mck SembWever edited comment on CASSANDRA-2388 at 6/29/11 7:19 AM:
---

This does happen already (i've seen it while testing initial patches that were 
no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

On a side note, there is no guarantee that any given TT will have its split 
accessible via a local c* node - this is only a preference in CFRR. A failed 
task may just as likely go to a random c* node. At least now we can actually 
properly limit to the one DC and sort by proximity. 

One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...

  was (Author: michaelsembwever):
This does happen already (i've seen it while testing initial patches that 
were no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

On a side note, there is no guarantee that any given TT will have its split 
accessible via a local c* node - this is only a preference in CFRR. A failed 
job may just as likely go to a random c* node. At least now we can actually 
properly limit to the one DC and sort by proximity. 

One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...
  
 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057002#comment-13057002
 ] 

Mck SembWever edited comment on CASSANDRA-2388 at 6/29/11 7:27 AM:
---

This does happen already (i've seen it while testing initial patches that were 
no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

On a side note, there is no guarantee that any given TT will have its split 
accessible via a local c* node - this is only a preference in CFRR. A failed 
task may just as likely go to a random c* node. At least now we can actually 
properly limit to the one DC and sort by proximity. 

One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...

A bug i can see in the patch that did get accepted already is in 
CassandraServer.java:763 when endpointValid is false and restrictToSameDC is 
true we end up restricting to a random DC. I can fix this so restrictToSameDC 
is disabled in such situations.

  was (Author: michaelsembwever):
This does happen already (i've seen it while testing initial patches that 
were no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

On a side note, there is no guarantee that any given TT will have its split 
accessible via a local c* node - this is only a preference in CFRR. A failed 
task may just as likely go to a random c* node. At least now we can actually 
properly limit to the one DC and sort by proximity. 

One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...
  
 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057002#comment-13057002
 ] 

Mck SembWever edited comment on CASSANDRA-2388 at 6/29/11 7:49 AM:
---

 - This does happen already (i've seen it while testing initial patches that 
were no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

 - There is no guarantee that any given TT will have its split accessible via a 
local c* node - this is only a preference in CFRR. A failed task may just as 
likely go to a random c* node. At least now we can actually properly limit to 
the one DC and sort by proximity. 

 - One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...

 - A bug i can see in the patch that did get accepted already is in 
CassandraServer.java:763 when endpointValid is false and restrictToSameDC is 
true we end up restricting to a random DC. I could fix this so restrictToSameDC 
is disabled in such situations but this actually invalidates the previous 
point: we can't restrict to DC anymore and we can only sortByProximity to a 
random node... I think this supports Jonathan's point that it's overall a poor 
approach. I'm more and more in preference of my original approach using just 
client.getDatacenter(..) and not worrying about proximity within the datacenter.

 - Another bug is that, contray to my patch, the code committed
bq. committed with a change to use the dynamic snitch id the passed endpoint is 
valid.
 can call {{DynamicEndpointSnitch.sortByProximity(..)}} with an address that is 
not localhost and this breaks the assertion in the method. 

  was (Author: michaelsembwever):
 - This does happen already (i've seen it while testing initial patches 
that were no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

 - There is no guarantee that any given TT will have its split accessible via a 
local c* node - this is only a preference in CFRR. A failed task may just as 
likely go to a random c* node. At least now we can actually properly limit to 
the one DC and sort by proximity. 

 - One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...

 - A bug i can see in the patch that did get accepted already is in 
CassandraServer.java:763 when endpointValid is false and restrictToSameDC is 
true we end up restricting to a random DC. I can fix this so restrictToSameDC 
is disabled in such situations. This actually invalidates the previous point: 
we can't restrict to DC anymore and we can only sortByProximity to a random 
node... I think this supports Jonathan's point that it's overall a poor 
approach. I'm more and more in preference of my original approach using just 
client.getDatacenter(..) and not worrying about proximity within the datacenter.

 - Another bug is that, contray to my patch, the code committed
bq. committed with a change to use the dynamic snitch id the passed endpoint is 
valid.
 can call {{DynamicEndpointSnitch.sortByProximity(..)}} with an address that is 
not localhost and this breaks the assertion in the method. 
  
 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, 

[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057002#comment-13057002
 ] 

Mck SembWever edited comment on CASSANDRA-2388 at 6/29/11 7:48 AM:
---

 - This does happen already (i've seen it while testing initial patches that 
were no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

 - There is no guarantee that any given TT will have its split accessible via a 
local c* node - this is only a preference in CFRR. A failed task may just as 
likely go to a random c* node. At least now we can actually properly limit to 
the one DC and sort by proximity. 

 - One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...

 - A bug i can see in the patch that did get accepted already is in 
CassandraServer.java:763 when endpointValid is false and restrictToSameDC is 
true we end up restricting to a random DC. I can fix this so restrictToSameDC 
is disabled in such situations. This actually invalidates the previous point: 
we can't restrict to DC anymore and we can only sortByProximity to a random 
node... I think this supports Jonathan's point that it's overall a poor 
approach. I'm more and more in preference of my original approach using just 
client.getDatacenter(..) and not worrying about proximity within the datacenter.

 - Another bug is that, contray to my patch, the code committed
bq. committed with a change to use the dynamic snitch id the passed endpoint is 
valid.
 can call {{DynamicEndpointSnitch.sortByProximity(..)}} with an address that is 
not localhost and this breaks the assertion in the method. 

  was (Author: michaelsembwever):
This does happen already (i've seen it while testing initial patches that 
were no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

On a side note, there is no guarantee that any given TT will have its split 
accessible via a local c* node - this is only a preference in CFRR. A failed 
task may just as likely go to a random c* node. At least now we can actually 
properly limit to the one DC and sort by proximity. 

One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...

A bug i can see in the patch that did get accepted already is in 
CassandraServer.java:763 when endpointValid is false and restrictToSameDC is 
true we end up restricting to a random DC. I can fix this so restrictToSameDC 
is disabled in such situations.
  
 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2475) Prepared statements

2011-06-29 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057082#comment-13057082
 ] 

Michal Augustýn commented on CASSANDRA-2475:


It would be great if there is this overload in order to eliminate one 
client-server roundtrip:
{noformat}CqlResult execute_cql_query(1:binary query, 2:listbinary 
parameters, 3:Compression compression);{noformat}
In many applications, there is just few queries (max. hundreds?) and so I think 
the _handle_ could be cached server-side (we could limit the cache size via 
configuration).
And do you/we plan to support named parameters?

 Prepared statements
 ---

 Key: CASSANDRA-2475
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2475
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
  Labels: cql
 Fix For: 1.0




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back

2011-06-29 Thread Nicholas Telford (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057189#comment-13057189
 ] 

Nicholas Telford commented on CASSANDRA-2045:
-

What if the coordinator happens to be one of the replicas for that key? Having 
the coordinator store the hint would mean it wasn't replicated at the 
replication_factor. The same is true for a coordinator that's not a replica for 
a key, but has to store a hint for multiple nodes (i.e. when multiple replicas 
are down).

I don't like this; I was under the impression that HintedHandoff helps to 
retain the replication factor even in the face of failed replicas.

 Simplify HH to decrease read load when nodes come back
 --

 Key: CASSANDRA-2045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2045
 Project: Cassandra
  Issue Type: Improvement
Reporter: Chris Goffinet
Assignee: Nicholas Telford
 Fix For: 1.0

 Attachments: 
 0001-Changed-storage-of-Hints-to-store-a-serialized-RowMu.patch, 
 0002-Refactored-HintedHandoffManager.sendRow-to-reduce-co.patch, 
 0003-Fixed-some-coding-style-issues.patch, 
 0004-Fixed-direct-usage-of-Gossiper.getEndpointStateForEn.patch, 
 0005-Removed-duplicate-failure-detection-conditionals.-It.patch, 
 0006-Removed-handling-of-old-style-hints.patch, 2045-v3.txt, 
 CASSANDRA-2045-simplify-hinted-handoff-001.diff, 
 CASSANDRA-2045-simplify-hinted-handoff-002.diff


 Currently when HH is enabled, hints are stored, and when a node comes back, 
 we begin sending that node data. We do a lookup on the local node for the row 
 to send. To help reduce read load (if a node is offline for long period of 
 time) we should store the data we want forward the node locally instead. We 
 wouldn't have to do any lookups, just take byte[] and send to the destination.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2383) log4j unable to load properties file from classpath

2011-06-29 Thread David Allsopp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057223#comment-13057223
 ] 

David Allsopp commented on CASSANDRA-2383:
--

Yes, that'll teach me to post code late at night :-(

 log4j unable to load properties file from classpath
 ---

 Key: CASSANDRA-2383
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2383
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 0.7.4
 Environment: OS : windows
 java : 1.6.0.23
Reporter: david lee
Assignee: T Jake Luciani
Priority: Minor
 Fix For: 0.7.7


 when cassandra home folder is placed inside a folder which has space 
 characters in its name,
 log4j settings are not properly loaded and warning messages are shown.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2840) cassandra-cli describe keyspace shows confusing memtable thresholds

2011-06-29 Thread Lanny Ripple (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057227#comment-13057227
 ] 

Lanny Ripple commented on CASSANDRA-2840:
-

Thanks!  I searched around before opening the ticket but guess I wasn't 
thorough enough.

Regards,
   -ljr




 cassandra-cli describe keyspace shows confusing memtable thresholds
 ---

 Key: CASSANDRA-2840
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2840
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.5
 Environment: linux rackspace instance 4cpu/4G
Reporter: Lanny Ripple
Priority: Minor

 The 'describe keyspace' output seems to be mixing up the labeling for minutes 
 and MB for Memtable thresholds output.
 Example off our ring:
Memtable thresholds: 0.2859375/61/1440 (millions of ops/minutes/MB)
 We use minutes=1440 and MB=61.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1141129 - in /cassandra/branches/cassandra-0.7: CHANGES.txt src/java/org/apache/cassandra/db/ColumnFamilyStore.java

2011-06-29 Thread slebresne
Author: slebresne
Date: Wed Jun 29 15:15:29 2011
New Revision: 1141129

URL: http://svn.apache.org/viewvc?rev=1141129view=rev
Log:
Fix scan wrongly throwing assertion errors
patch by slebresne; reviewed by jbellis for CASSANDRA-2653

Modified:
cassandra/branches/cassandra-0.7/CHANGES.txt

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java

Modified: cassandra/branches/cassandra-0.7/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1141129r1=1141128r2=1141129view=diff
==
--- cassandra/branches/cassandra-0.7/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.7/CHANGES.txt Wed Jun 29 15:15:29 2011
@@ -26,6 +26,7 @@
  * fix race that could result in Hadoop writer failing to throw an
exception encountered after close() (CASSANDRA-2755)
  * fix CLI parsing of read_repair_chance (CASSANDRA-2837)
+ * fix scan wrongly throwing assertion error (CASSANDRA-2653)
 
 
 0.7.6

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java?rev=1141129r1=1141128r2=1141129view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
 Wed Jun 29 15:15:29 2011
@@ -1486,6 +1486,16 @@ public class ColumnFamilyStore implement
 return rows;
 }
 
+private NamesQueryFilter getExtraFilter(IndexClause clause)
+{
+SortedSetByteBuffer columns = new 
TreeSetByteBuffer(getComparator());
+for (IndexExpression expr : clause.expressions)
+{
+columns.add(expr.column_name);
+}
+return new NamesQueryFilter(columns);
+}
+
 public ListRow scan(IndexClause clause, AbstractBounds range, IFilter 
dataFilter)
 {
 // Start with the most-restrictive indexed clause, then apply 
remaining clauses
@@ -1502,50 +1512,33 @@ public class ColumnFamilyStore implement
 // it needs to be expanded to include those too
 IFilter firstFilter = dataFilter;
 NamesQueryFilter extraFilter = null;
-if (clause.expressions.size()  1)
+if (dataFilter instanceof SliceQueryFilter)
 {
-if (dataFilter instanceof SliceQueryFilter)
+// if we have a high chance of getting all the columns in a single 
index slice, do that.
+// otherwise, we'll create an extraFilter (lazily) to fetch by 
name the columns referenced by the additional expressions.
+if (getMaxRowSize()  DatabaseDescriptor.getColumnIndexSize())
 {
-// if we have a high chance of getting all the columns in a 
single index slice, do that.
-// otherwise, create an extraFilter to fetch by name the 
columns referenced by the additional expressions.
-if (getMaxRowSize()  DatabaseDescriptor.getColumnIndexSize())
-{
-logger.debug(Expanding slice filter to entire row to 
cover additional expressions);
-firstFilter = new 
SliceQueryFilter(ByteBufferUtil.EMPTY_BYTE_BUFFER,
-   
ByteBufferUtil.EMPTY_BYTE_BUFFER,
-   ((SliceQueryFilter) 
dataFilter).reversed,
-   Integer.MAX_VALUE);
-}
-else
-{
-logger.debug(adding extraFilter to cover additional 
expressions);
-SortedSetByteBuffer columns = new 
TreeSetByteBuffer(getComparator());
-for (IndexExpression expr : clause.expressions)
-{
-if (expr == primary)
-continue;
-columns.add(expr.column_name);
-}
-extraFilter = new NamesQueryFilter(columns);
-}
+logger.debug(Expanding slice filter to entire row to cover 
additional expressions);
+firstFilter = new 
SliceQueryFilter(ByteBufferUtil.EMPTY_BYTE_BUFFER,
+ByteBufferUtil.EMPTY_BYTE_BUFFER,
+((SliceQueryFilter) dataFilter).reversed,
+Integer.MAX_VALUE);
 }
-else
+}
+else
+{
+logger.debug(adding columns to firstFilter to cover additional 
expressions);
+// just add in columns that are not part of the resultset
+assert dataFilter instanceof NamesQueryFilter;
+SortedSetByteBuffer 

[jira] [Commented] (CASSANDRA-2521) Move away from Phantom References for Compaction/Memtable

2011-06-29 Thread Terje Marthinussen (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057295#comment-13057295
 ] 

Terje Marthinussen commented on CASSANDRA-2521:
---

I have not found any further major issues here, but I think there is still 
situations where files are deleted late but they do seem to go away.

Not sure if we are missing something in terms of reference counting and GC 
delete it eventually or it is just a delayed free or delete for some reason, 
but it does not happen too often.

Will see try to add some debug logging and see what I find.

I was looking at the code though and I am wondering about one segment. I have 
not had time to actually test this, but in submitUserDefined() there is a 
finally statement removing References for sstables but I could not 
immediately see where there are References acquired for all the sstables that 
needs to be freed there?

I am sure it's just me missing something, but anyway...


 Move away from Phantom References for Compaction/Memtable
 -

 Key: CASSANDRA-2521
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2521
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
Assignee: Sylvain Lebresne
 Fix For: 1.0

 Attachments: 
 0001-Use-reference-counting-to-decide-when-a-sstable-can-.patch, 
 0001-Use-reference-counting-to-decide-when-a-sstable-can-v2.patch, 
 0002-Force-unmapping-files-before-deletion-v2.patch, 2521-v3.txt, 2521-v4.txt


 http://wiki.apache.org/cassandra/MemtableSSTable
 Let's move to using reference counting instead of relying on GC to be called 
 in StorageService.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1141134 - in /cassandra/branches/cassandra-0.8: ./ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/db/

2011-06-29 Thread slebresne
Author: slebresne
Date: Wed Jun 29 15:26:29 2011
New Revision: 1141134

URL: http://svn.apache.org/viewvc?rev=1141134view=rev
Log:
merge from 0.7

Modified:
cassandra/branches/cassandra-0.8/   (props changed)
cassandra/branches/cassandra-0.8/CHANGES.txt
cassandra/branches/cassandra-0.8/contrib/   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/db/ColumnFamilyStore.java

Propchange: cassandra/branches/cassandra-0.8/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 15:26:29 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7:1026516-1140567,1140928
+/cassandra/branches/cassandra-0.7:1026516-1140567,1140928,1141129
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
 /cassandra/branches/cassandra-0.8:1090934-1125013,1125041
 /cassandra/branches/cassandra-0.8.0:1125021-1130369

Modified: cassandra/branches/cassandra-0.8/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/CHANGES.txt?rev=1141134r1=1141133r2=1141134view=diff
==
--- cassandra/branches/cassandra-0.8/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.8/CHANGES.txt Wed Jun 29 15:26:29 2011
@@ -85,6 +85,7 @@
  * Fix wrong purge of deleted cf during compaction (CASSANDRA-2786)
  * fix race that could result in Hadoop writer failing to throw an
exception encountered after close() (CASSANDRA-2755)
+ * fix scan wrongly throwing assertion error (CASSANDRA-2653)
 
 
 0.8.0-final

Propchange: cassandra/branches/cassandra-0.8/contrib/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 15:26:29 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009
-/cassandra/branches/cassandra-0.7/contrib:1026516-1140567,1140928
+/cassandra/branches/cassandra-0.7/contrib:1026516-1140567,1140928,1141129
 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654
 /cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125041
 /cassandra/branches/cassandra-0.8.0/contrib:1125021-1130369

Propchange: 
cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 15:26:29 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1140567,1140928
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1140567,1140928,1141129
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
 
/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125041
 
/cassandra/branches/cassandra-0.8.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1125021-1130369

Propchange: 
cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 15:26:29 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1140567,1140928
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1140567,1140928,1141129
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1055654
 

[jira] [Commented] (CASSANDRA-2773) Index manager cannot support deleting and inserting into a row in the same mutation

2011-06-29 Thread Jim Ancona (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057298#comment-13057298
 ] 

Jim Ancona commented on CASSANDRA-2773:
---

We have deployed and tested 0.7.6 plus this patch to the affected cluster. The 
cluster restarted successfully and the tests that caused the original failure 
ran successfully. In addition, functional tests of our applications show no 
regressions. I also reviewed the Cassandra system logs after the testing and 
saw no errors or obvious problems.


 Index manager cannot support deleting and inserting into a row in the same 
 mutation
 -

 Key: CASSANDRA-2773
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2773
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Boris Yen
Assignee: Jonathan Ellis
Priority: Critical
 Fix For: 0.8.2

 Attachments: 2773-v2.txt, 2773.txt, cassandra.log, 
 v1-0001-allow-deleting-a-rowand-updating-indexed-columns-init-.txt, 
 v1-0002-CASSANDRA-2773-Add-unit-tests-to-verfy-fix-cherry-pick.txt


 I use hector 0.8.0-1 and cassandra 0.8.
 1. create mutator by using hector api, 
 2. Insert a few columns into the mutator for key key1, cf standard. 
 3. add a deletion to the mutator to delete the record of key1, cf 
 standard.
 4. repeat 2 and 3
 5. execute the mutator.
 the result: the connection seems to be held by the sever forever, it never 
 returns. when I tried to restart the cassandra I saw unsupportedexception : 
 Index manager cannot support deleting and inserting into a row in the same 
 mutation. and the cassandra is dead forever, unless I delete the commitlog. 
 I would expect to get an exception when I execute the mutator, not after I 
 restart the cassandra.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2521) Move away from Phantom References for Compaction/Memtable

2011-06-29 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057305#comment-13057305
 ] 

Sylvain Lebresne commented on CASSANDRA-2521:
-

bq. Not sure if we are missing something in terms of reference counting and GC 
delete it eventually

If you don't use mmap, the GC shouldn't do anything. So if it is deleted 
eventually, it would indicate some place where we last decrement is delayed 
longer that it needs somehow. It'd be interesting if you find more.

bq. in submitUserDefined() there is a finally statement removing References for 
sstables but I could not immediately see where there are References acquired

It's in the lookupSSTables (a private method used by submitUserDefined). I 
admit it's not super clean but It felt like the simplest way to do this in a 
thread safe manner without holding unneeded references for too long.

bq. Only thing missing beyond that is to get this into 0.8.

I really don't think this would be reasonable. This is not a trivial change by 
any mean, nor does it fixes a regression. Which is not saying it doesn't make 
life much easier. But I'm really uncomfortable pushing that to 0.8.

 Move away from Phantom References for Compaction/Memtable
 -

 Key: CASSANDRA-2521
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2521
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
Assignee: Sylvain Lebresne
 Fix For: 1.0

 Attachments: 
 0001-Use-reference-counting-to-decide-when-a-sstable-can-.patch, 
 0001-Use-reference-counting-to-decide-when-a-sstable-can-v2.patch, 
 0002-Force-unmapping-files-before-deletion-v2.patch, 2521-v3.txt, 2521-v4.txt


 http://wiki.apache.org/cassandra/MemtableSSTable
 Let's move to using reference counting instead of relying on GC to be called 
 in StorageService.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1141138 - in /cassandra/trunk: ./ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/db/

2011-06-29 Thread slebresne
Author: slebresne
Date: Wed Jun 29 15:48:50 2011
New Revision: 1141138

URL: http://svn.apache.org/viewvc?rev=1141138view=rev
Log:
merge from 0.8

Modified:
cassandra/trunk/   (props changed)
cassandra/trunk/CHANGES.txt
cassandra/trunk/contrib/   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)
cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java

Propchange: cassandra/trunk/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 15:48:50 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7:1026516-1140567
+/cassandra/branches/cassandra-0.7:1026516-1140567,1141129
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
-/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1140755,1140760
+/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1140755,1140760,1141134
 /cassandra/branches/cassandra-0.8.0:1125021-1130369
 /cassandra/branches/cassandra-0.8.1:1101014-1125018
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689

Modified: cassandra/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1141138r1=1141137r2=1141138view=diff
==
--- cassandra/trunk/CHANGES.txt (original)
+++ cassandra/trunk/CHANGES.txt Wed Jun 29 15:48:50 2011
@@ -103,6 +103,7 @@
  * Fix wrong purge of deleted cf during compaction (CASSANDRA-2786)
  * fix race that could result in Hadoop writer failing to throw an
exception encountered after close() (CASSANDRA-2755)
+ * fix scan wrongly throwing assertion error (CASSANDRA-2653)
 
 
 0.8.0-final

Propchange: cassandra/trunk/contrib/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 15:48:50 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009
-/cassandra/branches/cassandra-0.7/contrib:1026516-1140567
+/cassandra/branches/cassandra-0.7/contrib:1026516-1140567,1141129
 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654
-/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1140755,1140760
+/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1140755,1140760,1141134
 /cassandra/branches/cassandra-0.8.0/contrib:1125021-1130369
 /cassandra/branches/cassandra-0.8.1/contrib:1101014-1125018
 /cassandra/tags/cassandra-0.7.0-rc3/contrib:1051699-1053689

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 15:48:50 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1140567
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1140567,1141129
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
-/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125019-1140755,1140760
+/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125019-1140755,1140760,1141134
 
/cassandra/branches/cassandra-0.8.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1125021-1130369
 
/cassandra/branches/cassandra-0.8.1/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1101014-1125018
 
/cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1051699-1053689

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 15:48:50 2011
@@ -1,7 +1,7 @@
 

[jira] [Commented] (CASSANDRA-2653) index scan errors out when zero columns are requested

2011-06-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057307#comment-13057307
 ] 

Hudson commented on CASSANDRA-2653:
---

Integrated in Cassandra-0.7 #517 (See 
[https://builds.apache.org/job/Cassandra-0.7/517/])
Fix scan wrongly throwing assertion errors
patch by slebresne; reviewed by jbellis for CASSANDRA-2653

slebresne : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1141129
Files : 
* /cassandra/branches/cassandra-0.7/CHANGES.txt
* 
/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java


 index scan errors out when zero columns are requested
 -

 Key: CASSANDRA-2653
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2653
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.6, 0.8.0 beta 2
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.7.7, 0.8.2

 Attachments: 
 0001-Handle-data-get-returning-null-in-secondary-indexes.patch, 
 0001-Handle-null-returns-in-data-index-query-v0.7.patch, 
 0001-Reset-SSTII-in-EchoedRow-constructor.patch, 2653_v2.patch, 
 2653_v3.patch, v1-0001-CASSANDRA-2653-reproduce-regression.txt


 As reported by Tyler Hobbs as an addendum to CASSANDRA-2401,
 {noformat}
 ERROR 16:13:38,864 Fatal exception in thread Thread[ReadStage:16,5,main]
 java.lang.AssertionError: No data found for 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0] in DecoratedKey(81509516161424251288255223397843705139, 
 6b657931):QueryPath(columnFamilyName='cf', superColumnName='null', 
 columnName='null') (original filter 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0]) from expression 'cf.626972746864617465 EQ 1'
   at 
 org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1517)
   at 
 org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42)
   at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2816) Repair doesn't synchronize merkle tree creation properly

2011-06-29 Thread Terje Marthinussen (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057339#comment-13057339
 ] 

Terje Marthinussen commented on CASSANDRA-2816:
---

This is what heap looks like when GC start slowing things down so much that 
even gossip gets delayed long enough for nodes to be down for some seconds.

  num #instances #bytes  class name
--
   1:   9453188  453753024  java.nio.HeapByteBuffer
   2:  10081546  392167064  [B
   3:   7616875  24374  org.apache.cassandra.db.Column
   4:   9739914  233757936  
java.util.concurrent.ConcurrentSkipListMap$Node
   5:   4131938   99166512  
java.util.concurrent.ConcurrentSkipListMap$Index
   6:   1549230   49575360  org.apache.cassandra.db.DeletedColumn

I guess this really ends up maybe being the mix of everything going on in total 
and all the reading and writing that may occur when repair runs (valiadation 
compactions, streaming, normal compactions and regular traffic all at the same 
time and maybe many CFs at the same time).

However, I have suspected for some time that our young size was a bit on the 
small side and after increasing it and giving the heap a few more GB to work 
with, it seems like things are behaving quite a bit better.

I mentioned issues with this patch when testing for CASSANDRA-2521. That was a 
problem caused by me. Was playing around with git for the first time and I 
manage to apply 2816 to a different branch than the one I used for testing 
:(

My appologies. 

Initial testing with that corrected looks a lot better for my small scale test 
case, but I noticed one time where I deleted an sstable and restarted. It did 
not get repaired (repair scanned but did nothing).

Not entirely sure what to make out of that, I then tested to delete another 
sstable and repair started running.

I will test more over the next days. 


 Repair doesn't synchronize merkle tree creation properly
 

 Key: CASSANDRA-2816
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2816
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
  Labels: repair
 Fix For: 0.8.2

 Attachments: 0001-Schedule-merkle-tree-request-one-by-one.patch


 Being a little slow, I just realized after having opened CASSANDRA-2811 and 
 CASSANDRA-2815 that there is a more general problem with repair.
 When a repair is started, it will send a number of merkle tree to its 
 neighbor as well as himself and assume for correction that the building of 
 those trees will be started on every node roughly at the same time (if not, 
 we end up comparing data snapshot at different time and will thus mistakenly 
 repair a lot of useless data). This is bogus for many reasons:
 * Because validation compaction runs on the same executor that other 
 compaction, the start of the validation on the different node is subject to 
 other compactions. 0.8 mitigates this in a way by being multi-threaded (and 
 thus there is less change to be blocked a long time by a long running 
 compaction), but the compaction executor being bounded, its still a problem)
 * if you run a nodetool repair without arguments, it will repair every CFs. 
 As a consequence it will generate lots of merkle tree requests and all of 
 those requests will be issued at the same time. Because even in 0.8 the 
 compaction executor is bounded, some of those validations will end up being 
 queued behind the first ones. Even assuming that the different validation are 
 submitted in the same order on each node (which isn't guaranteed either), 
 there is no guarantee that on all nodes, the first validation will take the 
 same time, hence desynchronizing the queued ones.
 Overall, it is important for the precision of repair that for a given CF and 
 range (which is the unit at which trees are computed), we make sure that all 
 node will start the validation at the same time (or, since we can't do magic, 
 as close as possible).
 One (reasonably simple) proposition to fix this would be to have repair 
 schedule validation compactions across nodes one by one (i.e, one CF/range at 
 a time), waiting for all nodes to return their tree before submitting the 
 next request. Then on each node, we should make sure that the node will start 
 the validation compaction as soon as requested. For that, we probably want to 
 have a specific executor for validation compaction and:
 * either we fail the whole repair whenever one node is not able to execute 
 the validation compaction right away (because no thread are available right 
 away).
 * we simply tell the user that if he start too many repairs 

[jira] [Commented] (CASSANDRA-2838) Query indexed column with key filte

2011-06-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057345#comment-13057345
 ] 

Jonathan Ellis commented on CASSANDRA-2838:
---

1600 is about map/reduce, this is about CQL

 Query indexed column with key filte
 ---

 Key: CASSANDRA-2838
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2838
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Danny Wang

 To be able to support query like this,
  (KEY  foo AND KEY  bar and name1 = value1
 Currently I found this code
  
 // Start and finish keys, *and* column relations (KEY  foo AND KEY  
 bar and name1 = value1).
 if (select.isKeyRange()  (select.getKeyFinish() != null)  
 (select.getColumnRelations().size()  0))
 throw new InvalidRequestException(You cannot combine key range 
 and by-column clauses in a SELECT);
  
  in
  
  
 http://svn.apache.org/repos/asf/cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java
  
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back

2011-06-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057348#comment-13057348
 ] 

Jonathan Ellis commented on CASSANDRA-2045:
---

bq. I was under the impression that HintedHandoff helps to retain the 
replication factor even in the face of failed replicas.

Nope. If you require N replicas to be written, then you should use an 
appropriate consistency level.

In 0.6+ hints are stored to other live replicas whenever possible (i.e. you 
still have less total replicas written) unless, as you noted, no replicas are 
alive and you're writing at CL.ANY.

So my point is that after we move away from storing hints as pointers to row 
data, there's no reason for the prefer other replicas optimization so we 
might as well just always store it on the coordinator.

 Simplify HH to decrease read load when nodes come back
 --

 Key: CASSANDRA-2045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2045
 Project: Cassandra
  Issue Type: Improvement
Reporter: Chris Goffinet
Assignee: Nicholas Telford
 Fix For: 1.0

 Attachments: 
 0001-Changed-storage-of-Hints-to-store-a-serialized-RowMu.patch, 
 0002-Refactored-HintedHandoffManager.sendRow-to-reduce-co.patch, 
 0003-Fixed-some-coding-style-issues.patch, 
 0004-Fixed-direct-usage-of-Gossiper.getEndpointStateForEn.patch, 
 0005-Removed-duplicate-failure-detection-conditionals.-It.patch, 
 0006-Removed-handling-of-old-style-hints.patch, 2045-v3.txt, 
 CASSANDRA-2045-simplify-hinted-handoff-001.diff, 
 CASSANDRA-2045-simplify-hinted-handoff-002.diff


 Currently when HH is enabled, hints are stored, and when a node comes back, 
 we begin sending that node data. We do a lookup on the local node for the row 
 to send. To help reduce read load (if a node is offline for long period of 
 time) we should store the data we want forward the node locally instead. We 
 wouldn't have to do any lookups, just take byte[] and send to the destination.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CASSANDRA-2840) cassandra-cli describe keyspace shows confusing memtable thresholds

2011-06-29 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2840.
---

Resolution: Duplicate

 cassandra-cli describe keyspace shows confusing memtable thresholds
 ---

 Key: CASSANDRA-2840
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2840
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.5
 Environment: linux rackspace instance 4cpu/4G
Reporter: Lanny Ripple
Priority: Minor

 The 'describe keyspace' output seems to be mixing up the labeling for minutes 
 and MB for Memtable thresholds output.
 Example off our ring:
Memtable thresholds: 0.2859375/61/1440 (millions of ops/minutes/MB)
 We use minutes=1440 and MB=61.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2475) Prepared statements

2011-06-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057352#comment-13057352
 ] 

Jonathan Ellis commented on CASSANDRA-2475:
---

bq. I suggest that these token streams could be small enough that they could be 
the returned value from the Prepare call and relieve the server side from the 
maintenance and accounting hassle of keeping track of them.

It's really no hassle.  We already encapsulate per-connection state in the 
ClientState object.  And parsing the tokens from the client each request is 
going to generate a ton of GC churn...

 Prepared statements
 ---

 Key: CASSANDRA-2475
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2475
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
  Labels: cql
 Fix For: 1.0




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2653) index scan errors out when zero columns are requested

2011-06-29 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-2653:


Attachment: 0001-Fix-scan-issue.patch

Actually, after having committed it, I realize there is a few issue with the 
previous patch. Two mostly:
# If the extraFilter query finds nothing (which it will only in case of the 
race between write and reads), getColumnFamily() will return null and the 
data.addAll() will NPE
# For 0.8 and for counters, we must make really sure that this extra query 
won't add column that were returned by the first query (which can happen in the 
current code), otherwise we'll overcount. I think this is actually a bug that 
predate the fix for this.

Anyway, attaching 0001-Fix-scan-issue that fixes both of those issue. It also 
add a slight optimization that avoids doing extra work if we know an extra 
query won't help.

 index scan errors out when zero columns are requested
 -

 Key: CASSANDRA-2653
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2653
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.6, 0.8.0 beta 2
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.7.7, 0.8.2

 Attachments: 0001-Fix-scan-issue.patch, 
 0001-Handle-data-get-returning-null-in-secondary-indexes.patch, 
 0001-Handle-null-returns-in-data-index-query-v0.7.patch, 
 0001-Reset-SSTII-in-EchoedRow-constructor.patch, 2653_v2.patch, 
 2653_v3.patch, v1-0001-CASSANDRA-2653-reproduce-regression.txt


 As reported by Tyler Hobbs as an addendum to CASSANDRA-2401,
 {noformat}
 ERROR 16:13:38,864 Fatal exception in thread Thread[ReadStage:16,5,main]
 java.lang.AssertionError: No data found for 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0] in DecoratedKey(81509516161424251288255223397843705139, 
 6b657931):QueryPath(columnFamilyName='cf', superColumnName='null', 
 columnName='null') (original filter 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0]) from expression 'cf.626972746864617465 EQ 1'
   at 
 org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1517)
   at 
 org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42)
   at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2475) Prepared statements

2011-06-29 Thread Rick Shaw (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057375#comment-13057375
 ] 

Rick Shaw commented on CASSANDRA-2475:
--

I guess I was a bit too vague... My suggestion would be to return the 
pre-parsed token stream not something you would re-parse every time it is 
re-submitted. It is the same item I think you suggest we cache on the server 
side. I think the interesting twist is that using the suggested method the 
pre-parsed item (or prepared statement) could be used days later in a 
different connection. It would be an immutable resource. If it is cached server 
side it is only good for the connection life. 

But its just a suggestion. I understand the merits of retaining complete 
control of the format over time and the efficiencies by passing the handle 
back and forth. And I was not familiar with the ClientState at all. 

More troubling is the batch semantics... I hate the idea of disrupting the 
current syntax in CQL but I think the parameter substitution step will be very 
fragile if there is not a notion of lists of items that are tightly coupled 
with their respective handle's parameters in the batch. The thought of 
thousands of rows worth of entries in a batch and getting the parameters right 
for a giant array/list of parameters that fill into the pre-compiled tokens 
seems fraught with problems. How does the repeating nature get expressed? 
Currently it is very concrete and can be parsed into a mutation on the fly. But 
if it is pre-parsed what syntax represents the concept of repetition? Is the 
syntax different for the prepared statement vs. the simple (not prepared) 
statement as today?

The crafters of the JDBC driver specification seemed to have been faced with 
the same problem. Their solution was to have a batch method as well as 
execute methods that takes an array/list of prepared statements. Unsolved for 
us is how to recognize the notion of mutation start/Mutation end using that 
approach. Maybe you just do a prepare call for BEGIN BATCH and APPLY 
BATCH and use them in the list sent via the batch method?

 Prepared statements
 ---

 Key: CASSANDRA-2475
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2475
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
  Labels: cql
 Fix For: 1.0




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057392#comment-13057392
 ] 

Brandon Williams edited comment on CASSANDRA-2388 at 6/29/11 6:41 PM:
--

{quote}
This does happen already (i've seen it while testing initial patches that were 
no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
{quote}

If the cassandra node where the TT resides isn't working, then throughput is 
reduced regardless.


bq. I bet too that a fallback to a replica is faster than a fallback to another 
TT.

I doubt that for any significant job.  Locality is important.  Move the job to 
the data, not the data to the job.

{quote}
There is no guarantee that any given TT will have its split accessible via a 
local c* node - this is only a preference in CFRR. A failed task may just as 
likely go to a random c* node. At least now we can actually properly limit to 
the one DC and sort by proximity.
{quote}

This sounds like the thing we need to fix, then.  Ensuring that the TT assigned 
to the map has a local replica.

  was (Author: brandon.williams):
{quote}
This does happen already (i've seen it while testing initial patches that were 
no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
{quote}

If the cassandra node where the TT resides isn't working, then throughput is 
reduced regardless.


bq. I bet too that a fallback to a replica is faster than a fallback to another 
TT.

I doubt that for any significant job.  Locality is important.

{quote}
There is no guarantee that any given TT will have its split accessible via a 
local c* node - this is only a preference in CFRR. A failed task may just as 
likely go to a random c* node. At least now we can actually properly limit to 
the one DC and sort by proximity.
{quote}

This sounds like the thing we need to fix, then.  Ensuring that the TT assigned 
to the map has a local replica.
  
 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057392#comment-13057392
 ] 

Brandon Williams commented on CASSANDRA-2388:
-

{quote}
This does happen already (i've seen it while testing initial patches that were 
no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
{quote}

If the cassandra node where the TT resides isn't working, then throughput is 
reduced regardless.


bq. I bet too that a fallback to a replica is faster than a fallback to another 
TT.

I doubt that for any significant job.  Locality is important.

{quote}
There is no guarantee that any given TT will have its split accessible via a 
local c* node - this is only a preference in CFRR. A failed task may just as 
likely go to a random c* node. At least now we can actually properly limit to 
the one DC and sort by proximity.
{quote}

This sounds like the thing we need to fix, then.  Ensuring that the TT assigned 
to the map has a local replica.

 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back

2011-06-29 Thread Nicholas Telford (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057398#comment-13057398
 ] 

Nicholas Telford commented on CASSANDRA-2045:
-

bq. It looks like we do a query per hint to look up its version on replay? I 
think we can avoid that (one of the benefits of the new approach is we should 
be able to just do seq reads of a hint row on replay). Why not just add version 
in as another subcolumn of the hint entry?

I don't quite follow this. The new schema for hints doesn't really allow 
sequential reads of the row. Here's what I currently have:

{noformat}

Old
-
Hints: {// cf
  dest ip: {  // key
key: {// super-column
  table-cf: null// column
}
  }
}

New
--
Hints: {// cf
  dest ip: {  // key
key: {// super-column
  table-cf: id// column
}
  }
}

HintedMutations: {  // cf
  dest ip: {  // key
id: { // super-column
  version: mutation // column
}
  }
}
{noformat}

The point was to retain backwards compatability with the old Hints (so we don't 
have to expunge old ones on upgrade), but if we feel that we gain more by 
breaking this compatibility I'm open to it. As has been previously mentioned, 
losing hints during upgrade isn't the end of the world as they're little more 
than an optimization.

 Simplify HH to decrease read load when nodes come back
 --

 Key: CASSANDRA-2045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2045
 Project: Cassandra
  Issue Type: Improvement
Reporter: Chris Goffinet
Assignee: Nicholas Telford
 Fix For: 1.0

 Attachments: 
 0001-Changed-storage-of-Hints-to-store-a-serialized-RowMu.patch, 
 0002-Refactored-HintedHandoffManager.sendRow-to-reduce-co.patch, 
 0003-Fixed-some-coding-style-issues.patch, 
 0004-Fixed-direct-usage-of-Gossiper.getEndpointStateForEn.patch, 
 0005-Removed-duplicate-failure-detection-conditionals.-It.patch, 
 0006-Removed-handling-of-old-style-hints.patch, 2045-v3.txt, 
 CASSANDRA-2045-simplify-hinted-handoff-001.diff, 
 CASSANDRA-2045-simplify-hinted-handoff-002.diff


 Currently when HH is enabled, hints are stored, and when a node comes back, 
 we begin sending that node data. We do a lookup on the local node for the row 
 to send. To help reduce read load (if a node is offline for long period of 
 time) we should store the data we want forward the node locally instead. We 
 wouldn't have to do any lookups, just take byte[] and send to the destination.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1141194 [2/2] - in /cassandra/trunk: conf/ src/java/org/apache/cassandra/config/ src/java/org/apache/cassandra/db/ src/java/org/apache/cassandra/db/marshal/ src/java/org/apache/cassandra/

2011-06-29 Thread brandonwilliams
Modified: cassandra/trunk/src/java/org/apache/cassandra/tools/BulkLoader.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/tools/BulkLoader.java?rev=1141194r1=1141193r2=1141194view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/tools/BulkLoader.java 
(original)
+++ cassandra/trunk/src/java/org/apache/cassandra/tools/BulkLoader.java Wed Jun 
29 18:55:50 2011
@@ -184,7 +184,7 @@ public class BulkLoader
 StorageService.instance.initClient();
 
 SetInetAddress hosts = Gossiper.instance.getLiveMembers();
-hosts.remove(FBUtilities.getLocalAddress());
+hosts.remove(FBUtilities.getBroadcastAddress());
 if (hosts.isEmpty())
 throw new IllegalStateException(Cannot load any sstable, 
no live member found in the cluster);
 

Modified: cassandra/trunk/src/java/org/apache/cassandra/utils/FBUtilities.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/utils/FBUtilities.java?rev=1141194r1=1141193r2=1141194view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/utils/FBUtilities.java 
(original)
+++ cassandra/trunk/src/java/org/apache/cassandra/utils/FBUtilities.java Wed 
Jun 29 18:55:50 2011
@@ -61,6 +61,7 @@ public class FBUtilities
 public static final BigInteger TWO = new BigInteger(2);
 
 private static volatile InetAddress localInetAddress_;
+private static volatile InetAddress broadcastInetAddress_;
 
 private static final ThreadLocalMessageDigest localMD5Digest = new 
ThreadLocalMessageDigest()
 {
@@ -129,6 +130,15 @@ public class FBUtilities
 return localInetAddress_;
 }
 
+public static InetAddress getBroadcastAddress()
+{
+if (broadcastInetAddress_ == null)
+broadcastInetAddress_ = DatabaseDescriptor.getBroadcastAddress() 
== null
+? getLocalAddress()
+: DatabaseDescriptor.getBroadcastAddress();
+return broadcastInetAddress_;
+}
+
 /**
  * @param fractOrAbs A double that may represent a fraction or absolute 
value.
  * @param total If fractionOrAbs is a fraction, the total to take the 
fraction from

Modified: cassandra/trunk/src/java/org/apache/cassandra/utils/Mx4jTool.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/utils/Mx4jTool.java?rev=1141194r1=1141193r2=1141194view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/utils/Mx4jTool.java (original)
+++ cassandra/trunk/src/java/org/apache/cassandra/utils/Mx4jTool.java Wed Jun 
29 18:55:50 2011
@@ -80,7 +80,7 @@ public class Mx4jTool
 
 private static String getAddress()
 {
-return System.getProperty(mx4jaddress, 
FBUtilities.getLocalAddress().getHostAddress());
+return System.getProperty(mx4jaddress, 
FBUtilities.getBroadcastAddress().getHostAddress());
 }
 
 private static int getPort()

Modified: cassandra/trunk/src/java/org/apache/cassandra/utils/NodeId.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/utils/NodeId.java?rev=1141194r1=1141193r2=1141194view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/utils/NodeId.java (original)
+++ cassandra/trunk/src/java/org/apache/cassandra/utils/NodeId.java Wed Jun 29 
18:55:50 2011
@@ -102,7 +102,7 @@ public class NodeId implements Comparabl
 
 public static NodeId generate()
 {
-return new 
NodeId(ByteBuffer.wrap(UUIDGen.decompose(UUIDGen.makeType1UUIDFromHost(FBUtilities.getLocalAddress();
+return new 
NodeId(ByteBuffer.wrap(UUIDGen.decompose(UUIDGen.makeType1UUIDFromHost(FBUtilities.getBroadcastAddress();
 }
 
 /*

Modified: cassandra/trunk/test/unit/org/apache/cassandra/db/DefsTest.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/test/unit/org/apache/cassandra/db/DefsTest.java?rev=1141194r1=1141193r2=1141194view=diff
==
--- cassandra/trunk/test/unit/org/apache/cassandra/db/DefsTest.java (original)
+++ cassandra/trunk/test/unit/org/apache/cassandra/db/DefsTest.java Wed Jun 29 
18:55:50 2011
@@ -164,7 +164,7 @@ public class DefsTest extends CleanupHel
 public void saveAndRestore() throws IOException
 {
 // verify dump and reload.
-UUID first = 
UUIDGen.makeType1UUIDFromHost(FBUtilities.getLocalAddress());
+UUID first = 
UUIDGen.makeType1UUIDFromHost(FBUtilities.getBroadcastAddress());
 DefsTable.dumpToStorage(first);
 ListKSMetaData defs = new 

[Cassandra Wiki] Update of FileFormatDesignDoc by AlanLiang

2011-06-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The FileFormatDesignDoc page has been changed by AlanLiang:
http://wiki.apache.org/cassandra/FileFormatDesignDoc?action=diffrev1=39rev2=40

Comment:
add disk layout for chunk

  }
  }}}
  
+ == Disk Layout ==
+ 
+ === Chunk ===
+ 
+ || ''name1''  || ''bytes'' ||
+ || magic || 16 ||
+ || encoded_length || 4 ||
+ || encoded || variable ||
+ || hash || 4 ||
+ 
  == Roadmap ==
  
  Implementation has started on this design at 
https://github.com/stuhood/cassandra/tree/file-format


[Cassandra Wiki] Trivial Update of FileFormatDesignDoc by AlanLiang

2011-06-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The FileFormatDesignDoc page has been changed by AlanLiang:
http://wiki.apache.org/cassandra/FileFormatDesignDoc?action=diffrev1=40rev2=41

  
  === Chunk ===
  
- || ''name1''  || ''bytes'' ||
+ || || ''bytes'' ||
  || magic || 16 ||
  || encoded_length || 4 ||
  || encoded || variable ||


[jira] [Created] (CASSANDRA-2841) Always use even distribution for merkle tree with RandomPartitionner

2011-06-29 Thread Sylvain Lebresne (JIRA)
Always use even distribution for merkle tree with RandomPartitionner


 Key: CASSANDRA-2841
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2841
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
 Fix For: 0.7.7, 0.8.2
 Attachments: 2841.patch

When creating the initial merkle tree, repair tries to be (too) smart and use 
the key samples to guide the tree splitting. While this is a good idea for 
OPP where there is a good change the data distribution is uneven, you can't 
beat an even distribution for the RandomPartitionner. And a quick experiment 
even shows that the method used is significantly less efficient than an even 
distribution for the ranges of the merkle tree (that is, an even distribution 
gives a much better of distribution of the number of keys by range of the tree).

Thus let's switch to an even distribution for RandomPartitionner. That 3 lines 
change alone amounts for a significant improvement of repair's precision.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2841) Always use even distribution for merkle tree with RandomPartitionner

2011-06-29 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-2841:


Attachment: 2841.patch

Patch is against 0.7.

 Always use even distribution for merkle tree with RandomPartitionner
 

 Key: CASSANDRA-2841
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2841
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
  Labels: repair
 Fix For: 0.7.7, 0.8.2

 Attachments: 2841.patch


 When creating the initial merkle tree, repair tries to be (too) smart and use 
 the key samples to guide the tree splitting. While this is a good idea for 
 OPP where there is a good change the data distribution is uneven, you can't 
 beat an even distribution for the RandomPartitionner. And a quick experiment 
 even shows that the method used is significantly less efficient than an even 
 distribution for the ranges of the merkle tree (that is, an even distribution 
 gives a much better of distribution of the number of keys by range of the 
 tree).
 Thus let's switch to an even distribution for RandomPartitionner. That 3 
 lines change alone amounts for a significant improvement of repair's 
 precision.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back

2011-06-29 Thread Nicholas Telford (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057401#comment-13057401
 ] 

Nicholas Telford commented on CASSANDRA-2045:
-

bq. if we're storing the full mutation, why add the complexity of hint headers 
and forwarding? Can we just make the coordinator responsible for all hints 
instead?
bq. So my point is that after we move away from storing hints as pointers to 
row data, there's no reason for the prefer other replicas optimization so we 
might as well just always store it on the coordinator.

While I agree with this, it seems that changing this is non-trivial (lots of 
changes to StorageProxy by the looks of it) so I'm leaning towards not 
including it in this ticket. It seems like an isolated idea though, albeit one 
that depends on this issue. Can we open this as a dependent ticket?

 Simplify HH to decrease read load when nodes come back
 --

 Key: CASSANDRA-2045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2045
 Project: Cassandra
  Issue Type: Improvement
Reporter: Chris Goffinet
Assignee: Nicholas Telford
 Fix For: 1.0

 Attachments: 
 0001-Changed-storage-of-Hints-to-store-a-serialized-RowMu.patch, 
 0002-Refactored-HintedHandoffManager.sendRow-to-reduce-co.patch, 
 0003-Fixed-some-coding-style-issues.patch, 
 0004-Fixed-direct-usage-of-Gossiper.getEndpointStateForEn.patch, 
 0005-Removed-duplicate-failure-detection-conditionals.-It.patch, 
 0006-Removed-handling-of-old-style-hints.patch, 2045-v3.txt, 
 CASSANDRA-2045-simplify-hinted-handoff-001.diff, 
 CASSANDRA-2045-simplify-hinted-handoff-002.diff


 Currently when HH is enabled, hints are stored, and when a node comes back, 
 we begin sending that node data. We do a lookup on the local node for the row 
 to send. To help reduce read load (if a node is offline for long period of 
 time) we should store the data we want forward the node locally instead. We 
 wouldn't have to do any lookups, just take byte[] and send to the destination.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back

2011-06-29 Thread Nicholas Telford (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057409#comment-13057409
 ] 

Nicholas Telford commented on CASSANDRA-2045:
-

Another consideration: If we're moving away from the old hint storage layout, 
we can optimize for cases where the same RowMutation needs to be delivered to 
multiple endpoints (i.e. multiple replicas are down). This can be done by 
moving the destination IP down to the bottom level of the map so each 
RowMutation maps to multiple destinations.

Thoughts?

 Simplify HH to decrease read load when nodes come back
 --

 Key: CASSANDRA-2045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2045
 Project: Cassandra
  Issue Type: Improvement
Reporter: Chris Goffinet
Assignee: Nicholas Telford
 Fix For: 1.0

 Attachments: 
 0001-Changed-storage-of-Hints-to-store-a-serialized-RowMu.patch, 
 0002-Refactored-HintedHandoffManager.sendRow-to-reduce-co.patch, 
 0003-Fixed-some-coding-style-issues.patch, 
 0004-Fixed-direct-usage-of-Gossiper.getEndpointStateForEn.patch, 
 0005-Removed-duplicate-failure-detection-conditionals.-It.patch, 
 0006-Removed-handling-of-old-style-hints.patch, 2045-v3.txt, 
 CASSANDRA-2045-simplify-hinted-handoff-001.diff, 
 CASSANDRA-2045-simplify-hinted-handoff-002.diff


 Currently when HH is enabled, hints are stored, and when a node comes back, 
 we begin sending that node data. We do a lookup on the local node for the row 
 to send. To help reduce read load (if a node is offline for long period of 
 time) we should store the data we want forward the node locally instead. We 
 wouldn't have to do any lookups, just take byte[] and send to the destination.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057413#comment-13057413
 ] 

Jonathan Ellis commented on CASSANDRA-2388:
---

bq. If the cassandra node where the TT resides isn't working, then throughput 
is reduced regardless.

Right: we _want_ it to be blacklisted in that scenario.

 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2653) index scan errors out when zero columns are requested

2011-06-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057415#comment-13057415
 ] 

Jonathan Ellis commented on CASSANDRA-2653:
---

+1

 index scan errors out when zero columns are requested
 -

 Key: CASSANDRA-2653
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2653
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.6, 0.8.0 beta 2
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.7.7, 0.8.2

 Attachments: 0001-Fix-scan-issue.patch, 
 0001-Handle-data-get-returning-null-in-secondary-indexes.patch, 
 0001-Handle-null-returns-in-data-index-query-v0.7.patch, 
 0001-Reset-SSTII-in-EchoedRow-constructor.patch, 2653_v2.patch, 
 2653_v3.patch, v1-0001-CASSANDRA-2653-reproduce-regression.txt


 As reported by Tyler Hobbs as an addendum to CASSANDRA-2401,
 {noformat}
 ERROR 16:13:38,864 Fatal exception in thread Thread[ReadStage:16,5,main]
 java.lang.AssertionError: No data found for 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0] in DecoratedKey(81509516161424251288255223397843705139, 
 6b657931):QueryPath(columnFamilyName='cf', superColumnName='null', 
 columnName='null') (original filter 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0]) from expression 'cf.626972746864617465 EQ 1'
   at 
 org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1517)
   at 
 org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42)
   at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2475) Prepared statements

2011-06-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057417#comment-13057417
 ] 

Jonathan Ellis commented on CASSANDRA-2475:
---

My point was you still need to parse it from the socket bytestream.  Not 
re-parsing the raw CQL.

I really don't see what is so complex about apply(parsed_tokens_list, 
parameters) vs apply(saved_queries.get(parsed_tokens_id), parameters).

 Prepared statements
 ---

 Key: CASSANDRA-2475
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2475
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
  Labels: cql
 Fix For: 1.0




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2841) Always use even distribution for merkle tree with RandomPartitionner

2011-06-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057418#comment-13057418
 ] 

Jonathan Ellis commented on CASSANDRA-2841:
---

+1

 Always use even distribution for merkle tree with RandomPartitionner
 

 Key: CASSANDRA-2841
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2841
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
  Labels: repair
 Fix For: 0.7.7, 0.8.2

 Attachments: 2841.patch


 When creating the initial merkle tree, repair tries to be (too) smart and use 
 the key samples to guide the tree splitting. While this is a good idea for 
 OPP where there is a good change the data distribution is uneven, you can't 
 beat an even distribution for the RandomPartitionner. And a quick experiment 
 even shows that the method used is significantly less efficient than an even 
 distribution for the ranges of the merkle tree (that is, an even distribution 
 gives a much better of distribution of the number of keys by range of the 
 tree).
 Thus let's switch to an even distribution for RandomPartitionner. That 3 
 lines change alone amounts for a significant improvement of repair's 
 precision.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1141213 - /cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java

2011-06-29 Thread slebresne
Author: slebresne
Date: Wed Jun 29 19:36:15 2011
New Revision: 1141213

URL: http://svn.apache.org/viewvc?rev=1141213view=rev
Log:
Fix last issues with 2653

Modified:

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java?rev=1141213r1=1141212r2=1141213view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
 Wed Jun 29 19:36:15 2011
@@ -1496,6 +1496,13 @@ public class ColumnFamilyStore implement
 return new NamesQueryFilter(columns);
 }
 
+private static boolean isIdentityFilter(SliceQueryFilter filter)
+{
+return filter.start.equals(ByteBufferUtil.EMPTY_BYTE_BUFFER)
+ filter.finish.equals(ByteBufferUtil.EMPTY_BYTE_BUFFER)
+ filter.count == Integer.MAX_VALUE;
+}
+
 public ListRow scan(IndexClause clause, AbstractBounds range, IFilter 
dataFilter)
 {
 // Start with the most-restrictive indexed clause, then apply 
remaining clauses
@@ -1511,7 +1518,6 @@ public class ColumnFamilyStore implement
 // if the slicepredicate doesn't contain all the columns for which we 
have expressions to evaluate,
 // it needs to be expanded to include those too
 IFilter firstFilter = dataFilter;
-NamesQueryFilter extraFilter = null;
 if (dataFilter instanceof SliceQueryFilter)
 {
 // if we have a high chance of getting all the columns in a single 
index slice, do that.
@@ -1597,23 +1603,36 @@ public class ColumnFamilyStore implement
 if (data == null)
 data = ColumnFamily.create(metadata);
 logger.debug(fetched data row {}, data);
-if (dataFilter instanceof SliceQueryFilter)
+if (dataFilter instanceof SliceQueryFilter  
!isIdentityFilter((SliceQueryFilter)dataFilter))
 {
 // we might have gotten the expression columns in with the 
main data slice, but
 // we can't know for sure until that slice is done.  So, 
we'll do the extra query
 // if we go through and any expression columns are not 
present.
+boolean needExtraFilter = false;
 for (IndexExpression expr : clause.expressions)
 {
 if (data.getColumn(expr.column_name) == null)
 {
 logger.debug(adding extraFilter to cover 
additional expressions);
 // Lazily creating extra filter
-if (extraFilter == null)
-extraFilter = getExtraFilter(clause);
-data.addAll(getColumnFamily(new QueryFilter(dk, 
path, extraFilter)));
+needExtraFilter = true;
 break;
 }
 }
+if (needExtraFilter)
+{
+NamesQueryFilter extraFilter = getExtraFilter(clause);
+for (IndexExpression expr : clause.expressions)
+{
+if (data.getColumn(expr.column_name) != null)
+extraFilter.columns.remove(expr.column_name);
+}
+assert !extraFilter.columns.isEmpty();
+ColumnFamily cf = getColumnFamily(new QueryFilter(dk, 
path, extraFilter));
+if (cf != null)
+data.addAll(cf);
+}
+
 }
 
 if (satisfies(data, clause, primary))




[jira] [Commented] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back

2011-06-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057419#comment-13057419
 ] 

Jonathan Ellis commented on CASSANDRA-2045:
---

bq. losing hints during upgrade isn't the end of the world

Right.  I'm saying we should do this:

Hints: {// cf
  dest ip: {  // key
key: {// super-column
  table-cf: id// column
  mutation: mutation  // column
}
  }
}

So we denormalize but we gain not having to do secondary-lookup-per-mutation, 
which is our main motivation for the change.  (And single-destination-per-hint 
is by far the common case.)

bq. Can we open this as a dependent ticket?

WFM.

 Simplify HH to decrease read load when nodes come back
 --

 Key: CASSANDRA-2045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2045
 Project: Cassandra
  Issue Type: Improvement
Reporter: Chris Goffinet
Assignee: Nicholas Telford
 Fix For: 1.0

 Attachments: 
 0001-Changed-storage-of-Hints-to-store-a-serialized-RowMu.patch, 
 0002-Refactored-HintedHandoffManager.sendRow-to-reduce-co.patch, 
 0003-Fixed-some-coding-style-issues.patch, 
 0004-Fixed-direct-usage-of-Gossiper.getEndpointStateForEn.patch, 
 0005-Removed-duplicate-failure-detection-conditionals.-It.patch, 
 0006-Removed-handling-of-old-style-hints.patch, 2045-v3.txt, 
 CASSANDRA-2045-simplify-hinted-handoff-001.diff, 
 CASSANDRA-2045-simplify-hinted-handoff-002.diff


 Currently when HH is enabled, hints are stored, and when a node comes back, 
 we begin sending that node data. We do a lookup on the local node for the row 
 to send. To help reduce read load (if a node is offline for long period of 
 time) we should store the data we want forward the node locally instead. We 
 wouldn't have to do any lookups, just take byte[] and send to the destination.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back

2011-06-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057419#comment-13057419
 ] 

Jonathan Ellis edited comment on CASSANDRA-2045 at 6/29/11 7:36 PM:


bq. losing hints during upgrade isn't the end of the world

Right.  I'm saying we should do this:

{noformat}
Hints: {// cf
  dest ip: {  // key
key: {// super-column
  table-cf: id// column
  mutation: mutation  // column
}
  }
}
{noformat}

So we denormalize but we gain not having to do secondary-lookup-per-mutation, 
which is our main motivation for the change.  (And single-destination-per-hint 
is by far the common case.)

bq. Can we open this as a dependent ticket?

WFM.

  was (Author: jbellis):
bq. losing hints during upgrade isn't the end of the world

Right.  I'm saying we should do this:

Hints: {// cf
  dest ip: {  // key
key: {// super-column
  table-cf: id// column
  mutation: mutation  // column
}
  }
}

So we denormalize but we gain not having to do secondary-lookup-per-mutation, 
which is our main motivation for the change.  (And single-destination-per-hint 
is by far the common case.)

bq. Can we open this as a dependent ticket?

WFM.
  
 Simplify HH to decrease read load when nodes come back
 --

 Key: CASSANDRA-2045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2045
 Project: Cassandra
  Issue Type: Improvement
Reporter: Chris Goffinet
Assignee: Nicholas Telford
 Fix For: 1.0

 Attachments: 
 0001-Changed-storage-of-Hints-to-store-a-serialized-RowMu.patch, 
 0002-Refactored-HintedHandoffManager.sendRow-to-reduce-co.patch, 
 0003-Fixed-some-coding-style-issues.patch, 
 0004-Fixed-direct-usage-of-Gossiper.getEndpointStateForEn.patch, 
 0005-Removed-duplicate-failure-detection-conditionals.-It.patch, 
 0006-Removed-handling-of-old-style-hints.patch, 2045-v3.txt, 
 CASSANDRA-2045-simplify-hinted-handoff-001.diff, 
 CASSANDRA-2045-simplify-hinted-handoff-002.diff


 Currently when HH is enabled, hints are stored, and when a node comes back, 
 we begin sending that node data. We do a lookup on the local node for the row 
 to send. To help reduce read load (if a node is offline for long period of 
 time) we should store the data we want forward the node locally instead. We 
 wouldn't have to do any lookups, just take byte[] and send to the destination.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1141214 - in /cassandra/branches/cassandra-0.8: ./ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/db/

2011-06-29 Thread slebresne
Author: slebresne
Date: Wed Jun 29 19:39:55 2011
New Revision: 1141214

URL: http://svn.apache.org/viewvc?rev=1141214view=rev
Log:
Merge from 0.7

Modified:
cassandra/branches/cassandra-0.8/   (props changed)
cassandra/branches/cassandra-0.8/contrib/   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/db/ColumnFamilyStore.java

Propchange: cassandra/branches/cassandra-0.8/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 19:39:55 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7:1026516-1140567,1140928,1141129
+/cassandra/branches/cassandra-0.7:1026516-1140567,1140928,1141129,1141213
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
 /cassandra/branches/cassandra-0.8:1090934-1125013,1125041
 /cassandra/branches/cassandra-0.8.0:1125021-1130369

Propchange: cassandra/branches/cassandra-0.8/contrib/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 19:39:55 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009
-/cassandra/branches/cassandra-0.7/contrib:1026516-1140567,1140928,1141129
+/cassandra/branches/cassandra-0.7/contrib:1026516-1140567,1140928,1141129,1141213
 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654
 /cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125041
 /cassandra/branches/cassandra-0.8.0/contrib:1125021-1130369

Propchange: 
cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 19:39:55 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1140567,1140928,1141129
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1140567,1140928,1141129,1141213
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
 
/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125041
 
/cassandra/branches/cassandra-0.8.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1125021-1130369

Propchange: 
cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 19:39:55 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1140567,1140928,1141129
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1140567,1140928,1141129,1141213
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1055654
 
/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1090934-1125013,1125041
 
/cassandra/branches/cassandra-0.8.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1125021-1130369

Propchange: 
cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 19:39:55 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java:922689-1052356,1052358-1053452,1053454,1053456-1131291

[jira] [Updated] (CASSANDRA-2677) Optimize streaming to be single-pass

2011-06-29 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2677:
--

Fix Version/s: (was: 0.8.2)
   1.0

Moving to 1.0 b/c of CASSANDRA-2818.

 Optimize streaming to be single-pass
 

 Key: CASSANDRA-2677
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2677
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor
 Fix For: 1.0


 Streaming currently is a two-pass operation: one to write the Data component 
 do disk from the socket, then another to build the index and bloom filter 
 from it.  This means we do about 2x the i/o we would if we created the index 
 and BF during the original write.
 For node movement this was not considered to be a Big Deal because the stream 
 target is not a member of the ring, so we can be inefficient without hurting 
 live queries.  But optimizing node movement to not require un/rebootstrap 
 (CASSANDRA-1427) and bulk load (CASSANDRA-1278) mean we can stream to live 
 nodes too.
 The main obstacle here is we don't know how many keys will be in the new 
 sstable ahead of time, which we need to size the bloom filter correctly. We 
 can solve this by including that information (or a close approximation) in 
 the stream setup -- the source node can calculate that without hitting disk 
 from the in-memory index summary.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1141215 - in /cassandra/trunk: ./ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/db/

2011-06-29 Thread slebresne
Author: slebresne
Date: Wed Jun 29 19:41:46 2011
New Revision: 1141215

URL: http://svn.apache.org/viewvc?rev=1141215view=rev
Log:
Merge from 0.8

Modified:
cassandra/trunk/   (props changed)
cassandra/trunk/contrib/   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)
cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java

Propchange: cassandra/trunk/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 19:41:46 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7:1026516-1140567,1141129
+/cassandra/branches/cassandra-0.7:1026516-1140567,1141129,1141213
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
-/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1140755,1140760,1141134
+/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1140755,1140760,1141134,1141214
 /cassandra/branches/cassandra-0.8.0:1125021-1130369
 /cassandra/branches/cassandra-0.8.1:1101014-1125018
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689

Propchange: cassandra/trunk/contrib/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 19:41:46 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009
-/cassandra/branches/cassandra-0.7/contrib:1026516-1140567,1141129
+/cassandra/branches/cassandra-0.7/contrib:1026516-1140567,1141129,1141213
 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654
-/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1140755,1140760,1141134
+/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1140755,1140760,1141134,1141214
 /cassandra/branches/cassandra-0.8.0/contrib:1125021-1130369
 /cassandra/branches/cassandra-0.8.1/contrib:1101014-1125018
 /cassandra/tags/cassandra-0.7.0-rc3/contrib:1051699-1053689

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 19:41:46 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1140567,1141129
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1140567,1141129,1141213
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
-/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125019-1140755,1140760,1141134
+/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125019-1140755,1140760,1141134,1141214
 
/cassandra/branches/cassandra-0.8.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1125021-1130369
 
/cassandra/branches/cassandra-0.8.1/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1101014-1125018
 
/cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1051699-1053689

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 19:41:46 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1140567,1141129
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1140567,1141129,1141213
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1055654

svn commit: r1141220 - in /cassandra/branches/cassandra-0.8: ./ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/service/

2011-06-29 Thread slebresne
Author: slebresne
Date: Wed Jun 29 19:49:30 2011
New Revision: 1141220

URL: http://svn.apache.org/viewvc?rev=1141220view=rev
Log:
merge from 0.7

Modified:
cassandra/branches/cassandra-0.8/   (props changed)
cassandra/branches/cassandra-0.8/CHANGES.txt
cassandra/branches/cassandra-0.8/contrib/   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/AntiEntropyService.java

Propchange: cassandra/branches/cassandra-0.8/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 19:49:30 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7:1026516-1140567,1140928,1141129,1141213
+/cassandra/branches/cassandra-0.7:1026516-1140567,1140928,1141129,1141213,1141217
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
 /cassandra/branches/cassandra-0.8:1090934-1125013,1125041
 /cassandra/branches/cassandra-0.8.0:1125021-1130369

Modified: cassandra/branches/cassandra-0.8/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/CHANGES.txt?rev=1141220r1=1141219r2=1141220view=diff
==
--- cassandra/branches/cassandra-0.8/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.8/CHANGES.txt Wed Jun 29 19:49:30 2011
@@ -86,6 +86,8 @@
  * fix race that could result in Hadoop writer failing to throw an
exception encountered after close() (CASSANDRA-2755)
  * fix scan wrongly throwing assertion error (CASSANDRA-2653)
+ * Always use even distribution for merkle tree with RandomPartitionner
+   (CASSANDRA-2841)
 
 
 0.8.0-final

Propchange: cassandra/branches/cassandra-0.8/contrib/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 19:49:30 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009
-/cassandra/branches/cassandra-0.7/contrib:1026516-1140567,1140928,1141129,1141213
+/cassandra/branches/cassandra-0.7/contrib:1026516-1140567,1140928,1141129,1141213,1141217
 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654
 /cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125041
 /cassandra/branches/cassandra-0.8.0/contrib:1125021-1130369

Propchange: 
cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 19:49:30 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1140567,1140928,1141129,1141213
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1140567,1140928,1141129,1141213,1141217
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
 
/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125041
 
/cassandra/branches/cassandra-0.8.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1125021-1130369

Propchange: 
cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 19:49:30 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1140567,1140928,1141129,1141213
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1140567,1140928,1141129,1141213,1141217
 

svn commit: r1141221 - in /cassandra/trunk: ./ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/service/

2011-06-29 Thread slebresne
Author: slebresne
Date: Wed Jun 29 19:50:42 2011
New Revision: 1141221

URL: http://svn.apache.org/viewvc?rev=1141221view=rev
Log:
merge from 0.8

Modified:
cassandra/trunk/   (props changed)
cassandra/trunk/CHANGES.txt
cassandra/trunk/contrib/   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)

cassandra/trunk/src/java/org/apache/cassandra/service/AntiEntropyService.java

Propchange: cassandra/trunk/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 19:50:42 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7:1026516-1140567,1141129,1141213
+/cassandra/branches/cassandra-0.7:1026516-1140567,1141129,1141213,1141217
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
-/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1140755,1140760,1141134,1141214
+/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1140755,1140760,1141134,1141214,1141220
 /cassandra/branches/cassandra-0.8.0:1125021-1130369
 /cassandra/branches/cassandra-0.8.1:1101014-1125018
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689

Modified: cassandra/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1141221r1=1141220r2=1141221view=diff
==
--- cassandra/trunk/CHANGES.txt (original)
+++ cassandra/trunk/CHANGES.txt Wed Jun 29 19:50:42 2011
@@ -104,6 +104,8 @@
  * fix race that could result in Hadoop writer failing to throw an
exception encountered after close() (CASSANDRA-2755)
  * fix scan wrongly throwing assertion error (CASSANDRA-2653)
+ * Always use even distribution for merkle tree with RandomPartitionner
+   (CASSANDRA-2841)
 
 
 0.8.0-final

Propchange: cassandra/trunk/contrib/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 19:50:42 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009
-/cassandra/branches/cassandra-0.7/contrib:1026516-1140567,1141129,1141213
+/cassandra/branches/cassandra-0.7/contrib:1026516-1140567,1141129,1141213,1141217
 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654
-/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1140755,1140760,1141134,1141214
+/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1140755,1140760,1141134,1141214,1141220
 /cassandra/branches/cassandra-0.8.0/contrib:1125021-1130369
 /cassandra/branches/cassandra-0.8.1/contrib:1101014-1125018
 /cassandra/tags/cassandra-0.7.0-rc3/contrib:1051699-1053689

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 19:50:42 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1140567,1141129,1141213
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1140567,1141129,1141213,1141217
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
-/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125019-1140755,1140760,1141134,1141214
+/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125019-1140755,1140760,1141134,1141214,1141220
 
/cassandra/branches/cassandra-0.8.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1125021-1130369
 
/cassandra/branches/cassandra-0.8.1/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1101014-1125018
 
/cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1051699-1053689

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java

buildbot failure in ASF Buildbot on cassandra-trunk

2011-06-29 Thread buildbot
The Buildbot has detected a new failure on builder cassandra-trunk while 
building ASF Buildbot.
Full details are available at:
 http://ci.apache.org/builders/cassandra-trunk/builds/1406

Buildbot URL: http://ci.apache.org/

Buildslave for this Build: isis_ubuntu

Build Reason: scheduler
Build Source Stamp: [branch cassandra/trunk] 1141221
Blamelist: slebresne

BUILD FAILED: failed compile

sincerely,
 -The Buildbot



[jira] [Commented] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057437#comment-13057437
 ] 

T Jake Luciani commented on CASSANDRA-2388:
---

I dont think we should require the TT to be running locally. The whole idea is 
to support access to Cassandra data from hadoop even if it's just an import. 

This patch does spend a lot of time dealing with non local data for that 
reason. 

 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2475) Prepared statements

2011-06-29 Thread Rick Shaw (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057440#comment-13057440
 ] 

Rick Shaw commented on CASSANDRA-2475:
--


{quote}
I really don't see what is so complex about apply(parsed_tokens_list, 
parameters) vs apply(saved_queries.get(parsed_tokens_id), parameters).
{quote}

Given that the design has such a stream I agree completely. Not complex at all. 
Hence my statement:

{quote}
Even simple statements would be parsed down to the stream of tokens; It would 
just be executed immediately and then tossed as opposed to cached and returning 
the to the caller.
{quote}

I think we are in agreement of the need for such a precompiled item, and given 
that it needs to exist anyway we might as well only have one ANTLR parser and 
use its product for both simple and prepared statements.

 Prepared statements
 ---

 Key: CASSANDRA-2475
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2475
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
  Labels: cql
 Fix For: 1.0




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back

2011-06-29 Thread Nicholas Telford (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057451#comment-13057451
 ] 

Nicholas Telford commented on CASSANDRA-2045:
-

bq. That's bad though, because then we can't access hints efficiently on a node 
up/down message (we actually did it that way in 0.6 and learned our lesson.)

Good point. I retract that idea. :-)

bq. So we denormalize but we gain not having to do 
secondary-lookup-per-mutation, which is our main motivation for the change. 
(And single-destination-per-hint is by far the common case.)

I'm a bit confused here. There could be many mutations for a single key, we'd 
need to store each of them. I do like the idea of being able to slide the 
mutations though. Perhaps we could form the key from a compound of the 
key-table-cf, so it would look something like this:
{noformat}
Hints: {// cf
  dest ip: {  // key
key-table-cf: {   // super-column
  version: mutation // column
}
  }
}
{noformat}

Or is it vital that the key is stored separately from the table and cf?

 Simplify HH to decrease read load when nodes come back
 --

 Key: CASSANDRA-2045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2045
 Project: Cassandra
  Issue Type: Improvement
Reporter: Chris Goffinet
Assignee: Nicholas Telford
 Fix For: 1.0

 Attachments: 
 0001-Changed-storage-of-Hints-to-store-a-serialized-RowMu.patch, 
 0002-Refactored-HintedHandoffManager.sendRow-to-reduce-co.patch, 
 0003-Fixed-some-coding-style-issues.patch, 
 0004-Fixed-direct-usage-of-Gossiper.getEndpointStateForEn.patch, 
 0005-Removed-duplicate-failure-detection-conditionals.-It.patch, 
 0006-Removed-handling-of-old-style-hints.patch, 2045-v3.txt, 
 CASSANDRA-2045-simplify-hinted-handoff-001.diff, 
 CASSANDRA-2045-simplify-hinted-handoff-002.diff


 Currently when HH is enabled, hints are stored, and when a node comes back, 
 we begin sending that node data. We do a lookup on the local node for the row 
 to send. To help reduce read load (if a node is offline for long period of 
 time) we should store the data we want forward the node locally instead. We 
 wouldn't have to do any lookups, just take byte[] and send to the destination.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2841) Always use even distribution for merkle tree with RandomPartitionner

2011-06-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057458#comment-13057458
 ] 

Hudson commented on CASSANDRA-2841:
---

Integrated in Cassandra-0.7 #518 (See 
[https://builds.apache.org/job/Cassandra-0.7/518/])
Always use even distribution for merkle tree with RandomPartitionner
patch by slebresne; reviewed by jbellis for CASSANDRA-2841

slebresne : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1141217
Files : 
* 
/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/AntiEntropyService.java
* /cassandra/branches/cassandra-0.7/CHANGES.txt


 Always use even distribution for merkle tree with RandomPartitionner
 

 Key: CASSANDRA-2841
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2841
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
  Labels: repair
 Fix For: 0.7.7, 0.8.2

 Attachments: 2841.patch


 When creating the initial merkle tree, repair tries to be (too) smart and use 
 the key samples to guide the tree splitting. While this is a good idea for 
 OPP where there is a good change the data distribution is uneven, you can't 
 beat an even distribution for the RandomPartitionner. And a quick experiment 
 even shows that the method used is significantly less efficient than an even 
 distribution for the ranges of the merkle tree (that is, an even distribution 
 gives a much better of distribution of the number of keys by range of the 
 tree).
 Thus let's switch to an even distribution for RandomPartitionner. That 3 
 lines change alone amounts for a significant improvement of repair's 
 precision.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057461#comment-13057461
 ] 

Brandon Williams commented on CASSANDRA-2388:
-

{quote}
This is making the presumption that the hadoop cluster is only used with CFIF.
The TT could still be useful for other jobs submitted.
{quote}

I'm fine with that assumption.  If you want to run other jobs, use a different 
cluster.  Cassandra's JVM is eating wasteful memory at that point.

{quote}
Furthermore a blacklisted TT does't automatically come back - it needs to be 
manually restarted. Isn't this creating more headache for operations?
{quote}

I don't think this is actually the case, see HADOOP-4305


{quote}
I dont think we should require the TT to be running locally. The whole idea is 
to support access to Cassandra data from hadoop even if it's just an import.

This patch does spend a lot of time dealing with non local data for that reason.
{quote}

I'm fine with dropping support for non-colocated TTs, or at least saying 
there's no DC-specific support.  Because frankly, that is a very suboptimal 
thing to do, transfer the data across the network all the time, and flies in 
the face of Hadoop's core principles.

 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back

2011-06-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057466#comment-13057466
 ] 

Jonathan Ellis edited comment on CASSANDRA-2045 at 6/29/11 8:58 PM:


oops, didn't look too closely to what I was pasting.

{noformat}
Hints: {// cf
  dest ip: {  // key
uuid: {   // super-column
  table: table// columns
  key: key
  mutation: mutation  
}
  }
}
{noformat}

(Mutations can contain multiple CFs so storing a single CF value wouldn't make 
sense.)

  was (Author: jbellis):
oops, didn't look too closely to what I was pasting.

{noformat}
Hints: {// cf
  dest ip: {  // key
uuid: {   // super-column
  table: table// columns
  key: key
  mutation: mutation  
}
  }
}

(Mutations can contain multiple CFs so storing a single CF value wouldn't make 
sense.)
  
 Simplify HH to decrease read load when nodes come back
 --

 Key: CASSANDRA-2045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2045
 Project: Cassandra
  Issue Type: Improvement
Reporter: Chris Goffinet
Assignee: Nicholas Telford
 Fix For: 1.0

 Attachments: 
 0001-Changed-storage-of-Hints-to-store-a-serialized-RowMu.patch, 
 0002-Refactored-HintedHandoffManager.sendRow-to-reduce-co.patch, 
 0003-Fixed-some-coding-style-issues.patch, 
 0004-Fixed-direct-usage-of-Gossiper.getEndpointStateForEn.patch, 
 0005-Removed-duplicate-failure-detection-conditionals.-It.patch, 
 0006-Removed-handling-of-old-style-hints.patch, 2045-v3.txt, 
 CASSANDRA-2045-simplify-hinted-handoff-001.diff, 
 CASSANDRA-2045-simplify-hinted-handoff-002.diff


 Currently when HH is enabled, hints are stored, and when a node comes back, 
 we begin sending that node data. We do a lookup on the local node for the row 
 to send. To help reduce read load (if a node is offline for long period of 
 time) we should store the data we want forward the node locally instead. We 
 wouldn't have to do any lookups, just take byte[] and send to the destination.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back

2011-06-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057466#comment-13057466
 ] 

Jonathan Ellis commented on CASSANDRA-2045:
---

oops, didn't look too closely to what I was pasting.

{noformat}
Hints: {// cf
  dest ip: {  // key
uuid: {   // super-column
  table: table// columns
  key: key
  mutation: mutation  
}
  }
}

(Mutations can contain multiple CFs so storing a single CF value wouldn't make 
sense.)

 Simplify HH to decrease read load when nodes come back
 --

 Key: CASSANDRA-2045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2045
 Project: Cassandra
  Issue Type: Improvement
Reporter: Chris Goffinet
Assignee: Nicholas Telford
 Fix For: 1.0

 Attachments: 
 0001-Changed-storage-of-Hints-to-store-a-serialized-RowMu.patch, 
 0002-Refactored-HintedHandoffManager.sendRow-to-reduce-co.patch, 
 0003-Fixed-some-coding-style-issues.patch, 
 0004-Fixed-direct-usage-of-Gossiper.getEndpointStateForEn.patch, 
 0005-Removed-duplicate-failure-detection-conditionals.-It.patch, 
 0006-Removed-handling-of-old-style-hints.patch, 2045-v3.txt, 
 CASSANDRA-2045-simplify-hinted-handoff-001.diff, 
 CASSANDRA-2045-simplify-hinted-handoff-002.diff


 Currently when HH is enabled, hints are stored, and when a node comes back, 
 we begin sending that node data. We do a lookup on the local node for the row 
 to send. To help reduce read load (if a node is offline for long period of 
 time) we should store the data we want forward the node locally instead. We 
 wouldn't have to do any lookups, just take byte[] and send to the destination.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057467#comment-13057467
 ] 

Jonathan Ellis commented on CASSANDRA-2388:
---

bq. a blacklisted TT does't automatically come back

tlipcon says it comes back after 24h, fwiw.  In any case it's still the case 
that we DO want to blacklist it while it's down.  (Brisk could perhaps add a 
clear my tasktracker on restart operation as a further enhancement.)

bq. I'm fine with dropping support for non-colocated TTs

+1, it was a bad idea and I'm sorry I wrote it. :)

 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057470#comment-13057470
 ] 

Mck SembWever edited comment on CASSANDRA-2388 at 6/29/11 9:18 PM:
---

bq. tlipcon says it comes back after 24h
just to be clear about my concerns. 
this means a dead c* node will bring down a TT. In a hadoop cluster with 3 
nodes this means for 24hrs you're lost 33% throughput. (If less than 10% of 
hadoop jobs used CFIF i could well imagine some pissed users). (What if you 
have a temporarily problem with flapping c* nodes and you end up with a handful 
of blacklisted TTs? etc etc etc).

All this when using a replica, any replica, could have kept things going 
smoothly, the only slowdown being some of the data into CFIF had to go over the 
network instead...


  was (Author: michaelsembwever):
bq. tlipcon says it comes back after 24h
just to be clear about my concerns. 
this means a dead c* node will bring down a TT. In a hadoop cluster with 3 
nodes this means for 24hrs you're lost 33% throughput. (If less than 10% of 
hadoop jobs used CFIF i could well imagine some pissed customers). (What if you 
have a temporarily problem with flapping c* nodes and you end up with a handful 
of blacklisted TTs? etc etc etc).

All this when using a replica, any replica, could have kept things going 
smoothly, the only slowdown being some of the data into CFIF had to go over the 
network instead...

  
 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057470#comment-13057470
 ] 

Mck SembWever commented on CASSANDRA-2388:
--

bq. tlipcon says it comes back after 24h
just to be clear about my concerns. 
this means a dead c* node will bring down a TT. In a hadoop cluster with 3 
nodes this means for 24hrs you're lost 33% throughput. (If less than 10% of 
hadoop jobs used CFIF i could well imagine some pissed customers). (What if you 
have a temporarily problem with flapping c* nodes and you end up with a handful 
of blacklisted TTs? etc etc etc).

All this when using a replica, any replica, could have kept things going 
smoothly, the only slowdown being some of the data into CFIF had to go over the 
network instead...


 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-1608) Redesigned Compaction

2011-06-29 Thread Benjamin Coverston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Coverston updated CASSANDRA-1608:
--

Attachment: 1608-v8.txt

First the good:

1. Modified the code s.t. tombstone purge during minor compactions use the 
interval tree to prune the list of SSTables speeding up compactions by at least 
an order of magnitude where the number of SSTables in a column family exceeds 
~500.

2. Tested reads and writes. Write speeds (unsurprisingly) are not affected by 
this compaction strategy. Reads seem to keep up as well. The interval tree does 
a good job here making sure that bloom filters are only queried only for those 
SSTables that fall into the queried range.

3. Three successive runs of stress inserting 10M keys resulted in ~3GB of data 
stored in leveldb. By comparison, the same run using the tiered (default) 
strategy resulted in ~8GB of data.

The Meh:

Compactions do back up when setting the flush size to 64MB and the leveled 
SSTable size to anywhere between 5-10MB. On the upside, if your load has peaks 
and quieter times this compaction strategy will trigger a periodic check to 
catch up if all event-scheduled compactions complete.

Interestingly this extra IO has an upside. For datasets that frequently 
overwrite old data that has already been flushed to disk there is the potential 
for substantial de-duplication of data. Further, during reads the number of 
rows that would need to be merged for a single row is bound by the number of 
levels + the number of un-leveled sstables.

 Redesigned Compaction
 -

 Key: CASSANDRA-1608
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1608
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
Assignee: Benjamin Coverston
 Attachments: 0001-leveldb-style-compaction.patch, 1608-v2.txt, 
 1608-v3.txt, 1608-v4.txt, 1608-v5.txt, 1608-v7.txt, 1608-v8.txt


 After seeing the I/O issues in CASSANDRA-1470, I've been doing some more 
 thinking on this subject that I wanted to lay out.
 I propose we redo the concept of how compaction works in Cassandra. At the 
 moment, compaction is kicked off based on a write access pattern, not read 
 access pattern. In most cases, you want the opposite. You want to be able to 
 track how well each SSTable is performing in the system. If we were to keep 
 statistics in-memory of each SSTable, prioritize them based on most accessed, 
 and bloom filter hit/miss ratios, we could intelligently group sstables that 
 are being read most often and schedule them for compaction. We could also 
 schedule lower priority maintenance on SSTable's not often accessed.
 I also propose we limit the size of each SSTable to a fix sized, that gives 
 us the ability to  better utilize our bloom filters in a predictable manner. 
 At the moment after a certain size, the bloom filters become less reliable. 
 This would also allow us to group data most accessed. Currently the size of 
 an SSTable can grow to a point where large portions of the data might not 
 actually be accessed as often.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back

2011-06-29 Thread Nicholas Telford (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057534#comment-13057534
 ] 

Nicholas Telford commented on CASSANDRA-2045:
-

Ok, that makes sense. I've implemented this in my tree (albeit with an 
additional version column to store the serialization version). I won't post 
the patch yet as I need to go through it all and ensure it's correct and clean 
it up a little.

As an aside: while digging into RowMutationSerializer, I noticed that the 
version passed to deserialize() is ignored - is this intentional?

 Simplify HH to decrease read load when nodes come back
 --

 Key: CASSANDRA-2045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2045
 Project: Cassandra
  Issue Type: Improvement
Reporter: Chris Goffinet
Assignee: Nicholas Telford
 Fix For: 1.0

 Attachments: 
 0001-Changed-storage-of-Hints-to-store-a-serialized-RowMu.patch, 
 0002-Refactored-HintedHandoffManager.sendRow-to-reduce-co.patch, 
 0003-Fixed-some-coding-style-issues.patch, 
 0004-Fixed-direct-usage-of-Gossiper.getEndpointStateForEn.patch, 
 0005-Removed-duplicate-failure-detection-conditionals.-It.patch, 
 0006-Removed-handling-of-old-style-hints.patch, 2045-v3.txt, 
 CASSANDRA-2045-simplify-hinted-handoff-001.diff, 
 CASSANDRA-2045-simplify-hinted-handoff-002.diff


 Currently when HH is enabled, hints are stored, and when a node comes back, 
 we begin sending that node data. We do a lookup on the local node for the row 
 to send. To help reduce read load (if a node is offline for long period of 
 time) we should store the data we want forward the node locally instead. We 
 wouldn't have to do any lookups, just take byte[] and send to the destination.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057568#comment-13057568
 ] 

Jonathan Ellis commented on CASSANDRA-2388:
---

bq. this means a dead c* node will bring down a TT

Again: _this is what you want to happen_.  As long as the C* process on the 
same node is down, you want the TT to be blacklisted and the jobs to go 
elsewhere.

bq. In a hadoop cluster with 3 nodes this means for 24hrs you're lost 33% 
throughput

Right, but the real cause is because the C* process is dead, not b/c the TT is 
blacklisted.  Making the TT read from other nodes will only hurt your network, 
not fix the throughput problem, b/c i/o is the bottleneck.

 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back

2011-06-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057569#comment-13057569
 ] 

Jonathan Ellis commented on CASSANDRA-2045:
---

bq. I noticed that the version passed to deserialize() is ignored - is this 
intentional

Just means RM serialization hasn't changed since we started versioning the 
protocol.

 Simplify HH to decrease read load when nodes come back
 --

 Key: CASSANDRA-2045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2045
 Project: Cassandra
  Issue Type: Improvement
Reporter: Chris Goffinet
Assignee: Nicholas Telford
 Fix For: 1.0

 Attachments: 
 0001-Changed-storage-of-Hints-to-store-a-serialized-RowMu.patch, 
 0002-Refactored-HintedHandoffManager.sendRow-to-reduce-co.patch, 
 0003-Fixed-some-coding-style-issues.patch, 
 0004-Fixed-direct-usage-of-Gossiper.getEndpointStateForEn.patch, 
 0005-Removed-duplicate-failure-detection-conditionals.-It.patch, 
 0006-Removed-handling-of-old-style-hints.patch, 2045-v3.txt, 
 CASSANDRA-2045-simplify-hinted-handoff-001.diff, 
 CASSANDRA-2045-simplify-hinted-handoff-002.diff


 Currently when HH is enabled, hints are stored, and when a node comes back, 
 we begin sending that node data. We do a lookup on the local node for the row 
 to send. To help reduce read load (if a node is offline for long period of 
 time) we should store the data we want forward the node locally instead. We 
 wouldn't have to do any lookups, just take byte[] and send to the destination.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1141353 - in /cassandra/branches/cassandra-0.7: CHANGES.txt src/java/org/apache/cassandra/db/Table.java

2011-06-29 Thread jbellis
Author: jbellis
Date: Thu Jun 30 00:59:12 2011
New Revision: 1141353

URL: http://svn.apache.org/viewvc?rev=1141353view=rev
Log:
allow deleting and inserting into an indexed row in the same mutation
patch by jbellis; reviewed by slebresne and tested by Jim Ancona for 
CASSANDRA-2773

Modified:
cassandra/branches/cassandra-0.7/CHANGES.txt
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/Table.java

Modified: cassandra/branches/cassandra-0.7/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1141353r1=1141352r2=1141353view=diff
==
--- cassandra/branches/cassandra-0.7/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.7/CHANGES.txt Thu Jun 30 00:59:12 2011
@@ -29,6 +29,8 @@
  * fix scan wrongly throwing assertion error (CASSANDRA-2653)
  * Always use even distribution for merkle tree with RandomPartitionner
(CASSANDRA-2841)
+ * allow deleting a row and updating indexed columns in it in the
+   same mutation (CASSANDRA-2773)
 
 
 0.7.6

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/Table.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/Table.java?rev=1141353r1=1141352r2=1141353view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/Table.java 
(original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/Table.java 
Thu Jun 30 00:59:12 2011
@@ -429,7 +429,16 @@ public class Table
 ByteBuffer name = iter.next();
 IColumn newColumn = cf.getColumn(name); // null == row delete or 
it wouldn't be marked Mutated
 if (newColumn != null  cf.isMarkedForDelete())
-throw new UnsupportedOperationException(Index manager cannot 
support deleting and inserting into a row in the same mutation);
+{
+// row is marked for delete, but column was also updated.  if 
column is timestamped less than
+// the row tombstone, treat it as if it didn't exist.  
Otherwise we don't care about row
+// tombstone for the purpose of the index update and we can 
proceed as usual.
+if (newColumn.timestamp() = cf.getMarkedForDeleteAt())
+{
+// don't remove from the cf object; that can race w/ 
CommitLog write.  Leaving it is harmless.
+newColumn = null;
+}
+}
 IColumn oldColumn = oldIndexedColumns.getColumn(name);
 
 // deletions are irrelevant to the index unless we're changing 
state from live - deleted, i.e.,




svn commit: r1141354 - /cassandra/branches/cassandra-0.7/test/unit/org/apache/cassandra/db/ColumnFamilyStoreTest.java

2011-06-29 Thread jbellis
Author: jbellis
Date: Thu Jun 30 01:02:33 2011
New Revision: 1141354

URL: http://svn.apache.org/viewvc?rev=1141354view=rev
Log:
add additional tests for #2773
patch by Jim Ancona; reviewed by jbellis for CASSANDRA-2773

Modified:

cassandra/branches/cassandra-0.7/test/unit/org/apache/cassandra/db/ColumnFamilyStoreTest.java

Modified: 
cassandra/branches/cassandra-0.7/test/unit/org/apache/cassandra/db/ColumnFamilyStoreTest.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/test/unit/org/apache/cassandra/db/ColumnFamilyStoreTest.java?rev=1141354r1=1141353r2=1141354view=diff
==
--- 
cassandra/branches/cassandra-0.7/test/unit/org/apache/cassandra/db/ColumnFamilyStoreTest.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/test/unit/org/apache/cassandra/db/ColumnFamilyStoreTest.java
 Thu Jun 30 01:02:33 2011
@@ -314,6 +314,24 @@ public class ColumnFamilyStoreTest exten
 rm.apply();
 rows = cfs.scan(clause, range, filter);
 assert rows.isEmpty() : StringUtils.join(rows, ,);
+
+// try insert followed by row delete in the same mutation
+rm = new RowMutation(Keyspace3, ByteBufferUtil.bytes(k1));
+rm.add(new QueryPath(Indexed1, null, 
ByteBufferUtil.bytes(birthdate)), ByteBufferUtil.bytes(1L), 1);
+rm.delete(new QueryPath(Indexed1), 2);
+rm.apply();
+rows = cfs.scan(clause, range, filter);
+assert rows.isEmpty() : StringUtils.join(rows, ,);
+
+// try row delete followed by insert in the same mutation
+rm = new RowMutation(Keyspace3, ByteBufferUtil.bytes(k1));
+rm.delete(new QueryPath(Indexed1), 3);
+rm.add(new QueryPath(Indexed1, null, 
ByteBufferUtil.bytes(birthdate)), ByteBufferUtil.bytes(1L), 4);
+rm.apply();
+rows = cfs.scan(clause, range, filter);
+assert rows.size() == 1 : StringUtils.join(rows, ,);
+key = new 
String(rows.get(0).key.key.array(),rows.get(0).key.key.position(),rows.get(0).key.key.remaining());
+assert k1.equals( key );
 }
 
 @Test




[jira] [Resolved] (CASSANDRA-2773) Index manager cannot support deleting and inserting into a row in the same mutation

2011-06-29 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2773.
---

   Resolution: Fixed
Fix Version/s: 0.7.7

committed.  Thanks, Jim!

 Index manager cannot support deleting and inserting into a row in the same 
 mutation
 -

 Key: CASSANDRA-2773
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2773
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Boris Yen
Assignee: Jonathan Ellis
Priority: Critical
 Fix For: 0.7.7, 0.8.2

 Attachments: 2773-v2.txt, 2773.txt, cassandra.log, 
 v1-0001-allow-deleting-a-rowand-updating-indexed-columns-init-.txt, 
 v1-0002-CASSANDRA-2773-Add-unit-tests-to-verfy-fix-cherry-pick.txt


 I use hector 0.8.0-1 and cassandra 0.8.
 1. create mutator by using hector api, 
 2. Insert a few columns into the mutator for key key1, cf standard. 
 3. add a deletion to the mutator to delete the record of key1, cf 
 standard.
 4. repeat 2 and 3
 5. execute the mutator.
 the result: the connection seems to be held by the sever forever, it never 
 returns. when I tried to restart the cassandra I saw unsupportedexception : 
 Index manager cannot support deleting and inserting into a row in the same 
 mutation. and the cassandra is dead forever, unless I delete the commitlog. 
 I would expect to get an exception when I execute the mutator, not after I 
 restart the cassandra.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2819) Split rpc timeout for read and write ops

2011-06-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057575#comment-13057575
 ] 

Jonathan Ellis commented on CASSANDRA-2819:
---

- read_repair is a write and should use that timeout
- should distinguish b/t multirow (range/index) reads, and single-row lookups
- REQUEST_RESPONSE should drop based on what kind of query the request being 
responded to was
- ExpiringMap should use expiration of max (read, read multirow, write)
- which means the REQUEST_RESPONSE drop-messages block isn't entirely redundant 
wrt ExpiringMap, but if it's too difficult to look up what the message type was 
it's probably not a big deal to ignore
- DD.getRpcTimeout should be removed and replace w/ the appropriate op timeout. 
 If there are any internal operations that rely on rpctimeout (can't think of 
any that do) then we may want to add an internal timeout as well.  (Or it may 
be evidence of a bug and we should fix it.)

 Split rpc timeout for read and write ops
 

 Key: CASSANDRA-2819
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2819
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Stu Hood
Assignee: Melvin Wang
 Fix For: 1.0

 Attachments: rpc-rw-timeouts.patch


 Given the vastly different latency characteristics of reads and writes, it 
 makes sense for them to have independent rpc timeouts internally.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2521) Move away from Phantom References for Compaction/Memtable

2011-06-29 Thread Terje Marthinussen (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057614#comment-13057614
 ] 

Terje Marthinussen commented on CASSANDRA-2521:
---

In releaseReference(), if holdReferences for some reason gets less than 0, 
maybe the code should do an assert or throw an exception (or anything else that 
gives a stack trace)?

Should help debugging for some error scenarios with reference mismatches.

 Move away from Phantom References for Compaction/Memtable
 -

 Key: CASSANDRA-2521
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2521
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
Assignee: Sylvain Lebresne
 Fix For: 1.0

 Attachments: 
 0001-Use-reference-counting-to-decide-when-a-sstable-can-.patch, 
 0001-Use-reference-counting-to-decide-when-a-sstable-can-v2.patch, 
 0002-Force-unmapping-files-before-deletion-v2.patch, 2521-v3.txt, 2521-v4.txt


 http://wiki.apache.org/cassandra/MemtableSSTable
 Let's move to using reference counting instead of relying on GC to be called 
 in StorageService.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira