date:20110629

[jira] [Commented] (CASSANDRA-2840) cassandra-cli describe keyspace shows confusing memtable thresholds

2011-06-29 Thread Yuki Morishita (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057016#comment-13057016
 ] 

Yuki Morishita commented on CASSANDRA-2840:
---

duplicate CASSANDRA-2599

 cassandra-cli describe keyspace shows confusing memtable thresholds
 ---

 Key: CASSANDRA-2840
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2840
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.5
 Environment: linux rackspace instance 4cpu/4G
Reporter: Lanny Ripple
Priority: Minor

 The 'describe keyspace' output seems to be mixing up the labeling for minutes 
 and MB for Memtable thresholds output.
 Example off our ring:
Memtable thresholds: 0.2859375/61/1440 (millions of ops/minutes/MB)
 We use minutes=1440 and MB=61.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Mck SembWever (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057002#comment-13057002
]

Mck SembWever edited comment on CASSANDRA-2388 at 6/29/11 6:31 AM:
---

This does happen already (i've seen it while testing initial patches that were
no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

On a side note, there is no guarantee that any given TT will have its split
accessible via a local c* node - this is only a preference in CFRR. A failed
job may just as likely got to a random c* node. At least now we can actually
properly limit to the one DC and sort by proximity.

One thing we're not doing here is applying this same DC limit and sort by
proximity in the case when there isn't a localhost preference. See
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf)
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code
into SplitEndpointIterator...

was (Author: michaelsembwever):
This does happen already (i've seen it while testing initial patches that
were no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another
TT. For example a c* node may die in the middle of a TT...

ColumnFamilyRecordReader fails for a given split because a host is down, even
if records could reasonably be read from other replica.
-

Key: CASSANDRA-2388
URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
Project: Cassandra
Issue Type: Bug
Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
Labels: hadoop, inputformat
Fix For: 0.7.7, 0.8.2

Attachments: 0002_On_TException_try_next_split.patch,
CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch,
CASSANDRA-2388.patch

ColumnFamilyRecordReader only tries the first location for a given split. We
should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Mck SembWever (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057002#comment-13057002
 ] 

Mck SembWever edited comment on CASSANDRA-2388 at 6/29/11 6:32 AM:
---

This does happen already (i've seen it while testing initial patches that were 
no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

On a side note, there is no guarantee that any given TT will have its split 
accessible via a local c* node - this is only a preference in CFRR. A failed 
job may just as likely go to a random c* node. At least now we can actually 
properly limit to the one DC and sort by proximity. 

One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...

  was (Author: michaelsembwever):
This does happen already (i've seen it while testing initial patches that 
were no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

On a side note, there is no guarantee that any given TT will have its split 
accessible via a local c* node - this is only a preference in CFRR. A failed 
job may just as likely got to a random c* node. At least now we can actually 
properly limit to the one DC and sort by proximity. 

One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...
  
 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Mck SembWever (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057002#comment-13057002
 ] 

Mck SembWever edited comment on CASSANDRA-2388 at 6/29/11 7:19 AM:
---

This does happen already (i've seen it while testing initial patches that were 
no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

On a side note, there is no guarantee that any given TT will have its split 
accessible via a local c* node - this is only a preference in CFRR. A failed 
task may just as likely go to a random c* node. At least now we can actually 
properly limit to the one DC and sort by proximity. 

One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...

  was (Author: michaelsembwever):
This does happen already (i've seen it while testing initial patches that 
were no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

On a side note, there is no guarantee that any given TT will have its split 
accessible via a local c* node - this is only a preference in CFRR. A failed 
job may just as likely go to a random c* node. At least now we can actually 
properly limit to the one DC and sort by proximity. 

One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...
  
 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Mck SembWever (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057002#comment-13057002
 ] 

Mck SembWever edited comment on CASSANDRA-2388 at 6/29/11 7:27 AM:
---

This does happen already (i've seen it while testing initial patches that were 
no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

On a side note, there is no guarantee that any given TT will have its split 
accessible via a local c* node - this is only a preference in CFRR. A failed 
task may just as likely go to a random c* node. At least now we can actually 
properly limit to the one DC and sort by proximity. 

One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...

A bug i can see in the patch that did get accepted already is in 
CassandraServer.java:763 when endpointValid is false and restrictToSameDC is 
true we end up restricting to a random DC. I can fix this so restrictToSameDC 
is disabled in such situations.

  was (Author: michaelsembwever):
This does happen already (i've seen it while testing initial patches that 
were no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

On a side note, there is no guarantee that any given TT will have its split 
accessible via a local c* node - this is only a preference in CFRR. A failed 
task may just as likely go to a random c* node. At least now we can actually 
properly limit to the one DC and sort by proximity. 

One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...
  
 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Mck SembWever (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057002#comment-13057002
 ] 

Mck SembWever edited comment on CASSANDRA-2388 at 6/29/11 7:49 AM:
---

 - This does happen already (i've seen it while testing initial patches that 
were no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

 - There is no guarantee that any given TT will have its split accessible via a 
local c* node - this is only a preference in CFRR. A failed task may just as 
likely go to a random c* node. At least now we can actually properly limit to 
the one DC and sort by proximity. 

 - One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...

 - A bug i can see in the patch that did get accepted already is in 
CassandraServer.java:763 when endpointValid is false and restrictToSameDC is 
true we end up restricting to a random DC. I could fix this so restrictToSameDC 
is disabled in such situations but this actually invalidates the previous 
point: we can't restrict to DC anymore and we can only sortByProximity to a 
random node... I think this supports Jonathan's point that it's overall a poor 
approach. I'm more and more in preference of my original approach using just 
client.getDatacenter(..) and not worrying about proximity within the datacenter.

 - Another bug is that, contray to my patch, the code committed
bq. committed with a change to use the dynamic snitch id the passed endpoint is 
valid.
 can call {{DynamicEndpointSnitch.sortByProximity(..)}} with an address that is 
not localhost and this breaks the assertion in the method. 

  was (Author: michaelsembwever):
 - This does happen already (i've seen it while testing initial patches 
that were no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

 - There is no guarantee that any given TT will have its split accessible via a 
local c* node - this is only a preference in CFRR. A failed task may just as 
likely go to a random c* node. At least now we can actually properly limit to 
the one DC and sort by proximity. 

 - One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...

 - A bug i can see in the patch that did get accepted already is in 
CassandraServer.java:763 when endpointValid is false and restrictToSameDC is 
true we end up restricting to a random DC. I can fix this so restrictToSameDC 
is disabled in such situations. This actually invalidates the previous point: 
we can't restrict to DC anymore and we can only sortByProximity to a random 
node... I think this supports Jonathan's point that it's overall a poor 
approach. I'm more and more in preference of my original approach using just 
client.getDatacenter(..) and not worrying about proximity within the datacenter.

 - Another bug is that, contray to my patch, the code committed
bq. committed with a change to use the dynamic snitch id the passed endpoint is 
valid.
 can call {{DynamicEndpointSnitch.sortByProximity(..)}} with an address that is 
not localhost and this breaks the assertion in the method. 
  
 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch,

[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-29 Thread Mck SembWever (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057002#comment-13057002
 ] 

Mck SembWever edited comment on CASSANDRA-2388 at 6/29/11 7:48 AM:
---

 - This does happen already (i've seen it while testing initial patches that 
were no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

 - There is no guarantee that any given TT will have its split accessible via a 
local c* node - this is only a preference in CFRR. A failed task may just as 
likely go to a random c* node. At least now we can actually properly limit to 
the one DC and sort by proximity. 

 - One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...

 - A bug i can see in the patch that did get accepted already is in 
CassandraServer.java:763 when endpointValid is false and restrictToSameDC is 
true we end up restricting to a random DC. I can fix this so restrictToSameDC 
is disabled in such situations. This actually invalidates the previous point: 
we can't restrict to DC anymore and we can only sortByProximity to a random 
node... I think this supports Jonathan's point that it's overall a poor 
approach. I'm more and more in preference of my original approach using just 
client.getDatacenter(..) and not worrying about proximity within the datacenter.

 - Another bug is that, contray to my patch, the code committed
bq. committed with a change to use the dynamic snitch id the passed endpoint is 
valid.
 can call {{DynamicEndpointSnitch.sortByProximity(..)}} with an address that is 
not localhost and this breaks the assertion in the method. 

  was (Author: michaelsembwever):
This does happen already (i've seen it while testing initial patches that 
were no good).
Problem is that the TT is blacklisted, reducing hadoop's throughput for all 
jobs running.
I bet too that a fallback to a replica is faster than a fallback to another TT.

On a side note, there is no guarantee that any given TT will have its split 
accessible via a local c* node - this is only a preference in CFRR. A failed 
task may just as likely go to a random c* node. At least now we can actually 
properly limit to the one DC and sort by proximity. 

One thing we're not doing here is applying this same DC limit and sort by 
proximity in the case when there isn't a localhost preference. See 
CFRR.initialize(..)
It would make sense to rewrite CFRR.getLocations(..) to
{noformat}private IteratorString getLocations(final Configuration conf) 
throws IOException
{
return new SplitEndpointIterator(conf);
}{noformat} and then to move the finding-a-preference-to-localhost code 
into SplitEndpointIterator...

A bug i can see in the patch that did get accepted already is in 
CassandraServer.java:763 when endpointValid is false and restrictToSameDC is 
true we end up restricting to a random DC. I can fix this so restrictToSameDC 
is disabled in such situations.
  
 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.7.6, 0.8.0
Reporter: Eldon Stegall
Assignee: Jeremy Hanna
  Labels: hadoop, inputformat
 Fix For: 0.7.7, 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2475) Prepared statements

2011-06-29 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057082#comment-13057082
 ] 

Michal Augustýn commented on CASSANDRA-2475:


It would be great if there is this overload in order to eliminate one 
client-server roundtrip:
{noformat}CqlResult execute_cql_query(1:binary query, 2:listbinary 
parameters, 3:Compression compression);{noformat}
In many applications, there is just few queries (max. hundreds?) and so I think 
the _handle_ could be cached server-side (we could limit the cache size via 
configuration).
And do you/we plan to support named parameters?

 Prepared statements
 ---

 Key: CASSANDRA-2475
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2475
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
  Labels: cql
 Fix For: 1.0




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back

2011-06-29 Thread Nicholas Telford (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057189#comment-13057189
]

Nicholas Telford commented on CASSANDRA-2045:
-

What if the coordinator happens to be one of the replicas for that key? Having
the coordinator store the hint would mean it wasn't replicated at the
replication_factor. The same is true for a coordinator that's not a replica for
a key, but has to store a hint for multiple nodes (i.e. when multiple replicas
are down).

I don't like this; I was under the impression that HintedHandoff helps to
retain the replication factor even in the face of failed replicas.

Simplify HH to decrease read load when nodes come back
--

Key: CASSANDRA-2045
URL: https://issues.apache.org/jira/browse/CASSANDRA-2045
Project: Cassandra
Issue Type: Improvement
Reporter: Chris Goffinet
Assignee: Nicholas Telford
Fix For: 1.0

Attachments:
0001-Changed-storage-of-Hints-to-store-a-serialized-RowMu.patch,
0002-Refactored-HintedHandoffManager.sendRow-to-reduce-co.patch,
0003-Fixed-some-coding-style-issues.patch,
0004-Fixed-direct-usage-of-Gossiper.getEndpointStateForEn.patch,
0005-Removed-duplicate-failure-detection-conditionals.-It.patch,
0006-Removed-handling-of-old-style-hints.patch, 2045-v3.txt,
CASSANDRA-2045-simplify-hinted-handoff-001.diff,
CASSANDRA-2045-simplify-hinted-handoff-002.diff

Currently when HH is enabled, hints are stored, and when a node comes back,
we begin sending that node data. We do a lookup on the local node for the row
to send. To help reduce read load (if a node is offline for long period of
time) we should store the data we want forward the node locally instead. We
wouldn't have to do any lookups, just take byte[] and send to the destination.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2383) log4j unable to load properties file from classpath

2011-06-29 Thread David Allsopp (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057223#comment-13057223
 ] 

David Allsopp commented on CASSANDRA-2383:
--

Yes, that'll teach me to post code late at night :-(

 log4j unable to load properties file from classpath
 ---

 Key: CASSANDRA-2383
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2383
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 0.7.4
 Environment: OS : windows
 java : 1.6.0.23
Reporter: david lee
Assignee: T Jake Luciani
Priority: Minor
 Fix For: 0.7.7


 when cassandra home folder is placed inside a folder which has space 
 characters in its name,
 log4j settings are not properly loaded and warning messages are shown.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2840) cassandra-cli describe keyspace shows confusing memtable thresholds

2011-06-29 Thread Lanny Ripple (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057227#comment-13057227
 ] 

Lanny Ripple commented on CASSANDRA-2840:
-

Thanks!  I searched around before opening the ticket but guess I wasn't 
thorough enough.

Regards,
   -ljr




 cassandra-cli describe keyspace shows confusing memtable thresholds
 ---

 Key: CASSANDRA-2840
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2840
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.5
 Environment: linux rackspace instance 4cpu/4G
Reporter: Lanny Ripple
Priority: Minor

 The 'describe keyspace' output seems to be mixing up the labeling for minutes 
 and MB for Memtable thresholds output.
 Example off our ring:
Memtable thresholds: 0.2859375/61/1440 (millions of ops/minutes/MB)
 We use minutes=1440 and MB=61.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

svn commit: r1141129 - in /cassandra/branches/cassandra-0.7: CHANGES.txt src/java/org/apache/cassandra/db/ColumnFamilyStore.java

2011-06-29 Thread slebresne

Author: slebresne
Date: Wed Jun 29 15:15:29 2011
New Revision: 1141129

URL: http://svn.apache.org/viewvc?rev=1141129view=rev
Log:
Fix scan wrongly throwing assertion errors
patch by slebresne; reviewed by jbellis for CASSANDRA-2653

Modified:
cassandra/branches/cassandra-0.7/CHANGES.txt

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java

Modified: cassandra/branches/cassandra-0.7/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1141129r1=1141128r2=1141129view=diff
==
--- cassandra/branches/cassandra-0.7/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.7/CHANGES.txt Wed Jun 29 15:15:29 2011
@@ -26,6 +26,7 @@
  * fix race that could result in Hadoop writer failing to throw an
exception encountered after close() (CASSANDRA-2755)
  * fix CLI parsing of read_repair_chance (CASSANDRA-2837)
+ * fix scan wrongly throwing assertion error (CASSANDRA-2653)
 
 
 0.7.6

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java?rev=1141129r1=1141128r2=1141129view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
 Wed Jun 29 15:15:29 2011
@@ -1486,6 +1486,16 @@ public class ColumnFamilyStore implement
 return rows;
 }
 
+private NamesQueryFilter getExtraFilter(IndexClause clause)
+{
+SortedSetByteBuffer columns = new 
TreeSetByteBuffer(getComparator());
+for (IndexExpression expr : clause.expressions)
+{
+columns.add(expr.column_name);
+}
+return new NamesQueryFilter(columns);
+}
+
 public ListRow scan(IndexClause clause, AbstractBounds range, IFilter 
dataFilter)
 {
 // Start with the most-restrictive indexed clause, then apply 
remaining clauses
@@ -1502,50 +1512,33 @@ public class ColumnFamilyStore implement
 // it needs to be expanded to include those too
 IFilter firstFilter = dataFilter;
 NamesQueryFilter extraFilter = null;
-if (clause.expressions.size()  1)
+if (dataFilter instanceof SliceQueryFilter)
 {
-if (dataFilter instanceof SliceQueryFilter)
+// if we have a high chance of getting all the columns in a single 
index slice, do that.
+// otherwise, we'll create an extraFilter (lazily) to fetch by 
name the columns referenced by the additional expressions.
+if (getMaxRowSize()  DatabaseDescriptor.getColumnIndexSize())
 {
-// if we have a high chance of getting all the columns in a 
single index slice, do that.
-// otherwise, create an extraFilter to fetch by name the 
columns referenced by the additional expressions.
-if (getMaxRowSize()  DatabaseDescriptor.getColumnIndexSize())
-{
-logger.debug(Expanding slice filter to entire row to 
cover additional expressions);
-firstFilter = new 
SliceQueryFilter(ByteBufferUtil.EMPTY_BYTE_BUFFER,
-   
ByteBufferUtil.EMPTY_BYTE_BUFFER,
-   ((SliceQueryFilter) 
dataFilter).reversed,
-   Integer.MAX_VALUE);
-}
-else
-{
-logger.debug(adding extraFilter to cover additional 
expressions);
-SortedSetByteBuffer columns = new 
TreeSetByteBuffer(getComparator());
-for (IndexExpression expr : clause.expressions)
-{
-if (expr == primary)
-continue;
-columns.add(expr.column_name);
-}
-extraFilter = new NamesQueryFilter(columns);
-}
+logger.debug(Expanding slice filter to entire row to cover 
additional expressions);
+firstFilter = new 
SliceQueryFilter(ByteBufferUtil.EMPTY_BYTE_BUFFER,
+ByteBufferUtil.EMPTY_BYTE_BUFFER,
+((SliceQueryFilter) dataFilter).reversed,
+Integer.MAX_VALUE);
 }
-else
+}
+else
+{
+logger.debug(adding columns to firstFilter to cover additional 
expressions);
+// just add in columns that are not part of the resultset
+assert dataFilter instanceof NamesQueryFilter;
+SortedSetByteBuffer

[jira] [Commented] (CASSANDRA-2521) Move away from Phantom References for Compaction/Memtable

2011-06-29 Thread Terje Marthinussen (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057295#comment-13057295
 ] 

Terje Marthinussen commented on CASSANDRA-2521:
---

I have not found any further major issues here, but I think there is still 
situations where files are deleted late but they do seem to go away.

Not sure if we are missing something in terms of reference counting and GC 
delete it eventually or it is just a delayed free or delete for some reason, 
but it does not happen too often.

Will see try to add some debug logging and see what I find.

I was looking at the code though and I am wondering about one segment. I have 
not had time to actually test this, but in submitUserDefined() there is a 
finally statement removing References for sstables but I could not 
immediately see where there are References acquired for all the sstables that 
needs to be freed there?

I am sure it's just me missing something, but anyway...


 Move away from Phantom References for Compaction/Memtable
 -

 Key: CASSANDRA-2521
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2521
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
Assignee: Sylvain Lebresne
 Fix For: 1.0

 Attachments: 
 0001-Use-reference-counting-to-decide-when-a-sstable-can-.patch, 
 0001-Use-reference-counting-to-decide-when-a-sstable-can-v2.patch, 
 0002-Force-unmapping-files-before-deletion-v2.patch, 2521-v3.txt, 2521-v4.txt


 http://wiki.apache.org/cassandra/MemtableSSTable
 Let's move to using reference counting instead of relying on GC to be called 
 in StorageService.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

svn commit: r1141134 - in /cassandra/branches/cassandra-0.8: ./ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/db/

2011-06-29 Thread slebresne

Author: slebresne
Date: Wed Jun 29 15:26:29 2011
New Revision: 1141134

URL: http://svn.apache.org/viewvc?rev=1141134view=rev
Log:
merge from 0.7

Modified:
cassandra/branches/cassandra-0.8/   (props changed)
cassandra/branches/cassandra-0.8/CHANGES.txt
cassandra/branches/cassandra-0.8/contrib/   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/db/ColumnFamilyStore.java

Propchange: cassandra/branches/cassandra-0.8/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 15:26:29 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7:1026516-1140567,1140928
+/cassandra/branches/cassandra-0.7:1026516-1140567,1140928,1141129
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
 /cassandra/branches/cassandra-0.8:1090934-1125013,1125041
 /cassandra/branches/cassandra-0.8.0:1125021-1130369

Modified: cassandra/branches/cassandra-0.8/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/CHANGES.txt?rev=1141134r1=1141133r2=1141134view=diff
==
--- cassandra/branches/cassandra-0.8/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.8/CHANGES.txt Wed Jun 29 15:26:29 2011
@@ -85,6 +85,7 @@
  * Fix wrong purge of deleted cf during compaction (CASSANDRA-2786)
  * fix race that could result in Hadoop writer failing to throw an
exception encountered after close() (CASSANDRA-2755)
+ * fix scan wrongly throwing assertion error (CASSANDRA-2653)
 
 
 0.8.0-final

Propchange: cassandra/branches/cassandra-0.8/contrib/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 15:26:29 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009
-/cassandra/branches/cassandra-0.7/contrib:1026516-1140567,1140928
+/cassandra/branches/cassandra-0.7/contrib:1026516-1140567,1140928,1141129
 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654
 /cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125041
 /cassandra/branches/cassandra-0.8.0/contrib:1125021-1130369

Propchange: 
cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 15:26:29 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1140567,1140928
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1140567,1140928,1141129
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
 
/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125041
 
/cassandra/branches/cassandra-0.8.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1125021-1130369

Propchange: 
cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 15:26:29 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1140567,1140928
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1140567,1140928,1141129
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1055654

[jira] [Commented] (CASSANDRA-2773) Index manager cannot support deleting and inserting into a row in the same mutation

2011-06-29 Thread Jim Ancona (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057298#comment-13057298
]

Jim Ancona commented on CASSANDRA-2773:
---

We have deployed and tested 0.7.6 plus this patch to the affected cluster. The
cluster restarted successfully and the tests that caused the original failure
ran successfully. In addition, functional tests of our applications show no
regressions. I also reviewed the Cassandra system logs after the testing and
saw no errors or obvious problems.

Index manager cannot support deleting and inserting into a row in the same
mutation
-

Key: CASSANDRA-2773
URL: https://issues.apache.org/jira/browse/CASSANDRA-2773
Project: Cassandra
Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Boris Yen
Assignee: Jonathan Ellis
Priority: Critical
Fix For: 0.8.2

Attachments: 2773-v2.txt, 2773.txt, cassandra.log,
v1-0001-allow-deleting-a-rowand-updating-indexed-columns-init-.txt,
v1-0002-CASSANDRA-2773-Add-unit-tests-to-verfy-fix-cherry-pick.txt

I use hector 0.8.0-1 and cassandra 0.8.
1. create mutator by using hector api,
2. Insert a few columns into the mutator for key key1, cf standard.
3. add a deletion to the mutator to delete the record of key1, cf
standard.
4. repeat 2 and 3
5. execute the mutator.
the result: the connection seems to be held by the sever forever, it never
returns. when I tried to restart the cassandra I saw unsupportedexception :
Index manager cannot support deleting and inserting into a row in the same
mutation. and the cassandra is dead forever, unless I delete the commitlog.
I would expect to get an exception when I execute the mutator, not after I
restart the cassandra.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2521) Move away from Phantom References for Compaction/Memtable

2011-06-29 Thread Sylvain Lebresne (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057305#comment-13057305
]

Sylvain Lebresne commented on CASSANDRA-2521:
-

bq. Not sure if we are missing something in terms of reference counting and GC
delete it eventually

If you don't use mmap, the GC shouldn't do anything. So if it is deleted
eventually, it would indicate some place where we last decrement is delayed
longer that it needs somehow. It'd be interesting if you find more.

bq. in submitUserDefined() there is a finally statement removing References for
sstables but I could not immediately see where there are References acquired

It's in the lookupSSTables (a private method used by submitUserDefined). I
admit it's not super clean but It felt like the simplest way to do this in a
thread safe manner without holding unneeded references for too long.

bq. Only thing missing beyond that is to get this into 0.8.

I really don't think this would be reasonable. This is not a trivial change by
any mean, nor does it fixes a regression. Which is not saying it doesn't make
life much easier. But I'm really uncomfortable pushing that to 0.8.

Move away from Phantom References for Compaction/Memtable
-

Key: CASSANDRA-2521
URL: https://issues.apache.org/jira/browse/CASSANDRA-2521
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Chris Goffinet
Assignee: Sylvain Lebresne
Fix For: 1.0

Attachments:
0001-Use-reference-counting-to-decide-when-a-sstable-can-.patch,
0001-Use-reference-counting-to-decide-when-a-sstable-can-v2.patch,
0002-Force-unmapping-files-before-deletion-v2.patch, 2521-v3.txt, 2521-v4.txt

http://wiki.apache.org/cassandra/MemtableSSTable
Let's move to using reference counting instead of relying on GC to be called
in StorageService.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

svn commit: r1141138 - in /cassandra/trunk: ./ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/db/

2011-06-29 Thread slebresne

Author: slebresne
Date: Wed Jun 29 15:48:50 2011
New Revision: 1141138

URL: http://svn.apache.org/viewvc?rev=1141138view=rev
Log:
merge from 0.8

Modified:
cassandra/trunk/   (props changed)
cassandra/trunk/CHANGES.txt
cassandra/trunk/contrib/   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)
cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java

Propchange: cassandra/trunk/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 15:48:50 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7:1026516-1140567
+/cassandra/branches/cassandra-0.7:1026516-1140567,1141129
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
-/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1140755,1140760
+/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1140755,1140760,1141134
 /cassandra/branches/cassandra-0.8.0:1125021-1130369
 /cassandra/branches/cassandra-0.8.1:1101014-1125018
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689

Modified: cassandra/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1141138r1=1141137r2=1141138view=diff
==
--- cassandra/trunk/CHANGES.txt (original)
+++ cassandra/trunk/CHANGES.txt Wed Jun 29 15:48:50 2011
@@ -103,6 +103,7 @@
  * Fix wrong purge of deleted cf during compaction (CASSANDRA-2786)
  * fix race that could result in Hadoop writer failing to throw an
exception encountered after close() (CASSANDRA-2755)
+ * fix scan wrongly throwing assertion error (CASSANDRA-2653)
 
 
 0.8.0-final

Propchange: cassandra/trunk/contrib/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 15:48:50 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009
-/cassandra/branches/cassandra-0.7/contrib:1026516-1140567
+/cassandra/branches/cassandra-0.7/contrib:1026516-1140567,1141129
 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654
-/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1140755,1140760
+/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1140755,1140760,1141134
 /cassandra/branches/cassandra-0.8.0/contrib:1125021-1130369
 /cassandra/branches/cassandra-0.8.1/contrib:1101014-1125018
 /cassandra/tags/cassandra-0.7.0-rc3/contrib:1051699-1053689

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 15:48:50 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1140567
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1140567,1141129
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
-/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125019-1140755,1140760
+/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125019-1140755,1140760,1141134
 
/cassandra/branches/cassandra-0.8.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1125021-1130369
 
/cassandra/branches/cassandra-0.8.1/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1101014-1125018
 
/cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1051699-1053689

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun 29 15:48:50 2011
@@ -1,7 +1,7 @@

[jira] [Commented] (CASSANDRA-2653) index scan errors out when zero columns are requested

2011-06-29 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057307#comment-13057307
 ] 

Hudson commented on CASSANDRA-2653:
---

Integrated in Cassandra-0.7 #517 (See 
[https://builds.apache.org/job/Cassandra-0.7/517/])
Fix scan wrongly throwing assertion errors
patch by slebresne; reviewed by jbellis for CASSANDRA-2653

slebresne : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1141129
Files : 
* /cassandra/branches/cassandra-0.7/CHANGES.txt
* 
/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java


 index scan errors out when zero columns are requested
 -

 Key: CASSANDRA-2653
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2653
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.6, 0.8.0 beta 2
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.7.7, 0.8.2

 Attachments: 
 0001-Handle-data-get-returning-null-in-secondary-indexes.patch, 
 0001-Handle-null-returns-in-data-index-query-v0.7.patch, 
 0001-Reset-SSTII-in-EchoedRow-constructor.patch, 2653_v2.patch, 
 2653_v3.patch, v1-0001-CASSANDRA-2653-reproduce-regression.txt


 As reported by Tyler Hobbs as an addendum to CASSANDRA-2401,
 {noformat}
 ERROR 16:13:38,864 Fatal exception in thread Thread[ReadStage:16,5,main]
 java.lang.AssertionError: No data found for 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0] in DecoratedKey(81509516161424251288255223397843705139, 
 6b657931):QueryPath(columnFamilyName='cf', superColumnName='null', 
 columnName='null') (original filter 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0]) from expression 'cf.626972746864617465 EQ 1'
   at 
 org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1517)
   at 
 org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42)
   at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2816) Repair doesn't synchronize merkle tree creation properly

2011-06-29 Thread Terje Marthinussen (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057339#comment-13057339
 ] 

Terje Marthinussen commented on CASSANDRA-2816:
---

This is what heap looks like when GC start slowing things down so much that 
even gossip gets delayed long enough for nodes to be down for some seconds.

  num #instances #bytes  class name
--
   1:   9453188  453753024  java.nio.HeapByteBuffer
   2:  10081546  392167064  [B
   3:   7616875  24374  org.apache.cassandra.db.Column
   4:   9739914  233757936  
java.util.concurrent.ConcurrentSkipListMap$Node
   5:   4131938   99166512  
java.util.concurrent.ConcurrentSkipListMap$Index
   6:   1549230   49575360  org.apache.cassandra.db.DeletedColumn

I guess this really ends up maybe being the mix of everything going on in total 
and all the reading and writing that may occur when repair runs (valiadation 
compactions, streaming, normal compactions and regular traffic all at the same 
time and maybe many CFs at the same time).

However, I have suspected for some time that our young size was a bit on the 
small side and after increasing it and giving the heap a few more GB to work 
with, it seems like things are behaving quite a bit better.

I mentioned issues with this patch when testing for CASSANDRA-2521. That was a 
problem caused by me. Was playing around with git for the first time and I 
manage to apply 2816 to a different branch than the one I used for testing 
:(

My appologies. 

Initial testing with that corrected looks a lot better for my small scale test 
case, but I noticed one time where I deleted an sstable and restarted. It did 
not get repaired (repair scanned but did nothing).

Not entirely sure what to make out of that, I then tested to delete another 
sstable and repair started running.

I will test more over the next days. 


 Repair doesn't synchronize merkle tree creation properly
 

 Key: CASSANDRA-2816
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2816
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
  Labels: repair
 Fix For: 0.8.2

 Attachments: 0001-Schedule-merkle-tree-request-one-by-one.patch


 Being a little slow, I just realized after having opened CASSANDRA-2811 and 
 CASSANDRA-2815 that there is a more general problem with repair.
 When a repair is started, it will send a number of merkle tree to its 
 neighbor as well as himself and assume for correction that the building of 
 those trees will be started on every node roughly at the same time (if not, 
 we end up comparing data snapshot at different time and will thus mistakenly 
 repair a lot of useless data). This is bogus for many reasons:
 * Because validation compaction runs on the same executor that other 
 compaction, the start of the validation on the different node is subject to 
 other compactions. 0.8 mitigates this in a way by being multi-threaded (and 
 thus there is less change to be blocked a long time by a long running 
 compaction), but the compaction executor being bounded, its still a problem)
 * if you run a nodetool repair without arguments, it will repair every CFs. 
 As a consequence it will generate lots of merkle tree requests and all of 
 those requests will be issued at the same time. Because even in 0.8 the 
 compaction executor is bounded, some of those validations will end up being 
 queued behind the first ones. Even assuming that the different validation are 
 submitted in the same order on each node (which isn't guaranteed either), 
 there is no guarantee that on all nodes, the first validation will take the 
 same time, hence desynchronizing the queued ones.
 Overall, it is important for the precision of repair that for a given CF and 
 range (which is the unit at which trees are computed), we make sure that all 
 node will start the validation at the same time (or, since we can't do magic, 
 as close as possible).
 One (reasonably simple) proposition to fix this would be to have repair 
 schedule validation compactions across nodes one by one (i.e, one CF/range at 
 a time), waiting for all nodes to return their tree before submitting the 
 next request. Then on each node, we should make sure that the node will start 
 the validation compaction as soon as requested. For that, we probably want to 
 have a specific executor for validation compaction and:
 * either we fail the whole repair whenever one node is not able to execute 
 the validation compaction right away (because no thread are available right 
 away).
 * we simply tell the user that if he start too many repairs

67 matches

Mail list logo