[jira] [Commented] (CASSANDRA-3829) make seeds *only* be seeds, not special in gossip

2012-02-13 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206744#comment-13206744
 ] 

Peter Schuller commented on CASSANDRA-3829:
---

I'm not sure what's making it sound like I want a free lunch :)

Let me start with what I hope are the less controversial bits.

# If you apply normal bootstrapping process when inserting a node into the 
cluster, and it happens to be a seed according to its own configuration, it 
will just jump into the cluster w/o streaming data.
# You currently have to do rolling restarts to change the seed list.

In order to make clusters easier to operate, and make it more difficult to 
shoot yourself in the foot, I propose that the behavior of (1) be removed. I 
think it makes more sense to require a special setting (such as a system 
property) when performing the very unusual (in production) task of setting up a 
new cluster from scratch. For single-node cases, we could support a mode where 
a node is alone and never tries to bootstrap if we are concerned with 
maintaining simple ./bin/cassandra -f type running of lone nodes.

Fixing (2) so that the seed list is reloadable makes sense if seeds are kept 
relevant other than on start-up, and would in particular be even more 
important if we cannot agree on (1). Asking users for rolling restarts to do 
maintenance on a seed is IMO clearly not a good thing, even if we were to 
disagree about eliminating the behavior in (1).

Ok - so far what I've said in this comment doesn't change the notion of seeds 
as something which is continually used throughout the life-time of a node.

Now, if we make seeds be dynamic during runtime (minor changes in the code are 
needed to support this cleanly, but it's not a big deal) everyone is of 
course free to do whatever they want in terms of seed sources. I described an 
example zookeeper/serversets based case in the original filing of this ticket. 
For someone with infrastructure in place for multiple clusters and where these 
things aren't manually maintained, it's not really an issue once we reach the 
point of never having to do rolling restarts.

But I would really like to go further and make the seed concept simpler for 
*everyone*. I am not proposing to *remove* seeds; only to make them *seeds 
only*, in the sense of initially seeding a node with information about it's 
cluster when it starts up for the first time (*not* as a list of special 
nodes that are always gossiped to). Even if we make seeds reload:able, and 
provide an out-of-the-box implementation that e.g. loads from a property file, 
it still means operators (or their tools) have to actively be aware of seeds 
and the fact that special action is required during some tasks, if an affected 
node happens to be a seed.

I believe that for operational simplicity, it would be better if seeds would 
only enter the consciousness (or tool) on initial bootstrap where they are 
*fundamentally* absolutely required no matter what (for obvious reason there 
must be *some* source, as you point out, pointing a node to the appropriate 
cluster).

As discussed, this *would* be a slight regression in terms of partitioning in 
the sense that if a node goes down for a while, and goes back up, and all nodes 
it knows about have either changed IP addresses or are down - then yes, you 
would introduce a partition. But look at this this way; in my opinion this can 
easily be considered operator error. If you point clients to a set of nodes 
in a cluster and make hugely significant topology changes while a node being 
used by clients is down, that's a mistake. It's worth nothing however that it's 
only slightly more easy to make that mistake than the potential for a mistake 
*already there right now* - in the exact same scenario, you are already in 
trouble if if all of the nodes listed in your seeds list are among those either 
down or having changed IP address. Nor granted if you change the IP address of 
the seed you will deploy that change; but what if the node that went down just 
booted up (never got deployed to)? You still have, in practice, the 
partitioning of the cluster.

So in short, I believe that for practical use-cases, removing the significance 
of seeds in all but initial seeding has minimal negative consequences, while 
the positive consequences in terms of operational simplicity are very much 
significant.

That said, if am I truly the only person who thinks this would be an important 
improvement, then we can at least make seeds dynamic and provide a simple 
out-of-the-box way of using that feature (property file based seeds probably). 
I'll submit the necessary patches in a separate ticket if so. If so, I will 
also try to make time to empirically test how the propagation time in the 
cluster is affected by cluster size (because of CASSANDRA-3830).



 make seeds *only* be seeds, not special in 

[jira] [Commented] (CASSANDRA-3830) gossip-to-seeds is not obviously independent of failure detection algorithm

2012-02-13 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206745#comment-13206745
 ] 

Peter Schuller commented on CASSANDRA-3830:
---

I had written a response here, but I assume I must have failed to submit it and 
lost track of the browser tab or something.

What you describe is not the behavior of the Gossiper. It picks a random node 
to gossip to. Then, unless the node *happened* to also be a seed node, it picks 
a random *seed node* to gossip to *as well*.

The less than number of seeds you're mentioning is presumably due to the 
comments in the code before the gossip to seed:

{code}
/* Gossip to a seed if we did not do so above, or we have 
seen less nodes
   than there are seeds.  This prevents partitions where 
each group of nodes
   is only gossiping to a subset of the seeds.

   The most straightforward check would be to check that 
all the seeds have been
   verified either as live or unreachable.  To avoid that 
computation each round,
   we reason that:

   either all the live nodes are seeds, in which case 
non-seeds that come online
   will introduce themselves to a member of the ring by 
definition,

   or there is at least one non-seed node in the list, in 
which case eventually
   someone will gossip to it, and then do a gossip to a 
random seed from the
   gossipedToSeed check.

   See CASSANDRA-150 for more exposition. */
if (!gossipedToSeed || liveEndpoints.size()  seeds.size())
doGossipToSeed(prod);
{code}

If you look carefully though, you'll see that the number of live endpoints is 
*only* relevant in the sense that it forces *always* gossiping to a seed even 
if we already did. In the normal case of almost all cases, we have more live 
endpoints than seeds, and we'll still gossip to seeds because 
{{!gossipedToSeed}}.



 gossip-to-seeds is not obviously independent of failure detection algorithm 
 

 Key: CASSANDRA-3830
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3830
 Project: Cassandra
  Issue Type: Task
  Components: Core
Reporter: Peter Schuller
Priority: Minor

 The failure detector, ignoring all the theory, boils down to an
 extremely simple algorithm. The FD keeps track of a sliding window (of
 1000 currently) intervals of heartbeat for a given host. Meaning, we
 have a track record of the last 1000 times we saw an updated heartbeat
 for a host.
 At any given moment, a host has a score which is simply the time since
 the last heartbeat, over the *mean* interval in the sliding
 window. For historical reasons a simple scaling factor is applied to
 this prior to checking the phi conviction threshold.
 (CASSANDRA-2597 has details, but thanks to Paul's work there it's now
 trivial to understand what it does based on gut feeling)
 So in effect, a host is considered down if we haven't heard from it in
 some time which is significantly longer than the average time we
 expect to hear from it.
 This seems reasonable, but it does assume that under normal conditions
 the average time between heartbeats does not change for reasons other
 than those that would be plausible reasons to think a node is
 unhealthy.
 This assumption *could* be violated by the gossip-to-seed
 feature. There is an argument to avoid gossip-to-seed for other
 reasons (see CASSANDRA-3829), but this is a concrete case in which the
 gossip-to-seed could cause a negative side-effect of the general kind
 mentioned in CASSANDRA-3829 (see notes at end about not case w/o seeds
 not being continuously tested). Normally, due to gossip to seed,
 everyone essentially sees latest information within very few hart
 beats (assuming only 2-3 seeds). But should all seeds be down,
 suddenly we flip a switch and start relying on generalized propagation
 in the gossip system, rather than the seed special case.
 The potential problem I forese here is that if the average propagation
 time suddenly spikes when all seeds become available, it could cause
 bogus flapping of nodes into down state.
 In order to test this, I deployeda ~ 180 node cluster with a version
 that logs heartbet information on each interpret(), similar to:
  INFO [GossipTasks:1] 2012-02-01 23:29:58,746 FailureDetector.java (line 187) 
 ep /XXX.XXX.XXX.XXX is at phi 0.0019521638443084342, last interval 7.0, mean 
 is 1557.27778
 It turns out that, at least at 180 nodes, with 4 seed nodes, whether
 or not seeds are running *does not* seem to matter significantly. In
 

Git Push Summary

2012-02-13 Thread slebresne
Updated Tags:  refs/tags/cassandra-0.8.10 [created] c45a17cd0


Git Push Summary

2012-02-13 Thread slebresne
Updated Tags:  refs/tags/0.8.10-tentative [deleted] 038b8f212


svn commit: r1243474 - in /cassandra/site: publish/download/index.html src/settings.py

2012-02-13 Thread slebresne
Author: slebresne
Date: Mon Feb 13 10:38:58 2012
New Revision: 1243474

URL: http://svn.apache.org/viewvc?rev=1243474view=rev
Log:
Update website for 0.8.10 release

Modified:
cassandra/site/publish/download/index.html
cassandra/site/src/settings.py

Modified: cassandra/site/publish/download/index.html
URL: 
http://svn.apache.org/viewvc/cassandra/site/publish/download/index.html?rev=1243474r1=1243473r2=1243474view=diff
==
--- cassandra/site/publish/download/index.html (original)
+++ cassandra/site/publish/download/index.html Mon Feb 13 10:38:58 2012
@@ -103,16 +103,16 @@
   p
   Previous stable branches of Cassandra continue to see periodic maintenance
   for some time after a new major release is made. The lastest release on the
-  0.8 branch is 0.8.9 (released on
-  2011-12-14).
+  0.8 branch is 0.8.10 (released on
+  2012-02-13).
   /p
 
   ul
 li
-a class=filename 
href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.8.9/apache-cassandra-0.8.9-bin.tar.gz;apache-cassandra-0.8.9-bin.tar.gz/a
-[a 
href=http://www.apache.org/dist/cassandra/0.8.9/apache-cassandra-0.8.9-bin.tar.gz.asc;PGP/a]
-[a 
href=http://www.apache.org/dist/cassandra/0.8.9/apache-cassandra-0.8.9-bin.tar.gz.md5;MD5/a]
-[a 
href=http://www.apache.org/dist/cassandra/0.8.9/apache-cassandra-0.8.9-bin.tar.gz.sha1;SHA1/a]
+a class=filename 
href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.8.10/apache-cassandra-0.8.10-bin.tar.gz;apache-cassandra-0.8.10-bin.tar.gz/a
+[a 
href=http://www.apache.org/dist/cassandra/0.8.10/apache-cassandra-0.8.10-bin.tar.gz.asc;PGP/a]
+[a 
href=http://www.apache.org/dist/cassandra/0.8.10/apache-cassandra-0.8.10-bin.tar.gz.md5;MD5/a]
+[a 
href=http://www.apache.org/dist/cassandra/0.8.10/apache-cassandra-0.8.10-bin.tar.gz.sha1;SHA1/a]
 /li
   /ul
   
@@ -157,10 +157,10 @@
 /li
   
 li
-a class=filename 
href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.8.9/apache-cassandra-0.8.9-src.tar.gz;apache-cassandra-0.8.9-src.tar.gz/a
-[a 
href=http://www.apache.org/dist/cassandra/0.8.9/apache-cassandra-0.8.9-src.tar.gz.asc;PGP/a]
-[a 
href=http://www.apache.org/dist/cassandra/0.8.9/apache-cassandra-0.8.9-src.tar.gz.md5;MD5/a]
-[a 
href=http://www.apache.org/dist/cassandra/0.8.9/apache-cassandra-0.8.9-src.tar.gz.sha1;SHA1/a]
+a class=filename 
href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.8.10/apache-cassandra-0.8.10-src.tar.gz;apache-cassandra-0.8.10-src.tar.gz/a
+[a 
href=http://www.apache.org/dist/cassandra/0.8.10/apache-cassandra-0.8.10-src.tar.gz.asc;PGP/a]
+[a 
href=http://www.apache.org/dist/cassandra/0.8.10/apache-cassandra-0.8.10-src.tar.gz.md5;MD5/a]
+[a 
href=http://www.apache.org/dist/cassandra/0.8.10/apache-cassandra-0.8.10-src.tar.gz.sha1;SHA1/a]
 /li
   
   

Modified: cassandra/site/src/settings.py
URL: 
http://svn.apache.org/viewvc/cassandra/site/src/settings.py?rev=1243474r1=1243473r2=1243474view=diff
==
--- cassandra/site/src/settings.py (original)
+++ cassandra/site/src/settings.py Mon Feb 13 10:38:58 2012
@@ -92,8 +92,8 @@ SITE_POST_PROCESSORS = {
 }
 
 class CassandraDef(object):
-oldstable_version = '0.8.9'
-oldstable_release_date = '2011-12-14'
+oldstable_version = '0.8.10'
+oldstable_release_date = '2012-02-13'
 oldstable_exists = True
 veryoldstable_version = '0.7.10'
 veryoldstable_release_date = '2011-10-31'




[jira] [Commented] (CASSANDRA-3872) Sub-columns removal is broken in 1.1

2012-02-13 Thread Sylvain Lebresne (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206806#comment-13206806
 ] 

Sylvain Lebresne commented on CASSANDRA-3872:
-

I do not pretend this reduce line of codes, but I do think that it makes it 
easier to not make subtle mistakes.

Currently, there is a mismatch between how Column (the class) and the two 
IColumnContainer classes (CF and SC) handles getLocalDeletionTime() for 
non-deleted. The former uses MAX_VALUE, the latter uses MIN_VALUE. The lack of 
consistency alone is annoying but as long as SC lives it is made much worst by 
the fact that SC is both a IColumn and a IColumnContainer.

The attached patch tries to make things more consistent. The localDeletionTime 
is here for the purpose of tombstone garbage collection, so it seems to me that 
it is cleaner to use it for that purpose and that purpose only. In other words, 
with this patch, {{(getLocalDeletionTime()  gcbefore)}} tells you without 
ambiguity if you're dealing with a gcable tombstone or not.

Now there is the fact that live but empty containers are not returned to the 
user. I believe that was one of the reason of using MIN_VALUE for live 
containers. But imho this is a hack and it's much more clear in removeDeleted 
to read:
{noformat}
if (cf.getColumnCount() == 0  (!cf.isMarkedForDelete() || 
cf.getLocalDeletionTime()  gcBefore))
{noformat}
which directly translate into: if the cf is empty and it's either a gcable 
tombstone or a live cf, we can skip it, rather that having to check the code of 
ColumnFamily to understand why that does skip live empty CF *and* to have to 
remember each time you use CF.localDeletionTime that it may be MIN_VALUE for 
non-deleted CF and assert if it matters or not.

 Sub-columns removal is broken in 1.1
 

 Key: CASSANDRA-3872
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3872
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 1.1.0

 Attachments: 3872.patch


 CASSANDRA-3716 actually broke sub-columns deletion. The reason is that in 
 QueryFilter.isRelevant, we've switched in checking getLocalDeletionTime() 
 only (without looking for isMarkedForDelete). But for columns containers (in 
 this case SuperColumn), the default local deletion time when not deleted is 
 Integer.MIN_VALUE. In other words, a SC with only non-gcable tombstones will 
 be considered as not relevant (while it should).
 This is caught by two unit tests (RemoveSuperColumnTest and 
 RemoveSubColumnTest) that are failing currently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3555) Bootstrapping to handle more failure

2012-02-13 Thread Richard Low (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Low updated CASSANDRA-3555:
---

Attachment: 3555-bootstrap-with-down-node-test.txt
3555-bootstrap-with-down-node.txt

 Bootstrapping to handle more failure
 

 Key: CASSANDRA-3555
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3555
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.0.5
Reporter: Vijay
Assignee: Vijay
 Fix For: 1.2

 Attachments: 3555-bootstrap-with-down-node-test.txt, 
 3555-bootstrap-with-down-node.txt


 We might want to handle failures in bootstrapping:
 1) When none of the Seeds are available to communicate then throw exception
 2) When any one of the node which it is bootstrapping fails then try next in 
 the list (and if the list is exhausted then throw exception).
 3) Clean all the existing files in the data Dir before starting just in case 
 we retry.
 4) Currently when one node is down in the cluster the bootstrapping will 
 fail, because the bootstrapping node doesnt understand which one is actually 
 down.
 Also print the nt ring in the logs so we can troubleshoot later if it fails.
 Currently if any one of the above happens the node is skipping the bootstrap 
 or hangs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3555) Bootstrapping to handle more failure

2012-02-13 Thread Richard Low (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206810#comment-13206810
 ] 

Richard Low edited comment on CASSANDRA-3555 at 2/13/12 11:21 AM:
--

3555-bootstrap-with-down-node.txt contains a fix for 4, to stop a bootstrapping 
node choosing an unavailable node.  3555-bootstrap-with-down-node-test.txt is 
required to make BootStrapperTest pass with the patch.  Patches against trunk; 
same fix also works on 1.0 branch.

  was (Author: richardlow):
A fix for 4, to stop a bootstrapping node choosing an unavailable node.
  
 Bootstrapping to handle more failure
 

 Key: CASSANDRA-3555
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3555
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.0.5
Reporter: Vijay
Assignee: Vijay
 Fix For: 1.2

 Attachments: 3555-bootstrap-with-down-node-test.txt, 
 3555-bootstrap-with-down-node.txt


 We might want to handle failures in bootstrapping:
 1) When none of the Seeds are available to communicate then throw exception
 2) When any one of the node which it is bootstrapping fails then try next in 
 the list (and if the list is exhausted then throw exception).
 3) Clean all the existing files in the data Dir before starting just in case 
 we retry.
 4) Currently when one node is down in the cluster the bootstrapping will 
 fail, because the bootstrapping node doesnt understand which one is actually 
 down.
 Also print the nt ring in the logs so we can troubleshoot later if it fails.
 Currently if any one of the above happens the node is skipping the bootstrap 
 or hangs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




git commit: Have secondary indexes inherit compression and compaction properties from parent CF

2012-02-13 Thread slebresne
Updated Branches:
  refs/heads/cassandra-1.1 5f5e00bc9 - 6a6bf3cf1


Have secondary indexes inherit compression and compaction properties from 
parent CF

patch by slebresne; reviewed by xedin for CASSANDRA-3877


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6a6bf3cf
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6a6bf3cf
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6a6bf3cf

Branch: refs/heads/cassandra-1.1
Commit: 6a6bf3cf1aac6099c38c50b8d9d46f4ddea5a323
Parents: 5f5e00b
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu Feb 9 16:11:57 2012 +0100
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Mon Feb 13 12:22:40 2012 +0100

--
 CHANGES.txt|5 +
 .../org/apache/cassandra/config/CFMetaData.java|   15 ---
 .../cassandra/db/index/SecondaryIndexManager.java  |6 ++
 3 files changed, 23 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/6a6bf3cf/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 5d9eaf9..e115a2a 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -71,6 +71,11 @@
  * CQL support for altering key_validation_class in ALTER TABLE 
(CASSANDRA-3781)
  * turn compression on by default (CASSANDRA-3871)
  * make hexToBytes refuse invalid input (CASSANDRA-2851)
+ * Make secondary indexes CF inherit compression and compaction from their
+   parent CF (CASSANDRA-3877)
+Merged from 1.0:
+ * Only snapshot CF being compacted for snapshot_before_compaction
+   (CASSANDRA-3803)
 
 
 1.0.8

http://git-wip-us.apache.org/repos/asf/cassandra/blob/6a6bf3cf/src/java/org/apache/cassandra/config/CFMetaData.java
--
diff --git a/src/java/org/apache/cassandra/config/CFMetaData.java 
b/src/java/org/apache/cassandra/config/CFMetaData.java
index defa6cf..06b1adf 100644
--- a/src/java/org/apache/cassandra/config/CFMetaData.java
+++ b/src/java/org/apache/cassandra/config/CFMetaData.java
@@ -279,9 +279,18 @@ public final class CFMetaData
  .keyValidator(info.getValidator())
  .readRepairChance(0.0)
  .dclocalReadRepairChance(0.0)
- .gcGraceSeconds(parent.gcGraceSeconds)
- 
.minCompactionThreshold(parent.minCompactionThreshold)
- 
.maxCompactionThreshold(parent.maxCompactionThreshold);
+ .reloadSecondaryIndexMetadata(parent);
+}
+
+public CFMetaData reloadSecondaryIndexMetadata(CFMetaData parent)
+{
+gcGraceSeconds(parent.gcGraceSeconds);
+minCompactionThreshold(parent.minCompactionThreshold);
+maxCompactionThreshold(parent.maxCompactionThreshold);
+compactionStrategyClass(parent.compactionStrategyClass);
+compactionStrategyOptions(parent.compactionStrategyOptions);
+compressionParameters(parent.compressionParameters);;
+return this;
 }
 
 // Create a new CFMD by changing just the cfName

http://git-wip-us.apache.org/repos/asf/cassandra/blob/6a6bf3cf/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java
--
diff --git a/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java 
b/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java
index 3758e9b..aa16db2 100644
--- a/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java
+++ b/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java
@@ -93,6 +93,12 @@ public class SecondaryIndexManager
 for (ColumnDefinition cdef : 
baseCfs.metadata.getColumn_metadata().values())
 if (cdef.getIndexType() != null  
!indexedColumnNames.contains(cdef.name))
 addIndexedColumn(cdef);
+
+for (ColumnFamilyStore cfs : getIndexesBackedByCfs())
+{
+cfs.metadata.reloadSecondaryIndexMetadata(baseCfs.metadata);
+cfs.reload();
+}
 }
 
 



[1/2] git commit: Merge branch 'cassandra-1.1' into trunk

2012-02-13 Thread slebresne
Updated Branches:
  refs/heads/trunk 30ee8337e - ddc771dc5


Merge branch 'cassandra-1.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ddc771dc
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ddc771dc
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ddc771dc

Branch: refs/heads/trunk
Commit: ddc771dc5f1af98dde42d494f91cc398929c8515
Parents: 30ee833 6a6bf3c
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Mon Feb 13 12:27:00 2012 +0100
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Mon Feb 13 12:27:00 2012 +0100

--
 CHANGES.txt|5 +
 .../org/apache/cassandra/config/CFMetaData.java|   15 ---
 .../cassandra/db/index/SecondaryIndexManager.java  |6 ++
 3 files changed, 23 insertions(+), 3 deletions(-)
--




[2/2] git commit: Have secondary indexes inherit compression and compaction properties from parent CF

2012-02-13 Thread slebresne
Have secondary indexes inherit compression and compaction properties from 
parent CF

patch by slebresne; reviewed by xedin for CASSANDRA-3877


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6a6bf3cf
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6a6bf3cf
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6a6bf3cf

Branch: refs/heads/trunk
Commit: 6a6bf3cf1aac6099c38c50b8d9d46f4ddea5a323
Parents: 5f5e00b
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu Feb 9 16:11:57 2012 +0100
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Mon Feb 13 12:22:40 2012 +0100

--
 CHANGES.txt|5 +
 .../org/apache/cassandra/config/CFMetaData.java|   15 ---
 .../cassandra/db/index/SecondaryIndexManager.java  |6 ++
 3 files changed, 23 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/6a6bf3cf/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 5d9eaf9..e115a2a 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -71,6 +71,11 @@
  * CQL support for altering key_validation_class in ALTER TABLE 
(CASSANDRA-3781)
  * turn compression on by default (CASSANDRA-3871)
  * make hexToBytes refuse invalid input (CASSANDRA-2851)
+ * Make secondary indexes CF inherit compression and compaction from their
+   parent CF (CASSANDRA-3877)
+Merged from 1.0:
+ * Only snapshot CF being compacted for snapshot_before_compaction
+   (CASSANDRA-3803)
 
 
 1.0.8

http://git-wip-us.apache.org/repos/asf/cassandra/blob/6a6bf3cf/src/java/org/apache/cassandra/config/CFMetaData.java
--
diff --git a/src/java/org/apache/cassandra/config/CFMetaData.java 
b/src/java/org/apache/cassandra/config/CFMetaData.java
index defa6cf..06b1adf 100644
--- a/src/java/org/apache/cassandra/config/CFMetaData.java
+++ b/src/java/org/apache/cassandra/config/CFMetaData.java
@@ -279,9 +279,18 @@ public final class CFMetaData
  .keyValidator(info.getValidator())
  .readRepairChance(0.0)
  .dclocalReadRepairChance(0.0)
- .gcGraceSeconds(parent.gcGraceSeconds)
- 
.minCompactionThreshold(parent.minCompactionThreshold)
- 
.maxCompactionThreshold(parent.maxCompactionThreshold);
+ .reloadSecondaryIndexMetadata(parent);
+}
+
+public CFMetaData reloadSecondaryIndexMetadata(CFMetaData parent)
+{
+gcGraceSeconds(parent.gcGraceSeconds);
+minCompactionThreshold(parent.minCompactionThreshold);
+maxCompactionThreshold(parent.maxCompactionThreshold);
+compactionStrategyClass(parent.compactionStrategyClass);
+compactionStrategyOptions(parent.compactionStrategyOptions);
+compressionParameters(parent.compressionParameters);;
+return this;
 }
 
 // Create a new CFMD by changing just the cfName

http://git-wip-us.apache.org/repos/asf/cassandra/blob/6a6bf3cf/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java
--
diff --git a/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java 
b/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java
index 3758e9b..aa16db2 100644
--- a/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java
+++ b/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java
@@ -93,6 +93,12 @@ public class SecondaryIndexManager
 for (ColumnDefinition cdef : 
baseCfs.metadata.getColumn_metadata().values())
 if (cdef.getIndexType() != null  
!indexedColumnNames.contains(cdef.name))
 addIndexedColumn(cdef);
+
+for (ColumnFamilyStore cfs : getIndexesBackedByCfs())
+{
+cfs.metadata.reloadSecondaryIndexMetadata(baseCfs.metadata);
+cfs.reload();
+}
 }
 
 



[jira] [Commented] (CASSANDRA-3772) Evaluate Murmur3-based partitioner

2012-02-13 Thread Dave Brosius (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206830#comment-13206830
 ] 

Dave Brosius commented on CASSANDRA-3772:
-

With 10,000 inserts i'm seeing the same ratios, which i'm having a hard time 
describing why as again the hash function itself is about the same time.

 Evaluate Murmur3-based partitioner
 --

 Key: CASSANDRA-3772
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
 Fix For: 1.2

 Attachments: try_murmur3.diff


 MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
 qualities, just a good output distribution.  Let's see how much overhead we 
 can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3830) gossip-to-seeds is not obviously independent of failure detection algorithm

2012-02-13 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206857#comment-13206857
 ] 

Brandon Williams commented on CASSANDRA-3830:
-

bq. What you describe is not the behavior of the Gossiper. It picks a random 
node to gossip to. Then, unless the node happened to also be a seed node, it 
picks a random seed node to gossip to as well.

Right.

bq. The less than number of seeds you're mentioning

What I meant to say is this is the only special-case for seeds; gossiping to at 
least one seed every round is the normal case, as you said.

 gossip-to-seeds is not obviously independent of failure detection algorithm 
 

 Key: CASSANDRA-3830
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3830
 Project: Cassandra
  Issue Type: Task
  Components: Core
Reporter: Peter Schuller
Priority: Minor

 The failure detector, ignoring all the theory, boils down to an
 extremely simple algorithm. The FD keeps track of a sliding window (of
 1000 currently) intervals of heartbeat for a given host. Meaning, we
 have a track record of the last 1000 times we saw an updated heartbeat
 for a host.
 At any given moment, a host has a score which is simply the time since
 the last heartbeat, over the *mean* interval in the sliding
 window. For historical reasons a simple scaling factor is applied to
 this prior to checking the phi conviction threshold.
 (CASSANDRA-2597 has details, but thanks to Paul's work there it's now
 trivial to understand what it does based on gut feeling)
 So in effect, a host is considered down if we haven't heard from it in
 some time which is significantly longer than the average time we
 expect to hear from it.
 This seems reasonable, but it does assume that under normal conditions
 the average time between heartbeats does not change for reasons other
 than those that would be plausible reasons to think a node is
 unhealthy.
 This assumption *could* be violated by the gossip-to-seed
 feature. There is an argument to avoid gossip-to-seed for other
 reasons (see CASSANDRA-3829), but this is a concrete case in which the
 gossip-to-seed could cause a negative side-effect of the general kind
 mentioned in CASSANDRA-3829 (see notes at end about not case w/o seeds
 not being continuously tested). Normally, due to gossip to seed,
 everyone essentially sees latest information within very few hart
 beats (assuming only 2-3 seeds). But should all seeds be down,
 suddenly we flip a switch and start relying on generalized propagation
 in the gossip system, rather than the seed special case.
 The potential problem I forese here is that if the average propagation
 time suddenly spikes when all seeds become available, it could cause
 bogus flapping of nodes into down state.
 In order to test this, I deployeda ~ 180 node cluster with a version
 that logs heartbet information on each interpret(), similar to:
  INFO [GossipTasks:1] 2012-02-01 23:29:58,746 FailureDetector.java (line 187) 
 ep /XXX.XXX.XXX.XXX is at phi 0.0019521638443084342, last interval 7.0, mean 
 is 1557.27778
 It turns out that, at least at 180 nodes, with 4 seed nodes, whether
 or not seeds are running *does not* seem to matter significantly. In
 both cases, the mean interval is around 1500 milliseconds.
 I don't feel I have a good grasp of whether this is incidental or
 guaranteed, and it would be good to at least empirically test
 propagation time w/o seeds at differnet cluster sizes; it's supposed
 to be un-affected by cluster size ({{RING_DELAY}} is static for this
 reason, is my understanding). Would be nice to see this be the case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

2012-02-13 Thread Sylvain Lebresne (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-3862:


Attachment: 3862_v3.patch

 RowCache misses Updates
 ---

 Key: CASSANDRA-3862
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.7
Reporter: Daniel Doubleday
 Attachments: 3862-v2.patch, 3862.patch, 3862_v3.patch, 
 include_memtables_in_rowcache_read.patch


 While performing stress tests to find any race problems for CASSANDRA-2864 I 
 guess I (re-)found one for the standard on-heap row cache.
 During my stress test I hava lots of threads running with some of them only 
 reading other writing and re-reading the value.
 This seems to happen:
 - Reader tries to read row A for the first time doing a getTopLevelColumns
 - Row A which is not in the cache yet is updated by Writer. The row is not 
 eagerly read during write (because we want fast writes) so the writer cannot 
 perform a cache update
 - Reader puts the row in the cache which is now missing the update
 I already asked this some time ago on the mailing list but unfortunately 
 didn't dig after I got no answer since I assumed that I just missed 
 something. In a way I still do but haven't found any locking mechanism that 
 makes sure that this should not happen.
 The problem can be reproduced with every run of my stress test. When I 
 restart the server the expected column is there. It's just missing from the 
 cache.
 To test I have created a patch that merges memtables with the row cache. With 
 the patch the problem is gone.
 I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any 
 relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3862) RowCache misses Updates

2012-02-13 Thread Sylvain Lebresne (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206870#comment-13206870
 ] 

Sylvain Lebresne commented on CASSANDRA-3862:
-

Attaching v3. This mostly fix a but of the previous version where sentinels 
were not handled correctly in cacheRow(). I've also switch back to 
getRawCachedRow.
I'm not fully sure what you proposed to split exactly, but v3 does split 
cacheRow() in the hope of increasing clarity. 

 RowCache misses Updates
 ---

 Key: CASSANDRA-3862
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.7
Reporter: Daniel Doubleday
 Attachments: 3862-v2.patch, 3862.patch, 3862_v3.patch, 
 include_memtables_in_rowcache_read.patch


 While performing stress tests to find any race problems for CASSANDRA-2864 I 
 guess I (re-)found one for the standard on-heap row cache.
 During my stress test I hava lots of threads running with some of them only 
 reading other writing and re-reading the value.
 This seems to happen:
 - Reader tries to read row A for the first time doing a getTopLevelColumns
 - Row A which is not in the cache yet is updated by Writer. The row is not 
 eagerly read during write (because we want fast writes) so the writer cannot 
 perform a cache update
 - Reader puts the row in the cache which is now missing the update
 I already asked this some time ago on the mailing list but unfortunately 
 didn't dig after I got no answer since I assumed that I just missed 
 something. In a way I still do but haven't found any locking mechanism that 
 makes sure that this should not happen.
 The problem can be reproduced with every run of my stress test. When I 
 restart the server the expected column is there. It's just missing from the 
 cache.
 To test I have created a patch that merges memtables with the row cache. With 
 the patch the problem is gone.
 I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any 
 relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3843) Unnecessary ReadRepair request during RangeScan

2012-02-13 Thread Jeremy Hanna (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206881#comment-13206881
 ] 

Jeremy Hanna commented on CASSANDRA-3843:
-

We'll be upgrading to 1.0.8 as soon as we can, but this seems like a 
significant issue for anyone doing range scans - does it make sense to backport 
to 0.8.x?

 Unnecessary  ReadRepair request during RangeScan
 

 Key: CASSANDRA-3843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3843
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.0
Reporter: Philip Andronov
Assignee: Jonathan Ellis
 Fix For: 1.0.8

 Attachments: 3843-v2.txt, 3843.txt


 During reading with Quorum level and replication factor greater then 2, 
 Cassandra sends at least one ReadRepair, even if there is no need to do that. 
 With the fact that read requests await until ReadRepair will finish it slows 
 down requsts a lot, up to the Timeout :(
 It seems that the problem has been introduced by the CASSANDRA-2494, 
 unfortunately I have no enought knowledge of Cassandra internals to fix the 
 problem and do not broke CASSANDRA-2494 functionality, so my report without a 
 patch.
 Code explanations:
 {code:title=RangeSliceResponseResolver.java|borderStyle=solid}
 class RangeSliceResponseResolver {
 // 
 private class Reducer extends 
 MergeIterator.ReducerPairRow,InetAddress, Row
 {
 // 
 protected Row getReduced()
 {
 ColumnFamily resolved = versions.size()  1
   ? 
 RowRepairResolver.resolveSuperset(versions)
   : versions.get(0);
 if (versions.size()  sources.size())
 {
 for (InetAddress source : sources)
 {
 if (!versionSources.contains(source))
 {
   
 // [PA] Here we are adding null ColumnFamily.
 // later it will be compared with the desired
 // version and will give us fake difference which
 // forces Cassandra to send ReadRepair to a given 
 source
 versions.add(null);
 versionSources.add(source);
 }
 }
 }
 // 
 if (resolved != null)
 
 repairResults.addAll(RowRepairResolver.scheduleRepairs(resolved, table, key, 
 versions, versionSources));
 // 
 }
 }
 }
 {code}
 {code:title=RowRepairResolver.java|borderStyle=solid}
 public class RowRepairResolver extends AbstractRowResolver {
 // 
 public static ListIAsyncResult scheduleRepairs(ColumnFamily resolved, 
 String table, DecoratedKey? key, ListColumnFamily versions, 
 ListInetAddress endpoints)
 {
 ListIAsyncResult results = new 
 ArrayListIAsyncResult(versions.size());
 for (int i = 0; i  versions.size(); i++)
 {
 // On some iteration we have to compare null and resolved which 
 are obviously
 // not equals, so it will fire a ReadRequest, however it is not 
 needed here
 ColumnFamily diffCf = ColumnFamily.diff(versions.get(i), 
 resolved);
 if (diffCf == null)
 continue;
 //  
 {code}
 Imagine the following situation:
 NodeA has X.1 // row X with the version 1
 NodeB has X.2 
 NodeC has X.? // Unknown version, but because write was with Quorum it is 1 
 or 2
 During the Quorum read from nodes A and B, Cassandra creates version 12 and 
 send ReadRepair, so now nodes has the following content:
 NodeA has X.12
 NodeB has X.12
 which is correct, however Cassandra also will fire ReadRepair to NodeC. There 
 is no need to do that, the next consistent read have a chance to be served by 
 nodes {A, B} (no ReadRepair) or by pair {?, C} and in that case ReadRepair 
 will be fired and brings nodeC to the consistent state
 Right now we are reading from the Index a lot and starting from some point in 
 time we are getting TimeOutException because cluster is overloaded by the 
 ReadRepairRequests *even* if all nodes has the same data :(

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3883) CFIF WideRowIterator only returns batch size columns

2012-02-13 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206888#comment-13206888
 ] 

Brandon Williams commented on CASSANDRA-3883:
-

My original description here is incorrect; I can't repro the 198 count (not 
sure what happened there) but now the wide row tests counts 1033 'word1' items. 
 As far as I can tell, WordCountSetup actually inserts a total of 2002 'word1' 
matches, one in each of text1 and text2, and a thousand in each of text3 and 
text4.  I'm not sure what is causing the count discrepancy, but in any case 
1033 is far above the batch size of 99, and and the 4th word count test using a 
secondary index is counting 197 items, so I think something may be 
fundamentally wrong with word count.

That said, I've been adding wide row support to pig and testing with that, and 
the problem of not being able to completely paginate wide rows is a definite 
problem.

 CFIF WideRowIterator only returns batch size columns
 

 Key: CASSANDRA-3883
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3883
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.1.0
Reporter: Brandon Williams
 Fix For: 1.1.0


 Most evident with the word count, where there are 1250 'word1' items in two 
 rows (1000 in one, 250 in another) and it counts 198 with the batch size set 
 to 99.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3894) StorageService.getBootstrapToken() should check all endpoints/tokens for collissions

2012-02-13 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3894:
--

Reviewer: thepaul

 StorageService.getBootstrapToken() should check all endpoints/tokens for 
 collissions
 

 Key: CASSANDRA-3894
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3894
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor

 It currently checks all endpoints that are either bootstrapping or part of 
 the endpoint map. That covers leaving nodes, but doesn't cover moving nodes 
 in the sense that the token a node is moving to is not checked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3900) StorageService.handleStateNormal() should not have to deal with removal of an endpoint from moving

2012-02-13 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3900:
--

Reviewer: thepaul

 StorageService.handleStateNormal() should not have to deal with removal of an 
 endpoint from moving
 --

 Key: CASSANDRA-3900
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3900
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor

 Removing an endpoint from the moving endpoints should be internal to 
 TokenMetadata and implied by declaring that an endpoint turned normal. Need 
 to consider whether simply making this change is introducing a bug in some 
 subtle edge case where {{handleStateNormal()}} otherwise ignores the request 
 but we should still remove from moving.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3417) InvocationTargetException ConcurrentModificationException at startup

2012-02-13 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206935#comment-13206935
 ] 

Jonathan Ellis commented on CASSANDRA-3417:
---

I'm getting 3 of 5 hunks failing to apply to 1.0, did you switch branches?

 InvocationTargetException ConcurrentModificationException at startup
 

 Key: CASSANDRA-3417
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3417
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Joaquin Casares
Assignee: Peter Schuller
Priority: Minor
 Fix For: 1.0.8

 Attachments: 3417-2.txt, 3417-3.txt, 3417.txt, 
 CASSANDRA-3417-tokenmap-v2.txt, CASSANDRA-3417-tokenmap-v3.txt, 
 CASSANDRA-3417-tokenmap.txt


 I was starting up the new DataStax AMI where the seed starts first and 34 
 nodes would latch on together. So far things have been working decently for 
 launching, but right now I just got this during startup.
 {CODE}
 ubuntu@ip-10-40-190-143:~$ sudo cat /var/log/cassandra/output.log 
  INFO 09:24:38,453 JVM vendor/version: Java HotSpot(TM) 64-Bit Server 
 VM/1.6.0_26
  INFO 09:24:38,456 Heap size: 1936719872/1937768448
  INFO 09:24:38,457 Classpath: 
 /usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.4.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.2.jar:/usr/share/cassandra/lib/guava-r08.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jline-0.9.94.jar:/usr/share/cassandra/lib/joda-time-1.6.2.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.6.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.3.jar:/usr/share/cassandra/apache-cassandra-1.0.0.jar:/usr/share/cassandra/apache-cassandra-thrift-1.0.0.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/java/jna.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar
  INFO 09:24:39,891 JNA mlockall successful
  INFO 09:24:39,901 Loading settings from file:/etc/cassandra/cassandra.yaml
  INFO 09:24:40,057 DiskAccessMode 'auto' determined to be mmap, 
 indexAccessMode is mmap
  INFO 09:24:40,069 Global memtable threshold is enabled at 616MB
  INFO 09:24:40,159 EC2Snitch using region: us-east, zone: 1d.
  INFO 09:24:40,475 Creating new commitlog segment 
 /raid0/cassandra/commitlog/CommitLog-1319793880475.log
  INFO 09:24:40,486 Couldn't detect any schema definitions in local storage.
  INFO 09:24:40,486 Found table data in data directories. Consider using the 
 CLI to define your schema.
  INFO 09:24:40,497 No commitlog files found; skipping replay
  INFO 09:24:40,501 Cassandra version: 1.0.0
  INFO 09:24:40,502 Thrift API version: 19.18.0
  INFO 09:24:40,502 Loading persisted ring state
  INFO 09:24:40,506 Starting up server gossip
  INFO 09:24:40,529 Enqueuing flush of 
 Memtable-LocationInfo@1388314661(190/237 serialized/live bytes, 4 ops)
  INFO 09:24:40,530 Writing Memtable-LocationInfo@1388314661(190/237 
 serialized/live bytes, 4 ops)
  INFO 09:24:40,600 Completed flushing 
 /raid0/cassandra/data/system/LocationInfo-h-1-Data.db (298 bytes)
  INFO 09:24:40,613 Ec2Snitch adding ApplicationState ec2region=us-east 
 ec2zone=1d
  INFO 09:24:40,621 Starting Messaging Service on /10.40.190.143:7000
  INFO 09:24:40,628 Joining: waiting for ring and schema information
  INFO 09:24:43,389 InetAddress /10.194.29.156 is now dead.
  INFO 09:24:43,391 InetAddress /10.85.11.38 is now dead.
  INFO 09:24:43,392 InetAddress /10.34.42.28 is now dead.
  INFO 09:24:43,393 InetAddress /10.77.63.49 is now dead.
  INFO 09:24:43,394 InetAddress /10.194.22.191 is now dead.
  INFO 09:24:43,395 InetAddress /10.34.74.58 is now dead.
  INFO 09:24:43,395 Node /10.34.33.16 is now part of the cluster
  INFO 09:24:43,396 InetAddress /10.34.33.16 is now UP
  INFO 09:24:43,397 Enqueuing flush of Memtable-LocationInfo@1629818866(20/25 
 serialized/live bytes, 1 ops)
  INFO 09:24:43,398 Writing Memtable-LocationInfo@1629818866(20/25 
 serialized/live bytes, 1 ops)
  INFO 09:24:43,417 Completed flushing 
 /raid0/cassandra/data/system/LocationInfo-h-2-Data.db (74 bytes)
  INFO 

[jira] [Commented] (CASSANDRA-3829) make seeds *only* be seeds, not special in gossip

2012-02-13 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206944#comment-13206944
 ] 

Jonathan Ellis commented on CASSANDRA-3829:
---

bq. I propose that the behavior of (1) be removed

Okay, I'm with you so far.  But as you note, this impacts the usability of 
single-node clusters which is where virtually *everybody* starts.  So, I'll 
need to see a solution that doesn't make life more confusion for that 
overwhelming majority.   I get that you don't like the current tradeoffs but I 
haven't seen a better proposal yet.  (I'll go ahead and pre-emptively -1 pecial 
environment variables...)

bq. Fixing (2) so that the seed list is reloadable

I still haven't seen a case when this, or special-casing seeds to prevent 
gossip partitions, causes real problems.  Whereas I was around when we added 
the gossip-partition-prevention code, so I *do* know the problems that 
*prevents*.

 make seeds *only* be seeds, not special in gossip 
 --

 Key: CASSANDRA-3829
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3829
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor

 First, a little bit of framing on how seeds work:
 The concept of seed hosts makes fundamental sense; you need to
 seed a new node with some information required in order to join a
 cluster. Seed hosts is the information Cassandra uses for this
 purpose.
 But seed hosts play a role even after the initial start-up of a new
 node in a ring. Specifically, seed hosts continue to be gossiped to
 separately by the Gossiper throughout the life of a node and the
 cluster.
 Generally, operators must be careful to ensure that all nodes in a
 cluster are appropriately configured to refer to an overlapping set of
 seed hosts. Strictly speaking this should not be necessary (see
 further down though), but is the general recommendation. An
 unfortunate side-effect of this is that whenever you are doing ring
 management, such as replacing nodes, removing nodes, etc, you have to
 keep in mind which nodes are seeds.
 For example, if you bring a new node into the cluster, doing
 everything right with token assignment and auto_bootstrap=true, it
 will just enter the cluster without bootstrap - causing inconsistent
 reads. This is dangerous.
 And worse - changing the notion of which nodes are seeds across a
 cluster requires a *rolling restart*. It can be argued that it should
 actually be okay for nodes other than the one being fiddled with to
 incorrectly treat the fiddled-with node as a seed node, but this fact
 is highly opaque to most users that are not intimately familiar with
 Cassandra internals.
 This adds additional complexity to operations, as it introduces a
 reason why you cannot view the ring as completely homogeneous, despite
 the fundamental idea of Cassandra that all nodes should be equal.
 Now, fast forward a bit to what we are doing over here to avoid this
 problem: We have a zookeeper based systems for keeping track of hosts
 in a cluster, which is used by our Cassandra client to discover nodes
 to talk to. This works well.
 In order to avoid the need to manually keep track of seeds, we wanted
 to make seeds be automatically discoverable in order to eliminate as
 an operational concern. We have implemented a seed provider that does
 this for us, based on the data we keep in zookeeper.
 We could see essentially three ways of plugging this in:
 * (1) We could simply rely on not needing overlapping seeds and grab whatever 
 we have when a node starts.
 * (2) We could do something like continually treat all other nodes as seeds 
 by dynamically changing the seed list (involves some other changes like 
 having the Gossiper update it's notion of seeds.
 * (3) We could completely eliminate the use of seeds *except* for the very 
 specific purpose of initial start-up of an unbootstrapped node, and keep 
 using a static (for the duration of the node's uptime) seed list.
 (3) was attractive because it felt like this was the original intent
 of seeds; that they be used for *seeding*, and not be constantly
 required during cluster operation once nodes are already joined.
 Now before I make the suggestion, let me explain how we are currently
 (though not yet in production) handling seeds and start-up.
 First, we have the following relevant cases to consider during a normal 
 start-up:
 * (a) we are starting up a cluster for the very first time
 * (b) we are starting up a new clean node in order to join it to a 
 pre-existing cluster
 * (c) we are starting up a pre-existing already joined node in a pre-existing 
 cluster
 First, we proceeded on the assumption that we wanted to remove the use
 of 

[3/3] git commit: clean up redundant state lookups patch by Dave Brosius; reviewed by jbellis for CASSANDRA-3891

2012-02-13 Thread jbellis
clean up redundant state lookups
patch by Dave Brosius; reviewed by jbellis for CASSANDRA-3891


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9a842c7b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9a842c7b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9a842c7b

Branch: refs/heads/cassandra-1.1
Commit: 9a842c7b317e6f1e6e156ccb531e34bb769c979f
Parents: 6a6bf3c
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Feb 13 09:57:34 2012 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Feb 13 09:57:34 2012 -0600

--
 src/java/org/apache/cassandra/db/SystemTable.java  |4 +-
 .../apache/cassandra/thrift/CassandraServer.java   |  121 ---
 2 files changed, 69 insertions(+), 56 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9a842c7b/src/java/org/apache/cassandra/db/SystemTable.java
--
diff --git a/src/java/org/apache/cassandra/db/SystemTable.java 
b/src/java/org/apache/cassandra/db/SystemTable.java
index b0b5c60..3ced0a2 100644
--- a/src/java/org/apache/cassandra/db/SystemTable.java
+++ b/src/java/org/apache/cassandra/db/SystemTable.java
@@ -277,12 +277,12 @@ public class SystemTable
 SortedSetByteBuffer cols = new 
TreeSetByteBuffer(BytesType.instance);
 cols.add(CLUSTERNAME);
 QueryFilter filter = 
QueryFilter.getNamesFilter(decorate(LOCATION_KEY), new QueryPath(STATUS_CF), 
cols);
-ColumnFamily cf = 
table.getColumnFamilyStore(STATUS_CF).getColumnFamily(filter);
+ColumnFamilyStore cfs = table.getColumnFamilyStore(STATUS_CF);
+ColumnFamily cf = cfs.getColumnFamily(filter);
 
 if (cf == null)
 {
 // this is a brand new node
-ColumnFamilyStore cfs = table.getColumnFamilyStore(STATUS_CF);
 if (!cfs.getSSTables().isEmpty())
 throw new ConfigurationException(Found system table files, 
but they couldn't be loaded!);
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9a842c7b/src/java/org/apache/cassandra/thrift/CassandraServer.java
--
diff --git a/src/java/org/apache/cassandra/thrift/CassandraServer.java 
b/src/java/org/apache/cassandra/thrift/CassandraServer.java
index f30a130..4e141b4 100644
--- a/src/java/org/apache/cassandra/thrift/CassandraServer.java
+++ b/src/java/org/apache/cassandra/thrift/CassandraServer.java
@@ -92,21 +92,16 @@ public class CassandraServer implements Cassandra.Iface
 public ClientState state()
 {
 SocketAddress remoteSocket = 
SocketSessionManagementService.remoteSocket.get();
-ClientState retval = null;
-if (null != remoteSocket)
-{
-retval = SocketSessionManagementService.instance.get(remoteSocket);
-if (null == retval)
-{
-retval = new ClientState();
-SocketSessionManagementService.instance.put(remoteSocket, 
retval);
-}
-} 
-else
+if (remoteSocket == null)
+return clientState.get();
+
+ClientState cState = 
SocketSessionManagementService.instance.get(remoteSocket);
+if (cState == null)
 {
-retval = clientState.get();
+cState = new ClientState();
+SocketSessionManagementService.instance.put(remoteSocket, cState);
 }
-return retval;
+return cState;
 }
 
 protected MapDecoratedKey, ColumnFamily 
readColumnFamily(ListReadCommand commands, ConsistencyLevel consistency_level)
@@ -318,8 +313,9 @@ public class CassandraServer implements Cassandra.Iface
 {
 logger.debug(get_slice);
 
-state().hasColumnFamilyAccess(column_parent.column_family, 
Permission.READ);
-return multigetSliceInternal(state().getKeyspace(), 
Collections.singletonList(key), column_parent, predicate, 
consistency_level).get(key);
+ClientState cState = state();
+cState.hasColumnFamilyAccess(column_parent.column_family, 
Permission.READ);
+return multigetSliceInternal(cState.getKeyspace(), 
Collections.singletonList(key), column_parent, predicate, 
consistency_level).get(key);
 }
 
 public MapByteBuffer, ListColumnOrSuperColumn 
multiget_slice(ListByteBuffer keys, ColumnParent column_parent, 
SlicePredicate predicate, ConsistencyLevel consistency_level)
@@ -327,8 +323,9 @@ public class CassandraServer implements Cassandra.Iface
 {
 logger.debug(multiget_slice);
 
-state().hasColumnFamilyAccess(column_parent.column_family, 
Permission.READ);
-return multigetSliceInternal(state().getKeyspace(), keys, 
column_parent, predicate, 

[2/3] git commit: clean up redundant state lookups patch by Dave Brosius; reviewed by jbellis for CASSANDRA-3891

2012-02-13 Thread jbellis
clean up redundant state lookups
patch by Dave Brosius; reviewed by jbellis for CASSANDRA-3891


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9a842c7b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9a842c7b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9a842c7b

Branch: refs/heads/trunk
Commit: 9a842c7b317e6f1e6e156ccb531e34bb769c979f
Parents: 6a6bf3c
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Feb 13 09:57:34 2012 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Feb 13 09:57:34 2012 -0600

--
 src/java/org/apache/cassandra/db/SystemTable.java  |4 +-
 .../apache/cassandra/thrift/CassandraServer.java   |  121 ---
 2 files changed, 69 insertions(+), 56 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9a842c7b/src/java/org/apache/cassandra/db/SystemTable.java
--
diff --git a/src/java/org/apache/cassandra/db/SystemTable.java 
b/src/java/org/apache/cassandra/db/SystemTable.java
index b0b5c60..3ced0a2 100644
--- a/src/java/org/apache/cassandra/db/SystemTable.java
+++ b/src/java/org/apache/cassandra/db/SystemTable.java
@@ -277,12 +277,12 @@ public class SystemTable
 SortedSetByteBuffer cols = new 
TreeSetByteBuffer(BytesType.instance);
 cols.add(CLUSTERNAME);
 QueryFilter filter = 
QueryFilter.getNamesFilter(decorate(LOCATION_KEY), new QueryPath(STATUS_CF), 
cols);
-ColumnFamily cf = 
table.getColumnFamilyStore(STATUS_CF).getColumnFamily(filter);
+ColumnFamilyStore cfs = table.getColumnFamilyStore(STATUS_CF);
+ColumnFamily cf = cfs.getColumnFamily(filter);
 
 if (cf == null)
 {
 // this is a brand new node
-ColumnFamilyStore cfs = table.getColumnFamilyStore(STATUS_CF);
 if (!cfs.getSSTables().isEmpty())
 throw new ConfigurationException(Found system table files, 
but they couldn't be loaded!);
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9a842c7b/src/java/org/apache/cassandra/thrift/CassandraServer.java
--
diff --git a/src/java/org/apache/cassandra/thrift/CassandraServer.java 
b/src/java/org/apache/cassandra/thrift/CassandraServer.java
index f30a130..4e141b4 100644
--- a/src/java/org/apache/cassandra/thrift/CassandraServer.java
+++ b/src/java/org/apache/cassandra/thrift/CassandraServer.java
@@ -92,21 +92,16 @@ public class CassandraServer implements Cassandra.Iface
 public ClientState state()
 {
 SocketAddress remoteSocket = 
SocketSessionManagementService.remoteSocket.get();
-ClientState retval = null;
-if (null != remoteSocket)
-{
-retval = SocketSessionManagementService.instance.get(remoteSocket);
-if (null == retval)
-{
-retval = new ClientState();
-SocketSessionManagementService.instance.put(remoteSocket, 
retval);
-}
-} 
-else
+if (remoteSocket == null)
+return clientState.get();
+
+ClientState cState = 
SocketSessionManagementService.instance.get(remoteSocket);
+if (cState == null)
 {
-retval = clientState.get();
+cState = new ClientState();
+SocketSessionManagementService.instance.put(remoteSocket, cState);
 }
-return retval;
+return cState;
 }
 
 protected MapDecoratedKey, ColumnFamily 
readColumnFamily(ListReadCommand commands, ConsistencyLevel consistency_level)
@@ -318,8 +313,9 @@ public class CassandraServer implements Cassandra.Iface
 {
 logger.debug(get_slice);
 
-state().hasColumnFamilyAccess(column_parent.column_family, 
Permission.READ);
-return multigetSliceInternal(state().getKeyspace(), 
Collections.singletonList(key), column_parent, predicate, 
consistency_level).get(key);
+ClientState cState = state();
+cState.hasColumnFamilyAccess(column_parent.column_family, 
Permission.READ);
+return multigetSliceInternal(cState.getKeyspace(), 
Collections.singletonList(key), column_parent, predicate, 
consistency_level).get(key);
 }
 
 public MapByteBuffer, ListColumnOrSuperColumn 
multiget_slice(ListByteBuffer keys, ColumnParent column_parent, 
SlicePredicate predicate, ConsistencyLevel consistency_level)
@@ -327,8 +323,9 @@ public class CassandraServer implements Cassandra.Iface
 {
 logger.debug(multiget_slice);
 
-state().hasColumnFamilyAccess(column_parent.column_family, 
Permission.READ);
-return multigetSliceInternal(state().getKeyspace(), keys, 
column_parent, predicate, 

[1/3] git commit: Merge branch 'cassandra-1.1' into trunk

2012-02-13 Thread jbellis
Updated Branches:
  refs/heads/cassandra-1.1 6a6bf3cf1 - 9a842c7b3
  refs/heads/trunk ddc771dc5 - 232da8248


Merge branch 'cassandra-1.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/232da824
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/232da824
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/232da824

Branch: refs/heads/trunk
Commit: 232da8248072991dc521d1fe579a55679ea6c735
Parents: ddc771d 9a842c7
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Feb 13 09:57:53 2012 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Feb 13 09:57:53 2012 -0600

--
 src/java/org/apache/cassandra/db/SystemTable.java  |4 +-
 .../apache/cassandra/thrift/CassandraServer.java   |  121 ---
 2 files changed, 69 insertions(+), 56 deletions(-)
--




[jira] [Updated] (CASSANDRA-3883) CFIF WideRowIterator only returns batch size columns

2012-02-13 Thread Brandon Williams (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-3883:


Attachment: 3883-v1.txt

v1 isn't perfect but it's a start; if the batch starts on a wide row, we reuse 
the token and iterate until we're done.  Unfortunately if we don't start on 
one, I'm not sure if there's a way to detect that we're in a wide row without 
making an extra rpc against the last row seen every time.

 CFIF WideRowIterator only returns batch size columns
 

 Key: CASSANDRA-3883
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3883
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.1.0
Reporter: Brandon Williams
 Fix For: 1.1.0

 Attachments: 3883-v1.txt


 Most evident with the word count, where there are 1250 'word1' items in two 
 rows (1000 in one, 250 in another) and it counts 198 with the batch size set 
 to 99.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3859) Add Progress Reporting to Cassandra OutputFormats

2012-02-13 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206968#comment-13206968
 ] 

Brandon Williams commented on CASSANDRA-3859:
-

Samarth, how is this patching working for you?

 Add Progress Reporting to Cassandra OutputFormats
 -

 Key: CASSANDRA-3859
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3859
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop, Tools
Affects Versions: 1.1.0
Reporter: Samarth Gahire
Assignee: Brandon Williams
Priority: Minor
  Labels: bulkloader, hadoop, mapreduce, sstableloader
 Fix For: 1.1.0

 Attachments: 0001-add-progress-reporting-to-BOF.txt, 
 0002-Add-progress-to-CFOF.txt

   Original Estimate: 48h
  Remaining Estimate: 48h

 When we are using the BulkOutputFormat to load the data to cassandra. We 
 should use the progress reporting to Hadoop Job within Sstable loader because 
 while loading the data for particular task if streaming is taking more time 
 and progress is not reported to Job it may kill the task with timeout 
 exception. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3867) Disablethrift and Enablethrift can leaves behind zombie connections on THSHA server

2012-02-13 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206971#comment-13206971
 ] 

Brandon Williams commented on CASSANDRA-3867:
-

+1

 Disablethrift and Enablethrift can leaves behind zombie connections on THSHA 
 server
 ---

 Key: CASSANDRA-3867
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3867
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.7
Reporter: Vijay
Assignee: Vijay
 Fix For: 1.0.8

 Attachments: 0001-CASSANDRA-3867.patch


 While doing nodetool disable thrift we disable selecting threads and close 
 them... but the connections are still active...
 Enable thrift creates a new Selector threads because we create new 
 ThriftServer() which will cause the old connections to be zombies.
 I think the right fix will be to call server.interrupt(); and then close the 
 connections when they are done selecting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3740) While using BulkOutputFormat unneccessarily look for the cassandra.yaml file.

2012-02-13 Thread Brandon Williams (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206969#comment-13206969
 ] 

Brandon Williams edited comment on CASSANDRA-3740 at 2/13/12 4:36 PM:
--

Samarth/Erik,

How does this patch look?

  was (Author: brandon.williams):
Samarth/Eric,

How does this patch look?
  
 While using BulkOutputFormat  unneccessarily look for the cassandra.yaml file.
 --

 Key: CASSANDRA-3740
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3740
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.1.0
Reporter: Samarth Gahire
Assignee: Brandon Williams
  Labels: cassandra, hadoop, mapreduce
 Fix For: 1.1.0

 Attachments: 0001-Make-DD-the-canonical-partitioner-source.txt, 
 0002-Prevent-loading-from-yaml.txt, 0003-use-output-partitioner.txt, 
 0004-update-BOF-for-new-dir-layout.txt, 0005-BWR-uses-any-if.txt


 I am trying to use BulkOutputFormat to stream the data from map of Hadoop 
 job. I have set the cassandra related configuration using ConfigHelper ,Also 
 have looked into Cassandra code seems Cassandra has taken care that it should 
 not look for the cassandra.yaml file.
 But still when I run the job i get the following error:
 {
 12/01/13 11:30:04 WARN mapred.JobClient: Use GenericOptionsParser for parsing 
 the arguments. Applications should implement Tool for the same.
 12/01/13 11:30:04 INFO input.FileInputFormat: Total input paths to process : 1
 12/01/13 11:30:04 INFO mapred.JobClient: Running job: job_201201130910_0015
 12/01/13 11:30:05 INFO mapred.JobClient:  map 0% reduce 0%
 12/01/13 11:30:23 INFO mapred.JobClient: Task Id : 
 attempt_201201130910_0015_m_00_0, Status : FAILED
 java.lang.Throwable: Child Error
 at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.io.IOException: Task process exit with nonzero status of 1.
 at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
 attempt_201201130910_0015_m_00_0: Cannot locate cassandra.yaml
 attempt_201201130910_0015_m_00_0: Fatal configuration error; unable to 
 start server.
 }
 Also let me know how can i make this cassandra.yaml file available to Hadoop 
 mapreduce job?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3740) While using BulkOutputFormat unneccessarily look for the cassandra.yaml file.

2012-02-13 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206969#comment-13206969
 ] 

Brandon Williams commented on CASSANDRA-3740:
-

Samarth/Eric,

How does this patch look?

 While using BulkOutputFormat  unneccessarily look for the cassandra.yaml file.
 --

 Key: CASSANDRA-3740
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3740
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.1.0
Reporter: Samarth Gahire
Assignee: Brandon Williams
  Labels: cassandra, hadoop, mapreduce
 Fix For: 1.1.0

 Attachments: 0001-Make-DD-the-canonical-partitioner-source.txt, 
 0002-Prevent-loading-from-yaml.txt, 0003-use-output-partitioner.txt, 
 0004-update-BOF-for-new-dir-layout.txt, 0005-BWR-uses-any-if.txt


 I am trying to use BulkOutputFormat to stream the data from map of Hadoop 
 job. I have set the cassandra related configuration using ConfigHelper ,Also 
 have looked into Cassandra code seems Cassandra has taken care that it should 
 not look for the cassandra.yaml file.
 But still when I run the job i get the following error:
 {
 12/01/13 11:30:04 WARN mapred.JobClient: Use GenericOptionsParser for parsing 
 the arguments. Applications should implement Tool for the same.
 12/01/13 11:30:04 INFO input.FileInputFormat: Total input paths to process : 1
 12/01/13 11:30:04 INFO mapred.JobClient: Running job: job_201201130910_0015
 12/01/13 11:30:05 INFO mapred.JobClient:  map 0% reduce 0%
 12/01/13 11:30:23 INFO mapred.JobClient: Task Id : 
 attempt_201201130910_0015_m_00_0, Status : FAILED
 java.lang.Throwable: Child Error
 at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.io.IOException: Task process exit with nonzero status of 1.
 at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
 attempt_201201130910_0015_m_00_0: Cannot locate cassandra.yaml
 attempt_201201130910_0015_m_00_0: Fatal configuration error; unable to 
 start server.
 }
 Also let me know how can i make this cassandra.yaml file available to Hadoop 
 mapreduce job?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3772) Evaluate Murmur3-based partitioner

2012-02-13 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3772:
--

   Reviewer: yukim
Component/s: Core
   Assignee: Dave Brosius

Yuki, can you take a look?

 Evaluate Murmur3-based partitioner
 --

 Key: CASSANDRA-3772
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Dave Brosius
 Fix For: 1.2

 Attachments: try_murmur3.diff


 MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
 qualities, just a good output distribution.  Let's see how much overhead we 
 can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Cassandra Wiki] Update of HadoopSupport by jeremyhanna

2012-02-13 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The HadoopSupport page has been changed by jeremyhanna:
http://wiki.apache.org/cassandra/HadoopSupport?action=diffrev1=46rev2=47

Comment:
removed redundant hadoop property.

value20/value
  /property
  property
-   namemapred.max.tracker.failures/name
-   value20/value
- /property
- property
namemapred.map.max.attempts/name
value20/value
  /property


[jira] [Commented] (CASSANDRA-3740) While using BulkOutputFormat unneccessarily look for the cassandra.yaml file.

2012-02-13 Thread Samarth Gahire (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206986#comment-13206986
 ] 

Samarth Gahire commented on CASSANDRA-3740:
---

First 4 patches working fine.
About the patch related to CASSANDRA-3839 Erik can explain properly.

 While using BulkOutputFormat  unneccessarily look for the cassandra.yaml file.
 --

 Key: CASSANDRA-3740
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3740
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.1.0
Reporter: Samarth Gahire
Assignee: Brandon Williams
  Labels: cassandra, hadoop, mapreduce
 Fix For: 1.1.0

 Attachments: 0001-Make-DD-the-canonical-partitioner-source.txt, 
 0002-Prevent-loading-from-yaml.txt, 0003-use-output-partitioner.txt, 
 0004-update-BOF-for-new-dir-layout.txt, 0005-BWR-uses-any-if.txt


 I am trying to use BulkOutputFormat to stream the data from map of Hadoop 
 job. I have set the cassandra related configuration using ConfigHelper ,Also 
 have looked into Cassandra code seems Cassandra has taken care that it should 
 not look for the cassandra.yaml file.
 But still when I run the job i get the following error:
 {
 12/01/13 11:30:04 WARN mapred.JobClient: Use GenericOptionsParser for parsing 
 the arguments. Applications should implement Tool for the same.
 12/01/13 11:30:04 INFO input.FileInputFormat: Total input paths to process : 1
 12/01/13 11:30:04 INFO mapred.JobClient: Running job: job_201201130910_0015
 12/01/13 11:30:05 INFO mapred.JobClient:  map 0% reduce 0%
 12/01/13 11:30:23 INFO mapred.JobClient: Task Id : 
 attempt_201201130910_0015_m_00_0, Status : FAILED
 java.lang.Throwable: Child Error
 at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.io.IOException: Task process exit with nonzero status of 1.
 at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
 attempt_201201130910_0015_m_00_0: Cannot locate cassandra.yaml
 attempt_201201130910_0015_m_00_0: Fatal configuration error; unable to 
 start server.
 }
 Also let me know how can i make this cassandra.yaml file available to Hadoop 
 mapreduce job?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3712) Can't cleanup after I moved a token.

2012-02-13 Thread Yuki Morishita (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207021#comment-13207021
 ] 

Yuki Morishita commented on CASSANDRA-3712:
---

I ran my unit test enough and I see no error.
+1

 Can't cleanup after I moved a token.
 

 Key: CASSANDRA-3712
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3712
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.6.0_26
 Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
 Ubuntu 10.04.2 LTS 64-Bit
 RAM: 2GB / 1GB free
 Data partition: 80% free on the most used server.
Reporter: Herve Nicol
Assignee: Yuki Morishita
 Fix For: 1.0.8

 Attachments: 0001-Add-flush-and-cleanup-race-test.patch, 
 0002-Acquire-lock-when-updating-index.patch, 3712-v3.txt


 Before cleanup failed, I moved one node's token.
 My cluster had 10GB data on 2 nodes. Data repartition was bad, tokens were 
 165[...] and 155[...].
 I moved 155 to 075[...], then adjusted to 076[...]. The moves were correctly 
 processed, with no exception.
 But then, when I wanted to cleanup, it failed and keeps failing, on both 
 nodes.
 Other maintenance procedures like repair, compact or scrub work.
 All the data is in the URLs CF.
 Example session log:
 nodetool cleanup fails:
 $ ./nodetool --host cnode1 cleanup
 Error occured during cleanup
 java.util.concurrent.ExecutionException: java.lang.AssertionError
  at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
  at java.util.concurrent.FutureTask.get(FutureTask.java:83)
  at 
 org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:203)
  at 
 org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:237)
  at 
 org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:958)
  at 
 org.apache.cassandra.service.StorageService.forceTableCleanup(StorageService.java:1604)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
  at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
  at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
  at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
  at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
  at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
  at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
  at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
  at 
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
  at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
  at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
  at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
  at sun.rmi.transport.Transport$1.run(Transport.java:159)
  at java.security.AccessController.doPrivileged(Native Method)
  at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
  at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
  at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
  at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.AssertionError
  at org.apache.cassandra.db.Memtable.put(Memtable.java:136)
  at 
 org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:780)
  at 
 org.apache.cassandra.db.index.keys.KeysIndex.deleteColumn(KeysIndex.java:82)
  at 
 

[jira] [Updated] (CASSANDRA-2975) Upgrade MurmurHash to version 3

2012-02-13 Thread Vijay (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-2975:
-

Attachment: 0001-CASSANDRA-2975.patch

Attached is the refactor which includes fixes as per the suggestions. Added a 
factory to make adding newer hashesh easier but left the Legacy alone but it 
will be fairly trivial and more cleaner if we want to refactor a little more. 
Let me know thanks! Tests passed and the long test shows significant 
improvement Thanks Brian!

 Upgrade MurmurHash to version 3
 ---

 Key: CASSANDRA-2975
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2975
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Brian Lindauer
Assignee: Vijay
Priority: Trivial
  Labels: lhf
 Fix For: 1.2

 Attachments: 0001-CASSANDRA-2975.patch, 
 0001-Convert-BloomFilter-to-use-MurmurHash-v3-instead-of-.patch, 
 0002-Backwards-compatibility-with-files-using-Murmur2-blo.patch, 
 Murmur3Benchmark.java


 MurmurHash version 3 was finalized on June 3. It provides an enormous speedup 
 and increased robustness over version 2, which is implemented in Cassandra. 
 Information here:
 http://code.google.com/p/smhasher/
 The reference implementation is here:
 http://code.google.com/p/smhasher/source/browse/trunk/MurmurHash3.cpp?spec=svn136r=136
 I have already done the work to port the (public domain) reference 
 implementation to Java in the MurmurHash class and updated the BloomFilter 
 class to use the new implementation:
 https://github.com/lindauer/cassandra/commit/cea6068a4a3e5d7d9509335394f9ef3350d37e93
 Apart from the faster hash time, the new version only requires one call to 
 hash() rather than 2, since it returns 128 bits of hash instead of 64.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

2012-02-13 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3862:
--

Attachment: 3862-cleanup.txt

Attached cleanup patch that applies on top of v3.  Most of the changes are 
adding docstrings/comments and cleaning up typos.

A minor change to the code was to make cacheRow take just cfId and filter, 
removing the redundant filter.key as a parameter.

I also renamed cacheRow to getThroughCache.  Still not 100% happy with that, 
but my goal is to make the distinction between readAndCache more obvious.

Finally, I've modified the logic in invalidateCachedRow according to the 
reasoning in this comment:

{noformat}
.   // This method is used to (1) drop obsolete entries from a copying 
cache after the row in question was updated
// and to (2) make sure we're not wasting cache space on rows that 
don't exist anymore post-compaction.
// Sentinels complicate this because it means we've caught a read 
thread in the process of loading
// the cache, and we don't know (in case 2) if it will do so with rows 
from before the compaction or after,
// so we need to loop until the load completes.
{noformat}

(I also negated the loop condition, which looked like an oversight.)

 RowCache misses Updates
 ---

 Key: CASSANDRA-3862
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.7
Reporter: Daniel Doubleday
 Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862.patch, 
 3862_v3.patch, include_memtables_in_rowcache_read.patch


 While performing stress tests to find any race problems for CASSANDRA-2864 I 
 guess I (re-)found one for the standard on-heap row cache.
 During my stress test I hava lots of threads running with some of them only 
 reading other writing and re-reading the value.
 This seems to happen:
 - Reader tries to read row A for the first time doing a getTopLevelColumns
 - Row A which is not in the cache yet is updated by Writer. The row is not 
 eagerly read during write (because we want fast writes) so the writer cannot 
 perform a cache update
 - Reader puts the row in the cache which is now missing the update
 I already asked this some time ago on the mailing list but unfortunately 
 didn't dig after I got no answer since I assumed that I just missed 
 something. In a way I still do but haven't found any locking mechanism that 
 makes sure that this should not happen.
 The problem can be reproduced with every run of my stress test. When I 
 restart the server the expected column is there. It's just missing from the 
 cache.
 To test I have created a patch that merges memtables with the row cache. With 
 the patch the problem is gone.
 I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any 
 relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

2012-02-13 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3862:
--

Attachment: (was: 3862-cleanup.txt)

 RowCache misses Updates
 ---

 Key: CASSANDRA-3862
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.7
Reporter: Daniel Doubleday
 Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862.patch, 
 3862_v3.patch, include_memtables_in_rowcache_read.patch


 While performing stress tests to find any race problems for CASSANDRA-2864 I 
 guess I (re-)found one for the standard on-heap row cache.
 During my stress test I hava lots of threads running with some of them only 
 reading other writing and re-reading the value.
 This seems to happen:
 - Reader tries to read row A for the first time doing a getTopLevelColumns
 - Row A which is not in the cache yet is updated by Writer. The row is not 
 eagerly read during write (because we want fast writes) so the writer cannot 
 perform a cache update
 - Reader puts the row in the cache which is now missing the update
 I already asked this some time ago on the mailing list but unfortunately 
 didn't dig after I got no answer since I assumed that I just missed 
 something. In a way I still do but haven't found any locking mechanism that 
 makes sure that this should not happen.
 The problem can be reproduced with every run of my stress test. When I 
 restart the server the expected column is there. It's just missing from the 
 cache.
 To test I have created a patch that merges memtables with the row cache. With 
 the patch the problem is gone.
 I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any 
 relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

2012-02-13 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3862:
--

Attachment: 3862-cleanup.txt

 RowCache misses Updates
 ---

 Key: CASSANDRA-3862
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.7
Reporter: Daniel Doubleday
 Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862.patch, 
 3862_v3.patch, include_memtables_in_rowcache_read.patch


 While performing stress tests to find any race problems for CASSANDRA-2864 I 
 guess I (re-)found one for the standard on-heap row cache.
 During my stress test I hava lots of threads running with some of them only 
 reading other writing and re-reading the value.
 This seems to happen:
 - Reader tries to read row A for the first time doing a getTopLevelColumns
 - Row A which is not in the cache yet is updated by Writer. The row is not 
 eagerly read during write (because we want fast writes) so the writer cannot 
 perform a cache update
 - Reader puts the row in the cache which is now missing the update
 I already asked this some time ago on the mailing list but unfortunately 
 didn't dig after I got no answer since I assumed that I just missed 
 something. In a way I still do but haven't found any locking mechanism that 
 makes sure that this should not happen.
 The problem can be reproduced with every run of my stress test. When I 
 restart the server the expected column is there. It's just missing from the 
 cache.
 To test I have created a patch that merges memtables with the row cache. With 
 the patch the problem is gone.
 I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any 
 relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

2012-02-13 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3862:
--

 Reviewer: jbellis
Affects Version/s: (was: 1.0.7)
   0.6
Fix Version/s: 1.1.0
 Assignee: Sylvain Lebresne

 RowCache misses Updates
 ---

 Key: CASSANDRA-3862
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.6
Reporter: Daniel Doubleday
Assignee: Sylvain Lebresne
 Fix For: 1.1.0

 Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862.patch, 
 3862_v3.patch, include_memtables_in_rowcache_read.patch


 While performing stress tests to find any race problems for CASSANDRA-2864 I 
 guess I (re-)found one for the standard on-heap row cache.
 During my stress test I hava lots of threads running with some of them only 
 reading other writing and re-reading the value.
 This seems to happen:
 - Reader tries to read row A for the first time doing a getTopLevelColumns
 - Row A which is not in the cache yet is updated by Writer. The row is not 
 eagerly read during write (because we want fast writes) so the writer cannot 
 perform a cache update
 - Reader puts the row in the cache which is now missing the update
 I already asked this some time ago on the mailing list but unfortunately 
 didn't dig after I got no answer since I assumed that I just missed 
 something. In a way I still do but haven't found any locking mechanism that 
 makes sure that this should not happen.
 The problem can be reproduced with every run of my stress test. When I 
 restart the server the expected column is there. It's just missing from the 
 cache.
 To test I have created a patch that merges memtables with the row cache. With 
 the patch the problem is gone.
 I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any 
 relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3417) InvocationTargetException ConcurrentModificationException at startup

2012-02-13 Thread Peter Schuller (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Schuller updated CASSANDRA-3417:
--

Attachment: CASSANDRA-3417-tokenmap-1.0-v1.txt

Attaching {{CASSANDRA\-3417\-tokenmap\-1.0\-v1.txt}} which is for 1.0.

Apologies for the confusion; I only ever triggered and tested this on 1.1/trunk 
since that's what I was testing, despite this bug originally being against 1.0.

I haven't done real testing with this patch for 1.0. Right now I can't use the 
cluster I was testing with to easily go to 1.0 to test either. But, the fix 
seems correct to me regardless of branch given that the iteration is clearly 
over a map that is getting modified. The biggest risk is a typo or similar 
mistake which is more easily spotted by review anyway.


 InvocationTargetException ConcurrentModificationException at startup
 

 Key: CASSANDRA-3417
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3417
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Joaquin Casares
Assignee: Peter Schuller
Priority: Minor
 Fix For: 1.0.8

 Attachments: 3417-2.txt, 3417-3.txt, 3417.txt, 
 CASSANDRA-3417-tokenmap-1.0-v1.txt, CASSANDRA-3417-tokenmap-v2.txt, 
 CASSANDRA-3417-tokenmap-v3.txt, CASSANDRA-3417-tokenmap.txt


 I was starting up the new DataStax AMI where the seed starts first and 34 
 nodes would latch on together. So far things have been working decently for 
 launching, but right now I just got this during startup.
 {CODE}
 ubuntu@ip-10-40-190-143:~$ sudo cat /var/log/cassandra/output.log 
  INFO 09:24:38,453 JVM vendor/version: Java HotSpot(TM) 64-Bit Server 
 VM/1.6.0_26
  INFO 09:24:38,456 Heap size: 1936719872/1937768448
  INFO 09:24:38,457 Classpath: 
 /usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.4.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.2.jar:/usr/share/cassandra/lib/guava-r08.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jline-0.9.94.jar:/usr/share/cassandra/lib/joda-time-1.6.2.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.6.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.3.jar:/usr/share/cassandra/apache-cassandra-1.0.0.jar:/usr/share/cassandra/apache-cassandra-thrift-1.0.0.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/java/jna.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar
  INFO 09:24:39,891 JNA mlockall successful
  INFO 09:24:39,901 Loading settings from file:/etc/cassandra/cassandra.yaml
  INFO 09:24:40,057 DiskAccessMode 'auto' determined to be mmap, 
 indexAccessMode is mmap
  INFO 09:24:40,069 Global memtable threshold is enabled at 616MB
  INFO 09:24:40,159 EC2Snitch using region: us-east, zone: 1d.
  INFO 09:24:40,475 Creating new commitlog segment 
 /raid0/cassandra/commitlog/CommitLog-1319793880475.log
  INFO 09:24:40,486 Couldn't detect any schema definitions in local storage.
  INFO 09:24:40,486 Found table data in data directories. Consider using the 
 CLI to define your schema.
  INFO 09:24:40,497 No commitlog files found; skipping replay
  INFO 09:24:40,501 Cassandra version: 1.0.0
  INFO 09:24:40,502 Thrift API version: 19.18.0
  INFO 09:24:40,502 Loading persisted ring state
  INFO 09:24:40,506 Starting up server gossip
  INFO 09:24:40,529 Enqueuing flush of 
 Memtable-LocationInfo@1388314661(190/237 serialized/live bytes, 4 ops)
  INFO 09:24:40,530 Writing Memtable-LocationInfo@1388314661(190/237 
 serialized/live bytes, 4 ops)
  INFO 09:24:40,600 Completed flushing 
 /raid0/cassandra/data/system/LocationInfo-h-1-Data.db (298 bytes)
  INFO 09:24:40,613 Ec2Snitch adding ApplicationState ec2region=us-east 
 ec2zone=1d
  INFO 09:24:40,621 Starting Messaging Service on /10.40.190.143:7000
  INFO 09:24:40,628 Joining: waiting for ring and schema information
  INFO 09:24:43,389 InetAddress /10.194.29.156 is now dead.
  INFO 09:24:43,391 InetAddress /10.85.11.38 is now dead.
  INFO 09:24:43,392 InetAddress /10.34.42.28 is now dead.
  INFO 09:24:43,393 InetAddress /10.77.63.49 is now dead.
  

[jira] [Updated] (CASSANDRA-2975) Upgrade MurmurHash to version 3

2012-02-13 Thread Vijay (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-2975:
-

Attachment: 0001-CASSANDRA-2975.patch

updating the patch because old one missed the new files created.

 Upgrade MurmurHash to version 3
 ---

 Key: CASSANDRA-2975
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2975
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Brian Lindauer
Assignee: Vijay
Priority: Trivial
  Labels: lhf
 Fix For: 1.2

 Attachments: 0001-CASSANDRA-2975.patch, 
 0001-Convert-BloomFilter-to-use-MurmurHash-v3-instead-of-.patch, 
 0002-Backwards-compatibility-with-files-using-Murmur2-blo.patch, 
 Murmur3Benchmark.java


 MurmurHash version 3 was finalized on June 3. It provides an enormous speedup 
 and increased robustness over version 2, which is implemented in Cassandra. 
 Information here:
 http://code.google.com/p/smhasher/
 The reference implementation is here:
 http://code.google.com/p/smhasher/source/browse/trunk/MurmurHash3.cpp?spec=svn136r=136
 I have already done the work to port the (public domain) reference 
 implementation to Java in the MurmurHash class and updated the BloomFilter 
 class to use the new implementation:
 https://github.com/lindauer/cassandra/commit/cea6068a4a3e5d7d9509335394f9ef3350d37e93
 Apart from the faster hash time, the new version only requires one call to 
 hash() rather than 2, since it returns 128 bits of hash instead of 64.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2975) Upgrade MurmurHash to version 3

2012-02-13 Thread Vijay (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-2975:
-

Attachment: (was: 0001-CASSANDRA-2975.patch)

 Upgrade MurmurHash to version 3
 ---

 Key: CASSANDRA-2975
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2975
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Brian Lindauer
Assignee: Vijay
Priority: Trivial
  Labels: lhf
 Fix For: 1.2

 Attachments: 0001-CASSANDRA-2975.patch, 
 0001-Convert-BloomFilter-to-use-MurmurHash-v3-instead-of-.patch, 
 0002-Backwards-compatibility-with-files-using-Murmur2-blo.patch, 
 Murmur3Benchmark.java


 MurmurHash version 3 was finalized on June 3. It provides an enormous speedup 
 and increased robustness over version 2, which is implemented in Cassandra. 
 Information here:
 http://code.google.com/p/smhasher/
 The reference implementation is here:
 http://code.google.com/p/smhasher/source/browse/trunk/MurmurHash3.cpp?spec=svn136r=136
 I have already done the work to port the (public domain) reference 
 implementation to Java in the MurmurHash class and updated the BloomFilter 
 class to use the new implementation:
 https://github.com/lindauer/cassandra/commit/cea6068a4a3e5d7d9509335394f9ef3350d37e93
 Apart from the faster hash time, the new version only requires one call to 
 hash() rather than 2, since it returns 128 bits of hash instead of 64.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3843) Unnecessary ReadRepair request during RangeScan

2012-02-13 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207046#comment-13207046
 ] 

Jonathan Ellis commented on CASSANDRA-3843:
---

It's a relatively small patch, but StorageProxy and its callbacks can be 
fragile...  I almost didn't commit it to 1.0 either.  Tell you what though, 
I'll post a backported patch here and if you want you can run with it. :)

 Unnecessary  ReadRepair request during RangeScan
 

 Key: CASSANDRA-3843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3843
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.0
Reporter: Philip Andronov
Assignee: Jonathan Ellis
 Fix For: 1.0.8

 Attachments: 3843-v2.txt, 3843.txt


 During reading with Quorum level and replication factor greater then 2, 
 Cassandra sends at least one ReadRepair, even if there is no need to do that. 
 With the fact that read requests await until ReadRepair will finish it slows 
 down requsts a lot, up to the Timeout :(
 It seems that the problem has been introduced by the CASSANDRA-2494, 
 unfortunately I have no enought knowledge of Cassandra internals to fix the 
 problem and do not broke CASSANDRA-2494 functionality, so my report without a 
 patch.
 Code explanations:
 {code:title=RangeSliceResponseResolver.java|borderStyle=solid}
 class RangeSliceResponseResolver {
 // 
 private class Reducer extends 
 MergeIterator.ReducerPairRow,InetAddress, Row
 {
 // 
 protected Row getReduced()
 {
 ColumnFamily resolved = versions.size()  1
   ? 
 RowRepairResolver.resolveSuperset(versions)
   : versions.get(0);
 if (versions.size()  sources.size())
 {
 for (InetAddress source : sources)
 {
 if (!versionSources.contains(source))
 {
   
 // [PA] Here we are adding null ColumnFamily.
 // later it will be compared with the desired
 // version and will give us fake difference which
 // forces Cassandra to send ReadRepair to a given 
 source
 versions.add(null);
 versionSources.add(source);
 }
 }
 }
 // 
 if (resolved != null)
 
 repairResults.addAll(RowRepairResolver.scheduleRepairs(resolved, table, key, 
 versions, versionSources));
 // 
 }
 }
 }
 {code}
 {code:title=RowRepairResolver.java|borderStyle=solid}
 public class RowRepairResolver extends AbstractRowResolver {
 // 
 public static ListIAsyncResult scheduleRepairs(ColumnFamily resolved, 
 String table, DecoratedKey? key, ListColumnFamily versions, 
 ListInetAddress endpoints)
 {
 ListIAsyncResult results = new 
 ArrayListIAsyncResult(versions.size());
 for (int i = 0; i  versions.size(); i++)
 {
 // On some iteration we have to compare null and resolved which 
 are obviously
 // not equals, so it will fire a ReadRequest, however it is not 
 needed here
 ColumnFamily diffCf = ColumnFamily.diff(versions.get(i), 
 resolved);
 if (diffCf == null)
 continue;
 //  
 {code}
 Imagine the following situation:
 NodeA has X.1 // row X with the version 1
 NodeB has X.2 
 NodeC has X.? // Unknown version, but because write was with Quorum it is 1 
 or 2
 During the Quorum read from nodes A and B, Cassandra creates version 12 and 
 send ReadRepair, so now nodes has the following content:
 NodeA has X.12
 NodeB has X.12
 which is correct, however Cassandra also will fire ReadRepair to NodeC. There 
 is no need to do that, the next consistent read have a chance to be served by 
 nodes {A, B} (no ReadRepair) or by pair {?, C} and in that case ReadRepair 
 will be fired and brings nodeC to the consistent state
 Right now we are reading from the Index a lot and starting from some point in 
 time we are getting TimeOutException because cluster is overloaded by the 
 ReadRepairRequests *even* if all nodes has the same data :(

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3843) Unnecessary ReadRepair request during RangeScan

2012-02-13 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207050#comment-13207050
 ] 

Jonathan Ellis commented on CASSANDRA-3843:
---

Looks to me like the 1.0 code changes from v2 apply cleanly to 0.8.  (CHANGES 
diff does not apply but can be ignored.)

 Unnecessary  ReadRepair request during RangeScan
 

 Key: CASSANDRA-3843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3843
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.0
Reporter: Philip Andronov
Assignee: Jonathan Ellis
 Fix For: 1.0.8

 Attachments: 3843-v2.txt, 3843.txt


 During reading with Quorum level and replication factor greater then 2, 
 Cassandra sends at least one ReadRepair, even if there is no need to do that. 
 With the fact that read requests await until ReadRepair will finish it slows 
 down requsts a lot, up to the Timeout :(
 It seems that the problem has been introduced by the CASSANDRA-2494, 
 unfortunately I have no enought knowledge of Cassandra internals to fix the 
 problem and do not broke CASSANDRA-2494 functionality, so my report without a 
 patch.
 Code explanations:
 {code:title=RangeSliceResponseResolver.java|borderStyle=solid}
 class RangeSliceResponseResolver {
 // 
 private class Reducer extends 
 MergeIterator.ReducerPairRow,InetAddress, Row
 {
 // 
 protected Row getReduced()
 {
 ColumnFamily resolved = versions.size()  1
   ? 
 RowRepairResolver.resolveSuperset(versions)
   : versions.get(0);
 if (versions.size()  sources.size())
 {
 for (InetAddress source : sources)
 {
 if (!versionSources.contains(source))
 {
   
 // [PA] Here we are adding null ColumnFamily.
 // later it will be compared with the desired
 // version and will give us fake difference which
 // forces Cassandra to send ReadRepair to a given 
 source
 versions.add(null);
 versionSources.add(source);
 }
 }
 }
 // 
 if (resolved != null)
 
 repairResults.addAll(RowRepairResolver.scheduleRepairs(resolved, table, key, 
 versions, versionSources));
 // 
 }
 }
 }
 {code}
 {code:title=RowRepairResolver.java|borderStyle=solid}
 public class RowRepairResolver extends AbstractRowResolver {
 // 
 public static ListIAsyncResult scheduleRepairs(ColumnFamily resolved, 
 String table, DecoratedKey? key, ListColumnFamily versions, 
 ListInetAddress endpoints)
 {
 ListIAsyncResult results = new 
 ArrayListIAsyncResult(versions.size());
 for (int i = 0; i  versions.size(); i++)
 {
 // On some iteration we have to compare null and resolved which 
 are obviously
 // not equals, so it will fire a ReadRequest, however it is not 
 needed here
 ColumnFamily diffCf = ColumnFamily.diff(versions.get(i), 
 resolved);
 if (diffCf == null)
 continue;
 //  
 {code}
 Imagine the following situation:
 NodeA has X.1 // row X with the version 1
 NodeB has X.2 
 NodeC has X.? // Unknown version, but because write was with Quorum it is 1 
 or 2
 During the Quorum read from nodes A and B, Cassandra creates version 12 and 
 send ReadRepair, so now nodes has the following content:
 NodeA has X.12
 NodeB has X.12
 which is correct, however Cassandra also will fire ReadRepair to NodeC. There 
 is no need to do that, the next consistent read have a chance to be served by 
 nodes {A, B} (no ReadRepair) or by pair {?, C} and in that case ReadRepair 
 will be fired and brings nodeC to the consistent state
 Right now we are reading from the Index a lot and starting from some point in 
 time we are getting TimeOutException because cluster is overloaded by the 
 ReadRepairRequests *even* if all nodes has the same data :(

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3883) CFIF WideRowIterator only returns batch size columns

2012-02-13 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207055#comment-13207055
 ] 

Jonathan Ellis commented on CASSANDRA-3883:
---

bq. Unfortunately if we don't start on one, I'm not sure if there's a way to 
detect that we're in a wide row without making an extra rpc against the last 
row seen every time.

If we can easily address this w/ some extra logic in get_paged_slice then 
great, otherwise doing one extra rpc call out of (split size * rows per split) 
doesn't seem like a big deal to me.

 CFIF WideRowIterator only returns batch size columns
 

 Key: CASSANDRA-3883
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3883
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.1.0
Reporter: Brandon Williams
 Fix For: 1.1.0

 Attachments: 3883-v1.txt


 Most evident with the word count, where there are 1250 'word1' items in two 
 rows (1000 in one, 250 in another) and it counts 198 with the batch size set 
 to 99.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3883) CFIF WideRowIterator only returns batch size columns

2012-02-13 Thread Jonathan Ellis (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207055#comment-13207055
 ] 

Jonathan Ellis edited comment on CASSANDRA-3883 at 2/13/12 6:41 PM:


bq. Unfortunately if we don't start on one, I'm not sure if there's a way to 
detect that we're in a wide row without making an extra rpc against the last 
row seen every time.

If we can easily address this w/ some extra logic in get_paged_slice then 
great, otherwise doing one extra rpc call out of (split size * pages per row in 
split) doesn't seem like a big deal to me.

  was (Author: jbellis):
bq. Unfortunately if we don't start on one, I'm not sure if there's a way 
to detect that we're in a wide row without making an extra rpc against the last 
row seen every time.

If we can easily address this w/ some extra logic in get_paged_slice then 
great, otherwise doing one extra rpc call out of (split size * rows per split) 
doesn't seem like a big deal to me.
  
 CFIF WideRowIterator only returns batch size columns
 

 Key: CASSANDRA-3883
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3883
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.1.0
Reporter: Brandon Williams
 Fix For: 1.1.0

 Attachments: 3883-v1.txt


 Most evident with the word count, where there are 1250 'word1' items in two 
 rows (1000 in one, 250 in another) and it counts 198 with the batch size set 
 to 99.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3883) CFIF WideRowIterator only returns batch size columns

2012-02-13 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3883:
--

Reviewer: tjake
Assignee: Brandon Williams

 CFIF WideRowIterator only returns batch size columns
 

 Key: CASSANDRA-3883
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3883
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.1.0
Reporter: Brandon Williams
Assignee: Brandon Williams
 Fix For: 1.1.0

 Attachments: 3883-v1.txt


 Most evident with the word count, where there are 1250 'word1' items in two 
 rows (1000 in one, 250 in another) and it counts 198 with the batch size set 
 to 99.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3830) gossip-to-seeds is not obviously independent of failure detection algorithm

2012-02-13 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207060#comment-13207060
 ] 

Peter Schuller commented on CASSANDRA-3830:
---

{quote}
What I meant to say is this is the only special-case for seeds; gossiping to at 
least one seed every round is the normal case, as you said.
{quote}

Ah. So what I mean by gossip being a special case, is the fact that we have the 
gossip-to-seed logic at all. Part of the core aspects of gossip is the 
propagation delay and whether and to what extent it is affected by things like 
cluster size. My concern is that all production clusters that follow the 
recommendation w.r.t. seeds are all working well potentially only because of 
the fact that we are gossiping to seeds. It's trivial to see that if we have a 
bunch of N servers all gossiping to a small set of 2-4 servers, propagation 
delay is not going to be a major problem as long as at least one of those are 
up.

Anyways, I'll try to get to graphing average propagation delay as a function of 
cluster size (along with p99:s or something) and see if there seems to be a 
correlation or not.

 gossip-to-seeds is not obviously independent of failure detection algorithm 
 

 Key: CASSANDRA-3830
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3830
 Project: Cassandra
  Issue Type: Task
  Components: Core
Reporter: Peter Schuller
Priority: Minor

 The failure detector, ignoring all the theory, boils down to an
 extremely simple algorithm. The FD keeps track of a sliding window (of
 1000 currently) intervals of heartbeat for a given host. Meaning, we
 have a track record of the last 1000 times we saw an updated heartbeat
 for a host.
 At any given moment, a host has a score which is simply the time since
 the last heartbeat, over the *mean* interval in the sliding
 window. For historical reasons a simple scaling factor is applied to
 this prior to checking the phi conviction threshold.
 (CASSANDRA-2597 has details, but thanks to Paul's work there it's now
 trivial to understand what it does based on gut feeling)
 So in effect, a host is considered down if we haven't heard from it in
 some time which is significantly longer than the average time we
 expect to hear from it.
 This seems reasonable, but it does assume that under normal conditions
 the average time between heartbeats does not change for reasons other
 than those that would be plausible reasons to think a node is
 unhealthy.
 This assumption *could* be violated by the gossip-to-seed
 feature. There is an argument to avoid gossip-to-seed for other
 reasons (see CASSANDRA-3829), but this is a concrete case in which the
 gossip-to-seed could cause a negative side-effect of the general kind
 mentioned in CASSANDRA-3829 (see notes at end about not case w/o seeds
 not being continuously tested). Normally, due to gossip to seed,
 everyone essentially sees latest information within very few hart
 beats (assuming only 2-3 seeds). But should all seeds be down,
 suddenly we flip a switch and start relying on generalized propagation
 in the gossip system, rather than the seed special case.
 The potential problem I forese here is that if the average propagation
 time suddenly spikes when all seeds become available, it could cause
 bogus flapping of nodes into down state.
 In order to test this, I deployeda ~ 180 node cluster with a version
 that logs heartbet information on each interpret(), similar to:
  INFO [GossipTasks:1] 2012-02-01 23:29:58,746 FailureDetector.java (line 187) 
 ep /XXX.XXX.XXX.XXX is at phi 0.0019521638443084342, last interval 7.0, mean 
 is 1557.27778
 It turns out that, at least at 180 nodes, with 4 seed nodes, whether
 or not seeds are running *does not* seem to matter significantly. In
 both cases, the mean interval is around 1500 milliseconds.
 I don't feel I have a good grasp of whether this is incidental or
 guaranteed, and it would be good to at least empirically test
 propagation time w/o seeds at differnet cluster sizes; it's supposed
 to be un-affected by cluster size ({{RING_DELAY}} is static for this
 reason, is my understanding). Would be nice to see this be the case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3830) gossip-to-seeds is not obviously independent of failure detection algorithm

2012-02-13 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207061#comment-13207061
 ] 

Peter Schuller commented on CASSANDRA-3830:
---

To clarify, the relation to failure detector isn't the absolute propagation 
delay - I am concerned with a sudden *change* in propagation delay (either 
average or outliers).

 gossip-to-seeds is not obviously independent of failure detection algorithm 
 

 Key: CASSANDRA-3830
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3830
 Project: Cassandra
  Issue Type: Task
  Components: Core
Reporter: Peter Schuller
Priority: Minor

 The failure detector, ignoring all the theory, boils down to an
 extremely simple algorithm. The FD keeps track of a sliding window (of
 1000 currently) intervals of heartbeat for a given host. Meaning, we
 have a track record of the last 1000 times we saw an updated heartbeat
 for a host.
 At any given moment, a host has a score which is simply the time since
 the last heartbeat, over the *mean* interval in the sliding
 window. For historical reasons a simple scaling factor is applied to
 this prior to checking the phi conviction threshold.
 (CASSANDRA-2597 has details, but thanks to Paul's work there it's now
 trivial to understand what it does based on gut feeling)
 So in effect, a host is considered down if we haven't heard from it in
 some time which is significantly longer than the average time we
 expect to hear from it.
 This seems reasonable, but it does assume that under normal conditions
 the average time between heartbeats does not change for reasons other
 than those that would be plausible reasons to think a node is
 unhealthy.
 This assumption *could* be violated by the gossip-to-seed
 feature. There is an argument to avoid gossip-to-seed for other
 reasons (see CASSANDRA-3829), but this is a concrete case in which the
 gossip-to-seed could cause a negative side-effect of the general kind
 mentioned in CASSANDRA-3829 (see notes at end about not case w/o seeds
 not being continuously tested). Normally, due to gossip to seed,
 everyone essentially sees latest information within very few hart
 beats (assuming only 2-3 seeds). But should all seeds be down,
 suddenly we flip a switch and start relying on generalized propagation
 in the gossip system, rather than the seed special case.
 The potential problem I forese here is that if the average propagation
 time suddenly spikes when all seeds become available, it could cause
 bogus flapping of nodes into down state.
 In order to test this, I deployeda ~ 180 node cluster with a version
 that logs heartbet information on each interpret(), similar to:
  INFO [GossipTasks:1] 2012-02-01 23:29:58,746 FailureDetector.java (line 187) 
 ep /XXX.XXX.XXX.XXX is at phi 0.0019521638443084342, last interval 7.0, mean 
 is 1557.27778
 It turns out that, at least at 180 nodes, with 4 seed nodes, whether
 or not seeds are running *does not* seem to matter significantly. In
 both cases, the mean interval is around 1500 milliseconds.
 I don't feel I have a good grasp of whether this is incidental or
 guaranteed, and it would be good to at least empirically test
 propagation time w/o seeds at differnet cluster sizes; it's supposed
 to be un-affected by cluster size ({{RING_DELAY}} is static for this
 reason, is my understanding). Would be nice to see this be the case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3829) make seeds *only* be seeds, not special in gossip

2012-02-13 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207084#comment-13207084
 ] 

Peter Schuller commented on CASSANDRA-3829:
---

{quote}
Okay, I'm with you so far. But as you note, this impacts the usability of 
single-node clusters which is where virtually everybody starts. So, I'll need 
to see a solution that doesn't make life more confusion for that overwhelming 
majority. I get that you don't like the current tradeoffs but I haven't seen a 
better proposal yet. (I'll go ahead and pre-emptively -1 pecial environment 
variables...)
{quote}

I haven't been able to come up with a solution that avoids the initial setup 
requiring special actions. While I am personally fine with this (any software 
that doesn't would cause me to wonder what? what if this wasn't an initial 
setup?) I understand that 99% of users would probably not be fond of this 
behavior and it would just turn people off of Cassandra.

So, what about an opt-in setting which explicitly says the inverse - this *is* 
a production cluster that is not being set up? The recommendation could be that 
everyone uses this setting after a cluster is in production, but things keep 
working if they don't (subject to the risks associated with re-bootstrapping 
someone on the seed list, a problem we already have).

This could be either a {{cassandra.yaml}} option or, if that is deemed too 
visible/confusing, a not-so-prominently-documented environment variable. 
However, if a documented {{cassandra.yaml}} option in the default config is not 
acceptable, I think I'd still prefer a {{cassandra.yaml}} setting that wasn't 
in the default configuration to an environment variable above an environment 
variable.

(This is another case where it doesn't really matter *to me*. We can easily 
just patch in the env variable and run with it on our end, it's not like that 
patch will be a maintenance problem for us. I really just want to try to make 
this safer for all users.)

{quote}
I still haven't seen a case when this, or special-casing seeds to prevent 
gossip partitions, causes real problems. Whereas I was around when we added the 
gossip-partition-prevention code, so I do know the problems that prevents.
{quote}

Jumping into clusters/rolling restarts:

So I can give anecdotal stories about seeing people, multiple times, being 
unaware and/or confused about a node jumping into a cluster without 
bootstrapping and not realizing what's going on, or tell you that a long time 
ago before I knew enough about gossip I was feeling the pains of rolling 
restarts whenever maintenance was done on clusters.

But in this case it seems better to just have it flow from actual facts because 
it's not really that subjective. Consider the combination of:

* Restarts are in fact required in change seeds.
* A restart can easily be very very slow due to index sampling (until the 
samples-on-disk patch is in), row cache pre-load, commit log replay (not if you 
drained properly though), etc.
* A restart can also be problematic if it e.g. causes page cache eviction and 
thus necessitates rate limiting rolling restarts.
* Completing rolling restarts in a safe manner is prevented by pre-existing 
nodes being down in the cluster depending (e.g., RF=3 QUORUM, one node already 
down - can't restart neighbors).
* In addition, all forms of restarts carry with it some risk, even if we were 
to only consider the risk involved in terms of adding additional windows of 
potential double failures.

Having to do a full rolling restart on a production cluster, particularly if 
the cluster has a lot of data (- slower restarts, more sensitive to page 
caches, etc), is a *huge* operation to do just because you needed to e.g. 
replace a broken disk in and rebootstrap a node that just happened to be a 
seed. And clearly, the probability that *some* other node in the cluster is 
currently down for whatever reason in a large cluster is non-trivial, and would 
cause the inability to not be able to complete a orlling restart.

Of course one might again argue that there is no real need to be that strict on 
maintaining the seed list, but again the circumstances under which this is safe 
is very opaque to people not intimately familiar with the code - and not being 
strict about it kind of takes away the protection against partitions it was 
supposed to give you from the start.

So, while I realize changing the role of seeds is more controversial, I have a 
hard time understanding how it cannot be obviously better to allow seeds to be 
reloadable? Pushing a .yaml configuration file vs. a *complete rolling restart 
of the entire cluster* - that's a huge difference in impact, effort and risk 
for most production clusters.


 make seeds *only* be seeds, not special in gossip 
 --

 Key: 

[jira] [Created] (CASSANDRA-3903) Intermittent unexpected errors: possibly race condition around CQL parser?

2012-02-13 Thread paul cannon (Created) (JIRA)
Intermittent unexpected errors: possibly race condition around CQL parser?
--

 Key: CASSANDRA-3903
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3903
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.0
 Environment: Mac OS X 10.7 with Sun/Oracle Java 1.6.0_29
Debian GNU/Linux 6.0.3 (squeeze) with Sun/Oracle Java 1.6.0_26

several recent commits on cassandra-1.1 branch. at least:

0183dc0b36e684082832de43a21b3dc0a9716d48, 
3eefbac133c838db46faa6a91ba1f114192557ae, 
9a842c7b317e6f1e6e156ccb531e34bb769c979f

Running cassandra under ccm with one node
Reporter: paul cannon


When running multiple simultaneous instances of the test_cql.py piece of the 
python-cql test suite, I can reliably reproduce intermittent and unpredictable 
errors in the tests.

The failures often occur at the point of keyspace creation during test setup, 
with a CQL statement of the form:

{code}
CREATE KEYSPACE 'asnvzpot' WITH strategy_class = SimpleStrategy
AND strategy_options:replication_factor = 1

{code}

An InvalidRequestException is returned to the cql driver, which re-raises it as 
a cql.ProgrammingError. The message:

{code}
ProgrammingError: Bad Request: line 2:24 no viable alternative at input 
'asnvzpot'
{code}

In a few cases, Cassandra threw an ArrayIndexOutOfBoundsException and this 
traceback, closing the thrift connection:

{code}
ERROR [Thrift:244] 2012-02-10 15:51:46,815 CustomTThreadPoolServer.java (line 
205) Error occurred during processing of message.
java.lang.ArrayIndexOutOfBoundsException: 7
at 
org.apache.cassandra.db.ColumnFamilyStore.all(ColumnFamilyStore.java:1520)
at 
org.apache.cassandra.thrift.ThriftValidation.validateCfDef(ThriftValidation.java:634)
at 
org.apache.cassandra.cql.QueryProcessor.processStatement(QueryProcessor.java:744)
at 
org.apache.cassandra.cql.QueryProcessor.process(QueryProcessor.java:898)
at 
org.apache.cassandra.thrift.CassandraServer.execute_cql_query(CassandraServer.java:1245)
at 
org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3458)
at 
org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3446)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
{code}

Sometimes I see an ArrayOutOfBoundsError with no traceback:

{code}
ERROR [Thrift:858] 2012-02-13 12:04:01,537 CustomTThreadPoolServer.java (line 
205) Error occurred during processing of message.
java.lang.ArrayIndexOutOfBoundsException
{code}

Sometimes I get this:

{code}
ERROR [MigrationStage:1] 2012-02-13 12:04:46,077 AbstractCassandraDaemon.java 
(line 134) Fatal exception in thread Thread[MigrationStage:1,5,main]
java.lang.IllegalArgumentException: value already present: 1558
at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
at 
com.google.common.collect.AbstractBiMap.putInBothMaps(AbstractBiMap.java:111)
at com.google.common.collect.AbstractBiMap.put(AbstractBiMap.java:96)
at com.google.common.collect.HashBiMap.put(HashBiMap.java:84)
at org.apache.cassandra.config.Schema.load(Schema.java:392)
at 
org.apache.cassandra.db.migration.MigrationHelper.addColumnFamily(MigrationHelper.java:284)
at 
org.apache.cassandra.db.migration.MigrationHelper.addColumnFamily(MigrationHelper.java:209)
at 
org.apache.cassandra.db.migration.AddColumnFamily.applyImpl(AddColumnFamily.java:49)
at org.apache.cassandra.db.migration.Migration.apply(Migration.java:66)
at 
org.apache.cassandra.cql.QueryProcessor$1.call(QueryProcessor.java:334)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{code}

Again, around 99% of the instances of this {{CREATE KEYSPACE}} statement work 
fine, so it's a little hard to git bisect out, but I guess I'll see what I can 
do.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 

[jira] [Commented] (CASSANDRA-3830) gossip-to-seeds is not obviously independent of failure detection algorithm

2012-02-13 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207110#comment-13207110
 ] 

Brandon Williams commented on CASSANDRA-3830:
-

CASSANDRA-617 may be of interest then (though this is when gossip was old and 
busted; udp and whatnot)

bq. It's trivial to see that if we have a bunch of N servers all gossiping to a 
small set of 2-4 servers, propagation delay is not going to be a major problem 
as long as at least one of those are up

Right, gossiping to a seed every round actually becomes a bit of an 
optimization in this regard, but isn't strictly necessary.

 gossip-to-seeds is not obviously independent of failure detection algorithm 
 

 Key: CASSANDRA-3830
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3830
 Project: Cassandra
  Issue Type: Task
  Components: Core
Reporter: Peter Schuller
Priority: Minor

 The failure detector, ignoring all the theory, boils down to an
 extremely simple algorithm. The FD keeps track of a sliding window (of
 1000 currently) intervals of heartbeat for a given host. Meaning, we
 have a track record of the last 1000 times we saw an updated heartbeat
 for a host.
 At any given moment, a host has a score which is simply the time since
 the last heartbeat, over the *mean* interval in the sliding
 window. For historical reasons a simple scaling factor is applied to
 this prior to checking the phi conviction threshold.
 (CASSANDRA-2597 has details, but thanks to Paul's work there it's now
 trivial to understand what it does based on gut feeling)
 So in effect, a host is considered down if we haven't heard from it in
 some time which is significantly longer than the average time we
 expect to hear from it.
 This seems reasonable, but it does assume that under normal conditions
 the average time between heartbeats does not change for reasons other
 than those that would be plausible reasons to think a node is
 unhealthy.
 This assumption *could* be violated by the gossip-to-seed
 feature. There is an argument to avoid gossip-to-seed for other
 reasons (see CASSANDRA-3829), but this is a concrete case in which the
 gossip-to-seed could cause a negative side-effect of the general kind
 mentioned in CASSANDRA-3829 (see notes at end about not case w/o seeds
 not being continuously tested). Normally, due to gossip to seed,
 everyone essentially sees latest information within very few hart
 beats (assuming only 2-3 seeds). But should all seeds be down,
 suddenly we flip a switch and start relying on generalized propagation
 in the gossip system, rather than the seed special case.
 The potential problem I forese here is that if the average propagation
 time suddenly spikes when all seeds become available, it could cause
 bogus flapping of nodes into down state.
 In order to test this, I deployeda ~ 180 node cluster with a version
 that logs heartbet information on each interpret(), similar to:
  INFO [GossipTasks:1] 2012-02-01 23:29:58,746 FailureDetector.java (line 187) 
 ep /XXX.XXX.XXX.XXX is at phi 0.0019521638443084342, last interval 7.0, mean 
 is 1557.27778
 It turns out that, at least at 180 nodes, with 4 seed nodes, whether
 or not seeds are running *does not* seem to matter significantly. In
 both cases, the mean interval is around 1500 milliseconds.
 I don't feel I have a good grasp of whether this is incidental or
 guaranteed, and it would be good to at least empirically test
 propagation time w/o seeds at differnet cluster sizes; it's supposed
 to be un-affected by cluster size ({{RING_DELAY}} is static for this
 reason, is my understanding). Would be nice to see this be the case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3903) Intermittent unexpected errors: possibly race condition around CQL parser?

2012-02-13 Thread paul cannon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207115#comment-13207115
 ] 

paul cannon commented on CASSANDRA-3903:


I should mention that I adjusted the python-cql tests to be able to run cleanly 
in parallel, in the parallel-tests branch.

 Intermittent unexpected errors: possibly race condition around CQL parser?
 --

 Key: CASSANDRA-3903
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3903
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.0
 Environment: Mac OS X 10.7 with Sun/Oracle Java 1.6.0_29
 Debian GNU/Linux 6.0.3 (squeeze) with Sun/Oracle Java 1.6.0_26
 several recent commits on cassandra-1.1 branch. at least:
 0183dc0b36e684082832de43a21b3dc0a9716d48, 
 3eefbac133c838db46faa6a91ba1f114192557ae, 
 9a842c7b317e6f1e6e156ccb531e34bb769c979f
 Running cassandra under ccm with one node
Reporter: paul cannon

 When running multiple simultaneous instances of the test_cql.py piece of the 
 python-cql test suite, I can reliably reproduce intermittent and 
 unpredictable errors in the tests.
 The failures often occur at the point of keyspace creation during test setup, 
 with a CQL statement of the form:
 {code}
 CREATE KEYSPACE 'asnvzpot' WITH strategy_class = SimpleStrategy
 AND strategy_options:replication_factor = 1
 
 {code}
 An InvalidRequestException is returned to the cql driver, which re-raises it 
 as a cql.ProgrammingError. The message:
 {code}
 ProgrammingError: Bad Request: line 2:24 no viable alternative at input 
 'asnvzpot'
 {code}
 In a few cases, Cassandra threw an ArrayIndexOutOfBoundsException and this 
 traceback, closing the thrift connection:
 {code}
 ERROR [Thrift:244] 2012-02-10 15:51:46,815 CustomTThreadPoolServer.java (line 
 205) Error occurred during processing of message.
 java.lang.ArrayIndexOutOfBoundsException: 7
 at 
 org.apache.cassandra.db.ColumnFamilyStore.all(ColumnFamilyStore.java:1520)
 at 
 org.apache.cassandra.thrift.ThriftValidation.validateCfDef(ThriftValidation.java:634)
 at 
 org.apache.cassandra.cql.QueryProcessor.processStatement(QueryProcessor.java:744)
 at 
 org.apache.cassandra.cql.QueryProcessor.process(QueryProcessor.java:898)
 at 
 org.apache.cassandra.thrift.CassandraServer.execute_cql_query(CassandraServer.java:1245)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3458)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3446)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
 at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:680)
 {code}
 Sometimes I see an ArrayOutOfBoundsError with no traceback:
 {code}
 ERROR [Thrift:858] 2012-02-13 12:04:01,537 CustomTThreadPoolServer.java (line 
 205) Error occurred during processing of message.
 java.lang.ArrayIndexOutOfBoundsException
 {code}
 Sometimes I get this:
 {code}
 ERROR [MigrationStage:1] 2012-02-13 12:04:46,077 AbstractCassandraDaemon.java 
 (line 134) Fatal exception in thread Thread[MigrationStage:1,5,main]
 java.lang.IllegalArgumentException: value already present: 1558
 at 
 com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
 at 
 com.google.common.collect.AbstractBiMap.putInBothMaps(AbstractBiMap.java:111)
 at com.google.common.collect.AbstractBiMap.put(AbstractBiMap.java:96)
 at com.google.common.collect.HashBiMap.put(HashBiMap.java:84)
 at org.apache.cassandra.config.Schema.load(Schema.java:392)
 at 
 org.apache.cassandra.db.migration.MigrationHelper.addColumnFamily(MigrationHelper.java:284)
 at 
 org.apache.cassandra.db.migration.MigrationHelper.addColumnFamily(MigrationHelper.java:209)
 at 
 org.apache.cassandra.db.migration.AddColumnFamily.applyImpl(AddColumnFamily.java:49)
 at 
 org.apache.cassandra.db.migration.Migration.apply(Migration.java:66)
 at 
 org.apache.cassandra.cql.QueryProcessor$1.call(QueryProcessor.java:334)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

[jira] [Commented] (CASSANDRA-3772) Evaluate Murmur3-based partitioner

2012-02-13 Thread Yuki Morishita (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207132#comment-13207132
 ] 

Yuki Morishita commented on CASSANDRA-3772:
---

Dave,

Patch needs rebase, but looking at the patch, I noticed the following:

{code}
private static byte[] hashMurmur3(ByteBuffer... data)
{
HashFunction hashFunction = murmur3HF.get();
Hasher hasher = hashFunction.newHasher();
// snip
}
{code}

Isn't that slow if you instantiate every time? I looked up guava source code 
but I saw no way to reset, so I guess the above is the only thing you could 
do...

I also note that CASSANDRA-2975 will implement MurmurHash3, so I think it is 
better not to introduce external library. What do you think?

 Evaluate Murmur3-based partitioner
 --

 Key: CASSANDRA-3772
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Dave Brosius
 Fix For: 1.2

 Attachments: try_murmur3.diff


 MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
 qualities, just a good output distribution.  Let's see how much overhead we 
 can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




git commit: Fix misplaced 'new' keyword

2012-02-13 Thread brandonwilliams
Updated Branches:
  refs/heads/cassandra-1.0 cb0efd09c - 651ca528d


Fix misplaced 'new' keyword


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/651ca528
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/651ca528
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/651ca528

Branch: refs/heads/cassandra-1.0
Commit: 651ca528d24f088581055cfbd4c70115e04899ea
Parents: cb0efd0
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Feb 13 13:41:03 2012 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Feb 13 13:41:03 2012 -0600

--
 .../cassandra/hadoop/pig/CassandraStorage.java |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/651ca528/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
--
diff --git 
a/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java 
b/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
index 63758ab..b9977a5 100644
--- a/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
+++ b/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
@@ -491,7 +491,7 @@ public class CassandraStorage extends LoadFunc implements 
StoreFuncInterface, Lo
 if (o == null)
 return (ByteBuffer)o;
 if (o instanceof java.lang.String)
-return new ByteBuffer.wrap(DataByteArray((String)o).get());
+return ByteBuffer.wrap(new DataByteArray((String)o).get());
 if (o instanceof Integer)
 return IntegerType.instance.decompose((BigInteger)o);
 if (o instanceof Long)



[jira] [Updated] (CASSANDRA-3412) make nodetool ring ownership smarter

2012-02-13 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3412:
--

Assignee: Vijay  (was: paul cannon)

Vijay, do you have time to take a stab at this?

 make nodetool ring ownership smarter
 

 Key: CASSANDRA-3412
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3412
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jackson Chung
Assignee: Vijay
Priority: Minor

 just a thought.. the ownership info currently just look at the token and 
 calculate the % between nodes. It would be nice if it could do more, such as 
 discriminate nodes of each DC, replica set, etc. 
 ticket is open for suggestion...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




git commit: Integer corresponds to Int32Type

2012-02-13 Thread brandonwilliams
Updated Branches:
  refs/heads/cassandra-1.0 651ca528d - 4bd3f8d86


Integer corresponds to Int32Type


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4bd3f8d8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4bd3f8d8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4bd3f8d8

Branch: refs/heads/cassandra-1.0
Commit: 4bd3f8d86fcc29259dd0d508873125f88ce588e4
Parents: 651ca52
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Feb 13 13:48:20 2012 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Feb 13 13:48:20 2012 -0600

--
 .../cassandra/hadoop/pig/CassandraStorage.java |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/4bd3f8d8/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
--
diff --git 
a/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java 
b/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
index b9977a5..76a291a 100644
--- a/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
+++ b/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
@@ -493,7 +493,7 @@ public class CassandraStorage extends LoadFunc implements 
StoreFuncInterface, Lo
 if (o instanceof java.lang.String)
 return ByteBuffer.wrap(new DataByteArray((String)o).get());
 if (o instanceof Integer)
-return IntegerType.instance.decompose((BigInteger)o);
+return Int32Type.instance.decompose((Integer)o);
 if (o instanceof Long)
 return LongType.instance.decompose((Long)o);
 if (o instanceof Float)



[jira] [Commented] (CASSANDRA-3412) make nodetool ring ownership smarter

2012-02-13 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207153#comment-13207153
 ] 

Vijay commented on CASSANDRA-3412:
--

Will do!

Jackson, It is fairly trivial to change the own's The problem with this 
last time i looked at this, was that if we show % in the ring and if we show % 
per DC it will add up to be more than 100% and hence will cause some confusions 
for the staters (I wish it was color coded or something like that)... What do 
you think? Will it make sense to rename OWNS to OWNS-PER-DC or a better name 
and do the above? 

 make nodetool ring ownership smarter
 

 Key: CASSANDRA-3412
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3412
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jackson Chung
Assignee: Vijay
Priority: Minor

 just a thought.. the ownership info currently just look at the token and 
 calculate the % between nodes. It would be nice if it could do more, such as 
 discriminate nodes of each DC, replica set, etc. 
 ticket is open for suggestion...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3901) write endpoints are not treated correctly, breaking consistency guarantees

2012-02-13 Thread paul cannon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207156#comment-13207156
 ] 

paul cannon commented on CASSANDRA-3901:


I don't believe that the proposed fix for CASSANDRA-2434 covers these concerns 
at all.  I guess you could say that the *scope* of 2434 covers this, but I 
think it's separate enough to deserve its own ticket, as you've done.

 write endpoints are not treated correctly, breaking consistency guarantees
 --

 Key: CASSANDRA-3901
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3901
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Critical

 I had a nagging feeling this was the case ever since I started wanting 
 CASSANDRA-3833 and thinking about hot to handle the association between nodes 
 in the read set and nodes in the write set.
 I may be wrong (please point me in the direct direction if so), but I see no 
 code anywhere that tries to (1) apply consistency level to currently normal 
 endpoints only, and (2) connect a given read endpoint with a future write 
 endpoint such that they are tied together for consistency purposes (parts of 
 these concerns probably is covered by CASSANDRA-2434 but that ticket is more 
 general).
 To be more clear about the problem: Suppose we have a ring of nodes, with a 
 single node bootstrapping. Now, for a given row key suppose reads are served 
 by A, B and C while writes are to go to A, B, C and D. In other words, D is 
 the node bootstrapping. Suppose RF is 3 and A,B,C,D is ring order. There are 
 a few things required for correct behavior:
 * Writes acked by D must never be treated as sufficient to satisfy 
 consistency level since until it is part of the read set it does not count 
 towards CL on reads.
 * Writes acked by B must *not* be treated as sufficient to satisfy 
 consistency level *unless* the same write is *also* acked by D, because once 
 D enters the ring, B will no longer be counting towards CL on reads. The only 
 alternative is to make the read succeed and disallow D from entering the ring.
 We don't seem to be handling this at all (and it becomes more complicated 
 with arbitrary transitions).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3772) Evaluate Murmur3-based partitioner

2012-02-13 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207159#comment-13207159
 ] 

Jonathan Ellis commented on CASSANDRA-3772:
---

bq. I looked up guava source code but I saw no way to reset, so I guess the 
above is the only thing you could do

It looks like you're right: 
http://code.google.com/p/guava-libraries/source/browse/guava/src/com/google/common/hash/MessageDigestHashFunction.java

So using the standalone MH3 library is probably the way to go.

 Evaluate Murmur3-based partitioner
 --

 Key: CASSANDRA-3772
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Dave Brosius
 Fix For: 1.2

 Attachments: try_murmur3.diff


 MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
 qualities, just a good output distribution.  Let's see how much overhead we 
 can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3412) make nodetool ring ownership smarter

2012-02-13 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207169#comment-13207169
 ] 

Peter Schuller commented on CASSANDRA-3412:
---

Our internal tool (external, in python, based on describe_ring) simply uses 
{{describe_ring}} and looks at each range and their responsible nodes and just 
adds them all up. The ownership we report for a node is the total amount of 
ringspace (regardless of primary/secondary/dc/etc concerns) that the node has, 
compared to the overall total.

It ends up giving you the real number while completely blackboxing why we got 
there - whether it be due to rack awareness (CASSANDRA-3810) or DC:s.

FWIW, here is the code for that. It's not self-contained and won't run, but 
it's an FYI. The topology_xref is just post-processing the describe ring 
results to yield the map of range - nodes_responsible.

{code}
def cmd_effective_ownership(opts, args):
 


   
Print effective ownership of nodes in a cluster.


   



   
Effective ownership means the actual amount of the ring for which   


   
it has data, whether or not it is because it is the primary or  


   
secondary (etc) owner of the ring segment. This is essentially the  


   
ownership you would want nodetool ring to print but doesn't.  


   

if not args and not opts.all:
return

node_ranges, range_nodes = topology_xref(describe_ring(*((opts,) + 
split_hostport(seed(opts, 'localhost') if opts.all else args[0]

if opts.all:
args = node_ranges.keys()

# acrobatics to handle wrap-around  


   
max_token = 0
min_token = 2**127
for r in range_nodes.keys():
if r[0]  min_token:
min_token = r[0]
if r[1]  max_token:
max_token = r[1]

def ownership(start_token, end_token):
start_token, end_token = int(start_token), int(end_token)
if end_token  start_token:
# wrap-around   


   
return end_token + (2**127 - start_token)
else:
return end_token - start_token

toprint = [] # list of (owned, ranges), later to be sorted  


   
for node in (hostnames.normalize_hostname(arg) for arg in args):
if not node in node_ranges:
raise cmdline.UserError('node %s not in ring' % (node,))
ranges = node_ranges[node]
owned = reduce(lambda a, b: a + b, [ownership(r[0], r[1]) for r in 
ranges], 0)
toprint.append((owned, node, ranges))

toprint = sorted(toprint, reverse=True)
for owned, node, ranges in toprint:
print '%s %f%%' % (node, float(owned) / 2**127 * 100.0)
if opts.print_ranges:
for r in 

[jira] [Commented] (CASSANDRA-3901) write endpoints are not treated correctly, breaking consistency guarantees

2012-02-13 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207175#comment-13207175
 ] 

Peter Schuller commented on CASSANDRA-3901:
---

You're probably right. I didn't re-read 2434 again (it's long and it takes 
careful reading to follow the discussion), and mostly wanted to give a nod 
towards it in case it did.

 write endpoints are not treated correctly, breaking consistency guarantees
 --

 Key: CASSANDRA-3901
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3901
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Critical

 I had a nagging feeling this was the case ever since I started wanting 
 CASSANDRA-3833 and thinking about hot to handle the association between nodes 
 in the read set and nodes in the write set.
 I may be wrong (please point me in the direct direction if so), but I see no 
 code anywhere that tries to (1) apply consistency level to currently normal 
 endpoints only, and (2) connect a given read endpoint with a future write 
 endpoint such that they are tied together for consistency purposes (parts of 
 these concerns probably is covered by CASSANDRA-2434 but that ticket is more 
 general).
 To be more clear about the problem: Suppose we have a ring of nodes, with a 
 single node bootstrapping. Now, for a given row key suppose reads are served 
 by A, B and C while writes are to go to A, B, C and D. In other words, D is 
 the node bootstrapping. Suppose RF is 3 and A,B,C,D is ring order. There are 
 a few things required for correct behavior:
 * Writes acked by D must never be treated as sufficient to satisfy 
 consistency level since until it is part of the read set it does not count 
 towards CL on reads.
 * Writes acked by B must *not* be treated as sufficient to satisfy 
 consistency level *unless* the same write is *also* acked by D, because once 
 D enters the ring, B will no longer be counting towards CL on reads. The only 
 alternative is to make the read succeed and disallow D from entering the ring.
 We don't seem to be handling this at all (and it becomes more complicated 
 with arbitrary transitions).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3830) gossip-to-seeds is not obviously independent of failure detection algorithm

2012-02-13 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207176#comment-13207176
 ] 

Peter Schuller commented on CASSANDRA-3830:
---

Correct, and the concern is that when the optimization is removed (e.g., by 
seeds being down), that might affect the failure detector if the average 
heartbeat interval ends up being affected.

 gossip-to-seeds is not obviously independent of failure detection algorithm 
 

 Key: CASSANDRA-3830
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3830
 Project: Cassandra
  Issue Type: Task
  Components: Core
Reporter: Peter Schuller
Priority: Minor

 The failure detector, ignoring all the theory, boils down to an
 extremely simple algorithm. The FD keeps track of a sliding window (of
 1000 currently) intervals of heartbeat for a given host. Meaning, we
 have a track record of the last 1000 times we saw an updated heartbeat
 for a host.
 At any given moment, a host has a score which is simply the time since
 the last heartbeat, over the *mean* interval in the sliding
 window. For historical reasons a simple scaling factor is applied to
 this prior to checking the phi conviction threshold.
 (CASSANDRA-2597 has details, but thanks to Paul's work there it's now
 trivial to understand what it does based on gut feeling)
 So in effect, a host is considered down if we haven't heard from it in
 some time which is significantly longer than the average time we
 expect to hear from it.
 This seems reasonable, but it does assume that under normal conditions
 the average time between heartbeats does not change for reasons other
 than those that would be plausible reasons to think a node is
 unhealthy.
 This assumption *could* be violated by the gossip-to-seed
 feature. There is an argument to avoid gossip-to-seed for other
 reasons (see CASSANDRA-3829), but this is a concrete case in which the
 gossip-to-seed could cause a negative side-effect of the general kind
 mentioned in CASSANDRA-3829 (see notes at end about not case w/o seeds
 not being continuously tested). Normally, due to gossip to seed,
 everyone essentially sees latest information within very few hart
 beats (assuming only 2-3 seeds). But should all seeds be down,
 suddenly we flip a switch and start relying on generalized propagation
 in the gossip system, rather than the seed special case.
 The potential problem I forese here is that if the average propagation
 time suddenly spikes when all seeds become available, it could cause
 bogus flapping of nodes into down state.
 In order to test this, I deployeda ~ 180 node cluster with a version
 that logs heartbet information on each interpret(), similar to:
  INFO [GossipTasks:1] 2012-02-01 23:29:58,746 FailureDetector.java (line 187) 
 ep /XXX.XXX.XXX.XXX is at phi 0.0019521638443084342, last interval 7.0, mean 
 is 1557.27778
 It turns out that, at least at 180 nodes, with 4 seed nodes, whether
 or not seeds are running *does not* seem to matter significantly. In
 both cases, the mean interval is around 1500 milliseconds.
 I don't feel I have a good grasp of whether this is incidental or
 guaranteed, and it would be good to at least empirically test
 propagation time w/o seeds at differnet cluster sizes; it's supposed
 to be un-affected by cluster size ({{RING_DELAY}} is static for this
 reason, is my understanding). Would be nice to see this be the case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




git commit: CASSANDRA-3867 patch by Vijay; reviewed by Brandon Williams for CASSANDRA-3867

2012-02-13 Thread vijay
Updated Branches:
  refs/heads/trunk 232da8248 - c49a1497e


CASSANDRA-3867
patch by Vijay; reviewed by Brandon Williams for CASSANDRA-3867

Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c49a1497
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c49a1497
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c49a1497

Branch: refs/heads/trunk
Commit: c49a1497eafc5ab5c16b03b3f97842c5ab1e64c8
Parents: 232da82
Author: Vijay Parthasarathy vijay2...@gmail.com
Authored: Mon Feb 13 12:37:22 2012 -0800
Committer: Vijay Parthasarathy vijay2...@gmail.com
Committed: Mon Feb 13 12:37:22 2012 -0800

--
 .../apache/cassandra/thrift/CustomTHsHaServer.java |8 
 1 files changed, 8 insertions(+), 0 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c49a1497/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java
--
diff --git a/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java 
b/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java
index 4921678..9bfb4f7 100644
--- a/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java
+++ b/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java
@@ -177,6 +177,14 @@ public class CustomTHsHaServer extends TNonblockingServer
 {
 select();
 }
+try
+{
+selector.close(); // CASSANDRA-3867
+}
+catch (IOException e)
+{
+// ignore this exception.
+}
 } 
 catch (Throwable t)
 {



git commit: CASSANDRA-3867 patch by Vijay; reviewed by Brandon Williams for CASSANDRA-3867

2012-02-13 Thread vijay
Updated Branches:
  refs/heads/cassandra-1.0 4bd3f8d86 - 2a5547981


CASSANDRA-3867
patch by Vijay; reviewed by Brandon Williams for CASSANDRA-3867

Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2a554798
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2a554798
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2a554798

Branch: refs/heads/cassandra-1.0
Commit: 2a5547981dad7e59be2c26aeb52f5d49d2195b9c
Parents: 4bd3f8d
Author: Vijay Parthasarathy vijay2...@gmail.com
Authored: Mon Feb 13 12:42:29 2012 -0800
Committer: Vijay Parthasarathy vijay2...@gmail.com
Committed: Mon Feb 13 12:42:29 2012 -0800

--
 .../apache/cassandra/thrift/CustomTHsHaServer.java |8 
 1 files changed, 8 insertions(+), 0 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2a554798/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java
--
diff --git a/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java 
b/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java
index 4921678..9bfb4f7 100644
--- a/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java
+++ b/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java
@@ -177,6 +177,14 @@ public class CustomTHsHaServer extends TNonblockingServer
 {
 select();
 }
+try
+{
+selector.close(); // CASSANDRA-3867
+}
+catch (IOException e)
+{
+// ignore this exception.
+}
 } 
 catch (Throwable t)
 {



[jira] [Reopened] (CASSANDRA-3886) Pig can't store some types after loading them

2012-02-13 Thread Brandon Williams (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reopened CASSANDRA-3886:
-


We actually do need the catch-all:

{noformat}
return ByteBuffer.wrap(((DataByteArray) o).get());
{noformat}

To cast all the pig-native types like CharArray, but these are all guaranteed 
to be castable to DataByteArray.

 Pig can't store some types after loading them
 -

 Key: CASSANDRA-3886
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3886
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.7
Reporter: Brandon Williams
Assignee: Brandon Williams
 Fix For: 1.0.8

 Attachments: 3886.txt


 In CASSANDRA-2810, we removed the decompose methods in putNext instead 
 relying on objToBB, however it cannot sufficiently handle all types.  For 
 instance, if longs are loaded and then an attempt to store them is made, this 
 causes a cast exception: java.io.IOException: java.io.IOException: 
 java.lang.ClassCastException: java.lang.Long cannot be cast to 
 org.apache.pig.data.DataByteArray Output must be (key, {(column,value)...}) 
 for ColumnFamily or (key, {supercolumn:{(column,value)...}...}) for 
 SuperColumnFamily

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3569) Failure detector downs should not break streams

2012-02-13 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207189#comment-13207189
 ] 

Peter Schuller commented on CASSANDRA-3569:
---

For the record, while CASSANDRA-2433 did make the changes originally claimed in 
my initial post here, it's CASSANDRA-3216 which is causing non-AES streams to 
get killed as well, but only on the sender side (if the receiver goes down 
according to the sender).

It also generates an NPE:

{code}
java.lang.NullPointerException
at 
org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:97)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)]
{code}

It should be harmless, but not very pretty.




 Failure detector downs should not break streams
 ---

 Key: CASSANDRA-3569
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3569
 Project: Cassandra
  Issue Type: Bug
Reporter: Peter Schuller
Assignee: Peter Schuller

 CASSANDRA-2433 introduced this behavior just to get repairs to don't sit 
 there waiting forever. In my opinion the correct fix to that problem is to 
 use TCP keep alive. Unfortunately the TCP keep alive period is insanely high 
 by default on a modern Linux, so just doing that is not entirely good either.
 But using the failure detector seems non-sensicle to me. We have a 
 communication method which is the TCP transport, that we know is used for 
 long-running processes that you don't want to incorrectly be killed for no 
 good reason, and we are using a failure detector tuned to detecting when not 
 to send real-time sensitive request to nodes in order to actively kill a 
 working connection.
 So, rather than add complexity with protocol based ping/pongs and such, I 
 propose that we simply just use TCP keep alive for streaming connections and 
 instruct operators of production clusters to tweak 
 net.ipv4.tcp_keepalive_{probes,intvl} as appropriate (or whatever equivalent 
 on their OS).
 I can submit the patch. Awaiting opinions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3886) Pig can't store some types after loading them

2012-02-13 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207192#comment-13207192
 ] 

Pavel Yaskevich commented on CASSANDRA-3886:


+1

 Pig can't store some types after loading them
 -

 Key: CASSANDRA-3886
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3886
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.7
Reporter: Brandon Williams
Assignee: Brandon Williams
 Fix For: 1.0.8

 Attachments: 3886.txt


 In CASSANDRA-2810, we removed the decompose methods in putNext instead 
 relying on objToBB, however it cannot sufficiently handle all types.  For 
 instance, if longs are loaded and then an attempt to store them is made, this 
 causes a cast exception: java.io.IOException: java.io.IOException: 
 java.lang.ClassCastException: java.lang.Long cannot be cast to 
 org.apache.pig.data.DataByteArray Output must be (key, {(column,value)...}) 
 for ColumnFamily or (key, {supercolumn:{(column,value)...}...}) for 
 SuperColumnFamily

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CASSANDRA-3886) Pig can't store some types after loading them

2012-02-13 Thread Brandon Williams (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams resolved CASSANDRA-3886.
-

Resolution: Fixed

Committed.

 Pig can't store some types after loading them
 -

 Key: CASSANDRA-3886
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3886
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.7
Reporter: Brandon Williams
Assignee: Brandon Williams
 Fix For: 1.0.8

 Attachments: 3886.txt


 In CASSANDRA-2810, we removed the decompose methods in putNext instead 
 relying on objToBB, however it cannot sufficiently handle all types.  For 
 instance, if longs are loaded and then an attempt to store them is made, this 
 causes a cast exception: java.io.IOException: java.io.IOException: 
 java.lang.ClassCastException: java.lang.Long cannot be cast to 
 org.apache.pig.data.DataByteArray Output must be (key, {(column,value)...}) 
 for ColumnFamily or (key, {supercolumn:{(column,value)...}...}) for 
 SuperColumnFamily

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-3904) do not generate NPE on aborted stream-out sessions

2012-02-13 Thread Peter Schuller (Created) (JIRA)
do not generate NPE on aborted stream-out sessions
--

 Key: CASSANDRA-3904
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3904
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor
 Fix For: 1.1.0


https://issues.apache.org/jira/browse/CASSANDRA-3569?focusedCommentId=13207189page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13207189

Attaching patch to make this a friendlier log entry.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




git commit: Add catch-all cast back to CassandraStorage. Patch by brandonwilliams reviewed by xedin for CASSANDRA-3886

2012-02-13 Thread brandonwilliams
Updated Branches:
  refs/heads/cassandra-1.0 2a5547981 - 104791412


Add catch-all cast back to CassandraStorage.
Patch by brandonwilliams reviewed by xedin for CASSANDRA-3886


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/10479141
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/10479141
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/10479141

Branch: refs/heads/cassandra-1.0
Commit: 10479141285c885fcd77571a9b2397d684ecf826
Parents: 2a55479
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Feb 13 14:45:48 2012 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Feb 13 14:50:52 2012 -0600

--
 .../cassandra/hadoop/pig/CassandraStorage.java |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/10479141/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
--
diff --git 
a/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java 
b/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
index 76a291a..975d5ba 100644
--- a/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
+++ b/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
@@ -502,7 +502,7 @@ public class CassandraStorage extends LoadFunc implements 
StoreFuncInterface, Lo
 return DoubleType.instance.decompose((Double)o);
 if (o instanceof UUID)
 return ByteBuffer.wrap(UUIDGen.decompose((UUID) o));
-return null;
+return ByteBuffer.wrap(((DataByteArray) o).get());
 }
 
 public void putNext(Tuple t) throws ExecException, IOException



[jira] [Updated] (CASSANDRA-3904) do not generate NPE on aborted stream-out sessions

2012-02-13 Thread Peter Schuller (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Schuller updated CASSANDRA-3904:
--

Attachment: CASSANDRA-3904-1.1.txt

Attaching patch again 1.1. It replaces NPE with a friendlier message, and also 
augments the original stream out session message to clarify that streams may 
still be going in the background.

 do not generate NPE on aborted stream-out sessions
 --

 Key: CASSANDRA-3904
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3904
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor
 Fix For: 1.1.0

 Attachments: CASSANDRA-3904-1.1.txt


 https://issues.apache.org/jira/browse/CASSANDRA-3569?focusedCommentId=13207189page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13207189
 Attaching patch to make this a friendlier log entry.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3569) Failure detector downs should not break streams

2012-02-13 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207224#comment-13207224
 ] 

Peter Schuller commented on CASSANDRA-3569:
---

NPE followed up in CASSANDRA-3904.

 Failure detector downs should not break streams
 ---

 Key: CASSANDRA-3569
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3569
 Project: Cassandra
  Issue Type: Bug
Reporter: Peter Schuller
Assignee: Peter Schuller

 CASSANDRA-2433 introduced this behavior just to get repairs to don't sit 
 there waiting forever. In my opinion the correct fix to that problem is to 
 use TCP keep alive. Unfortunately the TCP keep alive period is insanely high 
 by default on a modern Linux, so just doing that is not entirely good either.
 But using the failure detector seems non-sensicle to me. We have a 
 communication method which is the TCP transport, that we know is used for 
 long-running processes that you don't want to incorrectly be killed for no 
 good reason, and we are using a failure detector tuned to detecting when not 
 to send real-time sensitive request to nodes in order to actively kill a 
 working connection.
 So, rather than add complexity with protocol based ping/pongs and such, I 
 propose that we simply just use TCP keep alive for streaming connections and 
 instruct operators of production clusters to tweak 
 net.ipv4.tcp_keepalive_{probes,intvl} as appropriate (or whatever equivalent 
 on their OS).
 I can submit the patch. Awaiting opinions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3904) do not generate NPE on aborted stream-out sessions

2012-02-13 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3904:
--

Reviewer: yukim

 do not generate NPE on aborted stream-out sessions
 --

 Key: CASSANDRA-3904
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3904
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor
 Fix For: 1.1.0

 Attachments: CASSANDRA-3904-1.1.txt


 https://issues.apache.org/jira/browse/CASSANDRA-3569?focusedCommentId=13207189page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13207189
 Attaching patch to make this a friendlier log entry.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3371) Cassandra inferred schema and actual data don't match

2012-02-13 Thread Brandon Williams (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-3371:


Attachment: smoke_test.txt
3371-v6.txt

v6 is rebased and contains minor cleanups, smoke_test contains a file to be 
replayed by the cli and a pig script to exercise loading/storing every 
cassandra type.

 Cassandra inferred schema and actual data don't match
 -

 Key: CASSANDRA-3371
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3371
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.7
Reporter: Pete Warden
Assignee: Brandon Williams
 Attachments: 0001-Rework-pig-schema.txt, 
 0002-Output-support-to-match-input.txt, 3371-v2.txt, 3371-v3.txt, 
 3371-v4.txt, 3371-v5-rebased.txt, 3371-v5.txt, 3371-v6.txt, pig.diff, 
 smoke_test.txt


 It's looking like there may be a mismatch between the schema that's being 
 reported by the latest CassandraStorage.java, and the data that's actually 
 returned. Here's an example:
 rows = LOAD 'cassandra://Frap/PhotoVotes' USING CassandraStorage();
 DESCRIBE rows;
 rows: {key: chararray,columns: {(name: chararray,value: 
 bytearray,photo_owner: chararray,value_photo_owner: bytearray,pid: 
 chararray,value_pid: bytearray,matched_string: 
 chararray,value_matched_string: bytearray,src_big: chararray,value_src_big: 
 bytearray,time: chararray,value_time: bytearray,vote_type: 
 chararray,value_vote_type: bytearray,voter: chararray,value_voter: 
 bytearray)}}
 DUMP rows;
 (691831038_1317937188.48955,{(photo_owner,1596090180),(pid,6855155124568798560),(matched_string,),(src_big,),(time,Thu
  Oct 06 14:39:48 -0700 2011),(vote_type,album_dislike),(voter,691831038)})
 getSchema() is reporting the columns as an inner bag of tuples, each of which 
 contains 16 values. In fact, getNext() seems to return an inner bag 
 containing 7 tuples, each of which contains two values. 
 It appears that things got out of sync with this change:
 http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java?r1=1177083r2=1177082pathrev=1177083
 See more discussion at:
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/pig-cassandra-problem-quot-Incompatible-field-schema-quot-error-tc6882703.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[3/5] git commit: Merge branch '3886' into cassandra-1.0

2012-02-13 Thread jbellis
Merge branch '3886' into cassandra-1.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/cb0efd09
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/cb0efd09
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/cb0efd09

Branch: refs/heads/cassandra-1.1
Commit: cb0efd09cf077799f4934d900089f87b4db06d9e
Parents: c3dc789 742648c
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Feb 10 12:01:18 2012 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Feb 10 12:01:18 2012 -0600

--
 .../cassandra/hadoop/pig/CassandraStorage.java |   14 --
 1 files changed, 12 insertions(+), 2 deletions(-)
--




[4/5] git commit: Pig's objToBB should handle all types. Patch by brandonwilliams, reviewed by xedin for CASSANDRA-3886

2012-02-13 Thread jbellis
Pig's objToBB should handle all types.
Patch by brandonwilliams, reviewed by xedin for CASSANDRA-3886


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/742648c8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/742648c8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/742648c8

Branch: refs/heads/cassandra-1.1
Commit: 742648c821bb5922018423ff5f360233017a08ba
Parents: 22b8a97
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Feb 10 10:07:53 2012 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Feb 10 12:00:07 2012 -0600

--
 .../cassandra/hadoop/pig/CassandraStorage.java |   14 --
 1 files changed, 12 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/742648c8/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
--
diff --git 
a/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java 
b/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
index b1af1b5..63758ab 100644
--- a/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
+++ b/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
@@ -491,8 +491,18 @@ public class CassandraStorage extends LoadFunc implements 
StoreFuncInterface, Lo
 if (o == null)
 return (ByteBuffer)o;
 if (o instanceof java.lang.String)
-o = new DataByteArray((String)o);
-return ByteBuffer.wrap(((DataByteArray) o).get());
+return new ByteBuffer.wrap(DataByteArray((String)o).get());
+if (o instanceof Integer)
+return IntegerType.instance.decompose((BigInteger)o);
+if (o instanceof Long)
+return LongType.instance.decompose((Long)o);
+if (o instanceof Float)
+return FloatType.instance.decompose((Float)o);
+if (o instanceof Double)
+return DoubleType.instance.decompose((Double)o);
+if (o instanceof UUID)
+return ByteBuffer.wrap(UUIDGen.decompose((UUID) o));
+return null;
 }
 
 public void putNext(Tuple t) throws ExecException, IOException



[1/5] git commit: merge from 1.0

2012-02-13 Thread jbellis
Updated Branches:
  refs/heads/cassandra-1.1 9a842c7b3 - c5986871c


merge from 1.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c5986871
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c5986871
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c5986871

Branch: refs/heads/cassandra-1.1
Commit: c5986871c007f8c552ff624d1fcf064ce6a45c92
Parents: 9a842c7 b55ab4f
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Feb 13 15:41:30 2012 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Feb 13 15:41:30 2012 -0600

--
 CHANGES.txt|3 --
 .../cassandra/hadoop/pig/CassandraStorage.java |   14 ++-
 .../cassandra/locator/NetworkTopologyStrategy.java |2 +-
 .../apache/cassandra/locator/TokenMetadata.java|   28 +++
 .../apache/cassandra/service/StorageService.java   |6 ++--
 5 files changed, 37 insertions(+), 16 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c5986871/CHANGES.txt
--
diff --cc CHANGES.txt
index e115a2a,0875da5..359e699
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,83 -1,3 +1,80 @@@
 +1.1-dev
 + * add nodetool rebuild_index (CASSANDRA-3583)
 + * add nodetool rangekeysample (CASSANDRA-2917)
 + * Fix streaming too much data during move operations (CASSANDRA-3639)
 + * Nodetool and CLI connect to localhost by default (CASSANDRA-3568)
 + * Reduce memory used by primary index sample (CASSANDRA-3743)
 + * (Hadoop) separate input/output configurations (CASSANDRA-3197, 3765)
 + * avoid returning internal Cassandra classes over JMX (CASSANDRA-2805)
 + * add row-level isolation via SnapTree (CASSANDRA-2893)
 + * Optimize key count estimation when opening sstable on startup
 +   (CASSANDRA-2988)
 + * multi-dc replication optimization supporting CL  ONE (CASSANDRA-3577)
 + * add command to stop compactions (CASSANDRA-1740, 3566, 3582)
 + * multithreaded streaming (CASSANDRA-3494)
 + * removed in-tree redhat spec (CASSANDRA-3567)
 + * defragment rows for name-based queries under STCS, again (CASSANDRA-2503)
 + * Recycle commitlog segments for improved performance 
 +   (CASSANDRA-3411, 3543, 3557, 3615)
 + * update size-tiered compaction to prioritize small tiers (CASSANDRA-2407)
 + * add message expiration logic to OutboundTcpConnection (CASSANDRA-3005)
 + * off-heap cache to use sun.misc.Unsafe instead of JNA (CASSANDRA-3271)
 + * EACH_QUORUM is only supported for writes (CASSANDRA-3272)
 + * replace compactionlock use in schema migration by checking CFS.isValid
 +   (CASSANDRA-3116)
 + * recognize that SELECT first ... * isn't really SELECT * 
(CASSANDRA-3445)
 + * Use faster bytes comparison (CASSANDRA-3434)
 + * Bulk loader is no longer a fat client, (HADOOP) bulk load output format
 +   (CASSANDRA-3045)
 + * (Hadoop) add support for KeyRange.filter
 + * remove assumption that keys and token are in bijection
 +   (CASSANDRA-1034, 3574, 3604)
 + * always remove endpoints from delevery queue in HH (CASSANDRA-3546)
 + * fix race between cf flush and its 2ndary indexes flush (CASSANDRA-3547)
 + * fix potential race in AES when a repair fails (CASSANDRA-3548)
 + * Remove columns shadowed by a deleted container even when we cannot purge
 +   (CASSANDRA-3538)
 + * Improve memtable slice iteration performance (CASSANDRA-3545)
 + * more efficient allocation of small bloom filters (CASSANDRA-3618)
 + * Use separate writer thread in SSTableSimpleUnsortedWriter (CASSANDRA-3619)
 + * fsync the directory after new sstable or commitlog segment are created 
(CASSANDRA-3250)
 + * fix minor issues reported by FindBugs (CASSANDRA-3658)
 + * global key/row caches (CASSANDRA-3143, 3849)
 + * optimize memtable iteration during range scan (CASSANDRA-3638)
 + * introduce 'crc_check_chance' in CompressionParameters to support
 +   a checksum percentage checking chance similarly to read-repair 
(CASSANDRA-3611)
 + * a way to deactivate global key/row cache on per-CF basis (CASSANDRA-3667)
 + * fix LeveledCompactionStrategy broken because of generation pre-allocation
 +   in LeveledManifest (CASSANDRA-3691)
 + * finer-grained control over data directories (CASSANDRA-2749)
 + * Fix ClassCastException during hinted handoff (CASSANDRA-3694)
 + * Upgrade Thrift to 0.7 (CASSANDRA-3213)
 + * Make stress.java insert operation to use microseconds (CASSANDRA-3725)
 + * Allows (internally) doing a range query with a limit of columns instead of
 +   rows (CASSANDRA-3742)
 + * Allow rangeSlice queries to be start/end inclusive/exclusive 
(CASSANDRA-3749)
 + * Fix BulkLoader to support new SSTable layout and add stream
 +   throttling to prevent an NPE when there is no yaml config (CASSANDRA-3752)
 + * Allow concurrent schema 

[5/5] git commit: avoid including non-queried nodes in rangeslice read repair patch by jbellis; reviewed by Vijay for CASSANDRA-3843

2012-02-13 Thread jbellis
avoid including non-queried nodes in rangeslice read repair
patch by jbellis; reviewed by Vijay for CASSANDRA-3843


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c3dc7894
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c3dc7894
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c3dc7894

Branch: refs/heads/cassandra-1.1
Commit: c3dc7894159ad413f9c8fa0cc0024c6ed0984831
Parents: 22b8a97
Author: Jonathan Ellis jbel...@apache.org
Authored: Wed Feb 8 22:28:47 2012 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Thu Feb 9 15:33:31 2012 -0600

--
 CHANGES.txt|7 +++
 .../service/RangeSliceResponseResolver.java|   10 +++---
 .../org/apache/cassandra/service/StorageProxy.java |6 --
 3 files changed, 14 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c3dc7894/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index cca24a9..0875da5 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,9 +1,8 @@
-1.0.9
+1.0.8
+ * avoid including non-queried nodes in rangeslice read repair
+   (CASSANDRA-3843)
  * Only snapshot CF being compacted for snapshot_before_compaction 
(CASSANDRA-3803)
-
-
-1.0.8
  * Log active compactions in StatusLogger (CASSANDRA-3703)
  * Compute more accurate compaction score per level (CASSANDRA-3790)
  * Return InvalidRequest when using a keyspace that doesn't exist

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c3dc7894/src/java/org/apache/cassandra/service/RangeSliceResponseResolver.java
--
diff --git 
a/src/java/org/apache/cassandra/service/RangeSliceResponseResolver.java 
b/src/java/org/apache/cassandra/service/RangeSliceResponseResolver.java
index 3be61d1..a870d5c 100644
--- a/src/java/org/apache/cassandra/service/RangeSliceResponseResolver.java
+++ b/src/java/org/apache/cassandra/service/RangeSliceResponseResolver.java
@@ -56,16 +56,20 @@ public class RangeSliceResponseResolver implements 
IResponseResolverIterableRo
 };
 
 private final String table;
-private final ListInetAddress sources;
+private ListInetAddress sources;
 protected final CollectionMessage responses = new 
LinkedBlockingQueueMessage();;
 public final ListIAsyncResult repairResults = new 
ArrayListIAsyncResult();
 
-public RangeSliceResponseResolver(String table, ListInetAddress sources)
+public RangeSliceResponseResolver(String table)
 {
-this.sources = sources;
 this.table = table;
 }
 
+public void setSources(ListInetAddress endpoints)
+{
+this.sources = endpoints;
+}
+
 public ListRow getData() throws IOException
 {
 Message response = responses.iterator().next();

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c3dc7894/src/java/org/apache/cassandra/service/StorageProxy.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java 
b/src/java/org/apache/cassandra/service/StorageProxy.java
index 0672b3f..27db551 100644
--- a/src/java/org/apache/cassandra/service/StorageProxy.java
+++ b/src/java/org/apache/cassandra/service/StorageProxy.java
@@ -814,9 +814,10 @@ public class StorageProxy implements StorageProxyMBean
 RangeSliceCommand c2 = new 
RangeSliceCommand(command.keyspace, command.column_family, 
command.super_column, command.predicate, range, command.max_keys);
 
 // collect replies and resolve according to consistency 
level
-RangeSliceResponseResolver resolver = new 
RangeSliceResponseResolver(command.keyspace, liveEndpoints);
+RangeSliceResponseResolver resolver = new 
RangeSliceResponseResolver(command.keyspace);
 ReadCallbackIterableRow handler = 
getReadCallback(resolver, command, consistency_level, liveEndpoints);
 handler.assureSufficientLiveNodes();
+resolver.setSources(handler.endpoints);
 for (InetAddress endpoint : handler.endpoints)
 {
 MessagingService.instance().sendRR(c2, endpoint, 
handler);
@@ -1071,7 +1072,7 @@ public class StorageProxy implements StorageProxyMBean
 
DatabaseDescriptor.getEndpointSnitch().sortByProximity(FBUtilities.getBroadcastAddress(),
 liveEndpoints);
 
 // collect replies and resolve according to consistency level
-RangeSliceResponseResolver resolver = new 
RangeSliceResponseResolver(keyspace, liveEndpoints);
+RangeSliceResponseResolver resolver = new 

[2/5] git commit: fix unsynchronized use of TokenMetadata.entrySet patch by Peter Schuller; reviewed by jbellis for CASSANDRA-3417

2012-02-13 Thread jbellis
fix unsynchronized use of TokenMetadata.entrySet
patch by Peter Schuller; reviewed by jbellis for CASSANDRA-3417


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b55ab4f3
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b55ab4f3
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b55ab4f3

Branch: refs/heads/cassandra-1.1
Commit: b55ab4f3b23b9f3f056ffcc526d2b06989e024fb
Parents: cb0efd0
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Feb 13 15:31:43 2012 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Feb 13 15:31:43 2012 -0600

--
 .../cassandra/locator/NetworkTopologyStrategy.java |2 +-
 .../apache/cassandra/locator/TokenMetadata.java|   28 +++
 .../apache/cassandra/service/StorageService.java   |4 +-
 3 files changed, 24 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b55ab4f3/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
--
diff --git a/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java 
b/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
index 2ae0a98..b6a99b2 100644
--- a/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
+++ b/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
@@ -88,7 +88,7 @@ public class NetworkTopologyStrategy extends 
AbstractReplicationStrategy
 
 // collect endpoints in this DC
 TokenMetadata dcTokens = new TokenMetadata();
-for (EntryToken, InetAddress tokenEntry : 
tokenMetadata.entrySet())
+for (EntryToken, InetAddress tokenEntry : 
tokenMetadata.getTokenToEndpointMapForReading().entrySet())
 {
 if (snitch.getDatacenter(tokenEntry.getValue()).equals(dcName))
 dcTokens.updateNormalToken(tokenEntry.getKey(), 
tokenEntry.getValue());

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b55ab4f3/src/java/org/apache/cassandra/locator/TokenMetadata.java
--
diff --git a/src/java/org/apache/cassandra/locator/TokenMetadata.java 
b/src/java/org/apache/cassandra/locator/TokenMetadata.java
index ebb094b..0942a5d 100644
--- a/src/java/org/apache/cassandra/locator/TokenMetadata.java
+++ b/src/java/org/apache/cassandra/locator/TokenMetadata.java
@@ -408,11 +408,6 @@ public class TokenMetadata
 }
 }
 
-public SetMap.EntryToken,InetAddress entrySet()
-{
-return tokenToEndpointMap.entrySet();
-}
-
 public InetAddress getEndpoint(Token token)
 {
 lock.readLock().lock();
@@ -713,9 +708,28 @@ public class TokenMetadata
 }
 
 /**
- * Return the Token to Endpoint map for all the node in the cluster, 
including bootstrapping ones.
+ * @return a token to endpoint map to consider for read operations on the 
cluster.
+ */
+public MapToken, InetAddress getTokenToEndpointMapForReading()
+{
+lock.readLock().lock();
+try
+{
+MapToken, InetAddress map = new HashMapToken, 
InetAddress(tokenToEndpointMap.size());
+map.putAll(tokenToEndpointMap);
+return map;
+}
+finally
+{
+lock.readLock().unlock();
+}
+}
+
+/**
+ * @return a (stable copy, won't be modified) Token to Endpoint map for 
all the normal and bootstrapping nodes
+ * in the cluster.
  */
-public MapToken, InetAddress getTokenToEndpointMap()
+public MapToken, InetAddress 
getNormalAndBootstrappingTokenToEndpointMap()
 {
 lock.readLock().lock();
 try

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b55ab4f3/src/java/org/apache/cassandra/service/StorageService.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageService.java 
b/src/java/org/apache/cassandra/service/StorageService.java
index 1f7a18d..f82fe32 100644
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@ -854,7 +854,7 @@ public class StorageService implements 
IEndpointStateChangeSubscriber, StorageSe
 
 public MapToken, String getTokenToEndpointMap()
 {
-MapToken, InetAddress mapInetAddress = 
tokenMetadata_.getTokenToEndpointMap();
+MapToken, InetAddress mapInetAddress = 
tokenMetadata_.getNormalAndBootstrappingTokenToEndpointMap();
 MapToken, String mapString = new HashMapToken, 
String(mapInetAddress.size());
 for (Map.EntryToken, InetAddress entry : mapInetAddress.entrySet())
 {
@@ -2074,7 +2074,7 @@ public 

[3/3] git commit: Pig's objToBB should handle all types. Patch by brandonwilliams, reviewed by xedin for CASSANDRA-3886

2012-02-13 Thread jbellis
Pig's objToBB should handle all types.
Patch by brandonwilliams, reviewed by xedin for CASSANDRA-3886


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bcad0688
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bcad0688
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bcad0688

Branch: refs/heads/trunk
Commit: bcad06883dc599c77393bc4eb2807be9da3d294a
Parents: c49a149
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Feb 10 10:07:53 2012 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Feb 13 15:43:03 2012 -0600

--
 .../cassandra/hadoop/pig/CassandraStorage.java |   14 --
 1 files changed, 12 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/bcad0688/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
--
diff --git a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java 
b/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
index 9c6dd30..ebd118c 100644
--- a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
+++ b/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
@@ -561,8 +561,18 @@ public class CassandraStorage extends LoadFunc implements 
StoreFuncInterface, Lo
 if (o == null)
 return (ByteBuffer)o;
 if (o instanceof java.lang.String)
-o = new DataByteArray((String)o);
-return ByteBuffer.wrap(((DataByteArray) o).get());
+return new ByteBuffer.wrap(DataByteArray((String)o).get());
+if (o instanceof Integer)
+return IntegerType.instance.decompose((BigInteger)o);
+if (o instanceof Long)
+return LongType.instance.decompose((Long)o);
+if (o instanceof Float)
+return FloatType.instance.decompose((Float)o);
+if (o instanceof Double)
+return DoubleType.instance.decompose((Double)o);
+if (o instanceof UUID)
+return ByteBuffer.wrap(UUIDGen.decompose((UUID) o));
+return null;
 }
 
 public void putNext(Tuple t) throws ExecException, IOException



[jira] [Created] (CASSANDRA-3905) fix typo in nodetool help for repair

2012-02-13 Thread Peter Schuller (Created) (JIRA)
fix typo in nodetool help for repair


 Key: CASSANDRA-3905
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3905
 Project: Cassandra
  Issue Type: Bug
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Trivial


It says to use {{-rp}} instead of {{-pr}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[2/3] git commit: fix unsynchronized use of TokenMetadata.entrySet patch by Peter Schuller; reviewed by jbellis for CASSANDRA-3417

2012-02-13 Thread jbellis
fix unsynchronized use of TokenMetadata.entrySet
patch by Peter Schuller; reviewed by jbellis for CASSANDRA-3417


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/79050449
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/79050449
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/79050449

Branch: refs/heads/trunk
Commit: 79050449e7e953a301e275a755a2b5f3a5b0d06a
Parents: bcad068
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Feb 13 15:31:43 2012 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Feb 13 15:44:29 2012 -0600

--
 .../cassandra/locator/NetworkTopologyStrategy.java |2 +-
 .../apache/cassandra/locator/TokenMetadata.java|   28 +++
 .../apache/cassandra/service/StorageService.java   |6 ++--
 3 files changed, 25 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/79050449/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
--
diff --git a/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java 
b/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
index ffbabd6..382e224 100644
--- a/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
+++ b/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
@@ -90,7 +90,7 @@ public class NetworkTopologyStrategy extends 
AbstractReplicationStrategy
 // collect endpoints in this DC; add in bulk to token meta data 
for computational complexity
 // reasons (CASSANDRA-3831).
 SetPairToken, InetAddress dcTokensToUpdate = new 
HashSetPairToken, InetAddress();
-for (EntryToken, InetAddress tokenEntry : 
tokenMetadata.entrySet())
+for (EntryToken, InetAddress tokenEntry : 
tokenMetadata.getTokenToEndpointMapForReading().entrySet())
 {
 if (snitch.getDatacenter(tokenEntry.getValue()).equals(dcName))
 dcTokensToUpdate.add(Pair.create(tokenEntry.getKey(), 
tokenEntry.getValue()));

http://git-wip-us.apache.org/repos/asf/cassandra/blob/79050449/src/java/org/apache/cassandra/locator/TokenMetadata.java
--
diff --git a/src/java/org/apache/cassandra/locator/TokenMetadata.java 
b/src/java/org/apache/cassandra/locator/TokenMetadata.java
index b02daae..4d89f92 100644
--- a/src/java/org/apache/cassandra/locator/TokenMetadata.java
+++ b/src/java/org/apache/cassandra/locator/TokenMetadata.java
@@ -436,11 +436,6 @@ public class TokenMetadata
 }
 }
 
-public SetMap.EntryToken,InetAddress entrySet()
-{
-return tokenToEndpointMap.entrySet();
-}
-
 public InetAddress getEndpoint(Token token)
 {
 lock.readLock().lock();
@@ -741,9 +736,28 @@ public class TokenMetadata
 }
 
 /**
- * Return the Token to Endpoint map for all the node in the cluster, 
including bootstrapping ones.
+ * @return a token to endpoint map to consider for read operations on the 
cluster.
+ */
+public MapToken, InetAddress getTokenToEndpointMapForReading()
+{
+lock.readLock().lock();
+try
+{
+MapToken, InetAddress map = new HashMapToken, 
InetAddress(tokenToEndpointMap.size());
+map.putAll(tokenToEndpointMap);
+return map;
+}
+finally
+{
+lock.readLock().unlock();
+}
+}
+
+/**
+ * @return a (stable copy, won't be modified) Token to Endpoint map for 
all the normal and bootstrapping nodes
+ * in the cluster.
  */
-public MapToken, InetAddress getTokenToEndpointMap()
+public MapToken, InetAddress 
getNormalAndBootstrappingTokenToEndpointMap()
 {
 lock.readLock().lock();
 try

http://git-wip-us.apache.org/repos/asf/cassandra/blob/79050449/src/java/org/apache/cassandra/service/StorageService.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageService.java 
b/src/java/org/apache/cassandra/service/StorageService.java
index c1681b9..9bcd54d 100644
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@ -908,7 +908,7 @@ public class StorageService implements 
IEndpointStateChangeSubscriber, StorageSe
 
 public MapString, String getTokenToEndpointMap()
 {
-MapToken, InetAddress mapInetAddress = 
tokenMetadata_.getTokenToEndpointMap();
+MapToken, InetAddress mapInetAddress = 
tokenMetadata_.getNormalAndBootstrappingTokenToEndpointMap();
 // in order to preserve tokens in ascending order, we 

[1/3] git commit: fix unsynchronized use of TokenMetadata.entrySet patch by Peter Schuller; reviewed by jbellis for CASSANDRA-3417

2012-02-13 Thread jbellis
Updated Branches:
  refs/heads/cassandra-1.0 104791412 - 4ab6fad94
  refs/heads/trunk c49a1497e - 79050449e


fix unsynchronized use of TokenMetadata.entrySet
patch by Peter Schuller; reviewed by jbellis for CASSANDRA-3417


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4ab6fad9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4ab6fad9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4ab6fad9

Branch: refs/heads/cassandra-1.0
Commit: 4ab6fad945cada90497a8cf523a4c868932834c2
Parents: 1047914
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Feb 13 15:31:43 2012 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Feb 13 15:44:50 2012 -0600

--
 .../cassandra/locator/NetworkTopologyStrategy.java |2 +-
 .../apache/cassandra/locator/TokenMetadata.java|   28 +++
 .../apache/cassandra/service/StorageService.java   |4 +-
 3 files changed, 24 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/4ab6fad9/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
--
diff --git a/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java 
b/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
index 2ae0a98..b6a99b2 100644
--- a/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
+++ b/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
@@ -88,7 +88,7 @@ public class NetworkTopologyStrategy extends 
AbstractReplicationStrategy
 
 // collect endpoints in this DC
 TokenMetadata dcTokens = new TokenMetadata();
-for (EntryToken, InetAddress tokenEntry : 
tokenMetadata.entrySet())
+for (EntryToken, InetAddress tokenEntry : 
tokenMetadata.getTokenToEndpointMapForReading().entrySet())
 {
 if (snitch.getDatacenter(tokenEntry.getValue()).equals(dcName))
 dcTokens.updateNormalToken(tokenEntry.getKey(), 
tokenEntry.getValue());

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4ab6fad9/src/java/org/apache/cassandra/locator/TokenMetadata.java
--
diff --git a/src/java/org/apache/cassandra/locator/TokenMetadata.java 
b/src/java/org/apache/cassandra/locator/TokenMetadata.java
index ebb094b..0942a5d 100644
--- a/src/java/org/apache/cassandra/locator/TokenMetadata.java
+++ b/src/java/org/apache/cassandra/locator/TokenMetadata.java
@@ -408,11 +408,6 @@ public class TokenMetadata
 }
 }
 
-public SetMap.EntryToken,InetAddress entrySet()
-{
-return tokenToEndpointMap.entrySet();
-}
-
 public InetAddress getEndpoint(Token token)
 {
 lock.readLock().lock();
@@ -713,9 +708,28 @@ public class TokenMetadata
 }
 
 /**
- * Return the Token to Endpoint map for all the node in the cluster, 
including bootstrapping ones.
+ * @return a token to endpoint map to consider for read operations on the 
cluster.
+ */
+public MapToken, InetAddress getTokenToEndpointMapForReading()
+{
+lock.readLock().lock();
+try
+{
+MapToken, InetAddress map = new HashMapToken, 
InetAddress(tokenToEndpointMap.size());
+map.putAll(tokenToEndpointMap);
+return map;
+}
+finally
+{
+lock.readLock().unlock();
+}
+}
+
+/**
+ * @return a (stable copy, won't be modified) Token to Endpoint map for 
all the normal and bootstrapping nodes
+ * in the cluster.
  */
-public MapToken, InetAddress getTokenToEndpointMap()
+public MapToken, InetAddress 
getNormalAndBootstrappingTokenToEndpointMap()
 {
 lock.readLock().lock();
 try

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4ab6fad9/src/java/org/apache/cassandra/service/StorageService.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageService.java 
b/src/java/org/apache/cassandra/service/StorageService.java
index 1f7a18d..f82fe32 100644
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@ -854,7 +854,7 @@ public class StorageService implements 
IEndpointStateChangeSubscriber, StorageSe
 
 public MapToken, String getTokenToEndpointMap()
 {
-MapToken, InetAddress mapInetAddress = 
tokenMetadata_.getTokenToEndpointMap();
+MapToken, InetAddress mapInetAddress = 
tokenMetadata_.getNormalAndBootstrappingTokenToEndpointMap();
 MapToken, String mapString = new HashMapToken, 
String(mapInetAddress.size());
   

[jira] [Updated] (CASSANDRA-3905) fix typo in nodetool help for repair

2012-02-13 Thread Peter Schuller (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Schuller updated CASSANDRA-3905:
--

Attachment: CASSANDRA-3905.txt

 fix typo in nodetool help for repair
 

 Key: CASSANDRA-3905
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3905
 Project: Cassandra
  Issue Type: Bug
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Trivial
 Fix For: 1.1.0

 Attachments: CASSANDRA-3905.txt


 It says to use {{-rp}} instead of {{-pr}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3905) fix typo in nodetool help for repair

2012-02-13 Thread Peter Schuller (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Schuller updated CASSANDRA-3905:
--

Fix Version/s: 1.1.0

 fix typo in nodetool help for repair
 

 Key: CASSANDRA-3905
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3905
 Project: Cassandra
  Issue Type: Bug
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Trivial
 Fix For: 1.1.0

 Attachments: CASSANDRA-3905.txt


 It says to use {{-rp}} instead of {{-pr}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3905) fix typo in nodetool help for repair

2012-02-13 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207261#comment-13207261
 ] 

Jonathan Ellis commented on CASSANDRA-3905:
---

+1

 fix typo in nodetool help for repair
 

 Key: CASSANDRA-3905
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3905
 Project: Cassandra
  Issue Type: Bug
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Trivial
 Fix For: 1.1.0

 Attachments: CASSANDRA-3905.txt


 It says to use {{-rp}} instead of {{-pr}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[3/8] git commit: fix race between cleanup and flush on secondary index CFSes patch by yukim and jbellis for CASSANDRA-3712

2012-02-13 Thread jbellis
fix race between cleanup and flush on secondary index CFSes
patch by yukim and jbellis for CASSANDRA-3712


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9ca84786
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9ca84786
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9ca84786

Branch: refs/heads/cassandra-1.1
Commit: 9ca84786b5be14b0a881268e3649b697f7f893b9
Parents: 4ab6fad
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Feb 13 16:30:34 2012 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Feb 13 16:30:34 2012 -0600

--
 CHANGES.txt|1 +
 src/java/org/apache/cassandra/db/Table.java|2 +-
 .../cassandra/db/compaction/CompactionManager.java |   24 ++-
 3 files changed, 18 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9ca84786/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 0875da5..500b9fb 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 1.0.8
+ * fix race between cleanup and flush on secondary index CFSes (CASSANDRA-3712)
  * avoid including non-queried nodes in rangeslice read repair
(CASSANDRA-3843)
  * Only snapshot CF being compacted for snapshot_before_compaction 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9ca84786/src/java/org/apache/cassandra/db/Table.java
--
diff --git a/src/java/org/apache/cassandra/db/Table.java 
b/src/java/org/apache/cassandra/db/Table.java
index 0168f0c..f954fbc 100644
--- a/src/java/org/apache/cassandra/db/Table.java
+++ b/src/java/org/apache/cassandra/db/Table.java
@@ -71,7 +71,7 @@ public class Table
  *
  * (Enabling fairness in the RRWL is observed to decrease throughput, so 
we leave it off.)
  */
-static final ReentrantReadWriteLock switchLock = new 
ReentrantReadWriteLock();
+public static final ReentrantReadWriteLock switchLock = new 
ReentrantReadWriteLock();
 
 // It is possible to call Table.open without a running daemon, so it makes 
sense to ensure
 // proper directories here as well as in CassandraDaemon.

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9ca84786/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
--
diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
index caaf6d2..97e5067 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
@@ -729,14 +729,13 @@ public class CompactionManager implements 
CompactionManagerMBean
 }
 else
 {
-  
 cfs.invalidateCachedRow(row.getKey());
-
+
 if (!indexedColumns.isEmpty() || isCommutative)
 {
 if (indexedColumnsInRow != null)
 indexedColumnsInRow.clear();
-
+
 while (row.hasNext())
 {
 IColumn column = row.next();
@@ -746,13 +745,24 @@ public class CompactionManager implements 
CompactionManagerMBean
 {
 if (indexedColumnsInRow == null)
 indexedColumnsInRow = new 
ArrayListIColumn();
-
+
 indexedColumnsInRow.add(column);
 }
 }
-
+
 if (indexedColumnsInRow != null  
!indexedColumnsInRow.isEmpty())
-
cfs.indexManager.deleteFromIndexes(row.getKey(), indexedColumnsInRow);
+{
+// acquire memtable lock here because 
secondary index deletion may cause a race. See CASSANDRA-3712
+Table.switchLock.readLock().lock();
+try
+{
+
cfs.indexManager.deleteFromIndexes(row.getKey(), indexedColumnsInRow);
+}
+finally
+{
+

[2/8] git commit: fix race between cleanup and flush on secondary index CFSes patch by yukim and jbellis for CASSANDRA-3712

2012-02-13 Thread jbellis
fix race between cleanup and flush on secondary index CFSes
patch by yukim and jbellis for CASSANDRA-3712


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9ca84786
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9ca84786
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9ca84786

Branch: refs/heads/cassandra-1.0
Commit: 9ca84786b5be14b0a881268e3649b697f7f893b9
Parents: 4ab6fad
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Feb 13 16:30:34 2012 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Feb 13 16:30:34 2012 -0600

--
 CHANGES.txt|1 +
 src/java/org/apache/cassandra/db/Table.java|2 +-
 .../cassandra/db/compaction/CompactionManager.java |   24 ++-
 3 files changed, 18 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9ca84786/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 0875da5..500b9fb 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 1.0.8
+ * fix race between cleanup and flush on secondary index CFSes (CASSANDRA-3712)
  * avoid including non-queried nodes in rangeslice read repair
(CASSANDRA-3843)
  * Only snapshot CF being compacted for snapshot_before_compaction 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9ca84786/src/java/org/apache/cassandra/db/Table.java
--
diff --git a/src/java/org/apache/cassandra/db/Table.java 
b/src/java/org/apache/cassandra/db/Table.java
index 0168f0c..f954fbc 100644
--- a/src/java/org/apache/cassandra/db/Table.java
+++ b/src/java/org/apache/cassandra/db/Table.java
@@ -71,7 +71,7 @@ public class Table
  *
  * (Enabling fairness in the RRWL is observed to decrease throughput, so 
we leave it off.)
  */
-static final ReentrantReadWriteLock switchLock = new 
ReentrantReadWriteLock();
+public static final ReentrantReadWriteLock switchLock = new 
ReentrantReadWriteLock();
 
 // It is possible to call Table.open without a running daemon, so it makes 
sense to ensure
 // proper directories here as well as in CassandraDaemon.

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9ca84786/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
--
diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
index caaf6d2..97e5067 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
@@ -729,14 +729,13 @@ public class CompactionManager implements 
CompactionManagerMBean
 }
 else
 {
-  
 cfs.invalidateCachedRow(row.getKey());
-
+
 if (!indexedColumns.isEmpty() || isCommutative)
 {
 if (indexedColumnsInRow != null)
 indexedColumnsInRow.clear();
-
+
 while (row.hasNext())
 {
 IColumn column = row.next();
@@ -746,13 +745,24 @@ public class CompactionManager implements 
CompactionManagerMBean
 {
 if (indexedColumnsInRow == null)
 indexedColumnsInRow = new 
ArrayListIColumn();
-
+
 indexedColumnsInRow.add(column);
 }
 }
-
+
 if (indexedColumnsInRow != null  
!indexedColumnsInRow.isEmpty())
-
cfs.indexManager.deleteFromIndexes(row.getKey(), indexedColumnsInRow);
+{
+// acquire memtable lock here because 
secondary index deletion may cause a race. See CASSANDRA-3712
+Table.switchLock.readLock().lock();
+try
+{
+
cfs.indexManager.deleteFromIndexes(row.getKey(), indexedColumnsInRow);
+}
+finally
+{
+

[5/8] git commit: Add catch-all cast back to CassandraStorage. Patch by brandonwilliams reviewed by xedin for CASSANDRA-3886

2012-02-13 Thread jbellis
Add catch-all cast back to CassandraStorage.
Patch by brandonwilliams reviewed by xedin for CASSANDRA-3886


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/10479141
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/10479141
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/10479141

Branch: refs/heads/cassandra-1.1
Commit: 10479141285c885fcd77571a9b2397d684ecf826
Parents: 2a55479
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Feb 13 14:45:48 2012 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Feb 13 14:50:52 2012 -0600

--
 .../cassandra/hadoop/pig/CassandraStorage.java |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/10479141/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
--
diff --git 
a/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java 
b/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
index 76a291a..975d5ba 100644
--- a/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
+++ b/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
@@ -502,7 +502,7 @@ public class CassandraStorage extends LoadFunc implements 
StoreFuncInterface, Lo
 return DoubleType.instance.decompose((Double)o);
 if (o instanceof UUID)
 return ByteBuffer.wrap(UUIDGen.decompose((UUID) o));
-return null;
+return ByteBuffer.wrap(((DataByteArray) o).get());
 }
 
 public void putNext(Tuple t) throws ExecException, IOException



[4/8] git commit: fix unsynchronized use of TokenMetadata.entrySet patch by Peter Schuller; reviewed by jbellis for CASSANDRA-3417

2012-02-13 Thread jbellis
fix unsynchronized use of TokenMetadata.entrySet
patch by Peter Schuller; reviewed by jbellis for CASSANDRA-3417


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4ab6fad9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4ab6fad9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4ab6fad9

Branch: refs/heads/cassandra-1.1
Commit: 4ab6fad945cada90497a8cf523a4c868932834c2
Parents: 1047914
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Feb 13 15:31:43 2012 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Feb 13 15:44:50 2012 -0600

--
 .../cassandra/locator/NetworkTopologyStrategy.java |2 +-
 .../apache/cassandra/locator/TokenMetadata.java|   28 +++
 .../apache/cassandra/service/StorageService.java   |4 +-
 3 files changed, 24 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/4ab6fad9/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
--
diff --git a/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java 
b/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
index 2ae0a98..b6a99b2 100644
--- a/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
+++ b/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
@@ -88,7 +88,7 @@ public class NetworkTopologyStrategy extends 
AbstractReplicationStrategy
 
 // collect endpoints in this DC
 TokenMetadata dcTokens = new TokenMetadata();
-for (EntryToken, InetAddress tokenEntry : 
tokenMetadata.entrySet())
+for (EntryToken, InetAddress tokenEntry : 
tokenMetadata.getTokenToEndpointMapForReading().entrySet())
 {
 if (snitch.getDatacenter(tokenEntry.getValue()).equals(dcName))
 dcTokens.updateNormalToken(tokenEntry.getKey(), 
tokenEntry.getValue());

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4ab6fad9/src/java/org/apache/cassandra/locator/TokenMetadata.java
--
diff --git a/src/java/org/apache/cassandra/locator/TokenMetadata.java 
b/src/java/org/apache/cassandra/locator/TokenMetadata.java
index ebb094b..0942a5d 100644
--- a/src/java/org/apache/cassandra/locator/TokenMetadata.java
+++ b/src/java/org/apache/cassandra/locator/TokenMetadata.java
@@ -408,11 +408,6 @@ public class TokenMetadata
 }
 }
 
-public SetMap.EntryToken,InetAddress entrySet()
-{
-return tokenToEndpointMap.entrySet();
-}
-
 public InetAddress getEndpoint(Token token)
 {
 lock.readLock().lock();
@@ -713,9 +708,28 @@ public class TokenMetadata
 }
 
 /**
- * Return the Token to Endpoint map for all the node in the cluster, 
including bootstrapping ones.
+ * @return a token to endpoint map to consider for read operations on the 
cluster.
+ */
+public MapToken, InetAddress getTokenToEndpointMapForReading()
+{
+lock.readLock().lock();
+try
+{
+MapToken, InetAddress map = new HashMapToken, 
InetAddress(tokenToEndpointMap.size());
+map.putAll(tokenToEndpointMap);
+return map;
+}
+finally
+{
+lock.readLock().unlock();
+}
+}
+
+/**
+ * @return a (stable copy, won't be modified) Token to Endpoint map for 
all the normal and bootstrapping nodes
+ * in the cluster.
  */
-public MapToken, InetAddress getTokenToEndpointMap()
+public MapToken, InetAddress 
getNormalAndBootstrappingTokenToEndpointMap()
 {
 lock.readLock().lock();
 try

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4ab6fad9/src/java/org/apache/cassandra/service/StorageService.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageService.java 
b/src/java/org/apache/cassandra/service/StorageService.java
index 1f7a18d..f82fe32 100644
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@ -854,7 +854,7 @@ public class StorageService implements 
IEndpointStateChangeSubscriber, StorageSe
 
 public MapToken, String getTokenToEndpointMap()
 {
-MapToken, InetAddress mapInetAddress = 
tokenMetadata_.getTokenToEndpointMap();
+MapToken, InetAddress mapInetAddress = 
tokenMetadata_.getNormalAndBootstrappingTokenToEndpointMap();
 MapToken, String mapString = new HashMapToken, 
String(mapInetAddress.size());
 for (Map.EntryToken, InetAddress entry : mapInetAddress.entrySet())
 {
@@ -2074,7 +2074,7 @@ public 

[1/8] git commit: merge from 1.0

2012-02-13 Thread jbellis
Updated Branches:
  refs/heads/cassandra-1.0 4ab6fad94 - 9ca84786b
  refs/heads/cassandra-1.1 c5986871c - c98edc3e8


merge from 1.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c98edc3e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c98edc3e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c98edc3e

Branch: refs/heads/cassandra-1.1
Commit: c98edc3e81c8c1e19370802ab6c82a7e5ff00f42
Parents: c598687 9ca8478
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Feb 13 16:31:41 2012 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Feb 13 16:31:41 2012 -0600

--
 CHANGES.txt|1 +
 src/java/org/apache/cassandra/db/Table.java|2 +-
 .../cassandra/db/compaction/CompactionManager.java |   26 +-
 .../cassandra/hadoop/pig/CassandraStorage.java |6 ++--
 .../apache/cassandra/thrift/CustomTHsHaServer.java |8 
 5 files changed, 30 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c98edc3e/CHANGES.txt
--
diff --cc CHANGES.txt
index 359e699,500b9fb..d39c9dd
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,81 -1,5 +1,82 @@@
 +1.1-dev
 + * add nodetool rebuild_index (CASSANDRA-3583)
 + * add nodetool rangekeysample (CASSANDRA-2917)
 + * Fix streaming too much data during move operations (CASSANDRA-3639)
 + * Nodetool and CLI connect to localhost by default (CASSANDRA-3568)
 + * Reduce memory used by primary index sample (CASSANDRA-3743)
 + * (Hadoop) separate input/output configurations (CASSANDRA-3197, 3765)
 + * avoid returning internal Cassandra classes over JMX (CASSANDRA-2805)
 + * add row-level isolation via SnapTree (CASSANDRA-2893)
 + * Optimize key count estimation when opening sstable on startup
 +   (CASSANDRA-2988)
 + * multi-dc replication optimization supporting CL  ONE (CASSANDRA-3577)
 + * add command to stop compactions (CASSANDRA-1740, 3566, 3582)
 + * multithreaded streaming (CASSANDRA-3494)
 + * removed in-tree redhat spec (CASSANDRA-3567)
 + * defragment rows for name-based queries under STCS, again (CASSANDRA-2503)
 + * Recycle commitlog segments for improved performance 
 +   (CASSANDRA-3411, 3543, 3557, 3615)
 + * update size-tiered compaction to prioritize small tiers (CASSANDRA-2407)
 + * add message expiration logic to OutboundTcpConnection (CASSANDRA-3005)
 + * off-heap cache to use sun.misc.Unsafe instead of JNA (CASSANDRA-3271)
 + * EACH_QUORUM is only supported for writes (CASSANDRA-3272)
 + * replace compactionlock use in schema migration by checking CFS.isValid
 +   (CASSANDRA-3116)
 + * recognize that SELECT first ... * isn't really SELECT * 
(CASSANDRA-3445)
 + * Use faster bytes comparison (CASSANDRA-3434)
 + * Bulk loader is no longer a fat client, (HADOOP) bulk load output format
 +   (CASSANDRA-3045)
 + * (Hadoop) add support for KeyRange.filter
 + * remove assumption that keys and token are in bijection
 +   (CASSANDRA-1034, 3574, 3604)
 + * always remove endpoints from delevery queue in HH (CASSANDRA-3546)
 + * fix race between cf flush and its 2ndary indexes flush (CASSANDRA-3547)
 + * fix potential race in AES when a repair fails (CASSANDRA-3548)
 + * Remove columns shadowed by a deleted container even when we cannot purge
 +   (CASSANDRA-3538)
 + * Improve memtable slice iteration performance (CASSANDRA-3545)
 + * more efficient allocation of small bloom filters (CASSANDRA-3618)
 + * Use separate writer thread in SSTableSimpleUnsortedWriter (CASSANDRA-3619)
 + * fsync the directory after new sstable or commitlog segment are created 
(CASSANDRA-3250)
 + * fix minor issues reported by FindBugs (CASSANDRA-3658)
 + * global key/row caches (CASSANDRA-3143, 3849)
 + * optimize memtable iteration during range scan (CASSANDRA-3638)
 + * introduce 'crc_check_chance' in CompressionParameters to support
 +   a checksum percentage checking chance similarly to read-repair 
(CASSANDRA-3611)
 + * a way to deactivate global key/row cache on per-CF basis (CASSANDRA-3667)
 + * fix LeveledCompactionStrategy broken because of generation pre-allocation
 +   in LeveledManifest (CASSANDRA-3691)
 + * finer-grained control over data directories (CASSANDRA-2749)
 + * Fix ClassCastException during hinted handoff (CASSANDRA-3694)
 + * Upgrade Thrift to 0.7 (CASSANDRA-3213)
 + * Make stress.java insert operation to use microseconds (CASSANDRA-3725)
 + * Allows (internally) doing a range query with a limit of columns instead of
 +   rows (CASSANDRA-3742)
 + * Allow rangeSlice queries to be start/end inclusive/exclusive 
(CASSANDRA-3749)
 + * Fix BulkLoader to support new SSTable layout and add stream
 +   throttling to prevent an NPE when there is no yaml config 

[8/8] git commit: Fix misplaced 'new' keyword

2012-02-13 Thread jbellis
Fix misplaced 'new' keyword


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/651ca528
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/651ca528
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/651ca528

Branch: refs/heads/cassandra-1.1
Commit: 651ca528d24f088581055cfbd4c70115e04899ea
Parents: cb0efd0
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Feb 13 13:41:03 2012 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Feb 13 13:41:03 2012 -0600

--
 .../cassandra/hadoop/pig/CassandraStorage.java |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/651ca528/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
--
diff --git 
a/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java 
b/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
index 63758ab..b9977a5 100644
--- a/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
+++ b/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
@@ -491,7 +491,7 @@ public class CassandraStorage extends LoadFunc implements 
StoreFuncInterface, Lo
 if (o == null)
 return (ByteBuffer)o;
 if (o instanceof java.lang.String)
-return new ByteBuffer.wrap(DataByteArray((String)o).get());
+return ByteBuffer.wrap(new DataByteArray((String)o).get());
 if (o instanceof Integer)
 return IntegerType.instance.decompose((BigInteger)o);
 if (o instanceof Long)



[6/8] git commit: CASSANDRA-3867 patch by Vijay; reviewed by Brandon Williams for CASSANDRA-3867

2012-02-13 Thread jbellis
CASSANDRA-3867
patch by Vijay; reviewed by Brandon Williams for CASSANDRA-3867

Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2a554798
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2a554798
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2a554798

Branch: refs/heads/cassandra-1.1
Commit: 2a5547981dad7e59be2c26aeb52f5d49d2195b9c
Parents: 4bd3f8d
Author: Vijay Parthasarathy vijay2...@gmail.com
Authored: Mon Feb 13 12:42:29 2012 -0800
Committer: Vijay Parthasarathy vijay2...@gmail.com
Committed: Mon Feb 13 12:42:29 2012 -0800

--
 .../apache/cassandra/thrift/CustomTHsHaServer.java |8 
 1 files changed, 8 insertions(+), 0 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2a554798/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java
--
diff --git a/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java 
b/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java
index 4921678..9bfb4f7 100644
--- a/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java
+++ b/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java
@@ -177,6 +177,14 @@ public class CustomTHsHaServer extends TNonblockingServer
 {
 select();
 }
+try
+{
+selector.close(); // CASSANDRA-3867
+}
+catch (IOException e)
+{
+// ignore this exception.
+}
 } 
 catch (Throwable t)
 {



[7/8] git commit: Integer corresponds to Int32Type

2012-02-13 Thread jbellis
Integer corresponds to Int32Type


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4bd3f8d8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4bd3f8d8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4bd3f8d8

Branch: refs/heads/cassandra-1.1
Commit: 4bd3f8d86fcc29259dd0d508873125f88ce588e4
Parents: 651ca52
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Feb 13 13:48:20 2012 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Feb 13 13:48:20 2012 -0600

--
 .../cassandra/hadoop/pig/CassandraStorage.java |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/4bd3f8d8/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
--
diff --git 
a/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java 
b/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
index b9977a5..76a291a 100644
--- a/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
+++ b/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
@@ -493,7 +493,7 @@ public class CassandraStorage extends LoadFunc implements 
StoreFuncInterface, Lo
 if (o instanceof java.lang.String)
 return ByteBuffer.wrap(new DataByteArray((String)o).get());
 if (o instanceof Integer)
-return IntegerType.instance.decompose((BigInteger)o);
+return Int32Type.instance.decompose((Integer)o);
 if (o instanceof Long)
 return LongType.instance.decompose((Long)o);
 if (o instanceof Float)



[2/12] git commit: fix race between cleanup and flush on secondary index CFSes patch by yukim and jbellis for CASSANDRA-3712

2012-02-13 Thread brandonwilliams
fix race between cleanup and flush on secondary index CFSes
patch by yukim and jbellis for CASSANDRA-3712


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9ca84786
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9ca84786
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9ca84786

Branch: refs/heads/trunk
Commit: 9ca84786b5be14b0a881268e3649b697f7f893b9
Parents: 4ab6fad
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Feb 13 16:30:34 2012 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Feb 13 16:30:34 2012 -0600

--
 CHANGES.txt|1 +
 src/java/org/apache/cassandra/db/Table.java|2 +-
 .../cassandra/db/compaction/CompactionManager.java |   24 ++-
 3 files changed, 18 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9ca84786/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 0875da5..500b9fb 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 1.0.8
+ * fix race between cleanup and flush on secondary index CFSes (CASSANDRA-3712)
  * avoid including non-queried nodes in rangeslice read repair
(CASSANDRA-3843)
  * Only snapshot CF being compacted for snapshot_before_compaction 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9ca84786/src/java/org/apache/cassandra/db/Table.java
--
diff --git a/src/java/org/apache/cassandra/db/Table.java 
b/src/java/org/apache/cassandra/db/Table.java
index 0168f0c..f954fbc 100644
--- a/src/java/org/apache/cassandra/db/Table.java
+++ b/src/java/org/apache/cassandra/db/Table.java
@@ -71,7 +71,7 @@ public class Table
  *
  * (Enabling fairness in the RRWL is observed to decrease throughput, so 
we leave it off.)
  */
-static final ReentrantReadWriteLock switchLock = new 
ReentrantReadWriteLock();
+public static final ReentrantReadWriteLock switchLock = new 
ReentrantReadWriteLock();
 
 // It is possible to call Table.open without a running daemon, so it makes 
sense to ensure
 // proper directories here as well as in CassandraDaemon.

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9ca84786/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
--
diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
index caaf6d2..97e5067 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
@@ -729,14 +729,13 @@ public class CompactionManager implements 
CompactionManagerMBean
 }
 else
 {
-  
 cfs.invalidateCachedRow(row.getKey());
-
+
 if (!indexedColumns.isEmpty() || isCommutative)
 {
 if (indexedColumnsInRow != null)
 indexedColumnsInRow.clear();
-
+
 while (row.hasNext())
 {
 IColumn column = row.next();
@@ -746,13 +745,24 @@ public class CompactionManager implements 
CompactionManagerMBean
 {
 if (indexedColumnsInRow == null)
 indexedColumnsInRow = new 
ArrayListIColumn();
-
+
 indexedColumnsInRow.add(column);
 }
 }
-
+
 if (indexedColumnsInRow != null  
!indexedColumnsInRow.isEmpty())
-
cfs.indexManager.deleteFromIndexes(row.getKey(), indexedColumnsInRow);
+{
+// acquire memtable lock here because 
secondary index deletion may cause a race. See CASSANDRA-3712
+Table.switchLock.readLock().lock();
+try
+{
+
cfs.indexManager.deleteFromIndexes(row.getKey(), indexedColumnsInRow);
+}
+finally
+{
+Table.switchLock.readLock().unlock();
+  

[3/12] git commit: Merge from 1.1

2012-02-13 Thread brandonwilliams
Merge from 1.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4d55a36a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4d55a36a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4d55a36a

Branch: refs/heads/trunk
Commit: 4d55a36aa2b94329507f931a3dffbc4c3547bdf0
Parents: 7905044 c98edc3
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Feb 13 16:29:36 2012 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Feb 13 16:29:36 2012 -0600

--
 CHANGES.txt|4 +--
 src/java/org/apache/cassandra/db/Table.java|2 +-
 .../cassandra/db/compaction/CompactionManager.java |   26 +-
 .../cassandra/hadoop/pig/CassandraStorage.java |6 ++--
 4 files changed, 22 insertions(+), 16 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/4d55a36a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
--



[5/12] git commit: merge from 1.0

2012-02-13 Thread brandonwilliams
merge from 1.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c5986871
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c5986871
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c5986871

Branch: refs/heads/trunk
Commit: c5986871c007f8c552ff624d1fcf064ce6a45c92
Parents: 9a842c7 b55ab4f
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Feb 13 15:41:30 2012 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Feb 13 15:41:30 2012 -0600

--
 CHANGES.txt|3 --
 .../cassandra/hadoop/pig/CassandraStorage.java |   14 ++-
 .../cassandra/locator/NetworkTopologyStrategy.java |2 +-
 .../apache/cassandra/locator/TokenMetadata.java|   28 +++
 .../apache/cassandra/service/StorageService.java   |6 ++--
 5 files changed, 37 insertions(+), 16 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c5986871/CHANGES.txt
--
diff --cc CHANGES.txt
index e115a2a,0875da5..359e699
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,83 -1,3 +1,80 @@@
 +1.1-dev
 + * add nodetool rebuild_index (CASSANDRA-3583)
 + * add nodetool rangekeysample (CASSANDRA-2917)
 + * Fix streaming too much data during move operations (CASSANDRA-3639)
 + * Nodetool and CLI connect to localhost by default (CASSANDRA-3568)
 + * Reduce memory used by primary index sample (CASSANDRA-3743)
 + * (Hadoop) separate input/output configurations (CASSANDRA-3197, 3765)
 + * avoid returning internal Cassandra classes over JMX (CASSANDRA-2805)
 + * add row-level isolation via SnapTree (CASSANDRA-2893)
 + * Optimize key count estimation when opening sstable on startup
 +   (CASSANDRA-2988)
 + * multi-dc replication optimization supporting CL  ONE (CASSANDRA-3577)
 + * add command to stop compactions (CASSANDRA-1740, 3566, 3582)
 + * multithreaded streaming (CASSANDRA-3494)
 + * removed in-tree redhat spec (CASSANDRA-3567)
 + * defragment rows for name-based queries under STCS, again (CASSANDRA-2503)
 + * Recycle commitlog segments for improved performance 
 +   (CASSANDRA-3411, 3543, 3557, 3615)
 + * update size-tiered compaction to prioritize small tiers (CASSANDRA-2407)
 + * add message expiration logic to OutboundTcpConnection (CASSANDRA-3005)
 + * off-heap cache to use sun.misc.Unsafe instead of JNA (CASSANDRA-3271)
 + * EACH_QUORUM is only supported for writes (CASSANDRA-3272)
 + * replace compactionlock use in schema migration by checking CFS.isValid
 +   (CASSANDRA-3116)
 + * recognize that SELECT first ... * isn't really SELECT * 
(CASSANDRA-3445)
 + * Use faster bytes comparison (CASSANDRA-3434)
 + * Bulk loader is no longer a fat client, (HADOOP) bulk load output format
 +   (CASSANDRA-3045)
 + * (Hadoop) add support for KeyRange.filter
 + * remove assumption that keys and token are in bijection
 +   (CASSANDRA-1034, 3574, 3604)
 + * always remove endpoints from delevery queue in HH (CASSANDRA-3546)
 + * fix race between cf flush and its 2ndary indexes flush (CASSANDRA-3547)
 + * fix potential race in AES when a repair fails (CASSANDRA-3548)
 + * Remove columns shadowed by a deleted container even when we cannot purge
 +   (CASSANDRA-3538)
 + * Improve memtable slice iteration performance (CASSANDRA-3545)
 + * more efficient allocation of small bloom filters (CASSANDRA-3618)
 + * Use separate writer thread in SSTableSimpleUnsortedWriter (CASSANDRA-3619)
 + * fsync the directory after new sstable or commitlog segment are created 
(CASSANDRA-3250)
 + * fix minor issues reported by FindBugs (CASSANDRA-3658)
 + * global key/row caches (CASSANDRA-3143, 3849)
 + * optimize memtable iteration during range scan (CASSANDRA-3638)
 + * introduce 'crc_check_chance' in CompressionParameters to support
 +   a checksum percentage checking chance similarly to read-repair 
(CASSANDRA-3611)
 + * a way to deactivate global key/row cache on per-CF basis (CASSANDRA-3667)
 + * fix LeveledCompactionStrategy broken because of generation pre-allocation
 +   in LeveledManifest (CASSANDRA-3691)
 + * finer-grained control over data directories (CASSANDRA-2749)
 + * Fix ClassCastException during hinted handoff (CASSANDRA-3694)
 + * Upgrade Thrift to 0.7 (CASSANDRA-3213)
 + * Make stress.java insert operation to use microseconds (CASSANDRA-3725)
 + * Allows (internally) doing a range query with a limit of columns instead of
 +   rows (CASSANDRA-3742)
 + * Allow rangeSlice queries to be start/end inclusive/exclusive 
(CASSANDRA-3749)
 + * Fix BulkLoader to support new SSTable layout and add stream
 +   throttling to prevent an NPE when there is no yaml config (CASSANDRA-3752)
 + * Allow concurrent schema migrations (CASSANDRA-1391, 3832)
 + * Add SnapshotCommand to trigger snapshot on 

[4/12] git commit: fix unsynchronized use of TokenMetadata.entrySet patch by Peter Schuller; reviewed by jbellis for CASSANDRA-3417

2012-02-13 Thread brandonwilliams
fix unsynchronized use of TokenMetadata.entrySet
patch by Peter Schuller; reviewed by jbellis for CASSANDRA-3417


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4ab6fad9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4ab6fad9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4ab6fad9

Branch: refs/heads/trunk
Commit: 4ab6fad945cada90497a8cf523a4c868932834c2
Parents: 1047914
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Feb 13 15:31:43 2012 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Feb 13 15:44:50 2012 -0600

--
 .../cassandra/locator/NetworkTopologyStrategy.java |2 +-
 .../apache/cassandra/locator/TokenMetadata.java|   28 +++
 .../apache/cassandra/service/StorageService.java   |4 +-
 3 files changed, 24 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/4ab6fad9/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
--
diff --git a/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java 
b/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
index 2ae0a98..b6a99b2 100644
--- a/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
+++ b/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
@@ -88,7 +88,7 @@ public class NetworkTopologyStrategy extends 
AbstractReplicationStrategy
 
 // collect endpoints in this DC
 TokenMetadata dcTokens = new TokenMetadata();
-for (EntryToken, InetAddress tokenEntry : 
tokenMetadata.entrySet())
+for (EntryToken, InetAddress tokenEntry : 
tokenMetadata.getTokenToEndpointMapForReading().entrySet())
 {
 if (snitch.getDatacenter(tokenEntry.getValue()).equals(dcName))
 dcTokens.updateNormalToken(tokenEntry.getKey(), 
tokenEntry.getValue());

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4ab6fad9/src/java/org/apache/cassandra/locator/TokenMetadata.java
--
diff --git a/src/java/org/apache/cassandra/locator/TokenMetadata.java 
b/src/java/org/apache/cassandra/locator/TokenMetadata.java
index ebb094b..0942a5d 100644
--- a/src/java/org/apache/cassandra/locator/TokenMetadata.java
+++ b/src/java/org/apache/cassandra/locator/TokenMetadata.java
@@ -408,11 +408,6 @@ public class TokenMetadata
 }
 }
 
-public SetMap.EntryToken,InetAddress entrySet()
-{
-return tokenToEndpointMap.entrySet();
-}
-
 public InetAddress getEndpoint(Token token)
 {
 lock.readLock().lock();
@@ -713,9 +708,28 @@ public class TokenMetadata
 }
 
 /**
- * Return the Token to Endpoint map for all the node in the cluster, 
including bootstrapping ones.
+ * @return a token to endpoint map to consider for read operations on the 
cluster.
+ */
+public MapToken, InetAddress getTokenToEndpointMapForReading()
+{
+lock.readLock().lock();
+try
+{
+MapToken, InetAddress map = new HashMapToken, 
InetAddress(tokenToEndpointMap.size());
+map.putAll(tokenToEndpointMap);
+return map;
+}
+finally
+{
+lock.readLock().unlock();
+}
+}
+
+/**
+ * @return a (stable copy, won't be modified) Token to Endpoint map for 
all the normal and bootstrapping nodes
+ * in the cluster.
  */
-public MapToken, InetAddress getTokenToEndpointMap()
+public MapToken, InetAddress 
getNormalAndBootstrappingTokenToEndpointMap()
 {
 lock.readLock().lock();
 try

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4ab6fad9/src/java/org/apache/cassandra/service/StorageService.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageService.java 
b/src/java/org/apache/cassandra/service/StorageService.java
index 1f7a18d..f82fe32 100644
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@ -854,7 +854,7 @@ public class StorageService implements 
IEndpointStateChangeSubscriber, StorageSe
 
 public MapToken, String getTokenToEndpointMap()
 {
-MapToken, InetAddress mapInetAddress = 
tokenMetadata_.getTokenToEndpointMap();
+MapToken, InetAddress mapInetAddress = 
tokenMetadata_.getNormalAndBootstrappingTokenToEndpointMap();
 MapToken, String mapString = new HashMapToken, 
String(mapInetAddress.size());
 for (Map.EntryToken, InetAddress entry : mapInetAddress.entrySet())
 {
@@ -2074,7 +2074,7 @@ public class 

[1/12] git commit: merge from 1.0

2012-02-13 Thread brandonwilliams
Updated Branches:
  refs/heads/trunk 79050449e - 4d55a36aa


merge from 1.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c98edc3e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c98edc3e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c98edc3e

Branch: refs/heads/trunk
Commit: c98edc3e81c8c1e19370802ab6c82a7e5ff00f42
Parents: c598687 9ca8478
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Feb 13 16:31:41 2012 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Feb 13 16:31:41 2012 -0600

--
 CHANGES.txt|1 +
 src/java/org/apache/cassandra/db/Table.java|2 +-
 .../cassandra/db/compaction/CompactionManager.java |   26 +-
 .../cassandra/hadoop/pig/CassandraStorage.java |6 ++--
 .../apache/cassandra/thrift/CustomTHsHaServer.java |8 
 5 files changed, 30 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c98edc3e/CHANGES.txt
--
diff --cc CHANGES.txt
index 359e699,500b9fb..d39c9dd
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,81 -1,5 +1,82 @@@
 +1.1-dev
 + * add nodetool rebuild_index (CASSANDRA-3583)
 + * add nodetool rangekeysample (CASSANDRA-2917)
 + * Fix streaming too much data during move operations (CASSANDRA-3639)
 + * Nodetool and CLI connect to localhost by default (CASSANDRA-3568)
 + * Reduce memory used by primary index sample (CASSANDRA-3743)
 + * (Hadoop) separate input/output configurations (CASSANDRA-3197, 3765)
 + * avoid returning internal Cassandra classes over JMX (CASSANDRA-2805)
 + * add row-level isolation via SnapTree (CASSANDRA-2893)
 + * Optimize key count estimation when opening sstable on startup
 +   (CASSANDRA-2988)
 + * multi-dc replication optimization supporting CL  ONE (CASSANDRA-3577)
 + * add command to stop compactions (CASSANDRA-1740, 3566, 3582)
 + * multithreaded streaming (CASSANDRA-3494)
 + * removed in-tree redhat spec (CASSANDRA-3567)
 + * defragment rows for name-based queries under STCS, again (CASSANDRA-2503)
 + * Recycle commitlog segments for improved performance 
 +   (CASSANDRA-3411, 3543, 3557, 3615)
 + * update size-tiered compaction to prioritize small tiers (CASSANDRA-2407)
 + * add message expiration logic to OutboundTcpConnection (CASSANDRA-3005)
 + * off-heap cache to use sun.misc.Unsafe instead of JNA (CASSANDRA-3271)
 + * EACH_QUORUM is only supported for writes (CASSANDRA-3272)
 + * replace compactionlock use in schema migration by checking CFS.isValid
 +   (CASSANDRA-3116)
 + * recognize that SELECT first ... * isn't really SELECT * 
(CASSANDRA-3445)
 + * Use faster bytes comparison (CASSANDRA-3434)
 + * Bulk loader is no longer a fat client, (HADOOP) bulk load output format
 +   (CASSANDRA-3045)
 + * (Hadoop) add support for KeyRange.filter
 + * remove assumption that keys and token are in bijection
 +   (CASSANDRA-1034, 3574, 3604)
 + * always remove endpoints from delevery queue in HH (CASSANDRA-3546)
 + * fix race between cf flush and its 2ndary indexes flush (CASSANDRA-3547)
 + * fix potential race in AES when a repair fails (CASSANDRA-3548)
 + * Remove columns shadowed by a deleted container even when we cannot purge
 +   (CASSANDRA-3538)
 + * Improve memtable slice iteration performance (CASSANDRA-3545)
 + * more efficient allocation of small bloom filters (CASSANDRA-3618)
 + * Use separate writer thread in SSTableSimpleUnsortedWriter (CASSANDRA-3619)
 + * fsync the directory after new sstable or commitlog segment are created 
(CASSANDRA-3250)
 + * fix minor issues reported by FindBugs (CASSANDRA-3658)
 + * global key/row caches (CASSANDRA-3143, 3849)
 + * optimize memtable iteration during range scan (CASSANDRA-3638)
 + * introduce 'crc_check_chance' in CompressionParameters to support
 +   a checksum percentage checking chance similarly to read-repair 
(CASSANDRA-3611)
 + * a way to deactivate global key/row cache on per-CF basis (CASSANDRA-3667)
 + * fix LeveledCompactionStrategy broken because of generation pre-allocation
 +   in LeveledManifest (CASSANDRA-3691)
 + * finer-grained control over data directories (CASSANDRA-2749)
 + * Fix ClassCastException during hinted handoff (CASSANDRA-3694)
 + * Upgrade Thrift to 0.7 (CASSANDRA-3213)
 + * Make stress.java insert operation to use microseconds (CASSANDRA-3725)
 + * Allows (internally) doing a range query with a limit of columns instead of
 +   rows (CASSANDRA-3742)
 + * Allow rangeSlice queries to be start/end inclusive/exclusive 
(CASSANDRA-3749)
 + * Fix BulkLoader to support new SSTable layout and add stream
 +   throttling to prevent an NPE when there is no yaml config (CASSANDRA-3752)
 + * Allow concurrent schema migrations (CASSANDRA-1391, 

[6/12] git commit: fix unsynchronized use of TokenMetadata.entrySet patch by Peter Schuller; reviewed by jbellis for CASSANDRA-3417

2012-02-13 Thread brandonwilliams
fix unsynchronized use of TokenMetadata.entrySet
patch by Peter Schuller; reviewed by jbellis for CASSANDRA-3417


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b55ab4f3
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b55ab4f3
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b55ab4f3

Branch: refs/heads/trunk
Commit: b55ab4f3b23b9f3f056ffcc526d2b06989e024fb
Parents: cb0efd0
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon Feb 13 15:31:43 2012 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon Feb 13 15:31:43 2012 -0600

--
 .../cassandra/locator/NetworkTopologyStrategy.java |2 +-
 .../apache/cassandra/locator/TokenMetadata.java|   28 +++
 .../apache/cassandra/service/StorageService.java   |4 +-
 3 files changed, 24 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b55ab4f3/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
--
diff --git a/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java 
b/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
index 2ae0a98..b6a99b2 100644
--- a/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
+++ b/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
@@ -88,7 +88,7 @@ public class NetworkTopologyStrategy extends 
AbstractReplicationStrategy
 
 // collect endpoints in this DC
 TokenMetadata dcTokens = new TokenMetadata();
-for (EntryToken, InetAddress tokenEntry : 
tokenMetadata.entrySet())
+for (EntryToken, InetAddress tokenEntry : 
tokenMetadata.getTokenToEndpointMapForReading().entrySet())
 {
 if (snitch.getDatacenter(tokenEntry.getValue()).equals(dcName))
 dcTokens.updateNormalToken(tokenEntry.getKey(), 
tokenEntry.getValue());

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b55ab4f3/src/java/org/apache/cassandra/locator/TokenMetadata.java
--
diff --git a/src/java/org/apache/cassandra/locator/TokenMetadata.java 
b/src/java/org/apache/cassandra/locator/TokenMetadata.java
index ebb094b..0942a5d 100644
--- a/src/java/org/apache/cassandra/locator/TokenMetadata.java
+++ b/src/java/org/apache/cassandra/locator/TokenMetadata.java
@@ -408,11 +408,6 @@ public class TokenMetadata
 }
 }
 
-public SetMap.EntryToken,InetAddress entrySet()
-{
-return tokenToEndpointMap.entrySet();
-}
-
 public InetAddress getEndpoint(Token token)
 {
 lock.readLock().lock();
@@ -713,9 +708,28 @@ public class TokenMetadata
 }
 
 /**
- * Return the Token to Endpoint map for all the node in the cluster, 
including bootstrapping ones.
+ * @return a token to endpoint map to consider for read operations on the 
cluster.
+ */
+public MapToken, InetAddress getTokenToEndpointMapForReading()
+{
+lock.readLock().lock();
+try
+{
+MapToken, InetAddress map = new HashMapToken, 
InetAddress(tokenToEndpointMap.size());
+map.putAll(tokenToEndpointMap);
+return map;
+}
+finally
+{
+lock.readLock().unlock();
+}
+}
+
+/**
+ * @return a (stable copy, won't be modified) Token to Endpoint map for 
all the normal and bootstrapping nodes
+ * in the cluster.
  */
-public MapToken, InetAddress getTokenToEndpointMap()
+public MapToken, InetAddress 
getNormalAndBootstrappingTokenToEndpointMap()
 {
 lock.readLock().lock();
 try

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b55ab4f3/src/java/org/apache/cassandra/service/StorageService.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageService.java 
b/src/java/org/apache/cassandra/service/StorageService.java
index 1f7a18d..f82fe32 100644
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@ -854,7 +854,7 @@ public class StorageService implements 
IEndpointStateChangeSubscriber, StorageSe
 
 public MapToken, String getTokenToEndpointMap()
 {
-MapToken, InetAddress mapInetAddress = 
tokenMetadata_.getTokenToEndpointMap();
+MapToken, InetAddress mapInetAddress = 
tokenMetadata_.getNormalAndBootstrappingTokenToEndpointMap();
 MapToken, String mapString = new HashMapToken, 
String(mapInetAddress.size());
 for (Map.EntryToken, InetAddress entry : mapInetAddress.entrySet())
 {
@@ -2074,7 +2074,7 @@ public class 

  1   2   >