[jira] [Commented] (CASSANDRA-7124) Use JMX Notifications to Indicate Success/Failure of Long-Running Operations

2014-12-11 Thread Rajanarayanan Thottuvaikkatumana (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242415#comment-14242415
 ] 

Rajanarayanan Thottuvaikkatumana commented on CASSANDRA-7124:
-

[~yukim], Sure I can do that. Can you please explain the branching part a bit? 
Did you mean to say that, get the latest version from trunk, create a branch 
locally, apply all the changes and then push that to a different repository 
such as to my own Github repository and share the link with you? 

One other opinion is that, if you look at all these operations for which I am 
making changes, all these changes are independent and implementations are 
different. In other words, the changes for compact and decommission are two 
totally different things and they can be committed separately. Just like the 
way the original repair task was implemented earlier and committed separately. 
Just because of that itself dealing with them independently will be ideal. More 
over, if changes are there in all the implementations, verifying all of them 
together and doing any changes after the review process will be more difficult 
than dealing with them one by one. So if you can have a look at the patches 
submitted one by one at your convenience, I think that will be easy for you and 
me. Feel free to be opinionated and let me know. I will do it accordingly. 
Thanks

 Use JMX Notifications to Indicate Success/Failure of Long-Running Operations
 

 Key: CASSANDRA-7124
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7124
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Rajanarayanan Thottuvaikkatumana
Priority: Minor
  Labels: lhf
 Fix For: 3.0

 Attachments: 7124-wip.txt, cassandra-trunk-compact-7124.txt, 
 cassandra-trunk-decommission-7124.txt


 If {{nodetool cleanup}} or some other long-running operation takes too long 
 to complete, you'll see an error like the one in CASSANDRA-2126, so you can't 
 tell if the operation completed successfully or not.  CASSANDRA-4767 fixed 
 this for repairs with JMX notifications.  We should do something similar for 
 nodetool cleanup, compact, decommission, move, relocate, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8449) Allow zero-copy reads again

2014-12-11 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242439#comment-14242439
 ] 

Benedict commented on CASSANDRA-8449:
-

bq. Isn't the existing use of OpOrder technically arbitrarily long due to GC 
for instance

Any delay caused by GC to the termination of an OpOrder.Group is instantaneous 
from the point of view of the waiter, since it is also delayed by GC

Either way, GC is not as arbitrarily long as I was referring to. Mostly I'm 
thinking about network consumers that haven't died but are, perhaps, in the 
process of doing so (GC death spiral), or where the network socket has frozen 
due to some other problem. i.e. where the problem is isolated from the rest of 
the host's functionality, but by being guarded by an OpOrder could conceivably 
cause the problem to infect the whole host's functionality. In reality we can 
probably guard against most of the risk, but I would still be reticent to use 
this scheme with that risk even minimally present without the ramifications 
being constrained as they are here.


 Allow zero-copy reads again
 ---

 Key: CASSANDRA-8449
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8449
 Project: Cassandra
  Issue Type: Improvement
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Minor
  Labels: performance
 Fix For: 3.0


 We disabled zero-copy reads in CASSANDRA-3179 due to in flight reads 
 accessing a ByteBuffer when the data was unmapped by compaction.  Currently 
 this code path is only used for uncompressed reads.
 The actual bytes are in fact copied to the client output buffers for both 
 netty and thrift before being sent over the wire, so the only issue really is 
 the time it takes to process the read internally.  
 This patch adds a slow network read test and changes the tidy() method to 
 actually delete a sstable once the readTimeout has elapsed giving plenty of 
 time to serialize the read.
 Removing this copy causes significantly less GC on the read path and improves 
 the tail latencies:
 http://cstar.datastax.com/graph?stats=c0c8ce16-7fea-11e4-959d-42010af0688fmetric=gc_countoperation=2_readsmoothing=1show_aggregates=truexmin=0xmax=109.34ymin=0ymax=5.5



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8376) Add support for multiple configuration files (or conf.d)

2014-12-11 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242445#comment-14242445
 ] 

Marcus Olsson commented on CASSANDRA-8376:
--

I think option 2 could be a nice feature, using the cassandra.yaml as a default 
config and then possibly have some files with cluster and node specific 
settings separated in conf.d that overrides some or all of cassandra.yaml. This 
would also remove the need to special case the node specific configuration when 
changing things related to the whole cluster. It could also make it easier to 
migrate between versions in case cassandra.yaml has changed and e.g. added a 
new setting(assuming that cassandra.yaml is the default before upgrading).

 Add support for multiple configuration files (or conf.d)
 

 Key: CASSANDRA-8376
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8376
 Project: Cassandra
  Issue Type: New Feature
Reporter: Omri Bahumi

 I'm using Chef to generate cassandra.yaml.
 Part of this file is the seed_provider, which is based on the Chef 
 inventory.
 Changes to this file (due to Chef inventory change, when adding/removing 
 Cassandra nodes) cause a restart, which is not desirable.
 The Chef way of handling this is to split the config file into two config 
 files, one containing only the seed_provider and the other containing the 
 rest of the config.
 Only the latter will cause a restart to Cassandra.
 This is achievable by either:
 1. Specifying multiple config files to Cassandra
 2. Specifying a conf.d directory



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8447) Nodes stuck in CMS GC cycle with very little traffic when compaction is enabled

2014-12-11 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242462#comment-14242462
 ] 

Benedict commented on CASSANDRA-8447:
-

[~yangzhe1991]: I don't think your problem is related, since it looks to me 
like you're running 2.1? If so, if you could file another ticket and upload a 
heap dump from one of your smaller nodes, its config yaml, and a full system 
log from startup until the problem was encountered I'll see if I can help 
pinpoint the problem.


 Nodes stuck in CMS GC cycle with very little traffic when compaction is 
 enabled
 ---

 Key: CASSANDRA-8447
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8447
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cluster size - 4 nodes
 Node size - 12 CPU (hyper threaded to 24 cores), 192 GB RAM, 2 Raid 0 arrays 
 (Data - 10 disk, spinning 10k drives | CL 2 disk, spinning 10k drives)
 OS - RHEL 6.5
 jvm - oracle 1.7.0_71
 Cassandra version 2.0.11
Reporter: jonathan lacefield
 Attachments: Node_with_compaction.png, Node_without_compaction.png, 
 cassandra.yaml, gc.logs.tar.gz, gcinspector_messages.txt, memtable_debug, 
 results.tar.gz, visualvm_screenshot


 Behavior - If autocompaction is enabled, nodes will become unresponsive due 
 to a full Old Gen heap which is not cleared during CMS GC.
 Test methodology - disabled autocompaction on 3 nodes, left autocompaction 
 enabled on 1 node.  Executed different Cassandra stress loads, using write 
 only operations.  Monitored visualvm and jconsole for heap pressure.  
 Captured iostat and dstat for most tests.  Captured heap dump from 50 thread 
 load.  Hints were disabled for testing on all nodes to alleviate GC noise due 
 to hints backing up.
 Data load test through Cassandra stress -  /usr/bin/cassandra-stress  write 
 n=19 -rate threads=different threads tested -schema  
 replication\(factor=3\)  keyspace=Keyspace1 -node all nodes listed
 Data load thread count and results:
 * 1 thread - Still running but looks like the node can sustain this load 
 (approx 500 writes per second per node)
 * 5 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range (approx 2k writes per second per node)
 * 10 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range
 * 50 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 10k writes per second per node)
 * 100 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 20k writes per second per node)
 * 200 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 25k writes per second per node)
 Note - the observed behavior was the same for all tests except for the single 
 threaded test.  The single threaded test does not appear to show this 
 behavior.
 Tested different GC and Linux OS settings with a focus on the 50 and 200 
 thread loads.  
 JVM settings tested:
 #  default, out of the box, env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #* JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=50
 #   20 G Max | 10 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=6
JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=3
JVM_OPTS=$JVM_OPTS -XX:ParallelGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions
JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity
JVM_OPTS=$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs
JVM_OPTS=$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32768
JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking
 # 20 G Max | 1 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=6

[jira] [Created] (CASSANDRA-8458) Avoid streaming from tmplink files

2014-12-11 Thread Marcus Eriksson (JIRA)
Marcus Eriksson created CASSANDRA-8458:
--

 Summary: Avoid streaming from tmplink files
 Key: CASSANDRA-8458
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8458
 Project: Cassandra
  Issue Type: Bug
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
 Fix For: 2.1.3


Looks like we include tmplink sstables in streams in 2.1+, and when we do, 
sometimes we get this error message on the receiving side: 
{{java.io.IOException: Corrupt input data, block did not start with 2 byte 
signature ('ZV') followed by type byte, 2-byte length)}}. I've only seen this 
happen when a tmplink sstable is included in the stream.

We can not just exclude the tmplink files when starting the stream - we need to 
include the original file, which we might miss since we check if the requested 
stream range intersects the sstable range.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8376) Add support for multiple configuration files (or conf.d)

2014-12-11 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242470#comment-14242470
 ] 

Aleksey Yeschenko commented on CASSANDRA-8376:
--

We have pluggabe configuration loaders now. Maybe that would help (by writing 
your own)?

 Add support for multiple configuration files (or conf.d)
 

 Key: CASSANDRA-8376
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8376
 Project: Cassandra
  Issue Type: New Feature
Reporter: Omri Bahumi

 I'm using Chef to generate cassandra.yaml.
 Part of this file is the seed_provider, which is based on the Chef 
 inventory.
 Changes to this file (due to Chef inventory change, when adding/removing 
 Cassandra nodes) cause a restart, which is not desirable.
 The Chef way of handling this is to split the config file into two config 
 files, one containing only the seed_provider and the other containing the 
 rest of the config.
 Only the latter will cause a restart to Cassandra.
 This is achievable by either:
 1. Specifying multiple config files to Cassandra
 2. Specifying a conf.d directory



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8447) Nodes stuck in CMS GC cycle with very little traffic when compaction is enabled

2014-12-11 Thread jonathan lacefield (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jonathan lacefield updated CASSANDRA-8447:
--
Attachment: output.svg

Added flame graph taken from the time the node started until it's heap became 
full and the node became unresponsive.  Flame graph was created using hprof and 
https://github.com/cykl/hprof2flamegraph.  It's understood that cpu sampling 
with hprof can be flawed as Brendan Gregg mentions here - 
http://www.brendangregg.com/blog/2014-06-12/java-flame-graphs.html. We used 
hprof to leverage the same jvm version currently in use for Cassandra.  We will 
provide another set of flame graphs today that show a healthy node as well as 
the node which has full heap, for comparison purposes.  Please note the many 
epoll wait items in the graph on the right hand side.  

 Nodes stuck in CMS GC cycle with very little traffic when compaction is 
 enabled
 ---

 Key: CASSANDRA-8447
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8447
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cluster size - 4 nodes
 Node size - 12 CPU (hyper threaded to 24 cores), 192 GB RAM, 2 Raid 0 arrays 
 (Data - 10 disk, spinning 10k drives | CL 2 disk, spinning 10k drives)
 OS - RHEL 6.5
 jvm - oracle 1.7.0_71
 Cassandra version 2.0.11
Reporter: jonathan lacefield
 Attachments: Node_with_compaction.png, Node_without_compaction.png, 
 cassandra.yaml, gc.logs.tar.gz, gcinspector_messages.txt, memtable_debug, 
 output.svg, results.tar.gz, visualvm_screenshot


 Behavior - If autocompaction is enabled, nodes will become unresponsive due 
 to a full Old Gen heap which is not cleared during CMS GC.
 Test methodology - disabled autocompaction on 3 nodes, left autocompaction 
 enabled on 1 node.  Executed different Cassandra stress loads, using write 
 only operations.  Monitored visualvm and jconsole for heap pressure.  
 Captured iostat and dstat for most tests.  Captured heap dump from 50 thread 
 load.  Hints were disabled for testing on all nodes to alleviate GC noise due 
 to hints backing up.
 Data load test through Cassandra stress -  /usr/bin/cassandra-stress  write 
 n=19 -rate threads=different threads tested -schema  
 replication\(factor=3\)  keyspace=Keyspace1 -node all nodes listed
 Data load thread count and results:
 * 1 thread - Still running but looks like the node can sustain this load 
 (approx 500 writes per second per node)
 * 5 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range (approx 2k writes per second per node)
 * 10 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range
 * 50 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 10k writes per second per node)
 * 100 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 20k writes per second per node)
 * 200 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 25k writes per second per node)
 Note - the observed behavior was the same for all tests except for the single 
 threaded test.  The single threaded test does not appear to show this 
 behavior.
 Tested different GC and Linux OS settings with a focus on the 50 and 200 
 thread loads.  
 JVM settings tested:
 #  default, out of the box, env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #* JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=50
 #   20 G Max | 10 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=6
JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=3
JVM_OPTS=$JVM_OPTS -XX:ParallelGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions
JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity
JVM_OPTS=$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs
JVM_OPTS=$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32768
JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking
 # 20 G Max | 1 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
 

[jira] [Commented] (CASSANDRA-7124) Use JMX Notifications to Indicate Success/Failure of Long-Running Operations

2014-12-11 Thread Rajanarayanan Thottuvaikkatumana (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242492#comment-14242492
 ] 

Rajanarayanan Thottuvaikkatumana commented on CASSANDRA-7124:
-

[~yukim], One question on the move operation. In the StorageService.java there 
is a method {{private void move(Token newToken) throws IOException}} which 
requires changes to return a ListenableFuture and for that we need to make the 
logic of the above mentioned method as runnable, do submission and return the 
ListenableFuture to its caller. Since it is not a runnable implementation, 
should I go ahead and implement a class like 
{{PendingRangeCalculatorService.java}} and implement a private static class 
implementing Runnable and its corresponding run method in it? Or can I include 
the logic of the {{private void move(Token newToken) throws IOException}} 
method in any of the existing classes like 
{{PendingRangeCalculatorService.java}}. Please confirm. Thanks

 Use JMX Notifications to Indicate Success/Failure of Long-Running Operations
 

 Key: CASSANDRA-7124
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7124
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Rajanarayanan Thottuvaikkatumana
Priority: Minor
  Labels: lhf
 Fix For: 3.0

 Attachments: 7124-wip.txt, cassandra-trunk-compact-7124.txt, 
 cassandra-trunk-decommission-7124.txt


 If {{nodetool cleanup}} or some other long-running operation takes too long 
 to complete, you'll see an error like the one in CASSANDRA-2126, so you can't 
 tell if the operation completed successfully or not.  CASSANDRA-4767 fixed 
 this for repairs with JMX notifications.  We should do something similar for 
 nodetool cleanup, compact, decommission, move, relocate, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8459) autocompaction on reads can prevent memtable space reclaimation

2014-12-11 Thread Benedict (JIRA)
Benedict created CASSANDRA-8459:
---

 Summary: autocompaction on reads can prevent memtable space 
reclaimation
 Key: CASSANDRA-8459
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8459
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.1.3


Memtable memory reclamation is dependent on reads always making progress, 
however on the collectTimeOrderedData critical path it is possible for the read 
to perform a _write_ inline, and for this write to block waiting for memtable 
space to be reclaimed. However the reclaimation is blocked waiting for this 
read to complete.

There are a number of solutions to this, but the simplest is to make the 
defragmentation happen asynchronously, so the read terminates normally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8459) autocompaction on reads can prevent memtable space reclaimation

2014-12-11 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-8459:

Attachment: 8459.txt

Attaching simple fix.

 autocompaction on reads can prevent memtable space reclaimation
 -

 Key: CASSANDRA-8459
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8459
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.1.3

 Attachments: 8459.txt


 Memtable memory reclamation is dependent on reads always making progress, 
 however on the collectTimeOrderedData critical path it is possible for the 
 read to perform a _write_ inline, and for this write to block waiting for 
 memtable space to be reclaimed. However the reclaimation is blocked waiting 
 for this read to complete.
 There are a number of solutions to this, but the simplest is to make the 
 defragmentation happen asynchronously, so the read terminates normally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8447) Nodes stuck in CMS GC cycle with very little traffic when compaction is enabled

2014-12-11 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242502#comment-14242502
 ] 

Benedict commented on CASSANDRA-8447:
-

[~yangzhe1991]: Your thread dump allowed me to trace the problem to 
CASSANDRA-8459.

 Nodes stuck in CMS GC cycle with very little traffic when compaction is 
 enabled
 ---

 Key: CASSANDRA-8447
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8447
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cluster size - 4 nodes
 Node size - 12 CPU (hyper threaded to 24 cores), 192 GB RAM, 2 Raid 0 arrays 
 (Data - 10 disk, spinning 10k drives | CL 2 disk, spinning 10k drives)
 OS - RHEL 6.5
 jvm - oracle 1.7.0_71
 Cassandra version 2.0.11
Reporter: jonathan lacefield
 Attachments: Node_with_compaction.png, Node_without_compaction.png, 
 cassandra.yaml, gc.logs.tar.gz, gcinspector_messages.txt, memtable_debug, 
 output.svg, results.tar.gz, visualvm_screenshot


 Behavior - If autocompaction is enabled, nodes will become unresponsive due 
 to a full Old Gen heap which is not cleared during CMS GC.
 Test methodology - disabled autocompaction on 3 nodes, left autocompaction 
 enabled on 1 node.  Executed different Cassandra stress loads, using write 
 only operations.  Monitored visualvm and jconsole for heap pressure.  
 Captured iostat and dstat for most tests.  Captured heap dump from 50 thread 
 load.  Hints were disabled for testing on all nodes to alleviate GC noise due 
 to hints backing up.
 Data load test through Cassandra stress -  /usr/bin/cassandra-stress  write 
 n=19 -rate threads=different threads tested -schema  
 replication\(factor=3\)  keyspace=Keyspace1 -node all nodes listed
 Data load thread count and results:
 * 1 thread - Still running but looks like the node can sustain this load 
 (approx 500 writes per second per node)
 * 5 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range (approx 2k writes per second per node)
 * 10 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range
 * 50 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 10k writes per second per node)
 * 100 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 20k writes per second per node)
 * 200 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 25k writes per second per node)
 Note - the observed behavior was the same for all tests except for the single 
 threaded test.  The single threaded test does not appear to show this 
 behavior.
 Tested different GC and Linux OS settings with a focus on the 50 and 200 
 thread loads.  
 JVM settings tested:
 #  default, out of the box, env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #* JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=50
 #   20 G Max | 10 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=6
JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=3
JVM_OPTS=$JVM_OPTS -XX:ParallelGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions
JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity
JVM_OPTS=$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs
JVM_OPTS=$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32768
JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking
 # 20 G Max | 1 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=6
JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=3
JVM_OPTS=$JVM_OPTS -XX:ParallelGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions
JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity

[jira] [Comment Edited] (CASSANDRA-8447) Nodes stuck in CMS GC cycle with very little traffic when compaction is enabled

2014-12-11 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242502#comment-14242502
 ] 

Benedict edited comment on CASSANDRA-8447 at 12/11/14 1:20 PM:
---

[~yangzhe1991]: Your thread dump allowed me to trace the (your) problem to 
CASSANDRA-8459. This is a 2.1 specific issue, and not related to this ticket/


was (Author: benedict):
[~yangzhe1991]: Your thread dump allowed me to trace the problem to 
CASSANDRA-8459.

 Nodes stuck in CMS GC cycle with very little traffic when compaction is 
 enabled
 ---

 Key: CASSANDRA-8447
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8447
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cluster size - 4 nodes
 Node size - 12 CPU (hyper threaded to 24 cores), 192 GB RAM, 2 Raid 0 arrays 
 (Data - 10 disk, spinning 10k drives | CL 2 disk, spinning 10k drives)
 OS - RHEL 6.5
 jvm - oracle 1.7.0_71
 Cassandra version 2.0.11
Reporter: jonathan lacefield
 Attachments: Node_with_compaction.png, Node_without_compaction.png, 
 cassandra.yaml, gc.logs.tar.gz, gcinspector_messages.txt, memtable_debug, 
 output.svg, results.tar.gz, visualvm_screenshot


 Behavior - If autocompaction is enabled, nodes will become unresponsive due 
 to a full Old Gen heap which is not cleared during CMS GC.
 Test methodology - disabled autocompaction on 3 nodes, left autocompaction 
 enabled on 1 node.  Executed different Cassandra stress loads, using write 
 only operations.  Monitored visualvm and jconsole for heap pressure.  
 Captured iostat and dstat for most tests.  Captured heap dump from 50 thread 
 load.  Hints were disabled for testing on all nodes to alleviate GC noise due 
 to hints backing up.
 Data load test through Cassandra stress -  /usr/bin/cassandra-stress  write 
 n=19 -rate threads=different threads tested -schema  
 replication\(factor=3\)  keyspace=Keyspace1 -node all nodes listed
 Data load thread count and results:
 * 1 thread - Still running but looks like the node can sustain this load 
 (approx 500 writes per second per node)
 * 5 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range (approx 2k writes per second per node)
 * 10 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range
 * 50 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 10k writes per second per node)
 * 100 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 20k writes per second per node)
 * 200 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 25k writes per second per node)
 Note - the observed behavior was the same for all tests except for the single 
 threaded test.  The single threaded test does not appear to show this 
 behavior.
 Tested different GC and Linux OS settings with a focus on the 50 and 200 
 thread loads.  
 JVM settings tested:
 #  default, out of the box, env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #* JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=50
 #   20 G Max | 10 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=6
JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=3
JVM_OPTS=$JVM_OPTS -XX:ParallelGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions
JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity
JVM_OPTS=$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs
JVM_OPTS=$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32768
JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking
 # 20 G Max | 1 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=6
JVM_OPTS=$JVM_OPTS 

[jira] [Commented] (CASSANDRA-8447) Nodes stuck in CMS GC cycle with very little traffic when compaction is enabled

2014-12-11 Thread Philo Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242506#comment-14242506
 ] 

Philo Yang commented on CASSANDRA-8447:
---

[~benedict]got it, let's discuss this in CASSANDRA-8459

 Nodes stuck in CMS GC cycle with very little traffic when compaction is 
 enabled
 ---

 Key: CASSANDRA-8447
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8447
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cluster size - 4 nodes
 Node size - 12 CPU (hyper threaded to 24 cores), 192 GB RAM, 2 Raid 0 arrays 
 (Data - 10 disk, spinning 10k drives | CL 2 disk, spinning 10k drives)
 OS - RHEL 6.5
 jvm - oracle 1.7.0_71
 Cassandra version 2.0.11
Reporter: jonathan lacefield
 Attachments: Node_with_compaction.png, Node_without_compaction.png, 
 cassandra.yaml, gc.logs.tar.gz, gcinspector_messages.txt, memtable_debug, 
 output.svg, results.tar.gz, visualvm_screenshot


 Behavior - If autocompaction is enabled, nodes will become unresponsive due 
 to a full Old Gen heap which is not cleared during CMS GC.
 Test methodology - disabled autocompaction on 3 nodes, left autocompaction 
 enabled on 1 node.  Executed different Cassandra stress loads, using write 
 only operations.  Monitored visualvm and jconsole for heap pressure.  
 Captured iostat and dstat for most tests.  Captured heap dump from 50 thread 
 load.  Hints were disabled for testing on all nodes to alleviate GC noise due 
 to hints backing up.
 Data load test through Cassandra stress -  /usr/bin/cassandra-stress  write 
 n=19 -rate threads=different threads tested -schema  
 replication\(factor=3\)  keyspace=Keyspace1 -node all nodes listed
 Data load thread count and results:
 * 1 thread - Still running but looks like the node can sustain this load 
 (approx 500 writes per second per node)
 * 5 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range (approx 2k writes per second per node)
 * 10 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range
 * 50 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 10k writes per second per node)
 * 100 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 20k writes per second per node)
 * 200 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 25k writes per second per node)
 Note - the observed behavior was the same for all tests except for the single 
 threaded test.  The single threaded test does not appear to show this 
 behavior.
 Tested different GC and Linux OS settings with a focus on the 50 and 200 
 thread loads.  
 JVM settings tested:
 #  default, out of the box, env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #* JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=50
 #   20 G Max | 10 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=6
JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=3
JVM_OPTS=$JVM_OPTS -XX:ParallelGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions
JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity
JVM_OPTS=$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs
JVM_OPTS=$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32768
JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking
 # 20 G Max | 1 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=6
JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=3
JVM_OPTS=$JVM_OPTS -XX:ParallelGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions
JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity
JVM_OPTS=$JVM_OPTS 

[jira] [Commented] (CASSANDRA-8418) Queries that require allow filtering are working without it

2014-12-11 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242509#comment-14242509
 ] 

Sylvain Lebresne commented on CASSANDRA-8418:
-

Actually, I do think the initial reasoning was correct. The query *should* 
require {{ALLOW FILTERING}} since the partition key is not provided. Because 
the index cell names starts by the partition key (before the clustering column) 
and so without the partition key, we do have to filter.

 Queries that require allow filtering are working without it
 ---

 Key: CASSANDRA-8418
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8418
 Project: Cassandra
  Issue Type: Bug
Reporter: Philip Thompson
Assignee: Benjamin Lerer
Priority: Minor
 Fix For: 3.0

 Attachments: CASSANDRA-8418.txt


 The trunk dtest {{cql_tests.py:TestCQL.composite_index_with_pk_test}} has 
 begun failing after the changes to CASSANDRA-7981. 
 With the schema {code}CREATE TABLE blogs (
 blog_id int,
 time1 int,
 time2 int,
 author text,
 content text,
 PRIMARY KEY (blog_id, time1, time2){code}
 and {code}CREATE INDEX ON blogs(author){code}, then the query
 {code}SELECT blog_id, content FROM blogs WHERE time1  0 AND 
 author='foo'{code} now requires ALLOW FILTERING, but did not before the 
 refactor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8459) autocompaction on reads can prevent memtable space reclaimation

2014-12-11 Thread Philo Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242511#comment-14242511
 ] 

Philo Yang commented on CASSANDRA-8459:
---

Hi, [~benedict], there is no node in my cluster that is unresponsive to dump 
the heap. But there are some hprof files dumped by +HeapDumpOnOutOfMemoryError 
automatically, are they helpful to you? If so I'll upload one of them.

 autocompaction on reads can prevent memtable space reclaimation
 -

 Key: CASSANDRA-8459
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8459
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.1.3

 Attachments: 8459.txt


 Memtable memory reclamation is dependent on reads always making progress, 
 however on the collectTimeOrderedData critical path it is possible for the 
 read to perform a _write_ inline, and for this write to block waiting for 
 memtable space to be reclaimed. However the reclaimation is blocked waiting 
 for this read to complete.
 There are a number of solutions to this, but the simplest is to make the 
 defragmentation happen asynchronously, so the read terminates normally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8459) autocompaction on reads can prevent memtable space reclaimation

2014-12-11 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242513#comment-14242513
 ] 

Benedict commented on CASSANDRA-8459:
-

No need, already sussed the problem and attached the fix

 autocompaction on reads can prevent memtable space reclaimation
 -

 Key: CASSANDRA-8459
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8459
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.1.3

 Attachments: 8459.txt


 Memtable memory reclamation is dependent on reads always making progress, 
 however on the collectTimeOrderedData critical path it is possible for the 
 read to perform a _write_ inline, and for this write to block waiting for 
 memtable space to be reclaimed. However the reclaimation is blocked waiting 
 for this read to complete.
 There are a number of solutions to this, but the simplest is to make the 
 defragmentation happen asynchronously, so the read terminates normally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8456) Some valid index queries can be considered as invalid

2014-12-11 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242514#comment-14242514
 ] 

Sylvain Lebresne commented on CASSANDRA-8456:
-

I don't think that's correct, for the reason explained in CASSANDRA-8418: those 
query do require filtering since the partition key is not provided.

 Some valid index queries can be considered as invalid
 -

 Key: CASSANDRA-8456
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8456
 Project: Cassandra
  Issue Type: Bug
Reporter: Benjamin Lerer
Assignee: Benjamin Lerer

 Some secondary index queries are rejected or need ALLOW FILTERING but should 
 not. It seems that in certain case {{SelectStatement}} use index filtering 
 for clustering column restrictions while it should be using clustering column 
 slices.
 The following unit tests can be used to reproduce the problem in 3.0
 {code}
 @Test
 public void testMultipleClusteringWithIndex() throws Throwable
 {
 createTable(CREATE TABLE %s (a int, b int, c int, d int, e int, 
 PRIMARY KEY (a, b, c, d)));
 createIndex(CREATE INDEX ON %s (b));
 createIndex(CREATE INDEX ON %s (e));
 execute(INSERT INTO %s (a, b, c, d, e) VALUES (?, ?, ?, ?, ?), 0, 
 0, 0, 0, 0);
 execute(INSERT INTO %s (a, b, c, d, e) VALUES (?, ?, ?, ?, ?), 0, 
 0, 1, 0, 1);
 execute(INSERT INTO %s (a, b, c, d, e) VALUES (?, ?, ?, ?, ?), 0, 
 0, 1, 1, 2);
 execute(INSERT INTO %s (a, b, c, d, e) VALUES (?, ?, ?, ?, ?), 0, 
 1, 0, 0, 0);
 execute(INSERT INTO %s (a, b, c, d, e) VALUES (?, ?, ?, ?, ?), 0, 
 1, 1, 0, 1);
 execute(INSERT INTO %s (a, b, c, d, e) VALUES (?, ?, ?, ?, ?), 0, 
 1, 1, 1, 2);
 execute(INSERT INTO %s (a, b, c, d, e) VALUES (?, ?, ?, ?, ?), 0, 
 2, 0, 0, 0);
 assertRows(execute(SELECT * FROM %s WHERE (b, c) = (?, ?), 1, 1),
row(0, 1, 1, 0, 1),
row(0, 1, 1, 1, 2));
 }
 @Test
 public void testMultiplePartitionKeyAndMultiClusteringWithIndex() throws 
 Throwable
 {
 createTable(CREATE TABLE %s (a int, b int, c int, d int, e int, f 
 int, PRIMARY KEY ((a, b), c, d, e)));
 createIndex(CREATE INDEX ON %s (c));
 createIndex(CREATE INDEX ON %s (f));
 execute(INSERT INTO %s (a, b, c, d, e, f) VALUES (?, ?, ?, ?, ?, 
 ?), 0, 0, 0, 0, 0, 0);
 execute(INSERT INTO %s (a, b, c, d, e, f) VALUES (?, ?, ?, ?, ?, 
 ?), 0, 0, 0, 1, 0, 1);
 execute(INSERT INTO %s (a, b, c, d, e, f) VALUES (?, ?, ?, ?, ?, 
 ?), 0, 0, 0, 1, 1, 2);
 execute(INSERT INTO %s (a, b, c, d, e, f) VALUES (?, ?, ?, ?, ?, 
 ?), 0, 0, 1, 0, 0, 3);
 execute(INSERT INTO %s (a, b, c, d, e, f) VALUES (?, ?, ?, ?, ?, 
 ?), 0, 0, 1, 1, 0, 4);
 execute(INSERT INTO %s (a, b, c, d, e, f) VALUES (?, ?, ?, ?, ?, 
 ?), 0, 0, 1, 1, 1, 5);
 execute(INSERT INTO %s (a, b, c, d, e, f) VALUES (?, ?, ?, ?, ?, 
 ?), 0, 0, 2, 0, 0, 6);
 assertRows(execute(SELECT * FROM %s WHERE a = ? AND (c) IN ((?), 
 (?)) AND f = ?, 0, 1, 2, 5),
row(0, 0, 1, 1, 1, 5));
 assertRows(execute(SELECT * FROM %s WHERE a = ? AND (c, d) IN ((?, 
 ?)) AND f = ?, 0, 1, 1, 5),
row(0, 0, 1, 1, 1, 5));
 assertRows(execute(SELECT * FROM %s WHERE a = ? AND (c) = (?) AND f 
 = ?, 0, 1, 5),
row(0, 0, 1, 1, 1, 5));
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8459) autocompaction on reads can prevent memtable space reclaimation

2014-12-11 Thread Philo Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242511#comment-14242511
 ] 

Philo Yang edited comment on CASSANDRA-8459 at 12/11/14 1:27 PM:
-

Hi, [~benedict], there is no node in my cluster that is unresponsive now to 
dump the heap. But there are some hprof files dumped by 
+HeapDumpOnOutOfMemoryError automatically, are they helpful to you? If so I'll 
upload one of them.


was (Author: yangzhe1991):
Hi, [~benedict], there is no node in my cluster that is unresponsive to dump 
the heap. But there are some hprof files dumped by +HeapDumpOnOutOfMemoryError 
automatically, are they helpful to you? If so I'll upload one of them.

 autocompaction on reads can prevent memtable space reclaimation
 -

 Key: CASSANDRA-8459
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8459
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.1.3

 Attachments: 8459.txt


 Memtable memory reclamation is dependent on reads always making progress, 
 however on the collectTimeOrderedData critical path it is possible for the 
 read to perform a _write_ inline, and for this write to block waiting for 
 memtable space to be reclaimed. However the reclaimation is blocked waiting 
 for this read to complete.
 There are a number of solutions to this, but the simplest is to make the 
 defragmentation happen asynchronously, so the read terminates normally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8459) autocompaction on reads can prevent memtable space reclaimation

2014-12-11 Thread Philo Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242523#comment-14242523
 ] 

Philo Yang commented on CASSANDRA-8459:
---

Ok, Thanks!

 autocompaction on reads can prevent memtable space reclaimation
 -

 Key: CASSANDRA-8459
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8459
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.1.3

 Attachments: 8459.txt


 Memtable memory reclamation is dependent on reads always making progress, 
 however on the collectTimeOrderedData critical path it is possible for the 
 read to perform a _write_ inline, and for this write to block waiting for 
 memtable space to be reclaimed. However the reclaimation is blocked waiting 
 for this read to complete.
 There are a number of solutions to this, but the simplest is to make the 
 defragmentation happen asynchronously, so the read terminates normally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-4139) Add varint encoding to Messaging service

2014-12-11 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242529#comment-14242529
 ] 

Sylvain Lebresne commented on CASSANDRA-4139:
-

To offer some kind of counter-point, I don't think this ticket would require 
lots of effort since we already have code to do the vint encoding/decoding. I 
might be missing something, but from what I can tell, it should be enough to 
pass the {{TypeSizes}} in {{IVersionedSerializer.serializedSize}} plus make 
sure both sides agree on whether vint is enabled or not, none of which is 
terribly involved (nor would add much complexity to the code). And since the 
investissement is not that big, I do think it's not completely worthless to 
evaluate it. It will probably not help in all cases or even with the default 
configuration, but I suspect it's faster than generic compression and so it 
could be interesting when you want a middle-ground between no compression at 
all and full messages compression.

Anyway, not trying to convince anyone to prioritize this in any way, but just 
to say that unless someone beats me to it, I do intend to give this a shot at 
some point in the future (especially because some parts I made in 
CASSANDRA-8099 would benefit more from vint that what the current format 
probaby do).

 Add varint encoding to Messaging service
 

 Key: CASSANDRA-4139
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4139
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Vijay
Assignee: Ariel Weisberg
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-4139-v1.patch, 
 0001-CASSANDRA-4139-v2.patch, 0001-CASSANDRA-4139-v4.patch, 
 0002-add-bytes-written-metric.patch, 4139-Test.rtf, 
 ASF.LICENSE.NOT.GRANTED--0001-CASSANDRA-4139-v3.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8261) Clean up schema metadata classes

2014-12-11 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-8261:
-
Attachment: 8261-isolate-serialization-code-v2.txt

 Clean up schema metadata classes
 

 Key: CASSANDRA-8261
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8261
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko
Priority: Minor
 Fix For: 3.0

 Attachments: 8261-isolate-hadcoded-system-tables.txt, 
 8261-isolate-serialization-code-v2.txt, 8261-isolate-serialization-code.txt, 
 8261-isolate-thrift-code.txt


 While working on CASSANDRA-6717, I've made some general cleanup changes to 
 schema metadata classes - distracted from the core purpose. Also, being 
 distracted from it by other things, every time I come back to it gives me a 
 bit of a rebase hell.
 Thus I'm isolating those changes into a separate issue here, hoping to commit 
 them one by one, before I go back and finalize CASSANDRA-6717.
 The changes include:
 - moving all the toThrift/fromThrift conversion code to ThriftConversion, 
 where it belongs
 - moving the complied system CFMetaData objects away from CFMetaData (to 
 SystemKeyspace and TracesKeyspace)
 - isolating legacy toSchema/fromSchema code into a separate class 
 (LegacySchemaTables - former DefsTables)
 - refactoring CFMetaData/KSMetaData fields to match CQL CREATE TABLE syntax, 
 and encapsulating more things in 
 CompactionOptions/CompressionOptions/ReplicationOptions classes
 - moving the definition classes to the new 'schema' package



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8261) Clean up schema metadata classes

2014-12-11 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242539#comment-14242539
 ] 

Aleksey Yeschenko commented on CASSANDRA-8261:
--

Attached a rebased v2 with the renames.

The TODO is there for CASSANDRA-6717 to resolve (all of these 8261 patches are 
extracts from the 6717 branch, actually).

Didn't touch javadoc, b/c many of those methods will be gone (all the ones that 
serialize schema the old way and some others).

This is the last 8261 patch. The rest of the work will be completed in 6717.

 Clean up schema metadata classes
 

 Key: CASSANDRA-8261
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8261
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko
Priority: Minor
 Fix For: 3.0

 Attachments: 8261-isolate-hadcoded-system-tables.txt, 
 8261-isolate-serialization-code-v2.txt, 8261-isolate-serialization-code.txt, 
 8261-isolate-thrift-code.txt


 While working on CASSANDRA-6717, I've made some general cleanup changes to 
 schema metadata classes - distracted from the core purpose. Also, being 
 distracted from it by other things, every time I come back to it gives me a 
 bit of a rebase hell.
 Thus I'm isolating those changes into a separate issue here, hoping to commit 
 them one by one, before I go back and finalize CASSANDRA-6717.
 The changes include:
 - moving all the toThrift/fromThrift conversion code to ThriftConversion, 
 where it belongs
 - moving the complied system CFMetaData objects away from CFMetaData (to 
 SystemKeyspace and TracesKeyspace)
 - isolating legacy toSchema/fromSchema code into a separate class 
 (LegacySchemaTables - former DefsTables)
 - refactoring CFMetaData/KSMetaData fields to match CQL CREATE TABLE syntax, 
 and encapsulating more things in 
 CompactionOptions/CompressionOptions/ReplicationOptions classes
 - moving the definition classes to the new 'schema' package



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


cassandra git commit: Fix error message on read repair timeouts

2014-12-11 Thread aleksey
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.0 578430952 - 451c514a3


Fix error message on read repair timeouts

patch by Sam Tunnicliffe; reviewed by Aleksey Yeschenko for
CASSANDRA-7947


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/451c514a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/451c514a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/451c514a

Branch: refs/heads/cassandra-2.0
Commit: 451c514a3a02f4e889f040176453beefbcd75843
Parents: 5784309
Author: Sam Tunnicliffe s...@beobal.com
Authored: Thu Dec 11 15:17:29 2014 +0100
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Thu Dec 11 15:17:29 2014 +0100

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/service/StorageProxy.java  | 16 +++-
 2 files changed, 16 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/451c514a/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 385af01..cd302fb 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.12:
+ * Fix error message on read repair timeouts (CASSANDRA-7947)
  * Default DTCS base_time_seconds changed to 60 (CASSANDRA-8417)
  * Refuse Paxos operation with more than one pending endpoint (CASSANDRA-8346)
  * Throw correct exception when trying to bind a keyspace or table

http://git-wip-us.apache.org/repos/asf/cassandra/blob/451c514a/src/java/org/apache/cassandra/service/StorageProxy.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java 
b/src/java/org/apache/cassandra/service/StorageProxy.java
index f877aee..1e1a2a3 100644
--- a/src/java/org/apache/cassandra/service/StorageProxy.java
+++ b/src/java/org/apache/cassandra/service/StorageProxy.java
@@ -1368,6 +1368,17 @@ public class StorageProxy implements StorageProxyMBean
 {
 throw new AssertionError(e); // full data requested 
from each node here, no digests should be sent
 }
+catch (ReadTimeoutException e)
+{
+if (Tracing.isTracing())
+Tracing.trace(Timed out waiting on digest 
mismatch repair requests);
+else
+logger.debug(Timed out waiting on digest mismatch 
repair requests);
+// the caught exception here will have CL.ALL from the 
repair command,
+// not whatever CL the initial command was at 
(CASSANDRA-7947)
+int blockFor = 
consistencyLevel.blockFor(Keyspace.open(command.getKeyspace()));
+throw new ReadTimeoutException(consistencyLevel, 
blockFor-1, blockFor, true);
+}
 
 RowDataResolver resolver = 
(RowDataResolver)handler.resolver;
 try
@@ -1378,7 +1389,10 @@ public class StorageProxy implements StorageProxyMBean
 }
 catch (TimeoutException e)
 {
-Tracing.trace(Timed out on digest mismatch retries);
+if (Tracing.isTracing())
+Tracing.trace(Timed out waiting on digest 
mismatch repair acknowledgements);
+else
+logger.debug(Timed out waiting on digest mismatch 
repair acknowledgements);
 int blockFor = 
consistencyLevel.blockFor(Keyspace.open(command.getKeyspace()));
 throw new ReadTimeoutException(consistencyLevel, 
blockFor-1, blockFor, true);
 }



[jira] [Created] (CASSANDRA-8460) Make it possible to move non-compacting sstables to slow/big storage in DTCS

2014-12-11 Thread Marcus Eriksson (JIRA)
Marcus Eriksson created CASSANDRA-8460:
--

 Summary: Make it possible to move non-compacting sstables to 
slow/big storage in DTCS
 Key: CASSANDRA-8460
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8460
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson


It would be nice if we could configure DTCS to have a set of extra data 
directories where we move the sstables once they are older than 
max_sstable_age_days. 

This would enable users to have a quick, small SSD for hot, new data, and big 
spinning disks for data that is rarely read and never compacted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] cassandra git commit: Fix error message on read repair timeouts

2014-12-11 Thread aleksey
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 27c67ad85 - 745ddd1c2


Fix error message on read repair timeouts

patch by Sam Tunnicliffe; reviewed by Aleksey Yeschenko for
CASSANDRA-7947


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/451c514a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/451c514a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/451c514a

Branch: refs/heads/cassandra-2.1
Commit: 451c514a3a02f4e889f040176453beefbcd75843
Parents: 5784309
Author: Sam Tunnicliffe s...@beobal.com
Authored: Thu Dec 11 15:17:29 2014 +0100
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Thu Dec 11 15:17:29 2014 +0100

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/service/StorageProxy.java  | 16 +++-
 2 files changed, 16 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/451c514a/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 385af01..cd302fb 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.12:
+ * Fix error message on read repair timeouts (CASSANDRA-7947)
  * Default DTCS base_time_seconds changed to 60 (CASSANDRA-8417)
  * Refuse Paxos operation with more than one pending endpoint (CASSANDRA-8346)
  * Throw correct exception when trying to bind a keyspace or table

http://git-wip-us.apache.org/repos/asf/cassandra/blob/451c514a/src/java/org/apache/cassandra/service/StorageProxy.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java 
b/src/java/org/apache/cassandra/service/StorageProxy.java
index f877aee..1e1a2a3 100644
--- a/src/java/org/apache/cassandra/service/StorageProxy.java
+++ b/src/java/org/apache/cassandra/service/StorageProxy.java
@@ -1368,6 +1368,17 @@ public class StorageProxy implements StorageProxyMBean
 {
 throw new AssertionError(e); // full data requested 
from each node here, no digests should be sent
 }
+catch (ReadTimeoutException e)
+{
+if (Tracing.isTracing())
+Tracing.trace(Timed out waiting on digest 
mismatch repair requests);
+else
+logger.debug(Timed out waiting on digest mismatch 
repair requests);
+// the caught exception here will have CL.ALL from the 
repair command,
+// not whatever CL the initial command was at 
(CASSANDRA-7947)
+int blockFor = 
consistencyLevel.blockFor(Keyspace.open(command.getKeyspace()));
+throw new ReadTimeoutException(consistencyLevel, 
blockFor-1, blockFor, true);
+}
 
 RowDataResolver resolver = 
(RowDataResolver)handler.resolver;
 try
@@ -1378,7 +1389,10 @@ public class StorageProxy implements StorageProxyMBean
 }
 catch (TimeoutException e)
 {
-Tracing.trace(Timed out on digest mismatch retries);
+if (Tracing.isTracing())
+Tracing.trace(Timed out waiting on digest 
mismatch repair acknowledgements);
+else
+logger.debug(Timed out waiting on digest mismatch 
repair acknowledgements);
 int blockFor = 
consistencyLevel.blockFor(Keyspace.open(command.getKeyspace()));
 throw new ReadTimeoutException(consistencyLevel, 
blockFor-1, blockFor, true);
 }



[2/2] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2014-12-11 Thread aleksey
Merge branch 'cassandra-2.0' into cassandra-2.1

Conflicts:
CHANGES.txt


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/745ddd1c
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/745ddd1c
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/745ddd1c

Branch: refs/heads/cassandra-2.1
Commit: 745ddd1c2c2156a43097934941b7d160cc6a981c
Parents: 27c67ad 451c514
Author: Aleksey Yeschenko alek...@apache.org
Authored: Thu Dec 11 15:20:26 2014 +0100
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Thu Dec 11 15:20:26 2014 +0100

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/service/StorageProxy.java  | 16 +++-
 2 files changed, 16 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/745ddd1c/CHANGES.txt
--
diff --cc CHANGES.txt
index 25e0f47,cd302fb..71a6642
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,24 -1,5 +1,25 @@@
 -2.0.12:
 +2.1.3
 + * Remove tmplink files for offline compactions (CASSANDRA-8321)
 + * Reduce maxHintsInProgress (CASSANDRA-8415)
 + * BTree updates may call provided update function twice (CASSANDRA-8018)
 + * Release sstable references after anticompaction (CASSANDRA-8386)
 + * Handle abort() in SSTableRewriter properly (CASSANDRA-8320)
 + * Fix high size calculations for prepared statements (CASSANDRA-8231)
 + * Centralize shared executors (CASSANDRA-8055)
 + * Fix filtering for CONTAINS (KEY) relations on frozen collection
 +   clustering columns when the query is restricted to a single
 +   partition (CASSANDRA-8203)
 + * Do more aggressive entire-sstable TTL expiry checks (CASSANDRA-8243)
 + * Add more log info if readMeter is null (CASSANDRA-8238)
 + * add check of the system wall clock time at startup (CASSANDRA-8305)
 + * Support for frozen collections (CASSANDRA-7859)
 + * Fix overflow on histogram computation (CASSANDRA-8028)
 + * Have paxos reuse the timestamp generation of normal queries 
(CASSANDRA-7801)
 + * Fix incremental repair not remove parent session on remote (CASSANDRA-8291)
 + * Improve JBOD disk utilization (CASSANDRA-7386)
 + * Log failed host when preparing incremental repair (CASSANDRA-8228)
 +Merged from 2.0:
+  * Fix error message on read repair timeouts (CASSANDRA-7947)
   * Default DTCS base_time_seconds changed to 60 (CASSANDRA-8417)
   * Refuse Paxos operation with more than one pending endpoint (CASSANDRA-8346)
   * Throw correct exception when trying to bind a keyspace or table

http://git-wip-us.apache.org/repos/asf/cassandra/blob/745ddd1c/src/java/org/apache/cassandra/service/StorageProxy.java
--



[1/3] cassandra git commit: Fix error message on read repair timeouts

2014-12-11 Thread aleksey
Repository: cassandra
Updated Branches:
  refs/heads/trunk 6ce8b3fcb - 857de5540


Fix error message on read repair timeouts

patch by Sam Tunnicliffe; reviewed by Aleksey Yeschenko for
CASSANDRA-7947


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/451c514a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/451c514a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/451c514a

Branch: refs/heads/trunk
Commit: 451c514a3a02f4e889f040176453beefbcd75843
Parents: 5784309
Author: Sam Tunnicliffe s...@beobal.com
Authored: Thu Dec 11 15:17:29 2014 +0100
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Thu Dec 11 15:17:29 2014 +0100

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/service/StorageProxy.java  | 16 +++-
 2 files changed, 16 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/451c514a/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 385af01..cd302fb 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.12:
+ * Fix error message on read repair timeouts (CASSANDRA-7947)
  * Default DTCS base_time_seconds changed to 60 (CASSANDRA-8417)
  * Refuse Paxos operation with more than one pending endpoint (CASSANDRA-8346)
  * Throw correct exception when trying to bind a keyspace or table

http://git-wip-us.apache.org/repos/asf/cassandra/blob/451c514a/src/java/org/apache/cassandra/service/StorageProxy.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java 
b/src/java/org/apache/cassandra/service/StorageProxy.java
index f877aee..1e1a2a3 100644
--- a/src/java/org/apache/cassandra/service/StorageProxy.java
+++ b/src/java/org/apache/cassandra/service/StorageProxy.java
@@ -1368,6 +1368,17 @@ public class StorageProxy implements StorageProxyMBean
 {
 throw new AssertionError(e); // full data requested 
from each node here, no digests should be sent
 }
+catch (ReadTimeoutException e)
+{
+if (Tracing.isTracing())
+Tracing.trace(Timed out waiting on digest 
mismatch repair requests);
+else
+logger.debug(Timed out waiting on digest mismatch 
repair requests);
+// the caught exception here will have CL.ALL from the 
repair command,
+// not whatever CL the initial command was at 
(CASSANDRA-7947)
+int blockFor = 
consistencyLevel.blockFor(Keyspace.open(command.getKeyspace()));
+throw new ReadTimeoutException(consistencyLevel, 
blockFor-1, blockFor, true);
+}
 
 RowDataResolver resolver = 
(RowDataResolver)handler.resolver;
 try
@@ -1378,7 +1389,10 @@ public class StorageProxy implements StorageProxyMBean
 }
 catch (TimeoutException e)
 {
-Tracing.trace(Timed out on digest mismatch retries);
+if (Tracing.isTracing())
+Tracing.trace(Timed out waiting on digest 
mismatch repair acknowledgements);
+else
+logger.debug(Timed out waiting on digest mismatch 
repair acknowledgements);
 int blockFor = 
consistencyLevel.blockFor(Keyspace.open(command.getKeyspace()));
 throw new ReadTimeoutException(consistencyLevel, 
blockFor-1, blockFor, true);
 }



[3/3] cassandra git commit: Merge branch 'cassandra-2.1' into trunk

2014-12-11 Thread aleksey
Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/857de554
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/857de554
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/857de554

Branch: refs/heads/trunk
Commit: 857de554066ecbe507ebab13b9cfe6a2749c403f
Parents: 6ce8b3f 745ddd1
Author: Aleksey Yeschenko alek...@apache.org
Authored: Thu Dec 11 15:20:56 2014 +0100
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Thu Dec 11 15:20:56 2014 +0100

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/service/StorageProxy.java  | 16 +++-
 2 files changed, 16 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/857de554/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/857de554/src/java/org/apache/cassandra/service/StorageProxy.java
--



[2/3] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2014-12-11 Thread aleksey
Merge branch 'cassandra-2.0' into cassandra-2.1

Conflicts:
CHANGES.txt


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/745ddd1c
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/745ddd1c
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/745ddd1c

Branch: refs/heads/trunk
Commit: 745ddd1c2c2156a43097934941b7d160cc6a981c
Parents: 27c67ad 451c514
Author: Aleksey Yeschenko alek...@apache.org
Authored: Thu Dec 11 15:20:26 2014 +0100
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Thu Dec 11 15:20:26 2014 +0100

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/service/StorageProxy.java  | 16 +++-
 2 files changed, 16 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/745ddd1c/CHANGES.txt
--
diff --cc CHANGES.txt
index 25e0f47,cd302fb..71a6642
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,24 -1,5 +1,25 @@@
 -2.0.12:
 +2.1.3
 + * Remove tmplink files for offline compactions (CASSANDRA-8321)
 + * Reduce maxHintsInProgress (CASSANDRA-8415)
 + * BTree updates may call provided update function twice (CASSANDRA-8018)
 + * Release sstable references after anticompaction (CASSANDRA-8386)
 + * Handle abort() in SSTableRewriter properly (CASSANDRA-8320)
 + * Fix high size calculations for prepared statements (CASSANDRA-8231)
 + * Centralize shared executors (CASSANDRA-8055)
 + * Fix filtering for CONTAINS (KEY) relations on frozen collection
 +   clustering columns when the query is restricted to a single
 +   partition (CASSANDRA-8203)
 + * Do more aggressive entire-sstable TTL expiry checks (CASSANDRA-8243)
 + * Add more log info if readMeter is null (CASSANDRA-8238)
 + * add check of the system wall clock time at startup (CASSANDRA-8305)
 + * Support for frozen collections (CASSANDRA-7859)
 + * Fix overflow on histogram computation (CASSANDRA-8028)
 + * Have paxos reuse the timestamp generation of normal queries 
(CASSANDRA-7801)
 + * Fix incremental repair not remove parent session on remote (CASSANDRA-8291)
 + * Improve JBOD disk utilization (CASSANDRA-7386)
 + * Log failed host when preparing incremental repair (CASSANDRA-8228)
 +Merged from 2.0:
+  * Fix error message on read repair timeouts (CASSANDRA-7947)
   * Default DTCS base_time_seconds changed to 60 (CASSANDRA-8417)
   * Refuse Paxos operation with more than one pending endpoint (CASSANDRA-8346)
   * Throw correct exception when trying to bind a keyspace or table

http://git-wip-us.apache.org/repos/asf/cassandra/blob/745ddd1c/src/java/org/apache/cassandra/service/StorageProxy.java
--



[jira] [Updated] (CASSANDRA-8447) Nodes stuck in CMS GC cycle with very little traffic when compaction is enabled

2014-12-11 Thread jonathan lacefield (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jonathan lacefield updated CASSANDRA-8447:
--
Attachment: output.1.svg
output.2.svg

output.1.svg represents unhealthy node, with compaction
output.2.svg represents healthy node, no compaction

These flame graphs were created to compare healthy and unhealthy nodes.  They 
were created after clearing out all CL replays by stopping dse, starting dse, 
flushing nodes, stopping dse, restarting dse, finally validated CL replay was 
not occurring through system.log. 

The flame graphs were created on the same test execution. 

 Nodes stuck in CMS GC cycle with very little traffic when compaction is 
 enabled
 ---

 Key: CASSANDRA-8447
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8447
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cluster size - 4 nodes
 Node size - 12 CPU (hyper threaded to 24 cores), 192 GB RAM, 2 Raid 0 arrays 
 (Data - 10 disk, spinning 10k drives | CL 2 disk, spinning 10k drives)
 OS - RHEL 6.5
 jvm - oracle 1.7.0_71
 Cassandra version 2.0.11
Reporter: jonathan lacefield
 Attachments: Node_with_compaction.png, Node_without_compaction.png, 
 cassandra.yaml, gc.logs.tar.gz, gcinspector_messages.txt, memtable_debug, 
 output.1.svg, output.2.svg, output.svg, results.tar.gz, visualvm_screenshot


 Behavior - If autocompaction is enabled, nodes will become unresponsive due 
 to a full Old Gen heap which is not cleared during CMS GC.
 Test methodology - disabled autocompaction on 3 nodes, left autocompaction 
 enabled on 1 node.  Executed different Cassandra stress loads, using write 
 only operations.  Monitored visualvm and jconsole for heap pressure.  
 Captured iostat and dstat for most tests.  Captured heap dump from 50 thread 
 load.  Hints were disabled for testing on all nodes to alleviate GC noise due 
 to hints backing up.
 Data load test through Cassandra stress -  /usr/bin/cassandra-stress  write 
 n=19 -rate threads=different threads tested -schema  
 replication\(factor=3\)  keyspace=Keyspace1 -node all nodes listed
 Data load thread count and results:
 * 1 thread - Still running but looks like the node can sustain this load 
 (approx 500 writes per second per node)
 * 5 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range (approx 2k writes per second per node)
 * 10 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range
 * 50 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 10k writes per second per node)
 * 100 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 20k writes per second per node)
 * 200 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 25k writes per second per node)
 Note - the observed behavior was the same for all tests except for the single 
 threaded test.  The single threaded test does not appear to show this 
 behavior.
 Tested different GC and Linux OS settings with a focus on the 50 and 200 
 thread loads.  
 JVM settings tested:
 #  default, out of the box, env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #* JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=50
 #   20 G Max | 10 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=6
JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=3
JVM_OPTS=$JVM_OPTS -XX:ParallelGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions
JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity
JVM_OPTS=$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs
JVM_OPTS=$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32768
JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking
 # 20 G Max | 1 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly

[jira] [Updated] (CASSANDRA-8419) NPE in SelectStatement

2014-12-11 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8419:
---
Labels: qa-resolved  (was: )

 NPE in SelectStatement
 --

 Key: CASSANDRA-8419
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8419
 Project: Cassandra
  Issue Type: Bug
Reporter: Philip Thompson
Assignee: Benjamin Lerer
  Labels: qa-resolved
 Fix For: 3.0

 Attachments: CASSANDRA-8419.txt


 The dtest {{cql_tests.py:TestCQL.empty_in_test}} is failing in trunk with a 
 Null Pointer Exception. The stack trace is:
 {code}ERROR [SharedPool-Worker-1] 2014-12-03 16:24:16,274 
 ErrorMessage.java:243 - Unexpected exception
 during request
 java.lang.NullPointerException: null
 at 
 com.google.common.base.Preconditions.checkNotNull(Preconditions.java:213) 
 ~[guava-16.0
 .jar:na]
 at 
 com.google.common.collect.Lists$TransformingSequentialList.init(Lists.java:525)
  ~[gu
 ava-16.0.jar:na]
 at com.google.common.collect.Lists.transform(Lists.java:508) 
 ~[guava-16.0.jar:na]
 at 
 org.apache.cassandra.db.composites.Composites.toByteBuffers(Composites.java:45)
  ~[main
 /:na]
 at 
 org.apache.cassandra.cql3.restrictions.SingleColumnPrimaryKeyRestrictions.values(Singl
 eColumnPrimaryKeyRestrictions.java:257) ~[main/:na]
 at 
 org.apache.cassandra.cql3.restrictions.StatementRestrictions.getPartitionKeys(StatementRestrictions.java:362)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement.getSliceCommands(SelectStatement.java:296)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement.getPageableCommand(SelectStatement.java:205)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:165)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:72)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:239)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:261) 
 ~[main/:na]
 at 
 org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:118)
  ~[main/:na]
 at 
 org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
  [main/:na]
 at 
 org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
  [main/:na]
 at 
 io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
 at 
 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
 at 
 io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
 at 
 io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 [na:1.7.0_67]
 at 
 org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
  [main/:na]
 at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
 [main/:na]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_67]{code}
 The error occurred while executing {{SELECT v FROM test_compact WHERE k1 IN 
 ()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8321) SStablesplit behavior changed

2014-12-11 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8321:
---
Labels: qa-resolved  (was: )

 SStablesplit behavior changed
 -

 Key: CASSANDRA-8321
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8321
 Project: Cassandra
  Issue Type: Bug
Reporter: Philip Thompson
Assignee: Marcus Eriksson
Priority: Minor
  Labels: qa-resolved
 Fix For: 2.1.3

 Attachments: 0001-ccm-fix-file-finding.patch, 
 0001-remove-tmplink-for-offline-compactions.patch


 The dtest sstablesplit_test.py has begun failing due to an incorrect number 
 of sstables being created after running sstablesplit.
 http://cassci.datastax.com/job/cassandra-2.1_dtest/559/changes#detail1
 is the run where the failure began.
 In 2.1.x, the test expects 7 sstables to be created after split, but instead 
 12 are being created. All of the data is there, and the sstables add up to 
 the expected size, so this simply may be a change in default behavior. The 
 test runs sstablesplit without the --size argument, and the default has not 
 changed, so it is unexpected that the behavior would change in a minor point 
 release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7124) Use JMX Notifications to Indicate Success/Failure of Long-Running Operations

2014-12-11 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242677#comment-14242677
 ] 

Yuki Morishita commented on CASSANDRA-7124:
---

bq. Did you mean to say that, get the latest version from trunk, create a 
branch locally, apply all the changes and then push that to a different 
repository such as to my own Github repository and share the link with you? 

Yes. Trunk code will likely change often, so keeping your work in branch is 
preferable.

At this point, I think it will be more easier to work/review if we create sub 
tasks. Compaction releated tasks(scrub, upgradesstable, compact, etc) and other 
tasks like move, decommission, etc need to have different codes.
So why don't we focus on compaction related tasks first?

For the latter question above, I prefer keeping classes small and focus their 
own responsibility. You can just go ahead and implement whatever you think is 
good.

 Use JMX Notifications to Indicate Success/Failure of Long-Running Operations
 

 Key: CASSANDRA-7124
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7124
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Rajanarayanan Thottuvaikkatumana
Priority: Minor
  Labels: lhf
 Fix For: 3.0

 Attachments: 7124-wip.txt, cassandra-trunk-compact-7124.txt, 
 cassandra-trunk-decommission-7124.txt


 If {{nodetool cleanup}} or some other long-running operation takes too long 
 to complete, you'll see an error like the one in CASSANDRA-2126, so you can't 
 tell if the operation completed successfully or not.  CASSANDRA-4767 fixed 
 this for repairs with JMX notifications.  We should do something similar for 
 nodetool cleanup, compact, decommission, move, relocate, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8337) mmap underflow during validation compaction

2014-12-11 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242680#comment-14242680
 ] 

Philip Thompson commented on CASSANDRA-8337:


[~sterligovak], any chance you can attach one of the corrupt sstables? That 
would help with reproducing this and #8061 which you also ran into. Thanks you.

 mmap underflow during validation compaction
 ---

 Key: CASSANDRA-8337
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8337
 Project: Cassandra
  Issue Type: Bug
Reporter: Alexander Sterligov
Assignee: Joshua McKenzie
 Fix For: 2.1.3

 Attachments: 8337_v1.txt, thread_dump


 During full parallel repair I often get errors like the following
 {quote}
 [2014-11-19 01:02:39,355] Repair session 116beaf0-6f66-11e4-afbb-c1c082008cbe 
 for range (3074457345618263602,-9223372036854775808] failed with error 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #116beaf0-6f66-11e4-afbb-c1c082008cbe on iss/target_state_history, 
 (3074457345618263602,-9223372036854775808]] Validation failed in 
 /95.108.242.19
 {quote}
 At the log of the node there are always same exceptions:
 {quote}
 ERROR [ValidationExecutor:2] 2014-11-19 01:02:10,847 
 JVMStabilityInspector.java:94 - JVM state determined to be unstable.  Exiting 
 forcefully due to:
 org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: 
 mmap segment underflow; remaining is 15 but 47 requested
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:1518)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:1385)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:1315)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1706)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1694)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:276)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getScanners(WrappingCompactionStrategy.java:320)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:917)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:97)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:557)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
 ~[na:1.7.0_51]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_51]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_51]
 at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
 Caused by: java.io.IOException: mmap segment underflow; remaining is 15 but 
 47 requested
 at 
 org.apache.cassandra.io.util.MappedFileDataInput.readBytes(MappedFileDataInput.java:135)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:348) 
 ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:327)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:1460)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 ... 13 common frames omitted
 {quote}
 Now i'm using die disk_failure_policy to determine such conditions faster, 
 but I get them even with stop policy.
 Streams related to host with such exception are hanged. Thread dump is 
 attached. Only restart helps.
 After retry I get errors from other nodes.
 scrub doesn't help and report that sstables are ok.
 Sequential repairs doesn't cause such exceptions.
 Load is about 1000 write rps and 50 read rps per node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8452) Add missing systems to FBUtilities.isUnix, add FBUtilities.isWindows

2014-12-11 Thread Blake Eggleston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-8452:
---
Attachment: CASSANDRA-8452-v2.patch

It looks like there's already a patch in the works for 2.1 in CASSANDRA-6993, 
should I close this as a duplicate?

If not, +1 on calculating at startup and calling it posix. The v2 patch 
attached determines OS on class initialization, and renames {{isUnix}} to 
{{isPosix}}. It also replaces a few {{!FBUtilities.isUnix()}} with 
{{FBUtilities.isWindows()}} where the comments indicate that the check is being 
done to support windows. Also, imo isPosix implies that the system is posix 
compliant, so I just changed it to isPosix, but lemme know if isPosixCompliant 
is strongly preferred, and I'll rename it.

 Add missing systems to FBUtilities.isUnix, add FBUtilities.isWindows
 

 Key: CASSANDRA-8452
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8452
 Project: Cassandra
  Issue Type: Bug
Reporter: Blake Eggleston
Assignee: Blake Eggleston
Priority: Minor
 Fix For: 2.1.3

 Attachments: CASSANDRA-8452-v2.patch, CASSANDRA-8452.patch


 The isUnix method leaves out a few unix systems, which, after the changes in 
 CASSANDRA-8136, causes some unexpected behavior during shutdown. It would 
 also be clearer if FBUtilities had an isWindows method for branching into 
 Windows specific logic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8459) autocompaction on reads can prevent memtable space reclaimation

2014-12-11 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242734#comment-14242734
 ] 

Jonathan Ellis commented on CASSANDRA-8459:
---

+1

Do we also need this in 2.0?

 autocompaction on reads can prevent memtable space reclaimation
 -

 Key: CASSANDRA-8459
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8459
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.1.3

 Attachments: 8459.txt


 Memtable memory reclamation is dependent on reads always making progress, 
 however on the collectTimeOrderedData critical path it is possible for the 
 read to perform a _write_ inline, and for this write to block waiting for 
 memtable space to be reclaimed. However the reclaimation is blocked waiting 
 for this read to complete.
 There are a number of solutions to this, but the simplest is to make the 
 defragmentation happen asynchronously, so the read terminates normally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8459) autocompaction on reads can prevent memtable space reclaimation

2014-12-11 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242738#comment-14242738
 ] 

Benedict commented on CASSANDRA-8459:
-

It's probably not a *bad idea* for 2.0 as it stops a read touching the write 
path, but it isn't necessary for correctness.

 autocompaction on reads can prevent memtable space reclaimation
 -

 Key: CASSANDRA-8459
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8459
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.1.3

 Attachments: 8459.txt


 Memtable memory reclamation is dependent on reads always making progress, 
 however on the collectTimeOrderedData critical path it is possible for the 
 read to perform a _write_ inline, and for this write to block waiting for 
 memtable space to be reclaimed. However the reclaimation is blocked waiting 
 for this read to complete.
 There are a number of solutions to this, but the simplest is to make the 
 defragmentation happen asynchronously, so the read terminates normally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8459) autocompaction on reads can prevent memtable space reclaimation

2014-12-11 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242747#comment-14242747
 ] 

Jonathan Ellis commented on CASSANDRA-8459:
---

Let's leave 2.0 alone then.

 autocompaction on reads can prevent memtable space reclaimation
 -

 Key: CASSANDRA-8459
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8459
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.1.3

 Attachments: 8459.txt


 Memtable memory reclamation is dependent on reads always making progress, 
 however on the collectTimeOrderedData critical path it is possible for the 
 read to perform a _write_ inline, and for this write to block waiting for 
 memtable space to be reclaimed. However the reclaimation is blocked waiting 
 for this read to complete.
 There are a number of solutions to this, but the simplest is to make the 
 defragmentation happen asynchronously, so the read terminates normally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8447) Nodes stuck in CMS GC cycle with very little traffic when compaction is enabled

2014-12-11 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242743#comment-14242743
 ] 

Jonathan Ellis commented on CASSANDRA-8447:
---

I don't think this is *the* problem, but I suggest patching with CASSANDRA-8164 
just to eliminate those effects from the equation.

Have you tried bisecting with earlier C* releases?  Let's throw stress at 
this isn't exactly an untested scenario.

 Nodes stuck in CMS GC cycle with very little traffic when compaction is 
 enabled
 ---

 Key: CASSANDRA-8447
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8447
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cluster size - 4 nodes
 Node size - 12 CPU (hyper threaded to 24 cores), 192 GB RAM, 2 Raid 0 arrays 
 (Data - 10 disk, spinning 10k drives | CL 2 disk, spinning 10k drives)
 OS - RHEL 6.5
 jvm - oracle 1.7.0_71
 Cassandra version 2.0.11
Reporter: jonathan lacefield
 Attachments: Node_with_compaction.png, Node_without_compaction.png, 
 cassandra.yaml, gc.logs.tar.gz, gcinspector_messages.txt, memtable_debug, 
 output.1.svg, output.2.svg, output.svg, results.tar.gz, visualvm_screenshot


 Behavior - If autocompaction is enabled, nodes will become unresponsive due 
 to a full Old Gen heap which is not cleared during CMS GC.
 Test methodology - disabled autocompaction on 3 nodes, left autocompaction 
 enabled on 1 node.  Executed different Cassandra stress loads, using write 
 only operations.  Monitored visualvm and jconsole for heap pressure.  
 Captured iostat and dstat for most tests.  Captured heap dump from 50 thread 
 load.  Hints were disabled for testing on all nodes to alleviate GC noise due 
 to hints backing up.
 Data load test through Cassandra stress -  /usr/bin/cassandra-stress  write 
 n=19 -rate threads=different threads tested -schema  
 replication\(factor=3\)  keyspace=Keyspace1 -node all nodes listed
 Data load thread count and results:
 * 1 thread - Still running but looks like the node can sustain this load 
 (approx 500 writes per second per node)
 * 5 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range (approx 2k writes per second per node)
 * 10 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range
 * 50 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 10k writes per second per node)
 * 100 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 20k writes per second per node)
 * 200 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 25k writes per second per node)
 Note - the observed behavior was the same for all tests except for the single 
 threaded test.  The single threaded test does not appear to show this 
 behavior.
 Tested different GC and Linux OS settings with a focus on the 50 and 200 
 thread loads.  
 JVM settings tested:
 #  default, out of the box, env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #* JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=50
 #   20 G Max | 10 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=6
JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=3
JVM_OPTS=$JVM_OPTS -XX:ParallelGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions
JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity
JVM_OPTS=$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs
JVM_OPTS=$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32768
JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking
 # 20 G Max | 1 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=6
JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=3

[jira] [Commented] (CASSANDRA-8447) Nodes stuck in CMS GC cycle with very little traffic when compaction is enabled

2014-12-11 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242779#comment-14242779
 ] 

Benedict commented on CASSANDRA-8447:
-

The problem is pretty simple: MeteredFlusher runs on 
StorageService.optionalTasks, and there are other events that can happen on 
here that can take a long time. In particular hint delivery scheduling, which 
is preceded by a blocking compaction of the hints table, during which no 
progress for any other optional tasks may proceed.

MeteredFlusher should have its own dedicated thread, as responding promptly is 
essential; under this workload running every couple of seconds is pretty much 
necessary to avoid rapid catastrophic build up of state in memtables. 

 Nodes stuck in CMS GC cycle with very little traffic when compaction is 
 enabled
 ---

 Key: CASSANDRA-8447
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8447
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cluster size - 4 nodes
 Node size - 12 CPU (hyper threaded to 24 cores), 192 GB RAM, 2 Raid 0 arrays 
 (Data - 10 disk, spinning 10k drives | CL 2 disk, spinning 10k drives)
 OS - RHEL 6.5
 jvm - oracle 1.7.0_71
 Cassandra version 2.0.11
Reporter: jonathan lacefield
 Attachments: Node_with_compaction.png, Node_without_compaction.png, 
 cassandra.yaml, gc.logs.tar.gz, gcinspector_messages.txt, memtable_debug, 
 output.1.svg, output.2.svg, output.svg, results.tar.gz, visualvm_screenshot


 Behavior - If autocompaction is enabled, nodes will become unresponsive due 
 to a full Old Gen heap which is not cleared during CMS GC.
 Test methodology - disabled autocompaction on 3 nodes, left autocompaction 
 enabled on 1 node.  Executed different Cassandra stress loads, using write 
 only operations.  Monitored visualvm and jconsole for heap pressure.  
 Captured iostat and dstat for most tests.  Captured heap dump from 50 thread 
 load.  Hints were disabled for testing on all nodes to alleviate GC noise due 
 to hints backing up.
 Data load test through Cassandra stress -  /usr/bin/cassandra-stress  write 
 n=19 -rate threads=different threads tested -schema  
 replication\(factor=3\)  keyspace=Keyspace1 -node all nodes listed
 Data load thread count and results:
 * 1 thread - Still running but looks like the node can sustain this load 
 (approx 500 writes per second per node)
 * 5 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range (approx 2k writes per second per node)
 * 10 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range
 * 50 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 10k writes per second per node)
 * 100 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 20k writes per second per node)
 * 200 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 25k writes per second per node)
 Note - the observed behavior was the same for all tests except for the single 
 threaded test.  The single threaded test does not appear to show this 
 behavior.
 Tested different GC and Linux OS settings with a focus on the 50 and 200 
 thread loads.  
 JVM settings tested:
 #  default, out of the box, env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #* JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=50
 #   20 G Max | 10 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=6
JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=3
JVM_OPTS=$JVM_OPTS -XX:ParallelGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions
JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity
JVM_OPTS=$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs
JVM_OPTS=$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32768
JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking
 # 20 G Max | 1 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8

[jira] [Comment Edited] (CASSANDRA-8447) Nodes stuck in CMS GC cycle with very little traffic when compaction is enabled

2014-12-11 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242779#comment-14242779
 ] 

Benedict edited comment on CASSANDRA-8447 at 12/11/14 4:59 PM:
---

The problem is pretty simple: MeteredFlusher runs on 
StorageService.optionalTasks, and there are other events that can happen on 
here that can take a long time. In particular hint delivery scheduling, which 
is preceded by a blocking compaction of the hints table, during which no 
progress for any other optional tasks may proceed.

MeteredFlusher should have its own dedicated thread, as responding promptly is 
essential; under this workload running every couple of seconds is pretty much 
necessary to avoid rapid catastrophic build up of state in memtables. 

(edit: in case there's any ambiguity, this isn't a hypothesis. the heap dump 
clearly shows optionalTasks blocked waiting on the result of a FutureTask 
executing a runnable defined in CompactionManager (as far as I can tell in 
submitUserDefined); the current live memtable is retaining 6M records at 6Gb of 
retained heap, so MeteredFlusher hasn't had its turn in a long time)


was (Author: benedict):
The problem is pretty simple: MeteredFlusher runs on 
StorageService.optionalTasks, and there are other events that can happen on 
here that can take a long time. In particular hint delivery scheduling, which 
is preceded by a blocking compaction of the hints table, during which no 
progress for any other optional tasks may proceed.

MeteredFlusher should have its own dedicated thread, as responding promptly is 
essential; under this workload running every couple of seconds is pretty much 
necessary to avoid rapid catastrophic build up of state in memtables. 

 Nodes stuck in CMS GC cycle with very little traffic when compaction is 
 enabled
 ---

 Key: CASSANDRA-8447
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8447
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cluster size - 4 nodes
 Node size - 12 CPU (hyper threaded to 24 cores), 192 GB RAM, 2 Raid 0 arrays 
 (Data - 10 disk, spinning 10k drives | CL 2 disk, spinning 10k drives)
 OS - RHEL 6.5
 jvm - oracle 1.7.0_71
 Cassandra version 2.0.11
Reporter: jonathan lacefield
 Attachments: Node_with_compaction.png, Node_without_compaction.png, 
 cassandra.yaml, gc.logs.tar.gz, gcinspector_messages.txt, memtable_debug, 
 output.1.svg, output.2.svg, output.svg, results.tar.gz, visualvm_screenshot


 Behavior - If autocompaction is enabled, nodes will become unresponsive due 
 to a full Old Gen heap which is not cleared during CMS GC.
 Test methodology - disabled autocompaction on 3 nodes, left autocompaction 
 enabled on 1 node.  Executed different Cassandra stress loads, using write 
 only operations.  Monitored visualvm and jconsole for heap pressure.  
 Captured iostat and dstat for most tests.  Captured heap dump from 50 thread 
 load.  Hints were disabled for testing on all nodes to alleviate GC noise due 
 to hints backing up.
 Data load test through Cassandra stress -  /usr/bin/cassandra-stress  write 
 n=19 -rate threads=different threads tested -schema  
 replication\(factor=3\)  keyspace=Keyspace1 -node all nodes listed
 Data load thread count and results:
 * 1 thread - Still running but looks like the node can sustain this load 
 (approx 500 writes per second per node)
 * 5 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range (approx 2k writes per second per node)
 * 10 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range
 * 50 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 10k writes per second per node)
 * 100 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 20k writes per second per node)
 * 200 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 25k writes per second per node)
 Note - the observed behavior was the same for all tests except for the single 
 threaded test.  The single threaded test does not appear to show this 
 behavior.
 Tested different GC and Linux OS settings with a focus on the 50 and 200 
 thread loads.  
 JVM settings tested:
 #  default, out of the box, env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #* JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=50
 #   20 G Max | 10 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled

[jira] [Updated] (CASSANDRA-8452) Add missing systems to FBUtilities.isUnix, add FBUtilities.isWindows

2014-12-11 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-8452:
---
Reviewer: Joshua McKenzie

 Add missing systems to FBUtilities.isUnix, add FBUtilities.isWindows
 

 Key: CASSANDRA-8452
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8452
 Project: Cassandra
  Issue Type: Bug
Reporter: Blake Eggleston
Assignee: Blake Eggleston
Priority: Minor
 Fix For: 2.1.3

 Attachments: CASSANDRA-8452-v2.patch, CASSANDRA-8452.patch


 The isUnix method leaves out a few unix systems, which, after the changes in 
 CASSANDRA-8136, causes some unexpected behavior during shutdown. It would 
 also be clearer if FBUtilities had an isWindows method for branching into 
 Windows specific logic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8458) Avoid streaming from tmplink files

2014-12-11 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242832#comment-14242832
 ] 

Benedict commented on CASSANDRA-8458:
-

We could also try and figure out how/why this happens, as it should be able to 
stream safely.

Does it only happen if streaming a range that wraps zero (i.e. from +X, to -Y)?

 Avoid streaming from tmplink files
 --

 Key: CASSANDRA-8458
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8458
 Project: Cassandra
  Issue Type: Bug
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
 Fix For: 2.1.3


 Looks like we include tmplink sstables in streams in 2.1+, and when we do, 
 sometimes we get this error message on the receiving side: 
 {{java.io.IOException: Corrupt input data, block did not start with 2 byte 
 signature ('ZV') followed by type byte, 2-byte length)}}. I've only seen this 
 happen when a tmplink sstable is included in the stream.
 We can not just exclude the tmplink files when starting the stream - we need 
 to include the original file, which we might miss since we check if the 
 requested stream range intersects the sstable range.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8458) Avoid streaming from tmplink files

2014-12-11 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242832#comment-14242832
 ] 

Benedict edited comment on CASSANDRA-8458 at 12/11/14 5:45 PM:
---

We could also try and figure out how/why this happens, as it should be able to 
stream safely.

Does it only happen if streaming a range that wraps zero (i.e. from +X, to -Y)? 

edit: To elaborate, I suspect the broken bit is that our dfile/ifile objects 
don't actually truncate the readable range - only our indexed decoratedkey 
range is truncated. In sstable.getPositionsForRanges we just return the end of 
the file if the range goes past the range of the file; in this case we could 
stream partially written data. If so, we could fix by simply making 
sstable.getPositionsForRanges() lookup the start position of the last key in 
the file, and always ensure we leave a key's overlap between the dropped 
sstables and the replacement.


was (Author: benedict):
We could also try and figure out how/why this happens, as it should be able to 
stream safely.

Does it only happen if streaming a range that wraps zero (i.e. from +X, to -Y)?

 Avoid streaming from tmplink files
 --

 Key: CASSANDRA-8458
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8458
 Project: Cassandra
  Issue Type: Bug
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
 Fix For: 2.1.3


 Looks like we include tmplink sstables in streams in 2.1+, and when we do, 
 sometimes we get this error message on the receiving side: 
 {{java.io.IOException: Corrupt input data, block did not start with 2 byte 
 signature ('ZV') followed by type byte, 2-byte length)}}. I've only seen this 
 happen when a tmplink sstable is included in the stream.
 We can not just exclude the tmplink files when starting the stream - we need 
 to include the original file, which we might miss since we check if the 
 requested stream range intersects the sstable range.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8390) The process cannot access the file because it is being used by another process

2014-12-11 Thread Alexander Radzin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242835#comment-14242835
 ] 

Alexander Radzin commented on CASSANDRA-8390:
-

FYI: I have Windows Defender on my computer. I have just tried to turn it off 
and got the same result. 

 The process cannot access the file because it is being used by another process
 --

 Key: CASSANDRA-8390
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8390
 Project: Cassandra
  Issue Type: Bug
Reporter: Ilya Komolkin
Assignee: Joshua McKenzie
 Fix For: 2.1.3


 21:46:27.810 [NonPeriodicTasks:1] ERROR o.a.c.service.CassandraDaemon - 
 Exception in thread Thread[NonPeriodicTasks:1,5,main]
 org.apache.cassandra.io.FSWriteError: java.nio.file.FileSystemException: 
 E:\Upsource_12391\data\cassandra\data\kernel\filechangehistory_t-a277b560764611e48c8e4915424c75fe\kernel-filechangehistory_t-ka-33-Index.db:
  The process cannot access the file because it is being used by another 
 process.
  
 at 
 org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:135) 
 ~[cassandra-all-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:121) 
 ~[cassandra-all-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:113) 
 ~[cassandra-all-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.io.sstable.SSTableDeletingTask.run(SSTableDeletingTask.java:94)
  ~[cassandra-all-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.io.sstable.SSTableReader$6.run(SSTableReader.java:664) 
 ~[cassandra-all-2.1.1.jar:2.1.1]
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 ~[na:1.7.0_71]
 at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
 ~[na:1.7.0_71]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
  ~[na:1.7.0_71]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
  ~[na:1.7.0_71]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_71]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_71]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
 Caused by: java.nio.file.FileSystemException: 
 E:\Upsource_12391\data\cassandra\data\kernel\filechangehistory_t-a277b560764611e48c8e4915424c75fe\kernel-filechangehistory_t-ka-33-Index.db:
  The process cannot access the file because it is being used by another 
 process.
  
 at 
 sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) 
 ~[na:1.7.0_71]
 at 
 sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) 
 ~[na:1.7.0_71]
 at 
 sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) 
 ~[na:1.7.0_71]
 at 
 sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269)
  ~[na:1.7.0_71]
 at 
 sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103)
  ~[na:1.7.0_71]
 at java.nio.file.Files.delete(Files.java:1079) ~[na:1.7.0_71]
 at 
 org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:131) 
 ~[cassandra-all-2.1.1.jar:2.1.1]
 ... 11 common frames omitted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] cassandra git commit: Support for user-defined aggregate functions

2014-12-11 Thread tylerhobbs
Repository: cassandra
Updated Branches:
  refs/heads/trunk 857de5540 - e2f35c767


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e2f35c76/src/java/org/apache/cassandra/cql3/statements/DropAggregateStatement.java
--
diff --git 
a/src/java/org/apache/cassandra/cql3/statements/DropAggregateStatement.java 
b/src/java/org/apache/cassandra/cql3/statements/DropAggregateStatement.java
new file mode 100644
index 000..118f89d
--- /dev/null
+++ b/src/java/org/apache/cassandra/cql3/statements/DropAggregateStatement.java
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.cassandra.cql3.statements;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.cassandra.auth.Permission;
+import org.apache.cassandra.cql3.CQL3Type;
+import org.apache.cassandra.cql3.functions.*;
+import org.apache.cassandra.db.marshal.AbstractType;
+import org.apache.cassandra.exceptions.InvalidRequestException;
+import org.apache.cassandra.exceptions.RequestValidationException;
+import org.apache.cassandra.exceptions.UnauthorizedException;
+import org.apache.cassandra.service.ClientState;
+import org.apache.cassandra.service.MigrationManager;
+import org.apache.cassandra.thrift.ThriftValidation;
+import org.apache.cassandra.transport.Event;
+
+/**
+ * A codeDROP AGGREGATE/code statement parsed from a CQL query.
+ */
+public final class DropAggregateStatement extends SchemaAlteringStatement
+{
+private FunctionName functionName;
+private final boolean ifExists;
+private final ListCQL3Type.Raw argRawTypes;
+private final boolean argsPresent;
+
+public DropAggregateStatement(FunctionName functionName,
+  ListCQL3Type.Raw argRawTypes,
+  boolean argsPresent,
+  boolean ifExists)
+{
+this.functionName = functionName;
+this.argRawTypes = argRawTypes;
+this.argsPresent = argsPresent;
+this.ifExists = ifExists;
+}
+
+public void prepareKeyspace(ClientState state) throws 
InvalidRequestException
+{
+if (!functionName.hasKeyspace()  state.getRawKeyspace() != null)
+functionName = new FunctionName(state.getKeyspace(), 
functionName.name);
+
+if (!functionName.hasKeyspace())
+throw new InvalidRequestException(Functions must be fully 
qualified with a keyspace name if a keyspace is not set for the session);
+
+ThriftValidation.validateKeyspaceNotSystem(functionName.keyspace);
+}
+
+public void checkAccess(ClientState state) throws UnauthorizedException, 
InvalidRequestException
+{
+// TODO CASSANDRA-7557 (function DDL permission)
+
+state.hasKeyspaceAccess(functionName.keyspace, Permission.DROP);
+}
+
+public void validate(ClientState state) throws RequestValidationException
+{
+}
+
+public Event.SchemaChange changeEvent()
+{
+return null;
+}
+
+public boolean announceMigration(boolean isLocalOnly) throws 
RequestValidationException
+{
+ListFunction olds = Functions.find(functionName);
+
+if (!argsPresent  olds != null  olds.size()  1)
+throw new InvalidRequestException(String.format('DROP AGGREGATE 
%s' matches multiple function definitions;  +
+specify the 
argument types by issuing a statement like  +
+'DROP AGGREGATE 
%s (type, type, ...)'. Hint: use cqlsh  +
+'DESCRIBE 
AGGREGATE %s' command to find all overloads,
+functionName, 
functionName, functionName));
+
+ListAbstractType? argTypes = new ArrayList(argRawTypes.size());
+for (CQL3Type.Raw rawType : argRawTypes)
+argTypes.add(rawType.prepare(functionName.keyspace).getType());
+
+Function old;
+if (argsPresent)
+{
+old = Functions.find(functionName, argTypes);
+if (old == null || !(old 

[2/2] cassandra git commit: Support for user-defined aggregate functions

2014-12-11 Thread tylerhobbs
Support for user-defined aggregate functions

Patch by Robert Stupp; reviewed by Tyler Hobbs for CASSANDRA-8053


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e2f35c76
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e2f35c76
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e2f35c76

Branch: refs/heads/trunk
Commit: e2f35c767e479da9761628578299b54872d7eea9
Parents: 857de55
Author: Robert Stupp sn...@snazy.de
Authored: Thu Dec 11 11:46:28 2014 -0600
Committer: Tyler Hobbs ty...@datastax.com
Committed: Thu Dec 11 11:46:28 2014 -0600

--
 CHANGES.txt |   1 +
 pylib/cqlshlib/cql3handling.py  |  28 +-
 src/java/org/apache/cassandra/auth/Auth.java|  12 +
 .../org/apache/cassandra/config/KSMetaData.java |   1 +
 src/java/org/apache/cassandra/cql3/Cql.g|  61 ++
 .../apache/cassandra/cql3/QueryProcessor.java   |  15 +
 .../cql3/functions/AbstractFunction.java|  10 +
 .../cassandra/cql3/functions/AggregateFcts.java |  64 +-
 .../cql3/functions/AggregateFunction.java   |  10 +-
 .../cassandra/cql3/functions/Function.java  |   4 +
 .../cassandra/cql3/functions/FunctionCall.java  |   2 +-
 .../cassandra/cql3/functions/Functions.java |  24 +-
 .../cql3/functions/JavaSourceUDFFactory.java|   6 +-
 .../cassandra/cql3/functions/UDAggregate.java   | 280 
 .../cassandra/cql3/functions/UDFunction.java| 193 ++
 .../cassandra/cql3/functions/UDHelper.java  | 123 
 .../selection/AbstractFunctionSelector.java |   4 +-
 .../selection/AggregateFunctionSelector.java|   6 +-
 .../cassandra/cql3/selection/FieldSelector.java |   2 +-
 .../cassandra/cql3/selection/Selection.java |   8 +-
 .../cassandra/cql3/selection/Selector.java  |   2 +-
 .../cql3/selection/SelectorFactories.java   |   2 +-
 .../statements/CreateAggregateStatement.java| 194 ++
 .../statements/CreateFunctionStatement.java |  11 +-
 .../cql3/statements/DropAggregateStatement.java | 136 
 .../cql3/statements/DropFunctionStatement.java  |  17 +-
 .../org/apache/cassandra/db/DefsTables.java |  89 ++-
 .../org/apache/cassandra/db/SystemKeyspace.java |  21 +-
 .../cassandra/service/IMigrationListener.java   |   3 +
 .../cassandra/service/MigrationManager.java |  45 +-
 .../org/apache/cassandra/transport/Server.java  |  12 +
 .../apache/cassandra/cql3/AggregationTest.java  | 640 ++-
 .../org/apache/cassandra/cql3/CQLTester.java|  14 +
 test/unit/org/apache/cassandra/cql3/UFTest.java |   8 -
 34 files changed, 1795 insertions(+), 253 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e2f35c76/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 34e740e..6ff61e7 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0
+ * Support for user-defined aggregation functions (CASSANDRA-8053)
  * Fix NPE in SelectStatement with empty IN values (CASSANDRA-8419)
  * Refactor SelectStatement, return IN results in natural order instead
of IN value list order (CASSANDRA-7981)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e2f35c76/pylib/cqlshlib/cql3handling.py
--
diff --git a/pylib/cqlshlib/cql3handling.py b/pylib/cqlshlib/cql3handling.py
index f8a3069..84af796 100644
--- a/pylib/cqlshlib/cql3handling.py
+++ b/pylib/cqlshlib/cql3handling.py
@@ -41,7 +41,7 @@ class Cql3ParsingRuleSet(CqlParsingRuleSet):
 'select', 'from', 'where', 'and', 'key', 'insert', 'update', 'with',
 'limit', 'using', 'use', 'set',
 'begin', 'apply', 'batch', 'truncate', 'delete', 'in', 'create',
-'function', 'keyspace', 'schema', 'columnfamily', 'table', 'index', 
'on', 'drop',
+'function', 'aggregate', 'keyspace', 'schema', 'columnfamily', 
'table', 'index', 'on', 'drop',
 'primary', 'into', 'values', 'timestamp', 'ttl', 'alter', 'add', 
'type',
 'compact', 'storage', 'order', 'by', 'asc', 'desc', 'clustering',
 'token', 'writetime', 'map', 'list', 'to', 'custom', 'if', 'not'
@@ -209,7 +209,10 @@ JUNK ::= /([ 
\t\r\f\v]+|(--|[/][/])[^\n\r]*([\n\r]|$)|[/][*].*?[*][/])/ ;
 mapLiteral ::= { term : term ( , term : term )* }
;
 
-functionName ::= identifier ( . identifier )?
+userFunctionName ::= identifier ( . identifier )?
+   ;
+
+functionName ::= userFunctionName
  | TOKEN
  ;
 
@@ -233,12 +236,14 @@ JUNK ::= /([ 
\t\r\f\v]+|(--|[/][/])[^\n\r]*([\n\r]|$)|[/][*].*?[*][/])/ ;
   | createIndexStatement
   | createUserTypeStatement
   | 

[jira] [Updated] (CASSANDRA-8053) Support for user defined aggregate functions

2014-12-11 Thread Tyler Hobbs (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-8053:
---
Attachment: 8053-final.txt

 Support for user defined aggregate functions
 

 Key: CASSANDRA-8053
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8053
 Project: Cassandra
  Issue Type: New Feature
Reporter: Robert Stupp
Assignee: Robert Stupp
  Labels: cql, udf
 Fix For: 3.0

 Attachments: 8053-final.txt, 8053v1.txt, 8053v2.txt


 CASSANDRA-4914 introduces aggregate functions.
 This ticket is about to decide how we can support user defined aggregate 
 functions. UD aggregate functions should be supported for all UDF flavors 
 (class, java, jsr223).
 Things to consider:
 * Special implementations for each scripting language should be omitted
 * No exposure of internal APIs (e.g. {{AggregateFunction}} interface)
 * No need for users to deal with serializers / codecs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7124) Use JMX Notifications to Indicate Success/Failure of Long-Running Operations

2014-12-11 Thread Rajanarayanan Thottuvaikkatumana (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242851#comment-14242851
 ] 

Rajanarayanan Thottuvaikkatumana commented on CASSANDRA-7124:
-

[~yukim], Please find the branch and the commit reference for the compact task 
- 
https://github.com/rnamboodiri/cassandra/commit/3e1a49c511eb23d0e9b5bc854de1316d3be9fd86

Please review it and let me know. Since I have done the decommission part as 
well, I will commit that also and send you the link. From then on, I will focus 
on the compact related tasks such as (scrub, upgradesstable) etc. Thanks

 Use JMX Notifications to Indicate Success/Failure of Long-Running Operations
 

 Key: CASSANDRA-7124
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7124
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Rajanarayanan Thottuvaikkatumana
Priority: Minor
  Labels: lhf
 Fix For: 3.0

 Attachments: 7124-wip.txt, cassandra-trunk-compact-7124.txt, 
 cassandra-trunk-decommission-7124.txt


 If {{nodetool cleanup}} or some other long-running operation takes too long 
 to complete, you'll see an error like the one in CASSANDRA-2126, so you can't 
 tell if the operation completed successfully or not.  CASSANDRA-4767 fixed 
 this for repairs with JMX notifications.  We should do something similar for 
 nodetool cleanup, compact, decommission, move, relocate, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8447) Nodes stuck in CMS GC cycle with very little traffic when compaction is enabled

2014-12-11 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242852#comment-14242852
 ] 

Jonathan Ellis commented on CASSANDRA-8447:
---

There is a patch on CASSANDRA-8285 that moves hint delivery off of 
optionalTasks to leave it free for MeteredFlusher.

 Nodes stuck in CMS GC cycle with very little traffic when compaction is 
 enabled
 ---

 Key: CASSANDRA-8447
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8447
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cluster size - 4 nodes
 Node size - 12 CPU (hyper threaded to 24 cores), 192 GB RAM, 2 Raid 0 arrays 
 (Data - 10 disk, spinning 10k drives | CL 2 disk, spinning 10k drives)
 OS - RHEL 6.5
 jvm - oracle 1.7.0_71
 Cassandra version 2.0.11
Reporter: jonathan lacefield
 Attachments: Node_with_compaction.png, Node_without_compaction.png, 
 cassandra.yaml, gc.logs.tar.gz, gcinspector_messages.txt, memtable_debug, 
 output.1.svg, output.2.svg, output.svg, results.tar.gz, visualvm_screenshot


 Behavior - If autocompaction is enabled, nodes will become unresponsive due 
 to a full Old Gen heap which is not cleared during CMS GC.
 Test methodology - disabled autocompaction on 3 nodes, left autocompaction 
 enabled on 1 node.  Executed different Cassandra stress loads, using write 
 only operations.  Monitored visualvm and jconsole for heap pressure.  
 Captured iostat and dstat for most tests.  Captured heap dump from 50 thread 
 load.  Hints were disabled for testing on all nodes to alleviate GC noise due 
 to hints backing up.
 Data load test through Cassandra stress -  /usr/bin/cassandra-stress  write 
 n=19 -rate threads=different threads tested -schema  
 replication\(factor=3\)  keyspace=Keyspace1 -node all nodes listed
 Data load thread count and results:
 * 1 thread - Still running but looks like the node can sustain this load 
 (approx 500 writes per second per node)
 * 5 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range (approx 2k writes per second per node)
 * 10 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range
 * 50 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 10k writes per second per node)
 * 100 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 20k writes per second per node)
 * 200 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 25k writes per second per node)
 Note - the observed behavior was the same for all tests except for the single 
 threaded test.  The single threaded test does not appear to show this 
 behavior.
 Tested different GC and Linux OS settings with a focus on the 50 and 200 
 thread loads.  
 JVM settings tested:
 #  default, out of the box, env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #* JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=50
 #   20 G Max | 10 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=6
JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=3
JVM_OPTS=$JVM_OPTS -XX:ParallelGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions
JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity
JVM_OPTS=$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs
JVM_OPTS=$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32768
JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking
 # 20 G Max | 1 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=6
JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=3
JVM_OPTS=$JVM_OPTS -XX:ParallelGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=12
JVM_OPTS=$JVM_OPTS 

[jira] [Commented] (CASSANDRA-8457) nio MessagingService

2014-12-11 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242859#comment-14242859
 ] 

Benedict commented on CASSANDRA-8457:
-

FTR, I strongly doubt _context switching_ is actually as much of a problem as 
we think, although constraining it is never a bad thing. The big hit we have is 
_thread signalling_ costs, which is a different but related beast. Certainly 
the talking point that raised this was discussing system time spent serving 
context switches which would definitely be referring to signalling, not the 
switching itself.

Now, we do use a BlockingQueue for OutboundTcpConnection which will incur these 
costs, however I strongly suspect the impact will be much lower than predicted 
- especially as the testing done to flag this up was on small clusters with 
RF=1, where these threads would not be being exercised at all. The costs of 
going to the network itself are likely to exceed the context switching costs, 
and naturally permit messages to accumulate in the queue, reducing the number 
of signals actually needed. 

There's then the negative performance implications we have found from small 
numbers of connections under NIO to consider, so that this change could have 
significant downsides for the majority of deployed clusters (although if we get 
batching in the client driver we may see these penalties disappear).

To establish if there's likely a benefit to exploit, we could most likely 
refactor this code comparatively minimally (than rewriting to NIO/Netty) to 
make use of the SharedExecutorPool to establish if such a positive effect is 
indeed to be had, as this would reduce the number of threads in flight to those 
actually serving work on the OTCs. This wouldn't affect the ITC, but I am 
dubious of their contribution. We should probably also actually test if this is 
indeed a problem from clusters at scale performing in-memory CL1 reads.


 nio MessagingService
 

 Key: CASSANDRA-8457
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Ariel Weisberg
  Labels: performance
 Fix For: 3.0


 Thread-per-peer (actually two each incoming and outbound) is a big 
 contributor to context switching, especially for larger clusters.  Let's look 
 at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes

2014-12-11 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242880#comment-14242880
 ] 

Michaël Figuière commented on CASSANDRA-7937:
-

The StreamIDs, introduced in the native protocol to multiplex several pending 
requests on a single connection, could actually serve as a backpressure 
mechanism. Before protocol v2 we had just 128 IDs per connection with drivers 
typically allowing just a few connection per node. This therefore already acts 
as a throttling mechanism on the client side. With protocol v3 we've increased 
this limit but the driver still let the user define a value for the max 
requests per host that will have the same effect. A simple way the handle 
backpressure could therefore be to introduce a Window (similar as TCP Window) 
of the currently allowed concurrent requests for each client. Just like in TCP, 
the Window Size could be included in each response header to the client. This 
Window Size could then be adjusted using a magic formula to define, probably 
based on the load of each Stage of the Cassandra architecture, state of 
compaction, etc...

I agree with [~jbellis]'s point: backpressure in a distributed system like 
Cassandra, with a coordinator fowarding traffic to replicas, is confusing. But 
in practice, most recent CQL Drivers now do Token Aware Balancing by default 
(since 2.0.2 in the Java Driver), which will send the queries to the replicas 
any PreparedStatement (expected to be used under the high pressure condition 
described here). So in this situation the backpressure information received by 
the client could be used properly, as it would just be understood by the client 
as a request to slow down for *this* particular replica, it could therefore 
pick another replica. Thus we end up with a system in which we avoid doing Load 
Shedding (which is a waste of time, bandwidth and workload) and that, I 
believe, could behave more smoothly when the cluster is overloaded.

Note that this StreamID Window could be considered as a mandatory limit or 
just as a hint in the protocol specification. The driver could then adjust 
its strategy to use it or not depending on the settings or type of request.

 Apply backpressure gently when overloaded with writes
 -

 Key: CASSANDRA-7937
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7937
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.0
Reporter: Piotr Kołaczkowski
  Labels: performance

 When writing huge amounts of data into C* cluster from analytic tools like 
 Hadoop or Apache Spark, we can see that often C* can't keep up with the load. 
 This is because analytic tools typically write data as fast as they can in 
 parallel, from many nodes and they are not artificially rate-limited, so C* 
 is the bottleneck here. Also, increasing the number of nodes doesn't really 
 help, because in a collocated setup this also increases number of 
 Hadoop/Spark nodes (writers) and although possible write performance is 
 higher, the problem still remains.
 We observe the following behavior:
 1. data is ingested at an extreme fast pace into memtables and flush queue 
 fills up
 2. the available memory limit for memtables is reached and writes are no 
 longer accepted
 3. the application gets hit by write timeout, and retries repeatedly, in 
 vain 
 4. after several failed attempts to write, the job gets aborted 
 Desired behaviour:
 1. data is ingested at an extreme fast pace into memtables and flush queue 
 fills up
 2. after exceeding some memtable fill threshold, C* applies adaptive rate 
 limiting to writes - the more the buffers are filled-up, the less writes/s 
 are accepted, however writes still occur within the write timeout.
 3. thanks to slowed down data ingestion, now flush can finish before all the 
 memory gets used
 Of course the details how rate limiting could be done are up for a discussion.
 It may be also worth considering putting such logic into the driver, not C* 
 core, but then C* needs to expose at least the following information to the 
 driver, so we could calculate the desired maximum data rate:
 1. current amount of memory available for writes before they would completely 
 block
 2. total amount of data queued to be flushed and flush progress (amount of 
 data to flush remaining for the memtable currently being flushed)
 3. average flush write speed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8418) Queries that require allow filtering are working without it

2014-12-11 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242878#comment-14242878
 ] 

Benjamin Lerer commented on CASSANDRA-8418:
---

What is happening is that non matching clustering columns are filtered out from 
the index row in {{CompositesSearcher.getIndexedIterator}} as it is more 
efficient than loading the row and then filtering. This filtering can only 
works today if the clustering columns have been specified as a slice and cause 
the behavior to not always be consistent specially if some indices exists on 
the clustering columns.

I agree that this behaviour is just some filtering optimization and that ALLOW 
FILTERING should be required.

 Queries that require allow filtering are working without it
 ---

 Key: CASSANDRA-8418
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8418
 Project: Cassandra
  Issue Type: Bug
Reporter: Philip Thompson
Assignee: Benjamin Lerer
Priority: Minor
 Fix For: 3.0

 Attachments: CASSANDRA-8418.txt


 The trunk dtest {{cql_tests.py:TestCQL.composite_index_with_pk_test}} has 
 begun failing after the changes to CASSANDRA-7981. 
 With the schema {code}CREATE TABLE blogs (
 blog_id int,
 time1 int,
 time2 int,
 author text,
 content text,
 PRIMARY KEY (blog_id, time1, time2){code}
 and {code}CREATE INDEX ON blogs(author){code}, then the query
 {code}SELECT blog_id, content FROM blogs WHERE time1  0 AND 
 author='foo'{code} now requires ALLOW FILTERING, but did not before the 
 refactor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8370) cqlsh doesn't handle LIST statements correctly

2014-12-11 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242889#comment-14242889
 ] 

Tyler Hobbs commented on CASSANDRA-8370:


bq. v2 wfm, one tiny nit - is it worth checking the querystring first to avoid 
unnecessary parsing? I couldn't see a case where we'd want to parse here just 
to validate the query, but I could be missing something.

+1, it's better to avoid parsing, if we can.

 cqlsh doesn't handle LIST statements correctly
 --

 Key: CASSANDRA-8370
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8370
 Project: Cassandra
  Issue Type: Bug
Reporter: Sam Tunnicliffe
Assignee: Sam Tunnicliffe
Priority: Minor
  Labels: cqlsh
 Fix For: 2.1.3

 Attachments: 8370.txt, 8370v2.patch


 {{LIST USERS}} and {{LIST PERMISSIONS}} statements are not handled correctly 
 by cqlsh in 2.1 (since CASSANDRA-6307).
 Running such a query results in errors along the lines of:
 {noformat}
 sam@easy:~/projects/cassandra$ bin/cqlsh --debug -u cassandra -p cassandra
 Using CQL driver: module 'cassandra' from 
 '/home/sam/projects/cassandra/bin/../lib/cassandra-driver-internal-only-2.1.2.zip/cassandra-driver-2.1.2/cassandra/__init__.py'
 Connected to Test Cluster at 127.0.0.1:9042.
 [cqlsh 5.0.1 | Cassandra 2.1.2-SNAPSHOT | CQL spec 3.2.0 | Native protocol v3]
 Use HELP for help.
 cassandra@cqlsh list users;
 Traceback (most recent call last):
   File bin/cqlsh, line 879, in onecmd
 self.handle_statement(st, statementtext)
   File bin/cqlsh, line 920, in handle_statement
 return self.perform_statement(cqlruleset.cql_extract_orig(tokens, srcstr))
   File bin/cqlsh, line 953, in perform_statement
 result = self.perform_simple_statement(stmt)
   File bin/cqlsh, line 989, in perform_simple_statement
 self.print_result(rows, self.parse_for_table_meta(statement.query_string))
   File bin/cqlsh, line 970, in parse_for_table_meta
 return self.get_table_meta(ks, cf)
   File bin/cqlsh, line 732, in get_table_meta
 ksmeta = self.get_keyspace_meta(ksname)
   File bin/cqlsh, line 717, in get_keyspace_meta
 raise KeyspaceNotFound('Keyspace %r not found.' % ksname)
 KeyspaceNotFound: Keyspace None not found.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8461) java.lang.AssertionError when running select queries

2014-12-11 Thread Chamila Dilshan Wijayarathna (JIRA)
Chamila Dilshan Wijayarathna created CASSANDRA-8461:
---

 Summary: java.lang.AssertionError when running select queries
 Key: CASSANDRA-8461
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8461
 Project: Cassandra
  Issue Type: Bug
  Components: API
 Environment: ubuntu 14.04
Reporter: Chamila Dilshan Wijayarathna


I have a column family with following schema.

CREATE TABLE corpus.trigram_category_ordered_frequency (
id bigint,
word1 varchar,
word2 varchar,
word3 varchar,
category varchar,
frequency int,
PRIMARY KEY(category,frequency,word1,word2,word3)
);

When I run 

 select word1,word2,word3 from corpus.trigram_category_ordered_frequency where 
category IN ('N','A','C','S','G') order by frequency DESC LIMIT 10;

I am getting error saying

ErrorMessage code= [Server error] message=java.lang.AssertionError

But when I ran 

select * from corpus.trigram_category_ordered_frequency where category IN 
('N','A','C','S','G') order by frequency DESC LIMIT 10;

it works without any error.

system log for this error is as follows.


ERROR [SharedPool-Worker-1] 2014-12-11 20:42:20,152 Message.java:538 - 
Unexpected exception during request; channel = [id: 0xea57d8b6, 
/127.0.0.1:35624 = /127.0.0.1:9042]
java.lang.AssertionError: null
at org.apache.cassandra.cql3.ResultSet.addRow(ResultSet.java:63) 
~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.statements.Selection$ResultSetBuilder.newRow(Selection.java:333)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.statements.SelectStatement.processColumnFamily(SelectStatement.java:1227)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:1161)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:290)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:267)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:215)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:64)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:226)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:248) 
~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
 [apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
 [apache-cassandra-2.1.2.jar:2.1.2]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
[na:1.7.0_72]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 [apache-cassandra-2.1.2.jar:2.1.2]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-2.1.2.jar:2.1.2]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes

2014-12-11 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242880#comment-14242880
 ] 

Michaël Figuière edited comment on CASSANDRA-7937 at 12/11/14 6:37 PM:
---

The StreamIDs, introduced in the native protocol to multiplex several pending 
requests on a single connection, could actually serve as a backpressure 
mechanism. Before protocol v2 we had just 128 IDs per connection with drivers 
typically allowing just a few connection per node. This therefore already acts 
as a throttling mechanism on the client side. With protocol v3 we've increased 
this limit but the driver still let the user define a value for the max 
requests per host that will have the same effect. A simple way the handle 
backpressure could therefore be to introduce a Window (similar as TCP Window) 
of the currently allowed concurrent requests for each client. Just like in TCP, 
the Window Size could be included in each response header to the client. This 
Window Size could then be adjusted using a magic formula to define, probably 
based on the load of each Stage of the Cassandra architecture, state of 
compaction, etc...

I agree with [~jbellis]'s point: backpressure in a distributed system like 
Cassandra, with a coordinator fowarding traffic to replicas, is confusing. But 
in practice, most recent CQL Drivers now do Token Aware Balancing by default 
(since 2.0.2 in the Java Driver), which will send the request to the replicas 
for any PreparedStatement (expected to be used under the high pressure 
condition described here). So in this situation the backpressure information 
received by the client could be used properly, as it would just be understood 
by the client as a request to slow down for *this* particular replica, it could 
therefore pick another replica. Thus we end up with a system in which we avoid 
doing Load Shedding (which is a waste of time, bandwidth and workload) and 
that, I believe, could behave more smoothly when the cluster is overloaded.

Note that this StreamID Window could be considered as a mandatory limit or 
just as a hint in the protocol specification. The driver could then adjust 
its strategy to use it or not depending on the settings or type of request.


was (Author: mfiguiere):
The StreamIDs, introduced in the native protocol to multiplex several pending 
requests on a single connection, could actually serve as a backpressure 
mechanism. Before protocol v2 we had just 128 IDs per connection with drivers 
typically allowing just a few connection per node. This therefore already acts 
as a throttling mechanism on the client side. With protocol v3 we've increased 
this limit but the driver still let the user define a value for the max 
requests per host that will have the same effect. A simple way the handle 
backpressure could therefore be to introduce a Window (similar as TCP Window) 
of the currently allowed concurrent requests for each client. Just like in TCP, 
the Window Size could be included in each response header to the client. This 
Window Size could then be adjusted using a magic formula to define, probably 
based on the load of each Stage of the Cassandra architecture, state of 
compaction, etc...

I agree with [~jbellis]'s point: backpressure in a distributed system like 
Cassandra, with a coordinator fowarding traffic to replicas, is confusing. But 
in practice, most recent CQL Drivers now do Token Aware Balancing by default 
(since 2.0.2 in the Java Driver), which will send the queries to the replicas 
any PreparedStatement (expected to be used under the high pressure condition 
described here). So in this situation the backpressure information received by 
the client could be used properly, as it would just be understood by the client 
as a request to slow down for *this* particular replica, it could therefore 
pick another replica. Thus we end up with a system in which we avoid doing Load 
Shedding (which is a waste of time, bandwidth and workload) and that, I 
believe, could behave more smoothly when the cluster is overloaded.

Note that this StreamID Window could be considered as a mandatory limit or 
just as a hint in the protocol specification. The driver could then adjust 
its strategy to use it or not depending on the settings or type of request.

 Apply backpressure gently when overloaded with writes
 -

 Key: CASSANDRA-7937
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7937
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.0
Reporter: Piotr Kołaczkowski
  Labels: performance

 When writing huge amounts of data into C* cluster from analytic tools like 
 Hadoop or Apache Spark, we can see that often C* can't keep up with the load. 
 This is 

[jira] [Updated] (CASSANDRA-8461) java.lang.AssertionError when running select queries

2014-12-11 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8461:
---
Description: 
I have a column family with following schema.

CREATE TABLE corpus.trigram_category_ordered_frequency (
id bigint,
word1 varchar,
word2 varchar,
word3 varchar,
category varchar,
frequency int,
PRIMARY KEY(category,frequency,word1,word2,word3)
);

When I run 

 select word1,word2,word3 from corpus.trigram_category_ordered_frequency where 
category IN ('N','A','C','S','G') order by frequency DESC LIMIT 10;

I am getting error saying

ErrorMessage code= [Server error] message=java.lang.AssertionError

But when I ran 

select * from corpus.trigram_category_ordered_frequency where category IN 
('N','A','C','S','G') order by frequency DESC LIMIT 10;

it works without any error.

system log for this error is as follows.

{code}
ERROR [SharedPool-Worker-1] 2014-12-11 20:42:20,152 Message.java:538 - 
Unexpected exception during request; channel = [id: 0xea57d8b6, 
/127.0.0.1:35624 = /127.0.0.1:9042]
java.lang.AssertionError: null
at org.apache.cassandra.cql3.ResultSet.addRow(ResultSet.java:63) 
~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.statements.Selection$ResultSetBuilder.newRow(Selection.java:333)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.statements.SelectStatement.processColumnFamily(SelectStatement.java:1227)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:1161)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:290)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:267)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:215)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:64)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:226)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:248) 
~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119)
 ~[apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
 [apache-cassandra-2.1.2.jar:2.1.2]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
 [apache-cassandra-2.1.2.jar:2.1.2]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
[na:1.7.0_72]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 [apache-cassandra-2.1.2.jar:2.1.2]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-2.1.2.jar:2.1.2]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72]{code}

  was:
I have a column family with following schema.

CREATE TABLE corpus.trigram_category_ordered_frequency (
id bigint,
word1 varchar,
word2 varchar,
word3 varchar,
category varchar,
frequency int,
PRIMARY KEY(category,frequency,word1,word2,word3)
);

When I run 

 select word1,word2,word3 from corpus.trigram_category_ordered_frequency where 
category IN ('N','A','C','S','G') order by frequency DESC LIMIT 10;

I am getting error saying

ErrorMessage code= [Server error] message=java.lang.AssertionError

But when I ran 

select * from corpus.trigram_category_ordered_frequency where category IN 
('N','A','C','S','G') order by frequency DESC LIMIT 10;

it works without any error.

system log for this error is as follows.


ERROR [SharedPool-Worker-1] 2014-12-11 20:42:20,152 Message.java:538 - 
Unexpected exception during request; channel = [id: 0xea57d8b6, 
/127.0.0.1:35624 = /127.0.0.1:9042]
java.lang.AssertionError: null
at 

[jira] [Assigned] (CASSANDRA-8461) java.lang.AssertionError when running select queries

2014-12-11 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson reassigned CASSANDRA-8461:
--

Assignee: Philip Thompson

 java.lang.AssertionError when running select queries
 

 Key: CASSANDRA-8461
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8461
 Project: Cassandra
  Issue Type: Bug
  Components: API
 Environment: ubuntu 14.04
Reporter: Chamila Dilshan Wijayarathna
Assignee: Philip Thompson

 I have a column family with following schema.
 CREATE TABLE corpus.trigram_category_ordered_frequency (
 id bigint,
 word1 varchar,
 word2 varchar,
 word3 varchar,
 category varchar,
 frequency int,
 PRIMARY KEY(category,frequency,word1,word2,word3)
 );
 When I run 
  select word1,word2,word3 from corpus.trigram_category_ordered_frequency 
 where category IN ('N','A','C','S','G') order by frequency DESC LIMIT 10;
 I am getting error saying
 ErrorMessage code= [Server error] message=java.lang.AssertionError
 But when I ran 
 select * from corpus.trigram_category_ordered_frequency where category IN 
 ('N','A','C','S','G') order by frequency DESC LIMIT 10;
 it works without any error.
 system log for this error is as follows.
 ERROR [SharedPool-Worker-1] 2014-12-11 20:42:20,152 Message.java:538 - 
 Unexpected exception during request; channel = [id: 0xea57d8b6, 
 /127.0.0.1:35624 = /127.0.0.1:9042]
 java.lang.AssertionError: null
   at org.apache.cassandra.cql3.ResultSet.addRow(ResultSet.java:63) 
 ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.Selection$ResultSetBuilder.newRow(Selection.java:333)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.processColumnFamily(SelectStatement.java:1227)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:1161)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:290)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:267)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:215)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:64)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:226)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:248) 
 ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
  [apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
  [apache-cassandra-2.1.2.jar:2.1.2]
   at 
 io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 [na:1.7.0_72]
   at 
 org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
  [apache-cassandra-2.1.2.jar:2.1.2]
   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
 [apache-cassandra-2.1.2.jar:2.1.2]
   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8461) java.lang.AssertionError when running select queries

2014-12-11 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242942#comment-14242942
 ] 

Philip Thompson commented on CASSANDRA-8461:


[~thobbs], I will verify that this was not fixed by other work.

 java.lang.AssertionError when running select queries
 

 Key: CASSANDRA-8461
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8461
 Project: Cassandra
  Issue Type: Bug
  Components: API
 Environment: ubuntu 14.04
Reporter: Chamila Dilshan Wijayarathna
Assignee: Philip Thompson

 I have a column family with following schema.
 CREATE TABLE corpus.trigram_category_ordered_frequency (
 id bigint,
 word1 varchar,
 word2 varchar,
 word3 varchar,
 category varchar,
 frequency int,
 PRIMARY KEY(category,frequency,word1,word2,word3)
 );
 When I run 
  select word1,word2,word3 from corpus.trigram_category_ordered_frequency 
 where category IN ('N','A','C','S','G') order by frequency DESC LIMIT 10;
 I am getting error saying
 ErrorMessage code= [Server error] message=java.lang.AssertionError
 But when I ran 
 select * from corpus.trigram_category_ordered_frequency where category IN 
 ('N','A','C','S','G') order by frequency DESC LIMIT 10;
 it works without any error.
 system log for this error is as follows.
 {code}
 ERROR [SharedPool-Worker-1] 2014-12-11 20:42:20,152 Message.java:538 - 
 Unexpected exception during request; channel = [id: 0xea57d8b6, 
 /127.0.0.1:35624 = /127.0.0.1:9042]
 java.lang.AssertionError: null
   at org.apache.cassandra.cql3.ResultSet.addRow(ResultSet.java:63) 
 ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.Selection$ResultSetBuilder.newRow(Selection.java:333)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.processColumnFamily(SelectStatement.java:1227)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:1161)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:290)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:267)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:215)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:64)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:226)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:248) 
 ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
  [apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
  [apache-cassandra-2.1.2.jar:2.1.2]
   at 
 io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 [na:1.7.0_72]
   at 
 org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
  [apache-cassandra-2.1.2.jar:2.1.2]
   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
 [apache-cassandra-2.1.2.jar:2.1.2]
   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72]{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8462) Upgrading a 2.0 to 2.1 breaks CFMetaData on 2.0 nodes

2014-12-11 Thread Rick Branson (JIRA)
Rick Branson created CASSANDRA-8462:
---

 Summary: Upgrading a 2.0 to 2.1 breaks CFMetaData on 2.0 nodes
 Key: CASSANDRA-8462
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8462
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Rick Branson


Added a 2.1.2 node to a cluster running 2.0.11. Didn't make any schema changes. 
When I tried to reboot one of the 2.0 nodes, it failed to boot with this 
exception. Besides an obvious fix, any workarounds for this?

{code}
java.lang.IllegalArgumentException: No enum constant 
org.apache.cassandra.config.CFMetaData.Caching.{keys:ALL, 
rows_per_partition:NONE}
at java.lang.Enum.valueOf(Enum.java:236)
at 
org.apache.cassandra.config.CFMetaData$Caching.valueOf(CFMetaData.java:286)
at 
org.apache.cassandra.config.CFMetaData.fromSchemaNoColumnsNoTriggers(CFMetaData.java:1713)
at 
org.apache.cassandra.config.CFMetaData.fromSchema(CFMetaData.java:1793)
at 
org.apache.cassandra.config.KSMetaData.deserializeColumnFamilies(KSMetaData.java:307)
at 
org.apache.cassandra.config.KSMetaData.fromSchema(KSMetaData.java:288)
at 
org.apache.cassandra.db.DefsTables.loadFromKeyspace(DefsTables.java:131)
at 
org.apache.cassandra.config.DatabaseDescriptor.loadSchemas(DatabaseDescriptor.java:529)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:270)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
{/code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8461) java.lang.AssertionError when running select queries

2014-12-11 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8461:
---
Reproduced In: 2.1.2

 java.lang.AssertionError when running select queries
 

 Key: CASSANDRA-8461
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8461
 Project: Cassandra
  Issue Type: Bug
  Components: API
 Environment: ubuntu 14.04
Reporter: Chamila Dilshan Wijayarathna
Assignee: Philip Thompson

 I have a column family with following schema.
 CREATE TABLE corpus.trigram_category_ordered_frequency (
 id bigint,
 word1 varchar,
 word2 varchar,
 word3 varchar,
 category varchar,
 frequency int,
 PRIMARY KEY(category,frequency,word1,word2,word3)
 );
 When I run 
  select word1,word2,word3 from corpus.trigram_category_ordered_frequency 
 where category IN ('N','A','C','S','G') order by frequency DESC LIMIT 10;
 I am getting error saying
 ErrorMessage code= [Server error] message=java.lang.AssertionError
 But when I ran 
 select * from corpus.trigram_category_ordered_frequency where category IN 
 ('N','A','C','S','G') order by frequency DESC LIMIT 10;
 it works without any error.
 system log for this error is as follows.
 {code}
 ERROR [SharedPool-Worker-1] 2014-12-11 20:42:20,152 Message.java:538 - 
 Unexpected exception during request; channel = [id: 0xea57d8b6, 
 /127.0.0.1:35624 = /127.0.0.1:9042]
 java.lang.AssertionError: null
   at org.apache.cassandra.cql3.ResultSet.addRow(ResultSet.java:63) 
 ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.Selection$ResultSetBuilder.newRow(Selection.java:333)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.processColumnFamily(SelectStatement.java:1227)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:1161)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:290)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:267)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:215)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:64)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:226)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:248) 
 ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
  [apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
  [apache-cassandra-2.1.2.jar:2.1.2]
   at 
 io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 [na:1.7.0_72]
   at 
 org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
  [apache-cassandra-2.1.2.jar:2.1.2]
   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
 [apache-cassandra-2.1.2.jar:2.1.2]
   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72]{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-8461) java.lang.AssertionError when running select queries

2014-12-11 Thread Tyler Hobbs (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs resolved CASSANDRA-8461.

Resolution: Duplicate

[~philipthompson] before I saw your comment, I went ahead and verified that 
CASSANDRA-8286 was the patch that fixed this, so Im resolving it as a 
duplicate of that.

 java.lang.AssertionError when running select queries
 

 Key: CASSANDRA-8461
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8461
 Project: Cassandra
  Issue Type: Bug
  Components: API
 Environment: ubuntu 14.04
Reporter: Chamila Dilshan Wijayarathna
Assignee: Philip Thompson

 I have a column family with following schema.
 CREATE TABLE corpus.trigram_category_ordered_frequency (
 id bigint,
 word1 varchar,
 word2 varchar,
 word3 varchar,
 category varchar,
 frequency int,
 PRIMARY KEY(category,frequency,word1,word2,word3)
 );
 When I run 
  select word1,word2,word3 from corpus.trigram_category_ordered_frequency 
 where category IN ('N','A','C','S','G') order by frequency DESC LIMIT 10;
 I am getting error saying
 ErrorMessage code= [Server error] message=java.lang.AssertionError
 But when I ran 
 select * from corpus.trigram_category_ordered_frequency where category IN 
 ('N','A','C','S','G') order by frequency DESC LIMIT 10;
 it works without any error.
 system log for this error is as follows.
 {code}
 ERROR [SharedPool-Worker-1] 2014-12-11 20:42:20,152 Message.java:538 - 
 Unexpected exception during request; channel = [id: 0xea57d8b6, 
 /127.0.0.1:35624 = /127.0.0.1:9042]
 java.lang.AssertionError: null
   at org.apache.cassandra.cql3.ResultSet.addRow(ResultSet.java:63) 
 ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.Selection$ResultSetBuilder.newRow(Selection.java:333)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.processColumnFamily(SelectStatement.java:1227)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:1161)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:290)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:267)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:215)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:64)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:226)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:248) 
 ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
  [apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
  [apache-cassandra-2.1.2.jar:2.1.2]
   at 
 io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 [na:1.7.0_72]
   at 
 org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
  [apache-cassandra-2.1.2.jar:2.1.2]
   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
 [apache-cassandra-2.1.2.jar:2.1.2]
   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72]{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8447) Nodes stuck in CMS GC cycle with very little traffic when compaction is enabled

2014-12-11 Thread jonathan lacefield (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242954#comment-14242954
 ] 

jonathan lacefield commented on CASSANDRA-8447:
---

[~benedict]  Interesting about hints.  Just verified hints on the cluster.  
*  CQLSH shows 0 for count
*  data directory locally is empty under hints for all nodes.  
*  For all healthy nodes, tpstats shows no pending/active hints.
*  For the unhealthy node, TPstats shows 2 Active and 3 Pending hint ops  

From unhealthy node
cqlsh use system;
cqlsh:system select count(*) from hints ;
 count
---
 0

Pool NameActive   Pending  Completed   Blocked  All 
time blocked
ReadStage 0 0  2 0  
   0
RequestResponseStage  0 0  9 0  
   0
MutationStage 0 0   16471703 0  
   0
ReadRepairStage   0 0  0 0  
   0
ReplicateOnWriteStage 0 0  0 0  
   0
GossipStage   0 0439 0  
   0
CacheCleanupExecutor  0 0  0 0  
   0
MigrationStage0 0  0 0  
   0
MemoryMeter   0 0 24 0  
   0
FlushWriter   0 0175 0  
   0
ValidationExecutor0 0  0 0  
   0
InternalResponseStage 0 0  0 0  
   0
AntiEntropyStage  0 0  0 0  
   0
MemtablePostFlusher   0 0194 0  
   0
MiscStage 0 0  0 0  
   0
PendingRangeCalculator0 0  6 0  
   0
CompactionExecutor117 18 0  
   0
commitlog_archiver0 0  0 0  
   0
HintedHandoff 2 3  0 0  
   0

Here is the excerpt from the current hints config items in the .yaml from all 4 
nodes
hinted_handoff_enabled: false
# this defines the maximum amount of time a dead host will have hints
# generated.  After it has been dead this long, new hints for it will not be
max_hint_window_in_ms: 1080 # 3 hours
# since we expect two nodes to be delivering hints simultaneously.)
hinted_handoff_throttle_in_kb: 1024
# Number of threads with which to deliver hints;
max_hints_delivery_threads: 2

 Nodes stuck in CMS GC cycle with very little traffic when compaction is 
 enabled
 ---

 Key: CASSANDRA-8447
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8447
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cluster size - 4 nodes
 Node size - 12 CPU (hyper threaded to 24 cores), 192 GB RAM, 2 Raid 0 arrays 
 (Data - 10 disk, spinning 10k drives | CL 2 disk, spinning 10k drives)
 OS - RHEL 6.5
 jvm - oracle 1.7.0_71
 Cassandra version 2.0.11
Reporter: jonathan lacefield
 Attachments: Node_with_compaction.png, Node_without_compaction.png, 
 cassandra.yaml, gc.logs.tar.gz, gcinspector_messages.txt, memtable_debug, 
 output.1.svg, output.2.svg, output.svg, results.tar.gz, visualvm_screenshot


 Behavior - If autocompaction is enabled, nodes will become unresponsive due 
 to a full Old Gen heap which is not cleared during CMS GC.
 Test methodology - disabled autocompaction on 3 nodes, left autocompaction 
 enabled on 1 node.  Executed different Cassandra stress loads, using write 
 only operations.  Monitored visualvm and jconsole for heap pressure.  
 Captured iostat and dstat for most tests.  Captured heap dump from 50 thread 
 load.  Hints were disabled for testing on all nodes to alleviate GC noise due 
 to hints backing up.
 Data load test through Cassandra stress -  /usr/bin/cassandra-stress  write 
 n=19 -rate threads=different threads tested -schema  
 replication\(factor=3\)  keyspace=Keyspace1 -node all nodes listed
 Data load thread count and results:
 * 1 thread - Still running but looks like the node can sustain this load 
 (approx 500 writes per second per node)
 * 5 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range (approx 2k writes per second per node)
 * 10 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 

[jira] [Commented] (CASSANDRA-8461) java.lang.AssertionError when running select queries

2014-12-11 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242955#comment-14242955
 ] 

Philip Thompson commented on CASSANDRA-8461:


[~thobbs], thanks.

[~cdwijayarathna], this issue has already been fixed in 2.1.3. When that minor 
release comes out, upgrading should fix your problem. Thank you for filing the 
JIRA!

 java.lang.AssertionError when running select queries
 

 Key: CASSANDRA-8461
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8461
 Project: Cassandra
  Issue Type: Bug
  Components: API
 Environment: ubuntu 14.04
Reporter: Chamila Dilshan Wijayarathna
Assignee: Philip Thompson

 I have a column family with following schema.
 CREATE TABLE corpus.trigram_category_ordered_frequency (
 id bigint,
 word1 varchar,
 word2 varchar,
 word3 varchar,
 category varchar,
 frequency int,
 PRIMARY KEY(category,frequency,word1,word2,word3)
 );
 When I run 
  select word1,word2,word3 from corpus.trigram_category_ordered_frequency 
 where category IN ('N','A','C','S','G') order by frequency DESC LIMIT 10;
 I am getting error saying
 ErrorMessage code= [Server error] message=java.lang.AssertionError
 But when I ran 
 select * from corpus.trigram_category_ordered_frequency where category IN 
 ('N','A','C','S','G') order by frequency DESC LIMIT 10;
 it works without any error.
 system log for this error is as follows.
 {code}
 ERROR [SharedPool-Worker-1] 2014-12-11 20:42:20,152 Message.java:538 - 
 Unexpected exception during request; channel = [id: 0xea57d8b6, 
 /127.0.0.1:35624 = /127.0.0.1:9042]
 java.lang.AssertionError: null
   at org.apache.cassandra.cql3.ResultSet.addRow(ResultSet.java:63) 
 ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.Selection$ResultSetBuilder.newRow(Selection.java:333)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.processColumnFamily(SelectStatement.java:1227)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:1161)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:290)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:267)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:215)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:64)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:226)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:248) 
 ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
  [apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
  [apache-cassandra-2.1.2.jar:2.1.2]
   at 
 io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 [na:1.7.0_72]
   at 
 org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
  [apache-cassandra-2.1.2.jar:2.1.2]
   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
 [apache-cassandra-2.1.2.jar:2.1.2]
   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72]{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8461) java.lang.AssertionError when running select queries

2014-12-11 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8461:
---
Assignee: (was: Philip Thompson)

 java.lang.AssertionError when running select queries
 

 Key: CASSANDRA-8461
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8461
 Project: Cassandra
  Issue Type: Bug
  Components: API
 Environment: ubuntu 14.04
Reporter: Chamila Dilshan Wijayarathna

 I have a column family with following schema.
 CREATE TABLE corpus.trigram_category_ordered_frequency (
 id bigint,
 word1 varchar,
 word2 varchar,
 word3 varchar,
 category varchar,
 frequency int,
 PRIMARY KEY(category,frequency,word1,word2,word3)
 );
 When I run 
  select word1,word2,word3 from corpus.trigram_category_ordered_frequency 
 where category IN ('N','A','C','S','G') order by frequency DESC LIMIT 10;
 I am getting error saying
 ErrorMessage code= [Server error] message=java.lang.AssertionError
 But when I ran 
 select * from corpus.trigram_category_ordered_frequency where category IN 
 ('N','A','C','S','G') order by frequency DESC LIMIT 10;
 it works without any error.
 system log for this error is as follows.
 {code}
 ERROR [SharedPool-Worker-1] 2014-12-11 20:42:20,152 Message.java:538 - 
 Unexpected exception during request; channel = [id: 0xea57d8b6, 
 /127.0.0.1:35624 = /127.0.0.1:9042]
 java.lang.AssertionError: null
   at org.apache.cassandra.cql3.ResultSet.addRow(ResultSet.java:63) 
 ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.Selection$ResultSetBuilder.newRow(Selection.java:333)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.processColumnFamily(SelectStatement.java:1227)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:1161)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:290)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:267)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:215)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:64)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:226)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:248) 
 ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
  [apache-cassandra-2.1.2.jar:2.1.2]
   at 
 org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
  [apache-cassandra-2.1.2.jar:2.1.2]
   at 
 io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 [na:1.7.0_72]
   at 
 org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
  [apache-cassandra-2.1.2.jar:2.1.2]
   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
 [apache-cassandra-2.1.2.jar:2.1.2]
   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72]{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8447) Nodes stuck in CMS GC cycle with very little traffic when compaction is enabled

2014-12-11 Thread jonathan lacefield (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242954#comment-14242954
 ] 

jonathan lacefield edited comment on CASSANDRA-8447 at 12/11/14 6:58 PM:
-

[~benedict]  Interesting about hints.  Just verified hints on the cluster.  
*  CQLSH shows 0 for count
*  data directory locally is empty under hints for all nodes.  
*  For all healthy nodes, tpstats shows no pending/active hints.
*  For the unhealthy node, TPstats shows 2 Active and 3 Pending hint ops  

From unhealthy node
cqlsh use system;
cqlsh:system select count(*) from hints ;
 count
---
 0

Pool NameActive   Pending  Completed   Blocked  All 
time blocked
ReadStage 0 0  2 0  
   0
RequestResponseStage  0 0  9 0  
   0
MutationStage 0 0   16471703 0  
   0
ReadRepairStage   0 0  0 0  
   0
ReplicateOnWriteStage 0 0  0 0  
   0
GossipStage   0 0439 0  
   0
CacheCleanupExecutor  0 0  0 0  
   0
MigrationStage0 0  0 0  
   0
MemoryMeter   0 0 24 0  
   0
FlushWriter   0 0175 0  
   0
ValidationExecutor0 0  0 0  
   0
InternalResponseStage 0 0  0 0  
   0
AntiEntropyStage  0 0  0 0  
   0
MemtablePostFlusher   0 0194 0  
   0
MiscStage 0 0  0 0  
   0
PendingRangeCalculator0 0  6 0  
   0
CompactionExecutor117 18 0  
   0
commitlog_archiver0 0  0 0  
   0
HintedHandoff 2 3  0 0  
   0

Here is the excerpt from the current hints config items in the .yaml from all 4 
nodes
hinted_handoff_enabled: false
# this defines the maximum amount of time a dead host will have hints
# generated.  After it has been dead this long, new hints for it will not be
max_hint_window_in_ms: 1080 # 3 hours
# since we expect two nodes to be delivering hints simultaneously.)
hinted_handoff_throttle_in_kb: 1024
# Number of threads with which to deliver hints;
max_hints_delivery_threads: 2

(edited - even after restarting dse, the unhealthy node shows 2 active and 3 
pending hints via tpstats)


was (Author: jlacefie):
[~benedict]  Interesting about hints.  Just verified hints on the cluster.  
*  CQLSH shows 0 for count
*  data directory locally is empty under hints for all nodes.  
*  For all healthy nodes, tpstats shows no pending/active hints.
*  For the unhealthy node, TPstats shows 2 Active and 3 Pending hint ops  

From unhealthy node
cqlsh use system;
cqlsh:system select count(*) from hints ;
 count
---
 0

Pool NameActive   Pending  Completed   Blocked  All 
time blocked
ReadStage 0 0  2 0  
   0
RequestResponseStage  0 0  9 0  
   0
MutationStage 0 0   16471703 0  
   0
ReadRepairStage   0 0  0 0  
   0
ReplicateOnWriteStage 0 0  0 0  
   0
GossipStage   0 0439 0  
   0
CacheCleanupExecutor  0 0  0 0  
   0
MigrationStage0 0  0 0  
   0
MemoryMeter   0 0 24 0  
   0
FlushWriter   0 0175 0  
   0
ValidationExecutor0 0  0 0  
   0
InternalResponseStage 0 0  0 0  
   0
AntiEntropyStage  0 0  0 0  
   0
MemtablePostFlusher   0 0194 0  
   0
MiscStage 0 0  0 0   

[jira] [Commented] (CASSANDRA-8452) Add missing systems to FBUtilities.isUnix, add FBUtilities.isWindows

2014-12-11 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242969#comment-14242969
 ] 

Joshua McKenzie commented on CASSANDRA-8452:


I'm not sure if isPosix is correct in retrospect 
([reference|http://en.wikipedia.org/wiki/POSIX#POSIX-oriented_operating_systems]).
  Not only are we not actually checking if something's truly posix compliant 
(though being mostly compliant serves our needs for now...), but our checks are 
more oriented around granular platform-specific differences, i.e. whether or 
not the underlying filesystem is an ext/hfs vs. ntfs, or whether or not the 
platform has a /proc filesystem  (Note: I don't think the /proc filesystem is 
actually defined in the posix standard ([reference 
2|http://pubs.opengroup.org/onlinepubs/9699919799/])).

While having methods like 'isNTFS' or 'hasProcFilesystem' would be arguably 
more correct, at this point if we slice the eco-sytem into Windows vs. 
non-Windows it seems like it would satisfy our requirements.  I could be off 
on that though - do we have areas in the code-base where we support specific 
sub-types of the *nix world w/different checks?  i.e. is cassandra run on any 
systems that are currently missing a /proc filesystem, or have wacky hard-link 
behavior so early re-open is a problem, etc?

 Add missing systems to FBUtilities.isUnix, add FBUtilities.isWindows
 

 Key: CASSANDRA-8452
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8452
 Project: Cassandra
  Issue Type: Bug
Reporter: Blake Eggleston
Assignee: Blake Eggleston
Priority: Minor
 Fix For: 2.1.3

 Attachments: CASSANDRA-8452-v2.patch, CASSANDRA-8452.patch


 The isUnix method leaves out a few unix systems, which, after the changes in 
 CASSANDRA-8136, causes some unexpected behavior during shutdown. It would 
 also be clearer if FBUtilities had an isWindows method for branching into 
 Windows specific logic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8447) Nodes stuck in CMS GC cycle with very little traffic when compaction is enabled

2014-12-11 Thread jonathan lacefield (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242954#comment-14242954
 ] 

jonathan lacefield edited comment on CASSANDRA-8447 at 12/11/14 7:21 PM:
-

[~benedict]  Interesting about hints.  Just verified hints on the cluster.  
*  CQLSH shows 0 for count
*  data directory locally is empty under hints for all nodes.  
*  For all healthy nodes, tpstats shows no pending/active hints.
*  For the unhealthy node, TPstats shows 2 Active and 3 Pending hint ops  

From unhealthy node
cqlsh use system;
cqlsh:system select count(*) from hints ;
 count
---
 0

Pool NameActive   Pending  Completed   Blocked  All 
time blocked
ReadStage 0 0  2 0  
   0
RequestResponseStage  0 0  9 0  
   0
MutationStage 0 0   16471703 0  
   0
ReadRepairStage   0 0  0 0  
   0
ReplicateOnWriteStage 0 0  0 0  
   0
GossipStage   0 0439 0  
   0
CacheCleanupExecutor  0 0  0 0  
   0
MigrationStage0 0  0 0  
   0
MemoryMeter   0 0 24 0  
   0
FlushWriter   0 0175 0  
   0
ValidationExecutor0 0  0 0  
   0
InternalResponseStage 0 0  0 0  
   0
AntiEntropyStage  0 0  0 0  
   0
MemtablePostFlusher   0 0194 0  
   0
MiscStage 0 0  0 0  
   0
PendingRangeCalculator0 0  6 0  
   0
CompactionExecutor117 18 0  
   0
commitlog_archiver0 0  0 0  
   0
HintedHandoff 2 3  0 0  
   0

Here is the excerpt from the current hints config items in the .yaml from all 4 
nodes
hinted_handoff_enabled: false
# this defines the maximum amount of time a dead host will have hints
# generated.  After it has been dead this long, new hints for it will not be
max_hint_window_in_ms: 1080 # 3 hours
# since we expect two nodes to be delivering hints simultaneously.)
hinted_handoff_throttle_in_kb: 1024
# Number of threads with which to deliver hints;
max_hints_delivery_threads: 2

(edited - even after restarting dse, the unhealthy node shows 2 active and 3 
pending hints via tpstats)
(edit 2 - was able to clear pending and active hints by dropping the keyspace 
through cqlsh as well as dropping the keyspace folder on this node.  new test 
is executing to see if behavior persists)


was (Author: jlacefie):
[~benedict]  Interesting about hints.  Just verified hints on the cluster.  
*  CQLSH shows 0 for count
*  data directory locally is empty under hints for all nodes.  
*  For all healthy nodes, tpstats shows no pending/active hints.
*  For the unhealthy node, TPstats shows 2 Active and 3 Pending hint ops  

From unhealthy node
cqlsh use system;
cqlsh:system select count(*) from hints ;
 count
---
 0

Pool NameActive   Pending  Completed   Blocked  All 
time blocked
ReadStage 0 0  2 0  
   0
RequestResponseStage  0 0  9 0  
   0
MutationStage 0 0   16471703 0  
   0
ReadRepairStage   0 0  0 0  
   0
ReplicateOnWriteStage 0 0  0 0  
   0
GossipStage   0 0439 0  
   0
CacheCleanupExecutor  0 0  0 0  
   0
MigrationStage0 0  0 0  
   0
MemoryMeter   0 0 24 0  
   0
FlushWriter   0 0175 0  
   0
ValidationExecutor0 0  0 0  
   0
InternalResponseStage 0 0  0 0  
   0
AntiEntropyStage  0 0

[jira] [Created] (CASSANDRA-8463) Upgrading 2.0 to 2.1 causes LCS to recompact all files

2014-12-11 Thread Rick Branson (JIRA)
Rick Branson created CASSANDRA-8463:
---

 Summary: Upgrading 2.0 to 2.1 causes LCS to recompact all files
 Key: CASSANDRA-8463
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8463
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Hardware is recent 2-socket, 16-core (x2 Hyperthreaded), 
144G RAM, solid-state storage.
Platform is Linux 3.2.51, Oracle JDK 64-bit 1.7.0_65.
Heap is 32G total, 4G newsize.
8G/8G on-heap/off-heap memtables, offheap_buffer allocator, 0.5 
memtable_cleanup_threshold
concurrent_compactors: 20
Reporter: Rick Branson


It appears that tables configured with LCS will completely re-compact 
themselves over some period of time after upgrading from 2.0 to 2.1 (2.0.11 - 
2.1.2, specifically). It starts out with 10 pending tasks for an hour or so, 
then starts building up, now with 50-100 tasks pending across the cluster after 
12 hours. These nodes are under heavy write load, but were easily able to keep 
up in 2.0 (they rarely had 5 pending compaction tasks), so I don't think it's 
LCS in 2.1 actually being worse, just perhaps some different LCS behavior that 
causes the layout of tables from 2.0 to prompt the compactor to reorganize them?

The nodes flushed ~11MB SSTables under 2.0. They're currently flushing ~36MB 
SSTables due to the improved memtable setup in 2.1. Before I upgraded the 
entire cluster to 2.1, I noticed the problem and tried several variations on 
the flush size, thinking perhaps the larger tables in L0 were causing some kind 
of cascading compactions. Even if they're sized roughly like the 2.0 flushes 
were, same behavior occurs. I also tried both enabling  disabling STCS in L0 
with no real change other than L0 began to back up faster, so I left the STCS 
in L0 enabled.

Tables are configured with 32MB sstable_size_in_mb, which was found to be an 
improvement on the 160MB table size for compaction performance. Maybe this is 
wrong now? Otherwise, the tables are configured with defaults. Compaction has 
been unthrottled to help them catch-up. The compaction threads stay very busy, 
with the cluster-wide CPU at 45% nice time. No nodes have completely caught 
up yet. I'll update JIRA with status about their progress if anything 
interesting happens.

From a node around 12 hours ago, around an hour after the upgrade, with 19 
pending compaction tasks:
SSTables in each level: [6/4, 10, 105/100, 268, 0, 0, 0, 0, 0]
SSTables in each level: [6/4, 10, 106/100, 271, 0, 0, 0, 0, 0]
SSTables in each level: [1, 16/10, 105/100, 269, 0, 0, 0, 0, 0]
SSTables in each level: [5/4, 10, 103/100, 272, 0, 0, 0, 0, 0]
SSTables in each level: [4, 11/10, 105/100, 270, 0, 0, 0, 0, 0]
SSTables in each level: [1, 12/10, 105/100, 271, 0, 0, 0, 0, 0]
SSTables in each level: [1, 14/10, 104/100, 267, 0, 0, 0, 0, 0]
SSTables in each level: [9/4, 10, 103/100, 265, 0, 0, 0, 0, 0]

Recently, with 41 pending compaction tasks:
SSTables in each level: [4, 13/10, 106/100, 269, 0, 0, 0, 0, 0]
SSTables in each level: [4, 12/10, 106/100, 273, 0, 0, 0, 0, 0]
SSTables in each level: [5/4, 11/10, 106/100, 271, 0, 0, 0, 0, 0]
SSTables in each level: [4, 12/10, 103/100, 275, 0, 0, 0, 0, 0]
SSTables in each level: [2, 13/10, 106/100, 273, 0, 0, 0, 0, 0]
SSTables in each level: [3, 10, 104/100, 275, 0, 0, 0, 0, 0]
SSTables in each level: [6/4, 11/10, 103/100, 269, 0, 0, 0, 0, 0]
SSTables in each level: [4, 16/10, 105/100, 264, 0, 0, 0, 0, 0]

More information about the use case: writes are roughly uniform across these 
tables. The data is sharded across these 8 tables by key to improve 
compaction parallelism. Each node receives up to 75,000 writes/sec sustained at 
peak, and a small number of reads. This is a pre-production cluster that's 
being warmed up with new data, so the low volume of reads (~100/sec per node) 
is just from automatic sampled data checks, otherwise we'd just use STCS :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8457) nio MessagingService

2014-12-11 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243008#comment-14243008
 ] 

Ariel Weisberg commented on CASSANDRA-8457:
---

 bq. To establish if there's likely a benefit to exploit, we could most likely 
refactor this code comparatively minimally (than rewriting to NIO/Netty) to 
make use of the SharedExecutorPool to establish if such a positive effect is 
indeed to be had, as this would reduce the number of threads in flight to those 
actually serving work on the OTCs. This wouldn't affect the ITC, but I am 
dubious of their contribution. We should probably also actually test if this is 
indeed a problem from clusters at scale performing in-memory CL1 reads.
I wonder what there is to be gained by having a single socket for 
inbound/outbound?

Running a representative test will take some doing. cstar doesn't support 
multiple stress clients and it seems like the clusters only have 3 nodes? This 
is another argument for getting some decent size performance runs in CI working 
rather then doing one-off manual tests. Having profiling artifacts collected as 
part of this would also make doing performance research and validation easier. 
I feel pretty under informed when we discuss what to do next due to the lack of 
profiling information and the lack of canonical/repeatable performance data and 
workloads.

 nio MessagingService
 

 Key: CASSANDRA-8457
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Ariel Weisberg
  Labels: performance
 Fix For: 3.0


 Thread-per-peer (actually two each incoming and outbound) is a big 
 contributor to context switching, especially for larger clusters.  Let's look 
 at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8429) Stress on trunk fails mixed workload on missing keys

2014-12-11 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-8429:
---
Fix Version/s: 2.1.3

 Stress on trunk fails mixed workload on missing keys
 

 Key: CASSANDRA-8429
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8429
 Project: Cassandra
  Issue Type: Bug
 Environment: Ubuntu 14.04
Reporter: Ariel Weisberg
Assignee: Marcus Eriksson
 Fix For: 2.1.3

 Attachments: cluster.conf, run_stress.sh


 Starts as part of merge commit 25be46497a8df46f05ffa102bc645bfd684ea48a
 Stress will say that a key wasn't validated because it isn't returned even 
 though it's loaded. The key will eventually appear and can be queried using 
 cqlsh.
 Reproduce with
 #!/bin/sh
 ROWCOUNT=1000
 SCHEMA='-col n=fixed(1) -schema 
 compaction(strategy=LeveledCompactionStrategy) compression=LZ4Compressor'
 ./cassandra-stress write n=$ROWCOUNT -node xh61 -pop seq=1..$ROWCOUNT no-wrap 
 -rate threads=25 $SCHEMA
 ./cassandra-stress mixed ratio(read=2) n=1 -node xh61 -pop 
 dist=extreme(1..$ROWCOUNT,0.6) -rate threads=25 $SCHEMA



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7304) Ability to distinguish between NULL and UNSET values in Prepared Statements

2014-12-11 Thread Oded Peer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oded Peer updated CASSANDRA-7304:
-
Attachment: 7304-04.patch

I appreciate your comments, they are very helpful and I am learning a lot from 
them.
bq. In the spec, instead of changing the meaning of {{\[bytes\]}}, I would 
rather add a new {{\[value\]}} definition that support 'unset', and use that 
exclusively in the definition of values for bind variables in QUERY and EXECUTE 
messages, so as to make it clear that it makes no sense in any other place. I 
would then add a specific {{CBUtil.readBoundValue()}} to read those.
Done
bq. Making {{UNSET_CONSTANT_VALUE}} be {{new Value(null)}} is somewhat 
incorrect, it should be {{new Value(UNSET_BYTE_BUFFER)}} so that we don't lose 
the the information that it's 'unset' if {{bindAndGet}} is used. For this 
reason, I'd prefer using {{Constants.UNSET_CONSTANT_VALUE}} (renamed as 
{{UNSET_VALUE}}) in collections too (instead of adding 
{{Lists.UNSET_LIST_VALUE}}, ...).
Done
bq. We can't have an 'unset' value inside a collection since we don't allow 
bind markers in the first place, and so there is a bit of useless 
code/validation related to that.
I verified the changes aren’t useless with unit tests.
bq. There is a bunch of place that don't handle 'UNSET_BYTE_BUFFER' properly: 
Tuples, ColumnCondition (we might want to reject queries for which all 
conditions are 'unset' as going through the paxos code for no reason feels like 
the user is doing something wrong) and SelectStatement where we could get an 
'unset' pretty much anywhere where the {{values()}} or {{bound()}} method of a 
{{Restriction}} is used (and validation might be tricky in SelectStatement: if 
we have {{SELECT * FROM foo WHERE k1 = ? AND k2 = ? AND k3 = ?}}, then we 
shouldn't accept an 'unset' for {{k2}} unless {{k3}} is also unset; note that 
I'd be fine just refusing 'unset' in selects for now to simplify, but we at 
least need the validation code to reject them).

I opted for rejecting unset values in selects. It’s not only to simplify I 
think it’s the right thing to do. Having a variable assignment or a condition 
with unset variables is undefined.
bq. I'd reject 'unset' indexes in {{UDPATE ... SET l\[?\] = ?}} since it's 
rejected for map keys. Unless maybe if both the key/index and value are 
'unset', but that should be coherent for lists and maps.
Done
bq. In Constants.Marker.bindAndGet, we should skip validation if 'unset' (even 
though the validation will never fail because empty values are always accepted, 
it's still dodgy).
Done
bq. We should have separate error messages when we reject both {{null}} and 
{{unset}}.
Done
bq. I'd prefer rejecting 'unset' inside UDTs (and tuples). Making it equivalent 
to {{null}} gives it a different meaning than usual and we should avoid that.
Done
bq. For the limit in SelectStatement, it would make sense to accept unset and 
to have it mean no limit (instead of being rejected). The same applies for 
the timestamp and ttl in {{Attributes}}.
Done
bq. In CBUtil.readValue(), we should throw a ProtocolException instead of an 
IllegalArgumentException.
Done
bq. I might have put {{UNSET_BYTE_BUFFER}} in {{ByteBufferUtil}} since it's a 
{{ByteBuffer}}.
Done
bq. The patch appears to have windows end-of-line and a few weird indentations. 
Could you check that?
My apologies. I switched to Ubuntu.
bq. I'd have added an unset() in CQLTester to use in tests to make the tests 
terser.
Done



 Ability to distinguish between NULL and UNSET values in Prepared Statements
 ---

 Key: CASSANDRA-7304
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7304
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Drew Kutcharian
Assignee: Oded Peer
  Labels: cql, protocolv4
 Fix For: 3.0

 Attachments: 7304-03.patch, 7304-04.patch, 7304-2.patch, 7304.patch


 Currently Cassandra inserts tombstones when a value of a column is bound to 
 NULL in a prepared statement. At higher insert rates managing all these 
 tombstones becomes an unnecessary overhead. This limits the usefulness of the 
 prepared statements since developers have to either create multiple prepared 
 statements (each with a different combination of column names, which at times 
 is just unfeasible because of the sheer number of possible combinations) or 
 fall back to using regular (non-prepared) statements.
 This JIRA is here to explore the possibility of either:
 A. Have a flag on prepared statements that once set, tells Cassandra to 
 ignore null columns
 or
 B. Have an UNSET value which makes Cassandra skip the null columns and not 
 tombstone them
 Basically, in the context of a prepared statement, a null value means delete, 
 but we don’t have anything that 

[jira] [Commented] (CASSANDRA-7124) Use JMX Notifications to Indicate Success/Failure of Long-Running Operations

2014-12-11 Thread Rajanarayanan Thottuvaikkatumana (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243076#comment-14243076
 ] 

Rajanarayanan Thottuvaikkatumana commented on CASSANDRA-7124:
-

[~yukim], Please find the changes for the decommission task - 
https://github.com/rnamboodiri/cassandra/commit/ca6c8e3788f6bdd54f21007524d10e287614cc88

Thanks

 Use JMX Notifications to Indicate Success/Failure of Long-Running Operations
 

 Key: CASSANDRA-7124
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7124
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Rajanarayanan Thottuvaikkatumana
Priority: Minor
  Labels: lhf
 Fix For: 3.0

 Attachments: 7124-wip.txt, cassandra-trunk-compact-7124.txt, 
 cassandra-trunk-decommission-7124.txt


 If {{nodetool cleanup}} or some other long-running operation takes too long 
 to complete, you'll see an error like the one in CASSANDRA-2126, so you can't 
 tell if the operation completed successfully or not.  CASSANDRA-4767 fixed 
 this for repairs with JMX notifications.  We should do something similar for 
 nodetool cleanup, compact, decommission, move, relocate, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8457) nio MessagingService

2014-12-11 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243095#comment-14243095
 ] 

T Jake Luciani commented on CASSANDRA-8457:
---

bq. Running a representative test will take some doing. cstar doesn't support 
multiple stress clients and it seems like the clusters only have 3 nodes? 

But if you run with RF  1 you can stress the internal network which is what we 
are changing in this ticket

 nio MessagingService
 

 Key: CASSANDRA-8457
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Ariel Weisberg
  Labels: performance
 Fix For: 3.0


 Thread-per-peer (actually two each incoming and outbound) is a big 
 contributor to context switching, especially for larger clusters.  Let's look 
 at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7032) Improve vnode allocation

2014-12-11 Thread Jon Haddad (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243148#comment-14243148
 ] 

Jon Haddad commented on CASSANDRA-7032:
---

Was the original 256 number chosen in hopes that it would minimize the chance 
of imbalance?  If so, would this patch result in recommending fewer vnodes, say 
16?  If so, I imagine that would result in less time consuming repair as well 
as improvements to the spark driver, which, as of the last time I checked, did 
1 query per token to achieve data locality.

I would assume 16 nodes streaming data to a single one would still achieve the 
benefits of vnodes, but I'm just picking a number out of the air.  

 Improve vnode allocation
 

 Key: CASSANDRA-7032
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7032
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Branimir Lambov
  Labels: performance, vnodes
 Fix For: 3.0

 Attachments: TestVNodeAllocation.java, TestVNodeAllocation.java, 
 TestVNodeAllocation.java


 It's been known for a little while that random vnode allocation causes 
 hotspots of ownership. It should be possible to improve dramatically on this 
 with deterministic allocation. I have quickly thrown together a simple greedy 
 algorithm that allocates vnodes efficiently, and will repair hotspots in a 
 randomly allocated cluster gradually as more nodes are added, and also 
 ensures that token ranges are fairly evenly spread between nodes (somewhat 
 tunably so). The allocation still permits slight discrepancies in ownership, 
 but it is bound by the inverse of the size of the cluster (as opposed to 
 random allocation, which strangely gets worse as the cluster size increases). 
 I'm sure there is a decent dynamic programming solution to this that would be 
 even better.
 If on joining the ring a new node were to CAS a shared table where a 
 canonical allocation of token ranges lives after running this (or a similar) 
 algorithm, we could then get guaranteed bounds on the ownership distribution 
 in a cluster. This will also help for CASSANDRA-6696.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8464) Support direct buffer decompression for reads

2014-12-11 Thread T Jake Luciani (JIRA)
T Jake Luciani created CASSANDRA-8464:
-

 Summary: Support direct buffer decompression for reads
 Key: CASSANDRA-8464
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8464
 Project: Cassandra
  Issue Type: Improvement
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 3.0


Currently when we read a compressed sstable we copy the data on heap then send 
it to be de-compressed to another on heap buffer (albeit pooled).

But now both snappy and lz4 (with CASSANDRA-7039) allow decompression of direct 
byte buffers.   This lets us mmap the data and decompress completely off heap 
(and avoids moving bytes over JNI).

One issue is performing the checksum offheap but the Adler32 does support in 
java 8 (it's also in java 7 but marked private?!)

This change yields a  10% boost in read performance on cstar.  Locally I see 
upto 30% improvement.

http://cstar.datastax.com/graph?stats=5ebcdd70-816b-11e4-aed6-42010af0688fmetric=op_rateoperation=2_readsmoothing=1show_aggregates=truexmin=0xmax=200.09ymin=0ymax=135908.3





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes

2014-12-11 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243181#comment-14243181
 ] 

Jonathan Ellis commented on CASSANDRA-7937:
---

bq. So in this situation the backpressure information received by the client 
could be used properly, as it would just be understood by the client as a 
request to slow down for this particular replica, it could therefore pick 
another replica.

That is a good point.  However, it's only really useful for reads, since writes 
are always sent to all replicas.  And unfortunately writes are by far a bigger 
problem because of the memory pressure they generate (in queues, as well as in 
the memtable).  I've never seen a node OOM and fall over from too many reads.

 Apply backpressure gently when overloaded with writes
 -

 Key: CASSANDRA-7937
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7937
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.0
Reporter: Piotr Kołaczkowski
  Labels: performance

 When writing huge amounts of data into C* cluster from analytic tools like 
 Hadoop or Apache Spark, we can see that often C* can't keep up with the load. 
 This is because analytic tools typically write data as fast as they can in 
 parallel, from many nodes and they are not artificially rate-limited, so C* 
 is the bottleneck here. Also, increasing the number of nodes doesn't really 
 help, because in a collocated setup this also increases number of 
 Hadoop/Spark nodes (writers) and although possible write performance is 
 higher, the problem still remains.
 We observe the following behavior:
 1. data is ingested at an extreme fast pace into memtables and flush queue 
 fills up
 2. the available memory limit for memtables is reached and writes are no 
 longer accepted
 3. the application gets hit by write timeout, and retries repeatedly, in 
 vain 
 4. after several failed attempts to write, the job gets aborted 
 Desired behaviour:
 1. data is ingested at an extreme fast pace into memtables and flush queue 
 fills up
 2. after exceeding some memtable fill threshold, C* applies adaptive rate 
 limiting to writes - the more the buffers are filled-up, the less writes/s 
 are accepted, however writes still occur within the write timeout.
 3. thanks to slowed down data ingestion, now flush can finish before all the 
 memory gets used
 Of course the details how rate limiting could be done are up for a discussion.
 It may be also worth considering putting such logic into the driver, not C* 
 core, but then C* needs to expose at least the following information to the 
 driver, so we could calculate the desired maximum data rate:
 1. current amount of memory available for writes before they would completely 
 block
 2. total amount of data queued to be flushed and flush progress (amount of 
 data to flush remaining for the memtable currently being flushed)
 3. average flush write speed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8462) Upgrading a 2.0 to 2.1 breaks CFMetaData on 2.0 nodes

2014-12-11 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-8462:
--
Assignee: Aleksey Yeschenko

 Upgrading a 2.0 to 2.1 breaks CFMetaData on 2.0 nodes
 -

 Key: CASSANDRA-8462
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8462
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Rick Branson
Assignee: Aleksey Yeschenko

 Added a 2.1.2 node to a cluster running 2.0.11. Didn't make any schema 
 changes. When I tried to reboot one of the 2.0 nodes, it failed to boot with 
 this exception. Besides an obvious fix, any workarounds for this?
 {code}
 java.lang.IllegalArgumentException: No enum constant 
 org.apache.cassandra.config.CFMetaData.Caching.{keys:ALL, 
 rows_per_partition:NONE}
 at java.lang.Enum.valueOf(Enum.java:236)
 at 
 org.apache.cassandra.config.CFMetaData$Caching.valueOf(CFMetaData.java:286)
 at 
 org.apache.cassandra.config.CFMetaData.fromSchemaNoColumnsNoTriggers(CFMetaData.java:1713)
 at 
 org.apache.cassandra.config.CFMetaData.fromSchema(CFMetaData.java:1793)
 at 
 org.apache.cassandra.config.KSMetaData.deserializeColumnFamilies(KSMetaData.java:307)
 at 
 org.apache.cassandra.config.KSMetaData.fromSchema(KSMetaData.java:288)
 at 
 org.apache.cassandra.db.DefsTables.loadFromKeyspace(DefsTables.java:131)
 at 
 org.apache.cassandra.config.DatabaseDescriptor.loadSchemas(DatabaseDescriptor.java:529)
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:270)
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
 {/code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8463) Upgrading 2.0 to 2.1 causes LCS to recompact all files

2014-12-11 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-8463:
--
Assignee: Marcus Eriksson

 Upgrading 2.0 to 2.1 causes LCS to recompact all files
 --

 Key: CASSANDRA-8463
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8463
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Hardware is recent 2-socket, 16-core (x2 Hyperthreaded), 
 144G RAM, solid-state storage.
 Platform is Linux 3.2.51, Oracle JDK 64-bit 1.7.0_65.
 Heap is 32G total, 4G newsize.
 8G/8G on-heap/off-heap memtables, offheap_buffer allocator, 0.5 
 memtable_cleanup_threshold
 concurrent_compactors: 20
Reporter: Rick Branson
Assignee: Marcus Eriksson
 Fix For: 2.1.3


 It appears that tables configured with LCS will completely re-compact 
 themselves over some period of time after upgrading from 2.0 to 2.1 (2.0.11 
 - 2.1.2, specifically). It starts out with 10 pending tasks for an hour or 
 so, then starts building up, now with 50-100 tasks pending across the cluster 
 after 12 hours. These nodes are under heavy write load, but were easily able 
 to keep up in 2.0 (they rarely had 5 pending compaction tasks), so I don't 
 think it's LCS in 2.1 actually being worse, just perhaps some different LCS 
 behavior that causes the layout of tables from 2.0 to prompt the compactor to 
 reorganize them?
 The nodes flushed ~11MB SSTables under 2.0. They're currently flushing ~36MB 
 SSTables due to the improved memtable setup in 2.1. Before I upgraded the 
 entire cluster to 2.1, I noticed the problem and tried several variations on 
 the flush size, thinking perhaps the larger tables in L0 were causing some 
 kind of cascading compactions. Even if they're sized roughly like the 2.0 
 flushes were, same behavior occurs. I also tried both enabling  disabling 
 STCS in L0 with no real change other than L0 began to back up faster, so I 
 left the STCS in L0 enabled.
 Tables are configured with 32MB sstable_size_in_mb, which was found to be an 
 improvement on the 160MB table size for compaction performance. Maybe this is 
 wrong now? Otherwise, the tables are configured with defaults. Compaction has 
 been unthrottled to help them catch-up. The compaction threads stay very 
 busy, with the cluster-wide CPU at 45% nice time. No nodes have completely 
 caught up yet. I'll update JIRA with status about their progress if anything 
 interesting happens.
 From a node around 12 hours ago, around an hour after the upgrade, with 19 
 pending compaction tasks:
 SSTables in each level: [6/4, 10, 105/100, 268, 0, 0, 0, 0, 0]
 SSTables in each level: [6/4, 10, 106/100, 271, 0, 0, 0, 0, 0]
 SSTables in each level: [1, 16/10, 105/100, 269, 0, 0, 0, 0, 0]
 SSTables in each level: [5/4, 10, 103/100, 272, 0, 0, 0, 0, 0]
 SSTables in each level: [4, 11/10, 105/100, 270, 0, 0, 0, 0, 0]
 SSTables in each level: [1, 12/10, 105/100, 271, 0, 0, 0, 0, 0]
 SSTables in each level: [1, 14/10, 104/100, 267, 0, 0, 0, 0, 0]
 SSTables in each level: [9/4, 10, 103/100, 265, 0, 0, 0, 0, 0]
 Recently, with 41 pending compaction tasks:
 SSTables in each level: [4, 13/10, 106/100, 269, 0, 0, 0, 0, 0]
 SSTables in each level: [4, 12/10, 106/100, 273, 0, 0, 0, 0, 0]
 SSTables in each level: [5/4, 11/10, 106/100, 271, 0, 0, 0, 0, 0]
 SSTables in each level: [4, 12/10, 103/100, 275, 0, 0, 0, 0, 0]
 SSTables in each level: [2, 13/10, 106/100, 273, 0, 0, 0, 0, 0]
 SSTables in each level: [3, 10, 104/100, 275, 0, 0, 0, 0, 0]
 SSTables in each level: [6/4, 11/10, 103/100, 269, 0, 0, 0, 0, 0]
 SSTables in each level: [4, 16/10, 105/100, 264, 0, 0, 0, 0, 0]
 More information about the use case: writes are roughly uniform across these 
 tables. The data is sharded across these 8 tables by key to improve 
 compaction parallelism. Each node receives up to 75,000 writes/sec sustained 
 at peak, and a small number of reads. This is a pre-production cluster that's 
 being warmed up with new data, so the low volume of reads (~100/sec per node) 
 is just from automatic sampled data checks, otherwise we'd just use STCS :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8463) Upgrading 2.0 to 2.1 causes LCS to recompact all files

2014-12-11 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-8463:
--
Fix Version/s: 2.1.3

 Upgrading 2.0 to 2.1 causes LCS to recompact all files
 --

 Key: CASSANDRA-8463
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8463
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Hardware is recent 2-socket, 16-core (x2 Hyperthreaded), 
 144G RAM, solid-state storage.
 Platform is Linux 3.2.51, Oracle JDK 64-bit 1.7.0_65.
 Heap is 32G total, 4G newsize.
 8G/8G on-heap/off-heap memtables, offheap_buffer allocator, 0.5 
 memtable_cleanup_threshold
 concurrent_compactors: 20
Reporter: Rick Branson
Assignee: Marcus Eriksson
 Fix For: 2.1.3


 It appears that tables configured with LCS will completely re-compact 
 themselves over some period of time after upgrading from 2.0 to 2.1 (2.0.11 
 - 2.1.2, specifically). It starts out with 10 pending tasks for an hour or 
 so, then starts building up, now with 50-100 tasks pending across the cluster 
 after 12 hours. These nodes are under heavy write load, but were easily able 
 to keep up in 2.0 (they rarely had 5 pending compaction tasks), so I don't 
 think it's LCS in 2.1 actually being worse, just perhaps some different LCS 
 behavior that causes the layout of tables from 2.0 to prompt the compactor to 
 reorganize them?
 The nodes flushed ~11MB SSTables under 2.0. They're currently flushing ~36MB 
 SSTables due to the improved memtable setup in 2.1. Before I upgraded the 
 entire cluster to 2.1, I noticed the problem and tried several variations on 
 the flush size, thinking perhaps the larger tables in L0 were causing some 
 kind of cascading compactions. Even if they're sized roughly like the 2.0 
 flushes were, same behavior occurs. I also tried both enabling  disabling 
 STCS in L0 with no real change other than L0 began to back up faster, so I 
 left the STCS in L0 enabled.
 Tables are configured with 32MB sstable_size_in_mb, which was found to be an 
 improvement on the 160MB table size for compaction performance. Maybe this is 
 wrong now? Otherwise, the tables are configured with defaults. Compaction has 
 been unthrottled to help them catch-up. The compaction threads stay very 
 busy, with the cluster-wide CPU at 45% nice time. No nodes have completely 
 caught up yet. I'll update JIRA with status about their progress if anything 
 interesting happens.
 From a node around 12 hours ago, around an hour after the upgrade, with 19 
 pending compaction tasks:
 SSTables in each level: [6/4, 10, 105/100, 268, 0, 0, 0, 0, 0]
 SSTables in each level: [6/4, 10, 106/100, 271, 0, 0, 0, 0, 0]
 SSTables in each level: [1, 16/10, 105/100, 269, 0, 0, 0, 0, 0]
 SSTables in each level: [5/4, 10, 103/100, 272, 0, 0, 0, 0, 0]
 SSTables in each level: [4, 11/10, 105/100, 270, 0, 0, 0, 0, 0]
 SSTables in each level: [1, 12/10, 105/100, 271, 0, 0, 0, 0, 0]
 SSTables in each level: [1, 14/10, 104/100, 267, 0, 0, 0, 0, 0]
 SSTables in each level: [9/4, 10, 103/100, 265, 0, 0, 0, 0, 0]
 Recently, with 41 pending compaction tasks:
 SSTables in each level: [4, 13/10, 106/100, 269, 0, 0, 0, 0, 0]
 SSTables in each level: [4, 12/10, 106/100, 273, 0, 0, 0, 0, 0]
 SSTables in each level: [5/4, 11/10, 106/100, 271, 0, 0, 0, 0, 0]
 SSTables in each level: [4, 12/10, 103/100, 275, 0, 0, 0, 0, 0]
 SSTables in each level: [2, 13/10, 106/100, 273, 0, 0, 0, 0, 0]
 SSTables in each level: [3, 10, 104/100, 275, 0, 0, 0, 0, 0]
 SSTables in each level: [6/4, 11/10, 103/100, 269, 0, 0, 0, 0, 0]
 SSTables in each level: [4, 16/10, 105/100, 264, 0, 0, 0, 0, 0]
 More information about the use case: writes are roughly uniform across these 
 tables. The data is sharded across these 8 tables by key to improve 
 compaction parallelism. Each node receives up to 75,000 writes/sec sustained 
 at peak, and a small number of reads. This is a pre-production cluster that's 
 being warmed up with new data, so the low volume of reads (~100/sec per node) 
 is just from automatic sampled data checks, otherwise we'd just use STCS :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8457) nio MessagingService

2014-12-11 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243188#comment-14243188
 ] 

Ariel Weisberg commented on CASSANDRA-8457:
---

Thank's Jake that is a good point. 3 nodes is still a problem as that allows 
these threads to get a lot hotter then they normally would in then in larger 
cluster.

I will try Benedict's suggestion since that would be easy to put in.


 nio MessagingService
 

 Key: CASSANDRA-8457
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Ariel Weisberg
  Labels: performance
 Fix For: 3.0


 Thread-per-peer (actually two each incoming and outbound) is a big 
 contributor to context switching, especially for larger clusters.  Let's look 
 at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes

2014-12-11 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243239#comment-14243239
 ] 

Michaël Figuière commented on CASSANDRA-7937:
-

bq. That is a good point. However, it's only really useful for reads, since 
writes are always sent to all replicas. And unfortunately writes are by far a 
bigger problem because of the memory pressure they generate (in queues, as well 
as in the memtable). I've never seen a node OOM and fall over from too many 
reads.

Indeed for Reads with CL=1 this will bring an appropriate backpressure for each 
replica.

For Writes the appropriate backpressure that you'd want to see is the clients 
to slow down their rate for all the replicas, that is for the entire partition, 
as you don't want to loose it. And we could actually have it with this 
mechanism at the Window Size of each of the replicas would be reduced due to 
the heavy load they experience, and when Token Awareness is enabled on the 
client, it could avoid balancing to another node when reaching the maximum 
allowed concurrent requests threshold for each Replica, if configured to do so.

Now if the entire cluster starts to be overloaded, this mechanism would make 
sure that the clients slow down their traffic, as there's no point in hammering 
an already overloaded cluster.

 Apply backpressure gently when overloaded with writes
 -

 Key: CASSANDRA-7937
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7937
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.0
Reporter: Piotr Kołaczkowski
  Labels: performance

 When writing huge amounts of data into C* cluster from analytic tools like 
 Hadoop or Apache Spark, we can see that often C* can't keep up with the load. 
 This is because analytic tools typically write data as fast as they can in 
 parallel, from many nodes and they are not artificially rate-limited, so C* 
 is the bottleneck here. Also, increasing the number of nodes doesn't really 
 help, because in a collocated setup this also increases number of 
 Hadoop/Spark nodes (writers) and although possible write performance is 
 higher, the problem still remains.
 We observe the following behavior:
 1. data is ingested at an extreme fast pace into memtables and flush queue 
 fills up
 2. the available memory limit for memtables is reached and writes are no 
 longer accepted
 3. the application gets hit by write timeout, and retries repeatedly, in 
 vain 
 4. after several failed attempts to write, the job gets aborted 
 Desired behaviour:
 1. data is ingested at an extreme fast pace into memtables and flush queue 
 fills up
 2. after exceeding some memtable fill threshold, C* applies adaptive rate 
 limiting to writes - the more the buffers are filled-up, the less writes/s 
 are accepted, however writes still occur within the write timeout.
 3. thanks to slowed down data ingestion, now flush can finish before all the 
 memory gets used
 Of course the details how rate limiting could be done are up for a discussion.
 It may be also worth considering putting such logic into the driver, not C* 
 core, but then C* needs to expose at least the following information to the 
 driver, so we could calculate the desired maximum data rate:
 1. current amount of memory available for writes before they would completely 
 block
 2. total amount of data queued to be flushed and flush progress (amount of 
 data to flush remaining for the memtable currently being flushed)
 3. average flush write speed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7708) UDF schema change events/results

2014-12-11 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-7708:

Attachment: 7708-1.txt

 UDF schema change events/results
 

 Key: CASSANDRA-7708
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7708
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Robert Stupp
Assignee: Robert Stupp
  Labels: protocolv4
 Fix For: 3.0

 Attachments: 7708-1.txt


 Schema change notifications for UDF might be interesting for client.
 This covers both - the result of {{CREATE}} + {{DROP}} statements and events.
 Just adding {{FUNCTION}} as a new target for these events breaks previous 
 native protocol contract.
 Proposal is to introduce a new target {{FUNCTION}} in native protocol v4.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7708) UDF schema change events/results

2014-12-11 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243243#comment-14243243
 ] 

Robert Stupp commented on CASSANDRA-7708:
-

Who can review the patch?

 UDF schema change events/results
 

 Key: CASSANDRA-7708
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7708
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Robert Stupp
Assignee: Robert Stupp
  Labels: protocolv4
 Fix For: 3.0

 Attachments: 7708-1.txt


 Schema change notifications for UDF might be interesting for client.
 This covers both - the result of {{CREATE}} + {{DROP}} statements and events.
 Just adding {{FUNCTION}} as a new target for these events breaks previous 
 native protocol contract.
 Proposal is to introduce a new target {{FUNCTION}} in native protocol v4.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8374) Better support of null for UDF

2014-12-11 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-8374:

Assignee: Robert Stupp

 Better support of null for UDF
 --

 Key: CASSANDRA-8374
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8374
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Robert Stupp
 Fix For: 3.0


 Currently, every function needs to deal with it's argument potentially being 
 {{null}}. There is very many case where that's just annoying, users should be 
 able to define a function like:
 {noformat}
 CREATE FUNCTION addTwo(val int) RETURNS int LANGUAGE JAVA AS 'return val + 2;'
 {noformat}
 without having this crashing as soon as a column it's applied to doesn't a 
 value for some rows (I'll note that this definition apparently cannot be 
 compiled currently, which should be looked into).  
 In fact, I think that by default methods shouldn't have to care about 
 {{null}} values: if the value is {{null}}, we should not call the method at 
 all and return {{null}}. There is still methods that may explicitely want to 
 handle {{null}} (to return a default value for instance), so maybe we can add 
 an {{ALLOW NULLS}} to the creation syntax.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8192) AssertionError in Memory.java

2014-12-11 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-8192:
---
Attachment: 8192_v1.txt

Looking at the raw data on these files, in both cases on the attached their 
file size is similar to the others however they are filled with 0's, meaning 
none of the parameters that CompressionMetadata needs to pull out are correctly 
set.  This worked fine on 2.0.10 as we didn't query the chunkOffsetsSize during 
the constructor and only queried that data lazily - we changed our behavior to 
storing it at construction for CASSANDRA-6916.

On 2.0.10, a simple select * from sstable_activity should get you an 
exception that would uncover that you have corrupt data.

Attaching a patch for 2.1 to check for 0 chunks in a compressed file and throw 
an IOException if that's encountered.

 AssertionError in Memory.java
 -

 Key: CASSANDRA-8192
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8192
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Windows-7-32 bit, 3GB RAM, Java 1.7.0_67
Reporter: Andreas Schnitzerling
Assignee: Joshua McKenzie
 Fix For: 2.1.3

 Attachments: 8192_v1.txt, cassandra.bat, cassandra.yaml, 
 logdata-onlinedata-ka-196504-CompressionInfo.zip, printChunkOffsetErrors.txt, 
 system-compactions_in_progress-ka-47594-CompressionInfo.zip, 
 system-sstable_activity-jb-25-Filter.zip, system.log, system_AssertionTest.log


 Since update of 1 of 12 nodes from 2.1.0-rel to 2.1.1-rel Exception during 
 start up.
 {panel:title=system.log}
 ERROR [SSTableBatchOpen:1] 2014-10-27 09:44:00,079 CassandraDaemon.java:153 - 
 Exception in thread Thread[SSTableBatchOpen:1,5,main]
 java.lang.AssertionError: null
   at org.apache.cassandra.io.util.Memory.size(Memory.java:307) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:135)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:83)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:50)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:48)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:766) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:725) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:402) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:302) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:438) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) 
 ~[na:1.7.0_55]
   at java.util.concurrent.FutureTask.run(Unknown Source) ~[na:1.7.0_55]
   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
 [na:1.7.0_55]
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
 [na:1.7.0_55]
   at java.lang.Thread.run(Unknown Source) [na:1.7.0_55]
 {panel}
 In the attached log you can still see as well CASSANDRA-8069 and 
 CASSANDRA-6283.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8465) Phase 1: Remove Singleton and break statics into classes

2014-12-11 Thread Joshua McKenzie (JIRA)
Joshua McKenzie created CASSANDRA-8465:
--

 Summary: Phase 1: Remove Singleton and break statics into classes
 Key: CASSANDRA-8465
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8465
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Joshua McKenzie
 Fix For: 3.0


1:  Convert StorageProxy into a non-singleton

2:  Writes
* Regular
* Counter
* RegularBatch
* CounterBatch
* AtomicBatch

3:  Reads
* Regular
* Range

4:  LightweightTransaction
* Write
* Read

5: Truncate



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Deleted] (CASSANDRA-8443) Phase 1: Refactor StorageProxy read path into separate classes

2014-12-11 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie deleted CASSANDRA-8443:
---


 Phase 1: Refactor StorageProxy read path into separate classes
 --

 Key: CASSANDRA-8443
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8443
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Joshua McKenzie

 Refactor the read path inside StorageProxy into separate classes.
 All are Request/Response pairs:
 * Regular
 * Range
 Keep them synchronous for now and just break it out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Deleted] (CASSANDRA-8442) Phase 1: Refactor StorageProxy write path into separate classes

2014-12-11 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie deleted CASSANDRA-8442:
---


 Phase 1: Refactor StorageProxy write path into separate classes
 ---

 Key: CASSANDRA-8442
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8442
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Joshua McKenzie

 Refactor the write path inside StorageProxy into separate classes.
 All are Request/Response pairs:
 * Regular
 * Counter
 * RegularBatch
 * CounterBatch
 * AtomicBatch
 Keep them sync for now and just break it out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Deleted] (CASSANDRA-8445) Phase 1: Refactor StorageProxy truncate into separate class

2014-12-11 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie deleted CASSANDRA-8445:
---


 Phase 1: Refactor StorageProxy truncate into separate class
 ---

 Key: CASSANDRA-8445
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8445
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Joshua McKenzie

 Refactor truncation into separate class, keep it synchronous.  Should be 
 pretty trivial.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Deleted] (CASSANDRA-8441) Phase 1: Un-Singleton the StorageProxy

2014-12-11 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie deleted CASSANDRA-8441:
---


 Phase 1: Un-Singleton the StorageProxy
 --

 Key: CASSANDRA-8441
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8441
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Joshua McKenzie

 To test / refactor the StorageProxy it'll help for it not to be a singleton.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Deleted] (CASSANDRA-8444) Phase 1: Refactor StorageProxy LightWeightTransactions into separate classes

2014-12-11 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie deleted CASSANDRA-8444:
---


 Phase 1: Refactor StorageProxy LightWeightTransactions into separate classes
 

 Key: CASSANDRA-8444
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8444
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Joshua McKenzie

 Refactor the lightweight transaction paths inside StorageProxy into separate 
 classes.
 Request/Response pairs for:
 LightWeightTransactionRead
 LightWeightTransactionWrite
 Keep them synchronous for now and just break it out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8374) Better support of null for UDF

2014-12-11 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243354#comment-14243354
 ] 

Robert Stupp edited comment on CASSANDRA-8374 at 12/11/14 11:32 PM:


Added optional {{ALLOW NULL}} to arguments and return type.
For _java_ source UDFs {{ALLOW NULL}} also means that the Java primitive types 
(e.g. {{int}} instead of {{j.l.Integer}}) are used making the Java source much 
nicer:
{code:title=current state}
return val == null ? null : Double.valueOf(Math.sin(val.doubleValue()));
{code}
becomes
{code:title=with ALLOW NULL for arg and return}
return Math.sin(val);
{code}

I'm just thinking of whether to use {{ALLOW NULL}} for each individual argument 
and the return type or to use {{ALLOW NULLS}} globally.

It's not much effort to additionally allow something like {{DEFAULT a_value}} 
for a UDF argument. Seems to be a nice option.

(The linked git branch is not ready for review yet)


was (Author: snazy):
Added optional {{ALLOW NULL}} to arguments and return type.
For _java_ source UDFs {{ALLOW NULL}} also means that the Java primitive types 
(e.g. {{int}} instead of {{j.l.Integer}}) are used making the Java source much 
nicer:
{code:title=current state}
return val == null ? null : Double.valueOf(Math.sin(val.doubleValue()));
{code}
becomes
{code:title=with ALLOW NULL for arg and return}
return Math.sin(val);
{code}

I'm just thinking of whether to use {{ALLOW NULL}} for each individual argument 
and the return type or to use {{ALLOW NULLS}} globally.

It's not much effort to alternatively allow something like {{DEFAULT a_value}} 
for a UDF argument. Seems to be a nice option.

(The linked git branch is not ready for review yet)

 Better support of null for UDF
 --

 Key: CASSANDRA-8374
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8374
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Robert Stupp
 Fix For: 3.0


 Currently, every function needs to deal with it's argument potentially being 
 {{null}}. There is very many case where that's just annoying, users should be 
 able to define a function like:
 {noformat}
 CREATE FUNCTION addTwo(val int) RETURNS int LANGUAGE JAVA AS 'return val + 2;'
 {noformat}
 without having this crashing as soon as a column it's applied to doesn't a 
 value for some rows (I'll note that this definition apparently cannot be 
 compiled currently, which should be looked into).  
 In fact, I think that by default methods shouldn't have to care about 
 {{null}} values: if the value is {{null}}, we should not call the method at 
 all and return {{null}}. There is still methods that may explicitely want to 
 handle {{null}} (to return a default value for instance), so maybe we can add 
 an {{ALLOW NULLS}} to the creation syntax.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8374) Better support of null for UDF

2014-12-11 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243354#comment-14243354
 ] 

Robert Stupp commented on CASSANDRA-8374:
-

Added optional {{ALLOW NULL}} to arguments and return type.
For _java_ source UDFs {{ALLOW NULL}} also means that the Java primitive types 
(e.g. {{int}} instead of {{j.l.Integer}}) are used making the Java source much 
nicer:
{code:title=current state}
return val == null ? null : Double.valueOf(Math.sin(val.doubleValue()));
{code}
becomes
{code:title=with ALLOW NULL for arg and return}
return Math.sin(val);
{code}

I'm just thinking of whether to use {{ALLOW NULL}} for each individual argument 
and the return type or to use {{ALLOW NULLS}} globally.

It's not much effort to alternatively allow something like {{DEFAULT a_value}} 
for a UDF argument. Seems to be a nice option.

(The linked git branch is not ready for review yet)

 Better support of null for UDF
 --

 Key: CASSANDRA-8374
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8374
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Robert Stupp
 Fix For: 3.0


 Currently, every function needs to deal with it's argument potentially being 
 {{null}}. There is very many case where that's just annoying, users should be 
 able to define a function like:
 {noformat}
 CREATE FUNCTION addTwo(val int) RETURNS int LANGUAGE JAVA AS 'return val + 2;'
 {noformat}
 without having this crashing as soon as a column it's applied to doesn't a 
 value for some rows (I'll note that this definition apparently cannot be 
 compiled currently, which should be looked into).  
 In fact, I think that by default methods shouldn't have to care about 
 {{null}} values: if the value is {{null}}, we should not call the method at 
 all and return {{null}}. There is still methods that may explicitely want to 
 handle {{null}} (to return a default value for instance), so maybe we can add 
 an {{ALLOW NULLS}} to the creation syntax.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7708) UDF schema change events/results

2014-12-11 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-7708:
--
Reviewer: Tyler Hobbs

[~thobbs]?

 UDF schema change events/results
 

 Key: CASSANDRA-7708
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7708
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Robert Stupp
Assignee: Robert Stupp
  Labels: protocolv4
 Fix For: 3.0

 Attachments: 7708-1.txt


 Schema change notifications for UDF might be interesting for client.
 This covers both - the result of {{CREATE}} + {{DROP}} statements and events.
 Just adding {{FUNCTION}} as a new target for these events breaks previous 
 native protocol contract.
 Proposal is to introduce a new target {{FUNCTION}} in native protocol v4.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8374) Better support of null for UDF

2014-12-11 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243444#comment-14243444
 ] 

Robert Stupp commented on CASSANDRA-8374:
-

To clarify - the syntax would be:
{code}
CREATE FUNCTION foo (
argOne   intDEFAULT 42,
argTwo   double ALLOW NULL,
argThree float )
RETURN float LANGUAGE java ...
{code}
would generate a Java method with this signature:
{code}
public float execute(int argOne, java.lang.Double argTwo, float argThree)
{code}

If any of the arguments would be {{null}} and neither {{ALLOW NULL}} nor 
{{DEFAULT x}} has been declared, the method wouldn't be executed.

There are also implications to aggregates (need some additional checks - 
{{INITCOND}} might be required for UDFs, otherwise the state/final function 
might never be called at all).


 Better support of null for UDF
 --

 Key: CASSANDRA-8374
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8374
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Robert Stupp
 Fix For: 3.0


 Currently, every function needs to deal with it's argument potentially being 
 {{null}}. There is very many case where that's just annoying, users should be 
 able to define a function like:
 {noformat}
 CREATE FUNCTION addTwo(val int) RETURNS int LANGUAGE JAVA AS 'return val + 2;'
 {noformat}
 without having this crashing as soon as a column it's applied to doesn't a 
 value for some rows (I'll note that this definition apparently cannot be 
 compiled currently, which should be looked into).  
 In fact, I think that by default methods shouldn't have to care about 
 {{null}} values: if the value is {{null}}, we should not call the method at 
 all and return {{null}}. There is still methods that may explicitely want to 
 handle {{null}} (to return a default value for instance), so maybe we can add 
 an {{ALLOW NULLS}} to the creation syntax.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8452) Add missing systems to FBUtilities.isUnix, add FBUtilities.isWindows

2014-12-11 Thread Blake Eggleston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-8452:
---
Attachment: CASSANDRA-8452-v3.patch

Well I'm glad you brought up procfs, since it turns out osx doesn't have one. 
The rest of the posix systems we're checking for do [according to wikipedia | 
http://en.wikipedia.org/wiki/Procfs]. I guess adding hasProcFS would make some 
sense, just for the sake of being correct. At the moment, it only affects 
whether some warnings are logged. Basically we eagerly try to open a proc file, 
and use isPosix to see if we should log anything if that fails. Relying on 
isPosix alone causes erroneous startup warnings in mac dev environments. 

I would think isWindows is enough for ntfs specific logic. Linux _can_ mount 
and write to ntfs disks, but I'm don't know how common it is for C* to be using 
it, outside of maybe some dual boot dev environments. I also not as clear if it 
would have the same behaviors we're coding around with isWindows. It's written 
against a spec that was reverse engineered from the Windows implementation.

 Add missing systems to FBUtilities.isUnix, add FBUtilities.isWindows
 

 Key: CASSANDRA-8452
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8452
 Project: Cassandra
  Issue Type: Bug
Reporter: Blake Eggleston
Assignee: Blake Eggleston
Priority: Minor
 Fix For: 2.1.3

 Attachments: CASSANDRA-8452-v2.patch, CASSANDRA-8452-v3.patch, 
 CASSANDRA-8452.patch


 The isUnix method leaves out a few unix systems, which, after the changes in 
 CASSANDRA-8136, causes some unexpected behavior during shutdown. It would 
 also be clearer if FBUtilities had an isWindows method for branching into 
 Windows specific logic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8447) Nodes stuck in CMS GC cycle with very little traffic when compaction is enabled

2014-12-11 Thread jonathan lacefield (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243645#comment-14243645
 ] 

jonathan lacefield commented on CASSANDRA-8447:
---

Please find a new heap dump attached here - 
https://drive.google.com/file/d/0B4Imdpu2YrEbdVRNMGM4X3BTS3M/view?usp=sharing

Test was executed after ensuring hints were disabled and did not exist in the 
cluster.

 Nodes stuck in CMS GC cycle with very little traffic when compaction is 
 enabled
 ---

 Key: CASSANDRA-8447
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8447
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cluster size - 4 nodes
 Node size - 12 CPU (hyper threaded to 24 cores), 192 GB RAM, 2 Raid 0 arrays 
 (Data - 10 disk, spinning 10k drives | CL 2 disk, spinning 10k drives)
 OS - RHEL 6.5
 jvm - oracle 1.7.0_71
 Cassandra version 2.0.11
Reporter: jonathan lacefield
 Attachments: Node_with_compaction.png, Node_without_compaction.png, 
 cassandra.yaml, gc.logs.tar.gz, gcinspector_messages.txt, memtable_debug, 
 output.1.svg, output.2.svg, output.svg, results.tar.gz, visualvm_screenshot


 Behavior - If autocompaction is enabled, nodes will become unresponsive due 
 to a full Old Gen heap which is not cleared during CMS GC.
 Test methodology - disabled autocompaction on 3 nodes, left autocompaction 
 enabled on 1 node.  Executed different Cassandra stress loads, using write 
 only operations.  Monitored visualvm and jconsole for heap pressure.  
 Captured iostat and dstat for most tests.  Captured heap dump from 50 thread 
 load.  Hints were disabled for testing on all nodes to alleviate GC noise due 
 to hints backing up.
 Data load test through Cassandra stress -  /usr/bin/cassandra-stress  write 
 n=19 -rate threads=different threads tested -schema  
 replication\(factor=3\)  keyspace=Keyspace1 -node all nodes listed
 Data load thread count and results:
 * 1 thread - Still running but looks like the node can sustain this load 
 (approx 500 writes per second per node)
 * 5 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range (approx 2k writes per second per node)
 * 10 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range
 * 50 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 10k writes per second per node)
 * 100 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 20k writes per second per node)
 * 200 threads - Nodes become unresponsive due to full Old Gen Heap.  CMS 
 measured in the 60 second range  (approx 25k writes per second per node)
 Note - the observed behavior was the same for all tests except for the single 
 threaded test.  The single threaded test does not appear to show this 
 behavior.
 Tested different GC and Linux OS settings with a focus on the 50 and 200 
 thread loads.  
 JVM settings tested:
 #  default, out of the box, env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #  10 G Max | 1 G New - default env-sh settings
 #* JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=50
 #   20 G Max | 10 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=6
JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=3
JVM_OPTS=$JVM_OPTS -XX:ParallelGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=12
JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions
JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity
JVM_OPTS=$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs
JVM_OPTS=$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32768
JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking
 # 20 G Max | 1 G New 
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=8
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=6
JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=3
JVM_OPTS=$JVM_OPTS 

[jira] [Commented] (CASSANDRA-8463) Upgrading 2.0 to 2.1 causes LCS to recompact all files

2014-12-11 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243740#comment-14243740
 ] 

Marcus Eriksson commented on CASSANDRA-8463:


[~rbranson] could you attach logs for the upgraded node?

 Upgrading 2.0 to 2.1 causes LCS to recompact all files
 --

 Key: CASSANDRA-8463
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8463
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Hardware is recent 2-socket, 16-core (x2 Hyperthreaded), 
 144G RAM, solid-state storage.
 Platform is Linux 3.2.51, Oracle JDK 64-bit 1.7.0_65.
 Heap is 32G total, 4G newsize.
 8G/8G on-heap/off-heap memtables, offheap_buffer allocator, 0.5 
 memtable_cleanup_threshold
 concurrent_compactors: 20
Reporter: Rick Branson
Assignee: Marcus Eriksson
 Fix For: 2.1.3


 It appears that tables configured with LCS will completely re-compact 
 themselves over some period of time after upgrading from 2.0 to 2.1 (2.0.11 
 - 2.1.2, specifically). It starts out with 10 pending tasks for an hour or 
 so, then starts building up, now with 50-100 tasks pending across the cluster 
 after 12 hours. These nodes are under heavy write load, but were easily able 
 to keep up in 2.0 (they rarely had 5 pending compaction tasks), so I don't 
 think it's LCS in 2.1 actually being worse, just perhaps some different LCS 
 behavior that causes the layout of tables from 2.0 to prompt the compactor to 
 reorganize them?
 The nodes flushed ~11MB SSTables under 2.0. They're currently flushing ~36MB 
 SSTables due to the improved memtable setup in 2.1. Before I upgraded the 
 entire cluster to 2.1, I noticed the problem and tried several variations on 
 the flush size, thinking perhaps the larger tables in L0 were causing some 
 kind of cascading compactions. Even if they're sized roughly like the 2.0 
 flushes were, same behavior occurs. I also tried both enabling  disabling 
 STCS in L0 with no real change other than L0 began to back up faster, so I 
 left the STCS in L0 enabled.
 Tables are configured with 32MB sstable_size_in_mb, which was found to be an 
 improvement on the 160MB table size for compaction performance. Maybe this is 
 wrong now? Otherwise, the tables are configured with defaults. Compaction has 
 been unthrottled to help them catch-up. The compaction threads stay very 
 busy, with the cluster-wide CPU at 45% nice time. No nodes have completely 
 caught up yet. I'll update JIRA with status about their progress if anything 
 interesting happens.
 From a node around 12 hours ago, around an hour after the upgrade, with 19 
 pending compaction tasks:
 SSTables in each level: [6/4, 10, 105/100, 268, 0, 0, 0, 0, 0]
 SSTables in each level: [6/4, 10, 106/100, 271, 0, 0, 0, 0, 0]
 SSTables in each level: [1, 16/10, 105/100, 269, 0, 0, 0, 0, 0]
 SSTables in each level: [5/4, 10, 103/100, 272, 0, 0, 0, 0, 0]
 SSTables in each level: [4, 11/10, 105/100, 270, 0, 0, 0, 0, 0]
 SSTables in each level: [1, 12/10, 105/100, 271, 0, 0, 0, 0, 0]
 SSTables in each level: [1, 14/10, 104/100, 267, 0, 0, 0, 0, 0]
 SSTables in each level: [9/4, 10, 103/100, 265, 0, 0, 0, 0, 0]
 Recently, with 41 pending compaction tasks:
 SSTables in each level: [4, 13/10, 106/100, 269, 0, 0, 0, 0, 0]
 SSTables in each level: [4, 12/10, 106/100, 273, 0, 0, 0, 0, 0]
 SSTables in each level: [5/4, 11/10, 106/100, 271, 0, 0, 0, 0, 0]
 SSTables in each level: [4, 12/10, 103/100, 275, 0, 0, 0, 0, 0]
 SSTables in each level: [2, 13/10, 106/100, 273, 0, 0, 0, 0, 0]
 SSTables in each level: [3, 10, 104/100, 275, 0, 0, 0, 0, 0]
 SSTables in each level: [6/4, 11/10, 103/100, 269, 0, 0, 0, 0, 0]
 SSTables in each level: [4, 16/10, 105/100, 264, 0, 0, 0, 0, 0]
 More information about the use case: writes are roughly uniform across these 
 tables. The data is sharded across these 8 tables by key to improve 
 compaction parallelism. Each node receives up to 75,000 writes/sec sustained 
 at peak, and a small number of reads. This is a pre-production cluster that's 
 being warmed up with new data, so the low volume of reads (~100/sec per node) 
 is just from automatic sampled data checks, otherwise we'd just use STCS :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)