date:20141120


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie updated CASSANDRA-8253:
-
Attachment: CASSANDRA-8253.txt

one line patch

 cassandra-stress 2.1 doesn't support LOCAL_ONE
 --

 Key: CASSANDRA-8253
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8253
 Project: Cassandra
  Issue Type: Bug
Reporter: J.B. Langston
Assignee: Liang Xie
 Attachments: CASSANDRA-8253.txt


 Looks like a simple oversight in argument parsing:
 ➜  bin  ./cassandra-stress write cl=LOCAL_ONE
 Invalid value LOCAL_ONE; must match pattern 
 ONE|QUORUM|LOCAL_QUORUM|EACH_QUORUM|ALL|ANY
 Also, CASSANDRA-7077 argues that it should be using LOCAL_ONE by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-7960) cassandra-stress should support LWT


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie reassigned CASSANDRA-7960:


Assignee: Liang Xie

 cassandra-stress should support LWT
 ---

 Key: CASSANDRA-7960
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7960
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Benedict
Assignee: Liang Xie
Priority: Minor
  Labels: stress





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7960) cassandra-stress should support LWT


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219115#comment-14219115
 ] 

Liang Xie commented on CASSANDRA-7960:
--

i just want to benchmark LWT soon(with cassandra-stress), so assign to me now, 
if anybody who has made a patch meanwhile, just feel free to reassign:)

 cassandra-stress should support LWT
 ---

 Key: CASSANDRA-7960
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7960
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Benedict
Assignee: Liang Xie
Priority: Minor
  Labels: stress





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-7933) Update cassandra-stress README


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie reassigned CASSANDRA-7933:


Assignee: Liang Xie

 Update cassandra-stress README
 --

 Key: CASSANDRA-7933
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7933
 Project: Cassandra
  Issue Type: Task
Reporter: Benedict
Assignee: Liang Xie
Priority: Minor
 Attachments: CASSANDRA-7933.txt


 There is a README in the tools/stress directory. It is completely out of date.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7933) Update cassandra-stress README


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie updated CASSANDRA-7933:
-
Attachment: CASSANDRA-7933.txt

I am not good at English, tried my best to make a raw patch, please feel free 
to modify it.

 Update cassandra-stress README
 --

 Key: CASSANDRA-7933
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7933
 Project: Cassandra
  Issue Type: Task
Reporter: Benedict
Priority: Minor
 Attachments: CASSANDRA-7933.txt


 There is a README in the tools/stress directory. It is completely out of date.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8253) cassandra-stress 2.1 doesn't support LOCAL_ONE


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie updated CASSANDRA-8253:
-
Attachment: CASSANDRA-8253.txt

 cassandra-stress 2.1 doesn't support LOCAL_ONE
 --

 Key: CASSANDRA-8253
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8253
 Project: Cassandra
  Issue Type: Bug
Reporter: J.B. Langston
Assignee: Liang Xie
 Attachments: CASSANDRA-8253.txt


 Looks like a simple oversight in argument parsing:
 ➜  bin  ./cassandra-stress write cl=LOCAL_ONE
 Invalid value LOCAL_ONE; must match pattern 
 ONE|QUORUM|LOCAL_QUORUM|EACH_QUORUM|ALL|ANY
 Also, CASSANDRA-7077 argues that it should be using LOCAL_ONE by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8253) cassandra-stress 2.1 doesn't support LOCAL_ONE


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie updated CASSANDRA-8253:
-
Attachment: (was: CASSANDRA-8253.txt)

 cassandra-stress 2.1 doesn't support LOCAL_ONE
 --

 Key: CASSANDRA-8253
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8253
 Project: Cassandra
  Issue Type: Bug
Reporter: J.B. Langston
Assignee: Liang Xie
 Attachments: CASSANDRA-8253.txt


 Looks like a simple oversight in argument parsing:
 ➜  bin  ./cassandra-stress write cl=LOCAL_ONE
 Invalid value LOCAL_ONE; must match pattern 
 ONE|QUORUM|LOCAL_QUORUM|EACH_QUORUM|ALL|ANY
 Also, CASSANDRA-7077 argues that it should be using LOCAL_ONE by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8341) Expose time spent in each thread pool


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219174#comment-14219174
 ] 

Robert Stupp commented on CASSANDRA-8341:
-

Yea - that {{TheadMXBean.getCurrentThread...()}} hopefully directly accesses 
the CPU registers for that.
Not sure about that since the implementations are very OS specific (in 
{{hotspot/os/*/vm/os_*.cpp}} in OpenJDK source) and as such also specific to 
the CPU model.
If all the implementations access the CPU registers for that, it definitely 
better than {{System.nanoTime()}, which is consistent over CPU codes (- NUMA - 
expensive).
Note: I don't want to make this ticket more complicated as necessary - just 
want to mention that metrics have their own overhead and should be used with 
care in sensitive areas like thread scheduling.

 Expose time spent in each thread pool
 -

 Key: CASSANDRA-8341
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8341
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Chris Lohfink
Priority: Minor
  Labels: metrics
 Attachments: 8341.patch, 8341v2.txt


 Can increment a counter with time spent in each queue.  This can provide 
 context on how much time is spent percentage wise in each stage.  
 Additionally can be used with littles law in future if ever want to try to 
 tune the size of the pools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8325) Cassandra 2.1.x fails to start on FreeBSD (JVM crash)

2014-11-20 Thread Leonid Shalupov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219184#comment-14219184
 ] 

Leonid Shalupov commented on CASSANDRA-8325:


{code}
public class A {  
  public static void main(String[] args) throws Exception {   
java.lang.reflect.Field f = 
sun.misc.Unsafe.class.getDeclaredField(theUnsafe);
f.setAccessible(true);
sun.misc.Unsafe unsafe = (sun.misc.Unsafe) f.get(null);   
  
long l = unsafe.allocateMemory(900L*1024*1024);   
System.err.println(l);
  }   
} 
{code}

$ javac A.java
$ java A
36679188480

 Cassandra 2.1.x fails to start on FreeBSD (JVM crash)
 -

 Key: CASSANDRA-8325
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8325
 Project: Cassandra
  Issue Type: Bug
 Environment: FreeBSD 10.0 with openjdk version 1.7.0_71, 64-Bit 
 Server VM
Reporter: Leonid Shalupov
 Attachments: hs_err_pid1856.log, system.log


 See attached error file after JVM crash
 {quote}
 FreeBSD xxx.intellij.net 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu 
 Jan 16 22:34:59 UTC 2014 
 r...@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
 {quote}
 {quote}
  % java -version
 openjdk version 1.7.0_71
 OpenJDK Runtime Environment (build 1.7.0_71-b14)
 OpenJDK 64-Bit Server VM (build 24.71-b01, mixed mode)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-8346) Paxos operation can use stale data during multiple range movements

2014-11-20 Thread Sylvain Lebresne (JIRA)

Sylvain Lebresne created CASSANDRA-8346:
---

 Summary: Paxos operation can use stale data during multiple range 
movements
 Key: CASSANDRA-8346
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8346
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne


Paxos operations correctly account for pending ranges for all operation 
pertaining to the Paxos state, but those pending ranges are not taken into 
account when reading the data to check for the conditions or during a serial 
read. It's thus possible to break the LWT guarantees by reading a stale value.  
This require 2 node movements (on the same token range) to be a problem though.

Basically, we have {{RF}} replicas + {{P}} pending nodes. For the Paxos 
prepare/propose phases, the number of required participants (the Paxos 
QUORUM) is {{(RF + P + 1) / 2}} ({{SP.getPaxosParticipants}}), but the read 
done to check conditions or for serial reads is done at a normal QUORUM (or 
LOCAL_QUORUM), and so a weaker {{(RF + 1) / 2}}. We have a problem if it's 
possible that said read can read only from nodes that were not part of the 
paxos participants, and so we have a problem if:
{noformat}
normal quorum == (RF + 1) / 2 = (RF + P) - ((RF + P + 1) / 2) == 
participants considered - blocked for
{noformat}
We're good if {{P = 0}} or {{P = 1}} since this inequality gives us 
respectively {{RF + 1 = RF - 1}} and {{RF + 1 = RF}}, both of which are 
impossible. But at {{P = 2}} (2 pending nodes), this inequality is equivalent 
to {{RF = RF}} and so we might read stale data.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8332) Null pointer after droping keyspace

2014-11-20 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219210#comment-14219210
 ] 

Marcus Eriksson commented on CASSANDRA-8332:


[~cnlwsu] you sure you have seen this after CASSANDRA-8027 was committed?

 Null pointer after droping keyspace
 ---

 Key: CASSANDRA-8332
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8332
 Project: Cassandra
  Issue Type: Bug
Reporter: Chris Lohfink
Assignee: Marcus Eriksson
Priority: Minor
 Fix For: 2.1.3


 After dropping keyspace, sometimes I see this in logs:
 {code}
 ERROR 03:40:29 Exception in thread Thread[CompactionExecutor:2,1,main]
 java.lang.AssertionError: null
   at 
 org.apache.cassandra.io.compress.CompressionParameters.setLiveMetadata(CompressionParameters.java:108)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.getCompressionMetadata(SSTableReader.java:1142)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:1896)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:68) 
 ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1681)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1693)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getScanners(LeveledCompactionStrategy.java:181)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getScanners(WrappingCompactionStrategy.java:320)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:340)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:151)
  ~[main/:na]
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
  ~[main/:na]
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
 ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:75)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:233)
  ~[main/:na]
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 ~[na:1.7.0_71]
   at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
 ~[na:1.7.0_71]
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_71]
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_71]
   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
 {code}
 Minor issue since doesn't really affect anything, but the error makes it look 
 like somethings wrong.  Seen on 2.1 branch 
 (1b21aef8152d96a180e75f2fcc5afad9ded6c595), not sure how far back (may be 
 post 2.1.2).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8346) Paxos operation can use stale data during multiple range movements

2014-11-20 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-8346:

Attachment: 8346.txt

I don't think there is much fix we can do (reading from the pending endpoints 
could also return stale data since those aren't yet up to date), so I think the 
simplest fix is to throw an UnavailableException if we have more than 2 pending 
endpoints. Attaching patch to do that.

 Paxos operation can use stale data during multiple range movements
 --

 Key: CASSANDRA-8346
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8346
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 2.0.12

 Attachments: 8346.txt


 Paxos operations correctly account for pending ranges for all operation 
 pertaining to the Paxos state, but those pending ranges are not taken into 
 account when reading the data to check for the conditions or during a serial 
 read. It's thus possible to break the LWT guarantees by reading a stale 
 value.  This require 2 node movements (on the same token range) to be a 
 problem though.
 Basically, we have {{RF}} replicas + {{P}} pending nodes. For the Paxos 
 prepare/propose phases, the number of required participants (the Paxos 
 QUORUM) is {{(RF + P + 1) / 2}} ({{SP.getPaxosParticipants}}), but the read 
 done to check conditions or for serial reads is done at a normal QUORUM (or 
 LOCAL_QUORUM), and so a weaker {{(RF + 1) / 2}}. We have a problem if it's 
 possible that said read can read only from nodes that were not part of the 
 paxos participants, and so we have a problem if:
 {noformat}
 normal quorum == (RF + 1) / 2 = (RF + P) - ((RF + P + 1) / 2) == 
 participants considered - blocked for
 {noformat}
 We're good if {{P = 0}} or {{P = 1}} since this inequality gives us 
 respectively {{RF + 1 = RF - 1}} and {{RF + 1 = RF}}, both of which are 
 impossible. But at {{P = 2}} (2 pending nodes), this inequality is equivalent 
 to {{RF = RF}} and so we might read stale data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-8341) Expose time spent in each thread pool


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219174#comment-14219174
 ] 

Robert Stupp edited comment on CASSANDRA-8341 at 11/20/14 11:04 AM:


Yea - that {{TheadMXBean.getCurrentThread...()}} hopefully directly accesses 
the CPU registers for that.
Not sure about that since the implementations are very OS specific (in 
{{hotspot/os/\*/vm/os_\*.cpp}} in OpenJDK source) and as such also specific to 
the CPU model.
If all the implementations access the CPU registers for that, it definitely 
better than {{System.nanoTime()}, which is consistent over CPU codes (- NUMA - 
expensive).
Note: I don't want to make this ticket more complicated as necessary - just 
want to mention that metrics have their own overhead and should be used with 
care in sensitive areas like thread scheduling.


was (Author: snazy):
Yea - that {{TheadMXBean.getCurrentThread...()}} hopefully directly accesses 
the CPU registers for that.
Not sure about that since the implementations are very OS specific (in 
{{hotspot/os/*/vm/os_*.cpp}} in OpenJDK source) and as such also specific to 
the CPU model.
If all the implementations access the CPU registers for that, it definitely 
better than {{System.nanoTime()}, which is consistent over CPU codes (- NUMA - 
expensive).
Note: I don't want to make this ticket more complicated as necessary - just 
want to mention that metrics have their own overhead and should be used with 
care in sensitive areas like thread scheduling.

 Expose time spent in each thread pool
 -

 Key: CASSANDRA-8341
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8341
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Chris Lohfink
Priority: Minor
  Labels: metrics
 Attachments: 8341.patch, 8341v2.txt


 Can increment a counter with time spent in each queue.  This can provide 
 context on how much time is spent percentage wise in each stage.  
 Additionally can be used with littles law in future if ever want to try to 
 tune the size of the pools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization

[
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219318#comment-14219318
]

Alan Boudreault edited comment on CASSANDRA-7386 at 11/20/14 12:22 PM:
---

Devs, this is the result of my regression test without and with the patch.

Note: the compaction concurrency is set to 4 and the throughput unlimited.

h4. Test

* 12 disks total of 2G of size.
* Goal: run the following command to fill the disk:
cassandra-stress WRITE n=200 -col size=FIXED\(1000\) -mode native prepared
cql3 -schema keyspace=r1

h5. Result - No Patch

!test_regression_no_patch.jpg!

All disk are filled in ~420 seconds. Casandra-stress crashed with write
timeouts at around n=65

h5. Result - With Patch

!test_regression_with_patch.jpg!

Cassandra-stress finished all its work (~13 minutes, n=200) and all disks
are under 60% of disk usage.

Any idea what's going on? Am I doing something wrong in my test case?

was (Author: aboudreault):
Devs, this is the result of my regression test without and with the patch.

Note: the compaction concurrency is set to 4 and the throughput unlimited.

h4. Test

* 12 disks total of 2G of size.
* Goal: run the following command to fill the disk:
cassandra-stress WRITE n=200 -col size=FIXED\(1000\) -mode native prepared
cql3 -schema keyspace=r1

h5. Result - No Patch

!test_regression_no_patch.jpg|thumbnail!

All disk are filled in ~420 seconds. Casandra-stress crashed with write
timeouts at around n=65

h5. Result - With Patch

!test_regression_with_patch.jpg|thumbnail!

Cassandra-stress finished all its work (~13 minutes, n=200) and all disks
are under 60% of disk usage.

Any idea what's going on? Am I doing something wrong in my test case?

JBOD threshold to prevent unbalanced disk utilization
-

Key: CASSANDRA-7386
URL: https://issues.apache.org/jira/browse/CASSANDRA-7386
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Chris Lohfink
Assignee: Robert Stupp
Priority: Minor
Fix For: 2.1.3

Attachments: 7386-2.0-v3.txt, 7386-2.0-v4.txt, 7386-2.0-v5.txt,
7386-2.1-v3.txt, 7386-2.1-v4.txt, 7386-2.1-v5.txt, 7386-v1.patch,
7386v2.diff, Mappe1.ods, mean-writevalue-7disks.png,
patch_2_1_branch_proto.diff, sstable-count-second-run.png,
test1_no_patch.jpg, test1_with_patch.jpg, test2_no_patch.jpg,
test2_with_patch.jpg, test3_no_patch.jpg, test3_with_patch.jpg,
test_regression_no_patch.jpg, test_regression_with_patch.jpg

Currently the pick the disks are picked first by number of current tasks,
then by free space. This helps with performance but can lead to large
differences in utilization in some (unlikely but possible) scenarios. Ive
seen 55% to 10% and heard reports of 90% to 10% on IRC. With both LCS and
STCS (although my suspicion is that STCS makes it worse since harder to be
balanced).
I purpose the algorithm change a little to have some maximum range of
utilization where it will pick by free space over load (acknowledging it can
be slower). So if a disk A is 30% full and disk B is 5% full it will never
pick A over B until it balances out.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization

[
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alan Boudreault updated CASSANDRA-7386:
---
Attachment: test_regression_with_patch.jpg
test_regression_no_patch.jpg

Devs, this is the result of my regression test without and with the patch.

Note: the compaction concurrency is set to 4 and the throughput unlimited.

h4. Test

* 12 disks total of 2G of size.
* Goal: run the following command to fill the disk:
cassandra-stress WRITE n=200 -col size=FIXED\(1000\) -mode native prepared
cql3 -schema keyspace=r1

h5. Result - No Patch

!test_regression_no_patch.jpg|thumbnail!

All disk are filled in ~420 seconds. Casandra-stress crashed with write
timeouts at around n=65

h5. Result - With Patch

!test_regression_with_patch.jpg|thumbnail!

Cassandra-stress finished all its work (~13 minutes, n=200) and all disks
are under 60% of disk usage.

Any idea what's going on? Am I doing something wrong in my test case?

JBOD threshold to prevent unbalanced disk utilization
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization

[
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219318#comment-14219318
]

Alan Boudreault edited comment on CASSANDRA-7386 at 11/20/14 12:23 PM:
---

Devs, this is the result of my regression test without and with the patch.

Note: the compaction concurrency is set to 4 and the throughput unlimited.

h4. Test

* 12 disks total of 2G of size.
* Goal: run the following command to fill the disk:
cassandra-stress WRITE n=200 -col size=FIXED\(1000\) -mode native prepared
cql3 -schema keyspace=r1

h5. Result - No Patch

[^test_regression_no_patch.jpg]

All disk are filled in ~420 seconds. Casandra-stress crashed with write
timeouts at around n=65

h5. Result - With Patch

[^test_regression_with_patch.jpg]

Cassandra-stress finished all its work (~13 minutes, n=200) and all disks
are under 60% of disk usage.

Any idea what's going on? Am I doing something wrong in my test case?

was (Author: aboudreault):
Devs, this is the result of my regression test without and with the patch.

Note: the compaction concurrency is set to 4 and the throughput unlimited.

h4. Test

* 12 disks total of 2G of size.
* Goal: run the following command to fill the disk:
cassandra-stress WRITE n=200 -col size=FIXED\(1000\) -mode native prepared
cql3 -schema keyspace=r1

h5. Result - No Patch

!test_regression_no_patch.jpg!

All disk are filled in ~420 seconds. Casandra-stress crashed with write
timeouts at around n=65

h5. Result - With Patch

!test_regression_with_patch.jpg!

Cassandra-stress finished all its work (~13 minutes, n=200) and all disks
are under 60% of disk usage.

Any idea what's going on? Am I doing something wrong in my test case?

JBOD threshold to prevent unbalanced disk utilization
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization

[
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219318#comment-14219318
]

Alan Boudreault edited comment on CASSANDRA-7386 at 11/20/14 12:24 PM:
---

Devs, this is the result of my regression test without and with the patch.

Note: the compaction concurrency is set to 4 and the throughput unlimited.

h4. Test

* 12 disks total of 2G of size.
* Goal: run the following command to fill the disk:
cassandra-stress WRITE n=200 -col size=FIXED\(1000\) -mode native prepared
cql3 -schema keyspace=r1

h5. Result - No Patch

[^test_regression_no_patch.jpg]

All disk are filled in ~6 minutes Casandra-stress crashed with write timeouts
at around n=65

h5. Result - With Patch

[^test_regression_with_patch.jpg]

Cassandra-stress finished all its work (~13 minutes, n=200) and all disks
are under 60% of disk usage.

Any idea what's going on? Am I doing something wrong in my test case?

was (Author: aboudreault):
Devs, this is the result of my regression test without and with the patch.

Note: the compaction concurrency is set to 4 and the throughput unlimited.

h4. Test

* 12 disks total of 2G of size.
* Goal: run the following command to fill the disk:
cassandra-stress WRITE n=200 -col size=FIXED\(1000\) -mode native prepared
cql3 -schema keyspace=r1

h5. Result - No Patch

[^test_regression_no_patch.jpg]

All disk are filled in ~420 seconds. Casandra-stress crashed with write
timeouts at around n=65

h5. Result - With Patch

[^test_regression_with_patch.jpg]

Cassandra-stress finished all its work (~13 minutes, n=200) and all disks
are under 60% of disk usage.

Any idea what's going on? Am I doing something wrong in my test case?

JBOD threshold to prevent unbalanced disk utilization
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization

[
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219318#comment-14219318
]

Alan Boudreault edited comment on CASSANDRA-7386 at 11/20/14 12:37 PM:
---

Devs, this is the result of my regression test without and with the patch.

Note: the compaction concurrency is set to 4 and the throughput unlimited.

h4. Test

* 12 disks total of 2G of size.
* Goal: run the following command to fill the disks:
cassandra-stress WRITE n=200 -col size=FIXED\(1000\) -mode native prepared
cql3 -schema keyspace=r1

h5. Result - No Patch

[^test_regression_no_patch.jpg]

All disk are filled in ~6 minutes Casandra-stress crashed with write timeouts
at around n=65

h5. Result - With Patch

[^test_regression_with_patch.jpg]

Cassandra-stress finished all its work (~13 minutes, n=200) and all disks
are under 60% of disk usage.

Any idea what's going on? Am I doing something wrong in my test case?

was (Author: aboudreault):
Devs, this is the result of my regression test without and with the patch.

Note: the compaction concurrency is set to 4 and the throughput unlimited.

h4. Test

* 12 disks total of 2G of size.
* Goal: run the following command to fill the disk:
cassandra-stress WRITE n=200 -col size=FIXED\(1000\) -mode native prepared
cql3 -schema keyspace=r1

h5. Result - No Patch

[^test_regression_no_patch.jpg]

All disk are filled in ~6 minutes Casandra-stress crashed with write timeouts
at around n=65

h5. Result - With Patch

[^test_regression_with_patch.jpg]

Cassandra-stress finished all its work (~13 minutes, n=200) and all disks
are under 60% of disk usage.

Any idea what's going on? Am I doing something wrong in my test case?

JBOD threshold to prevent unbalanced disk utilization
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8329) LeveledCompactionStrategy should split large files across data directories when compacting


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219346#comment-14219346
 ] 

Alan Boudreault commented on CASSANDRA-8329:


[~krummas] Is there a patch for cassandra 2.1? I assume that this issue affects 
also 2.1 and master.

 LeveledCompactionStrategy should split large files across data directories 
 when compacting
 --

 Key: CASSANDRA-8329
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8329
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: J.B. Langston
Assignee: Marcus Eriksson
 Fix For: 2.0.12

 Attachments: 
 0001-get-new-sstable-directory-for-every-new-file-during-.patch


 Because we fall back to STCS for L0 when LCS gets behind, the sstables in L0 
 can get quite large during sustained periods of heavy writes.  This can 
 result in large imbalances between data volumes when using JBOD support.  
 Eventually these large files get broken up as L0 sstables are moved up into 
 higher levels; however, because LCS only chooses a single volume on which to 
 write all of the sstables created during a single compaction, the imbalance 
 is persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8329) LeveledCompactionStrategy should split large files across data directories when compacting

2014-11-20 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219351#comment-14219351
 ] 

Marcus Eriksson commented on CASSANDRA-8329:


[~aboudreault] yes, it does, I think testing on 2.0 is enough here, unless you 
really want to test 2.1+?

 LeveledCompactionStrategy should split large files across data directories 
 when compacting
 --

 Key: CASSANDRA-8329
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8329
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: J.B. Langston
Assignee: Marcus Eriksson
 Fix For: 2.0.12

 Attachments: 
 0001-get-new-sstable-directory-for-every-new-file-during-.patch


 Because we fall back to STCS for L0 when LCS gets behind, the sstables in L0 
 can get quite large during sustained periods of heavy writes.  This can 
 result in large imbalances between data volumes when using JBOD support.  
 Eventually these large files get broken up as L0 sstables are moved up into 
 higher levels; however, because LCS only chooses a single volume on which to 
 write all of the sstables created during a single compaction, the imbalance 
 is persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8340) Use sstable min timestamp when deciding if an sstable should be included in DTCS compactions

2014-11-20 Thread JIRA

[
https://issues.apache.org/jira/browse/CASSANDRA-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219354#comment-14219354
]

Björn Hegerfors commented on CASSANDRA-8340:

No drawback, really. It doesn't make a big difference. Whatever is easiest to
reason about would be best. It's true that in your repair example, it would
have some effect, but only when the repair SSTables are not older than
max_sstable_age_days while the big one is. I would imagine that repair would be
likely to bring in a bunch of files that are older than max_sstable_age_days,
which will stay scattered anyway.

I suppose using min timestamp would align more with that the rest of the
strategy uses to determine age. In fact, something that would work even more
consistently with the strategy would be to specify maximum window size. Perhaps
in terms of initial window size. We have
* up to min_threshold windows of size 1, followed by
* up to min_threshold windows of size min_threshold, followed by
* up to min_threshold windows of size min_threshold^2, followed by
* up to min_threshold windows of size min_threshold^3, followed by
* etc.

And then we can simply stop generating more windows after some point. The
simplest, yet perhaps least intuitive, option would be max_window_exponent.
If we set max_window_exponent=n, then we would stop after windows of size
min_threshold^n. Example: max_window_exponent=3, min_threshold=4. The last few
windows would be 64*base_time_seconds in size, no 256 window is every created.
Other option alternatives are max_window or max_window_seconds.

WDYT [~krummas]?

Use sstable min timestamp when deciding if an sstable should be included in
DTCS compactions

Key: CASSANDRA-8340
URL: https://issues.apache.org/jira/browse/CASSANDRA-8340
Project: Cassandra
Issue Type: Improvement
Reporter: Marcus Eriksson
Priority: Minor

Currently we check how old the newest data (max timestamp) in an sstable is
when we check if it should be compacted.
If we instead switch to using min timestamp for this we have a pretty clean
migration path from STCS/LCS to DTCS.
My thinking is that before migrating, the user does a major compaction, which
creates a huge sstable containing all data, with min timestamp very far back
in time, then switching to DTCS, we will have a big sstable that we never
compact (ie, min timestamp of this big sstable is before
max_sstable_age_days), and all newer data will be after that, and that new
data will be properly compacted
WDYT [~Bj0rn] ?

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Boudreault reopened CASSANDRA-7386:

Tester: Alan Boudreault

 JBOD threshold to prevent unbalanced disk utilization
 -

 Key: CASSANDRA-7386
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7386
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Lohfink
Assignee: Robert Stupp
Priority: Minor
 Fix For: 2.1.3

 Attachments: 7386-2.0-v3.txt, 7386-2.0-v4.txt, 7386-2.0-v5.txt, 
 7386-2.1-v3.txt, 7386-2.1-v4.txt, 7386-2.1-v5.txt, 7386-v1.patch, 
 7386v2.diff, Mappe1.ods, mean-writevalue-7disks.png, 
 patch_2_1_branch_proto.diff, sstable-count-second-run.png, 
 test1_no_patch.jpg, test1_with_patch.jpg, test2_no_patch.jpg, 
 test2_with_patch.jpg, test3_no_patch.jpg, test3_with_patch.jpg, 
 test_regression_no_patch.jpg, test_regression_with_patch.jpg


 Currently the pick the disks are picked first by number of current tasks, 
 then by free space.  This helps with performance but can lead to large 
 differences in utilization in some (unlikely but possible) scenarios.  Ive 
 seen 55% to 10% and heard reports of 90% to 10% on IRC.  With both LCS and 
 STCS (although my suspicion is that STCS makes it worse since harder to be 
 balanced).
 I purpose the algorithm change a little to have some maximum range of 
 utilization where it will pick by free space over load (acknowledging it can 
 be slower).  So if a disk A is 30% full and disk B is 5% full it will never 
 pick A over B until it balances out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-8340) Use sstable min timestamp when deciding if an sstable should be included in DTCS compactions

2014-11-20 Thread JIRA

[
https://issues.apache.org/jira/browse/CASSANDRA-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219354#comment-14219354
]

Björn Hegerfors edited comment on CASSANDRA-8340 at 11/20/14 1:16 PM:
--

No drawback, really. It doesn't make a big difference. Whatever is easiest to
reason about would be best. It's true that in your repair example, it would
have some effect, but only when the repair SSTables are not older than
max_sstable_age_days while the big one is. I would imagine that repair would be
equally likely to bring in a bunch of files that are older than
max_sstable_age_days, which will stay scattered (uncompacted) anyway.

WDYT [~krummas]?

was (Author: bj0rn):
No drawback, really. It doesn't make a big difference. Whatever is easiest to
reason about would be best. It's true that in your repair example, it would
have some effect, but only when the repair SSTables are not older than
max_sstable_age_days while the big one is. I would imagine that repair would be
likely to bring in a bunch of files that are older than max_sstable_age_days,
which will stay scattered anyway.

WDYT [~krummas]?

Use sstable min timestamp when deciding if an sstable should be included in
DTCS compactions

Key: CASSANDRA-8340
URL: https://issues.apache.org/jira/browse/CASSANDRA-8340
Project: Cassandra
Issue Type: Improvement
Reporter: Marcus Eriksson
Priority: Minor

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219360#comment-14219360
 ] 

Robert Stupp commented on CASSANDRA-7386:
-

[~aboudreault] I guess you mean the disks 2 +7 in 
test_regression_with_patch.jpg.
I assume that disk2 got a lot of new sstables shortly after second 221 that 
disk7 got another bunch of sstables just before second 701 - just because these 
were the foolish disks that were nearly empty.
It might be a consequence of unlimited compaction throughput - can you verify 
that with a conservative compaction thoughput?
Maybe we have to (reintroduce) reservation of disk space - it's not a big deal 
to implement that and provide a patch this evening (CET) so that you can verify 
it.


 JBOD threshold to prevent unbalanced disk utilization
 -

 Key: CASSANDRA-7386
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7386
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Lohfink
Assignee: Robert Stupp
Priority: Minor
 Fix For: 2.1.3

 Attachments: 7386-2.0-v3.txt, 7386-2.0-v4.txt, 7386-2.0-v5.txt, 
 7386-2.1-v3.txt, 7386-2.1-v4.txt, 7386-2.1-v5.txt, 7386-v1.patch, 
 7386v2.diff, Mappe1.ods, mean-writevalue-7disks.png, 
 patch_2_1_branch_proto.diff, sstable-count-second-run.png, 
 test1_no_patch.jpg, test1_with_patch.jpg, test2_no_patch.jpg, 
 test2_with_patch.jpg, test3_no_patch.jpg, test3_with_patch.jpg, 
 test_regression_no_patch.jpg, test_regression_with_patch.jpg


 Currently the pick the disks are picked first by number of current tasks, 
 then by free space.  This helps with performance but can lead to large 
 differences in utilization in some (unlikely but possible) scenarios.  Ive 
 seen 55% to 10% and heard reports of 90% to 10% on IRC.  With both LCS and 
 STCS (although my suspicion is that STCS makes it worse since harder to be 
 balanced).
 I purpose the algorithm change a little to have some maximum range of 
 utilization where it will pick by free space over load (acknowledging it can 
 be slower).  So if a disk A is 30% full and disk B is 5% full it will never 
 pick A over B until it balances out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8332) Null pointer after droping keyspace


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219362#comment-14219362
 ] 

Chris Lohfink commented on CASSANDRA-8332:
--

no, I didn't see that change sorry - It looks fixed.

 Null pointer after droping keyspace
 ---

 Key: CASSANDRA-8332
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8332
 Project: Cassandra
  Issue Type: Bug
Reporter: Chris Lohfink
Assignee: Marcus Eriksson
Priority: Minor
 Fix For: 2.1.3


 After dropping keyspace, sometimes I see this in logs:
 {code}
 ERROR 03:40:29 Exception in thread Thread[CompactionExecutor:2,1,main]
 java.lang.AssertionError: null
   at 
 org.apache.cassandra.io.compress.CompressionParameters.setLiveMetadata(CompressionParameters.java:108)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.getCompressionMetadata(SSTableReader.java:1142)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:1896)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:68) 
 ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1681)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1693)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getScanners(LeveledCompactionStrategy.java:181)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getScanners(WrappingCompactionStrategy.java:320)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:340)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:151)
  ~[main/:na]
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
  ~[main/:na]
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
 ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:75)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:233)
  ~[main/:na]
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 ~[na:1.7.0_71]
   at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
 ~[na:1.7.0_71]
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_71]
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_71]
   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
 {code}
 Minor issue since doesn't really affect anything, but the error makes it look 
 like somethings wrong.  Seen on 2.1 branch 
 (1b21aef8152d96a180e75f2fcc5afad9ded6c595), not sure how far back (may be 
 post 2.1.2).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (CASSANDRA-8332) Null pointer after droping keyspace


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-8332:
-
Comment: was deleted

(was: no, I didn't see that change sorry - It looks fixed.)

 Null pointer after droping keyspace
 ---

 Key: CASSANDRA-8332
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8332
 Project: Cassandra
  Issue Type: Bug
Reporter: Chris Lohfink
Assignee: Marcus Eriksson
Priority: Minor
 Fix For: 2.1.3


 After dropping keyspace, sometimes I see this in logs:
 {code}
 ERROR 03:40:29 Exception in thread Thread[CompactionExecutor:2,1,main]
 java.lang.AssertionError: null
   at 
 org.apache.cassandra.io.compress.CompressionParameters.setLiveMetadata(CompressionParameters.java:108)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.getCompressionMetadata(SSTableReader.java:1142)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:1896)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:68) 
 ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1681)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1693)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getScanners(LeveledCompactionStrategy.java:181)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getScanners(WrappingCompactionStrategy.java:320)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:340)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:151)
  ~[main/:na]
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
  ~[main/:na]
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
 ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:75)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:233)
  ~[main/:na]
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 ~[na:1.7.0_71]
   at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
 ~[na:1.7.0_71]
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_71]
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_71]
   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
 {code}
 Minor issue since doesn't really affect anything, but the error makes it look 
 like somethings wrong.  Seen on 2.1 branch 
 (1b21aef8152d96a180e75f2fcc5afad9ded6c595), not sure how far back (may be 
 post 2.1.2).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8332) Null pointer after droping keyspace


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219365#comment-14219365
 ] 

Chris Lohfink commented on CASSANDRA-8332:
--

sorry about last comment (deleted) was thinking different issue.  Yes I was 
seeing it after 8027. Same error but caused by different thing I think.  I can 
work on coming up with something thats a little more reproducible, not sure if 
it makes a difference but I mostly ran into it when using cqlstress with the 
default cqlstress-example.yaml so it had a lot of batch mutations.

 Null pointer after droping keyspace
 ---

 Key: CASSANDRA-8332
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8332
 Project: Cassandra
  Issue Type: Bug
Reporter: Chris Lohfink
Assignee: Marcus Eriksson
Priority: Minor
 Fix For: 2.1.3


 After dropping keyspace, sometimes I see this in logs:
 {code}
 ERROR 03:40:29 Exception in thread Thread[CompactionExecutor:2,1,main]
 java.lang.AssertionError: null
   at 
 org.apache.cassandra.io.compress.CompressionParameters.setLiveMetadata(CompressionParameters.java:108)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.getCompressionMetadata(SSTableReader.java:1142)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:1896)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:68) 
 ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1681)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1693)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getScanners(LeveledCompactionStrategy.java:181)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getScanners(WrappingCompactionStrategy.java:320)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:340)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:151)
  ~[main/:na]
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
  ~[main/:na]
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
 ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:75)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:233)
  ~[main/:na]
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 ~[na:1.7.0_71]
   at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
 ~[na:1.7.0_71]
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_71]
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_71]
   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
 {code}
 Minor issue since doesn't really affect anything, but the error makes it look 
 like somethings wrong.  Seen on 2.1 branch 
 (1b21aef8152d96a180e75f2fcc5afad9ded6c595), not sure how far back (may be 
 post 2.1.2).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-8332) Null pointer after droping keyspace

2014-11-20 Thread Marcus Eriksson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson reassigned CASSANDRA-8332:
--

Assignee: T Jake Luciani  (was: Marcus Eriksson)

[~tjake] could you have a look as it seems to be related to CASSANDRA-8027 ?

 Null pointer after droping keyspace
 ---

 Key: CASSANDRA-8332
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8332
 Project: Cassandra
  Issue Type: Bug
Reporter: Chris Lohfink
Assignee: T Jake Luciani
Priority: Minor
 Fix For: 2.1.3


 After dropping keyspace, sometimes I see this in logs:
 {code}
 ERROR 03:40:29 Exception in thread Thread[CompactionExecutor:2,1,main]
 java.lang.AssertionError: null
   at 
 org.apache.cassandra.io.compress.CompressionParameters.setLiveMetadata(CompressionParameters.java:108)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.getCompressionMetadata(SSTableReader.java:1142)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:1896)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:68) 
 ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1681)
  ~[main/:na]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1693)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getScanners(LeveledCompactionStrategy.java:181)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getScanners(WrappingCompactionStrategy.java:320)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:340)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:151)
  ~[main/:na]
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
  ~[main/:na]
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
 ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:75)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
  ~[main/:na]
   at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:233)
  ~[main/:na]
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 ~[na:1.7.0_71]
   at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
 ~[na:1.7.0_71]
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_71]
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_71]
   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
 {code}
 Minor issue since doesn't really affect anything, but the error makes it look 
 like somethings wrong.  Seen on 2.1 branch 
 (1b21aef8152d96a180e75f2fcc5afad9ded6c595), not sure how far back (may be 
 post 2.1.2).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8341) Expose time spent in each thread pool

2014-11-20 Thread Rajanarayanan Thottuvaikkatumana (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219374#comment-14219374
 ] 

Chris Lohfink commented on CASSANDRA-8341:
--

yep understandable. Id like this if possible though so let me know if theres 
anything I can do.  A lot of the metrics library work happens after the task is 
complete, in cases like read/write's they send back to a request response stage 
so at least the post task work shouldn't affect the read/write latency until 
the thread pool is saturated.  I am not really sure if I know anything I can do 
to counter the ~100ns-1us overhead though - any ideas?

 Expose time spent in each thread pool
 -

 Key: CASSANDRA-8341
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8341
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Chris Lohfink
Priority: Minor
  Labels: metrics
 Attachments: 8341.patch, 8341v2.txt


 Can increment a counter with time spent in each queue.  This can provide 
 context on how much time is spent percentage wise in each stage.  
 Additionally can be used with littles law in future if ever want to try to 
 tune the size of the pools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7981) Refactor SelectStatement

2014-11-20 Thread Benjamin Lerer (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219394#comment-14219394
 ] 

Benjamin Lerer commented on CASSANDRA-7981:
---

The new fixes are there: [branch | 
https://github.com/blerer/cassandra/compare/CASSANDRA-7981]

{quote}It looks like the mergeWith() conflict is still present. For some 
reason, the ant build doesn't normally complain, but my IDE does. Either way, 
it's a legit problem that needs to be fixed.{quote}
It is a perfectly valid thing from the Java point of view but I agree that it 
is a bit of a hack. 
I have tried to simplify a bit the hierachy by having 
{{SingleColumnPrimaryKeyRestrictions}} using an instance of 
{{SingleColumnRestrictions}} instead of extending it. This as allowed me to 
remove the {{mergeWith()}} problem.
{quote}Can you explain why that's correct (and add a comment)?{quote}
I cannot. This code is completly wrong.
Hopefully it should now be fixed. Have a look at the new unit tests and tell me 
if they look fine to you. 

 Refactor SelectStatement
 

 Key: CASSANDRA-7981
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7981
 Project: Cassandra
  Issue Type: Bug
Reporter: Benjamin Lerer
Assignee: Benjamin Lerer
 Fix For: 3.0


 The current state of the code of SelectStatement make fixing some issues or 
 adding new functionnalities really hard. It also contains some 
 functionnalities that we would like to reuse in ModificationStatement but 
 cannot for the moment.
 Ideally I would like to:
 * Perform as much validation as possible on Relations instead of performing 
 it on Restrictions as it will help for problem like the one of 
 #CASSANDRA-6075 (I believe that by clearly separating validation and 
 Restrictions building we will also make the code a lot clearer)
 * Provide a way to easily merge restrictions on the same columns as needed 
 for #CASSANDRA-7016
 * Have a preparation logic (validation + pre-processing) that we can easily 
 reuse for Delete statement #CASSANDRA-6237
 * Make the code much easier to read and safer to modify.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-4123) vnodes aware Replication Strategy

[
https://issues.apache.org/jira/browse/CASSANDRA-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219426#comment-14219426
]

Rajanarayanan Thottuvaikkatumana commented on CASSANDRA-4123:
-

Is the whole idea to re-architect the replication strategy based on the CRUSH
paper? Thanks

vnodes aware Replication Strategy
--

Key: CASSANDRA-4123
URL: https://issues.apache.org/jira/browse/CASSANDRA-4123
Project: Cassandra
Issue Type: New Feature
Components: Core
Reporter: Sam Overton
Labels: vnodes
Fix For: 3.0

The simplest implementation for this would be if NTS regarded a single host
as a distinct rack. This would prevent replicas being placed on the same
host. The rest of the logic for replica selection would be identical to NTS
(but this would be removing a level of topology hierarchy). This would be
achievable just by writing a snitch to place hosts in their own rack.
A better solution would be to add an extra level of hierarchy to NTS so that
it still supported DC rack, and IP would be the new level at the bottom of
the hierarchy. The logic would remain largely the same.
I would very much like to build in Peter Schuller's notion of Distribution
Factor (as described in
http://www.mail-archive.com/dev@cassandra.apache.org/msg03844.html). This
requires a method of defining a replica set for each host and then treating
it in a similar way to a DC (ie. RF replicas are chosen from that set,
instead of from the whole cluster).

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8341) Expose time spent in each thread pool


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219444#comment-14219444
 ] 

Chris Lohfink commented on CASSANDRA-8341:
--

In the SEPExecutor since the same thread will continually execute tasks if 
theres work, I think its possible to just collect the start time before 
starting any work and then the end time once theres a gap in processing.  This 
way if the thread pool is saturated it should incur no overhead from 
monitoring, but once there is a break in the work or a executor has run out of 
task permits it would update the cpu time.  Then on a saturated system, the 
metric may be out of date but would have less of an overhead.  I am not 
completely familiar with the SEPWorker so I need to do a little more research, 
but would that be adequate?  

The type of work in tasks using the ThreadPoolExecutor (i.e. migration, 
repairs, commitlog archiver etc) I don't think would be impacted from the 
metrics overhead so can probably stick with impl used in the patched 
JMXEnabledThreadPoolExecutor with cpu time instead of nanotime.

 Expose time spent in each thread pool
 -

 Key: CASSANDRA-8341
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8341
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Chris Lohfink
Priority: Minor
  Labels: metrics
 Attachments: 8341.patch, 8341v2.txt


 Can increment a counter with time spent in each queue.  This can provide 
 context on how much time is spent percentage wise in each stage.  
 Additionally can be used with littles law in future if ever want to try to 
 tune the size of the pools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization

[
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219459#comment-14219459
]

Alan Boudreault commented on CASSANDRA-7386:

[~snazy] In fact, my concern is not really the 2 full disks but more why
can I fill all my disks in 6 minutes without the patch and that with the patch,
7/9 of my disks are under 60% of usage after 15 minutes? I might be wrong since
that stuff is new to me. but is there some *better* compaction/compression
happening with your patch or was there something wrong happening before?
Thanks!

Yes, will try with a *conservative* compaction throughput, like 20mb/s.

JBOD threshold to prevent unbalanced disk utilization
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization

[
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219459#comment-14219459
]

Alan Boudreault edited comment on CASSANDRA-7386 at 11/20/14 3:14 PM:
--

Yes, will try with a *conservative* compaction throughput, like 16mb/s
(default).

was (Author: aboudreault):
[~snazy] In fact, my concern is not really the 2 full disks but more why
can I fill all my disks in 6 minutes without the patch and that with the patch,
7/9 of my disks are under 60% of usage after 15 minutes? I might be wrong since
that stuff is new to me. but is there some *better* compaction/compression
happening with your patch or was there something wrong happening before?
Thanks!

Yes, will try with a *conservative* compaction throughput, like 20mb/s.

JBOD threshold to prevent unbalanced disk utilization
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-8347) 2.1.1: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException after accidental computer crash

2014-11-20 Thread Evgeny Pasynkov (JIRA)

Evgeny Pasynkov created CASSANDRA-8347:
--

 Summary: 2.1.1: 
org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException 
after accidental computer crash
 Key: CASSANDRA-8347
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8347
 Project: Cassandra
  Issue Type: Bug
Reporter: Evgeny Pasynkov


9:08:56.972 [SSTableBatchOpen:1] ERROR o.a.c.service.CassandraDaemon - 
Exception in thread Thread[SSTableBatchOpen:1,5,main]
org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException
 at 
org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:129)
 ~[cassandra-all-2.1.1.jar:2.1.1]
 at 
org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:83)
 ~[cassandra-all-2.1.1.jar:2.1.1]
 at 
org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:50)
 ~[cassandra-all-2.1.1.jar:2.1.1]
 at 
org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:48)
 ~[cassandra-all-2.1.1.jar:2.1.1]
 at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:766) 
~[cassandra-all-2.1.1.jar:2.1.1]
 at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:725) 
~[cassandra-all-2.1.1.jar:2.1.1]
 at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:402) 
~[cassandra-all-2.1.1.jar:2.1.1]
 at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:302) 
~[cassandra-all-2.1.1.jar:2.1.1]
 at org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:438) 
~[cassandra-all-2.1.1.jar:2.1.1]
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
~[na:1.7.0_65]
 at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_65]
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_65]
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_65]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
Caused by: java.io.EOFException: null
 at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) 
~[na:1.7.0_65]
 at java.io.DataInputStream.readUTF(DataInputStream.java:589) ~[na:1.7.0_65]
 at java.io.DataInputStream.readUTF(DataInputStream.java:564) ~[na:1.7.0_65]
 at 
org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:104)
 ~[cassandra-all-2.1.1.jar:2.1.1]
 ... 13 common frames omitted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization

[
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219507#comment-14219507
]

Robert Stupp commented on CASSANDRA-7386:
-

Hm ... I see. Have you seen any error in system.log during the no-patch run
(beside disk full)? Or any unfinished compactions? IMO the with-patch graph
shows typical compaction spikes - but the no-patch graph doesn't.
The patch itself has no direct influence on compactions - but since disk
assignment is influenced by the patch, it has some influence.

JBOD threshold to prevent unbalanced disk utilization
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8337) mmap underflow during validation compaction

2014-11-20 Thread Joshua McKenzie (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219510#comment-14219510
 ] 

Joshua McKenzie commented on CASSANDRA-8337:


[~sterligovak]: Could you give some more details about your environment?  
OS/kernel?  Did these problems occur on a subset of your hosts or was it 
scattered?

I haven't split out and analyzed them yet, but given the sheer variety of 
exceptions (mmap underflow, corrupt sstable while iterating during compaction, 
NPE) my first suspicion leans toward a more underlying problem with the file 
system that's showing up w/a variety of symptoms, though corruption and/or race 
on streaming that's reproducible could definitely be the culprit too.

[~enigmacurry]: have we ever seen something like this on the 2.1.X line while 
testing with repairs?

 mmap underflow during validation compaction
 ---

 Key: CASSANDRA-8337
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8337
 Project: Cassandra
  Issue Type: Bug
Reporter: Alexander Sterligov
Assignee: Joshua McKenzie
 Attachments: thread_dump


 During full parallel repair I often get errors like the following
 {quote}
 [2014-11-19 01:02:39,355] Repair session 116beaf0-6f66-11e4-afbb-c1c082008cbe 
 for range (3074457345618263602,-9223372036854775808] failed with error 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #116beaf0-6f66-11e4-afbb-c1c082008cbe on iss/target_state_history, 
 (3074457345618263602,-9223372036854775808]] Validation failed in 
 /95.108.242.19
 {quote}
 At the log of the node there are always same exceptions:
 {quote}
 ERROR [ValidationExecutor:2] 2014-11-19 01:02:10,847 
 JVMStabilityInspector.java:94 - JVM state determined to be unstable.  Exiting 
 forcefully due to:
 org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: 
 mmap segment underflow; remaining is 15 but 47 requested
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:1518)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:1385)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:1315)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1706)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1694)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:276)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getScanners(WrappingCompactionStrategy.java:320)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:917)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:97)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:557)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
 ~[na:1.7.0_51]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_51]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_51]
 at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
 Caused by: java.io.IOException: mmap segment underflow; remaining is 15 but 
 47 requested
 at 
 org.apache.cassandra.io.util.MappedFileDataInput.readBytes(MappedFileDataInput.java:135)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:348) 
 ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:327)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:1460)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 ... 13 common frames omitted
 {quote}
 Now i'm using die disk_failure_policy to determine such conditions faster, 
 but I get them even with stop policy.
 Streams related to host with such exception are hanged. Thread dump is 
 attached. Only restart helps.
 After retry I get errors from other nodes.
 scrub doesn't help and report that sstables are ok.
 Sequential repairs doesn't cause such exceptions.
 Load is about 1000 write rps

[jira] [Commented] (CASSANDRA-8337) mmap underflow during validation compaction

2014-11-20 Thread Alexander Sterligov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219523#comment-14219523
 ] 

Alexander Sterligov commented on CASSANDRA-8337:


Linux 3.10.28.
48GB, 64GB or 128GB ram.
16 or 32 cores.
1000MBps network, highest latency between nodes is 10ms.

Problem was scattered.

 mmap underflow during validation compaction
 ---

 Key: CASSANDRA-8337
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8337
 Project: Cassandra
  Issue Type: Bug
Reporter: Alexander Sterligov
Assignee: Joshua McKenzie
 Attachments: thread_dump


 During full parallel repair I often get errors like the following
 {quote}
 [2014-11-19 01:02:39,355] Repair session 116beaf0-6f66-11e4-afbb-c1c082008cbe 
 for range (3074457345618263602,-9223372036854775808] failed with error 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #116beaf0-6f66-11e4-afbb-c1c082008cbe on iss/target_state_history, 
 (3074457345618263602,-9223372036854775808]] Validation failed in 
 /95.108.242.19
 {quote}
 At the log of the node there are always same exceptions:
 {quote}
 ERROR [ValidationExecutor:2] 2014-11-19 01:02:10,847 
 JVMStabilityInspector.java:94 - JVM state determined to be unstable.  Exiting 
 forcefully due to:
 org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: 
 mmap segment underflow; remaining is 15 but 47 requested
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:1518)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:1385)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:1315)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1706)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1694)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:276)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getScanners(WrappingCompactionStrategy.java:320)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:917)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:97)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:557)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
 ~[na:1.7.0_51]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_51]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_51]
 at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
 Caused by: java.io.IOException: mmap segment underflow; remaining is 15 but 
 47 requested
 at 
 org.apache.cassandra.io.util.MappedFileDataInput.readBytes(MappedFileDataInput.java:135)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:348) 
 ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:327)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:1460)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 ... 13 common frames omitted
 {quote}
 Now i'm using die disk_failure_policy to determine such conditions faster, 
 but I get them even with stop policy.
 Streams related to host with such exception are hanged. Thread dump is 
 attached. Only restart helps.
 After retry I get errors from other nodes.
 scrub doesn't help and report that sstables are ok.
 Sequential repairs doesn't cause such exceptions.
 Load is about 1000 write rps and 50 read rps per node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8244) Token, DecoratedKey, RowPosition and all bound types should not make any hidden references to the database partitioner

2014-11-20 Thread Joshua McKenzie (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219526#comment-14219526
 ] 

Joshua McKenzie commented on CASSANDRA-8244:


getHeapSize on BigIntegerToken has a 'TODO: Probably wrong' comment for 
getHeapSize().  While I agree, we should probably either fix that or create 
another ticket to address that and maybe reference it in the comment.

nits:
* annotate @VisibleForTesting on 
RandomPartitioner.BigIntegerToken.BigIntegerToken(String token)
* clean up import order on FBUtilities.java
* looks like changes to a couple of the .db files under test snuck in on commit

Looks good - nice cleanup!

 Token, DecoratedKey, RowPosition and all bound types should not make any 
 hidden references to the database partitioner
 --

 Key: CASSANDRA-8244
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8244
 Project: Cassandra
  Issue Type: Bug
Reporter: Branimir Lambov
Assignee: Branimir Lambov
Priority: Minor
 Fix For: 3.0


 Currently some of the functionality of Token refers to 
 StorageService.getPartitioner() to avoid needing an extra argument. This is 
 in turn implicitly used by RowPosition and then Range, causing possible 
 problems, for example when ranges on secondary indices are used in a 
 murmur-partitioned database.
 These references should be removed to force explicit choice of partitioner by 
 callers; alternatively, the Token interface could be changed to provide a 
 reference to the partitioner that created it.
 (Note: the hidden reference to partitioner in serialization is a separate 
 issue.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219531#comment-14219531
 ] 

Jeremiah Jordan commented on CASSANDRA-7386:


[~aboudreault] the fact that without the patch you crash is what this issue is 
trying to fix. So that is a GOOD thing that it happens without, but the patch 
fixes it.

[~snazy] nothing to change here, and no we do not want to bring back disk 
reservation, that only caused problems.

 JBOD threshold to prevent unbalanced disk utilization
 -

 Key: CASSANDRA-7386
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7386
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Lohfink
Assignee: Robert Stupp
Priority: Minor
 Fix For: 2.1.3

 Attachments: 7386-2.0-v3.txt, 7386-2.0-v4.txt, 7386-2.0-v5.txt, 
 7386-2.1-v3.txt, 7386-2.1-v4.txt, 7386-2.1-v5.txt, 7386-v1.patch, 
 7386v2.diff, Mappe1.ods, mean-writevalue-7disks.png, 
 patch_2_1_branch_proto.diff, sstable-count-second-run.png, 
 test1_no_patch.jpg, test1_with_patch.jpg, test2_no_patch.jpg, 
 test2_with_patch.jpg, test3_no_patch.jpg, test3_with_patch.jpg, 
 test_regression_no_patch.jpg, test_regression_with_patch.jpg


 Currently the pick the disks are picked first by number of current tasks, 
 then by free space.  This helps with performance but can lead to large 
 differences in utilization in some (unlikely but possible) scenarios.  Ive 
 seen 55% to 10% and heard reports of 90% to 10% on IRC.  With both LCS and 
 STCS (although my suspicion is that STCS makes it worse since harder to be 
 balanced).
 I purpose the algorithm change a little to have some maximum range of 
 utilization where it will pick by free space over load (acknowledging it can 
 be slower).  So if a disk A is 30% full and disk B is 5% full it will never 
 pick A over B until it balances out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8188) don't block SocketThread for MessagingService

2014-11-20 Thread Chris Burroughs (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219550#comment-14219550
 ] 

Chris Burroughs commented on CASSANDRA-8188:


I'm investigating an issue with our dual DC clusters.  There was a network 
outage after witch each cluster spit out 'Cannot handshake version with ' for 
minutes (after network connectivity was restored). I didn't get a stack trace 
in time but this appears similar.  Is there something that makes this ticket 
2.1.x specific?

 don't block SocketThread for MessagingService
 -

 Key: CASSANDRA-8188
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8188
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: yangwei
Assignee: yangwei
 Fix For: 2.1.2

 Attachments: 0001-don-t-block-SocketThread-for-MessagingService.patch


 We have two datacenters A and B.
 The node in A cannot handshake version with nodes in B, logs in A as follow:
 {noformat}
   INFO [HANDSHAKE-/B] 2014-10-24 04:29:49,075 OutboundTcpConnection.java 
 (line 395) Cannot handshake version with B
 TRACE [WRITE-/B] 2014-10-24 11:02:49,044 OutboundTcpConnection.java (line 
 368) unable to connect to /B
   java.net.ConnectException: Connection refused
 at sun.nio.ch.Net.connect0(Native Method)
 at sun.nio.ch.Net.connect(Net.java:364)
 at sun.nio.ch.Net.connect(Net.java:356)
 at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:623)
 at java.nio.channels.SocketChannel.open(SocketChannel.java:184)
 at 
 org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:134)
 at 
 org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:119)
 at 
 org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:299)
 at 
 org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:150)
 {noformat}
 
 The jstack output of nodes in B shows it blocks in inputStream.readInt 
 resulting in SocketThread not accept socket any more, logs as follow:
 {noformat}
  java.lang.Thread.State: RUNNABLE
 at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
 at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
 at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
 at sun.nio.ch.IOUtil.read(IOUtil.java:197)
 at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
 - locked 0x0007963747e8 (a java.lang.Object)
 at 
 sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:203)
 - locked 0x000796374848 (a java.lang.Object)
 at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
 - locked 0x0007a5c7ca88 (a 
 sun.nio.ch.SocketAdaptor$SocketInputStream)
 at java.io.InputStream.read(InputStream.java:101)
 at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
 - locked 0x0007a5c7ca88 (a 
 sun.nio.ch.SocketAdaptor$SocketInputStream)
 at java.io.DataInputStream.readInt(DataInputStream.java:387)
 at 
 org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:879)
 {noformat}

 In nodes of B tcpdump shows retransmission of SYN,ACK during the tcp 
 three-way handshake phase because tcp implementation drops the last ack when 
 the backlog queue is full.
 In nodes of B ss -tl shows Recv-Q 51 Send-Q 50.
 
 In nodes of B netstat -s shows “SYNs to LISTEN sockets dropped” and “times 
 the listen queue of a socket overflowed” are both increasing.
 This patch sets read timeout to 2 * 
 OutboundTcpConnection.WAIT_FOR_VERSION_MAX_TIME for the accepted socket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8188) don't block SocketThread for MessagingService

2014-11-20 Thread Chris Burroughs (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Burroughs updated CASSANDRA-8188:
---
Attachment: handshake.stack.txt

Co-worker was faster with me and did get a stacktrace

 don't block SocketThread for MessagingService
 -

 Key: CASSANDRA-8188
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8188
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: yangwei
Assignee: yangwei
 Fix For: 2.1.2

 Attachments: 
 0001-don-t-block-SocketThread-for-MessagingService.patch, handshake.stack.txt


 We have two datacenters A and B.
 The node in A cannot handshake version with nodes in B, logs in A as follow:
 {noformat}
   INFO [HANDSHAKE-/B] 2014-10-24 04:29:49,075 OutboundTcpConnection.java 
 (line 395) Cannot handshake version with B
 TRACE [WRITE-/B] 2014-10-24 11:02:49,044 OutboundTcpConnection.java (line 
 368) unable to connect to /B
   java.net.ConnectException: Connection refused
 at sun.nio.ch.Net.connect0(Native Method)
 at sun.nio.ch.Net.connect(Net.java:364)
 at sun.nio.ch.Net.connect(Net.java:356)
 at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:623)
 at java.nio.channels.SocketChannel.open(SocketChannel.java:184)
 at 
 org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:134)
 at 
 org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:119)
 at 
 org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:299)
 at 
 org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:150)
 {noformat}
 
 The jstack output of nodes in B shows it blocks in inputStream.readInt 
 resulting in SocketThread not accept socket any more, logs as follow:
 {noformat}
  java.lang.Thread.State: RUNNABLE
 at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
 at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
 at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
 at sun.nio.ch.IOUtil.read(IOUtil.java:197)
 at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
 - locked 0x0007963747e8 (a java.lang.Object)
 at 
 sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:203)
 - locked 0x000796374848 (a java.lang.Object)
 at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
 - locked 0x0007a5c7ca88 (a 
 sun.nio.ch.SocketAdaptor$SocketInputStream)
 at java.io.InputStream.read(InputStream.java:101)
 at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
 - locked 0x0007a5c7ca88 (a 
 sun.nio.ch.SocketAdaptor$SocketInputStream)
 at java.io.DataInputStream.readInt(DataInputStream.java:387)
 at 
 org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:879)
 {noformat}

 In nodes of B tcpdump shows retransmission of SYN,ACK during the tcp 
 three-way handshake phase because tcp implementation drops the last ack when 
 the backlog queue is full.
 In nodes of B ss -tl shows Recv-Q 51 Send-Q 50.
 
 In nodes of B netstat -s shows “SYNs to LISTEN sockets dropped” and “times 
 the listen queue of a socket overflowed” are both increasing.
 This patch sets read timeout to 2 * 
 OutboundTcpConnection.WAIT_FOR_VERSION_MAX_TIME for the accepted socket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219562#comment-14219562
 ] 

Alan Boudreault commented on CASSANDRA-7386:


Yep, discussed with Jeremiah on hipchat, he clarified things. Thanks! Closing.

 JBOD threshold to prevent unbalanced disk utilization
 -

 Key: CASSANDRA-7386
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7386
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Lohfink
Assignee: Robert Stupp
Priority: Minor
 Fix For: 2.1.3

 Attachments: 7386-2.0-v3.txt, 7386-2.0-v4.txt, 7386-2.0-v5.txt, 
 7386-2.1-v3.txt, 7386-2.1-v4.txt, 7386-2.1-v5.txt, 7386-v1.patch, 
 7386v2.diff, Mappe1.ods, mean-writevalue-7disks.png, 
 patch_2_1_branch_proto.diff, sstable-count-second-run.png, 
 test1_no_patch.jpg, test1_with_patch.jpg, test2_no_patch.jpg, 
 test2_with_patch.jpg, test3_no_patch.jpg, test3_with_patch.jpg, 
 test_regression_no_patch.jpg, test_regression_with_patch.jpg


 Currently the pick the disks are picked first by number of current tasks, 
 then by free space.  This helps with performance but can lead to large 
 differences in utilization in some (unlikely but possible) scenarios.  Ive 
 seen 55% to 10% and heard reports of 90% to 10% on IRC.  With both LCS and 
 STCS (although my suspicion is that STCS makes it worse since harder to be 
 balanced).
 I purpose the algorithm change a little to have some maximum range of 
 utilization where it will pick by free space over load (acknowledging it can 
 be slower).  So if a disk A is 30% full and disk B is 5% full it will never 
 pick A over B until it balances out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Boudreault resolved CASSANDRA-7386.

Resolution: Fixed

 JBOD threshold to prevent unbalanced disk utilization
 -

 Key: CASSANDRA-7386
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7386
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Lohfink
Assignee: Robert Stupp
Priority: Minor
 Fix For: 2.1.3

 Attachments: 7386-2.0-v3.txt, 7386-2.0-v4.txt, 7386-2.0-v5.txt, 
 7386-2.1-v3.txt, 7386-2.1-v4.txt, 7386-2.1-v5.txt, 7386-v1.patch, 
 7386v2.diff, Mappe1.ods, mean-writevalue-7disks.png, 
 patch_2_1_branch_proto.diff, sstable-count-second-run.png, 
 test1_no_patch.jpg, test1_with_patch.jpg, test2_no_patch.jpg, 
 test2_with_patch.jpg, test3_no_patch.jpg, test3_with_patch.jpg, 
 test_regression_no_patch.jpg, test_regression_with_patch.jpg


 Currently the pick the disks are picked first by number of current tasks, 
 then by free space.  This helps with performance but can lead to large 
 differences in utilization in some (unlikely but possible) scenarios.  Ive 
 seen 55% to 10% and heard reports of 90% to 10% on IRC.  With both LCS and 
 STCS (although my suspicion is that STCS makes it worse since harder to be 
 balanced).
 I purpose the algorithm change a little to have some maximum range of 
 utilization where it will pick by free space over load (acknowledging it can 
 be slower).  So if a disk A is 30% full and disk B is 5% full it will never 
 pick A over B until it balances out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219566#comment-14219566
 ] 

Robert Stupp commented on CASSANDRA-7386:
-

bq. we do not want to bring back disk reservation

good to hear :)

[~jjordan] just for me to understand it - what were there problems?

 JBOD threshold to prevent unbalanced disk utilization
 -

 Key: CASSANDRA-7386
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7386
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Lohfink
Assignee: Robert Stupp
Priority: Minor
 Fix For: 2.1.3

 Attachments: 7386-2.0-v3.txt, 7386-2.0-v4.txt, 7386-2.0-v5.txt, 
 7386-2.1-v3.txt, 7386-2.1-v4.txt, 7386-2.1-v5.txt, 7386-v1.patch, 
 7386v2.diff, Mappe1.ods, mean-writevalue-7disks.png, 
 patch_2_1_branch_proto.diff, sstable-count-second-run.png, 
 test1_no_patch.jpg, test1_with_patch.jpg, test2_no_patch.jpg, 
 test2_with_patch.jpg, test3_no_patch.jpg, test3_with_patch.jpg, 
 test_regression_no_patch.jpg, test_regression_with_patch.jpg


 Currently the pick the disks are picked first by number of current tasks, 
 then by free space.  This helps with performance but can lead to large 
 differences in utilization in some (unlikely but possible) scenarios.  Ive 
 seen 55% to 10% and heard reports of 90% to 10% on IRC.  With both LCS and 
 STCS (although my suspicion is that STCS makes it worse since harder to be 
 balanced).
 I purpose the algorithm change a little to have some maximum range of 
 utilization where it will pick by free space over load (acknowledging it can 
 be slower).  So if a disk A is 30% full and disk B is 5% full it will never 
 pick A over B until it balances out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219577#comment-14219577
 ] 

Alan Boudreault commented on CASSANDRA-7386:


[~snazy] From what I understand, the whole compaction process crashed as soon 
as it hit 1 disk full. So, no more compaction was happening then. This makes 
sense since that in my prior tests, I just make the compaction process very 
very slow, so nothing was crashing.  

[~yukim] Will this be backported in branch cassandra-2.0? Thanks 

 JBOD threshold to prevent unbalanced disk utilization
 -

 Key: CASSANDRA-7386
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7386
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Lohfink
Assignee: Robert Stupp
Priority: Minor
 Fix For: 2.1.3

 Attachments: 7386-2.0-v3.txt, 7386-2.0-v4.txt, 7386-2.0-v5.txt, 
 7386-2.1-v3.txt, 7386-2.1-v4.txt, 7386-2.1-v5.txt, 7386-v1.patch, 
 7386v2.diff, Mappe1.ods, mean-writevalue-7disks.png, 
 patch_2_1_branch_proto.diff, sstable-count-second-run.png, 
 test1_no_patch.jpg, test1_with_patch.jpg, test2_no_patch.jpg, 
 test2_with_patch.jpg, test3_no_patch.jpg, test3_with_patch.jpg, 
 test_regression_no_patch.jpg, test_regression_with_patch.jpg


 Currently the pick the disks are picked first by number of current tasks, 
 then by free space.  This helps with performance but can lead to large 
 differences in utilization in some (unlikely but possible) scenarios.  Ive 
 seen 55% to 10% and heard reports of 90% to 10% on IRC.  With both LCS and 
 STCS (although my suspicion is that STCS makes it worse since harder to be 
 balanced).
 I purpose the algorithm change a little to have some maximum range of 
 utilization where it will pick by free space over load (acknowledging it can 
 be slower).  So if a disk A is 30% full and disk B is 5% full it will never 
 pick A over B until it balances out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization

[
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219577#comment-14219577
]

Alan Boudreault edited comment on CASSANDRA-7386 at 11/20/14 4:54 PM:
--

[~snazy] From what I understand, the whole compaction process crashed as soon
as it hit 1 disk full. So, no more compaction was happening then. This makes
sense since that in my prior tests, I just make the compaction process very
very slow, so nothing was crashing. [~jjordan] can confirm if I'm right here.

[~yukim] Will this be backported in branch cassandra-2.0? Thanks

was (Author: aboudreault):
[~snazy] From what I understand, the whole compaction process crashed as soon
as it hit 1 disk full. So, no more compaction was happening then. This makes
sense since that in my prior tests, I just make the compaction process very
very slow, so nothing was crashing.

[~yukim] Will this be backported in branch cassandra-2.0? Thanks

JBOD threshold to prevent unbalanced disk utilization
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8231) Wrong size of cached prepared statements

2014-11-20 Thread Benjamin Lerer (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-8231:
--
Attachment: CASSANDRA-8231-V2.txt

Same patch as the previous with the renamed license file for Jamm.

 Wrong size of cached prepared statements
 

 Key: CASSANDRA-8231
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8231
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jaroslav Kamenik
Assignee: Benjamin Lerer
 Fix For: 2.1.3

 Attachments: 8231-notes.txt, CASSANDRA-8231-V2.txt, 
 CASSANDRA-8231.txt, Unsafes.java


 Cassandra counts memory footprint of prepared statements for caching 
 purposes. It seems, that there is problem with some statements, ie 
 SelectStatement. Even simple selects is counted as 100KB object, updates, 
 deletes etc have few hundreds or thousands bytes. Result is that cache - 
 QueryProcessor.preparedStatements  - holds just fraction of statements..
 I dig a little into the code, and it seems that problem is in jamm in class 
 MemoryMeter. It seems that if instance contains reference to class, it counts 
 size of whole class too. SelectStatement references EnumSet through 
 ResultSet.Metadata and EnumSet holds reference to Enum class...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7124) Use JMX Notifications to Indicate Success/Failure of Long-Running Operations

2014-11-20 Thread Yuki Morishita (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219615#comment-14219615
 ] 

Yuki Morishita commented on CASSANDRA-7124:
---

[~rnamboodiri] Thanks for the patch.

Here are some comments.

- You just cannot 'cast' executor to ListeningExecutorService. It is throwing 
ClassCastException.
- Can you add tests to perform async cleanup to CleanupTest?
- Make original sync method to use async method and block there. Those two have 
duplicate codes.

 Use JMX Notifications to Indicate Success/Failure of Long-Running Operations
 

 Key: CASSANDRA-7124
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7124
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Rajanarayanan Thottuvaikkatumana
Priority: Minor
  Labels: lhf
 Fix For: 3.0

 Attachments: cassandra-trunk-cleanup-7124.txt


 If {{nodetool cleanup}} or some other long-running operation takes too long 
 to complete, you'll see an error like the one in CASSANDRA-2126, so you can't 
 tell if the operation completed successfully or not.  CASSANDRA-4767 fixed 
 this for repairs with JMX notifications.  We should do something similar for 
 nodetool cleanup, compact, decommission, move, relocate, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7981) Refactor SelectStatement

2014-11-20 Thread Tyler Hobbs (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219621#comment-14219621
 ] 

Tyler Hobbs commented on CASSANDRA-7981:


I should be able to review the latest changes today, but for now, we finally 
have some test coverage analysis (unit tests + dtests): 
https://cassci.datastax.com/job/scratch-7981_coverage_unit_and_dtest/JaCoCo_Coverage_Report/.
  Since trunk still has problems with some of the tests, there are a lot of 
failures, but that should give you a pretty good idea of the test coverage on 
the new code.  It looks like there are a few spots in the new code that could 
use some additional tests.

 Refactor SelectStatement
 

 Key: CASSANDRA-7981
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7981
 Project: Cassandra
  Issue Type: Bug
Reporter: Benjamin Lerer
Assignee: Benjamin Lerer
 Fix For: 3.0


 The current state of the code of SelectStatement make fixing some issues or 
 adding new functionnalities really hard. It also contains some 
 functionnalities that we would like to reuse in ModificationStatement but 
 cannot for the moment.
 Ideally I would like to:
 * Perform as much validation as possible on Relations instead of performing 
 it on Restrictions as it will help for problem like the one of 
 #CASSANDRA-6075 (I believe that by clearly separating validation and 
 Restrictions building we will also make the code a lot clearer)
 * Provide a way to easily merge restrictions on the same columns as needed 
 for #CASSANDRA-7016
 * Have a preparation logic (validation + pre-processing) that we can easily 
 reuse for Delete statement #CASSANDRA-6237
 * Make the code much easier to read and safer to modify.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7281) SELECT on tuple relations are broken for mixed ASC/DESC clustering order

2014-11-20 Thread Marcin Szymaniuk (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcin Szymaniuk updated CASSANDRA-7281:

Attachment: 
0001-CASSANDRA-7281-SELECT-on-tuple-relations-are-broken-v2.patch

New patch uploaded.
I decided to just change = to . So far I create as many restrictions (slices) 
as the number of clustering columns in tuple. The fact we query  two slices 
next to each other instead of one is ok in terms of the final effect. I don't 
think it harms performance. We might want to change it to be more consistent 
from conceptual point of view. I will wait for your thoughts before doing 
anything more.
Also I did dtests pull-request related to that change: 
https://github.com/riptano/cassandra-dtest/pull/118

 SELECT on tuple relations are broken for mixed ASC/DESC clustering order
 

 Key: CASSANDRA-7281
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7281
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
 Fix For: 2.0.12

 Attachments: 
 0001-CASSANDRA-7281-SELECT-on-tuple-relations-are-broken-.patch, 
 0001-CASSANDRA-7281-SELECT-on-tuple-relations-are-broken-v2.patch


 As noted on 
 [CASSANDRA-6875|https://issues.apache.org/jira/browse/CASSANDRA-6875?focusedCommentId=13992153page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13992153],
  the tuple notation is broken when the clustering order mixes ASC and DESC 
 directives because the range of data they describe don't correspond to a 
 single continuous slice internally. To copy the example from CASSANDRA-6875:
 {noformat}
 cqlsh:ks create table foo (a int, b int, c int, PRIMARY KEY (a, b, c)) WITH 
 CLUSTERING ORDER BY (b DESC, c ASC);
 cqlsh:ks INSERT INTO foo (a, b, c) VALUES (0, 2, 0);
 cqlsh:ks INSERT INTO foo (a, b, c) VALUES (0, 1, 0);
 cqlsh:ks INSERT INTO foo (a, b, c) VALUES (0, 1, 1);
 cqlsh:ks INSERT INTO foo (a, b, c) VALUES (0, 0, 0);
 cqlsh:ks SELECT * FROM foo WHERE a=0;
  a | b | c
 ---+---+---
  0 | 2 | 0
  0 | 1 | 0
  0 | 1 | 1
  0 | 0 | 0
 (4 rows)
 cqlsh:ks SELECT * FROM foo WHERE a=0 AND (b, c)  (1, 0);
  a | b | c
 ---+---+---
  0 | 2 | 0
 (1 rows)
 {noformat}
 The last query should really return {{(0, 2, 0)}} and {{(0, 1, 1)}}.
 For that specific example we should generate 2 internal slices, but I believe 
 that with more clustering columns we may have more slices.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-8348) allow takeColumnFamilySnapshot to take a list of ColumnFamilies

2014-11-20 Thread Peter Halliday (JIRA)

Peter Halliday created CASSANDRA-8348:
-

 Summary: allow takeColumnFamilySnapshot to take a list of 
ColumnFamilies
 Key: CASSANDRA-8348
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8348
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Halliday
Priority: Minor


Within StorageServiceMBean.java the function takeSnapshot allows for a list of 
keyspaces to snapshot.  However, the function takeColumnFamilySnapshot only 
allows for a single ColumnFamily to snapshot.  This should allow for multiple 
ColumnFamilies within the same Keyspace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8348) allow takeColumnFamilySnapshot to take a list of ColumnFamilies

2014-11-20 Thread Nick Bailey (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219742#comment-14219742
 ] 

Nick Bailey commented on CASSANDRA-8348:


It make make sense to include a method that takes a list of ks.cf pairs to 
snapshot as well.

 allow takeColumnFamilySnapshot to take a list of ColumnFamilies
 ---

 Key: CASSANDRA-8348
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8348
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Halliday
Priority: Minor

 Within StorageServiceMBean.java the function takeSnapshot allows for a list 
 of keyspaces to snapshot.  However, the function takeColumnFamilySnapshot 
 only allows for a single ColumnFamily to snapshot.  This should allow for 
 multiple ColumnFamilies within the same Keyspace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8325) Cassandra 2.1.x fails to start on FreeBSD (JVM crash)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219824#comment-14219824
 ] 

graham sanderson commented on CASSANDRA-8325:
-

Just for sanity checking's sake - otherwise my previous comments about {{peer}} 
still stand, can you check:

{code}
System.err.println(byteA  + theUnsafe.getByte(null, l)); // that is your long 
l
System.err.println(byteB  + theUnsafe.getByte(l)); // that is your long l
{code}

To make sure neither fail.

 Cassandra 2.1.x fails to start on FreeBSD (JVM crash)
 -

 Key: CASSANDRA-8325
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8325
 Project: Cassandra
  Issue Type: Bug
 Environment: FreeBSD 10.0 with openjdk version 1.7.0_71, 64-Bit 
 Server VM
Reporter: Leonid Shalupov
 Attachments: hs_err_pid1856.log, system.log


 See attached error file after JVM crash
 {quote}
 FreeBSD xxx.intellij.net 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu 
 Jan 16 22:34:59 UTC 2014 
 r...@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
 {quote}
 {quote}
  % java -version
 openjdk version 1.7.0_71
 OpenJDK Runtime Environment (build 1.7.0_71-b14)
 OpenJDK 64-Bit Server VM (build 24.71-b01, mixed mode)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-8325) Cassandra 2.1.x fails to start on FreeBSD (JVM crash)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219824#comment-14219824
 ] 

graham sanderson edited comment on CASSANDRA-8325 at 11/20/14 7:23 PM:
---

Just for sanity checking's sake - otherwise my previous comments about {{peer}} 
still stand, can you check:

{code}
System.err.println(byteA  + unsafe.getByte(null, l)); // that is your long l
System.err.println(byteB  + unsafe.getByte(l)); // that is your long l
{code}

To make sure neither fail.


was (Author: graham sanderson):
Just for sanity checking's sake - otherwise my previous comments about {{peer}} 
still stand, can you check:

{code}
System.err.println(byteA  + theUnsafe.getByte(null, l)); // that is your long 
l
System.err.println(byteB  + theUnsafe.getByte(l)); // that is your long l
{code}

To make sure neither fail.

 Cassandra 2.1.x fails to start on FreeBSD (JVM crash)
 -

 Key: CASSANDRA-8325
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8325
 Project: Cassandra
  Issue Type: Bug
 Environment: FreeBSD 10.0 with openjdk version 1.7.0_71, 64-Bit 
 Server VM
Reporter: Leonid Shalupov
 Attachments: hs_err_pid1856.log, system.log


 See attached error file after JVM crash
 {quote}
 FreeBSD xxx.intellij.net 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu 
 Jan 16 22:34:59 UTC 2014 
 r...@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
 {quote}
 {quote}
  % java -version
 openjdk version 1.7.0_71
 OpenJDK Runtime Environment (build 1.7.0_71-b14)
 OpenJDK 64-Bit Server VM (build 24.71-b01, mixed mode)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8325) Cassandra 2.1.x fails to start on FreeBSD (JVM crash)

2014-11-20 Thread Rajanarayanan Thottuvaikkatumana (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219829#comment-14219829
 ] 

graham sanderson commented on CASSANDRA-8325:
-

You might also want to try putting them (without the println in a long tight 
loop to get them out of the interpreter also)

 Cassandra 2.1.x fails to start on FreeBSD (JVM crash)
 -

 Key: CASSANDRA-8325
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8325
 Project: Cassandra
  Issue Type: Bug
 Environment: FreeBSD 10.0 with openjdk version 1.7.0_71, 64-Bit 
 Server VM
Reporter: Leonid Shalupov
 Attachments: hs_err_pid1856.log, system.log


 See attached error file after JVM crash
 {quote}
 FreeBSD xxx.intellij.net 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu 
 Jan 16 22:34:59 UTC 2014 
 r...@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
 {quote}
 {quote}
  % java -version
 openjdk version 1.7.0_71
 OpenJDK Runtime Environment (build 1.7.0_71-b14)
 OpenJDK 64-Bit Server VM (build 24.71-b01, mixed mode)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8150) Simplify and enlarge new heap calculation

2014-11-20 Thread Matt Stump (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219863#comment-14219863
]

Matt Stump commented on CASSANDRA-8150:
---

I wanted to add more evidence and exposition for the above recommendations so
that my argument can be better understood.

Young generation GC is pretty simple. The new heap is broken down into three
segments, eden, and two survivor spaces s0 and s1. All new objects are
allocated in eden, and once eden reaches a size threshold a minor GC is
triggered. After the minor GC all surviving objects are moved to one of the
survivor spaces, only one survivor space is active at a time. At the same time
that GC for eden is triggered a GC for the active survivor space is also
triggered. All live objects from both eden and the active survivor space are
copied to the other survivor space and both eden and the previously active
survivor space is wiped clean. Objects will bounce between the different
survivor spaces until the MaxTenuringThreshold is hit (default in C* is 1).
Once an object survives MaxTenuringThreshold number of collections it's copied
to the tenured space which is governed by a different collector, in our
instance CMS, but it could just as easily be G1. This act of copying is called
promotion. The promotion from young generation to tenured space is what takes a
long time. So if you see long ParNew GC pauses it's because many objects are
being promoted. You decrease ParNew collection times by decreasing promotion.

What can cause many objects to be promoted? It's objects that have survived
both the initial eden space collection and MaxTenuringThreshold number of
collections in the survivor space. The main tunables are the size of the
various spaces in young gen, and the MaxTenuringThreshold. By increasing the
young generation space it decreases the frequency at which we have to run GC
because more objects can accumulate before we reach 75% capacity. By increasing
the young generation and the MaxTenuringThreshold you give the short lived
objects more time to die, and dead objects don't get promoted.

The vast majority of objects in C* are ephemeral short lived objects. The only
thing that should live in tenured space is the key cache and in releases 2.1
memtables. If most objects die in survivor space you've solved the long GC
pauses for both young gen and tenured spaces.

As a data point with the mixed cluster we're we've been experimenting with
these options most aggressively the longest CMS pause in a 24 hour period went
from 10s to less than 900ms and most nodes experienced a max of less than
500ms. This is just the max CMS which could include an outlier like
defragmentation. Average CMS is significantly less, less than 100ms. For ParNew
collections we went from many many pauses in excess of 200ms to a max of 15ms
cluster wide and an average of 5ms. ParNew collection frequency decreased from
1 per second to one every 10s worst case to the average case of one every 16
seconds.

This also unlocks additional throughput on large machines. For 20 cores
machines I was able to increase throughput from 75k TPS to 110-120k TPS. For a
40 core machine we more than doubled request throughput and significantly
increased compaction throughput.

I've asked a number of other larger customers to help validate the new
settings. I now view GC pauses as a mostly solvable issue.

Simplify and enlarge new heap calculation
-

Key: CASSANDRA-8150
URL: https://issues.apache.org/jira/browse/CASSANDRA-8150
Project: Cassandra
Issue Type: Improvement
Components: Config
Reporter: Matt Stump
Assignee: Brandon Williams

It's been found that the old twitter recommendations of 100m per core up to
800m is harmful and should no longer be used.
Instead the formula used should be 1/3 or 1/4 max heap with a max of 2G. 1/3
or 1/4 is debatable and I'm open to suggestions. If I were to hazard a guess
1/3 is probably better for releases greater than 2.1.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-4175) Reduce memory, disk space, and cpu usage with a column name/id map

2014-11-20 Thread Jon Haddad (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219899#comment-14219899
 ] 

Jon Haddad commented on CASSANDRA-4175:
---

Probably a stupid question, but will making the schema be set external to the 
SSTable make it harder or impossible to move an sstable to a different cluster, 
since the column data is no longer there?

 Reduce memory, disk space, and cpu usage with a column name/id map
 --

 Key: CASSANDRA-4175
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4175
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jason Brown
  Labels: performance
 Fix For: 3.0


 We spend a lot of memory on column names, both transiently (during reads) and 
 more permanently (in the row cache).  Compression mitigates this on disk but 
 not on the heap.
 The overhead is significant for typical small column values, e.g., ints.
 Even though we intern once we get to the memtable, this affects writes too 
 via very high allocation rates in the young generation, hence more GC 
 activity.
 Now that CQL3 provides us some guarantees that column names must be defined 
 before they are inserted, we could create a map of (say) 32-bit int column 
 id, to names, and use that internally right up until we return a resultset to 
 the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[3/3] cassandra git commit: Merge branch 'cassandra-2.1' into trunk

2014-11-20 Thread brandonwilliams

Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/68fdb2db
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/68fdb2db
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/68fdb2db

Branch: refs/heads/trunk
Commit: 68fdb2db2f47ce7fafdc7c09c9805286eb133029
Parents: 201a055 705e5e4
Author: Brandon Williams brandonwilli...@apache.org
Authored: Thu Nov 20 14:32:08 2014 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Thu Nov 20 14:32:08 2014 -0600

--
 debian/control | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--

[1/3] cassandra git commit: bump python version to 2.7

2014-11-20 Thread brandonwilliams

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 2291a60e9 - 705e5e47d
  refs/heads/trunk 201a05511 - 68fdb2db2


bump python version to 2.7


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/705e5e47
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/705e5e47
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/705e5e47

Branch: refs/heads/cassandra-2.1
Commit: 705e5e47d18d572cbd7907f264bcd02d87890c02
Parents: 2291a60
Author: Brandon Williams brandonwilli...@apache.org
Authored: Thu Nov 20 14:31:59 2014 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Thu Nov 20 14:31:59 2014 -0600

--
 debian/control | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/705e5e47/debian/control
--
diff --git a/debian/control b/debian/control
index a48441b..8a30de1 100644
--- a/debian/control
+++ b/debian/control
@@ -11,7 +11,7 @@ Standards-Version: 3.8.3
 
 Package: cassandra
 Architecture: all
-Depends: openjdk-7-jre-headless | java7-runtime, adduser, python (= 2.5), 
python-support (= 0.90.0), ${misc:Depends}
+Depends: openjdk-7-jre-headless | java7-runtime, adduser, python (= 2.7), 
python-support (= 0.90.0), ${misc:Depends}
 Recommends: ntp | time-daemon
 Suggests: cassandra-tools
 Conflicts: apache-cassandra1

[2/3] cassandra git commit: bump python version to 2.7

2014-11-20 Thread brandonwilliams

bump python version to 2.7


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/705e5e47
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/705e5e47
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/705e5e47

Branch: refs/heads/trunk
Commit: 705e5e47d18d572cbd7907f264bcd02d87890c02
Parents: 2291a60
Author: Brandon Williams brandonwilli...@apache.org
Authored: Thu Nov 20 14:31:59 2014 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Thu Nov 20 14:31:59 2014 -0600

--
 debian/control | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/705e5e47/debian/control
--
diff --git a/debian/control b/debian/control
index a48441b..8a30de1 100644
--- a/debian/control
+++ b/debian/control
@@ -11,7 +11,7 @@ Standards-Version: 3.8.3
 
 Package: cassandra
 Architecture: all
-Depends: openjdk-7-jre-headless | java7-runtime, adduser, python (= 2.5), 
python-support (= 0.90.0), ${misc:Depends}
+Depends: openjdk-7-jre-headless | java7-runtime, adduser, python (= 2.7), 
python-support (= 0.90.0), ${misc:Depends}
 Recommends: ntp | time-daemon
 Suggests: cassandra-tools
 Conflicts: apache-cassandra1

[jira] [Commented] (CASSANDRA-7124) Use JMX Notifications to Indicate Success/Failure of Long-Running Operations


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219946#comment-14219946
 ] 

Rajanarayanan Thottuvaikkatumana commented on CASSANDRA-7124:
-

[~yukim], Thanks for the response. Some clarifications please.
1) Can you please tell me how did you get the exception captured? I searched in 
the system.log and could not find it. Instead of casting the executor, is that 
OK if I cast the FutureObject to ListenableFutureObject which is the result 
of the executer.submit() ? When I tried executing the command 
./bin/nodetool -h localhost cleanup that resulted in the output of the 
following lines in the console and the system.log

INFO  20:07:08 No sstables for system_traces.sessions
INFO  20:07:08 No sstables for system_traces.events

The code that generates the above output is below from the StorageService.java
{code}
private FutureTaskObject createCleanupTask(final int cmd, final String 
keyspace, final ColumnFamilyStore cfStore)
{
return new FutureTask(new WrappedRunnable()
{
protected void runMayThrow() throws Exception
{
IterableSSTableReader compactingSSTables = 
cfStore.markAllCompacting();
if (compactingSSTables == null){
logger.info(Aborting operation on {}.{} after failing 
to interrupt other compaction operations, cfStore.keyspace.getName(), 
cfStore.name);
return;
}
if (Iterables.isEmpty(compactingSSTables))
{
logger.info(No sstables for {}.{}, 
cfStore.keyspace.getName(), cfStore.name);
return;
}   
String message = String.format(Starting cleanup command #%d, 
cleaning up keyspace %s with column family store %s, cmd, keyspace, 
cfStore.name);
logger.info(message);
sendNotification(cleanup, message, new int[]{cmd, 
ActiveRepairService.Status.STARTED.ordinal()});
ListListenableFutureObject futures = 
cfStore.forceAsyncCleanup();
for(final ListenableFutureObject future: 
futures)
{
Futures.addCallback(future, new 
FutureCallbackObject()
{
  public void onFailure(Throwable 
thrown) 
  {
String message = Failed 
cleanup job  + future.toString() + with exception:  + thrown.getMessage();
logger.info(message);  
sendNotification(cleanup, 
message, new int[]{cmd, ActiveRepairService.Status.SESSION_FAILED.ordinal()});
  } 
  public void onSuccess(Object future) 
  { 
String message = Cleanup 
Session:  + future.toString() ;
logger.info(message);
sendNotification(cleanup, 
message, new int[]{cmd, ActiveRepairService.Status.SESSION_SUCCESS.ordinal()});
  } 
  
});
future.get();
}

cfStore.getDataTracker().unmarkCompacting(compactingSSTables);
message = String.format(Ending cleanup command 
#%d, cleaning up keyspace %s with column family store %s, cmd, keyspace, 
cfStore.name);
logger.info(message);
sendNotification(cleanup, message, new int[]{cmd, 
ActiveRepairService.Status.FINISHED.ordinal()});
}
},null);
}
{code}

2) Where are the current tests for the cleanup located? 
3) Regarding your comment Make original sync method to use async method and 
block there. Those two have duplicate codes. Did you mean to change the 
original forceKeyspaceCleanup method in the StorageService.java. Please 
clarify. 

Thanks a lot

 Use JMX Notifications to Indicate Success/Failure of Long-Running Operations
 

 Key: CASSANDRA-7124
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7124
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Rajanarayanan Thottuvaikkatumana

[jira] [Commented] (CASSANDRA-7124) Use JMX Notifications to Indicate Success/Failure of Long-Running Operations

2014-11-20 Thread Yuki Morishita (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219965#comment-14219965
 ] 

Yuki Morishita commented on CASSANDRA-7124:
---

1)  2) CleanupTest is 'test/unit/org/apache/cassandra/db/CleanupTest.java'. 
Add test there and check if async version is also working as the same as sync 
version.
3) I mean the sync and async methods in CompactionManager. {{performCleanup}} 
can call {{performAsyncCleanup}} internally so that you don't have to 
duplicated code.

 Use JMX Notifications to Indicate Success/Failure of Long-Running Operations
 

 Key: CASSANDRA-7124
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7124
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Rajanarayanan Thottuvaikkatumana
Priority: Minor
  Labels: lhf
 Fix For: 3.0

 Attachments: cassandra-trunk-cleanup-7124.txt


 If {{nodetool cleanup}} or some other long-running operation takes too long 
 to complete, you'll see an error like the one in CASSANDRA-2126, so you can't 
 tell if the operation completed successfully or not.  CASSANDRA-4767 fixed 
 this for repairs with JMX notifications.  We should do something similar for 
 nodetool cleanup, compact, decommission, move, relocate, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8150) Simplify and enlarge new heap calculation

[
https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219967#comment-14219967
]

Jeremiah Jordan commented on CASSANDRA-8150:

While I agree all of that sounds nice for read heavy workloads. Have you used
these settings with a write heavy workload?

From my experience when you have a write heavy workload, your young gen fills
up with memtable data, which will and should be promoted to old gen. So if
you set your young gen size high, it takes for ever to copy all that stuff to
old gen. If you increase the MaxTenuringThreshold it makes that even worse,
as all of the memtable data has to get copied back and forth inside young gen
X times, and then there is even more memtable stuff which will build up, so
the copy to old gen takes that much longer.

Simplify and enlarge new heap calculation
-

Key: CASSANDRA-8150
URL: https://issues.apache.org/jira/browse/CASSANDRA-8150
Project: Cassandra
Issue Type: Improvement
Components: Config
Reporter: Matt Stump
Assignee: Brandon Williams

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-4175) Reduce memory, disk space, and cpu usage with a column name/id map

2014-11-20 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219973#comment-14219973
 ] 

Aleksey Yeschenko commented on CASSANDRA-4175:
--

bq. Probably a stupid question, but will making the schema be set external to 
the SSTable make it harder or impossible to move an sstable to a different 
cluster, since the column data is no longer there?

Nah, we can just encode the {name - id} map in sstable metadata.

 Reduce memory, disk space, and cpu usage with a column name/id map
 --

 Key: CASSANDRA-4175
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4175
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jason Brown
  Labels: performance
 Fix For: 3.0


 We spend a lot of memory on column names, both transiently (during reads) and 
 more permanently (in the row cache).  Compression mitigates this on disk but 
 not on the heap.
 The overhead is significant for typical small column values, e.g., ints.
 Even though we intern once we get to the memtable, this affects writes too 
 via very high allocation rates in the young generation, hence more GC 
 activity.
 Now that CQL3 provides us some guarantees that column names must be defined 
 before they are inserted, we could create a map of (say) 32-bit int column 
 id, to names, and use that internally right up until we return a resultset to 
 the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219986#comment-14219986
 ] 

Jeremiah Jordan commented on CASSANDRA-7386:


bq. what were there problems?

The problem was without the patch, the test hit the issue the patch was meant 
to fix... one disk filling up completely and crashing things.

 JBOD threshold to prevent unbalanced disk utilization
 -

 Key: CASSANDRA-7386
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7386
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Lohfink
Assignee: Robert Stupp
Priority: Minor
 Fix For: 2.1.3

 Attachments: 7386-2.0-v3.txt, 7386-2.0-v4.txt, 7386-2.0-v5.txt, 
 7386-2.1-v3.txt, 7386-2.1-v4.txt, 7386-2.1-v5.txt, 7386-v1.patch, 
 7386v2.diff, Mappe1.ods, mean-writevalue-7disks.png, 
 patch_2_1_branch_proto.diff, sstable-count-second-run.png, 
 test1_no_patch.jpg, test1_with_patch.jpg, test2_no_patch.jpg, 
 test2_with_patch.jpg, test3_no_patch.jpg, test3_with_patch.jpg, 
 test_regression_no_patch.jpg, test_regression_with_patch.jpg


 Currently the pick the disks are picked first by number of current tasks, 
 then by free space.  This helps with performance but can lead to large 
 differences in utilization in some (unlikely but possible) scenarios.  Ive 
 seen 55% to 10% and heard reports of 90% to 10% on IRC.  With both LCS and 
 STCS (although my suspicion is that STCS makes it worse since harder to be 
 balanced).
 I purpose the algorithm change a little to have some maximum range of 
 utilization where it will pick by free space over load (acknowledging it can 
 be slower).  So if a disk A is 30% full and disk B is 5% full it will never 
 pick A over B until it balances out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-4123) vnodes aware Replication Strategy

[
https://issues.apache.org/jira/browse/CASSANDRA-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jeremiah Jordan updated CASSANDRA-4123:
---
Description:
I would very much like to build in Peter Schuller's notion of Distribution
Factor (as described in
http://www.mail-archive.com/dev@cassandra.apache.org/msg03844.html). This
requires a method of defining a replica set for each host and then treating
it in a similar way to a DC (ie. RF replicas are chosen from that set, instead
of from the whole cluster).

was:
The simplest implementation for this would be if NTS regarded a single host as
a distinct rack. This would prevent replicas being placed on the same host. The
rest of the logic for replica selection would be identical to NTS (but this
would be removing a level of topology hierarchy). This would be achievable just
by writing a snitch to place hosts in their own rack.

A better solution would be to add an extra level of hierarchy to NTS so that it
still supported DC rack, and IP would be the new level at the bottom of the
hierarchy. The logic would remain largely the same.

I would very much like to build in Peter Schuller's notion of Distribution
Factor (as described in
http://www.mail-archive.com/dev@cassandra.apache.org/msg03844.html). This
requires a method of defining a replica set for each host and then treating
it in a similar way to a DC (ie. RF replicas are chosen from that set, instead
of from the whole cluster).

vnodes aware Replication Strategy
--

Key: CASSANDRA-4123
URL: https://issues.apache.org/jira/browse/CASSANDRA-4123
Project: Cassandra
Issue Type: New Feature
Components: Core
Reporter: Sam Overton
Labels: vnodes
Fix For: 3.0

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-4123) Investigate Distribution Factor based Replication Strategy


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremiah Jordan updated CASSANDRA-4123:
---
Summary: Investigate Distribution Factor based Replication Strategy   (was: 
vnodes aware Replication Strategy )

 Investigate Distribution Factor based Replication Strategy 
 ---

 Key: CASSANDRA-4123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4123
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Sam Overton
  Labels: vnodes
 Fix For: 3.0


 I would very much like to build in Peter Schuller's notion of Distribution 
 Factor (as described in 
 http://www.mail-archive.com/dev@cassandra.apache.org/msg03844.html). This 
 requires a method of defining a replica set for each host and then treating 
 it in a similar way to a DC (ie. RF replicas are chosen from that set, 
 instead of from the whole cluster). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8345) Client notifications should carry the entire delta of the information that changed

2014-11-20 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219994#comment-14219994
 ] 

Aleksey Yeschenko commented on CASSANDRA-8345:
--

CASSANDRA-6038 is supposed to do something like this for the schema push 
protocol. However doing this for the native protocol is not trivial at all - in 
a useful way, anyway.

The issue here is that schema changes can happen to different nodes, and thus 
events can also be triggered by different nodes, and out of their logical 
order. Internally, we rely on the change timestamps (and timestamps in the 
system schema tables) for reconciliation.

So for this to work, the drivers would have to replicate that same 
reconciliation logic.

What we could do safely is for ALTER TABLE client notifications specify whether 
the change was made to params or two columns, so that you'd only have to fetch 
updated info from one table (pre-3.0 either schema_columnfamilies or 
schema_columns). But I'm not sure it improves things enough to warrant a change 
there.

 Client notifications should carry the entire delta of the information that 
 changed
 --

 Key: CASSANDRA-8345
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8345
 Project: Cassandra
  Issue Type: Improvement
Reporter: Michaël Figuière
  Labels: protocolv4

 Currently when the schema changes, a {{SCHEMA_CHANGE}} notification is sent 
 to the client to let it know that a modification happened in a specific table 
 or keyspace. If the client register for these notifications, this is likely 
 that it actually cares to have an up to date version of this information, so 
 the next step is logically for the client to query the {{system}} keyspace to 
 retrieve the latest version of the schema for the particular element that was 
 mentioned in the notification.
 The same thing happen with the {{TOPOLOGY_CHANGE}} notification as the client 
 will follow up with a query to retrieve the details that changed in the 
 {{system.peers}} table.
 It would be interesting to send the entire delta of the information that 
 changed within the notification. I see several advantages with this:
 * This would ensure that the data that are sent to the client are as small as 
 possible as such a delta will always be smaller than the resultset that would 
 eventually be received for a formal query on the {{system}} keyspace.
 * This avoid the Cassandra node to receive plenty of query after it issue a 
 notification but rather to prepare a delta once and send it to everybody.
 * This should improve the overall behaviour when dealing with very large 
 schemas with frequent changes (typically due to a tentative of implementing 
 multitenancy through separate keyspaces), as it has been observed that the 
 the notifications and subsequent queries traffic can become non negligible in 
 this case.
 * This would eventually simplify the driver design by removing the need for 
 an extra asynchronous operation to follow up with, although the benefit of 
 this point will be real only once the previous versions of the protocols are 
 far behind.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220007#comment-14220007
 ] 

Robert Stupp commented on CASSANDRA-7386:
-

bq. one disk filling up completely and crashing things

just because of reservation? oops

 JBOD threshold to prevent unbalanced disk utilization
 -

 Key: CASSANDRA-7386
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7386
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Lohfink
Assignee: Robert Stupp
Priority: Minor
 Fix For: 2.1.3

 Attachments: 7386-2.0-v3.txt, 7386-2.0-v4.txt, 7386-2.0-v5.txt, 
 7386-2.1-v3.txt, 7386-2.1-v4.txt, 7386-2.1-v5.txt, 7386-v1.patch, 
 7386v2.diff, Mappe1.ods, mean-writevalue-7disks.png, 
 patch_2_1_branch_proto.diff, sstable-count-second-run.png, 
 test1_no_patch.jpg, test1_with_patch.jpg, test2_no_patch.jpg, 
 test2_with_patch.jpg, test3_no_patch.jpg, test3_with_patch.jpg, 
 test_regression_no_patch.jpg, test_regression_with_patch.jpg


 Currently the pick the disks are picked first by number of current tasks, 
 then by free space.  This helps with performance but can lead to large 
 differences in utilization in some (unlikely but possible) scenarios.  Ive 
 seen 55% to 10% and heard reports of 90% to 10% on IRC.  With both LCS and 
 STCS (although my suspicion is that STCS makes it worse since harder to be 
 balanced).
 I purpose the algorithm change a little to have some maximum range of 
 utilization where it will pick by free space over load (acknowledging it can 
 be slower).  So if a disk A is 30% full and disk B is 5% full it will never 
 pick A over B until it balances out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8231) Wrong size of cached prepared statements

2014-11-20 Thread Dave Brosius (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220009#comment-14220009
 ] 

Dave Brosius commented on CASSANDRA-8231:
-

+1

 Wrong size of cached prepared statements
 

 Key: CASSANDRA-8231
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8231
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jaroslav Kamenik
Assignee: Benjamin Lerer
 Fix For: 2.1.3

 Attachments: 8231-notes.txt, CASSANDRA-8231-V2.txt, 
 CASSANDRA-8231.txt, Unsafes.java


 Cassandra counts memory footprint of prepared statements for caching 
 purposes. It seems, that there is problem with some statements, ie 
 SelectStatement. Even simple selects is counted as 100KB object, updates, 
 deletes etc have few hundreds or thousands bytes. Result is that cache - 
 QueryProcessor.preparedStatements  - holds just fraction of statements..
 I dig a little into the code, and it seems that problem is in jamm in class 
 MemoryMeter. It seems that if instance contains reference to class, it counts 
 size of whole class too. SelectStatement references EnumSet through 
 ResultSet.Metadata and EnumSet holds reference to Enum class...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization

[
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220017#comment-14220017
]

Jeremiah Jordan commented on CASSANDRA-7386:

bq. just because of reservation? oops

oh sorry, mis-understood you. The problem with reservations was
CASSANDRA-5605. We would end up reserving the whole disk, so flushing couldn't
happen, and then the heap would fill up, and you would OOM. Because we
reserved the max possible space, but for workloads with overwrites, the
resulting file is way smaller than the reservation, so we didn't actually need
all that space. Basically we were declaring disk full, before the disk was
actually full. The other problem is that when reserving space, if multiple
compactions are in progress, you reserve the max needed by all of them. But
they finish at different times, and when they finish all the old files get
removed. So again you are declaring disk full when it will not actually be
full.

JBOD threshold to prevent unbalanced disk utilization
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8345) Client notifications should carry the entire delta of the information that changed

2014-11-20 Thread Bulat Shakirzyanov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220039#comment-14220039
 ] 

Bulat Shakirzyanov commented on CASSANDRA-8345:
---

Each driver receives notifications only from a single node, why would it need 
to reconcile, isn't the notification pushed from the node after schema has been 
reconciled?

 Client notifications should carry the entire delta of the information that 
 changed
 --

 Key: CASSANDRA-8345
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8345
 Project: Cassandra
  Issue Type: Improvement
Reporter: Michaël Figuière
  Labels: protocolv4

 Currently when the schema changes, a {{SCHEMA_CHANGE}} notification is sent 
 to the client to let it know that a modification happened in a specific table 
 or keyspace. If the client register for these notifications, this is likely 
 that it actually cares to have an up to date version of this information, so 
 the next step is logically for the client to query the {{system}} keyspace to 
 retrieve the latest version of the schema for the particular element that was 
 mentioned in the notification.
 The same thing happen with the {{TOPOLOGY_CHANGE}} notification as the client 
 will follow up with a query to retrieve the details that changed in the 
 {{system.peers}} table.
 It would be interesting to send the entire delta of the information that 
 changed within the notification. I see several advantages with this:
 * This would ensure that the data that are sent to the client are as small as 
 possible as such a delta will always be smaller than the resultset that would 
 eventually be received for a formal query on the {{system}} keyspace.
 * This avoid the Cassandra node to receive plenty of query after it issue a 
 notification but rather to prepare a delta once and send it to everybody.
 * This should improve the overall behaviour when dealing with very large 
 schemas with frequent changes (typically due to a tentative of implementing 
 multitenancy through separate keyspaces), as it has been observed that the 
 the notifications and subsequent queries traffic can become non negligible in 
 this case.
 * This would eventually simplify the driver design by removing the need for 
 an extra asynchronous operation to follow up with, although the benefit of 
 this point will be real only once the previous versions of the protocols are 
 far behind.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220050#comment-14220050
 ] 

Robert Stupp commented on CASSANDRA-7386:
-

Thanks :)

 JBOD threshold to prevent unbalanced disk utilization
 -

 Key: CASSANDRA-7386
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7386
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Lohfink
Assignee: Robert Stupp
Priority: Minor
 Fix For: 2.1.3

 Attachments: 7386-2.0-v3.txt, 7386-2.0-v4.txt, 7386-2.0-v5.txt, 
 7386-2.1-v3.txt, 7386-2.1-v4.txt, 7386-2.1-v5.txt, 7386-v1.patch, 
 7386v2.diff, Mappe1.ods, mean-writevalue-7disks.png, 
 patch_2_1_branch_proto.diff, sstable-count-second-run.png, 
 test1_no_patch.jpg, test1_with_patch.jpg, test2_no_patch.jpg, 
 test2_with_patch.jpg, test3_no_patch.jpg, test3_with_patch.jpg, 
 test_regression_no_patch.jpg, test_regression_with_patch.jpg


 Currently the pick the disks are picked first by number of current tasks, 
 then by free space.  This helps with performance but can lead to large 
 differences in utilization in some (unlikely but possible) scenarios.  Ive 
 seen 55% to 10% and heard reports of 90% to 10% on IRC.  With both LCS and 
 STCS (although my suspicion is that STCS makes it worse since harder to be 
 balanced).
 I purpose the algorithm change a little to have some maximum range of 
 utilization where it will pick by free space over load (acknowledging it can 
 be slower).  So if a disk A is 30% full and disk B is 5% full it will never 
 pick A over B until it balances out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-8325) Cassandra 2.1.x fails to start on FreeBSD (JVM crash)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219829#comment-14219829
 ] 

graham sanderson edited comment on CASSANDRA-8325 at 11/20/14 10:16 PM:


You might also want to try putting them (without the println) in a long tight 
loop to get them out of the interpreter


was (Author: graham sanderson):
You might also want to try putting them (without the println in a long tight 
loop to get them out of the interpreter also)

 Cassandra 2.1.x fails to start on FreeBSD (JVM crash)
 -

 Key: CASSANDRA-8325
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8325
 Project: Cassandra
  Issue Type: Bug
 Environment: FreeBSD 10.0 with openjdk version 1.7.0_71, 64-Bit 
 Server VM
Reporter: Leonid Shalupov
 Attachments: hs_err_pid1856.log, system.log


 See attached error file after JVM crash
 {quote}
 FreeBSD xxx.intellij.net 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu 
 Jan 16 22:34:59 UTC 2014 
 r...@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
 {quote}
 {quote}
  % java -version
 openjdk version 1.7.0_71
 OpenJDK Runtime Environment (build 1.7.0_71-b14)
 OpenJDK 64-Bit Server VM (build 24.71-b01, mixed mode)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-8349) ALTER KEYSPACE causes tables not to be found

2014-11-20 Thread Joseph Chu (JIRA)

Joseph Chu created CASSANDRA-8349:
-

 Summary: ALTER KEYSPACE causes tables not to be found
 Key: CASSANDRA-8349
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8349
 Project: Cassandra
  Issue Type: Bug
Reporter: Joseph Chu
Priority: Minor


Running Cassandra 2.1.2 on a single node.

Reproduction steps in cqlsh:

CREATE KEYSPACE a WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 1};
CREATE TABLE a.a (a INT PRIMARY KEY);
INSERT INTO a.a (a) VALUES (1);
SELECT * FROM a.a;
ALTER KEYSPACE a WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 2};
SELECT * FROM a.a;
DESCRIBE KEYSPACE a

Errors:
Column family 'a' not found

Workaround(?):
Restart the instance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8150) Simplify and enlarge new heap calculation

2014-11-20 Thread Matt Stump (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220149#comment-14220149
]

Matt Stump commented on CASSANDRA-8150:
---

I don't disagree with your experience but I do disagree with the description of
what is happening. With the GC frequency that I described above the memtable
will be moved to tenured space after about 60-80 seconds. All of the individual
requests will create ephemeral objects which would be ideally handled by
ParNew.

Where we went wrong was growing the heap but not also increasing
MaxTenuringThreshold. By default we set MaxTenuringThreshold to 1 which means
promote everything that survives 2 GCs to tenured, which coupled with a small
heap for the workload results in a very high promotion rate which is why we see
the delays. The key is to always increase MaxTenuringThreshold and young gen
more or less proportionally. From the perspective of GC and the creation rate
for ephemeral objects reads and writes are more or less identical. One could
possibly even make the case that writes are even better suited for the settings
I've outlined above because writes should put less presure on eden due to the
simpler request path. In my opinion, and I hope to have data to back this up
soon, is that write heavy vs read heavy GC tuning I think is mostly a red
herring.

Simplify and enlarge new heap calculation
-

Key: CASSANDRA-8150
URL: https://issues.apache.org/jira/browse/CASSANDRA-8150
Project: Cassandra
Issue Type: Improvement
Components: Config
Reporter: Matt Stump
Assignee: Brandon Williams

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-8350) Drop,Create CF, insert 658, one record fail to insert

2014-11-20 Thread Harpreet Kaur (JIRA)

Harpreet Kaur created CASSANDRA-8350:


 Summary: Drop,Create CF, insert 658,  one record fail to insert
 Key: CASSANDRA-8350
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8350
 Project: Cassandra
  Issue Type: Bug
Reporter: Harpreet Kaur


For a CF ,there is a change in definition. so we 

Drop CF
Create CF
Inserted data. 658 
cqlsh : select count(*) from CF; 657
one record did not insert. 

Try inserting as one-off, no errors on cqlsh, but record did not insert. 

when doing a tracing on; select * from CF where id='blah'; 
shows 
Read 0 live and 73 tombstoned cells 

changed gc_grace_seconds to 600 for CF 

next performed nodetool flush,compact,repair on all 12 nodes, did not help. 






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8150) Simplify and enlarge new heap calculation

2014-11-20 Thread Matt Stump (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Stump updated CASSANDRA-8150:
--
Attachment: upload.png

Just to emphasize the point I just got word of another unrelated customer that 
rolled out the changes. Here is a graph of their GC activity. Additionally, 
write latency was cut in half.

 Simplify and enlarge new heap calculation
 -

 Key: CASSANDRA-8150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8150
 Project: Cassandra
  Issue Type: Improvement
  Components: Config
Reporter: Matt Stump
Assignee: Brandon Williams
 Attachments: upload.png


 It's been found that the old twitter recommendations of 100m per core up to 
 800m is harmful and should no longer be used.
 Instead the formula used should be 1/3 or 1/4 max heap with a max of 2G. 1/3 
 or 1/4 is debatable and I'm open to suggestions. If I were to hazard a guess 
 1/3 is probably better for releases greater than 2.1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-8351) Running COPY FROM in cqlsh aborts with errors or segmentation fault

2014-11-20 Thread Joseph Chu (JIRA)

Joseph Chu created CASSANDRA-8351:
-

 Summary: Running COPY FROM in cqlsh aborts with errors or 
segmentation fault
 Key: CASSANDRA-8351
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8351
 Project: Cassandra
  Issue Type: Bug
Reporter: Joseph Chu
Priority: Minor
 Attachments: stress.cql, stress.csv

Running Cassandra 2.1.2 binary tarball on a single instance.

Put together a script to try to reproduce this using data generated by 
cassandra-stress.

Reproduction steps: Download files and run cqlsh -f stress.cql
This may need to run a couple of times before errors are encountered. I've seen 
this work best when running after a fresh install.

Errors seen:

1.Segmentation fault (core dumped)

2.stress.cql:24:line contains NULL byte
   stress.cql:24:Aborting import at record #0. Previously-inserted values 
still present.
   71 rows imported in 0.100 seconds.

3.   *** glibc detected *** python: corrupted double-linked list: 
0x01121ad0 ***
=== Backtrace: =
/lib/x86_64-linux-gnu/libc.so.6(+0x7eb96)[0x7f80fe0cdb96]
/lib/x86_64-linux-gnu/libc.so.6(+0x7fead)[0x7f80fe0ceead]
python[0x42615d]
python[0x501dc8]
python[0x4ff715]
python[0x425d02]
python(PyEval_EvalCodeEx+0x1c4)[0x575db4]
python[0x577be2]
python(PyObject_Call+0x36)[0x4d91b6]
python(PyEval_EvalFrameEx+0x2035)[0x54d8a5]
python(PyEval_EvalCodeEx+0x1a2)[0x575d92]
python(PyEval_EvalFrameEx+0x7b8)[0x54c028]
python(PyEval_EvalCodeEx+0x1a2)[0x575d92]
python(PyEval_EvalFrameEx+0x7b8)[0x54c028]
python(PyEval_EvalFrameEx+0xa02)[0x54c272]
python(PyEval_EvalFrameEx+0xa02)[0x54c272]
python(PyEval_EvalFrameEx+0xa02)[0x54c272]
python(PyEval_EvalCodeEx+0x1a2)[0x575d92]
python(PyEval_EvalFrameEx+0x7b8)[0x54c028]
python(PyEval_EvalCodeEx+0x1a2)[0x575d92]
python(PyEval_EvalFrameEx+0x7b8)[0x54c028]
python(PyEval_EvalCodeEx+0x1a2)[0x575d92]
python[0x577be2]
python(PyObject_Call+0x36)[0x4d91b6]
python(PyEval_EvalFrameEx+0x2035)[0x54d8a5]
python(PyEval_EvalFrameEx+0xa02)[0x54c272]
python(PyEval_EvalFrameEx+0xa02)[0x54c272]
python(PyEval_EvalCodeEx+0x1a2)[0x575d92]
python[0x577ab0]
python(PyObject_Call+0x36)[0x4d91b6]
python[0x4c91fa]
python(PyObject_Call+0x36)[0x4d91b6]
python(PyEval_CallObjectWithKeywords+0x36)[0x4d97c6]
python[0x4f7f58]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7f80ff369e9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f80fe1433fd]
=== Memory map: 
0040-00672000 r-xp  08:01 1447344
/usr/bin/python2.7
00871000-00872000 r--p 00271000 08:01 1447344
/usr/bin/python2.7
00872000-008db000 rw-p 00272000 08:01 1447344
/usr/bin/python2.7
008db000-008ed000 rw-p  00:00 0 
0090e000-0126 rw-p  00:00 0  [heap]
7f80ec00-7f80ec0aa000 rw-p  00:00 0 
7f80ec0aa000-7f80f000 ---p  00:00 0 
7f80f000-7f80f0021000 rw-p  00:00 0 
7f80f0021000-7f80f400 ---p  00:00 0 
7f80f400-7f80f4021000 rw-p  00:00 0 
7f80f4021000-7f80f800 ---p  00:00 0 
7f80fa713000-7f80fa714000 ---p  00:00 0 
7f80fa714000-7f80faf14000 rw-p  00:00 0  
[stack:7493]
7f80faf14000-7f80faf15000 ---p  00:00 0 
7f80faf15000-7f80fb715000 rw-p  00:00 0  
[stack:7492]
7f80fb715000-7f80fb716000 ---p  00:00 0 
7f80fb716000-7f80fbf16000 rw-p  00:00 0  
[stack:7491]
7f80fbf16000-7f80fbf21000 r-xp  08:01 1456254
/usr/lib/python2.7/lib-dynload/_json.so
7f80fbf21000-7f80fc12 ---p b000 08:01 1456254
/usr/lib/python2.7/lib-dynload/_json.so
7f80fc12-7f80fc121000 r--p a000 08:01 1456254
/usr/lib/python2.7/lib-dynload/_json.so
7f80fc121000-7f80fc122000 rw-p b000 08:01 1456254
/usr/lib/python2.7/lib-dynload/_json.so
7f80fc122000-7f80fc133000 r-xp  08:01 1585974
/usr/local/lib/python2.7/dist-packages/blist/_blist.so
7f80fc133000-7f80fc332000 ---p 00011000 08:01 1585974
/usr/local/lib/python2.7/dist-packages/blist/_blist.so
7f80fc332000-7f80fc333000 r--p 0001 08:01 1585974
/usr/local/lib/python2.7/dist-packages/blist/_blist.so
7f80fc333000-7f80fc335000 rw-p 00011000 08:01 1585974
/usr/local/lib/python2.7/dist-packages/blist/_blist.so
7f80fc335000-7f80fc349000 r-xp  08:01 1456262
/usr/lib/python2.7/lib-dynload/datetime.so
7f80fc349000-7f80fc548000 ---p 00014000 08:01 1456262
/usr/lib/python2.7/lib-dynload/datetime.so
7f80fc548000-7f80fc549000 r--p 00013000 08:01 1456262
/usr/lib/python2.7/lib-dynload/datetime.so
7f80fc549000-7f80fc54d000 rw-p

[jira] [Updated] (CASSANDRA-8087) Multiple non-DISTINCT rows returned when page_size set

2014-11-20 Thread Tyler Hobbs (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-8087:
---
Reproduced In: 2.0.11, 2.0.10, 2.0.9  (was: 2.0.10, 2.0.11)
Since Version: 2.0.9

 Multiple non-DISTINCT rows returned when page_size set
 --

 Key: CASSANDRA-8087
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8087
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Adam Holmberg
Assignee: Tyler Hobbs
Priority: Minor
 Fix For: 2.0.12


 Using the following statements to reproduce:
 {code}
 CREATE TABLE test (
 k int,
 p int,
 s int static,
 PRIMARY KEY (k, p)
 );
 INSERT INTO test (k, p) VALUES (1, 1);
 INSERT INTO test (k, p) VALUES (1, 2);
 SELECT DISTINCT k, s FROM test ;
 {code}
 Native clients that set result_page_size in the query message receive 
 multiple non-distinct rows back (one per clustered value p in row k).
 This is only reproduced on 2.0.10. Does not appear in 2.1.0
 It does not appear in cqlsh for 2.0.10 because thrift.
 See https://datastax-oss.atlassian.net/browse/PYTHON-164 for background



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8150) Simplify and enlarge new heap calculation

2014-11-20 Thread T Jake Luciani (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220212#comment-14220212
 ] 

T Jake Luciani commented on CASSANDRA-8150:
---

Let's run some cstar tests with write and read workloads...

 Simplify and enlarge new heap calculation
 -

 Key: CASSANDRA-8150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8150
 Project: Cassandra
  Issue Type: Improvement
  Components: Config
Reporter: Matt Stump
Assignee: Brandon Williams
 Attachments: upload.png


 It's been found that the old twitter recommendations of 100m per core up to 
 800m is harmful and should no longer be used.
 Instead the formula used should be 1/3 or 1/4 max heap with a max of 2G. 1/3 
 or 1/4 is debatable and I'm open to suggestions. If I were to hazard a guess 
 1/3 is probably better for releases greater than 2.1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8350) Drop,Create CF, insert 658, one record fail to insert

2014-11-20 Thread Rajanarayanan Thottuvaikkatumana (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220218#comment-14220218
 ] 

Jeremiah Jordan commented on CASSANDRA-8350:


Most likely you have a tombstone future dated or something.  I would dump the 
sstable with sst2json and see what is in there for that id.

 Drop,Create CF, insert 658,  one record fail to insert
 --

 Key: CASSANDRA-8350
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8350
 Project: Cassandra
  Issue Type: Bug
Reporter: Harpreet Kaur

 For a CF ,there is a change in definition. so we 
 Drop CF
 Create CF
 Inserted data. 658 
 cqlsh : select count(*) from CF; 657
 one record did not insert. 
 Try inserting as one-off, no errors on cqlsh, but record did not insert. 
 when doing a tracing on; select * from CF where id='blah'; 
 shows 
 Read 0 live and 73 tombstoned cells 
 changed gc_grace_seconds to 600 for CF 
 next performed nodetool flush,compact,repair on all 12 nodes, did not help. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8150) Simplify and enlarge new heap calculation

2014-11-20 Thread T Jake Luciani (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220242#comment-14220242
 ] 

T Jake Luciani commented on CASSANDRA-8150:
---

I also learned we should not be using biased locking.  Here is a sample run 
showing 2.1 without and with biased locking disabled

http://cstar.datastax.com/graph?stats=0f0ec9a6-710c-11e4-af11-bc764e04482cmetric=op_rateoperation=2_readsmoothing=1show_aggregates=truexmin=0xmax=98.89ymin=0ymax=273028.8

{code}
-XX:-UseBiasedLocking
{code}

 Simplify and enlarge new heap calculation
 -

 Key: CASSANDRA-8150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8150
 Project: Cassandra
  Issue Type: Improvement
  Components: Config
Reporter: Matt Stump
Assignee: Brandon Williams
 Attachments: upload.png


 It's been found that the old twitter recommendations of 100m per core up to 
 800m is harmful and should no longer be used.
 Instead the formula used should be 1/3 or 1/4 max heap with a max of 2G. 1/3 
 or 1/4 is debatable and I'm open to suggestions. If I were to hazard a guess 
 1/3 is probably better for releases greater than 2.1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7124) Use JMX Notifications to Indicate Success/Failure of Long-Running Operations


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajanarayanan Thottuvaikkatumana updated CASSANDRA-7124:

Attachment: cassandra-trunk-cleanup-7124.txt

Patch for cleanup as in CASSANDRA-7124

 Use JMX Notifications to Indicate Success/Failure of Long-Running Operations
 

 Key: CASSANDRA-7124
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7124
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Rajanarayanan Thottuvaikkatumana
Priority: Minor
  Labels: lhf
 Fix For: 3.0

 Attachments: cassandra-trunk-cleanup-7124.txt


 If {{nodetool cleanup}} or some other long-running operation takes too long 
 to complete, you'll see an error like the one in CASSANDRA-2126, so you can't 
 tell if the operation completed successfully or not.  CASSANDRA-4767 fixed 
 this for repairs with JMX notifications.  We should do something similar for 
 nodetool cleanup, compact, decommission, move, relocate, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7124) Use JMX Notifications to Indicate Success/Failure of Long-Running Operations

2014-11-20 Thread Rajanarayanan Thottuvaikkatumana (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajanarayanan Thottuvaikkatumana updated CASSANDRA-7124:

Attachment: (was: cassandra-trunk-cleanup-7124.txt)

 Use JMX Notifications to Indicate Success/Failure of Long-Running Operations
 

 Key: CASSANDRA-7124
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7124
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Rajanarayanan Thottuvaikkatumana
Priority: Minor
  Labels: lhf
 Fix For: 3.0

 Attachments: cassandra-trunk-cleanup-7124.txt


 If {{nodetool cleanup}} or some other long-running operation takes too long 
 to complete, you'll see an error like the one in CASSANDRA-2126, so you can't 
 tell if the operation completed successfully or not.  CASSANDRA-4767 fixed 
 this for repairs with JMX notifications.  We should do something similar for 
 nodetool cleanup, compact, decommission, move, relocate, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7124) Use JMX Notifications to Indicate Success/Failure of Long-Running Operations

2014-11-20 Thread Rajanarayanan Thottuvaikkatumana (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220264#comment-14220264
 ] 

Rajanarayanan Thottuvaikkatumana commented on CASSANDRA-7124:
-

[~yukim], As per your suggestions, consolidated the performCleanup call fixed 
the class cast exception etc, did some testing and attached the patch for the 
same. Please have a look at it. Here are some of the test results 

Ran the ./bin/nodetool -h localhost cleanup command and found working and 
here is the output I could see from the console
{code}
INFO  23:44:21 Starting cleanup command #2, cleaning up keyspace system_traces 
with column family store sessions
INFO  23:44:21 Starting cleanup command #1, cleaning up keyspace system_traces 
with column family store events
INFO  23:44:21 No sstables for system_traces.events
INFO  23:44:21 No sstables for system_traces.sessions
INFO  23:44:21 Ending cleanup command #2, cleaning up keyspace system_traces 
with column family store sessions
INFO  23:44:21 Ending cleanup command #1, cleaning up keyspace system_traces 
with column family store events
{code}

Ran the CleanupTest (since the performCleanup is calling the changed code, I 
didn't have to make any changes to the test code) and found working. Here is 
the test output.

{code}
Rajanarayanans-MacBook-Pro:cassandra-trunk RajT$ ant test 
-Dtest.name=CleanupTest
Buildfile: /Users/RajT/cassandra-source/cassandra-trunk/build.xml

init:

maven-ant-tasks-localrepo:

maven-ant-tasks-download:

maven-ant-tasks-init:

maven-declare-dependencies:

maven-ant-tasks-retrieve-build:

init-dependencies:
 [echo] Loading dependency paths from file: 
/Users/RajT/cassandra-source/cassandra-trunk/build/build-dependencies.xml
[unzip] Expanding: 
/Users/RajT/cassandra-source/cassandra-trunk/build/lib/jars/org.jacoco.agent-0.7.1.201405082137.jar
 into /Users/RajT/cassandra-source/cassandra-trunk/build/lib/jars

check-gen-cql3-grammar:

gen-cql3-grammar:

build-project:
 [echo] apache-cassandra: 
/Users/RajT/cassandra-source/cassandra-trunk/build.xml

createVersionPropFile:
[propertyfile] Updating property file: 
/Users/RajT/cassandra-source/cassandra-trunk/src/resources/org/apache/cassandra/config/version.properties
 [copy] Copying 1 file to 
/Users/RajT/cassandra-source/cassandra-trunk/build/classes/main

build:

build-test:

test:

testlist:
 [echo] running test bucket 0 tests
[mkdir] Created dir: 
/Users/RajT/cassandra-source/cassandra-trunk/build/test/cassandra
[mkdir] Created dir: 
/Users/RajT/cassandra-source/cassandra-trunk/build/test/output
[junit] WARNING: multiple versions of ant detected in path for junit 
[junit]  
jar:file:/usr/local/Cellar/ant/1.9.4/libexec/lib/ant.jar!/org/apache/tools/ant/Project.class
[junit]  and 
jar:file:/Users/RajT/cassandra-source/cassandra-trunk/build/lib/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
[junit] objc[3711]: Class JavaLaunchHelper is implemented in both 
/Library/Java/JavaVirtualMachines/jdk1.7.0_67.jdk/Contents/Home/jre/bin/java 
and 
/Library/Java/JavaVirtualMachines/jdk1.7.0_67.jdk/Contents/Home/jre/lib/libinstrument.dylib.
 One of the two will be used. Which one is undefined.
[junit] Testsuite: org.apache.cassandra.db.CleanupTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
2.217 sec
[junit] 
[junit] - Standard Output ---
[junit] WARN  23:43:07 JNA link failure, one or more native method will be 
unavailable.
[junit] WARN  23:43:07 JNA link failure, one or more native method will be 
unavailable.
[junit] WARN  23:43:07 Couldn't open /proc/stats
[junit] WARN  23:43:07 Couldn't open /proc/stats
[junit] -  ---
[junitreport] Processing 
/Users/RajT/cassandra-source/cassandra-trunk/build/test/TESTS-TestSuites.xml to 
/var/folders/nf/trtmyt9534z03kq8p8zgbnxhgn/T/null1933090667
[junitreport] Loading stylesheet 
jar:file:/usr/local/Cellar/ant/1.9.4/libexec/lib/ant-junit.jar!/org/apache/tools/ant/taskdefs/optional/junit/xsl/junit-frames.xsl
[junitreport] Transform time: 387ms
[junitreport] Deleting: 
/var/folders/nf/trtmyt9534z03kq8p8zgbnxhgn/T/null1933090667

BUILD SUCCESSFUL
Total time: 4 seconds

{code}

 Use JMX Notifications to Indicate Success/Failure of Long-Running Operations
 

 Key: CASSANDRA-7124
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7124
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Rajanarayanan Thottuvaikkatumana
Priority: Minor
  Labels: lhf
 Fix For: 3.0

 Attachments: cassandra-trunk-cleanup-7124.txt

[jira] [Commented] (CASSANDRA-7563) UserType, TupleType and collections in UDFs

2014-11-20 Thread Tyler Hobbs (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220311#comment-14220311
 ] 

Tyler Hobbs commented on CASSANDRA-7563:


This is almost good to go, but it looks like there are some problems with 
Javascript in Java 7:

{noformat}
[junit] Testcase: 
testJavascriptTupleTypeCollection(org.apache.cassandra.cql3.UFTest):  
Caused an ERROR
[junit] Failed to compile function 'cql_test_keyspace_alt.function_75' for 
language javascript: javax.script.ScriptException: 
sun.org.mozilla.javascript.internal.EvaluatorException: missing name after . 
operator (Unknown Source#1)
[junit] org.apache.cassandra.exceptions.InvalidRequestException: Failed to 
compile function 'cql_test_keyspace_alt.function_75' for language javascript: 
javax.script.ScriptException: 
sun.org.mozilla.javascript.internal.EvaluatorException: missing name after . 
operator (Unknown Source#1)
[junit] at 
org.apache.cassandra.cql3.functions.ScriptBasedUDF.init(ScriptBasedUDF.java:86)
[junit] at 
org.apache.cassandra.cql3.functions.UDFunction.create(UDFunction.java:202)
[junit] at 
org.apache.cassandra.cql3.statements.CreateFunctionStatement.announceMigration(CreateFunctionStatement.java:132)
[junit] at 
org.apache.cassandra.cql3.statements.SchemaAlteringStatement.executeInternal(SchemaAlteringStatement.java:92)
[junit] at 
org.apache.cassandra.cql3.QueryProcessor.executeOnceInternal(QueryProcessor.java:349)
[junit] at 
org.apache.cassandra.cql3.CQLTester.execute(CQLTester.java:436)
[junit] at 
org.apache.cassandra.cql3.CQLTester.createFunctionOverload(CQLTester.java:298)
[junit] at 
org.apache.cassandra.cql3.CQLTester.createFunction(CQLTester.java:289)
[junit] at 
org.apache.cassandra.cql3.UFTest.testJavascriptTupleTypeCollection(UFTest.java:1138)
[junit] 
[junit] 
[junit] Testcase: 
testJavascriptUTCollections(org.apache.cassandra.cql3.UFTest):Caused an 
ERROR
[junit] Execution of user-defined function 'cql_test_keyspace.function_83' 
failed: javax.script.ScriptException: 
sun.org.mozilla.javascript.internal.EcmaError: TypeError: Cannot call method 
getString of null (Unknown Source#1) in Unknown Source at line number 1
[junit] org.apache.cassandra.exceptions.InvalidRequestException: Execution 
of user-defined function 'cql_test_keyspace.function_83' failed: 
javax.script.ScriptException: sun.org.mozilla.javascript.internal.EcmaError: 
TypeError: Cannot call method getString of null (Unknown Source#1) in 
Unknown Source at line number 1
[junit] at 
org.apache.cassandra.cql3.functions.ScriptBasedUDF.execute(ScriptBasedUDF.java:142)
[junit] at 
org.apache.cassandra.cql3.selection.ScalarFunctionSelector.getOutput(ScalarFunctionSelector.java:60)
[junit] at 
org.apache.cassandra.cql3.selection.Selection$SelectionWithProcessing$1.getOutputRow(Selection.java:397)
[junit] at 
org.apache.cassandra.cql3.selection.Selection$ResultSetBuilder.build(Selection.java:241)
[junit] at 
org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:1168)
[junit] at 
org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:304)
[junit] at 
org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:328)
[junit] at 
org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:65)
[junit] at 
org.apache.cassandra.cql3.QueryProcessor.executeOnceInternal(QueryProcessor.java:349)
[junit] at 
org.apache.cassandra.cql3.CQLTester.execute(CQLTester.java:436)
[junit] at 
org.apache.cassandra.cql3.UFTest.testJavascriptUTCollections(UFTest.java:1278)
[junit] 
{noformat}

I believe we are considering requiring Java 8 for Cassandra 3.0, so if there's 
not a straightforward fix for whatever the underlying problem is, we could 
potentially just advertise the limitation.

 UserType, TupleType and collections in UDFs
 ---

 Key: CASSANDRA-7563
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7563
 Project: Cassandra
  Issue Type: Bug
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 3.0

 Attachments: 7563-7740.txt, 7563.txt, 7563v2.txt, 7563v3.txt, 
 7563v4.txt, 7563v5.txt


 * is Java Driver as a dependency required ?
 * is it possible to extract parts of the Java Driver for UDT/TT/coll support ?
 * CQL {{DROP TYPE}} must check UDFs
 * must check keyspace access permissions (if those exist)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8350) Drop,Create CF, insert 658, one record fail to insert

2014-11-20 Thread Harpreet Kaur (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220336#comment-14220336
 ] 

Harpreet Kaur commented on CASSANDRA-8350:
--

found it, future dated tombstone, how to get rid of it?

found a google doc, mentioning dump/filter/rewrite using 
sstable2json/json2sstable. Has anyone tried it?



 Drop,Create CF, insert 658,  one record fail to insert
 --

 Key: CASSANDRA-8350
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8350
 Project: Cassandra
  Issue Type: Bug
Reporter: Harpreet Kaur

 For a CF ,there is a change in definition. so we 
 Drop CF
 Create CF
 Inserted data. 658 
 cqlsh : select count(*) from CF; 657
 one record did not insert. 
 Try inserting as one-off, no errors on cqlsh, but record did not insert. 
 when doing a tracing on; select * from CF where id='blah'; 
 shows 
 Read 0 live and 73 tombstoned cells 
 changed gc_grace_seconds to 600 for CF 
 next performed nodetool flush,compact,repair on all 12 nodes, did not help. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7563) UserType, TupleType and collections in UDFs


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-7563:

Attachment: 7563v6.txt

Updated patch as v6 with that rhino stuff passing - don't ask me why that works 
that way in the one case but not in the other... I think that's one of the 
reasons why nashorn has been developed. ;)

 UserType, TupleType and collections in UDFs
 ---

 Key: CASSANDRA-7563
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7563
 Project: Cassandra
  Issue Type: Bug
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 3.0

 Attachments: 7563-7740.txt, 7563.txt, 7563v2.txt, 7563v3.txt, 
 7563v4.txt, 7563v5.txt, 7563v6.txt


 * is Java Driver as a dependency required ?
 * is it possible to extract parts of the Java Driver for UDT/TT/coll support ?
 * CQL {{DROP TYPE}} must check UDFs
 * must check keyspace access permissions (if those exist)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7563) UserType, TupleType and collections in UDFs


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220348#comment-14220348
 ] 

Robert Stupp commented on CASSANDRA-7563:
-

FYI - diff for the rhino fix : 
https://github.com/snazy/cassandra/commit/4268d2ac148bcda407fc407ae4f5f9d498ee81da

 UserType, TupleType and collections in UDFs
 ---

 Key: CASSANDRA-7563
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7563
 Project: Cassandra
  Issue Type: Bug
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 3.0

 Attachments: 7563-7740.txt, 7563.txt, 7563v2.txt, 7563v3.txt, 
 7563v4.txt, 7563v5.txt, 7563v6.txt


 * is Java Driver as a dependency required ?
 * is it possible to extract parts of the Java Driver for UDT/TT/coll support ?
 * CQL {{DROP TYPE}} must check UDFs
 * must check keyspace access permissions (if those exist)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8150) Simplify and enlarge new heap calculation