date:20140422

[
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977165#comment-13977165
]

Benedict commented on CASSANDRA-6696:
-

bq. Okay, but I think that's clearly a different ticket. In the meantime,
sstable-per-vnode has a lot of advantages.

Agreed, it's CASSANDRA-7032 :-)

But I guess what I'm saying is let's hold off knapsacking and rebalancing, as
that's a lot of added complexity to this ticket, and we can probably fix it
more easily with CASSANDRA-7032.

Drive replacement in JBOD can cause data to reappear.
--

Key: CASSANDRA-6696
URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: sankalp kohli
Assignee: Marcus Eriksson
Fix For: 3.0

In JBOD, when someone gets a bad drive, the bad drive is replaced with a new
empty one and repair is run.
This can cause deleted data to come back in some cases. Also this is true for
corrupt stables in which we delete the corrupt stable and run repair.
Here is an example:
Say we have 3 nodes A,B and C and RF=3 and GC grace=10days.
row=sankalp col=sankalp is written 20 days back and successfully went to all
three nodes.
Then a delete/tombstone was written successfully for the same row column 15
days back.
Since this tombstone is more than gc grace, it got compacted in Nodes A and B
since it got compacted with the actual data. So there is no trace of this row
column in node A and B.
Now in node C, say the original data is in drive1 and tombstone is in drive2.
Compaction has not yet reclaimed the data and tombstone.
Drive2 becomes corrupt and was replaced with new empty drive.
Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp
has come back to life.
Now after replacing the drive we run repair. This data will be propagated to
all nodes.
Note: This is still a problem even if we run repair every gc grace.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers

2014-04-22 Thread Yuki Morishita (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977170#comment-13977170
 ] 

Yuki Morishita commented on CASSANDRA-7066:
---

Ancestors are used to prevent over counting counters by loading unnecessary 
SSTables (leftover from compaction).
CASSANDRA-5151 is created to improve this.

As Jonathan mentioned, we store them in SSTable metadata, but it causes heap 
pressure when using LCS (CASSANDRA-5342).
CASSANDRA-5342 makes it to read as necessary and not to keep in heap forever.
In fact, even in 2.1 ancestors are recorded in [one of SSTable 
metadata|https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/io/sstable/metadata/CompactionMetadata.java#L42]
 and used to calculate the new ancestors during compaction.


 Simplify (and unify) cleanup of compaction leftovers
 

 Key: CASSANDRA-7066
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Priority: Minor
 Fix For: 3.0


 Currently we manage a list of in-progress compactions in a system table, 
 which we use to cleanup incomplete compactions when we're done. The problem 
 with this is that 1) it's a bit clunky (and leaves us in positions where we 
 can unnecessarily cleanup completed files, or conversely not cleanup files 
 that have been superceded); and 2) it's only used for a regular compaction - 
 no other compaction types are guarded in the same way, so can result in 
 duplication if we fail before deleting the replacements.
 I'd like to see each sstable store in its metadata its direct ancestors, and 
 on startup we simply delete any sstables that occur in the union of all 
 ancestor sets. This way as soon as we finish writing we're capable of 
 cleaning up any leftovers, so we never get duplication. It's also much easier 
 to reason about.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (CASSANDRA-7068) AssertionError when running putget_test

2014-04-22 Thread Ryan McGuire (JIRA)

Ryan McGuire created CASSANDRA-7068:
---

 Summary: AssertionError when running putget_test
 Key: CASSANDRA-7068
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7068
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Assignee: Aleksey Yeschenko


running the putget_test like so:

{code}
nosetests2 -x -s -v putget_test.py:TestPutGet.non_local_read_test
{code}

Yields this error in the logs on cassandra-2.0:

{code}
ERROR [Thrift:1] 2014-04-22 14:25:37,584 CassandraDaemon.java (line 198) 
Exception in thread Thread[Thrift:1,5,main]
java.lang.AssertionError
at 
org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:542)
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:595)
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:579)
at 
org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:817)
at 
org.apache.cassandra.service.StorageProxy$2.apply(StorageProxy.java:119)
at 
org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:693)
at 
org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:465)
at 
org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:535)
at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:542)
at 
org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:526)
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:158)
at 
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:175)
at 
org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1959)
at 
org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4486)
at 
org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4470)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:201)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
{code}

On cassandra-2.1 I don't get any errors in the logs, but the test doesn't run , 
instead I get a 'TSocket read 0 bytes' error. 

Test on 1.2 is fine.

After bisecting, it appears that a common commit 
3a73e392fa424bff5378d4bb72117cfa28f9b0b7 is the cause.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977174#comment-13977174
 ] 

Benedict commented on CASSANDRA-7066:
-

So we already have it, the problem is we don't want to keep it resident? What's 
the problem with just reading it on startup for cleaning up compaction left 
overs?

For cleaning up compaction all we need is direct ancestors, but it's not really 
a problem to read all ancestors and filter based on those files we know exist 
to avoid unnecessary heap pressure. Seems much more robust and simple to reason 
about to me...?

 Simplify (and unify) cleanup of compaction leftovers
 

 Key: CASSANDRA-7066
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Priority: Minor
 Fix For: 3.0


 Currently we manage a list of in-progress compactions in a system table, 
 which we use to cleanup incomplete compactions when we're done. The problem 
 with this is that 1) it's a bit clunky (and leaves us in positions where we 
 can unnecessarily cleanup completed files, or conversely not cleanup files 
 that have been superceded); and 2) it's only used for a regular compaction - 
 no other compaction types are guarded in the same way, so can result in 
 duplication if we fail before deleting the replacements.
 I'd like to see each sstable store in its metadata its direct ancestors, and 
 on startup we simply delete any sstables that occur in the union of all 
 ancestor sets. This way as soon as we finish writing we're capable of 
 cleaning up any leftovers, so we never get duplication. It's also much easier 
 to reason about.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7068) AssertionError when running putget_test


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977179#comment-13977179
 ] 

Brandon Williams commented on CASSANDRA-7068:
-

Sounds related to CASSANDRA-6476

 AssertionError when running putget_test
 ---

 Key: CASSANDRA-7068
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7068
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Assignee: Aleksey Yeschenko

 running the putget_test like so:
 {code}
 nosetests2 -x -s -v putget_test.py:TestPutGet.non_local_read_test
 {code}
 Yields this error in the logs on cassandra-2.0:
 {code}
 ERROR [Thrift:1] 2014-04-22 14:25:37,584 CassandraDaemon.java (line 198) 
 Exception in thread Thread[Thrift:1,5,main]
 java.lang.AssertionError
 at 
 org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:542)
 at 
 org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:595)
 at 
 org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:579)
 at 
 org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:817)
 at 
 org.apache.cassandra.service.StorageProxy$2.apply(StorageProxy.java:119)
 at 
 org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:693)
 at 
 org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:465)
 at 
 org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:535)
 at 
 org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:542)
 at 
 org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:526)
 at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:158)
 at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:175)
 at 
 org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1959)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4486)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4470)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:201)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 {code}
 On cassandra-2.1 I don't get any errors in the logs, but the test doesn't run 
 , instead I get a 'TSocket read 0 bytes' error. 
 Test on 1.2 is fine.
 After bisecting, it appears that a common commit 
 3a73e392fa424bff5378d4bb72117cfa28f9b0b7 is the cause.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Issue Comment Deleted] (CASSANDRA-7068) AssertionError when running putget_test


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-7068:


Comment: was deleted

(was: Sounds related to CASSANDRA-6476)

 AssertionError when running putget_test
 ---

 Key: CASSANDRA-7068
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7068
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Assignee: Aleksey Yeschenko

 running the putget_test like so:
 {code}
 nosetests2 -x -s -v putget_test.py:TestPutGet.non_local_read_test
 {code}
 Yields this error in the logs on cassandra-2.0:
 {code}
 ERROR [Thrift:1] 2014-04-22 14:25:37,584 CassandraDaemon.java (line 198) 
 Exception in thread Thread[Thrift:1,5,main]
 java.lang.AssertionError
 at 
 org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:542)
 at 
 org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:595)
 at 
 org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:579)
 at 
 org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:817)
 at 
 org.apache.cassandra.service.StorageProxy$2.apply(StorageProxy.java:119)
 at 
 org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:693)
 at 
 org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:465)
 at 
 org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:535)
 at 
 org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:542)
 at 
 org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:526)
 at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:158)
 at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:175)
 at 
 org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1959)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4486)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4470)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:201)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 {code}
 On cassandra-2.1 I don't get any errors in the logs, but the test doesn't run 
 , instead I get a 'TSocket read 0 bytes' error. 
 Test on 1.2 is fine.
 After bisecting, it appears that a common commit 
 3a73e392fa424bff5378d4bb72117cfa28f9b0b7 is the cause.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-7042) Disk space growth until restart

2014-04-22 Thread Zach Aller (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach Aller updated CASSANDRA-7042:
--

Attachment: Screen Shot 2014-04-22 at 1.40.41 PM.png

 Disk space growth until restart
 ---

 Key: CASSANDRA-7042
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7042
 Project: Cassandra
  Issue Type: Bug
 Environment: Ubuntu 12.04
 Sun Java 7
 Cassandra 2.0.6
Reporter: Zach Aller
 Attachments: Screen Shot 2014-04-17 at 11.07.24 AM.png, Screen Shot 
 2014-04-18 at 11.47.30 AM.png, Screen Shot 2014-04-22 at 1.40.41 PM.png, 
 after.log, before.log


 Cassandra will constantly eat disk space not sure whats causing it the only 
 thing that seems to fix it is a restart of cassandra this happens about every 
 3-5 hrs we will grow from about 350GB to 650GB with no end in site. Once we 
 restart cassandra it usually all clears itself up and disks return to normal 
 for a while then something triggers its and starts climbing again. Sometimes 
 when we restart compactions pending skyrocket and if we restart a second time 
 the compactions pending drop off back to a normal level. One other thing to 
 note is the space is not free'd until cassandra starts back up and not when 
 shutdown.
 I will get a clean log of before and after restarting next time it happens 
 and post it.
 Here is a common ERROR in our logs that might be related
 ERROR [CompactionExecutor:46] 2014-04-15 09:12:51,040 CassandraDaemon.java 
 (line 196) Exception in thread Thread[CompactionExecutor:46,1,main]
 java.lang.RuntimeException: java.io.FileNotFoundException: 
 /local-project/cassandra_data/data/wxgrid/grid/wxgrid-grid-jb-468677-Data.db 
 (No such file or directory)
 at 
 org.apache.cassandra.io.util.ThrottledReader.open(ThrottledReader.java:53)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:1355)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:67)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1161)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1173)
 at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getScanners(LeveledCompactionStrategy.java:194)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:258)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:126)
 at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:197)
 at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 at java.util.concurrent.FutureTask.run(Unknown Source)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
 Caused by: java.io.FileNotFoundException: 
 /local-project/cassandra_data/data/wxgrid/grid/wxgrid-grid-jb-468677-Data.db 
 (No such file or directory)
 at java.io.RandomAccessFile.open(Native Method)
 at java.io.RandomAccessFile.init(Unknown Source)
 at 
 org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:58)
 at 
 org.apache.cassandra.io.util.ThrottledReader.init(ThrottledReader.java:35)
 at 
 org.apache.cassandra.io.util.ThrottledReader.open(ThrottledReader.java:49)
 ... 17 more



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977174#comment-13977174
 ] 

Benedict edited comment on CASSANDRA-7066 at 4/22/14 6:42 PM:
--

So we already have it, the problem is we don't want to keep it resident? What's 
the problem with just reading it on startup for cleaning up compaction left 
overs?

For cleaning up compaction all we need is direct ancestors, but it's not really 
a problem to read all ancestors and filter based on those files we know exist 
to avoid unnecessary heap pressure. Seems much more robust and simple to reason 
about to me...?

If we only have direct ancestors, we wouldn't be permitted to delete any 
sstable that shadows another that isn't itself already deleted (which can be 
delayed for a period)... but this still gives us better behaviour than the 
current setup. This could be achieved with an OpOrder reasonably easily, or 
simply rescheduling the DeletingTask if any of its ancestors are still alive.

Right now, we can finish compaction with the replacement sstable finished and 
ready to read, but the replaced/expired sstables untidied. If something happens 
at this point, we end up with all of the tables.


was (Author: benedict):
So we already have it, the problem is we don't want to keep it resident? What's 
the problem with just reading it on startup for cleaning up compaction left 
overs?

For cleaning up compaction all we need is direct ancestors, but it's not really 
a problem to read all ancestors and filter based on those files we know exist 
to avoid unnecessary heap pressure. Seems much more robust and simple to reason 
about to me...?

 Simplify (and unify) cleanup of compaction leftovers
 

 Key: CASSANDRA-7066
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Priority: Minor
 Fix For: 3.0


 Currently we manage a list of in-progress compactions in a system table, 
 which we use to cleanup incomplete compactions when we're done. The problem 
 with this is that 1) it's a bit clunky (and leaves us in positions where we 
 can unnecessarily cleanup completed files, or conversely not cleanup files 
 that have been superceded); and 2) it's only used for a regular compaction - 
 no other compaction types are guarded in the same way, so can result in 
 duplication if we fail before deleting the replacements.
 I'd like to see each sstable store in its metadata its direct ancestors, and 
 on startup we simply delete any sstables that occur in the union of all 
 ancestor sets. This way as soon as we finish writing we're capable of 
 cleaning up any leftovers, so we never get duplication. It's also much easier 
 to reason about.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977188#comment-13977188
 ] 

Jonathan Ellis commented on CASSANDRA-7066:
---

Okay, it's coming back to me.  Compaction works like this:

# Compact and write the new sstables as .tmp
# Promote the .tmp sstables to live
# Delete the obsolete source files

The problem was that if we just relied on ancestors, we didn't know at what 
point we were in among 2 and 3 when we restart.  We could be partway done 
renaming .tmp to live (in which case it's not safe to delete the originals and 
just keep the new ones), or we could instead be partway done deleting the 
originals (in which case it's not safe to keep the originals).

The obvious solution would be to just keep everything and not try to cleanup at 
all, but until CASSANDRA-6880 that wasn't safe either because you could double 
up on counter increments.

So I'd say the logical simplification for 3.0 would be to get rid of the 
cleanup and ancestor tracking entirely.

 Simplify (and unify) cleanup of compaction leftovers
 

 Key: CASSANDRA-7066
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Priority: Minor
 Fix For: 3.0


 Currently we manage a list of in-progress compactions in a system table, 
 which we use to cleanup incomplete compactions when we're done. The problem 
 with this is that 1) it's a bit clunky (and leaves us in positions where we 
 can unnecessarily cleanup completed files, or conversely not cleanup files 
 that have been superceded); and 2) it's only used for a regular compaction - 
 no other compaction types are guarded in the same way, so can result in 
 duplication if we fail before deleting the replacements.
 I'd like to see each sstable store in its metadata its direct ancestors, and 
 on startup we simply delete any sstables that occur in the union of all 
 ancestor sets. This way as soon as we finish writing we're capable of 
 cleaning up any leftovers, so we never get duplication. It's also much easier 
 to reason about.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7032) Improve vnode allocation

2014-04-22 Thread Tupshin Harper (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977190#comment-13977190
 ] 

Tupshin Harper commented on CASSANDRA-7032:
---

It seems like there might be a way to constrain vnode RDF (replication 
distribution factor) in the general scope of this ticket as well. Call it a 
nice to have, should it present itself.

 Improve vnode allocation
 

 Key: CASSANDRA-7032
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7032
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
  Labels: performance, vnodes
 Fix For: 3.0

 Attachments: TestVNodeAllocation.java, TestVNodeAllocation.java


 It's been known for a little while that random vnode allocation causes 
 hotspots of ownership. It should be possible to improve dramatically on this 
 with deterministic allocation. I have quickly thrown together a simple greedy 
 algorithm that allocates vnodes efficiently, and will repair hotspots in a 
 randomly allocated cluster gradually as more nodes are added, and also 
 ensures that token ranges are fairly evenly spread between nodes (somewhat 
 tunably so). The allocation still permits slight discrepancies in ownership, 
 but it is bound by the inverse of the size of the cluster (as opposed to 
 random allocation, which strangely gets worse as the cluster size increases). 
 I'm sure there is a decent dynamic programming solution to this that would be 
 even better.
 If on joining the ring a new node were to CAS a shared table where a 
 canonical allocation of token ranges lives after running this (or a similar) 
 algorithm, we could then get guaranteed bounds on the ownership distribution 
 in a cluster. This will also help for CASSANDRA-6696.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers

[
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977201#comment-13977201
]

Benedict commented on CASSANDRA-7066:
-

bq. We could be partway done renaming .tmp to live

Ah, because we could have multiple target sstables. Yes, sticky. I still think
performing some cleanup is a good thing, else we can end up with unnecessary
data, but I don't think the current setup is actually much better than dropping
the tracking entirely. So long as any failure is transient (not a bug that
occurs predictably) it should be a relatively minor concern. So let's go with
that.

Simplify (and unify) cleanup of compaction leftovers

Key: CASSANDRA-7066
URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Benedict
Priority: Minor
Fix For: 3.0

Currently we manage a list of in-progress compactions in a system table,
which we use to cleanup incomplete compactions when we're done. The problem
with this is that 1) it's a bit clunky (and leaves us in positions where we
can unnecessarily cleanup completed files, or conversely not cleanup files
that have been superceded); and 2) it's only used for a regular compaction -
no other compaction types are guarded in the same way, so can result in
duplication if we fail before deleting the replacements.
I'd like to see each sstable store in its metadata its direct ancestors, and
on startup we simply delete any sstables that occur in the union of all
ancestor sets. This way as soon as we finish writing we're capable of
cleaning up any leftovers, so we never get duplication. It's also much easier
to reason about.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

git commit: Post-CASSANDRA-7058 fix

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.0 3a73e392f - b9324e1b9


Post-CASSANDRA-7058 fix


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b9324e1b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b9324e1b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b9324e1b

Branch: refs/heads/cassandra-2.0
Commit: b9324e1b94f67f3d89096fcef4d157f9505364e9
Parents: 3a73e39
Author: Aleksey Yeschenko alek...@apache.org
Authored: Tue Apr 22 22:13:57 2014 +0300
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Tue Apr 22 22:13:57 2014 +0300

--
 src/java/org/apache/cassandra/service/StorageProxy.java | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b9324e1b/src/java/org/apache/cassandra/service/StorageProxy.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java 
b/src/java/org/apache/cassandra/service/StorageProxy.java
index fc6ee3a..8196352 100644
--- a/src/java/org/apache/cassandra/service/StorageProxy.java
+++ b/src/java/org/apache/cassandra/service/StorageProxy.java
@@ -635,7 +635,7 @@ public class StorageProxy implements StorageProxyMBean
 {
 MessageOutRowMutation message = rm.createMessage();
 for (InetAddress target : endpoints)
-MessagingService.instance().sendRR(message, target, handler);
+MessagingService.instance().sendRR(message, target, handler, 
false);
 }
 }
 
@@ -814,7 +814,7 @@ public class StorageProxy implements StorageProxyMBean
 // (1.1 knows how to forward old-style String message IDs; 
updated to int in 2.0)
 if (localDataCenter.equals(dc) || 
MessagingService.instance().getVersion(destination)  
MessagingService.VERSION_20)
 {
-MessagingService.instance().sendRR(message, 
destination, responseHandler);
+MessagingService.instance().sendRR(message, 
destination, responseHandler, true);
 }
 else
 {
@@ -937,7 +937,7 @@ public class StorageProxy implements StorageProxyMBean
 }
 message = message.withParameter(RowMutation.FORWARD_TO, 
out.getData());
 // send the combined message + forward headers
-int id = MessagingService.instance().sendRR(message, target, 
handler);
+int id = MessagingService.instance().sendRR(message, target, 
handler, true);
 logger.trace(Sending message to {}@{}, id, target);
 }
 catch (IOException e)
@@ -1000,7 +1000,7 @@ public class StorageProxy implements StorageProxyMBean
 AbstractWriteResponseHandler responseHandler = new 
WriteResponseHandler(endpoint, WriteType.COUNTER);
 
 Tracing.trace(Enqueuing counter update to {}, endpoint);
-MessagingService.instance().sendRR(cm.makeMutationMessage(), 
endpoint, responseHandler);
+MessagingService.instance().sendRR(cm.makeMutationMessage(), 
endpoint, responseHandler, false);
 return responseHandler;
 }
 }

[jira] [Updated] (CASSANDRA-6746) Reads have a slow ramp up in speed


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-6746:


 Priority: Minor  (was: Major)
Fix Version/s: (was: 2.1 beta2)
   Issue Type: Improvement  (was: Bug)

 Reads have a slow ramp up in speed
 --

 Key: CASSANDRA-6746
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6746
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Minor
  Labels: performance
 Attachments: 2.1_vs_2.0_read.png, 6746-buffered-io-tweaks.png, 
 6746-patched.png, 6746.blockdev_setra.full.png, 
 6746.blockdev_setra.zoomed.png, 6746.buffered_io_tweaks.logs.tar.gz, 
 6746.buffered_io_tweaks.write-flush-compact-mixed.png, 
 6746.buffered_io_tweaks.write-read-flush-compact.png, 6746.txt, 
 buffered-io-tweaks.patch, cassandra-2.0-bdplab-trial-fincore.tar.bz2, 
 cassandra-2.1-bdplab-trial-fincore.tar.bz2


 On a physical four node cluister I am doing a big write and then a big read. 
 The read takes a long time to ramp up to respectable speeds.
 !2.1_vs_2.0_read.png!
 [See data 
 here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.2.1_vs_2.0_vs_1.2.retry1.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1]



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6746) Reads have a slow ramp up in speed


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977235#comment-13977235
 ] 

Benedict commented on CASSANDRA-6746:
-

I've updated the ticket to a minor improvement and moved it to no-release for 
now, as CASSANDRA-6916 looks to solve the regression; not sure if you still 
want to investigate the read-ahead changes you were looking at still though as 
an improvement, [~xedin]? If not I'll resolve the ticket.

 Reads have a slow ramp up in speed
 --

 Key: CASSANDRA-6746
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6746
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Minor
  Labels: performance
 Attachments: 2.1_vs_2.0_read.png, 6746-buffered-io-tweaks.png, 
 6746-patched.png, 6746.blockdev_setra.full.png, 
 6746.blockdev_setra.zoomed.png, 6746.buffered_io_tweaks.logs.tar.gz, 
 6746.buffered_io_tweaks.write-flush-compact-mixed.png, 
 6746.buffered_io_tweaks.write-read-flush-compact.png, 6746.txt, 
 buffered-io-tweaks.patch, cassandra-2.0-bdplab-trial-fincore.tar.bz2, 
 cassandra-2.1-bdplab-trial-fincore.tar.bz2


 On a physical four node cluister I am doing a big write and then a big read. 
 The read takes a long time to ramp up to respectable speeds.
 !2.1_vs_2.0_read.png!
 [See data 
 here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.2.1_vs_2.0_vs_1.2.retry1.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1]



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[1/2] git commit: Post-CASSANDRA-7058 fix

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 a16adba9b - 2c7622a65


Post-CASSANDRA-7058 fix


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b9324e1b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b9324e1b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b9324e1b

Branch: refs/heads/cassandra-2.1
Commit: b9324e1b94f67f3d89096fcef4d157f9505364e9
Parents: 3a73e39
Author: Aleksey Yeschenko alek...@apache.org
Authored: Tue Apr 22 22:13:57 2014 +0300
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Tue Apr 22 22:13:57 2014 +0300

--
 src/java/org/apache/cassandra/service/StorageProxy.java | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b9324e1b/src/java/org/apache/cassandra/service/StorageProxy.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java 
b/src/java/org/apache/cassandra/service/StorageProxy.java
index fc6ee3a..8196352 100644
--- a/src/java/org/apache/cassandra/service/StorageProxy.java
+++ b/src/java/org/apache/cassandra/service/StorageProxy.java
@@ -635,7 +635,7 @@ public class StorageProxy implements StorageProxyMBean
 {
 MessageOutRowMutation message = rm.createMessage();
 for (InetAddress target : endpoints)
-MessagingService.instance().sendRR(message, target, handler);
+MessagingService.instance().sendRR(message, target, handler, 
false);
 }
 }
 
@@ -814,7 +814,7 @@ public class StorageProxy implements StorageProxyMBean
 // (1.1 knows how to forward old-style String message IDs; 
updated to int in 2.0)
 if (localDataCenter.equals(dc) || 
MessagingService.instance().getVersion(destination)  
MessagingService.VERSION_20)
 {
-MessagingService.instance().sendRR(message, 
destination, responseHandler);
+MessagingService.instance().sendRR(message, 
destination, responseHandler, true);
 }
 else
 {
@@ -937,7 +937,7 @@ public class StorageProxy implements StorageProxyMBean
 }
 message = message.withParameter(RowMutation.FORWARD_TO, 
out.getData());
 // send the combined message + forward headers
-int id = MessagingService.instance().sendRR(message, target, 
handler);
+int id = MessagingService.instance().sendRR(message, target, 
handler, true);
 logger.trace(Sending message to {}@{}, id, target);
 }
 catch (IOException e)
@@ -1000,7 +1000,7 @@ public class StorageProxy implements StorageProxyMBean
 AbstractWriteResponseHandler responseHandler = new 
WriteResponseHandler(endpoint, WriteType.COUNTER);
 
 Tracing.trace(Enqueuing counter update to {}, endpoint);
-MessagingService.instance().sendRR(cm.makeMutationMessage(), 
endpoint, responseHandler);
+MessagingService.instance().sendRR(cm.makeMutationMessage(), 
endpoint, responseHandler, false);
 return responseHandler;
 }
 }

[2/2] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

Merge branch 'cassandra-2.0' into cassandra-2.1

Conflicts:
src/java/org/apache/cassandra/service/StorageProxy.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2c7622a6
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2c7622a6
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2c7622a6

Branch: refs/heads/cassandra-2.1
Commit: 2c7622a65ce747819931bd52bc576a4cd055ba3d
Parents: a16adba b9324e1
Author: Aleksey Yeschenko alek...@apache.org
Authored: Tue Apr 22 22:16:27 2014 +0300
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Tue Apr 22 22:16:27 2014 +0300

--
 src/java/org/apache/cassandra/service/StorageProxy.java | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2c7622a6/src/java/org/apache/cassandra/service/StorageProxy.java
--
diff --cc src/java/org/apache/cassandra/service/StorageProxy.java
index 33f6ff0,8196352..d8c5813
--- a/src/java/org/apache/cassandra/service/StorageProxy.java
+++ b/src/java/org/apache/cassandra/service/StorageProxy.java
@@@ -642,15 -620,22 +642,15 @@@ public class StorageProxy implements St
  
Keyspace.open(Keyspace.SYSTEM_KS),
  null,
  
WriteType.SIMPLE);
 -RowMutation rm = new RowMutation(Keyspace.SYSTEM_KS, 
UUIDType.instance.decompose(uuid), cf);
 -updateBatchlog(rm, endpoints, handler);
 -}
 -
 -private static void updateBatchlog(RowMutation rm, 
CollectionInetAddress endpoints, AbstractWriteResponseHandler handler)
 -{
 -if (endpoints.contains(FBUtilities.getBroadcastAddress()))
 -{
 -assert endpoints.size() == 1;
 -insertLocal(rm, handler);
 -}
 -else
 +Mutation mutation = new Mutation(Keyspace.SYSTEM_KS, 
UUIDType.instance.decompose(uuid));
 +mutation.delete(SystemKeyspace.BATCHLOG_CF, 
FBUtilities.timestampMicros());
 +MessageOutMutation message = mutation.createMessage();
 +for (InetAddress target : endpoints)
  {
 -MessageOutRowMutation message = rm.createMessage();
 -for (InetAddress target : endpoints)
 +if (target.equals(FBUtilities.getBroadcastAddress())  
OPTIMIZE_LOCAL_REQUESTS)
 +insertLocal(message.payload, handler);
 +else
- MessagingService.instance().sendRR(message, target, handler);
+ MessagingService.instance().sendRR(message, target, handler, 
false);
  }
  }
  
@@@ -823,9 -812,9 +823,9 @@@
  String dc = 
DatabaseDescriptor.getEndpointSnitch().getDatacenter(destination);
  // direct writes to local DC or old Cassandra versions
  // (1.1 knows how to forward old-style String message 
IDs; updated to int in 2.0)
 -if (localDataCenter.equals(dc) || 
MessagingService.instance().getVersion(destination)  
MessagingService.VERSION_20)
 +if (localDataCenter.equals(dc))
  {
- MessagingService.instance().sendRR(message, 
destination, responseHandler);
+ MessagingService.instance().sendRR(message, 
destination, responseHandler, true);
  }
  else
  {
@@@ -946,9 -935,9 +946,9 @@@
  out.writeInt(id);
  logger.trace(Adding FWD message to {}@{}, id, destination);
  }
 -message = message.withParameter(RowMutation.FORWARD_TO, 
out.getData());
 +message = message.withParameter(Mutation.FORWARD_TO, 
out.getData());
  // send the combined message + forward headers
- int id = MessagingService.instance().sendRR(message, target, 
handler);
+ int id = MessagingService.instance().sendRR(message, target, 
handler, true);
  logger.trace(Sending message to {}@{}, id, target);
  }
  catch (IOException e)

[jira] [Commented] (CASSANDRA-7068) AssertionError when running putget_test


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977243#comment-13977243
 ] 

Aleksey Yeschenko commented on CASSANDRA-7068:
--

Fixed by b9324e1b94f67f3d89096fcef4d157f9505364e9

In some instances a wrong overload was being picked.

 AssertionError when running putget_test
 ---

 Key: CASSANDRA-7068
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7068
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Assignee: Aleksey Yeschenko

 running the putget_test like so:
 {code}
 nosetests2 -x -s -v putget_test.py:TestPutGet.non_local_read_test
 {code}
 Yields this error in the logs on cassandra-2.0:
 {code}
 ERROR [Thrift:1] 2014-04-22 14:25:37,584 CassandraDaemon.java (line 198) 
 Exception in thread Thread[Thrift:1,5,main]
 java.lang.AssertionError
 at 
 org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:542)
 at 
 org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:595)
 at 
 org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:579)
 at 
 org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:817)
 at 
 org.apache.cassandra.service.StorageProxy$2.apply(StorageProxy.java:119)
 at 
 org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:693)
 at 
 org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:465)
 at 
 org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:535)
 at 
 org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:542)
 at 
 org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:526)
 at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:158)
 at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:175)
 at 
 org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1959)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4486)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4470)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:201)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 {code}
 On cassandra-2.1 I don't get any errors in the logs, but the test doesn't run 
 , instead I get a 'TSocket read 0 bytes' error. 
 Test on 1.2 is fine.
 After bisecting, it appears that a common commit 
 3a73e392fa424bff5378d4bb72117cfa28f9b0b7 is the cause.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (CASSANDRA-6809) Compressed Commit Log


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict reassigned CASSANDRA-6809:
---

Assignee: T Jake Luciani

 Compressed Commit Log
 -

 Key: CASSANDRA-6809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6809
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: T Jake Luciani
Priority: Minor
  Labels: performance
 Fix For: 3.0


 It seems an unnecessary oversight that we don't compress the commit log. 
 Doing so should improve throughput, but some care will need to be taken to 
 ensure we use as much of a segment as possible. I propose decoupling the 
 writing of the records from the segments. Basically write into a (queue of) 
 DirectByteBuffer, and have the sync thread compress, say, ~64K chunks every X 
 MB written to the CL (where X is ordinarily CLS size), and then pack as many 
 of the compressed chunks into a CLS as possible.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (CASSANDRA-7068) AssertionError when running putget_test


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko resolved CASSANDRA-7068.
--

Resolution: Fixed

 AssertionError when running putget_test
 ---

 Key: CASSANDRA-7068
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7068
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Assignee: Aleksey Yeschenko

 running the putget_test like so:
 {code}
 nosetests2 -x -s -v putget_test.py:TestPutGet.non_local_read_test
 {code}
 Yields this error in the logs on cassandra-2.0:
 {code}
 ERROR [Thrift:1] 2014-04-22 14:25:37,584 CassandraDaemon.java (line 198) 
 Exception in thread Thread[Thrift:1,5,main]
 java.lang.AssertionError
 at 
 org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:542)
 at 
 org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:595)
 at 
 org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:579)
 at 
 org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:817)
 at 
 org.apache.cassandra.service.StorageProxy$2.apply(StorageProxy.java:119)
 at 
 org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:693)
 at 
 org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:465)
 at 
 org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:535)
 at 
 org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:542)
 at 
 org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:526)
 at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:158)
 at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:175)
 at 
 org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1959)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4486)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4470)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:201)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 {code}
 On cassandra-2.1 I don't get any errors in the logs, but the test doesn't run 
 , instead I get a 'TSocket read 0 bytes' error. 
 Test on 1.2 is fine.
 After bisecting, it appears that a common commit 
 3a73e392fa424bff5378d4bb72117cfa28f9b0b7 is the cause.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (CASSANDRA-7069) Prevent operator mistakes due to simultaneous bootstrap

Brandon Williams created CASSANDRA-7069:
---

 Summary: Prevent operator mistakes due to simultaneous bootstrap
 Key: CASSANDRA-7069
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7069
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
Priority: Minor
 Fix For: 3.0


Cassandra has always had the '2 minute rule' between beginning topology changes 
to ensure the range announcement is known to all nodes before the next one 
begins.  Trying to bootstrap a bunch of nodes simultaneously is a common 
mistake and seems to be on the rise as of late.

We can prevent users from shooting themselves in the foot this way by looking 
for other joining nodes in the shadow round, then comparing their generation 
against our own and if there isn't a large enough difference, bail out or sleep 
until there it is large enough.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-7069) Prevent operator mistakes due to simultaneous bootstrap


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-7069:


Description: 
Cassandra has always had the '2 minute rule' between beginning topology changes 
to ensure the range announcement is known to all nodes before the next one 
begins.  Trying to bootstrap a bunch of nodes simultaneously is a common 
mistake and seems to be on the rise as of late.

We can prevent users from shooting themselves in the foot this way by looking 
for other joining nodes in the shadow round, then comparing their generation 
against our own and if there isn't a large enough difference, bail out or sleep 
until it is large enough.

  was:
Cassandra has always had the '2 minute rule' between beginning topology changes 
to ensure the range announcement is known to all nodes before the next one 
begins.  Trying to bootstrap a bunch of nodes simultaneously is a common 
mistake and seems to be on the rise as of late.

We can prevent users from shooting themselves in the foot this way by looking 
for other joining nodes in the shadow round, then comparing their generation 
against our own and if there isn't a large enough difference, bail out or sleep 
until there it is large enough.


 Prevent operator mistakes due to simultaneous bootstrap
 ---

 Key: CASSANDRA-7069
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7069
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
Priority: Minor
 Fix For: 3.0


 Cassandra has always had the '2 minute rule' between beginning topology 
 changes to ensure the range announcement is known to all nodes before the 
 next one begins.  Trying to bootstrap a bunch of nodes simultaneously is a 
 common mistake and seems to be on the rise as of late.
 We can prevent users from shooting themselves in the foot this way by looking 
 for other joining nodes in the shadow round, then comparing their generation 
 against our own and if there isn't a large enough difference, bail out or 
 sleep until it is large enough.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-6755) Optimise CellName/Composite comparisons for NativeCell


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-6755:


Fix Version/s: (was: 3.0)
   2.1.0
 Assignee: T Jake Luciani

 Optimise CellName/Composite comparisons for NativeCell
 --

 Key: CASSANDRA-6755
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6755
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: T Jake Luciani
Priority: Minor
  Labels: performance
 Fix For: 2.1.0


 As discussed in CASSANDRA-6694, to reduce temporary garbage generation we 
 should minimise the incidence of CellName component extraction. The biggest 
 win will be to perform comparisons on Cell where possible, instead of 
 CellName, so that Native*Cell can use its extra information to avoid creating 
 any ByteBuffer objects



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-5086) MultiSlice bug in SimpleBlockFetcher


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977257#comment-13977257
 ] 

Jonathan Ellis commented on CASSANDRA-5086:
---

Is this still a problem?

 MultiSlice bug in SimpleBlockFetcher
 

 Key: CASSANDRA-5086
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5086
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.2.0 beta 1
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Minor
 Fix For: 2.1 beta2

 Attachments: 5086.txt


 Related to CASSANDRA-3885 it works for most cases but logic when there is no 
 index entries is wrong



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (CASSANDRA-6861) Eliminate garbage in server-side native transport


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict reassigned CASSANDRA-6861:
---

Assignee: T Jake Luciani

 Eliminate garbage in server-side native transport
 -

 Key: CASSANDRA-6861
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6861
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: T Jake Luciani
Priority: Minor
  Labels: performance
 Fix For: 2.1 beta2


 Now we've upgraded to Netty 4, we're generating a lot of garbage that could 
 be avoided, so we should probably stop that. Should be reasonably easy to 
 hook into Netty's pooled buffers, returning them to the pool once a given 
 message is completed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-4762) Support IN clause for any clustering column


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-4762:
--

Reviewer:   (was: Sylvain Lebresne)
Assignee: (was: T Jake Luciani)

 Support IN clause for any clustering column
 ---

 Key: CASSANDRA-4762
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4762
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: T Jake Luciani
  Labels: cql3
 Fix For: 3.0

 Attachments: 4762-1.txt


 Given CASSANDRA-3885
 It seems it should be possible to store multiple ranges for many predicates 
 even the inner parts of a composite column.
 They could be expressed as a expanded set of filter queries.
 example:
 {code}
 CREATE TABLE test (
name text,
tdate timestamp,
tdate2 timestamp,
tdate3 timestamp,
num double,
PRIMARY KEY(name,tdate,tdate2,tdate3)
  ) WITH COMPACT STORAGE;
 SELECT * FROM test WHERE 
   name IN ('a','b') and
   tdate IN ('2010-01-01','2011-01-01') and
   tdate2 IN ('2010-01-01','2011-01-01') and
   tdate3 IN ('2010-01-01','2011-01-01') 
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[4/4] git commit: Merge branch 'cassandra-2.1' into trunk

Merge branch 'cassandra-2.1' into trunk

Conflicts:

tools/stress/src/org/apache/cassandra/stress/operations/CqlIndexedRangeSlicer.java
tools/stress/src/org/apache/cassandra/stress/operations/CqlReader.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/99fbafee
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/99fbafee
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/99fbafee

Branch: refs/heads/trunk
Commit: 99fbafee3e571ed7e73de8ed3fb9d9c27bcdb754
Parents: 5bbc54f 2c7622a
Author: Aleksey Yeschenko alek...@apache.org
Authored: Tue Apr 22 22:43:16 2014 +0300
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Tue Apr 22 22:43:16 2014 +0300

--
 src/java/org/apache/cassandra/service/StorageProxy.java | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)
--

[1/4] git commit: fix cassandra stress errors on reads with native protocol

Repository: cassandra
Updated Branches:
  refs/heads/trunk 5bbc54fe7 - 99fbafee3


fix cassandra stress errors on reads with native protocol

patch by belliottsmith; reviewed by jasobrown for CASSANDRA-7033


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a16adba9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a16adba9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a16adba9

Branch: refs/heads/trunk
Commit: a16adba9b89661f95d167d056d128ed388c4e7a7
Parents: 5045d3e
Author: Jason Brown jasobr...@apple.com
Authored: Tue Apr 22 10:04:11 2014 -0700
Committer: Jason Brown jasobr...@apple.com
Committed: Tue Apr 22 10:09:58 2014 -0700

--
 CHANGES.txt |  1 +
 .../operations/CqlIndexedRangeSlicer.java   |  9 ++-
 .../stress/operations/CqlOperation.java | 14 ---
 .../cassandra/stress/operations/CqlReader.java  | 26 +++-
 4 files changed, 22 insertions(+), 28 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a16adba9/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 2f70c63..844df95 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -48,6 +48,7 @@
  * Clean up IndexInfo on keyspace/table drops (CASSANDRA-6924)
  * Only snapshot relative SSTables when sequential repair (CASSANDRA-7024)
  * Require nodetool rebuild_index to specify index names (CASSANDRA-7038)
+ * fix cassandra stress errors on reads with native protocol (CASANDRA-7033)
 Merged from 2.0:
  * Use LOCAL_QUORUM for data reads at LOCAL_SERIAL (CASSANDRA-6939)
  * Log a warning for large batches (CASSANDRA-6487)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a16adba9/tools/stress/src/org/apache/cassandra/stress/operations/CqlIndexedRangeSlicer.java
--
diff --git 
a/tools/stress/src/org/apache/cassandra/stress/operations/CqlIndexedRangeSlicer.java
 
b/tools/stress/src/org/apache/cassandra/stress/operations/CqlIndexedRangeSlicer.java
index c971844..046381e 100644
--- 
a/tools/stress/src/org/apache/cassandra/stress/operations/CqlIndexedRangeSlicer.java
+++ 
b/tools/stress/src/org/apache/cassandra/stress/operations/CqlIndexedRangeSlicer.java
@@ -47,13 +47,8 @@ public class CqlIndexedRangeSlicer extends 
CqlOperationbyte[][]
 @Override
 protected String buildQuery()
 {
-StringBuilder query = new StringBuilder(SELECT );
-
-if (state.isCql2())
-query.append(state.settings.columns.maxColumnsPerKey).append( 
''..'');
-else
-query.append(*);
-
+StringBuilder query = new StringBuilder(SELECT);
+query.append(wrapInQuotesIfRequired(key));
 query.append( FROM );
 query.append(wrapInQuotesIfRequired(state.type.table));
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a16adba9/tools/stress/src/org/apache/cassandra/stress/operations/CqlOperation.java
--
diff --git 
a/tools/stress/src/org/apache/cassandra/stress/operations/CqlOperation.java 
b/tools/stress/src/org/apache/cassandra/stress/operations/CqlOperation.java
index 5b27146..1c59e2d 100644
--- a/tools/stress/src/org/apache/cassandra/stress/operations/CqlOperation.java
+++ b/tools/stress/src/org/apache/cassandra/stress/operations/CqlOperation.java
@@ -145,7 +145,7 @@ public abstract class CqlOperationV extends Operation
 @Override
 public boolean validate(Integer result)
 {
-return true;
+return result  0;
 }
 
 @Override
@@ -195,12 +195,8 @@ public abstract class CqlOperationV extends Operation
 if (result.length != expect.size())
 return false;
 for (int i = 0 ; i  result.length ; i++)
-{
-ListByteBuffer resultRow = Arrays.asList(result[i]);
-resultRow = resultRow.subList(1, resultRow.size());
-if (expect.get(i) != null  !expect.get(i).equals(resultRow))
+if (expect.get(i) != null  
!expect.get(i).equals(Arrays.asList(result[i])))
 return false;
-}
 return true;
 }
 }
@@ -510,9 +506,9 @@ public abstract class CqlOperationV extends Operation
 for (int i = 0 ; i  r.length ; i++)
 {
 Row row = rows.get(i);
-r[i] = new 
ByteBuffer[row.getColumnDefinitions().size() - 1];
-for (int j = 1 ; j  row.getColumnDefinitions().size() 
; j++)
-r[i][j - 1] = row.getBytes(j);
+r[i] = new

[2/4] git commit: Post-CASSANDRA-7058 fix

Post-CASSANDRA-7058 fix


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b9324e1b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b9324e1b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b9324e1b

Branch: refs/heads/trunk
Commit: b9324e1b94f67f3d89096fcef4d157f9505364e9
Parents: 3a73e39
Author: Aleksey Yeschenko alek...@apache.org
Authored: Tue Apr 22 22:13:57 2014 +0300
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Tue Apr 22 22:13:57 2014 +0300

--
 src/java/org/apache/cassandra/service/StorageProxy.java | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b9324e1b/src/java/org/apache/cassandra/service/StorageProxy.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java 
b/src/java/org/apache/cassandra/service/StorageProxy.java
index fc6ee3a..8196352 100644
--- a/src/java/org/apache/cassandra/service/StorageProxy.java
+++ b/src/java/org/apache/cassandra/service/StorageProxy.java
@@ -635,7 +635,7 @@ public class StorageProxy implements StorageProxyMBean
 {
 MessageOutRowMutation message = rm.createMessage();
 for (InetAddress target : endpoints)
-MessagingService.instance().sendRR(message, target, handler);
+MessagingService.instance().sendRR(message, target, handler, 
false);
 }
 }
 
@@ -814,7 +814,7 @@ public class StorageProxy implements StorageProxyMBean
 // (1.1 knows how to forward old-style String message IDs; 
updated to int in 2.0)
 if (localDataCenter.equals(dc) || 
MessagingService.instance().getVersion(destination)  
MessagingService.VERSION_20)
 {
-MessagingService.instance().sendRR(message, 
destination, responseHandler);
+MessagingService.instance().sendRR(message, 
destination, responseHandler, true);
 }
 else
 {
@@ -937,7 +937,7 @@ public class StorageProxy implements StorageProxyMBean
 }
 message = message.withParameter(RowMutation.FORWARD_TO, 
out.getData());
 // send the combined message + forward headers
-int id = MessagingService.instance().sendRR(message, target, 
handler);
+int id = MessagingService.instance().sendRR(message, target, 
handler, true);
 logger.trace(Sending message to {}@{}, id, target);
 }
 catch (IOException e)
@@ -1000,7 +1000,7 @@ public class StorageProxy implements StorageProxyMBean
 AbstractWriteResponseHandler responseHandler = new 
WriteResponseHandler(endpoint, WriteType.COUNTER);
 
 Tracing.trace(Enqueuing counter update to {}, endpoint);
-MessagingService.instance().sendRR(cm.makeMutationMessage(), 
endpoint, responseHandler);
+MessagingService.instance().sendRR(cm.makeMutationMessage(), 
endpoint, responseHandler, false);
 return responseHandler;
 }
 }

[3/4] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

Merge branch 'cassandra-2.0' into cassandra-2.1

Conflicts:
src/java/org/apache/cassandra/service/StorageProxy.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2c7622a6
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2c7622a6
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2c7622a6

Branch: refs/heads/trunk
Commit: 2c7622a65ce747819931bd52bc576a4cd055ba3d
Parents: a16adba b9324e1
Author: Aleksey Yeschenko alek...@apache.org
Authored: Tue Apr 22 22:16:27 2014 +0300
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Tue Apr 22 22:16:27 2014 +0300

--
 src/java/org/apache/cassandra/service/StorageProxy.java | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2c7622a6/src/java/org/apache/cassandra/service/StorageProxy.java
--
diff --cc src/java/org/apache/cassandra/service/StorageProxy.java
index 33f6ff0,8196352..d8c5813
--- a/src/java/org/apache/cassandra/service/StorageProxy.java
+++ b/src/java/org/apache/cassandra/service/StorageProxy.java
@@@ -642,15 -620,22 +642,15 @@@ public class StorageProxy implements St
  
Keyspace.open(Keyspace.SYSTEM_KS),
  null,
  
WriteType.SIMPLE);
 -RowMutation rm = new RowMutation(Keyspace.SYSTEM_KS, 
UUIDType.instance.decompose(uuid), cf);
 -updateBatchlog(rm, endpoints, handler);
 -}
 -
 -private static void updateBatchlog(RowMutation rm, 
CollectionInetAddress endpoints, AbstractWriteResponseHandler handler)
 -{
 -if (endpoints.contains(FBUtilities.getBroadcastAddress()))
 -{
 -assert endpoints.size() == 1;
 -insertLocal(rm, handler);
 -}
 -else
 +Mutation mutation = new Mutation(Keyspace.SYSTEM_KS, 
UUIDType.instance.decompose(uuid));
 +mutation.delete(SystemKeyspace.BATCHLOG_CF, 
FBUtilities.timestampMicros());
 +MessageOutMutation message = mutation.createMessage();
 +for (InetAddress target : endpoints)
  {
 -MessageOutRowMutation message = rm.createMessage();
 -for (InetAddress target : endpoints)
 +if (target.equals(FBUtilities.getBroadcastAddress())  
OPTIMIZE_LOCAL_REQUESTS)
 +insertLocal(message.payload, handler);
 +else
- MessagingService.instance().sendRR(message, target, handler);
+ MessagingService.instance().sendRR(message, target, handler, 
false);
  }
  }
  
@@@ -823,9 -812,9 +823,9 @@@
  String dc = 
DatabaseDescriptor.getEndpointSnitch().getDatacenter(destination);
  // direct writes to local DC or old Cassandra versions
  // (1.1 knows how to forward old-style String message 
IDs; updated to int in 2.0)
 -if (localDataCenter.equals(dc) || 
MessagingService.instance().getVersion(destination)  
MessagingService.VERSION_20)
 +if (localDataCenter.equals(dc))
  {
- MessagingService.instance().sendRR(message, 
destination, responseHandler);
+ MessagingService.instance().sendRR(message, 
destination, responseHandler, true);
  }
  else
  {
@@@ -946,9 -935,9 +946,9 @@@
  out.writeInt(id);
  logger.trace(Adding FWD message to {}@{}, id, destination);
  }
 -message = message.withParameter(RowMutation.FORWARD_TO, 
out.getData());
 +message = message.withParameter(Mutation.FORWARD_TO, 
out.getData());
  // send the combined message + forward headers
- int id = MessagingService.instance().sendRR(message, target, 
handler);
+ int id = MessagingService.instance().sendRR(message, target, 
handler, true);
  logger.trace(Sending message to {}@{}, id, target);
  }
  catch (IOException e)

[jira] [Commented] (CASSANDRA-7068) AssertionError when running putget_test

2014-04-22 Thread Ryan McGuire (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977289#comment-13977289
 ] 

Ryan McGuire commented on CASSANDRA-7068:
-

confirmed. Thanks!

 AssertionError when running putget_test
 ---

 Key: CASSANDRA-7068
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7068
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Assignee: Aleksey Yeschenko

 running the putget_test like so:
 {code}
 nosetests2 -x -s -v putget_test.py:TestPutGet.non_local_read_test
 {code}
 Yields this error in the logs on cassandra-2.0:
 {code}
 ERROR [Thrift:1] 2014-04-22 14:25:37,584 CassandraDaemon.java (line 198) 
 Exception in thread Thread[Thrift:1,5,main]
 java.lang.AssertionError
 at 
 org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:542)
 at 
 org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:595)
 at 
 org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:579)
 at 
 org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:817)
 at 
 org.apache.cassandra.service.StorageProxy$2.apply(StorageProxy.java:119)
 at 
 org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:693)
 at 
 org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:465)
 at 
 org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:535)
 at 
 org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:542)
 at 
 org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:526)
 at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:158)
 at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:175)
 at 
 org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1959)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4486)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4470)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:201)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 {code}
 On cassandra-2.1 I don't get any errors in the logs, but the test doesn't run 
 , instead I get a 'TSocket read 0 bytes' error. 
 Test on 1.2 is fine.
 After bisecting, it appears that a common commit 
 3a73e392fa424bff5378d4bb72117cfa28f9b0b7 is the cause.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6106) Provide timestamp with true microsecond resolution


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977322#comment-13977322
 ] 

Benedict commented on CASSANDRA-6106:
-

In case it helps at all, I've commented it heavily and simplified the logic 
quite a bit, by removing the test on time elapsed to grab the realtime 
offset, as the effect will be pretty minimal even if it gets a temporarily 
whack value. It really isn't actually super complicated, but it was a bit ugly 
to read and non-obvious without comments.

It's worth noting that having a monotonically increasing time source is 
probably a good thing in and of itself, which this also provides.

I've rebased and pushed -f

 Provide timestamp with true microsecond resolution
 --

 Key: CASSANDRA-6106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: DSE Cassandra 3.1, but also HEAD
Reporter: Christopher Smith
Assignee: Benedict
Priority: Minor
  Labels: timestamps
 Fix For: 2.1 beta2

 Attachments: microtimstamp.patch, microtimstamp_random.patch, 
 microtimstamp_random_rev2.patch


 I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
 mentioned issues with millisecond rounding in timestamps and was able to 
 reproduce the issue. If I specify a timestamp in a mutating query, I get 
 microsecond precision, but if I don't, I get timestamps rounded to the 
 nearest millisecond, at least for my first query on a given connection, which 
 substantially increases the possibilities of collision.
 I believe I found the offending code, though I am by no means sure this is 
 comprehensive. I think we probably need a fairly comprehensive replacement of 
 all uses of System.currentTimeMillis() with System.nanoTime().



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7011) auth_test system_auth_ks_is_alterable_test dtest hangs in 2.1 and 2.0


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977355#comment-13977355
 ] 

Michael Shuler commented on CASSANDRA-7011:
---

checked with latest ccm changes, and this test is still hanging at runtime

 auth_test system_auth_ks_is_alterable_test dtest hangs in 2.1 and 2.0
 -

 Key: CASSANDRA-7011
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7011
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Michael Shuler

 This test hangs forever. When I hit ctl-c after running the test, then the 
 ccm nodes actually continue running - I think ccm is looking for log lines 
 that never occur until the test is killed(?).
 {noformat}
 $ export MAX_HEAP_SIZE=1G; export HEAP_NEWSIZE=256M; ENABLE_VNODES=true 
 PRINT_DEBUG=true nosetests --nocapture --nologcapture --verbosity=3 
 auth_test.py:TestAuth.system_auth_ks_is_alterable_test
 nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$']
 system_auth_ks_is_alterable_test (auth_test.TestAuth) ... cluster ccm 
 directory: /tmp/dtest-O3AAJr
 ^C
 {noformat}
 Search for (hanging here) below - I typed this prior to hitting ctl-c. Then 
 the nodes start running again and I see Listening for thrift clients later 
 on.
 {noformat}
 mshuler@hana:~$ tail -f /tmp/dtest-O3AAJr/test/node*/logs/system.log
 == /tmp/dtest-O3AAJr/test/node1/logs/system.log ==
 INFO  [MemtableFlushWriter:2] 2014-04-08 16:45:12,599 Memtable.java:344 - 
 Writing Memtable-schema_columnfamilies@1792243696(1627 serialized bytes, 27 
 ops, 0%/0% of on/off-heap limit)
 INFO  [CompactionExecutor:2] 2014-04-08 16:45:12,603 CompactionTask.java:287 
 - Compacted 4 sstables to 
 [/tmp/dtest-O3AAJr/test/node1/data/system/schema_columns-296e9c049bec3085827dc17d3df2122a/system-schema_columns-ka-13,].
   14,454 bytes to 11,603 (~80% of original) in 105ms = 0.105386MB/s.  7 total 
 partitions merged to 3.  Partition merge counts were {1:1, 2:1, 4:1, }
 INFO  [MemtableFlushWriter:2] 2014-04-08 16:45:12,668 Memtable.java:378 - 
 Completed flushing 
 /tmp/dtest-O3AAJr/test/node1/flush/system/schema_columnfamilies-45f5b36024bc3f83a3631034ea4fa697/system-schema_columnfamilies-ka-14-Data.db
  (956 bytes) for commitlog position ReplayPosition(segmentId=1396993504671, 
 position=193292)
 INFO  [MigrationStage:1] 2014-04-08 16:45:12,669 ColumnFamilyStore.java:853 - 
 Enqueuing flush of schema_columns: 6806 (0%) on-heap, 0 (0%) off-heap
 INFO  [MemtableFlushWriter:1] 2014-04-08 16:45:12,670 Memtable.java:344 - 
 Writing Memtable-schema_columns@352928691(1014 serialized bytes, 21 ops, 
 0%/0% of on/off-heap limit)
 INFO  [CompactionExecutor:1] 2014-04-08 16:45:12,672 CompactionTask.java:287 
 - Compacted 4 sstables to 
 [/tmp/dtest-O3AAJr/test/node1/data/system/schema_keyspaces-b0f2235744583cdb9631c43e59ce3676/system-schema_keyspaces-ka-17,].
   710 bytes to 233 (~32% of original) in 70ms = 0.003174MB/s.  6 total 
 partitions merged to 3.  Partition merge counts were {1:2, 4:1, }
 INFO  [MemtableFlushWriter:1] 2014-04-08 16:45:12,721 Memtable.java:378 - 
 Completed flushing 
 /tmp/dtest-O3AAJr/test/node1/flush/system/schema_columns-296e9c049bec3085827dc17d3df2122a/system-schema_columns-ka-14-Data.db
  (435 bytes) for commitlog position ReplayPosition(segmentId=1396993504671, 
 position=193830)
 WARN  [NonPeriodicTasks:1] 2014-04-08 16:45:20,566 FBUtilities.java:359 - 
 Trigger directory doesn't exist, please create it and try again.
 INFO  [NonPeriodicTasks:1] 2014-04-08 16:45:20,570 
 PasswordAuthenticator.java:220 - PasswordAuthenticator created default user 
 'cassandra'
 INFO  [NonPeriodicTasks:1] 2014-04-08 16:45:21,806 Auth.java:232 - Created 
 default superuser 'cassandra'
 == /tmp/dtest-O3AAJr/test/node2/logs/system.log ==
 INFO  [InternalResponseStage:4] 2014-04-08 16:45:12,214 
 ColumnFamilyStore.java:853 - Enqueuing flush of schema_keyspaces: 1004 (0%) 
 on-heap, 0 (0%) off-heap
 INFO  [MemtableFlushWriter:1] 2014-04-08 16:45:12,215 Memtable.java:344 - 
 Writing Memtable-schema_keyspaces@781373873(276 serialized bytes, 6 ops, 
 0%/0% of on/off-heap limit)
 INFO  [MemtableFlushWriter:1] 2014-04-08 16:45:12,295 Memtable.java:378 - 
 Completed flushing 
 /tmp/dtest-O3AAJr/test/node2/flush/system/schema_keyspaces-b0f2235744583cdb9631c43e59ce3676/system-schema_keyspaces-ka-15-Data.db
  (179 bytes) for commitlog position ReplayPosition(segmentId=1396993504760, 
 position=243552)
 INFO  [InternalResponseStage:4] 2014-04-08 16:45:12,296 
 ColumnFamilyStore.java:853 - Enqueuing flush of schema_columnfamilies: 34190 
 (0%) on-heap, 0 (0%) off-heap
 INFO  [MemtableFlushWriter:2] 2014-04-08 16:45:12,297 Memtable.java:344 - 
 Writing Memtable-schema_columnfamilies@2077216447(5746 serialized bytes, 108 
 ops, 0%/0%

[jira] [Commented] (CASSANDRA-6476) Assertion error in MessagingService.addCallback

[
https://issues.apache.org/jira/browse/CASSANDRA-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977357#comment-13977357
]

Benedict commented on CASSANDRA-6476:
-

Isn't this most likely a duplicate of CASSANDRA-6948? Caused by never bouncing
your node between bootstrap and hitting 4B+ messages, coupled with some dropped
messages along the way.

Assertion error in MessagingService.addCallback
---

Key: CASSANDRA-6476
URL: https://issues.apache.org/jira/browse/CASSANDRA-6476
Project: Cassandra
Issue Type: Bug
Environment: Cassandra 2.0.2 DCE, Cassandra 1.2.15
Reporter: Theo Hultberg
Assignee: Sylvain Lebresne

Two of the three Cassandra nodes in one of our clusters just started behaving
very strange about an hour ago. Within a minute of each other they started
logging AssertionErrors (see stack traces here:
https://gist.github.com/iconara/7917438) over and over again. The client lost
connection with the nodes at roughly the same time. The nodes were still up,
and even if no clients were connected to them they continued logging the same
errors over and over.
The errors are in the native transport (specifically
MessagingService.addCallback) which makes me suspect that it has something to
do with a test that we started running this afternoon. I've just implemented
support for frame compression in my CQL driver cql-rb. About two hours before
this happened I deployed a version of the application which enabled Snappy
compression on all frames larger than 64 bytes. It's not impossible that
there is a bug somewhere in the driver or compression library that caused
this -- but at the same time, it feels like it shouldn't be possible to make
C* a zombie with a bad frame.
Restarting seems to have got them back running again, but I suspect they will
go down again sooner or later.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7069) Prevent operator mistakes due to simultaneous bootstrap

2014-04-22 Thread T Jake Luciani (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977360#comment-13977360
 ] 

T Jake Luciani commented on CASSANDRA-7069:
---

CASSANDRA-2434 will require one node at a time no?

 Prevent operator mistakes due to simultaneous bootstrap
 ---

 Key: CASSANDRA-7069
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7069
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
Priority: Minor
 Fix For: 3.0


 Cassandra has always had the '2 minute rule' between beginning topology 
 changes to ensure the range announcement is known to all nodes before the 
 next one begins.  Trying to bootstrap a bunch of nodes simultaneously is a 
 common mistake and seems to be on the rise as of late.
 We can prevent users from shooting themselves in the foot this way by looking 
 for other joining nodes in the shadow round, then comparing their generation 
 against our own and if there isn't a large enough difference, bail out or 
 sleep until it is large enough.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (CASSANDRA-6476) Assertion error in MessagingService.addCallback

[
https://issues.apache.org/jira/browse/CASSANDRA-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977357#comment-13977357
]

Benedict edited comment on CASSANDRA-6476 at 4/22/14 8:31 PM:
--

Isn't this most likely a duplicate of CASSANDRA-6948? Hit by never bouncing
your node between bootstrap and hitting 4B+ messages, coupled with some dropped
messages along the way, caused by shutting down the expiringmap reaper during
bootstrap

was (Author: benedict):
Isn't this most likely a duplicate of CASSANDRA-6948? Caused by never bouncing
your node between bootstrap and hitting 4B+ messages, coupled with some dropped
messages along the way.

Assertion error in MessagingService.addCallback
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7069) Prevent operator mistakes due to simultaneous bootstrap


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977362#comment-13977362
 ] 

Brandon Williams commented on CASSANDRA-7069:
-

Without looking at that patch, will it gracefully handle starting a bunch of 
nodes up in bootstrap mode at once?

 Prevent operator mistakes due to simultaneous bootstrap
 ---

 Key: CASSANDRA-7069
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7069
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
Priority: Minor
 Fix For: 3.0


 Cassandra has always had the '2 minute rule' between beginning topology 
 changes to ensure the range announcement is known to all nodes before the 
 next one begins.  Trying to bootstrap a bunch of nodes simultaneously is a 
 common mistake and seems to be on the rise as of late.
 We can prevent users from shooting themselves in the foot this way by looking 
 for other joining nodes in the shadow round, then comparing their generation 
 against our own and if there isn't a large enough difference, bail out or 
 sleep until it is large enough.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (CASSANDRA-7011) auth_test system_auth_ks_is_alterable_test dtest hangs in 2.1 and 2.0


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko reassigned CASSANDRA-7011:


Assignee: Aleksey Yeschenko

 auth_test system_auth_ks_is_alterable_test dtest hangs in 2.1 and 2.0
 -

 Key: CASSANDRA-7011
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7011
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Michael Shuler
Assignee: Aleksey Yeschenko

 This test hangs forever. When I hit ctl-c after running the test, then the 
 ccm nodes actually continue running - I think ccm is looking for log lines 
 that never occur until the test is killed(?).
 {noformat}
 $ export MAX_HEAP_SIZE=1G; export HEAP_NEWSIZE=256M; ENABLE_VNODES=true 
 PRINT_DEBUG=true nosetests --nocapture --nologcapture --verbosity=3 
 auth_test.py:TestAuth.system_auth_ks_is_alterable_test
 nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$']
 system_auth_ks_is_alterable_test (auth_test.TestAuth) ... cluster ccm 
 directory: /tmp/dtest-O3AAJr
 ^C
 {noformat}
 Search for (hanging here) below - I typed this prior to hitting ctl-c. Then 
 the nodes start running again and I see Listening for thrift clients later 
 on.
 {noformat}
 mshuler@hana:~$ tail -f /tmp/dtest-O3AAJr/test/node*/logs/system.log
 == /tmp/dtest-O3AAJr/test/node1/logs/system.log ==
 INFO  [MemtableFlushWriter:2] 2014-04-08 16:45:12,599 Memtable.java:344 - 
 Writing Memtable-schema_columnfamilies@1792243696(1627 serialized bytes, 27 
 ops, 0%/0% of on/off-heap limit)
 INFO  [CompactionExecutor:2] 2014-04-08 16:45:12,603 CompactionTask.java:287 
 - Compacted 4 sstables to 
 [/tmp/dtest-O3AAJr/test/node1/data/system/schema_columns-296e9c049bec3085827dc17d3df2122a/system-schema_columns-ka-13,].
   14,454 bytes to 11,603 (~80% of original) in 105ms = 0.105386MB/s.  7 total 
 partitions merged to 3.  Partition merge counts were {1:1, 2:1, 4:1, }
 INFO  [MemtableFlushWriter:2] 2014-04-08 16:45:12,668 Memtable.java:378 - 
 Completed flushing 
 /tmp/dtest-O3AAJr/test/node1/flush/system/schema_columnfamilies-45f5b36024bc3f83a3631034ea4fa697/system-schema_columnfamilies-ka-14-Data.db
  (956 bytes) for commitlog position ReplayPosition(segmentId=1396993504671, 
 position=193292)
 INFO  [MigrationStage:1] 2014-04-08 16:45:12,669 ColumnFamilyStore.java:853 - 
 Enqueuing flush of schema_columns: 6806 (0%) on-heap, 0 (0%) off-heap
 INFO  [MemtableFlushWriter:1] 2014-04-08 16:45:12,670 Memtable.java:344 - 
 Writing Memtable-schema_columns@352928691(1014 serialized bytes, 21 ops, 
 0%/0% of on/off-heap limit)
 INFO  [CompactionExecutor:1] 2014-04-08 16:45:12,672 CompactionTask.java:287 
 - Compacted 4 sstables to 
 [/tmp/dtest-O3AAJr/test/node1/data/system/schema_keyspaces-b0f2235744583cdb9631c43e59ce3676/system-schema_keyspaces-ka-17,].
   710 bytes to 233 (~32% of original) in 70ms = 0.003174MB/s.  6 total 
 partitions merged to 3.  Partition merge counts were {1:2, 4:1, }
 INFO  [MemtableFlushWriter:1] 2014-04-08 16:45:12,721 Memtable.java:378 - 
 Completed flushing 
 /tmp/dtest-O3AAJr/test/node1/flush/system/schema_columns-296e9c049bec3085827dc17d3df2122a/system-schema_columns-ka-14-Data.db
  (435 bytes) for commitlog position ReplayPosition(segmentId=1396993504671, 
 position=193830)
 WARN  [NonPeriodicTasks:1] 2014-04-08 16:45:20,566 FBUtilities.java:359 - 
 Trigger directory doesn't exist, please create it and try again.
 INFO  [NonPeriodicTasks:1] 2014-04-08 16:45:20,570 
 PasswordAuthenticator.java:220 - PasswordAuthenticator created default user 
 'cassandra'
 INFO  [NonPeriodicTasks:1] 2014-04-08 16:45:21,806 Auth.java:232 - Created 
 default superuser 'cassandra'
 == /tmp/dtest-O3AAJr/test/node2/logs/system.log ==
 INFO  [InternalResponseStage:4] 2014-04-08 16:45:12,214 
 ColumnFamilyStore.java:853 - Enqueuing flush of schema_keyspaces: 1004 (0%) 
 on-heap, 0 (0%) off-heap
 INFO  [MemtableFlushWriter:1] 2014-04-08 16:45:12,215 Memtable.java:344 - 
 Writing Memtable-schema_keyspaces@781373873(276 serialized bytes, 6 ops, 
 0%/0% of on/off-heap limit)
 INFO  [MemtableFlushWriter:1] 2014-04-08 16:45:12,295 Memtable.java:378 - 
 Completed flushing 
 /tmp/dtest-O3AAJr/test/node2/flush/system/schema_keyspaces-b0f2235744583cdb9631c43e59ce3676/system-schema_keyspaces-ka-15-Data.db
  (179 bytes) for commitlog position ReplayPosition(segmentId=1396993504760, 
 position=243552)
 INFO  [InternalResponseStage:4] 2014-04-08 16:45:12,296 
 ColumnFamilyStore.java:853 - Enqueuing flush of schema_columnfamilies: 34190 
 (0%) on-heap, 0 (0%) off-heap
 INFO  [MemtableFlushWriter:2] 2014-04-08 16:45:12,297 Memtable.java:344 - 
 Writing Memtable-schema_columnfamilies@2077216447(5746 serialized bytes, 108 
 ops, 0%/0% of on/off-heap limit)
 INFO

[jira] [Commented] (CASSANDRA-7011) auth_test system_auth_ks_is_alterable_test dtest hangs in 2.1 and 2.0


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977365#comment-13977365
 ] 

Aleksey Yeschenko commented on CASSANDRA-7011:
--

[~mshuler] could be another case of CASSANDRA-7058. Could you bisect?

 auth_test system_auth_ks_is_alterable_test dtest hangs in 2.1 and 2.0
 -

 Key: CASSANDRA-7011
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7011
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Michael Shuler
Assignee: Aleksey Yeschenko

 This test hangs forever. When I hit ctl-c after running the test, then the 
 ccm nodes actually continue running - I think ccm is looking for log lines 
 that never occur until the test is killed(?).
 {noformat}
 $ export MAX_HEAP_SIZE=1G; export HEAP_NEWSIZE=256M; ENABLE_VNODES=true 
 PRINT_DEBUG=true nosetests --nocapture --nologcapture --verbosity=3 
 auth_test.py:TestAuth.system_auth_ks_is_alterable_test
 nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$']
 system_auth_ks_is_alterable_test (auth_test.TestAuth) ... cluster ccm 
 directory: /tmp/dtest-O3AAJr
 ^C
 {noformat}
 Search for (hanging here) below - I typed this prior to hitting ctl-c. Then 
 the nodes start running again and I see Listening for thrift clients later 
 on.
 {noformat}
 mshuler@hana:~$ tail -f /tmp/dtest-O3AAJr/test/node*/logs/system.log
 == /tmp/dtest-O3AAJr/test/node1/logs/system.log ==
 INFO  [MemtableFlushWriter:2] 2014-04-08 16:45:12,599 Memtable.java:344 - 
 Writing Memtable-schema_columnfamilies@1792243696(1627 serialized bytes, 27 
 ops, 0%/0% of on/off-heap limit)
 INFO  [CompactionExecutor:2] 2014-04-08 16:45:12,603 CompactionTask.java:287 
 - Compacted 4 sstables to 
 [/tmp/dtest-O3AAJr/test/node1/data/system/schema_columns-296e9c049bec3085827dc17d3df2122a/system-schema_columns-ka-13,].
   14,454 bytes to 11,603 (~80% of original) in 105ms = 0.105386MB/s.  7 total 
 partitions merged to 3.  Partition merge counts were {1:1, 2:1, 4:1, }
 INFO  [MemtableFlushWriter:2] 2014-04-08 16:45:12,668 Memtable.java:378 - 
 Completed flushing 
 /tmp/dtest-O3AAJr/test/node1/flush/system/schema_columnfamilies-45f5b36024bc3f83a3631034ea4fa697/system-schema_columnfamilies-ka-14-Data.db
  (956 bytes) for commitlog position ReplayPosition(segmentId=1396993504671, 
 position=193292)
 INFO  [MigrationStage:1] 2014-04-08 16:45:12,669 ColumnFamilyStore.java:853 - 
 Enqueuing flush of schema_columns: 6806 (0%) on-heap, 0 (0%) off-heap
 INFO  [MemtableFlushWriter:1] 2014-04-08 16:45:12,670 Memtable.java:344 - 
 Writing Memtable-schema_columns@352928691(1014 serialized bytes, 21 ops, 
 0%/0% of on/off-heap limit)
 INFO  [CompactionExecutor:1] 2014-04-08 16:45:12,672 CompactionTask.java:287 
 - Compacted 4 sstables to 
 [/tmp/dtest-O3AAJr/test/node1/data/system/schema_keyspaces-b0f2235744583cdb9631c43e59ce3676/system-schema_keyspaces-ka-17,].
   710 bytes to 233 (~32% of original) in 70ms = 0.003174MB/s.  6 total 
 partitions merged to 3.  Partition merge counts were {1:2, 4:1, }
 INFO  [MemtableFlushWriter:1] 2014-04-08 16:45:12,721 Memtable.java:378 - 
 Completed flushing 
 /tmp/dtest-O3AAJr/test/node1/flush/system/schema_columns-296e9c049bec3085827dc17d3df2122a/system-schema_columns-ka-14-Data.db
  (435 bytes) for commitlog position ReplayPosition(segmentId=1396993504671, 
 position=193830)
 WARN  [NonPeriodicTasks:1] 2014-04-08 16:45:20,566 FBUtilities.java:359 - 
 Trigger directory doesn't exist, please create it and try again.
 INFO  [NonPeriodicTasks:1] 2014-04-08 16:45:20,570 
 PasswordAuthenticator.java:220 - PasswordAuthenticator created default user 
 'cassandra'
 INFO  [NonPeriodicTasks:1] 2014-04-08 16:45:21,806 Auth.java:232 - Created 
 default superuser 'cassandra'
 == /tmp/dtest-O3AAJr/test/node2/logs/system.log ==
 INFO  [InternalResponseStage:4] 2014-04-08 16:45:12,214 
 ColumnFamilyStore.java:853 - Enqueuing flush of schema_keyspaces: 1004 (0%) 
 on-heap, 0 (0%) off-heap
 INFO  [MemtableFlushWriter:1] 2014-04-08 16:45:12,215 Memtable.java:344 - 
 Writing Memtable-schema_keyspaces@781373873(276 serialized bytes, 6 ops, 
 0%/0% of on/off-heap limit)
 INFO  [MemtableFlushWriter:1] 2014-04-08 16:45:12,295 Memtable.java:378 - 
 Completed flushing 
 /tmp/dtest-O3AAJr/test/node2/flush/system/schema_keyspaces-b0f2235744583cdb9631c43e59ce3676/system-schema_keyspaces-ka-15-Data.db
  (179 bytes) for commitlog position ReplayPosition(segmentId=1396993504760, 
 position=243552)
 INFO  [InternalResponseStage:4] 2014-04-08 16:45:12,296 
 ColumnFamilyStore.java:853 - Enqueuing flush of schema_columnfamilies: 34190 
 (0%) on-heap, 0 (0%) off-heap
 INFO  [MemtableFlushWriter:2] 2014-04-08 16:45:12,297 Memtable.java:344 - 
 Writing

[jira] [Assigned] (CASSANDRA-6746) Reads have a slow ramp up in speed


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict reassigned CASSANDRA-6746:
---

Assignee: (was: Benedict)

 Reads have a slow ramp up in speed
 --

 Key: CASSANDRA-6746
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6746
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Ryan McGuire
Priority: Minor
  Labels: performance
 Attachments: 2.1_vs_2.0_read.png, 6746-buffered-io-tweaks.png, 
 6746-patched.png, 6746.blockdev_setra.full.png, 
 6746.blockdev_setra.zoomed.png, 6746.buffered_io_tweaks.logs.tar.gz, 
 6746.buffered_io_tweaks.write-flush-compact-mixed.png, 
 6746.buffered_io_tweaks.write-read-flush-compact.png, 6746.txt, 
 buffered-io-tweaks.patch, cassandra-2.0-bdplab-trial-fincore.tar.bz2, 
 cassandra-2.1-bdplab-trial-fincore.tar.bz2


 On a physical four node cluister I am doing a big write and then a big read. 
 The read takes a long time to ramp up to respectable speeds.
 !2.1_vs_2.0_read.png!
 [See data 
 here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.2.1_vs_2.0_vs_1.2.retry1.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1]



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6476) Assertion error in MessagingService.addCallback

2014-04-22 Thread Richard Low (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977378#comment-13977378
]

Richard Low commented on CASSANDRA-6476:

Yes, most likely it is. We see it correlated across nodes that were
bootstrapped at the same time, which makes sense.

Assertion error in MessagingService.addCallback
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (CASSANDRA-6476) Assertion error in MessagingService.addCallback


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict reassigned CASSANDRA-6476:
---

Assignee: Brandon Williams  (was: Sylvain Lebresne)

Want to backport your patch [~brandon.williams]?

 Assertion error in MessagingService.addCallback
 ---

 Key: CASSANDRA-6476
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6476
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.0.2 DCE, Cassandra 1.2.15
Reporter: Theo Hultberg
Assignee: Brandon Williams

 Two of the three Cassandra nodes in one of our clusters just started behaving 
 very strange about an hour ago. Within a minute of each other they started 
 logging AssertionErrors (see stack traces here: 
 https://gist.github.com/iconara/7917438) over and over again. The client lost 
 connection with the nodes at roughly the same time. The nodes were still up, 
 and even if no clients were connected to them they continued logging the same 
 errors over and over.
 The errors are in the native transport (specifically 
 MessagingService.addCallback) which makes me suspect that it has something to 
 do with a test that we started running this afternoon. I've just implemented 
 support for frame compression in my CQL driver cql-rb. About two hours before 
 this happened I deployed a version of the application which enabled Snappy 
 compression on all frames larger than 64 bytes. It's not impossible that 
 there is a bug somewhere in the driver or compression library that caused 
 this -- but at the same time, it feels like it shouldn't be possible to make 
 C* a zombie with a bad frame.
 Restarting seems to have got them back running again, but I suspect they will 
 go down again sooner or later.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7053) USING TIMESTAMP for batches does not work

2014-04-22 Thread Robert Supencheck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977389#comment-13977389
 ] 

Robert Supencheck commented on CASSANDRA-7053:
--

For what it is worth, the patch provided by Mikhail has been verified to work.

 USING TIMESTAMP for batches does not work
 -

 Key: CASSANDRA-7053
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7053
 Project: Cassandra
  Issue Type: Bug
Reporter: Robert Supencheck
Assignee: Mikhail Stepura
  Labels: cqlsh
 Fix For: 2.0.8, 2.1 beta2

 Attachments: cassandra-2.0-7053.patch


 When using the USING TIMESTAMP timestamp syntax for a batch statement, 
 the supplied timestamp is ignored.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6476) Assertion error in MessagingService.addCallback

[
https://issues.apache.org/jira/browse/CASSANDRA-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977397#comment-13977397
]

Brandon Williams commented on CASSANDRA-6476:
-

Unfortunately, I don't think your analysis is correct, because in 1.2 we only
spin MS up/down for replace, not bootstrap. The bootstrap shadow round was
added in CASSANDRA-5571 which is 2.0-only.

Assertion error in MessagingService.addCallback
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7058) HHOM and BM direct delivery should not cause hints to be written on timeout


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977399#comment-13977399
 ] 

Jonathan Ellis commented on CASSANDRA-7058:
---

(2.0 patch LGTM.)

 HHOM and BM direct delivery should not cause hints to be written on timeout
 ---

 Key: CASSANDRA-7058
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7058
 Project: Cassandra
  Issue Type: Bug
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko
 Fix For: 1.2.17, 2.0.8, 2.1 beta2

 Attachments: 7058-2.0.txt, 7058-final.txt, 7058-simplified.txt, 
 7058.txt, 7058.txt


 Currently, a timed out HHOM hint delivery would create a further hint, with a 
 wrong TTL. BM direct delivery code is using the same code snippet basically, 
 so is also affected (with slightly worse consequences).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7011) auth_test system_auth_ks_is_alterable_test dtest hangs in 2.1 and 2.0


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977403#comment-13977403
 ] 

Michael Shuler commented on CASSANDRA-7011:
---

1.2 works, so I'll bisect with 2.0 and double check if I arrive at the same 
place with 2.1

 auth_test system_auth_ks_is_alterable_test dtest hangs in 2.1 and 2.0
 -

 Key: CASSANDRA-7011
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7011
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Michael Shuler
Assignee: Aleksey Yeschenko

 This test hangs forever. When I hit ctl-c after running the test, then the 
 ccm nodes actually continue running - I think ccm is looking for log lines 
 that never occur until the test is killed(?).
 {noformat}
 $ export MAX_HEAP_SIZE=1G; export HEAP_NEWSIZE=256M; ENABLE_VNODES=true 
 PRINT_DEBUG=true nosetests --nocapture --nologcapture --verbosity=3 
 auth_test.py:TestAuth.system_auth_ks_is_alterable_test
 nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$']
 system_auth_ks_is_alterable_test (auth_test.TestAuth) ... cluster ccm 
 directory: /tmp/dtest-O3AAJr
 ^C
 {noformat}
 Search for (hanging here) below - I typed this prior to hitting ctl-c. Then 
 the nodes start running again and I see Listening for thrift clients later 
 on.
 {noformat}
 mshuler@hana:~$ tail -f /tmp/dtest-O3AAJr/test/node*/logs/system.log
 == /tmp/dtest-O3AAJr/test/node1/logs/system.log ==
 INFO  [MemtableFlushWriter:2] 2014-04-08 16:45:12,599 Memtable.java:344 - 
 Writing Memtable-schema_columnfamilies@1792243696(1627 serialized bytes, 27 
 ops, 0%/0% of on/off-heap limit)
 INFO  [CompactionExecutor:2] 2014-04-08 16:45:12,603 CompactionTask.java:287 
 - Compacted 4 sstables to 
 [/tmp/dtest-O3AAJr/test/node1/data/system/schema_columns-296e9c049bec3085827dc17d3df2122a/system-schema_columns-ka-13,].
   14,454 bytes to 11,603 (~80% of original) in 105ms = 0.105386MB/s.  7 total 
 partitions merged to 3.  Partition merge counts were {1:1, 2:1, 4:1, }
 INFO  [MemtableFlushWriter:2] 2014-04-08 16:45:12,668 Memtable.java:378 - 
 Completed flushing 
 /tmp/dtest-O3AAJr/test/node1/flush/system/schema_columnfamilies-45f5b36024bc3f83a3631034ea4fa697/system-schema_columnfamilies-ka-14-Data.db
  (956 bytes) for commitlog position ReplayPosition(segmentId=1396993504671, 
 position=193292)
 INFO  [MigrationStage:1] 2014-04-08 16:45:12,669 ColumnFamilyStore.java:853 - 
 Enqueuing flush of schema_columns: 6806 (0%) on-heap, 0 (0%) off-heap
 INFO  [MemtableFlushWriter:1] 2014-04-08 16:45:12,670 Memtable.java:344 - 
 Writing Memtable-schema_columns@352928691(1014 serialized bytes, 21 ops, 
 0%/0% of on/off-heap limit)
 INFO  [CompactionExecutor:1] 2014-04-08 16:45:12,672 CompactionTask.java:287 
 - Compacted 4 sstables to 
 [/tmp/dtest-O3AAJr/test/node1/data/system/schema_keyspaces-b0f2235744583cdb9631c43e59ce3676/system-schema_keyspaces-ka-17,].
   710 bytes to 233 (~32% of original) in 70ms = 0.003174MB/s.  6 total 
 partitions merged to 3.  Partition merge counts were {1:2, 4:1, }
 INFO  [MemtableFlushWriter:1] 2014-04-08 16:45:12,721 Memtable.java:378 - 
 Completed flushing 
 /tmp/dtest-O3AAJr/test/node1/flush/system/schema_columns-296e9c049bec3085827dc17d3df2122a/system-schema_columns-ka-14-Data.db
  (435 bytes) for commitlog position ReplayPosition(segmentId=1396993504671, 
 position=193830)
 WARN  [NonPeriodicTasks:1] 2014-04-08 16:45:20,566 FBUtilities.java:359 - 
 Trigger directory doesn't exist, please create it and try again.
 INFO  [NonPeriodicTasks:1] 2014-04-08 16:45:20,570 
 PasswordAuthenticator.java:220 - PasswordAuthenticator created default user 
 'cassandra'
 INFO  [NonPeriodicTasks:1] 2014-04-08 16:45:21,806 Auth.java:232 - Created 
 default superuser 'cassandra'
 == /tmp/dtest-O3AAJr/test/node2/logs/system.log ==
 INFO  [InternalResponseStage:4] 2014-04-08 16:45:12,214 
 ColumnFamilyStore.java:853 - Enqueuing flush of schema_keyspaces: 1004 (0%) 
 on-heap, 0 (0%) off-heap
 INFO  [MemtableFlushWriter:1] 2014-04-08 16:45:12,215 Memtable.java:344 - 
 Writing Memtable-schema_keyspaces@781373873(276 serialized bytes, 6 ops, 
 0%/0% of on/off-heap limit)
 INFO  [MemtableFlushWriter:1] 2014-04-08 16:45:12,295 Memtable.java:378 - 
 Completed flushing 
 /tmp/dtest-O3AAJr/test/node2/flush/system/schema_keyspaces-b0f2235744583cdb9631c43e59ce3676/system-schema_keyspaces-ka-15-Data.db
  (179 bytes) for commitlog position ReplayPosition(segmentId=1396993504760, 
 position=243552)
 INFO  [InternalResponseStage:4] 2014-04-08 16:45:12,296 
 ColumnFamilyStore.java:853 - Enqueuing flush of schema_columnfamilies: 34190 
 (0%) on-heap, 0 (0%) off-heap
 INFO  [MemtableFlushWriter:2] 2014-04-08 16:45:12,297 Memtable.java:344 - 
 Writing

[jira] [Commented] (CASSANDRA-6476) Assertion error in MessagingService.addCallback


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977408#comment-13977408
 ] 

Brandon Williams commented on CASSANDRA-6476:
-

[~rlow] are you doing a bootstrap, or are you doing a replace?

 Assertion error in MessagingService.addCallback
 ---

 Key: CASSANDRA-6476
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6476
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.0.2 DCE, Cassandra 1.2.15
Reporter: Theo Hultberg
Assignee: Brandon Williams

 Two of the three Cassandra nodes in one of our clusters just started behaving 
 very strange about an hour ago. Within a minute of each other they started 
 logging AssertionErrors (see stack traces here: 
 https://gist.github.com/iconara/7917438) over and over again. The client lost 
 connection with the nodes at roughly the same time. The nodes were still up, 
 and even if no clients were connected to them they continued logging the same 
 errors over and over.
 The errors are in the native transport (specifically 
 MessagingService.addCallback) which makes me suspect that it has something to 
 do with a test that we started running this afternoon. I've just implemented 
 support for frame compression in my CQL driver cql-rb. About two hours before 
 this happened I deployed a version of the application which enabled Snappy 
 compression on all frames larger than 64 bytes. It's not impossible that 
 there is a bug somewhere in the driver or compression library that caused 
 this -- but at the same time, it feels like it shouldn't be possible to make 
 C* a zombie with a bad frame.
 Restarting seems to have got them back running again, but I suspect they will 
 go down again sooner or later.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6476) Assertion error in MessagingService.addCallback

2014-04-22 Thread Richard Low (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977409#comment-13977409
]

Richard Low commented on CASSANDRA-6476:

Sorry, the instances we've seen were replaced, not bootstrapped. This is on
1.2.15.

Theo, had you replaced instances too?

Assertion error in MessagingService.addCallback
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6948) After Bootstrap or Replace node startup, EXPIRING_MAP_REAPER is shutdown and cannot be restarted, causing callbacks to collect indefinitely

2014-04-22 Thread Richard Low (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977420#comment-13977420
 ] 

Richard Low commented on CASSANDRA-6948:


See discussion on CASSANDRA-6476, it applies to 1.2.15 on replace. Can this be 
reopened and fixed on 1.2?

 After Bootstrap or Replace node startup, EXPIRING_MAP_REAPER is shutdown and 
 cannot be restarted, causing callbacks to collect indefinitely
 ---

 Key: CASSANDRA-6948
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6948
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Keith Wright
Assignee: Brandon Williams
 Fix For: 2.0.7, 2.1 beta2

 Attachments: 6948-v2.txt, 6948.debug.txt, 6948.txt, Screen Shot 
 2014-03-28 at 11.27.56 AM.png, Screen Shot 2014-03-28 at 11.29.24 AM.png, 
 cassandra.log.min, cassandra.yaml, logs.old.tar.gz, logs.tar.gz, 
 system.log.1.gz, system.log.gz


 Since ExpiringMap.shutdown() shuts down the static executor service, it 
 cannot be restarted (and in fact reset() makes no attempt to do so). As such 
 callbacks that receive no response are never removed from the map, and 
 eventually either than server will run out of memory or will loop around the 
 integer space and start reusing messageids that have not been expired, 
 causing assertions to be thrown and messages to fail to be sent. It appears 
 that this situation only arises on bootstrap or node replacement, as 
 MessagingService is shutdown before being attached to the listen address.
 This can cause the following errors to begin occurring in the log:
 ERROR [Native-Transport-Requests:7636] 2014-03-28 13:32:10,638 
 ErrorMessage.java (line 222) Unexpected exception during request
 java.lang.AssertionError: Callback already exists for id -1665979622! 
 (CallbackInfo(target=/10.106.160.84, 
 callback=org.apache.cassandra.service.WriteResponseHandler@5d36d8ea, 
 serializer=org.apache.cassandra.db.WriteResponse$WriteResponseSerializer@6ed37f0b))
   at 
 org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:549)
   at 
 org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:601)
   at 
 org.apache.cassandra.service.StorageProxy.mutateCounter(StorageProxy.java:984)
   at 
 org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:449)
   at 
 org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:524)
   at 
 org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:521)
   at 
 org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:505)
   at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:188)
   at 
 org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:358)
   at 
 org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:131)
   at 
 org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:304)
   at 
 org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:43)
   at 
 org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 ERROR [ReplicateOnWriteStage:102766] 2014-03-28 13:32:10,638 
 CassandraDaemon.java (line 196) Exception in thread 
 Thread[ReplicateOnWriteStage:102766,5,main]
 java.lang.AssertionError: Callback already exists for id -1665979620! 
 (CallbackInfo(target=/10.106.160.84, 
 callback=org.apache.cassandra.service.WriteResponseHandler@3bdb1a75, 
 serializer=org.apache.cassandra.db.WriteResponse$WriteResponseSerializer@6ed37f0b))
   at 
 org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:549)
   at 
 org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:601)
   at 
 org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:806)
   at 
 org.apache.cassandra.service.StorageProxy$8$1.runMayThrow(StorageProxy.java:1074)
   at 
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1896)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA

[jira] [Updated] (CASSANDRA-6476) Assertion error in MessagingService.addCallback

[
https://issues.apache.org/jira/browse/CASSANDRA-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Brandon Williams updated CASSANDRA-6476:

Attachment: 6476.txt

Half a backport from CASSANDRA-6948 to only affect replace.

Assertion error in MessagingService.addCallback
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6588) Add a 'NO EMPTY RESULTS' filter to SELECT


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977437#comment-13977437
 ] 

Jonathan Ellis commented on CASSANDRA-6588:
---

So no new syntax is implied, right?  +1 from me, too.

 Add a 'NO EMPTY RESULTS' filter to SELECT
 -

 Key: CASSANDRA-6588
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6588
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Priority: Minor
 Fix For: 2.1 beta2


 It is the semantic of CQL that a (CQL) row exists as long as it has one 
 non-null column (including the PK columns, which, given that no PK columns 
 can be null, means that it's enough to have the PK set for a row to exist). 
 This does means that the result to
 {noformat}
 CREATE TABLE test (k int PRIMARY KEY, v1 int, v2 int);
 INSERT INTO test(k, v1) VALUES (0, 4);
 SELECT v2 FROM test;
 {noformat}
 must be (and is)
 {noformat}
  v2
 --
  null
 {noformat}
 That fact does mean however that when we only select a few columns of a row, 
 we still need to find out rows that exist but have no values for the selected 
 columns. Long story short, given how the storage engine works, this means we 
 need to query full (CQL) rows even when only some of the columns are selected 
 because that's the only way to distinguish between the row exists but have 
 no value for the selected columns and the row doesn't exist. I'll note in 
 particular that, due to CASSANDRA-5762, we can't unfortunately rely on the 
 row marker to optimize that out.
 Now, when you selects only a subsets of the columns of a row, there is many 
 cases where you don't care about rows that exists but have no value for the 
 columns you requested and are happy to filter those out. So, for those cases, 
 we could provided a new SELECT filter. Outside the potential convenience (not 
 having to filter empty results client side), one interesting part is that 
 when this filter is provided, we could optimize a bit by only querying the 
 columns selected, since we wouldn't need to return rows that exists but have 
 no values for the selected columns.
 For the exact syntax, there is probably a bunch of options. For instance:
 * {{SELECT NON EMPTY(v2, v3) FROM test}}: the vague rational for putting it 
 in the SELECT part is that such filter is kind of in the spirit to DISTINCT.  
 Possibly a bit ugly outside of that.
 * {{SELECT v2, v3 FROM test NO EMPTY RESULTS}} or {{SELECT v2, v3 FROM test 
 NO EMPTY ROWS}} or {{SELECT v2, v3 FROM test NO EMPTY}}: the last one is 
 shorter but maybe a bit less explicit. As for {{RESULTS}} versus {{ROWS}}, 
 the only small object to {{NO EMPTY ROWS}} could be that it might suggest it 
 is filtering non existing rows (I mean, the fact we never ever return non 
 existing rows should hint that it's not what it does but well...) while we're 
 just filtering empty resultSet rows.
 Of course, if there is a pre-existing SQL syntax for that, it's even better, 
 though a very quick search didn't turn anything. Other suggestions welcome 
 too.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6948) After Bootstrap or Replace node startup, EXPIRING_MAP_REAPER is shutdown and cannot be restarted, causing callbacks to collect indefinitely


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977452#comment-13977452
 ] 

Brandon Williams commented on CASSANDRA-6948:
-

Already have a patch on CASSANDRA-6476, let's handle it there.

 After Bootstrap or Replace node startup, EXPIRING_MAP_REAPER is shutdown and 
 cannot be restarted, causing callbacks to collect indefinitely
 ---

 Key: CASSANDRA-6948
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6948
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Keith Wright
Assignee: Brandon Williams
 Fix For: 2.0.7, 2.1 beta2

 Attachments: 6948-v2.txt, 6948.debug.txt, 6948.txt, Screen Shot 
 2014-03-28 at 11.27.56 AM.png, Screen Shot 2014-03-28 at 11.29.24 AM.png, 
 cassandra.log.min, cassandra.yaml, logs.old.tar.gz, logs.tar.gz, 
 system.log.1.gz, system.log.gz


 Since ExpiringMap.shutdown() shuts down the static executor service, it 
 cannot be restarted (and in fact reset() makes no attempt to do so). As such 
 callbacks that receive no response are never removed from the map, and 
 eventually either than server will run out of memory or will loop around the 
 integer space and start reusing messageids that have not been expired, 
 causing assertions to be thrown and messages to fail to be sent. It appears 
 that this situation only arises on bootstrap or node replacement, as 
 MessagingService is shutdown before being attached to the listen address.
 This can cause the following errors to begin occurring in the log:
 ERROR [Native-Transport-Requests:7636] 2014-03-28 13:32:10,638 
 ErrorMessage.java (line 222) Unexpected exception during request
 java.lang.AssertionError: Callback already exists for id -1665979622! 
 (CallbackInfo(target=/10.106.160.84, 
 callback=org.apache.cassandra.service.WriteResponseHandler@5d36d8ea, 
 serializer=org.apache.cassandra.db.WriteResponse$WriteResponseSerializer@6ed37f0b))
   at 
 org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:549)
   at 
 org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:601)
   at 
 org.apache.cassandra.service.StorageProxy.mutateCounter(StorageProxy.java:984)
   at 
 org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:449)
   at 
 org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:524)
   at 
 org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:521)
   at 
 org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:505)
   at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:188)
   at 
 org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:358)
   at 
 org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:131)
   at 
 org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:304)
   at 
 org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:43)
   at 
 org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 ERROR [ReplicateOnWriteStage:102766] 2014-03-28 13:32:10,638 
 CassandraDaemon.java (line 196) Exception in thread 
 Thread[ReplicateOnWriteStage:102766,5,main]
 java.lang.AssertionError: Callback already exists for id -1665979620! 
 (CallbackInfo(target=/10.106.160.84, 
 callback=org.apache.cassandra.service.WriteResponseHandler@3bdb1a75, 
 serializer=org.apache.cassandra.db.WriteResponse$WriteResponseSerializer@6ed37f0b))
   at 
 org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:549)
   at 
 org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:601)
   at 
 org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:806)
   at 
 org.apache.cassandra.service.StorageProxy$8$1.runMayThrow(StorageProxy.java:1074)
   at 
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1896)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-7070) Virtual column name aliasing

2014-04-22 Thread Jay Patel (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-7070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jay Patel updated CASSANDRA-7070:
-

Component/s: (was: API)

Virtual column name aliasing

Key: CASSANDRA-7070
URL: https://issues.apache.org/jira/browse/CASSANDRA-7070
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Jay Patel
Fix For: 3.0

Hi folks!
Currently, storage space is saved significantly (in terabytes sometime for
static tables!) by shortening column names as it's repeated in each row;
however, this short column names can be very unreadable. So far, I've seen
10s of tables with 100s of convoluted names. Hard to debug issues, and work
with on day-to-day basis. This can make smart engineers quit project or even
company :)
Another reason: most of the time, folks are not even aware that column names
are repeated, and end up with really descriptive column names. Then, realize
waste of disk/ram/network, and spend time on re-implementation and/or crazy
migration to new table.
Yet another reason: table might be shared by multiple system, use cases and
people in organization, e.g. primary/analytics use cases, Ops/Developer, etc.
Now, it's becoming the issue where we should reliably keep mapping from
convoluted names to descriptive names. Usually, these mappings are done in
java enums; I've seen in DB as well, just so Ops folks don't have to
interpret java code :)
It would be great if Cassandra internally could virtually alias the column
names to a more efficient representation in storage.
I can take a shot at this feature if there are no major concerns. We ideally
want user to work with descriptive alias everywhere not even aware of
internal storage name of the column. Also, I think name/alias mapping needs
to be cached all the time to avoid any performance hit.. Any thoughts? How
difficult is it to accommodate? BTW, I think this may not directly apply to
dynamic tables as we rely on column name for proper ordering of columns in
wide rows. However, we may have some room there if it's not a clustered
column..
Thanks,
Jay

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (CASSANDRA-7070) Virtual column name aliasing

2014-04-22 Thread Jay Patel (JIRA)

Jay Patel created CASSANDRA-7070:


 Summary: Virtual column name aliasing
 Key: CASSANDRA-7070
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7070
 Project: Cassandra
  Issue Type: Improvement
  Components: API, Core
Reporter: Jay Patel
 Fix For: 3.0


Hi folks!

Currently, storage space is saved significantly (in terabytes sometime for 
static tables!) by shortening column names as it's repeated in each row; 
however, this short column names can be very unreadable. So far, I've seen 10s 
of tables with 100s of convoluted names. Hard to debug issues, and work with on 
day-to-day basis. This can make smart engineers quit project or even company :) 

Another reason:  most of the time, folks are not even aware that column names 
are repeated, and end up with really descriptive column names. Then, realize 
waste of disk/ram/network, and spend time on re-implementation and/or crazy 
migration to new table.

Yet another reason: table might be shared by multiple system, use cases and 
people in organization, e.g. primary/analytics use cases, Ops/Developer, etc. 
Now, it's becoming the issue where we should reliably keep mapping from 
convoluted names to descriptive names. Usually, these mappings are done in java 
enums; I've seen in DB as well, just so Ops folks don't have to interpret java 
code :)

It would be great if Cassandra internally could virtually alias the column 
names to a more efficient representation in storage. 

I can take a shot at this feature if there are no major concerns. We ideally 
want user to work with descriptive alias everywhere  not even aware of 
internal storage name of the column.  Also, I think name/alias mapping needs to 
be cached all the time to avoid any performance hit.. Any thoughts? How 
difficult is it to accommodate? BTW, I think this may not directly apply to 
dynamic tables as we rely on column name for proper ordering of columns in wide 
rows. However, we may have some room there if it's not a clustered column..

Thanks,
Jay



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (CASSANDRA-7070) Virtual column name aliasing

[
https://issues.apache.org/jira/browse/CASSANDRA-7070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Benedict resolved CASSANDRA-7070.
-

Resolution: Duplicate

Virtual column name aliasing

Key: CASSANDRA-7070
URL: https://issues.apache.org/jira/browse/CASSANDRA-7070
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Jay Patel
Fix For: 3.0

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7070) Virtual column name aliasing

[
https://issues.apache.org/jira/browse/CASSANDRA-7070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977464#comment-13977464
]

Benedict commented on CASSANDRA-7070:
-

Hi Jay,

This is indeed a good idea - one there's already a ticket open for. Please feel
free to join the discussion there.

Virtual column name aliasing

Key: CASSANDRA-7070
URL: https://issues.apache.org/jira/browse/CASSANDRA-7070
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Jay Patel
Fix For: 3.0

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-4175) Reduce memory, disk space, and cpu usage with a column name/id map


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977467#comment-13977467
 ] 

Benedict commented on CASSANDRA-4175:
-

See also CASSANDRA-6917 - IMO the best solution to this problem is an enum data 
type, and then to convert all column names to that type.

 Reduce memory, disk space, and cpu usage with a column name/id map
 --

 Key: CASSANDRA-4175
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4175
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jason Brown
  Labels: performance
 Fix For: 3.0


 We spend a lot of memory on column names, both transiently (during reads) and 
 more permanently (in the row cache).  Compression mitigates this on disk but 
 not on the heap.
 The overhead is significant for typical small column values, e.g., ints.
 Even though we intern once we get to the memtable, this affects writes too 
 via very high allocation rates in the young generation, hence more GC 
 activity.
 Now that CQL3 provides us some guarantees that column names must be defined 
 before they are inserted, we could create a map of (say) 32-bit int column 
 id, to names, and use that internally right up until we return a resultset to 
 the client.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6810) SSTable and Index Layout Improvements/Modifications


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977484#comment-13977484
 ] 

Jonathan Ellis commented on CASSANDRA-6810:
---

related: CASSANDRA-4324

 SSTable and Index Layout Improvements/Modifications
 ---

 Key: CASSANDRA-6810
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6810
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
  Labels: performance
 Fix For: 3.0


 Right now SSTables are somewhat inefficient in their storage of composite 
 keys. I propose resolving this by merging (some of) the index functionality 
 with the storage of keys, through introducing a composite btree/trie 
 structure (e.g. string b-tree) to represent the key, and for this structure 
 to index into the cell position in the file. This structure can then serve as 
 both an efficient index and the key data itself. 
 If we then offer the option of (possibly automatically decided for you at 
 flush) storing this either packed into the same file directly prepending the 
 data, or in a separate key file (with small pages), with an uncompressed page 
 cache we can get good performance for wide rows by storing it separately and 
 relying on the page cache for CQL row index lookups, whereas storing it 
 inline will allow very efficient lookups of small rows where index lookups 
 aren't particularly helpful. This removal of extra data from the index file, 
 however, will allow CASSANDRA-6709 to massively scale up the efficiency of 
 the key cache, whilst also reducing the total disk footprint of sstables and 
 (most likely) offering better indexing capability in similar space



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (CASSANDRA-6887) LOCAL_ONE read repair only does local repair, in spite of global digest queries

[
https://issues.apache.org/jira/browse/CASSANDRA-6887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Ellis reassigned CASSANDRA-6887:
-

Assignee: Aleksey Yeschenko

LOCAL_ONE read repair only does local repair, in spite of global digest
queries
---

Key: CASSANDRA-6887
URL: https://issues.apache.org/jira/browse/CASSANDRA-6887
Project: Cassandra
Issue Type: Bug
Components: Core
Environment: Cassandra 2.0.6, x86-64 ubuntu precise
Reporter: Duncan Sands
Assignee: Aleksey Yeschenko
Fix For: 2.0.8

I have a cluster spanning two data centres. Almost all of the writing (and a
lot of reading) is done in DC1. DC2 is used for running the occasional
analytics query. Reads in both data centres use LOCAL_ONE. Read repair
settings are set to the defaults on all column families.
I had a long network outage between the data centres; it lasted longer than
the hints window, so after it was over DC2 didn't have the latest
information. Even after reading data many many times in DC2, the returned
data was still out of date: read repair was not correcting it.
I then investigated using cqlsh in DC2, with tracing on.
What I saw was:
- with consistency ONE, after about 10 read requests a digest request would
be sent to many nodes (spanning both data centres), and the data in DC2 would
be repaired.
- with consistency LOCAL_ONE, after about 10 read requests a digest request
would be sent to many nodes (spanning both data centres), but the data in DC2
would not be repaired. This is in spite of digest requests being sent to
DC1, as shown by the tracing.
So it looks like digest requests are being sent to both data centres, but
replies from outside the local data centre are ignored when using LOCAL_ONE.
The same data is being queried all the time in DC1 with consistency
LOCAL_ONE, but this didn't result in the data in DC2 being read repaired
either. This is a slightly different case to what I described above: in that
case the local node was out of date and the remote node had the latest data,
while here it is the other way round.
It could be argued that you don't want cross data centre read repair when
using LOCAL_ONE. But then why bother sending cross data centre digest
requests? And if only doing local read repair is how it is supposed to work
then it would be good to document this somewhere.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6546) disablethrift results in unclosed file descriptors


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977497#comment-13977497
 ] 

Jonathan Ellis commented on CASSANDRA-6546:
---

Do you want to take a stab at this, [~mishail]?

 disablethrift results in unclosed file descriptors
 --

 Key: CASSANDRA-6546
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6546
 Project: Cassandra
  Issue Type: Bug
Reporter: Jason Harvey
Assignee: Ryan McGuire
Priority: Minor

 Disabling thrift results in unclosed thrift sockets being left around.
 Steps to reproduce and observe:
 1. Have a handful of clients connect via thrift.
 2. Disable thrift.
 3. Enable thrift, have the clients reconnect.
 4. Observe netstat or lsof, and you'll find a lot of thrift sockets in 
 CLOSE_WAIT state, and they'll never go away.
   * Also verifiable from 
 org.apache.cassandra.metrics:type=Client,name=connectedThriftClients MBean.
 What's extra fun about this is the leaked sockets still count towards your 
 maximum RPC thread count. As a result, toggling thrift enough times will 
 result in an rpc_max_threads number of CLOSED_WAIT sockets, with no new 
 clients able to connect.
 This was reproduced with HSHA. I haven't tried it in sync yet.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-6887) LOCAL_ONE read repair only does local repair, in spite of global digest queries

[
https://issues.apache.org/jira/browse/CASSANDRA-6887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Ellis updated CASSANDRA-6887:
--

Fix Version/s: 2.0.8

LOCAL_ONE read repair only does local repair, in spite of global digest
queries
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (CASSANDRA-6944) liveRatio jumps to max when Memtable is empty


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-6944:
-

Assignee: Aleksey Yeschenko

 liveRatio jumps to max when Memtable is empty
 -

 Key: CASSANDRA-6944
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6944
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: SUSE Linux Enterprise 11 (64-bit)
Reporter: Erik Hansen
Assignee: Aleksey Yeschenko
Priority: Minor
  Labels: memtables

 liveRatio calculation on an empty memtable results in a value of Infinity 
 since memtable.currentSize=0.  Infinity then gets capped at the liveRatio max 
 of 64.
 {noformat}
 WARN [MemoryMeter:1] 2014-03-19 09:26:59,483 Memtable.java (line 441) setting 
 live ratio to maximum of 64.0 instead of Infinity
 INFO [MemoryMeter:1] 2014-03-19 09:26:59,485 Memtable.java (line 452) 
 CFS(Keyspace='system', ColumnFamily='compactions_in_progress') liveRatio is 
 64.0 (just-counted was 64.0).  calculation took 7ms for 0 cells
 {noformat}
 Jumping liveRatio to the max value based on an empty Memtable leads to more 
 frequent flushing than may be necessary.
 CASSANDRA-4243 previously addressed this issue, but was resolved as fixed by 
 CASSANDRA-3741.  It does not appear this issue has been fixed as of 2.0.5



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (CASSANDRA-876) Support session (read-after-write) consistency


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-876.
--

Resolution: Won't Fix

Going to close as wontfix; I don't see be careful not to do too many writes in 
your session or you'll OOM because we're saving them for CL.RYW as a fun 
explanation to give people.

 Support session (read-after-write) consistency
 --

 Key: CASSANDRA-876
 URL: https://issues.apache.org/jira/browse/CASSANDRA-876
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor
  Labels: gsoc, gsoc2010
 Attachments: 876-v2.txt, CASSANDRA-876.patch


 In http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html and 
 http://www.allthingsdistributed.com/2008/12/eventually_consistent.html Amazon 
 discusses the concept of eventual consistency.  Cassandra uses eventual 
 consistency in a design similar to Dynamo.
 Supporting session consistency would be useful and relatively easy to add: we 
 already have the concept of a Memtable (see 
 http://wiki.apache.org/cassandra/MemtableSSTable ) to stage updates in 
 before flushing to disk; if we applied mutations to a session-level memtable 
 on the coordinator machine (that is, the machine the client is connected to), 
 and then did a final merge from that table against query results before 
 handing them to the client, we'd get it almost for free.
 Of course, the devil is in the details; thrift doesn't provide any hooks for 
 session-level data out of the box, but we could do this with a threadlocal 
 approach fairly easily.  CASSANDRA-569 has some (probably out of date now) 
 code that might be useful here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (CASSANDRA-6950) Secondary index query fails with tc range query when ordered by DESC


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-6950:
-

Assignee: Sylvain Lebresne

duplicate?

 Secondary index query fails with tc range query when ordered by DESC
 

 Key: CASSANDRA-6950
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6950
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: RHEL 6.3 virtual guest, 
 apache-cassandra-2.0.6-SNAPSHOT-src.tar.gz from build #284 (also tried with 
 2.0.5 with CASSANDRA- patch custom-applied with same result).
Reporter: Andre Campeau
Assignee: Sylvain Lebresne

 create table test4 ( name text, lname text, tc bigint, record text, 
 PRIMARY KEY ((name, lname), tc)) WITH CLUSTERING ORDER BY (tc DESC) AND 
 compaction={'class': 'LeveledCompactionStrategy'};
 create index test4_index ON test4(lname);
 Populate it with some data and non-zero tc values, then try:
 select * from test4 where lname='blah' and tc0 allow filtering;
 And, (0 rows) returned, even though there are rows which should be found.
 When I create the table using CLUSTERING ORDER BY (tc ASC), the above query 
 works. Rows are correctly returned based on the range check.
 Tried various combinations but with descending order on tc nothing works.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6944) liveRatio jumps to max when Memtable is empty


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977506#comment-13977506
 ] 

Jonathan Ellis commented on CASSANDRA-6944:
---

Not sure how you'd see this with 2.0.5 but it will definitely be a problem 
after CASSANDRA-6945 unless it was fixed as part of that ticket.

 liveRatio jumps to max when Memtable is empty
 -

 Key: CASSANDRA-6944
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6944
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: SUSE Linux Enterprise 11 (64-bit)
Reporter: Erik Hansen
Assignee: Aleksey Yeschenko
Priority: Minor
  Labels: memtables

 liveRatio calculation on an empty memtable results in a value of Infinity 
 since memtable.currentSize=0.  Infinity then gets capped at the liveRatio max 
 of 64.
 {noformat}
 WARN [MemoryMeter:1] 2014-03-19 09:26:59,483 Memtable.java (line 441) setting 
 live ratio to maximum of 64.0 instead of Infinity
 INFO [MemoryMeter:1] 2014-03-19 09:26:59,485 Memtable.java (line 452) 
 CFS(Keyspace='system', ColumnFamily='compactions_in_progress') liveRatio is 
 64.0 (just-counted was 64.0).  calculation took 7ms for 0 cells
 {noformat}
 Jumping liveRatio to the max value based on an empty Memtable leads to more 
 frequent flushing than may be necessary.
 CASSANDRA-4243 previously addressed this issue, but was resolved as fixed by 
 CASSANDRA-3741.  It does not appear this issue has been fixed as of 2.0.5



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (CASSANDRA-6546) disablethrift results in unclosed file descriptors

2014-04-22 Thread Mikhail Stepura (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Stepura reassigned CASSANDRA-6546:
--

Assignee: Mikhail Stepura  (was: Ryan McGuire)

 disablethrift results in unclosed file descriptors
 --

 Key: CASSANDRA-6546
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6546
 Project: Cassandra
  Issue Type: Bug
Reporter: Jason Harvey
Assignee: Mikhail Stepura
Priority: Minor

 Disabling thrift results in unclosed thrift sockets being left around.
 Steps to reproduce and observe:
 1. Have a handful of clients connect via thrift.
 2. Disable thrift.
 3. Enable thrift, have the clients reconnect.
 4. Observe netstat or lsof, and you'll find a lot of thrift sockets in 
 CLOSE_WAIT state, and they'll never go away.
   * Also verifiable from 
 org.apache.cassandra.metrics:type=Client,name=connectedThriftClients MBean.
 What's extra fun about this is the leaked sockets still count towards your 
 maximum RPC thread count. As a result, toggling thrift enough times will 
 result in an rpc_max_threads number of CLOSED_WAIT sockets, with no new 
 clients able to connect.
 This was reproduced with HSHA. I haven't tried it in sync yet.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6546) disablethrift results in unclosed file descriptors

2014-04-22 Thread Mikhail Stepura (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977509#comment-13977509
 ] 

Mikhail Stepura commented on CASSANDRA-6546:


[~jbellis] will do

 disablethrift results in unclosed file descriptors
 --

 Key: CASSANDRA-6546
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6546
 Project: Cassandra
  Issue Type: Bug
Reporter: Jason Harvey
Assignee: Ryan McGuire
Priority: Minor

 Disabling thrift results in unclosed thrift sockets being left around.
 Steps to reproduce and observe:
 1. Have a handful of clients connect via thrift.
 2. Disable thrift.
 3. Enable thrift, have the clients reconnect.
 4. Observe netstat or lsof, and you'll find a lot of thrift sockets in 
 CLOSE_WAIT state, and they'll never go away.
   * Also verifiable from 
 org.apache.cassandra.metrics:type=Client,name=connectedThriftClients MBean.
 What's extra fun about this is the leaked sockets still count towards your 
 maximum RPC thread count. As a result, toggling thrift enough times will 
 result in an rpc_max_threads number of CLOSED_WAIT sockets, with no new 
 clients able to connect.
 This was reproduced with HSHA. I haven't tried it in sync yet.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (CASSANDRA-6944) liveRatio jumps to max when Memtable is empty


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko resolved CASSANDRA-6944.
--

Resolution: Duplicate

 liveRatio jumps to max when Memtable is empty
 -

 Key: CASSANDRA-6944
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6944
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: SUSE Linux Enterprise 11 (64-bit)
Reporter: Erik Hansen
Assignee: Aleksey Yeschenko
Priority: Minor
  Labels: memtables

 liveRatio calculation on an empty memtable results in a value of Infinity 
 since memtable.currentSize=0.  Infinity then gets capped at the liveRatio max 
 of 64.
 {noformat}
 WARN [MemoryMeter:1] 2014-03-19 09:26:59,483 Memtable.java (line 441) setting 
 live ratio to maximum of 64.0 instead of Infinity
 INFO [MemoryMeter:1] 2014-03-19 09:26:59,485 Memtable.java (line 452) 
 CFS(Keyspace='system', ColumnFamily='compactions_in_progress') liveRatio is 
 64.0 (just-counted was 64.0).  calculation took 7ms for 0 cells
 {noformat}
 Jumping liveRatio to the max value based on an empty Memtable leads to more 
 frequent flushing than may be necessary.
 CASSANDRA-4243 previously addressed this issue, but was resolved as fixed by 
 CASSANDRA-3741.  It does not appear this issue has been fixed as of 2.0.5



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6944) liveRatio jumps to max when Memtable is empty


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977512#comment-13977512
 ] 

Aleksey Yeschenko commented on CASSANDRA-6944:
--

I say it's not a problem after CASSANDRA-6945 (in fact, this scenario was one 
of the reasons for 6945 in the first place). The new memtable would inherit 
both the live ratio and calculatedAt. So as soon as some data starts flowing 
into the memtable, the ratio will be recalculated. 64 will never stuck for long.

 liveRatio jumps to max when Memtable is empty
 -

 Key: CASSANDRA-6944
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6944
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: SUSE Linux Enterprise 11 (64-bit)
Reporter: Erik Hansen
Assignee: Aleksey Yeschenko
Priority: Minor
  Labels: memtables

 liveRatio calculation on an empty memtable results in a value of Infinity 
 since memtable.currentSize=0.  Infinity then gets capped at the liveRatio max 
 of 64.
 {noformat}
 WARN [MemoryMeter:1] 2014-03-19 09:26:59,483 Memtable.java (line 441) setting 
 live ratio to maximum of 64.0 instead of Infinity
 INFO [MemoryMeter:1] 2014-03-19 09:26:59,485 Memtable.java (line 452) 
 CFS(Keyspace='system', ColumnFamily='compactions_in_progress') liveRatio is 
 64.0 (just-counted was 64.0).  calculation took 7ms for 0 cells
 {noformat}
 Jumping liveRatio to the max value based on an empty Memtable leads to more 
 frequent flushing than may be necessary.
 CASSANDRA-4243 previously addressed this issue, but was resolved as fixed by 
 CASSANDRA-3741.  It does not appear this issue has been fixed as of 2.0.5



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7011) auth_test system_auth_ks_is_alterable_test dtest hangs in 2.1 and 2.0


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977530#comment-13977530
 ] 

Michael Shuler commented on CASSANDRA-7011:
---

1.2-2.0 bisect:
{noformat}
((20c2adc...)|BISECTING)mshuler@hana:~/git/cassandra$ git bisect good
41ffca1281dcdc69b1b843b47a5bb6dc3c462aac is the first bad commit
commit 41ffca1281dcdc69b1b843b47a5bb6dc3c462aac
Author: Yuki Morishita yu...@apache.org
Date:   Tue Jan 28 09:17:30 2014 -0600

Fix NPE when streaming connection is not yet ready

patch by yukim; reviewed by Russell Spitzer for CASSANDRA-6210

:100644 100644 68727dcd5bee6265d31f79223b47aba6c5c5b166 
46b14fca2f621f97c0ea553a1d736ecbe794255b M  CHANGES.txt
:04 04 a297635e8cc415fa34d03270625dc1d53b838ab2 
4f1d6a09cd7d0bd8a8ad4374e15350d75b9c8458 M  src

((20c2adc...)|BISECTING)mshuler@hana:~/git/cassandra$ git bisect log 
git bisect start
# good: [2890cc5be986740cadf491bb5efbb49af2b11c57] Ensure that batchlog and 
hint timeouts do not produce hints
git bisect good 2890cc5be986740cadf491bb5efbb49af2b11c57
# bad: [b9324e1b94f67f3d89096fcef4d157f9505364e9] Post-CASSANDRA-7058 fix
git bisect bad b9324e1b94f67f3d89096fcef4d157f9505364e9
# good: [76664d2c1cbd94fb902c4867f28f7e07e2c284a6] ninja-cleanup 
ColumnDefinition, update to match TriggerDefinition
git bisect good 76664d2c1cbd94fb902c4867f28f7e07e2c284a6
# good: [1240c9bd228da81c4052eade48e40bc34ec1d34d] Fix potential NPE while 
loading paxos state
git bisect good 1240c9bd228da81c4052eade48e40bc34ec1d34d
# bad: [63059a81944094e4946fdc337ff8951dd2c6ca3e] Merge branch 'cassandra-1.2' 
into cassandra-2.0
git bisect bad 63059a81944094e4946fdc337ff8951dd2c6ca3e
# good: [4fae76c146f2856063927c9a8d1c34bd86b30a58] Fix CQLSSTableWriterTest
git bisect good 4fae76c146f2856063927c9a8d1c34bd86b30a58
# good: [2111a20b4b44e557357f81146ead6cf7493a8d31] Fix streaming older SSTable 
yields row tombstones
git bisect good 2111a20b4b44e557357f81146ead6cf7493a8d31
# good: [200e494e7fd305cacb638e13a98b18356d124def] Remove time penalty from 
DES. Patch by Tyler Hobbs, reviewed by brandonwilliams for CASSANDRA-6465
git bisect good 200e494e7fd305cacb638e13a98b18356d124def
# good: [24af3525f2f036ba116941cee94a56f1d0e46e07] Merge branch 'cassandra-1.2' 
into cassandra-2.0
git bisect good 24af3525f2f036ba116941cee94a56f1d0e46e07
# good: [a16986374450c0e8c1bd1de8933042998a079f13] Don't check for expireTime 
is node isn't in REMOVED Patch by thobbs, reviewed by brandonwilliams for 
CASSANDRA-6564
git bisect good a16986374450c0e8c1bd1de8933042998a079f13
# bad: [41ffca1281dcdc69b1b843b47a5bb6dc3c462aac] Fix NPE when streaming 
connection is not yet ready
git bisect bad 41ffca1281dcdc69b1b843b47a5bb6dc3c462aac
# good: [8bbb6eda66412bdf347e302d5677538b82d26948] Merge branch 'cassandra-1.2' 
into cassandra-2.0
git bisect good 8bbb6eda66412bdf347e302d5677538b82d26948
# good: [20c2adc87102963836a59a5e9626005fd9ee08bc] Reduce garbage generated by 
bloom filter lookups patch by Benedict Elliott Smith; reviewed by jbellis for 
CASSANDRA-6609
git bisect good 20c2adc87102963836a59a5e9626005fd9ee08bc
# first bad commit: [41ffca1281dcdc69b1b843b47a5bb6dc3c462aac] Fix NPE when 
streaming connection is not yet ready
{noformat}

I'll run through again to double check

 auth_test system_auth_ks_is_alterable_test dtest hangs in 2.1 and 2.0
 -

 Key: CASSANDRA-7011
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7011
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Michael Shuler
Assignee: Aleksey Yeschenko

 This test hangs forever. When I hit ctl-c after running the test, then the 
 ccm nodes actually continue running - I think ccm is looking for log lines 
 that never occur until the test is killed(?).
 {noformat}
 $ export MAX_HEAP_SIZE=1G; export HEAP_NEWSIZE=256M; ENABLE_VNODES=true 
 PRINT_DEBUG=true nosetests --nocapture --nologcapture --verbosity=3 
 auth_test.py:TestAuth.system_auth_ks_is_alterable_test
 nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$']
 system_auth_ks_is_alterable_test (auth_test.TestAuth) ... cluster ccm 
 directory: /tmp/dtest-O3AAJr
 ^C
 {noformat}
 Search for (hanging here) below - I typed this prior to hitting ctl-c. Then 
 the nodes start running again and I see Listening for thrift clients later 
 on.
 {noformat}
 mshuler@hana:~$ tail -f /tmp/dtest-O3AAJr/test/node*/logs/system.log
 == /tmp/dtest-O3AAJr/test/node1/logs/system.log ==
 INFO  [MemtableFlushWriter:2] 2014-04-08 16:45:12,599 Memtable.java:344 - 
 Writing Memtable-schema_columnfamilies@1792243696(1627 serialized bytes, 27 
 ops, 0%/0% of on/off-heap limit)
 INFO  [CompactionExecutor:2] 2014-04-08 16:45:12,603 CompactionTask.java:287 
 - Compacted 4 sstables to

[jira] [Created] (CASSANDRA-7071) Buffer cache metrics in OpsCenter

2014-04-22 Thread Jay Patel (JIRA)

Jay Patel created CASSANDRA-7071:


 Summary: Buffer cache metrics in OpsCenter
 Key: CASSANDRA-7071
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7071
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Jay Patel
 Fix For: 3.0


It's currently very difficult to understand how the buffer cache is being used 
by Cassandra. Unlike the key and row cache, for which there are hit rate 
metrics pulled by the datastax agent and visible in opscenter, there are no 
such metrics around the buffer cache. This would be immensely useful in a 
production environment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7071) Buffer cache metrics in OpsCenter


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977549#comment-13977549
 ] 

Jonathan Ellis commented on CASSANDRA-7071:
---

We evaluated attempting this with mincore but it didn't work very well.  Did 
you have a better suggestion?

 Buffer cache metrics in OpsCenter
 -

 Key: CASSANDRA-7071
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7071
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Jay Patel
 Fix For: 3.0


 It's currently very difficult to understand how the buffer cache is being 
 used by Cassandra. Unlike the key and row cache, for which there are hit rate 
 metrics pulled by the datastax agent and visible in opscenter, there are no 
 such metrics around the buffer cache. This would be immensely useful in a 
 production environment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7069) Prevent operator mistakes due to simultaneous bootstrap

2014-04-22 Thread T Jake Luciani (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977700#comment-13977700
 ] 

T Jake Luciani commented on CASSANDRA-7069:
---

It will not throw an error but it defeats the purpose of the ticket.  I need to 
think about it deeper, but if two nodes are bootstrapping and fall in the 
bounds of the original token range then you would, in the end, not have a 
consistent bootstrap.

 Prevent operator mistakes due to simultaneous bootstrap
 ---

 Key: CASSANDRA-7069
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7069
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
Priority: Minor
 Fix For: 3.0


 Cassandra has always had the '2 minute rule' between beginning topology 
 changes to ensure the range announcement is known to all nodes before the 
 next one begins.  Trying to bootstrap a bunch of nodes simultaneously is a 
 common mistake and seems to be on the rise as of late.
 We can prevent users from shooting themselves in the foot this way by looking 
 for other joining nodes in the shadow round, then comparing their generation 
 against our own and if there isn't a large enough difference, bail out or 
 sleep until it is large enough.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7069) Prevent operator mistakes due to simultaneous bootstrap


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977705#comment-13977705
 ] 

Brandon Williams commented on CASSANDRA-7069:
-

bq. It will not throw an error but it defeats the purpose of the ticket

We know from experience that telling people don't do that isn't good 
enough... what I'm proposing here is to either not allow it, or sleep long 
enough that it avoids any issues.

 Prevent operator mistakes due to simultaneous bootstrap
 ---

 Key: CASSANDRA-7069
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7069
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
Priority: Minor
 Fix For: 3.0


 Cassandra has always had the '2 minute rule' between beginning topology 
 changes to ensure the range announcement is known to all nodes before the 
 next one begins.  Trying to bootstrap a bunch of nodes simultaneously is a 
 common mistake and seems to be on the rise as of late.
 We can prevent users from shooting themselves in the foot this way by looking 
 for other joining nodes in the shadow round, then comparing their generation 
 against our own and if there isn't a large enough difference, bail out or 
 sleep until it is large enough.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7031) Increase default commit log total space + segment size

2014-04-22 Thread Ryan McGuire (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977707#comment-13977707
 ] 

Ryan McGuire commented on CASSANDRA-7031:
-

[~benedict] - here's some graphs:

[write n=2500 -key populate=1..1 -rate threads=50 -mode 
thrift|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.7031.populate_1_1.json]

[write n=2500 -key populate=1..100 -rate threads=50 -mode 
thrift|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.7031.populate_1_100.json]

 Increase default commit log total space + segment size
 --

 Key: CASSANDRA-7031
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7031
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Trivial
 Fix For: 2.1 beta2

 Attachments: 7031.txt


 I would like to increase the default commit log total space and segment size 
 options for 64-bit JVMs:
 The current default of 1Gb and 32Mb is quite constrained and can have some 
 (very minor) negative performance implications, for no major benefit: 
 # 32Mb files are actually quite small, and if during the 10s interval we have 
 completely filled multiple of them (quite easy) it would be more efficient to 
 write fewer larger files, as we can issue fewer fsyncs and permit the OS to 
 schedule the writes more efficiently. On my box this has a small but 
 noticeable impact. Although I would expect on decent server hardware this 
 would be smaller still, since we immediately drop the pages from cache on 
 writing there isn't a great deal of advantage to keeping the files so small. 
 The only advantage I can see is that during a drop KS/CF or other event that 
 forces log rollover we're wasting less space until log recycling. 128-256Mb 
 are modest increases that seem more appropriate to me.
 # 1Gb is too small for the default total log space. We can find that we force 
 memtable flushes as a result of log utilisation instead of memtable occupancy 
 quite often (esp. as a result of increased effective memtable space from 
 recent improvements), especially on machines with more addressable memory. I 
 suggest 8Gb as a minimum. The only disadvantage of having more log data is 
 that replay on restart may be slightly slower, but since most of the events 
 will be ignored it should be relatively benign, and I would rather take the 
 penalty on startup instead of during running, no matter how small the running 
 penalty.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7069) Prevent operator mistakes due to simultaneous bootstrap

2014-04-22 Thread T Jake Luciani (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977715#comment-13977715
 ] 

T Jake Luciani commented on CASSANDRA-7069:
---

Right, I agree.  What I'm saying it we may need to error any simultaneous 
bootstraps, they would need to happen fully one at a time. 

Honestly I don't understand how the shadow round works well enough to know if 
two bootstraps placed N minutes apart will end up with consistency issues ala 
2434 but I suspect it would be an issue.

 Prevent operator mistakes due to simultaneous bootstrap
 ---

 Key: CASSANDRA-7069
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7069
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
Priority: Minor
 Fix For: 3.0


 Cassandra has always had the '2 minute rule' between beginning topology 
 changes to ensure the range announcement is known to all nodes before the 
 next one begins.  Trying to bootstrap a bunch of nodes simultaneously is a 
 common mistake and seems to be on the rise as of late.
 We can prevent users from shooting themselves in the foot this way by looking 
 for other joining nodes in the shadow round, then comparing their generation 
 against our own and if there isn't a large enough difference, bail out or 
 sleep until it is large enough.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

[
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1394#comment-1394
]

Aleksey Yeschenko commented on CASSANDRA-6694:
--

h3. Benedict’s original branch

ABTC.ColumnUpdater#apply() calls update.reconcile(existing) and skips
localCopy() if reconciled == existing. This means that we should optimise all
reconcile() implementations to prioritise the argument cell in case of ties for
optimal savings (and ties happen often enough, from retries and whatnot +
potentially counter updates if we decide to do that thing when batch commit log
is enabled). Currently we do the opposite. Would be easiest to simply swap the
call to existing.reconcile(update).

getAllocator() doesn’t belong to SecondaryIndex, API-wise. CFS#logFlush() and
CFS.FLCF#run() should just use SecondaryIndexManager#getIndexesNotBackedByCfs()
and get their allocators directly instead of using SIM#getIndexes() and
checking for null.

Composite/CellName/CellNameType/etc#copy() all now have an extra CFMetaData
argument, while only NativeCell really uses it. Can we isolate its usage to a
NativeCell-specific methods and leave the rest alone?

At least NativeCell#cql3ColumnName() can throw NPE when calling
metadata.getColumnDefinition(buffer).name. Just because it’s SIMPLE_SPARSE
doesn’t mean all the column names are predefined - it’s legal to insert
non-predefined cells w/ default_validator validator via Thrift/CQL2.

NativeCell#copy(), COMPOUND_SPARSE branch - there is no way a compound sparse
comparator and cfType = Super can coexist. Supers are all compound dense.

Generally, NativeCell methods seem to assume a bit too much about the sizes and
about what can and what can’t be present/absent. You can even guarantee
presence of a ColumnIdentifier for COMPOUND_SPARSE, and yet NativeCell#copy()
would throw an AssertionError is that’s the case. And CFMetaData is mutable,
too, and it is possible to remove a column via ALTER TABLE at any time.

I’m not comfortable +1-ing it until Sylvain has a look at at least these bits
(just the NativeCell methods).

Allocator hierarchy is confusing - I won’t claim having understood it entirely,
as are the names there. ‘Data’ prefix in DataAllocator is absolutely
meaningless in the context. Maybe MemtableAllocator would be more meaningful?
Don’t have suggestions for the rest of the names and for making that hierarchy
more straightforward, but I can live with it as it is.

I very much dislike the Impl thing though. This is an uncomfortable step back
in Cell* hierarchy readability. Basic things like using IDEA’s Find Usages on
Cell.Impl#localCopy() not showing Counter/Expiring/Deleted counterparts’ usage
are annoying. This is my largest, and, really the only fundamental issue with
the branch. Other than that, and too many assumptions in certain NativeCell
methods, I’m okay with the branch.

Overall it looks reasonable, and is actually less invasive than I was afraid it
would be.

Nits: AbstractMemory formatting is all messed up.

h3. Pavel’s refactoring branch

Doesn’t build (although trivial-ish to make it build) and is incomplete (as
expected), and that does complicate judging the ugliness of the result.

Same issues and potential issues in AbstractNativeCells methods as in
NativeCell methods in the other branch.

Can’t form an opinion on Pavel’s Allocator/Pool approach, because it’s not here
yet, and I’m not sure I got it right from just reading the comments.

This *Cell hierarchy, though, I feel a lot more comfortable with.

I feel strongly that we should borrow the Impl-less Cell hierarchy from this
branch, if nothing else (and there isn’t much else yet) - this is my biggest
issue with the original.

As for the rest of it - the time is running low, we have to ship 2.1
eventually. Any chance you could flesh it out in the next few days, maybe until
Monday, Pavel? If not, I’m not sure if we should block beta2 further :\

Slightly More Off-Heap Memtables

Key: CASSANDRA-6694
URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Benedict
Assignee: Benedict
Labels: performance
Fix For: 2.1 beta2

The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as
the on-heap overhead is still very large. It should not be tremendously
difficult to extend these changes so that we allocate entire Cells off-heap,
instead of multiple BBs per Cell (with all their associated overhead).
The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6
bytes per cell on average for the btree overhead, for a total overhead of
around 20-22 bytes). This translates to 8-byte object overhead,

[jira] [Updated] (CASSANDRA-6932) StorageServiceMbean needs to expose flush directory.

2014-04-22 Thread Dave Brosius (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Brosius updated CASSANDRA-6932:


Attachment: 6932.txt

against 2.1

 StorageServiceMbean needs to expose flush directory.
 

 Key: CASSANDRA-6932
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6932
 Project: Cassandra
  Issue Type: Bug
Reporter: Nick Bailey
 Fix For: 2.1 beta2

 Attachments: 6932.txt


 Storage service currently exposes data dirs, commitlog dir, and saved caches 
 dir. Should add the flush dir now that we have that as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7011) auth_test system_auth_ks_is_alterable_test dtest hangs in 2.1 and 2.0


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977781#comment-13977781
 ] 

Michael Shuler commented on CASSANDRA-7011:
---

1.2-2.1 bisect didn't go so well.. guess I'll try again or we can look at the 
above
{noformat}
((82b086f...)|BISECTING)mshuler@hana:~/git/cassandra$ git bisect skip 
There are only 'skip'ped commits left to test.
The first bad commit could be any of:
379212d5deb2f0a3c399fc447808b2de2559341d
bef7146c73ff986f01da8b8674b2f860d0e5e201
db07b20edbcd2a23b0669e64e466cd13ce47e2f3
eca02fd2551d36ccbcf7a2f5e7aeafed12b0a869
892d8e699cf5ca3807da288bd08c73319c35c3b8
c2294aa21eb6310b6d5c05d6d9ff505f59b376c2
b5e2a01af7ca1911d6779651fb647f9dc455f4e3
d63d07b9270d73a289086c69002b5a0023b2d233
0a1b277d659e20cd19eaedaa7220b0fc55950dc4
ebc712aeb9f8730c5b5a73eba6261f566d79956e
5e304eb78a3e9227260998c335ee0e01ebab07d7
82b086f478d1979e2776fdb724c228a19a9de05b
8ebeee104bc985ab6dd7515851747cbd93e898b2
We cannot bisect more!
((82b086f...)|BISECTING)mshuler@hana:~/git/cassandra$ git bisect log
# bad: [2c7622a65ce747819931bd52bc576a4cd055ba3d] Merge branch 'cassandra-2.0' 
into cassandra-2.1
# good: [2890cc5be986740cadf491bb5efbb49af2b11c57] Ensure that batchlog and 
hint timeouts do not produce hints
git bisect start 'cassandra-2.1' 'cassandra-1.2'
# good: [2c7b61b76ec034afe4267fdaecd0905db16b40eb] Merge branch 'cassandra-2.0' 
into trunk
git bisect good 2c7b61b76ec034afe4267fdaecd0905db16b40eb
# skip: [7263584c44f4af9044b875fa549abc08e6e6bc21] Merge branch 'cassandra-2.0' 
into trunk
git bisect skip 7263584c44f4af9044b875fa549abc08e6e6bc21
# skip: [458bcf238d3a5ad8f3b756e8806ca63bf0057aeb] Merge branch 'cassandra-2.0' 
into trunk
git bisect skip 458bcf238d3a5ad8f3b756e8806ca63bf0057aeb
# good: [fe58dffef5d4f44255ff47623b7d4d50a2f4e56d] merge from 2.0
git bisect good fe58dffef5d4f44255ff47623b7d4d50a2f4e56d
# bad: [8c6541715067a4ae9e3bb583c49d4b7ac0bb2fff] Undo CASSANDRA-6707 from trunk
git bisect bad 8c6541715067a4ae9e3bb583c49d4b7ac0bb2fff
# bad: [680f2bda4d0d51e023bcad2d160883ab408cca8f] Merge branch 'cassandra-2.0' 
into trunk
git bisect bad 680f2bda4d0d51e023bcad2d160883ab408cca8f
# skip: [0a1b277d659e20cd19eaedaa7220b0fc55950dc4] CF id is now 
non-deterministic
git bisect skip 0a1b277d659e20cd19eaedaa7220b0fc55950dc4
# good: [49efc13cd530735ad802769e7f5322f3c79085ef] Merge branch 'cassandra-2.0' 
into trunk
git bisect good 49efc13cd530735ad802769e7f5322f3c79085ef
# skip: [0bfe9efd859eccd6bb6c6a253ad3912650831ec0] Don't scrub 2i CF if index 
type is CUSTOM
git bisect skip 0bfe9efd859eccd6bb6c6a253ad3912650831ec0
# good: [b40b98d361bab28dcf5bd3902aa306ee3e852d30] Merge branch 'cassandra-1.2' 
into cassandra-2.0
git bisect good b40b98d361bab28dcf5bd3902aa306ee3e852d30
# good: [1dc43bdad2beb52793253c4aacb737c8de74cd4a] Merge branch 'cassandra-2.0' 
into trunk
git bisect good 1dc43bdad2beb52793253c4aacb737c8de74cd4a
# bad: [602dd62e5a93ae1ada5d63f8db5cc3e77fe0e5ee] fix build
git bisect bad 602dd62e5a93ae1ada5d63f8db5cc3e77fe0e5ee
# skip: [fd888bcbcce2fda02c636e9b6d45cae9097180ca] Merge branch 'cassandra-2.0' 
into trunk
git bisect skip fd888bcbcce2fda02c636e9b6d45cae9097180ca
# good: [e2209d5983a623659bac4693e8afb815771cd54c] Merge branch 'cassandra-1.2' 
into cassandra-2.0
git bisect good e2209d5983a623659bac4693e8afb815771cd54c
# skip: [e3d1e38a44a84f4689c4f352144a2132d0da6fa2] Merge branch 'cassandra-2.0' 
into trunk
git bisect skip e3d1e38a44a84f4689c4f352144a2132d0da6fa2
# skip: [5e304eb78a3e9227260998c335ee0e01ebab07d7] Merge branch 'cassandra-2.0' 
into trunk
git bisect skip 5e304eb78a3e9227260998c335ee0e01ebab07d7
# bad: [f6f50ddffe0821617fe29482f9ec918608560381] Fix previous merge
git bisect bad f6f50ddffe0821617fe29482f9ec918608560381
# bad: [8ebeee104bc985ab6dd7515851747cbd93e898b2] Merge branch 'cassandra-2.0' 
into trunk
git bisect bad 8ebeee104bc985ab6dd7515851747cbd93e898b2
# skip: [379212d5deb2f0a3c399fc447808b2de2559341d] Fix logback config in 
scripts and packaging. Patch by Michael Shuler, reviewed by brandonwilliams for 
CASSANDRA-6530
git bisect skip 379212d5deb2f0a3c399fc447808b2de2559341d
# skip: [bef7146c73ff986f01da8b8674b2f860d0e5e201] Merge branch 'cassandra-2.0' 
into trunk
git bisect skip bef7146c73ff986f01da8b8674b2f860d0e5e201
# skip: [c2294aa21eb6310b6d5c05d6d9ff505f59b376c2] Secondary indexing of map 
keys
git bisect skip c2294aa21eb6310b6d5c05d6d9ff505f59b376c2
# skip: [db07b20edbcd2a23b0669e64e466cd13ce47e2f3] merge from 2.0
git bisect skip db07b20edbcd2a23b0669e64e466cd13ce47e2f3
# skip: [d63d07b9270d73a289086c69002b5a0023b2d233] Make user types keyspace 
scoped
git bisect skip d63d07b9270d73a289086c69002b5a0023b2d233
# skip: [eca02fd2551d36ccbcf7a2f5e7aeafed12b0a869] Ninja fix simple bugs from 
#6438
git bisect skip eca02fd2551d36ccbcf7a2f5e7aeafed12b0a869
# skip: [b5e2a01af7ca1911d6779651fb647f9dc455f4e3] Merge branch 'cassandra-2.0'

[jira] [Updated] (CASSANDRA-6932) StorageServiceMbean needs to expose flush directory.

2014-04-22 Thread Dave Brosius (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Brosius updated CASSANDRA-6932:


Priority: Minor  (was: Major)

 StorageServiceMbean needs to expose flush directory.
 

 Key: CASSANDRA-6932
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6932
 Project: Cassandra
  Issue Type: Bug
Reporter: Nick Bailey
Assignee: Dave Brosius
Priority: Minor
 Fix For: 2.1 beta2

 Attachments: 6932.txt


 Storage service currently exposes data dirs, commitlog dir, and saved caches 
 dir. Should add the flush dir now that we have that as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7030) Remove JEMallocAllocator

2014-04-22 Thread Vijay (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977788#comment-13977788
 ] 

Vijay commented on CASSANDRA-7030:
--

Hi Bendict, Sorry missed the update earlier Not sure why we are comparing 
synchronization, hence i removed synchronization and here are the results 
on RHEL (32 core box) http://pastebin.com/ZXSytn70. JEMalloc with JNI overhead 
is faster and efficient.

 Remove JEMallocAllocator
 

 Key: CASSANDRA-7030
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Minor
  Labels: performance
 Fix For: 2.1 beta2

 Attachments: 7030.txt, benchmark.21.diff.txt


 JEMalloc, whilst having some nice performance properties by comparison to 
 Doug Lea's standard malloc algorithm in principle, is pointless in practice 
 because of the JNA cost. In general it is around 30x more expensive to call 
 than unsafe.allocate(); malloc does not have a variability of response time 
 as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a 
 sensible idea. I doubt if custom JNI would make it worthwhile either.
 I propose removing it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

git commit: Setting severity via JMX broken patch by Vijay; reviewed by rbranson for CASSANDRA-6996

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.0 b9324e1b9 - 4e4d7bbcb


Setting severity via JMX broken
patch by Vijay; reviewed by rbranson for CASSANDRA-6996


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4e4d7bbc
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4e4d7bbc
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4e4d7bbc

Branch: refs/heads/cassandra-2.0
Commit: 4e4d7bbcb254285a1031cb232b3fe7af326e9da3
Parents: b9324e1
Author: Vijay vijay2...@gmail.com
Authored: Tue Apr 22 21:30:38 2014 -0700
Committer: Vijay vijay2...@gmail.com
Committed: Tue Apr 22 21:30:38 2014 -0700

--
 .../org/apache/cassandra/locator/DynamicEndpointSnitch.java   | 2 +-
 src/java/org/apache/cassandra/service/StorageService.java | 5 +
 .../org/apache/cassandra/utils/BackgroundActivityMonitor.java | 7 +++
 3 files changed, 13 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e4d7bbc/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java 
b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
index 00c3618..c76a196 100644
--- a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
+++ b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
@@ -300,7 +300,7 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 
 public void setSeverity(double severity)
 {
-StorageService.instance.reportSeverity(severity);
+StorageService.instance.reportManualSeverity(severity);
 }
 
 public double getSeverity()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e4d7bbc/src/java/org/apache/cassandra/service/StorageService.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageService.java 
b/src/java/org/apache/cassandra/service/StorageService.java
index 7382cbd..75f6427 100644
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@ -1054,6 +1054,11 @@ public class StorageService extends 
NotificationBroadcasterSupport implements IE
 bgMonitor.incrCompactionSeverity(incr);
 }
 
+public void reportManualSeverity(double incr)
+{
+bgMonitor.incrManualSeverity(incr);
+}
+
 public double getSeverity(InetAddress endpoint)
 {
 return bgMonitor.getSeverity(endpoint);

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e4d7bbc/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java
--
diff --git a/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java 
b/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java
index bad9a17..93906eb 100644
--- a/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java
+++ b/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java
@@ -56,6 +56,7 @@ public class BackgroundActivityMonitor
 private static final String PROC_STAT_PATH = /proc/stat;
 
 private final AtomicDouble compaction_severity = new AtomicDouble();
+private final AtomicDouble manual_severity = new AtomicDouble();
 private final ScheduledExecutorService reportThread = new 
DebuggableScheduledThreadPoolExecutor(Background_Reporter);
 
 private RandomAccessFile statsFile;
@@ -112,6 +113,11 @@ public class BackgroundActivityMonitor
 compaction_severity.addAndGet(sev);
 }
 
+public void incrManualSeverity(double sev)
+{
+manual_severity.addAndGet(sev);
+}
+
 public double getIOWait() throws IOException
 {
 if (statsFile == null)
@@ -157,6 +163,7 @@ public class BackgroundActivityMonitor
 
 if (!Gossiper.instance.isEnabled())
 return;
+report += manual_severity.get(); // add manual severity setting.
 VersionedValue updated = 
StorageService.instance.valueFactory.severity(report);
 
Gossiper.instance.addLocalApplicationState(ApplicationState.SEVERITY, updated);
 }

[1/2] git commit: Setting severity via JMX broken patch by Vijay; reviewed by rbranson for CASSANDRA-6996

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 2c7622a65 - ad57cb010


Setting severity via JMX broken
patch by Vijay; reviewed by rbranson for CASSANDRA-6996


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4e4d7bbc
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4e4d7bbc
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4e4d7bbc

Branch: refs/heads/cassandra-2.1
Commit: 4e4d7bbcb254285a1031cb232b3fe7af326e9da3
Parents: b9324e1
Author: Vijay vijay2...@gmail.com
Authored: Tue Apr 22 21:30:38 2014 -0700
Committer: Vijay vijay2...@gmail.com
Committed: Tue Apr 22 21:30:38 2014 -0700

--
 .../org/apache/cassandra/locator/DynamicEndpointSnitch.java   | 2 +-
 src/java/org/apache/cassandra/service/StorageService.java | 5 +
 .../org/apache/cassandra/utils/BackgroundActivityMonitor.java | 7 +++
 3 files changed, 13 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e4d7bbc/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java 
b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
index 00c3618..c76a196 100644
--- a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
+++ b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
@@ -300,7 +300,7 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 
 public void setSeverity(double severity)
 {
-StorageService.instance.reportSeverity(severity);
+StorageService.instance.reportManualSeverity(severity);
 }
 
 public double getSeverity()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e4d7bbc/src/java/org/apache/cassandra/service/StorageService.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageService.java 
b/src/java/org/apache/cassandra/service/StorageService.java
index 7382cbd..75f6427 100644
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@ -1054,6 +1054,11 @@ public class StorageService extends 
NotificationBroadcasterSupport implements IE
 bgMonitor.incrCompactionSeverity(incr);
 }
 
+public void reportManualSeverity(double incr)
+{
+bgMonitor.incrManualSeverity(incr);
+}
+
 public double getSeverity(InetAddress endpoint)
 {
 return bgMonitor.getSeverity(endpoint);

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e4d7bbc/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java
--
diff --git a/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java 
b/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java
index bad9a17..93906eb 100644
--- a/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java
+++ b/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java
@@ -56,6 +56,7 @@ public class BackgroundActivityMonitor
 private static final String PROC_STAT_PATH = /proc/stat;
 
 private final AtomicDouble compaction_severity = new AtomicDouble();
+private final AtomicDouble manual_severity = new AtomicDouble();
 private final ScheduledExecutorService reportThread = new 
DebuggableScheduledThreadPoolExecutor(Background_Reporter);
 
 private RandomAccessFile statsFile;
@@ -112,6 +113,11 @@ public class BackgroundActivityMonitor
 compaction_severity.addAndGet(sev);
 }
 
+public void incrManualSeverity(double sev)
+{
+manual_severity.addAndGet(sev);
+}
+
 public double getIOWait() throws IOException
 {
 if (statsFile == null)
@@ -157,6 +163,7 @@ public class BackgroundActivityMonitor
 
 if (!Gossiper.instance.isEnabled())
 return;
+report += manual_severity.get(); // add manual severity setting.
 VersionedValue updated = 
StorageService.instance.valueFactory.severity(report);
 
Gossiper.instance.addLocalApplicationState(ApplicationState.SEVERITY, updated);
 }

[2/2] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

Merge branch 'cassandra-2.0' into cassandra-2.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ad57cb01
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ad57cb01
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ad57cb01

Branch: refs/heads/cassandra-2.1
Commit: ad57cb010231a89d8795a0944bd99eb6e72079cc
Parents: 2c7622a 4e4d7bb
Author: Vijay vijay2...@gmail.com
Authored: Tue Apr 22 21:34:22 2014 -0700
Committer: Vijay vijay2...@gmail.com
Committed: Tue Apr 22 21:34:22 2014 -0700

--
 .../org/apache/cassandra/locator/DynamicEndpointSnitch.java   | 2 +-
 src/java/org/apache/cassandra/service/StorageService.java | 5 +
 .../org/apache/cassandra/utils/BackgroundActivityMonitor.java | 7 +++
 3 files changed, 13 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad57cb01/src/java/org/apache/cassandra/service/StorageService.java
--

[1/3] git commit: Setting severity via JMX broken patch by Vijay; reviewed by rbranson for CASSANDRA-6996

Repository: cassandra
Updated Branches:
  refs/heads/trunk 99fbafee3 - 902925716


Setting severity via JMX broken
patch by Vijay; reviewed by rbranson for CASSANDRA-6996


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4e4d7bbc
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4e4d7bbc
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4e4d7bbc

Branch: refs/heads/trunk
Commit: 4e4d7bbcb254285a1031cb232b3fe7af326e9da3
Parents: b9324e1
Author: Vijay vijay2...@gmail.com
Authored: Tue Apr 22 21:30:38 2014 -0700
Committer: Vijay vijay2...@gmail.com
Committed: Tue Apr 22 21:30:38 2014 -0700

--
 .../org/apache/cassandra/locator/DynamicEndpointSnitch.java   | 2 +-
 src/java/org/apache/cassandra/service/StorageService.java | 5 +
 .../org/apache/cassandra/utils/BackgroundActivityMonitor.java | 7 +++
 3 files changed, 13 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e4d7bbc/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java 
b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
index 00c3618..c76a196 100644
--- a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
+++ b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
@@ -300,7 +300,7 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 
 public void setSeverity(double severity)
 {
-StorageService.instance.reportSeverity(severity);
+StorageService.instance.reportManualSeverity(severity);
 }
 
 public double getSeverity()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e4d7bbc/src/java/org/apache/cassandra/service/StorageService.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageService.java 
b/src/java/org/apache/cassandra/service/StorageService.java
index 7382cbd..75f6427 100644
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@ -1054,6 +1054,11 @@ public class StorageService extends 
NotificationBroadcasterSupport implements IE
 bgMonitor.incrCompactionSeverity(incr);
 }
 
+public void reportManualSeverity(double incr)
+{
+bgMonitor.incrManualSeverity(incr);
+}
+
 public double getSeverity(InetAddress endpoint)
 {
 return bgMonitor.getSeverity(endpoint);

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e4d7bbc/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java
--
diff --git a/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java 
b/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java
index bad9a17..93906eb 100644
--- a/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java
+++ b/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java
@@ -56,6 +56,7 @@ public class BackgroundActivityMonitor
 private static final String PROC_STAT_PATH = /proc/stat;
 
 private final AtomicDouble compaction_severity = new AtomicDouble();
+private final AtomicDouble manual_severity = new AtomicDouble();
 private final ScheduledExecutorService reportThread = new 
DebuggableScheduledThreadPoolExecutor(Background_Reporter);
 
 private RandomAccessFile statsFile;
@@ -112,6 +113,11 @@ public class BackgroundActivityMonitor
 compaction_severity.addAndGet(sev);
 }
 
+public void incrManualSeverity(double sev)
+{
+manual_severity.addAndGet(sev);
+}
+
 public double getIOWait() throws IOException
 {
 if (statsFile == null)
@@ -157,6 +163,7 @@ public class BackgroundActivityMonitor
 
 if (!Gossiper.instance.isEnabled())
 return;
+report += manual_severity.get(); // add manual severity setting.
 VersionedValue updated = 
StorageService.instance.valueFactory.severity(report);
 
Gossiper.instance.addLocalApplicationState(ApplicationState.SEVERITY, updated);
 }

[3/3] git commit: Merge branch 'cassandra-2.1' into trunk

Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/90292571
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/90292571
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/90292571

Branch: refs/heads/trunk
Commit: 90292571663e7ba8cf9b625b0d89fd67ae1bcc3e
Parents: 99fbafe ad57cb0
Author: Vijay vijay2...@gmail.com
Authored: Tue Apr 22 21:36:20 2014 -0700
Committer: Vijay vijay2...@gmail.com
Committed: Tue Apr 22 21:36:20 2014 -0700

--
 .../org/apache/cassandra/locator/DynamicEndpointSnitch.java   | 2 +-
 src/java/org/apache/cassandra/service/StorageService.java | 5 +
 .../org/apache/cassandra/utils/BackgroundActivityMonitor.java | 7 +++
 3 files changed, 13 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/90292571/src/java/org/apache/cassandra/service/StorageService.java
--

[2/3] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1