[jira] [Created] (CASSANDRA-8002) Cassandra SSTableReader hang on startup

2014-09-25 Thread Tim Isted (JIRA)
Tim Isted created CASSANDRA-8002:


 Summary: Cassandra SSTableReader hang on startup
 Key: CASSANDRA-8002
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8002
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: centos6.5 
jre-7u60.x86_64
kernel:2.6.32-431.17.1.el6.x86_64
Reporter: Tim Isted
Priority: Minor
 Fix For: 2.0.10
 Attachments: htop.png

We are experiencing a hang on startup when SSTableReader tries to open certain 
DataIndex-jb files, each time its the same SSTable

#/var/log/system.log
 INFO [SSTableBatchOpen:1] 2014-09-25 07:58:13,660 SSTableReader.java (line 
223) Opening /cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-25125 
(291271971 bytes)
 INFO [SSTableBatchOpen:2] 2014-09-25 07:58:13,660 SSTableReader.java (line 
223) Opening /cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-25182 
(97969101 bytes)
 INFO [SSTableBatchOpen:3] 2014-09-25 07:58:13,660 SSTableReader.java (line 
223) Opening /cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-25217 
(8787563 bytes)
 INFO [SSTableBatchOpen:4] 2014-09-25 07:58:13,661 SSTableReader.java (line 
223) Opening /cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-23265 
(395830480191 bytes)
 INFO [SSTableBatchOpen:5] 2014-09-25 07:58:13,661 SSTableReader.java (line 
223) Opening /cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-25219 
(80412 bytes)
 INFO [SSTableBatchOpen:7] 2014-09-25 07:58:13,663 SSTableReader.java (line 
223) Opening /cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-25130 
(52552441 bytes)
 INFO [SSTableBatchOpen:6] 2014-09-25 07:58:13,663 SSTableReader.java (line 
223) Opening /cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-25106 
(1160245456 bytes)
 INFO [SSTableBatchOpen:8] 2014-09-25 07:58:13,663 SSTableReader.java (line 
223) Opening /cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-25218 
(5068542 bytes)
 INFO [SSTableBatchOpen:5] 2014-09-25 07:58:13,737 SSTableReader.java (line 
223) Opening /cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-25021 
(1169149628 bytes)
 INFO [SSTableBatchOpen:8] 2014-09-25 07:58:13,745 SSTableReader.java (line 
223) Opening /cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-25163 
(64078835 bytes)
 INFO [SSTableBatchOpen:3] 2014-09-25 07:58:13,746 SSTableReader.java (line 
223) Opening /cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-25187 
(53540520 bytes)
 INFO [SSTableBatchOpen:7] 2014-09-25 07:58:13,771 SSTableReader.java (line 
223) Opening /cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-24537 
(18890626389 bytes)
 INFO [SSTableBatchOpen:3] 2014-09-25 07:58:13,779 SSTableReader.java (line 
223) Opening /cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-24936 
(1183874336 bytes)
 INFO [SSTableBatchOpen:8] 2014-09-25 07:58:13,785 SSTableReader.java (line 
223) Opening /cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-25158 
(99217260 bytes)
 INFO [SSTableBatchOpen:8] 2014-09-25 07:58:13,813 SSTableReader.java (line 
223) Opening /cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-24859 
(4761451567 bytes)

#/var/log/cassandra.log
 INFO 07:58:13,663 Opening 
/cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-25106 (1160245456 
bytes)
 INFO 07:58:13,663 Opening 
/cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-25218 (5068542 bytes)
 INFO 07:58:13,737 Opening 
/cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-25021 (1169149628 
bytes)
 INFO 07:58:13,745 Opening 
/cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-25163 (64078835 
bytes)
 INFO 07:58:13,746 Opening 
/cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-25187 (53540520 
bytes)
 INFO 07:58:13,771 Opening 
/cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-24537 (18890626389 
bytes)
 INFO 07:58:13,779 Opening 
/cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-24936 (1183874336 
bytes)
 INFO 07:58:13,785 Opening 
/cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-25158 (99217260 
bytes)
 INFO 07:58:13,813 Opening 
/cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-24859 (4761451567 
bytes)

-rw-r--r--. 1 cassandra cassandra 1.4M Sep 24 07:23 
/cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-24859-CompressionInfo.db
-rw-r--r--. 1 cassandra cassandra 4.5G Sep 24 07:23 
/cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-24859-Data.db
-rw-r--r--. 1 cassandra cassandra  53M Sep 24 07:23 
/cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-24859-Filter.db
-rw-r--r--. 1 cassandra cassandra 3.3G Sep 24 07:23 
/cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-24859-Index.db
-rw-r--r--. 1 cassandra cassandra 5.9K Sep 24 07:23 
/cassandra/data/Resources/DataIndex/Resources-DataIndex-jb-24859-Statistics.db
-rw-r--r--. 1 

[jira] [Created] (CASSANDRA-8003) Allow the CassandraDaemon to be managed externally

2014-09-25 Thread Heiko Braun (JIRA)
Heiko Braun created CASSANDRA-8003:
--

 Summary: Allow the CassandraDaemon to be managed externally
 Key: CASSANDRA-8003
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8003
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Heiko Braun


This is related to CASSANDRA-7998 and deals with the control flow, if the 
CassandraDaemon is managed by another Java process. In that case it should not 
exit the VM, but instead delegate that decision to the process that crated the 
daemon in the first place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7282) Faster Memtable map

2014-09-25 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147705#comment-14147705
 ] 

Jason Brown commented on CASSANDRA-7282:


I ran benedict's 7282 patch vs. the commit prior in his branch, for several use 
cases. TL;DR the patch was a clear winner for the the specific use case he 
called out, and marginally better, if not the same, for a 'typical' use case.

For the first use case, I used Benedict's attached profile.yaml, although i 
changed the RF from 1 to 2. I then used this stress command to execute 
(basically the same as Benedict's above) {code}./bin/cassandra-stress user 
profile=~/jasobrown/b7282.yaml ops\(insert=5,read=5\) n=2000 -pop 
seq=1..10M read-lookback=extreme\(1..1M,2\) -rate threads=200 -mode cql3 native 
prepared -node node_addresses{code}

I've attached the results of running the above command three times successively 
on both patch and unpatched code, and the results can be summarized:
- 15% improvement in throughput
- 10% improvement at 95%/99%-iles
- 50% improvement at 99.9%-ile
~ 40% less garbage created / 40% less time in GC

Note that over the life of stress run, memtables were flushing and sstables 
compacting, so not all reads were coming directly from the memtable.

One thing I perhaps could have tried (and would have liked to) was switching 
the CL from ONE to LOCAL_QUORUM, although I think stress would have applied 
that to both reads and writes for the above command, whereas I would have 
wanted to affect either read or write.

I also ran the stress a more 'standard' use case of mine (I'm still developing 
it alongside the new stress), and results were about even between patch and 
unpatched, although there may have been very slight advantage toward the 
patched version.

So, I think in the case where you are reading your most recent writes, and the 
data in the memtable is small (not wide), there is a performance gain in this 
patch. In a more typical case, it wasn't necessarily proven that the patch 
boosts performance (but then the patch wasn't attempting to help the general 
case, anyway).


 Faster Memtable map
 ---

 Key: CASSANDRA-7282
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 3.0

 Attachments: profile.yaml, reads.svg, run1.svg, writes.svg


 Currently we maintain a ConcurrentSkipLastMap of DecoratedKey - Partition in 
 our memtables. Maintaining this is an O(lg(n)) operation; since the vast 
 majority of users use a hash partitioner, it occurs to me we could maintain a 
 hybrid ordered list / hash map. The list would impose the normal order on the 
 collection, but a hash index would live alongside as part of the same data 
 structure, simply mapping into the list and permitting O(1) lookups and 
 inserts.
 I've chosen to implement this initial version as a linked-list node per item, 
 but we can optimise this in future by storing fatter nodes that permit a 
 cache-line's worth of hashes to be checked at once,  further reducing the 
 constant factor costs for lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7282) Faster Memtable map

2014-09-25 Thread Jason Brown (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-7282:
---
Attachment: jasobrown-sample-run.txt

 Faster Memtable map
 ---

 Key: CASSANDRA-7282
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 3.0

 Attachments: jasobrown-sample-run.txt, profile.yaml, reads.svg, 
 run1.svg, writes.svg


 Currently we maintain a ConcurrentSkipLastMap of DecoratedKey - Partition in 
 our memtables. Maintaining this is an O(lg(n)) operation; since the vast 
 majority of users use a hash partitioner, it occurs to me we could maintain a 
 hybrid ordered list / hash map. The list would impose the normal order on the 
 collection, but a hash index would live alongside as part of the same data 
 structure, simply mapping into the list and permitting O(1) lookups and 
 inserts.
 I've chosen to implement this initial version as a linked-list node per item, 
 but we can optimise this in future by storing fatter nodes that permit a 
 cache-line's worth of hashes to be checked at once,  further reducing the 
 constant factor costs for lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8001) CQLSH broken in Mac OS 10.9.5

2014-09-25 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8001:
---
Labels: cqlsh  (was: )

 CQLSH broken in Mac OS 10.9.5
 -

 Key: CASSANDRA-8001
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8001
 Project: Cassandra
  Issue Type: Bug
 Environment: Mac OS 10.9.5, homebrew python 2.7, cql 4.1.1
Reporter: Jonathan Hoskin
  Labels: cqlsh

 Since upgrading to Mac OS 10.9.5, cqlsh has stopped working.
 Here is the error trace I get:
 {quote}
 $ cqlsh -u USER -k KEYSPACE HOST
 Traceback (most recent call last):
   File /usr/local/bin/cqlsh, line 2044, in module
 main(*read_options(sys.argv[1:], os.environ))
   File /usr/local/bin/cqlsh, line 2030, in main
 display_float_precision=options.float_precision)
   File /usr/local/bin/cqlsh, line 480, in __init__
 cql_version=cqlver, transport=transport)
   File /usr/local/lib/python2.7/site-packages/cql/connection.py, line 143, 
 in connect
 consistency_level=consistency_level, transport=transport)
   File /usr/local/lib/python2.7/site-packages/cql/connection.py, line 59, 
 in __init__
 self.establish_connection()
   File /usr/local/lib/python2.7/site-packages/cql/thrifteries.py, line 157, 
 in establish_connection
 self.client.login(AuthenticationRequest(credentials=self.credentials))
   File /usr/local/lib/python2.7/site-packages/cql/cassandra/Cassandra.py, 
 line 454, in login
 self.send_login(auth_request)
   File /usr/local/lib/python2.7/site-packages/cql/cassandra/Cassandra.py, 
 line 461, in send_login
 args.write(self._oprot)
   File /usr/local/lib/python2.7/site-packages/cql/cassandra/Cassandra.py, 
 line 2769, in write
 oprot.trans.write(fastbinary.encode_binary(self, (self.__class__, 
 self.thrift_spec)))
 TypeError: expected string or Unicode object, NoneType found
 {quote}
 At first I suspected that some system dependency had changed, so I have clean 
 reinstalled python, pip and cql. The error persists.
 I have also tested the cqlsh command on another Mac running OS 10.9.2 and it 
 can connect fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8003) Allow the CassandraDaemon to be managed externally

2014-09-25 Thread Heiko Braun (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heiko Braun updated CASSANDRA-8003:
---
Description: This is related to CASSANDRA-7998 and deals with the control 
flow, if the CassandraDaemon is managed by another Java process. In that case 
it should not exit the VM, but instead delegate that decision to the process 
that created the daemon in the first place.  (was: This is related to 
CASSANDRA-7998 and deals with the control flow, if the CassandraDaemon is 
managed by another Java process. In that case it should not exit the VM, but 
instead delegate that decision to the process that crated the daemon in the 
first place.)

 Allow the CassandraDaemon to be managed externally
 --

 Key: CASSANDRA-8003
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8003
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Heiko Braun

 This is related to CASSANDRA-7998 and deals with the control flow, if the 
 CassandraDaemon is managed by another Java process. In that case it should 
 not exit the VM, but instead delegate that decision to the process that 
 created the daemon in the first place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7282) Faster Memtable map

2014-09-25 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147705#comment-14147705
 ] 

Jason Brown edited comment on CASSANDRA-7282 at 9/25/14 1:05 PM:
-

I ran benedict's 7282 patch vs. the commit prior in his branch, for several use 
cases. TL;DR the patch was a clear winner for the the specific use case he 
called out, and marginally better, if not the same, for a 'typical' use case.

For the record, I set mct = 0.5 and was using offheap_objects.

For the first use case, I used Benedict's attached profile.yaml, although i 
changed the RF from 1 to 2. I then used this stress command to execute 
(basically the same as Benedict's above) {code}./bin/cassandra-stress user 
profile=~/jasobrown/b7282.yaml ops\(insert=5,read=5\) n=2000 -pop 
seq=1..10M read-lookback=extreme\(1..1M,2\) -rate threads=200 -mode cql3 native 
prepared -node node_addresses{code}

I've attached the results of running the above command three times successively 
on both patch and unpatched code, and the results can be summarized:
- 15% improvement in throughput
- 10% improvement at 95%/99%-iles
- 50% improvement at 99.9%-ile
~ 40% less garbage created / 40% less time in GC

Note that over the life of stress run, memtables were flushing and sstables 
compacting, so not all reads were coming directly from the memtable.

One thing I perhaps could have tried (and would have liked to) was switching 
the CL from ONE to LOCAL_QUORUM, although I think stress would have applied 
that to both reads and writes for the above command, whereas I would have 
wanted to affect either read or write.

I also ran the stress a more 'standard' use case of mine (I'm still developing 
it alongside the new stress), and results were about even between patch and 
unpatched, although there may have been very slight advantage toward the 
patched version.

So, I think in the case where you are reading your most recent writes, and the 
data in the memtable is small (not wide), there is a performance gain in this 
patch. In a more typical case, it wasn't necessarily proven that the patch 
boosts performance (but then the patch wasn't attempting to help the general 
case, anyway).



was (Author: jasobrown):
I ran benedict's 7282 patch vs. the commit prior in his branch, for several use 
cases. TL;DR the patch was a clear winner for the the specific use case he 
called out, and marginally better, if not the same, for a 'typical' use case.

For the first use case, I used Benedict's attached profile.yaml, although i 
changed the RF from 1 to 2. I then used this stress command to execute 
(basically the same as Benedict's above) {code}./bin/cassandra-stress user 
profile=~/jasobrown/b7282.yaml ops\(insert=5,read=5\) n=2000 -pop 
seq=1..10M read-lookback=extreme\(1..1M,2\) -rate threads=200 -mode cql3 native 
prepared -node node_addresses{code}

I've attached the results of running the above command three times successively 
on both patch and unpatched code, and the results can be summarized:
- 15% improvement in throughput
- 10% improvement at 95%/99%-iles
- 50% improvement at 99.9%-ile
~ 40% less garbage created / 40% less time in GC

Note that over the life of stress run, memtables were flushing and sstables 
compacting, so not all reads were coming directly from the memtable.

One thing I perhaps could have tried (and would have liked to) was switching 
the CL from ONE to LOCAL_QUORUM, although I think stress would have applied 
that to both reads and writes for the above command, whereas I would have 
wanted to affect either read or write.

I also ran the stress a more 'standard' use case of mine (I'm still developing 
it alongside the new stress), and results were about even between patch and 
unpatched, although there may have been very slight advantage toward the 
patched version.

So, I think in the case where you are reading your most recent writes, and the 
data in the memtable is small (not wide), there is a performance gain in this 
patch. In a more typical case, it wasn't necessarily proven that the patch 
boosts performance (but then the patch wasn't attempting to help the general 
case, anyway).


 Faster Memtable map
 ---

 Key: CASSANDRA-7282
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 3.0

 Attachments: jasobrown-sample-run.txt, profile.yaml, reads.svg, 
 run1.svg, writes.svg


 Currently we maintain a ConcurrentSkipLastMap of DecoratedKey - Partition in 
 our memtables. Maintaining this is an O(lg(n)) operation; since the vast 
 majority of users use a hash partitioner, it occurs to me we could maintain a 
 hybrid 

[jira] [Created] (CASSANDRA-8004) Run LCS for both repaired and unrepaired data

2014-09-25 Thread Marcus Eriksson (JIRA)
Marcus Eriksson created CASSANDRA-8004:
--

 Summary: Run LCS for both repaired and unrepaired data
 Key: CASSANDRA-8004
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8004
 Project: Cassandra
  Issue Type: Bug
Reporter: Marcus Eriksson
 Fix For: 2.1.1


If a user has leveled compaction configured, we should run that for both the 
unrepaired and the repaired data. I think this would make things a lot easier 
for end users

It would simplify migration to incremental repairs as well, if a user runs 
incremental repair on its nice leveled unrepaired data, we wont need to drop it 
all to L0, instead we can just start moving sstables from the unrepaired 
leveling straight into the repaired leveling

Idea could be to have two instances of LeveledCompactionStrategy and move 
sstables between the instances after an incremental repair run (and let LCS be 
totally oblivious to whether it handles repaired or unrepaired data). Same 
should probably apply to any compaction strategy, run two instances and remove 
all repaired/unrepaired logic from the strategy itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8004) Run LCS for both repaired and unrepaired data

2014-09-25 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147728#comment-14147728
 ] 

T Jake Luciani commented on CASSANDRA-8004:
---

+1

 Run LCS for both repaired and unrepaired data
 -

 Key: CASSANDRA-8004
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8004
 Project: Cassandra
  Issue Type: Bug
Reporter: Marcus Eriksson
  Labels: compaction
 Fix For: 2.1.1


 If a user has leveled compaction configured, we should run that for both the 
 unrepaired and the repaired data. I think this would make things a lot easier 
 for end users
 It would simplify migration to incremental repairs as well, if a user runs 
 incremental repair on its nice leveled unrepaired data, we wont need to drop 
 it all to L0, instead we can just start moving sstables from the unrepaired 
 leveling straight into the repaired leveling
 Idea could be to have two instances of LeveledCompactionStrategy and move 
 sstables between the instances after an incremental repair run (and let LCS 
 be totally oblivious to whether it handles repaired or unrepaired data). Same 
 should probably apply to any compaction strategy, run two instances and 
 remove all repaired/unrepaired logic from the strategy itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8004) Run LCS for both repaired and unrepaired data

2014-09-25 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147728#comment-14147728
 ] 

T Jake Luciani edited comment on CASSANDRA-8004 at 9/25/14 1:29 PM:


+1 this idea


was (Author: tjake):
+1

 Run LCS for both repaired and unrepaired data
 -

 Key: CASSANDRA-8004
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8004
 Project: Cassandra
  Issue Type: Bug
Reporter: Marcus Eriksson
  Labels: compaction
 Fix For: 2.1.1


 If a user has leveled compaction configured, we should run that for both the 
 unrepaired and the repaired data. I think this would make things a lot easier 
 for end users
 It would simplify migration to incremental repairs as well, if a user runs 
 incremental repair on its nice leveled unrepaired data, we wont need to drop 
 it all to L0, instead we can just start moving sstables from the unrepaired 
 leveling straight into the repaired leveling
 Idea could be to have two instances of LeveledCompactionStrategy and move 
 sstables between the instances after an incremental repair run (and let LCS 
 be totally oblivious to whether it handles repaired or unrepaired data). Same 
 should probably apply to any compaction strategy, run two instances and 
 remove all repaired/unrepaired logic from the strategy itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8003) Allow the CassandraDaemon to be managed externally

2014-09-25 Thread Heiko Braun (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heiko Braun updated CASSANDRA-8003:
---
Attachment: CASSANDRA-8003.patch

 Allow the CassandraDaemon to be managed externally
 --

 Key: CASSANDRA-8003
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8003
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Heiko Braun
 Attachments: CASSANDRA-8003.patch


 This is related to CASSANDRA-7998 and deals with the control flow, if the 
 CassandraDaemon is managed by another Java process. In that case it should 
 not exit the VM, but instead delegate that decision to the process that 
 created the daemon in the first place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7998) Remove the usage of System.exit() calls in core services

2014-09-25 Thread Heiko Braun (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heiko Braun updated CASSANDRA-7998:
---
Attachment: CASSANDRA-7998.patch

 Remove the usage of System.exit() calls in core services
 

 Key: CASSANDRA-7998
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7998
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Heiko Braun
 Attachments: CASSANDRA-7998.patch


 The use of System.exit() prevents using the CassandraDaemon as a managed 
 service (managed from another Java process). The core services 
 (StorageService,DatabaseDescriptor, SSTableReader) should propagate 
 exceptions back to the callee so the decision to exit the VM (unmanaged case) 
 or further delegate that decision (managed case) can be handled in a well 
 defined place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7997) Improve the ability to run the CassandraDaemon as a managed service

2014-09-25 Thread Heiko Braun (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147761#comment-14147761
 ] 

Heiko Braun commented on CASSANDRA-7997:


I've separated the issues into two, same with the patch files that need to be 
applied in order. If it's more convenient I can collapse both issues into one. 
Juts let me know what works best for you.

 Improve the ability to run the CassandraDaemon as a managed service
 ---

 Key: CASSANDRA-7997
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7997
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Heiko Braun
Priority: Minor

 See a transcript of the discussion here:
 http://www.mail-archive.com/dev@cassandra.apache.org/msg07529.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7998) Remove the usage of System.exit() calls in core services

2014-09-25 Thread Heiko Braun (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heiko Braun updated CASSANDRA-7998:
---
Attachment: (was: CASSANDRA-7998.patch)

 Remove the usage of System.exit() calls in core services
 

 Key: CASSANDRA-7998
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7998
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Heiko Braun

 The use of System.exit() prevents using the CassandraDaemon as a managed 
 service (managed from another Java process). The core services 
 (StorageService,DatabaseDescriptor, SSTableReader) should propagate 
 exceptions back to the callee so the decision to exit the VM (unmanaged case) 
 or further delegate that decision (managed case) can be handled in a well 
 defined place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8003) Allow the CassandraDaemon to be managed externally

2014-09-25 Thread Heiko Braun (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heiko Braun updated CASSANDRA-8003:
---
Attachment: (was: CASSANDRA-8003.patch)

 Allow the CassandraDaemon to be managed externally
 --

 Key: CASSANDRA-8003
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8003
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Heiko Braun

 This is related to CASSANDRA-7998 and deals with the control flow, if the 
 CassandraDaemon is managed by another Java process. In that case it should 
 not exit the VM, but instead delegate that decision to the process that 
 created the daemon in the first place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7997) Improve the ability to run the CassandraDaemon as a managed service

2014-09-25 Thread Heiko Braun (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heiko Braun updated CASSANDRA-7997:
---
Attachment: CASSANDRA-8003.patch
CASSANDRA-7998.patch

 Improve the ability to run the CassandraDaemon as a managed service
 ---

 Key: CASSANDRA-7997
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7997
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Heiko Braun
Priority: Minor
 Attachments: CASSANDRA-7998.patch, CASSANDRA-8003.patch


 See a transcript of the discussion here:
 http://www.mail-archive.com/dev@cassandra.apache.org/msg07529.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7997) Improve the ability to run the CassandraDaemon as a managed service

2014-09-25 Thread Heiko Braun (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147761#comment-14147761
 ] 

Heiko Braun edited comment on CASSANDRA-7997 at 9/25/14 2:14 PM:
-

I've separated the issues into two, same with the patch files that need to be 
applied in order. If it's more convenient I can collapse both issues into one. 
Just let me know what works best for you.


was (Author: hbraun):
I've separated the issues into two, same with the patch files that need to be 
applied in order. If it's more convenient I can collapse both issues into one. 
Juts let me know what works best for you.

 Improve the ability to run the CassandraDaemon as a managed service
 ---

 Key: CASSANDRA-7997
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7997
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Heiko Braun
Priority: Minor
 Attachments: CASSANDRA-7998.patch, CASSANDRA-8003.patch


 See a transcript of the discussion here:
 http://www.mail-archive.com/dev@cassandra.apache.org/msg07529.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7998) Remove the usage of System.exit() calls in core services

2014-09-25 Thread Heiko Braun (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heiko Braun updated CASSANDRA-7998:
---
Priority: Minor  (was: Major)

 Remove the usage of System.exit() calls in core services
 

 Key: CASSANDRA-7998
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7998
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Heiko Braun
Priority: Minor

 The use of System.exit() prevents using the CassandraDaemon as a managed 
 service (managed from another Java process). The core services 
 (StorageService,DatabaseDescriptor, SSTableReader) should propagate 
 exceptions back to the callee so the decision to exit the VM (unmanaged case) 
 or further delegate that decision (managed case) can be handled in a well 
 defined place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8003) Allow the CassandraDaemon to be managed externally

2014-09-25 Thread Heiko Braun (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heiko Braun updated CASSANDRA-8003:
---
Priority: Minor  (was: Major)

 Allow the CassandraDaemon to be managed externally
 --

 Key: CASSANDRA-8003
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8003
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Heiko Braun
Priority: Minor

 This is related to CASSANDRA-7998 and deals with the control flow, if the 
 CassandraDaemon is managed by another Java process. In that case it should 
 not exit the VM, but instead delegate that decision to the process that 
 created the daemon in the first place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7019) Major tombstone compaction

2014-09-25 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147823#comment-14147823
 ] 

Marcus Eriksson commented on CASSANDRA-7019:


branch here: https://github.com/krummas/cassandra/commits/marcuse/7019-2

triggered with nodetool compact -o ks cf

It writes fully compacted partitions - each partition will only be in one 
single sstable  - my first idea was to put the cells back in the corresponding 
files where they were found (minus tombstones), but it felt wrong to not 
actually write the compacted partition out when we have it.

LCS:
* creates an 'optimal' leveling - it takes all existing files, compacts them, 
and starts filling each level from L0 up
** note that (if we have token range 0 - 1000) L1 will get tokens 0-10, L2 
11-100 and L3 101 - 1000. Not though much about if this is good/bad for 
future compactions.

STCS:
* calculates an 'optimal' distribution of sstables, currently it makes them 
50%, 25%, 12.5% ... of total data size until the smallest sstable would be sub 
50MB, then puts all the rest in the last sstable. If anyone has a more optimal 
sstable distribution, please let me know
** the sstables will be non-overlapping, it starts writing the biggest sstable 
first and continues with the rest once 50% is in that

 Major tombstone compaction
 --

 Key: CASSANDRA-7019
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7019
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
  Labels: compaction

 It should be possible to do a major tombstone compaction by including all 
 sstables, but writing them out 1:1, meaning that if you have 10 sstables 
 before, you will have 10 sstables after the compaction with the same data, 
 minus all the expired tombstones.
 We could do this in two ways:
 # a nodetool command that includes _all_ sstables
 # once we detect that an sstable has more than x% (20%?) expired tombstones, 
 we start one of these compactions, and include all overlapping sstables that 
 contain older data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7019) Major tombstone compaction

2014-09-25 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147823#comment-14147823
 ] 

Marcus Eriksson edited comment on CASSANDRA-7019 at 9/25/14 2:56 PM:
-

branch here: https://github.com/krummas/cassandra/commits/marcuse/7019-2

triggered with nodetool compact -o ks cf

It writes fully compacted partitions - each partition will only be in one 
single sstable  - my first idea was to put the cells back in the corresponding 
files where they were found (minus tombstones), but it felt wrong to not 
actually write the compacted partition out when we have it.

LCS:
* creates an 'optimal' leveling - it takes all existing files, compacts them, 
and starts filling each level from L0 up
** note that (if we have token range 0 - 1000) L1 will get tokens 0 - 10, L2 
11 - 100 and L3 101 - 1000. Not thought much about if this is good/bad for 
future compactions.

STCS:
* calculates an 'optimal' distribution of sstables, currently it makes them 
50%, 25%, 12.5% ... of total data size until the smallest sstable would be sub 
50MB, then puts all the rest in the last sstable. If anyone has a more optimal 
sstable distribution, please let me know
** the sstables will be non-overlapping, it starts writing the biggest sstable 
first and continues with the rest once 50% is in that


was (Author: krummas):
branch here: https://github.com/krummas/cassandra/commits/marcuse/7019-2

triggered with nodetool compact -o ks cf

It writes fully compacted partitions - each partition will only be in one 
single sstable  - my first idea was to put the cells back in the corresponding 
files where they were found (minus tombstones), but it felt wrong to not 
actually write the compacted partition out when we have it.

LCS:
* creates an 'optimal' leveling - it takes all existing files, compacts them, 
and starts filling each level from L0 up
** note that (if we have token range 0 - 1000) L1 will get tokens 0 - 10, L2 
11 - 100 and L3 101 - 1000. Not though much about if this is good/bad for 
future compactions.

STCS:
* calculates an 'optimal' distribution of sstables, currently it makes them 
50%, 25%, 12.5% ... of total data size until the smallest sstable would be sub 
50MB, then puts all the rest in the last sstable. If anyone has a more optimal 
sstable distribution, please let me know
** the sstables will be non-overlapping, it starts writing the biggest sstable 
first and continues with the rest once 50% is in that

 Major tombstone compaction
 --

 Key: CASSANDRA-7019
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7019
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
  Labels: compaction

 It should be possible to do a major tombstone compaction by including all 
 sstables, but writing them out 1:1, meaning that if you have 10 sstables 
 before, you will have 10 sstables after the compaction with the same data, 
 minus all the expired tombstones.
 We could do this in two ways:
 # a nodetool command that includes _all_ sstables
 # once we detect that an sstable has more than x% (20%?) expired tombstones, 
 we start one of these compactions, and include all overlapping sstables that 
 contain older data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7019) Major tombstone compaction

2014-09-25 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147823#comment-14147823
 ] 

Marcus Eriksson edited comment on CASSANDRA-7019 at 9/25/14 2:55 PM:
-

branch here: https://github.com/krummas/cassandra/commits/marcuse/7019-2

triggered with nodetool compact -o ks cf

It writes fully compacted partitions - each partition will only be in one 
single sstable  - my first idea was to put the cells back in the corresponding 
files where they were found (minus tombstones), but it felt wrong to not 
actually write the compacted partition out when we have it.

LCS:
* creates an 'optimal' leveling - it takes all existing files, compacts them, 
and starts filling each level from L0 up
** note that (if we have token range 0 - 1000) L1 will get tokens 0 - 10, L2 
11 - 100 and L3 101 - 1000. Not though much about if this is good/bad for 
future compactions.

STCS:
* calculates an 'optimal' distribution of sstables, currently it makes them 
50%, 25%, 12.5% ... of total data size until the smallest sstable would be sub 
50MB, then puts all the rest in the last sstable. If anyone has a more optimal 
sstable distribution, please let me know
** the sstables will be non-overlapping, it starts writing the biggest sstable 
first and continues with the rest once 50% is in that


was (Author: krummas):
branch here: https://github.com/krummas/cassandra/commits/marcuse/7019-2

triggered with nodetool compact -o ks cf

It writes fully compacted partitions - each partition will only be in one 
single sstable  - my first idea was to put the cells back in the corresponding 
files where they were found (minus tombstones), but it felt wrong to not 
actually write the compacted partition out when we have it.

LCS:
* creates an 'optimal' leveling - it takes all existing files, compacts them, 
and starts filling each level from L0 up
** note that (if we have token range 0 - 1000) L1 will get tokens 0-10, L2 
11-100 and L3 101 - 1000. Not though much about if this is good/bad for 
future compactions.

STCS:
* calculates an 'optimal' distribution of sstables, currently it makes them 
50%, 25%, 12.5% ... of total data size until the smallest sstable would be sub 
50MB, then puts all the rest in the last sstable. If anyone has a more optimal 
sstable distribution, please let me know
** the sstables will be non-overlapping, it starts writing the biggest sstable 
first and continues with the rest once 50% is in that

 Major tombstone compaction
 --

 Key: CASSANDRA-7019
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7019
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
  Labels: compaction

 It should be possible to do a major tombstone compaction by including all 
 sstables, but writing them out 1:1, meaning that if you have 10 sstables 
 before, you will have 10 sstables after the compaction with the same data, 
 minus all the expired tombstones.
 We could do this in two ways:
 # a nodetool command that includes _all_ sstables
 # once we detect that an sstable has more than x% (20%?) expired tombstones, 
 we start one of these compactions, and include all overlapping sstables that 
 contain older data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7019) Major tombstone compaction

2014-09-25 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147831#comment-14147831
 ] 

Jeremiah Jordan commented on CASSANDRA-7019:


Since this is going in 3.0, maybe we should make this the default nodetool 
compact.  I don't know of any case where the STCS put everything in one file 
is really what people want.  And for LCS all we used to do is run the 
compaction task like normal.  If we still want a way to kick compaction for 
LCS, we could add a new nodetool checkcompaction command or something that 
just schedules the compaction manager to run (and does that for STCS and LCS).  
Doing that is useful when someone changes compaction settings and there are not 
currently writes happening to the system, so making it an explicit command 
sounds right to me.

 Major tombstone compaction
 --

 Key: CASSANDRA-7019
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7019
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
  Labels: compaction
 Fix For: 3.0


 It should be possible to do a major tombstone compaction by including all 
 sstables, but writing them out 1:1, meaning that if you have 10 sstables 
 before, you will have 10 sstables after the compaction with the same data, 
 minus all the expired tombstones.
 We could do this in two ways:
 # a nodetool command that includes _all_ sstables
 # once we detect that an sstable has more than x% (20%?) expired tombstones, 
 we start one of these compactions, and include all overlapping sstables that 
 contain older data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-5589) ArrayIndexOutOfBoundsException in LeveledManifest

2014-09-25 Thread Jeff Griffith (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147833#comment-14147833
 ] 

Jeff Griffith edited comment on CASSANDRA-5589 at 9/25/14 3:05 PM:
---

hi [~jbellis], it looks like this problem may not have been the total # of 
generations allocated because I can still see this in 1.2.19. The stack trace 
is the same as Jeremy's however the index out of bounds is 9 (log 10 of 1 
billion from your fix.) skipLevels does not check newLevel against 
generations.length however the fix obviously isn't that simple since it needs 
to return the newLevel... some other logic problem in the loop termination?

private int skipLevels(int newLevel, IterableSSTableReader added)
{
while (maxBytesForLevel(newLevel)  SSTableReader.getTotalBytes(added)
 generations[(newLevel + 1)].isEmpty())
{
newLevel++;
}
return newLevel;
}


was (Author: jeffery.griffith):
hi [~jbellis], it looks like this problem not have been the total # of 
generations allocated because I can still see this in 1.2.19. The stack trace 
is the same as Jeremy's however the index out of bounds is 9 (log 10 of 1 
billion from your fix.) skipLevels does not check newLevel against 
generations.length however the fix obviously isn't that simple since it needs 
to return the newLevel... some other logic problem in the loop termination?

private int skipLevels(int newLevel, IterableSSTableReader added)
{
while (maxBytesForLevel(newLevel)  SSTableReader.getTotalBytes(added)
 generations[(newLevel + 1)].isEmpty())
{
newLevel++;
}
return newLevel;
}

 ArrayIndexOutOfBoundsException in LeveledManifest
 -

 Key: CASSANDRA-5589
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5589
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.0
Reporter: Jeremy Hanna
Assignee: Jonathan Ellis
Priority: Minor
  Labels: compaction
 Fix For: 1.2.6

 Attachments: 5589.txt


 The following stack trace was in the system.log:
 {quote}
 ERROR [CompactionExecutor:2] 2013-05-22 16:19:32,402 CassandraDaemon.java 
 (line 174) Exception in thread Thread[CompactionExecutor:2,1,main]
  java.lang.ArrayIndexOutOfBoundsException: 5
   at 
 org.apache.cassandra.db.compaction.LeveledManifest.skipLevels(LeveledManifest.java:176)
   at 
 org.apache.cassandra.db.compaction.LeveledManifest.promote(LeveledManifest.java:215)
   at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.handleNotification(LeveledCompactionStrategy.java:155)
   at 
 org.apache.cassandra.db.DataTracker.notifySSTablesChanged(DataTracker.java:410)
   at 
 org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:223)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:991)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:230)
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:188)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-5589) ArrayIndexOutOfBoundsException in LeveledManifest

2014-09-25 Thread Jeff Griffith (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147833#comment-14147833
 ] 

Jeff Griffith commented on CASSANDRA-5589:
--

hi [~jbellis], it looks like this problem not have been the total # of 
generations allocated because I can still see this in 1.2.19. The stack trace 
is the same as Jeremy's however the index out of bounds is 9 (log 10 of 1 
billion from your fix.) skipLevels does not check newLevel against 
generations.length however the fix obviously isn't that simple since it needs 
to return the newLevel... some other logic problem in the loop termination?

private int skipLevels(int newLevel, IterableSSTableReader added)
{
while (maxBytesForLevel(newLevel)  SSTableReader.getTotalBytes(added)
 generations[(newLevel + 1)].isEmpty())
{
newLevel++;
}
return newLevel;
}

 ArrayIndexOutOfBoundsException in LeveledManifest
 -

 Key: CASSANDRA-5589
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5589
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.0
Reporter: Jeremy Hanna
Assignee: Jonathan Ellis
Priority: Minor
  Labels: compaction
 Fix For: 1.2.6

 Attachments: 5589.txt


 The following stack trace was in the system.log:
 {quote}
 ERROR [CompactionExecutor:2] 2013-05-22 16:19:32,402 CassandraDaemon.java 
 (line 174) Exception in thread Thread[CompactionExecutor:2,1,main]
  java.lang.ArrayIndexOutOfBoundsException: 5
   at 
 org.apache.cassandra.db.compaction.LeveledManifest.skipLevels(LeveledManifest.java:176)
   at 
 org.apache.cassandra.db.compaction.LeveledManifest.promote(LeveledManifest.java:215)
   at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.handleNotification(LeveledCompactionStrategy.java:155)
   at 
 org.apache.cassandra.db.DataTracker.notifySSTablesChanged(DataTracker.java:410)
   at 
 org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:223)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:991)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:230)
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:188)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-5589) ArrayIndexOutOfBoundsException in LeveledManifest

2014-09-25 Thread Jeff Griffith (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147833#comment-14147833
 ] 

Jeff Griffith edited comment on CASSANDRA-5589 at 9/25/14 3:05 PM:
---

hi [~jbellis], it looks like this problem may not have been the total # of 
generations allocated because I can still see this in 1.2.19. The stack trace 
is the same as Jeremy's however the index out of bounds is 9 (log 10 of 1 
billion=9 from your fix.) skipLevels does not check newLevel against 
generations.length however the fix obviously isn't that simple since it needs 
to return the newLevel... some other logic problem in the loop termination?

private int skipLevels(int newLevel, IterableSSTableReader added)
{
while (maxBytesForLevel(newLevel)  SSTableReader.getTotalBytes(added)
 generations[(newLevel + 1)].isEmpty())
{
newLevel++;
}
return newLevel;
}


was (Author: jeffery.griffith):
hi [~jbellis], it looks like this problem may not have been the total # of 
generations allocated because I can still see this in 1.2.19. The stack trace 
is the same as Jeremy's however the index out of bounds is 9 (log 10 of 1 
billion from your fix.) skipLevels does not check newLevel against 
generations.length however the fix obviously isn't that simple since it needs 
to return the newLevel... some other logic problem in the loop termination?

private int skipLevels(int newLevel, IterableSSTableReader added)
{
while (maxBytesForLevel(newLevel)  SSTableReader.getTotalBytes(added)
 generations[(newLevel + 1)].isEmpty())
{
newLevel++;
}
return newLevel;
}

 ArrayIndexOutOfBoundsException in LeveledManifest
 -

 Key: CASSANDRA-5589
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5589
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.0
Reporter: Jeremy Hanna
Assignee: Jonathan Ellis
Priority: Minor
  Labels: compaction
 Fix For: 1.2.6

 Attachments: 5589.txt


 The following stack trace was in the system.log:
 {quote}
 ERROR [CompactionExecutor:2] 2013-05-22 16:19:32,402 CassandraDaemon.java 
 (line 174) Exception in thread Thread[CompactionExecutor:2,1,main]
  java.lang.ArrayIndexOutOfBoundsException: 5
   at 
 org.apache.cassandra.db.compaction.LeveledManifest.skipLevels(LeveledManifest.java:176)
   at 
 org.apache.cassandra.db.compaction.LeveledManifest.promote(LeveledManifest.java:215)
   at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.handleNotification(LeveledCompactionStrategy.java:155)
   at 
 org.apache.cassandra.db.DataTracker.notifySSTablesChanged(DataTracker.java:410)
   at 
 org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:223)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:991)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:230)
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:188)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7019) Major tombstone compaction

2014-09-25 Thread Carl Yeksigian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147843#comment-14147843
 ] 

Carl Yeksigian commented on CASSANDRA-7019:
---

For LCS, we might be artifically penalizing early tokens. What if we started at 
the highest level which we are currently storing data in instead of at L1? It 
will be a good proxy for the size of the data that we are currently storing, 
and it will avoid unnecessarily recompacting data because we placed it in such 
a low level.

I'm +1 to [~jjordan]'s proposal to change the default to this; I'd rather just 
add an option to compact to start minor compactions instead of adding a new 
command.

 Major tombstone compaction
 --

 Key: CASSANDRA-7019
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7019
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
  Labels: compaction
 Fix For: 3.0


 It should be possible to do a major tombstone compaction by including all 
 sstables, but writing them out 1:1, meaning that if you have 10 sstables 
 before, you will have 10 sstables after the compaction with the same data, 
 minus all the expired tombstones.
 We could do this in two ways:
 # a nodetool command that includes _all_ sstables
 # once we detect that an sstable has more than x% (20%?) expired tombstones, 
 we start one of these compactions, and include all overlapping sstables that 
 contain older data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[Cassandra Wiki] Update of HowToContribute by TylerHobbs

2014-09-25 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The HowToContribute page has been changed by TylerHobbs:
https://wiki.apache.org/cassandra/HowToContribute?action=diffrev1=55rev2=56

Comment:
Update instructions for running tests

  == Testing and Coverage ==
  Setting up and running system tests:
  
+ === Running the Unit Tests ===
+ Simply run `ant` to run all unit tests. To run a specific test class, run 
`ant -Dtest.name=ClassName`.  To run a specific test method, run `ant 
-Dtestsome.name=ClassName -Dtestsome.methods=comma-separated list of method 
names`.
- === Running the functional tests for Thrift RPC ===
-  1. Install CQL: `svn checkout 
https://svn.apache.org/repos/asf/cassandra/drivers; cd drivers/py; python 
setup.py build; sudo python setup.py install`.
-  1. Install the 
[[http://somethingaboutorange.com/mrl/projects/nose/0.11.1/|nose]] test runner 
(`aptitude install python-nose`, `easy_install nose`, etc).
-  1. Install the Thrift compiler (see InstallThrift) and Python libraries (`cd 
thrift/lib/py  python setup.py install`).
-  1. Generate Cassandra's Python code using `ant gen-thrift-py`.
-  1. Build the source `ant clean build`.
-  1. Run `nosetests test/system/` from the top-level source directory.
  
- If you need to modify the system tests, you probably only need to care about 
test/system/test_thrift_server.py.  (test/system/__init__.py takes care of 
spawning new cassandra instances for each test and cleaning up afterwards so 
they are isolated.)
+ You can also run tests in parallel: `ant test -Dtest.runners=4`.
+ 
+ === Running the dtests ===
+ The dtests use [[https://github.com/pcmanus/ccm|ccm]] to test a local cluster.
+  1. Install ccm.  You can do this with pip by running `pip install ccm`.
+  1. Install nosetests.  With pip, this is `pip install nose`.
+  1. Clone the dtest repo: https://github.com/riptano/cassandra-dtest.git
+  1. Set `$CASSANDRA_DIR` to the location of your cassandra checkout.  For 
example: `export CASSANDRA_DIR=/home/joe/cassandra`.  Make sure you've already 
built cassandra in this directory.
+  1. Run all tests by running `nosetests` from the dtest checkout.  You can 
run a specific module like so: `nosetests cql_tests.py`.  You can run a 
specific test method like this: `nosetests cql_tests.py:TestCQL.counters_test`
  
  === Running the code coverage task ===
   1. Unzip this one: 
http://sourceforge.net/projects/cobertura/files/cobertura/1.9.4.1/cobertura-1.9.4.1-bin.zip/download


[jira] [Comment Edited] (CASSANDRA-5589) ArrayIndexOutOfBoundsException in LeveledManifest

2014-09-25 Thread Jeff Griffith (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147833#comment-14147833
 ] 

Jeff Griffith edited comment on CASSANDRA-5589 at 9/25/14 3:49 PM:
---

hi [~jbellis], it looks like this problem may not have been the total # of 
generations allocated because I can still see this in 1.2.19. The stack trace 
is the same as Jeremy's however the index out of bounds is 9 (log 10 of 1 
billion=9 from your fix.) skipLevels does not check newLevel against 
generations.length however the fix obviously isn't that simple since it needs 
to return the newLevel... some other logic problem in the loop termination?

private int skipLevels(int newLevel, IterableSSTableReader added)
{
while (maxBytesForLevel(newLevel)  SSTableReader.getTotalBytes(added)
 generations[(newLevel + 1)].isEmpty())
{
newLevel++;
}
return newLevel;
}

Apparently, this seems to have begun when someone set the max sstable size to 
400MB. Going back to 300MB it seems to have gone away.


was (Author: jeffery.griffith):
hi [~jbellis], it looks like this problem may not have been the total # of 
generations allocated because I can still see this in 1.2.19. The stack trace 
is the same as Jeremy's however the index out of bounds is 9 (log 10 of 1 
billion=9 from your fix.) skipLevels does not check newLevel against 
generations.length however the fix obviously isn't that simple since it needs 
to return the newLevel... some other logic problem in the loop termination?

private int skipLevels(int newLevel, IterableSSTableReader added)
{
while (maxBytesForLevel(newLevel)  SSTableReader.getTotalBytes(added)
 generations[(newLevel + 1)].isEmpty())
{
newLevel++;
}
return newLevel;
}

 ArrayIndexOutOfBoundsException in LeveledManifest
 -

 Key: CASSANDRA-5589
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5589
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.0
Reporter: Jeremy Hanna
Assignee: Jonathan Ellis
Priority: Minor
  Labels: compaction
 Fix For: 1.2.6

 Attachments: 5589.txt


 The following stack trace was in the system.log:
 {quote}
 ERROR [CompactionExecutor:2] 2013-05-22 16:19:32,402 CassandraDaemon.java 
 (line 174) Exception in thread Thread[CompactionExecutor:2,1,main]
  java.lang.ArrayIndexOutOfBoundsException: 5
   at 
 org.apache.cassandra.db.compaction.LeveledManifest.skipLevels(LeveledManifest.java:176)
   at 
 org.apache.cassandra.db.compaction.LeveledManifest.promote(LeveledManifest.java:215)
   at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.handleNotification(LeveledCompactionStrategy.java:155)
   at 
 org.apache.cassandra.db.DataTracker.notifySSTablesChanged(DataTracker.java:410)
   at 
 org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:223)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:991)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:230)
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:188)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7019) Major tombstone compaction

2014-09-25 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147892#comment-14147892
 ] 

sankalp kohli commented on CASSANDRA-7019:
--

[~carlyeks] Can you explain your idea about putting stables. If the application 
is using upto say L4, we should fill L4 then L3 and so on? 

 Major tombstone compaction
 --

 Key: CASSANDRA-7019
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7019
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
  Labels: compaction
 Fix For: 3.0


 It should be possible to do a major tombstone compaction by including all 
 sstables, but writing them out 1:1, meaning that if you have 10 sstables 
 before, you will have 10 sstables after the compaction with the same data, 
 minus all the expired tombstones.
 We could do this in two ways:
 # a nodetool command that includes _all_ sstables
 # once we detect that an sstable has more than x% (20%?) expired tombstones, 
 we start one of these compactions, and include all overlapping sstables that 
 contain older data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[Cassandra Wiki] Trivial Update of HowToContribute by TylerHobbs

2014-09-25 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The HowToContribute page has been changed by TylerHobbs:
https://wiki.apache.org/cassandra/HowToContribute?action=diffrev1=56rev2=57

Comment:
Fix testsome invocation

  Setting up and running system tests:
  
  === Running the Unit Tests ===
- Simply run `ant` to run all unit tests. To run a specific test class, run 
`ant -Dtest.name=ClassName`.  To run a specific test method, run `ant 
-Dtestsome.name=ClassName -Dtestsome.methods=comma-separated list of method 
names`.
+ Simply run `ant` to run all unit tests. To run a specific test class, run 
`ant -Dtest.name=ClassName`.  To run a specific test method, run `ant 
testsome -Dtest.name=ClassName -Dtest.methods=comma-separated list of method 
names`.
  
  You can also run tests in parallel: `ant test -Dtest.runners=4`.
  


[jira] [Commented] (CASSANDRA-7019) Major tombstone compaction

2014-09-25 Thread Carl Yeksigian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147903#comment-14147903
 ] 

Carl Yeksigian commented on CASSANDRA-7019:
---

I was thinking L4, then L5 (as in this patch, currently). Ideally, we would 
pick the level where all of the sstables would fit, but we don't know how many 
sstables will end up being produced by the compaction in the end, so this seems 
like a compromise. This would be similar to the thinking in CASSANDRA-6323.

 Major tombstone compaction
 --

 Key: CASSANDRA-7019
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7019
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
  Labels: compaction
 Fix For: 3.0


 It should be possible to do a major tombstone compaction by including all 
 sstables, but writing them out 1:1, meaning that if you have 10 sstables 
 before, you will have 10 sstables after the compaction with the same data, 
 minus all the expired tombstones.
 We could do this in two ways:
 # a nodetool command that includes _all_ sstables
 # once we detect that an sstable has more than x% (20%?) expired tombstones, 
 we start one of these compactions, and include all overlapping sstables that 
 contain older data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8000) Schema Corruption when 1.2.15-2.0.9 rolling upgrade and in mixed mode

2014-09-25 Thread Yeshvanthni (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148047#comment-14148047
 ] 

Yeshvanthni commented on CASSANDRA-8000:


Full Rolling upgrade to 1.2.16 and then to 2.0.9 avoids this issue.

 Schema Corruption when 1.2.15-2.0.9 rolling upgrade and in mixed mode
 --

 Key: CASSANDRA-8000
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8000
 Project: Cassandra
  Issue Type: Bug
Reporter: Yeshvanthni

 Steps to reproduce:
 1. Setup multi-node Cassandra 1.2.15 with following schema 
 {code}
 CREATE KEYSPACE testkeyspace WITH replication = {
   'class': 'SimpleStrategy',
   'replication_factor':2
 };
 USE testkeyspace;
 CREATE TABLE test (
   testid timeuuid PRIMARY KEY,
   businesskey timestamp,
   createdby text,
   createdtimestamp timestamp,
   testname text
 ) ;
 insert into test(testid,businesskey,createdby,createdtimestamp,testname) 
 VALUES (now(),dateOf(now()),'user',dateOf(now()),'test');
 {code}
 2. Roll one node to Cassandra 2.0.9 
  - Snapshot 1.2.15 
  - Decommission the old 1.2.15
  - Start Cassandra 2.0.9 pointing to the same data folder  as 1.2.15
  - nodetool upgradesstables 
 3. Query against 1.2.15 nodes of the cluster with CQLSH
 It returns an additional primary key column with null value in it. Describe 
 shows that the table has somehow got the additional column
 {code}
 CREATE TABLE test (
   testid timeuuid PRIMARY KEY,
   testid timeuuid,
   businesskey timestamp,
   createdby text,
   createdtimestamp timestamp,
   testname text
 ) ;
 {code}
 Observation:
 This could be because of the change in Cassandra 2.x to store all columns 
 including the key columns in schema_columns while earlier key columns were 
 stored schema_columnfamilies.
 This blocks rolling upgrades and fails the cluster when in mixed mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7019) Major tombstone compaction

2014-09-25 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148058#comment-14148058
 ] 

Marcus Eriksson commented on CASSANDRA-7019:


The problem with starting in high levels is that it will take a long time 
before that data gets included in a (minor) compaction. This is basically a 
major compaction (like in current STCS)

The option to not putting low tokens in lower levels is to write all levels at 
the same time and randomly distribute the tokens over the levels (and put 1% in 
L1, 10% in L2, 89% in L3), but i cant really see any difference compared to 
having the low tokens in one sstable, the number of overlapping tokens between 
a newly flushed file in L0 and L1 should be the same (if tokens are evenly 
distributed over the flushed sstable)

 Major tombstone compaction
 --

 Key: CASSANDRA-7019
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7019
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
  Labels: compaction
 Fix For: 3.0


 It should be possible to do a major tombstone compaction by including all 
 sstables, but writing them out 1:1, meaning that if you have 10 sstables 
 before, you will have 10 sstables after the compaction with the same data, 
 minus all the expired tombstones.
 We could do this in two ways:
 # a nodetool command that includes _all_ sstables
 # once we detect that an sstable has more than x% (20%?) expired tombstones, 
 we start one of these compactions, and include all overlapping sstables that 
 contain older data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7019) Major tombstone compaction

2014-09-25 Thread Carl Yeksigian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148186#comment-14148186
 ] 

Carl Yeksigian commented on CASSANDRA-7019:
---

I have no problem with making it consistent but arbitrary which tokens go into 
L1/L2, just thought it would be better to put all of them in the same level 
since they'll move there eventually. I think you're right, though; they will 
end up not being included in minor compactions, so it would continually require 
a major tombstone compaction.

 Major tombstone compaction
 --

 Key: CASSANDRA-7019
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7019
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
  Labels: compaction
 Fix For: 3.0


 It should be possible to do a major tombstone compaction by including all 
 sstables, but writing them out 1:1, meaning that if you have 10 sstables 
 before, you will have 10 sstables after the compaction with the same data, 
 minus all the expired tombstones.
 We could do this in two ways:
 # a nodetool command that includes _all_ sstables
 # once we detect that an sstable has more than x% (20%?) expired tombstones, 
 we start one of these compactions, and include all overlapping sstables that 
 contain older data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7579) File descriptor exhaustion can lead to unreliable state in exception condition

2014-09-25 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-7579:
---
Fix Version/s: 2.1.1
 Assignee: Joshua McKenzie

 File descriptor exhaustion can lead to unreliable state in exception condition
 --

 Key: CASSANDRA-7579
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7579
 Project: Cassandra
  Issue Type: New Feature
Reporter: Joshua McKenzie
Assignee: Joshua McKenzie
Priority: Minor
 Fix For: 2.1.1


 If the JVM runs out of file descriptors we can get into an unreliable state 
 (similar to CASSANDRA-7507 on OOM) where we cannot trust our shutdown hook to 
 run successfully to completion.  We need to check IOExceptions and 
 appropriate Throwable to see if we have a FileNotFoundException w/message 
 Too many files open and forcefully shutdown the Daemon in these cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7775) Cassandra attempts to flush an empty memtable into disk and fails

2014-09-25 Thread Russ Hatch (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148312#comment-14148312
 ] 

Russ Hatch commented on CASSANDRA-7775:
---

[~omribahumi] Since it has been a while, are you able to provide steps that 
reliably reproduce this issue? /cc [~merentafo]

 Cassandra attempts to flush an empty memtable into disk and fails
 -

 Key: CASSANDRA-7775
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7775
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: $ nodetool version
 ReleaseVersion: 2.0.6
 $ java -version
 java version 1.7.0_51
 Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
 Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)
Reporter: Omri Bahumi

 I'm not sure what triggers this flush, but when it happens the following 
 appears in our logs:
 {code}
  INFO [OptionalTasks:1] 2014-08-15 02:24:20,115 ColumnFamilyStore.java (line 
 785) Enqueuing flush of Memtable-app_recs_best_in_expr_prefix2@1219170646(0/0 
 serialized/live bytes, 0 ops)
  INFO [FlushWriter:34] 2014-08-15 02:24:20,116 Memtable.java (line 331) 
 Writing Memtable-app_recs_best_in_expr_prefix2@1219170646(0/0 serialized/live 
 bytes, 0 ops)
 ERROR [FlushWriter:34] 2014-08-15 02:24:20,127 CassandraDaemon.java (line 
 196) Exception in thread Thread[FlushWriter:34,5,main]
 java.lang.RuntimeException: Cannot get comparator 1 in 
 org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type).
  This might due to a mismatch between the schema and the data read
 at 
 org.apache.cassandra.db.marshal.CompositeType.getComparator(CompositeType.java:133)
 at 
 org.apache.cassandra.db.marshal.CompositeType.getComparator(CompositeType.java:140)
 at 
 org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:96)
 at 
 org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35)
 at 
 org.apache.cassandra.db.RangeTombstone$Tracker$1.compare(RangeTombstone.java:125)
 at 
 org.apache.cassandra.db.RangeTombstone$Tracker$1.compare(RangeTombstone.java:122)
 at java.util.TreeMap.compare(TreeMap.java:1188)
 at java.util.TreeMap$NavigableSubMap.init(TreeMap.java:1264)
 at java.util.TreeMap$AscendingSubMap.init(TreeMap.java:1699)
 at java.util.TreeMap.tailMap(TreeMap.java:905)
 at java.util.TreeSet.tailSet(TreeSet.java:350)
 at java.util.TreeSet.tailSet(TreeSet.java:383)
 at 
 org.apache.cassandra.db.RangeTombstone$Tracker.update(RangeTombstone.java:203)
 at 
 org.apache.cassandra.db.ColumnIndex$Builder.add(ColumnIndex.java:192)
 at 
 org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:138)
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:202)
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:187)
 at 
 org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:365)
 at 
 org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:318)
 at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 Caused by: java.lang.IndexOutOfBoundsException: index (1) must be less than 
 size (1)
 at 
 com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:306)
 at 
 com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:285)
 at 
 com.google.common.collect.SingletonImmutableList.get(SingletonImmutableList.java:45)
 at 
 org.apache.cassandra.db.marshal.CompositeType.getComparator(CompositeType.java:124)
 ... 23 more
 {code}
 After this happens, the MemtablePostFlusher thread pool starts piling up.
 When trying to restart the cluster, a similar exception occurs when trying to 
 replay the commit log.
 Our way of recovering from this is to delete all commit logs in the faulty 
 node, start it and issue a repair.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7859) Extend freezing to collections

2014-09-25 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148341#comment-14148341
 ] 

Tyler Hobbs commented on CASSANDRA-7859:


For secondary indexes, do we want to support indexing keys and values of frozen 
collections?  If so, do we want to introduce a {{values()}} modifier to 
indicate that values should be indexed and index the whole collection by 
default?  If we assume that the values should be indexed when no modifier is 
used, it will be consistent with the behavior for non-frozen collections.  On 
the other hand, we actually have the ability to index the entire frozen 
collection, and this may be more in line with how frozen collections will be 
used.

My personal opinion is that we should:
# Introduce a {{values()}} modifier that can be used on collection indexes.  
For non-frozen collections, this is implied.
# Accept both {{keys()}} and {{values()}} indexes on frozen collections.
# By default, indexes on frozen collections will index the entire collection.

I can (and probably should) split secondary index work into a second ticket, 
but it would be good to decide some of this upfront in case that ticket can't 
make it into the same release as this one.

 Extend freezing to collections
 --

 Key: CASSANDRA-7859
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7859
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Tyler Hobbs
 Fix For: 2.1.1


 This is the follow-up to CASSANDRA-7857, to extend {{frozen}} to collections. 
 This will allow things like {{maptext, frozenmapint, int}} for 
 instance, as well as allowing {{frozen}} collections in PK columns.
 Additionally (and that's alsmot a separate ticket but I figured we can start 
 discussing it here), we could decide that tuple is a frozen type by default. 
 This means that we would allow {{tupleint, text}} without needing to add 
 {{frozen}}, but we would require {{frozen}} for complex type inside tuple, so 
 {{tupleint, listtext}} would be rejected, but not {{tupleint, 
 frozenlisttext}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-25 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-7949:
-
Comment: was deleted

(was: May be this is not related but I have another small cluster with similar 
data. I have just upgraded that one to 2.0.10 (not DSE, original open-source 
version). On all machines in this cluster I have many thousands of sstables, 
all 160Mb, few ones that are smaller. So they are all L0, no L1 or higher level 
sstables exist. LCS is used. Number of pending compactions: 0. There is even 
incoming traffic that writes into that keyspace. nodetool compact returns 
immediately.

)

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name **table_list1**Data** | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only 

[jira] [Created] (CASSANDRA-8005) Server-side DESCRIBE

2014-09-25 Thread Tyler Hobbs (JIRA)
Tyler Hobbs created CASSANDRA-8005:
--

 Summary: Server-side DESCRIBE
 Key: CASSANDRA-8005
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8005
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Tyler Hobbs
Priority: Minor
 Fix For: 3.0


The various {{DESCRIBE}} commands are currently implemented by cqlsh, and 
nearly identical implementations exist in many drivers.  There are several 
motivations for making {{DESCRIBE}} part of the CQL language:
* Eliminate the (fairly complex) duplicate implementations across drivers and 
cqlsh
* Get closer to allowing drivers to not have to fetch the schema tables. (Minor 
changes to prepared statements are also needed.)
* Have instantaneous support for new schema features in cqlsh.  (You currently 
have to update the bundled python driver.)
* Support writing out schemas where it makes sense.  One good example of this 
is backups.  You need to restore the schema before restoring data in the case 
of total loss, so it makes sense to write out the schema alongside snapshots.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7728) ConcurrentModificationException after upgrade to trunk

2014-09-25 Thread Russ Hatch (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148489#comment-14148489
 ] 

Russ Hatch commented on CASSANDRA-7728:
---

not reproducing anymore. adding qa-resolved label.

 ConcurrentModificationException after upgrade to trunk
 --

 Key: CASSANDRA-7728
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7728
 Project: Cassandra
  Issue Type: Bug
Reporter: Russ Hatch
  Labels: qa-resolved

 Trying to repro another issue, I ran across this exception. It occurred 
 during a rolling upgrade to trunk. It happening during or right after the 
 test script checks counters to see if they are correct.
 {noformat}
 ERROR [Thrift:2] 2014-08-11 13:47:09,668 CustomTThreadPoolServer.java:219 - 
 Error occurred during processing of message.
 java.util.ConcurrentModificationException: null
   at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) 
 ~[na:1.7.0_65]
   at java.util.ArrayList$Itr.next(ArrayList.java:831) ~[na:1.7.0_65]
   at 
 org.apache.cassandra.service.RowDigestResolver.getData(RowDigestResolver.java:40)
  ~[main/:na]
   at 
 org.apache.cassandra.service.RowDigestResolver.getData(RowDigestResolver.java:28)
  ~[main/:na]
   at org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:110) 
 ~[main/:na]
   at 
 org.apache.cassandra.service.AbstractReadExecutor.get(AbstractReadExecutor.java:144)
  ~[main/:na]
   at 
 org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1262) 
 ~[main/:na]
   at 
 org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1188) 
 ~[main/:na]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:256)
  ~[main/:na]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:212)
  ~[main/:na]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:61)
  ~[main/:na]
   at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:186)
  ~[main/:na]
   at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:205) 
 ~[main/:na]
   at 
 org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1916)
  ~[main/:na]
   at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4588)
  ~[thrift/:na]
   at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4572)
  ~[thrift/:na]
   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
 ~[libthrift-0.9.1.jar:0.9.1]
   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
 ~[libthrift-0.9.1.jar:0.9.1]
   at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:201)
  ~[main/:na]
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  [na:1.7.0_65]
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_65]
   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
 {noformat}
 It's not happening 100% of the time, but may be triggered by running this 
 dtest:
 {noformat}
 nosetests -vs 
 upgrade_through_versions_test.py:TestUpgradeThroughVersions.upgrade_test_mixed
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)