[jira] [Commented] (CASSANDRA-7957) improve active/pending compaction monitoring

2016-03-03 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177915#comment-15177915
 ] 

Nikolai Grigoriev commented on CASSANDRA-7957:
--

OK, I see your point. Well, then what I was thinking about is simply not 
doable, I guess.

> improve active/pending compaction monitoring
> 
>
> Key: CASSANDRA-7957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7957
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Nikolai Grigoriev
>Priority: Minor
>
> I think it might be useful to create a way to see what sstables are being 
> compacted into what new sstable. Something like an extension of "nodetool 
> compactionstats". I think it would be easier with this feature to 
> troubleshoot and understand how compactions are happening on your data. Not 
> sure how it is useful in everyday life but I could use such a feature when 
> dealing with CASSANDRA-7949.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7957) improve active/pending compaction monitoring

2016-03-02 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175833#comment-15175833
 ] 

Nikolai Grigoriev commented on CASSANDRA-7957:
--

I was talking specifically about the *current* status. That's why I mentioned 
"nodetool compactionstats". I think that in case of Leveled Compaction we could 
expose a bit more data to make nodetool more informative.

Here is the use case. I look at "nodetool compactionstats" and I see that my 
compaction process is doing almost nothing, like doing one compaction. At the 
same time, I see hundreds of pending compactions. What I was asking for is the 
additional information that shows what is in that list of pending compactions 
and what is blocked by what. I do not think that with the default log settings 
you can get this info today easily.

If you look at CASSANDRA-7949 you will probably better understand why I was 
looking for this information.

> improve active/pending compaction monitoring
> 
>
> Key: CASSANDRA-7957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7957
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Nikolai Grigoriev
>Priority: Minor
>
> I think it might be useful to create a way to see what sstables are being 
> compacted into what new sstable. Something like an extension of "nodetool 
> compactionstats". I think it would be easier with this feature to 
> troubleshoot and understand how compactions are happening on your data. Not 
> sure how it is useful in everyday life but I could use such a feature when 
> dealing with CASSANDRA-7949.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8740) java.lang.AssertionError when reading saved cache

2015-02-04 Thread Nikolai Grigoriev (JIRA)
Nikolai Grigoriev created CASSANDRA-8740:


 Summary: java.lang.AssertionError when reading saved cache
 Key: CASSANDRA-8740
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8740
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: OEL 6.5, DSE 4.6.0, Cassandra 2.0.11.83
Reporter: Nikolai Grigoriev


I have started seeing it recently. Not sure from which version but now it 
happens relatively often one some of my nodes.

{code}
 INFO [main] 2015-02-04 18:18:09,253 ColumnFamilyStore.java (line 249) 
Initializing duo_xxx
 INFO [main] 2015-02-04 18:18:09,254 AutoSavingCache.java (line 114) reading 
saved cache /var/lib/cassandra/saved_caches/duo_xxx-RowCach
e-b.db
ERROR [main] 2015-02-04 18:18:09,256 CassandraDaemon.java (line 513) Exception 
encountered during startup
java.lang.AssertionError
at 
org.apache.cassandra.cache.SerializingCacheProvider$RowCacheSerializer.serialize(SerializingCacheProvider.java:41)
at 
org.apache.cassandra.cache.SerializingCacheProvider$RowCacheSerializer.serialize(SerializingCacheProvider.java:37)
at 
org.apache.cassandra.cache.SerializingCache.serialize(SerializingCache.java:118)
at 
org.apache.cassandra.cache.SerializingCache.put(SerializingCache.java:177)
at 
org.apache.cassandra.cache.InstrumentingCache.put(InstrumentingCache.java:44)
at 
org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:130)
at 
org.apache.cassandra.db.ColumnFamilyStore.initRowCache(ColumnFamilyStore.java:592)
at org.apache.cassandra.db.Keyspace.open(Keyspace.java:119)
at org.apache.cassandra.db.Keyspace.open(Keyspace.java:92)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:305)
at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:419)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
at com.datastax.bdp.server.DseDaemon.main(DseDaemon.java:659)
 INFO [Thread-2] 2015-02-04 18:18:09,259 DseDaemon.java (line 505) DSE shutting 
down...
ERROR [Thread-2] 2015-02-04 18:18:09,279 CassandraDaemon.java (line 199) 
Exception in thread Thread[Thread-2,5,main]
java.lang.AssertionError
at 
org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1274)
at com.datastax.bdp.gms.DseState.setActiveStatus(DseState.java:171)
at com.datastax.bdp.server.DseDaemon.stop(DseDaemon.java:506)
at com.datastax.bdp.server.DseDaemon$1.run(DseDaemon.java:408)
 INFO [main] 2015-02-04 18:18:49,144 CassandraDaemon.java (line 135) Logging 
initialized
 INFO [main] 2015-02-04 18:18:49,169 DseDaemon.java (line 382) DSE version: 
4.6.0
{code}

Cassandra version: 2.0.11.83 (DSE 4.6.0)

Looks like similar issues were reported and fixed in the past - like 
CASSANDRA-6325.

Maybe I am missing something, but I think that Cassandra should not crash and 
stop at startup if it cannot read a saved cache. This does not make the node 
inoperable and does not necessarily indicate a severe data corruption. I have 
applied a small change to my cluster config, restarted it and 30% of my nodes 
did not start because of that. Of course the solution is simple, but it 
requires to go to every node that failed to start, wipe the cache and start.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (CASSANDRA-6325) AssertionError on startup reading saved Serializing row cache

2015-02-04 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev reopened CASSANDRA-6325:
--

I have started seeing it recently. Not sure from which version but now it 
happens relatively often one some of my nodes.

{code}
 INFO [main] 2015-02-04 18:18:09,253 ColumnFamilyStore.java (line 249) 
Initializing duo_xxx
 INFO [main] 2015-02-04 18:18:09,254 AutoSavingCache.java (line 114) reading 
saved cache /var/lib/cassandra/saved_caches/duo_xxx-RowCach
e-b.db
ERROR [main] 2015-02-04 18:18:09,256 CassandraDaemon.java (line 513) Exception 
encountered during startup
java.lang.AssertionError
at 
org.apache.cassandra.cache.SerializingCacheProvider$RowCacheSerializer.serialize(SerializingCacheProvider.java:41)
at 
org.apache.cassandra.cache.SerializingCacheProvider$RowCacheSerializer.serialize(SerializingCacheProvider.java:37)
at 
org.apache.cassandra.cache.SerializingCache.serialize(SerializingCache.java:118)
at 
org.apache.cassandra.cache.SerializingCache.put(SerializingCache.java:177)
at 
org.apache.cassandra.cache.InstrumentingCache.put(InstrumentingCache.java:44)
at 
org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:130)
at 
org.apache.cassandra.db.ColumnFamilyStore.initRowCache(ColumnFamilyStore.java:592)
at org.apache.cassandra.db.Keyspace.open(Keyspace.java:119)
at org.apache.cassandra.db.Keyspace.open(Keyspace.java:92)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:305)
at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:419)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
at com.datastax.bdp.server.DseDaemon.main(DseDaemon.java:659)
 INFO [Thread-2] 2015-02-04 18:18:09,259 DseDaemon.java (line 505) DSE shutting 
down...
ERROR [Thread-2] 2015-02-04 18:18:09,279 CassandraDaemon.java (line 199) 
Exception in thread Thread[Thread-2,5,main]
java.lang.AssertionError
at 
org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1274)
at com.datastax.bdp.gms.DseState.setActiveStatus(DseState.java:171)
at com.datastax.bdp.server.DseDaemon.stop(DseDaemon.java:506)
at com.datastax.bdp.server.DseDaemon$1.run(DseDaemon.java:408)
 INFO [main] 2015-02-04 18:18:49,144 CassandraDaemon.java (line 135) Logging 
initialized
 INFO [main] 2015-02-04 18:18:49,169 DseDaemon.java (line 382) DSE version: 
4.6.0
{code}


Cassandra version: 2.0.11.83 (DSE 4.6.0)

 AssertionError on startup reading saved Serializing row cache
 -

 Key: CASSANDRA-6325
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6325
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: upgrade from 1.2.9ish to 1.2.11ish
Reporter: Chris Burroughs
Assignee: Mikhail Stepura
Priority: Minor
 Fix For: 1.2.12, 2.0.3

 Attachments: 6325-v2.txt, CASSANDRA-1.2-6325.patch


 I don't see any reason what this could have to do with the upgrade, but don't 
 have a large enough non-prod cluster to just keep restarting on.  Occurred on 
 roughly 2 out of 100 restarted nodes. 
 {noformat}
 ERROR [main] 2013-11-08 14:40:13,535 CassandraDaemon.java (line 482) 
 Exception encountered during startup
 java.lang.AssertionError
 at 
 org.apache.cassandra.cache.SerializingCacheProvider$RowCacheSerializer.serialize(SerializingCacheProvider.java:41)
 at 
 org.apache.cassandra.cache.SerializingCacheProvider$RowCacheSerializer.serialize(SerializingCacheProvider.java:37)
 at 
 org.apache.cassandra.cache.SerializingCache.serialize(SerializingCache.java:118)
 at 
 org.apache.cassandra.cache.SerializingCache.put(SerializingCache.java:176)
 at 
 org.apache.cassandra.cache.InstrumentingCache.put(InstrumentingCache.java:44)
 at 
 org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:156)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.initRowCache(ColumnFamilyStore.java:444)
 at org.apache.cassandra.db.Table.open(Table.java:114)
 at org.apache.cassandra.db.Table.open(Table.java:87)
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:278)
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:465)
 {noformat}
 I have the files if there is any useful analysis that can be run.  Looked 
 'normal' to a cursory `less` inspection.
 Possibly related: CASSANDRA-4463



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8301) Create a tool that given a bunch of sstables creates a decent sstable leveling

2014-11-27 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227931#comment-14227931
 ] 

Nikolai Grigoriev commented on CASSANDRA-8301:
--

Oh...I think I see what you mean. I only created the situation where nothing 
overlaps at each level - but I've done nothing to respect this rule about the 
target number of overlapping sstables between the levels. So, if I understand 
correctly, this will (or may - depending on how lucky I am) result in slower 
promotion of the sstables to the upper levels, right?

Yes, I was checking the logs carefully to see the result of my manipulations - 
the only error I saw was about the keys out-of-order in a single sstable file - 
this could not be caused by my re-leveling.

What I observe now is that the remaining ~280 pending compactions go very 
slowly, there is quite a bit of sstables at level 0. Under normal traffic this 
number seems to be floating around ~600 and probably even increasing. Each 
compaction grabs some but while it is working new ones get created :) I think 
new sstables get created a bit faster than they are compacted and promoted. 
Could it be due to bad leveling?

Regarding initial set of L0 sstables...In my case I had 79 sstables with the 
token ranges like -9010847458915378120,9190536470441980462. I believe they 
are original L0 sstables from other machines. I think for those there is no 
choice but to put them in L0, otherwise they will overlap with all other 
sstables.

I think I will try to implement that algorithm differently. So, just to confirm 
I get it right:

- no overlap allowed at any level except L0
- for each sstable at level N there should be no more than 10 at level N+1
- anything that does not fit goes to L0
- sstables with large token ranges have to go to L0 anyway

Interesting...these first two rules most likely create a number of different 
possible combinations. 

 Create a tool that given a bunch of sstables creates a decent sstable 
 leveling
 

 Key: CASSANDRA-8301
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8301
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson

 In old versions of cassandra (i.e. not trunk/3.0), when bootstrapping a new 
 node, you will end up with a ton of files in L0 and it might be extremely 
 painful to get LCS to compact into a new leveling
 We could probably exploit the fact that we have many non-overlapping sstables 
 in L0, and offline-bump those sstables into higher levels. It does not need 
 to be perfect, just get the majority of the data into L1+ without creating 
 overlaps.
 So, suggestion is to create an offline tool that looks at the range each 
 sstable covers and tries to bump it as high as possible in the leveling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8301) Create a tool that given a bunch of sstables creates a decent sstable leveling

2014-11-26 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226614#comment-14226614
 ] 

Nikolai Grigoriev commented on CASSANDRA-8301:
--

I have attempted to write a simple prototype (very ugly :) ) of such a tool. I 
am very interested in it because I do suffer from that problem. In fact, 
without such a tool I simply cannot bootstrap a node. I have tried and the node 
*never* recovers. 

So, anyway, I have tried my prototype on a freshly bootstrapped node and it 
seems to be working. Instead of initial 7,5K pending compactions I have got 
only about 600, few hours later it is down to ~450 and seems to be going down. 
cfstats also look quite good (to me ;) ):

{code}
SSTable count: 6311
SSTables in each level: [571/4, 10, 80, 1411/1000, 4239, 0, 0, 0, 0]
{code}

I do have some sstables at L0 because the node is taking normal (heavy) traffic 
at the same time. But this number is already down from ~700 original.

I think I could give it a try to make the prototype tool less ugly and submit 
it here, if you do not mind.

 Create a tool that given a bunch of sstables creates a decent sstable 
 leveling
 

 Key: CASSANDRA-8301
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8301
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson

 In old versions of cassandra (i.e. not trunk/3.0), when bootstrapping a new 
 node, you will end up with a ton of files in L0 and it might be extremely 
 painful to get LCS to compact into a new leveling
 We could probably exploit the fact that we have many non-overlapping sstables 
 in L0, and offline-bump those sstables into higher levels. It does not need 
 to be perfect, just get the majority of the data into L1+ without creating 
 overlaps.
 So, suggestion is to create an offline tool that looks at the range each 
 sstable covers and tries to bump it as high as possible in the leveling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8301) Create a tool that given a bunch of sstables creates a decent sstable leveling

2014-11-26 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226744#comment-14226744
 ] 

Nikolai Grigoriev commented on CASSANDRA-8301:
--

The logic I have built is very simple. And probably has some fundamental flaws 
:)

First I calculate the target size for each level (in bytes) to accommodate all 
my data - i.e. to distribute the total size of all my sstables. This also gives 
me the maximum level to target. Then I take all sstables for the given CF, sort 
them by the beginning (left) of their bounds. Then I start from the highest 
level (L4 in my example) and iterate over that list of sstables. I grab the 
first sstable, remember its bounds, put it to the current level. Then skip to 
the next one that does not intersect with these bounds, assign it to the 
current level and change the bounds. And so on until the end of the list or 
until I use all available size. Then I move to the lower level and repeat it on 
the remaining sstables. And so on. The remainder goes to L0 where overlaps are 
allowed (right?).

I had to also come up with some logic to exclude the sstables that cover large 
range of tokens. Most likely these are the ones that were recently written at 
L0 on the source node - they cover whatever was recently written into them, 
right? I ignore those from my logic and leave them for L0.

Or did I get it completely wrong?

 Create a tool that given a bunch of sstables creates a decent sstable 
 leveling
 

 Key: CASSANDRA-8301
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8301
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson

 In old versions of cassandra (i.e. not trunk/3.0), when bootstrapping a new 
 node, you will end up with a ton of files in L0 and it might be extremely 
 painful to get LCS to compact into a new leveling
 We could probably exploit the fact that we have many non-overlapping sstables 
 in L0, and offline-bump those sstables into higher levels. It does not need 
 to be perfect, just get the majority of the data into L1+ without creating 
 overlaps.
 So, suggestion is to create an offline tool that looks at the range each 
 sstable covers and tries to bump it as high as possible in the leveling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8301) Create a tool that given a bunch of sstables creates a decent sstable leveling

2014-11-26 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226744#comment-14226744
 ] 

Nikolai Grigoriev edited comment on CASSANDRA-8301 at 11/26/14 8:04 PM:


The logic I have built is very simple. And probably has some fundamental flaws 
:)

First I calculate the target size for each level (in bytes) to accommodate all 
my data - i.e. to distribute the total size of all my sstables. This also gives 
me the maximum level to target. Then I take all sstables for the given CF, sort 
them by the beginning (left) of their bounds. Then I start from the highest 
level (L4 in my example) and iterate over that list of sstables. I grab the 
first sstable, remember its bounds, put it to the current level. Then skip to 
the next one that does not intersect with these bounds, assign it to the 
current level and change the bounds. And so on until the end of the list or 
until I use all available size. Then I move to the lower level and repeat it on 
the remaining sstables. And so on. The remainder goes to L0 where overlaps are 
allowed (right?).

I had to also come up with some logic to exclude the sstables that cover large 
range of tokens. Most likely these are the ones that were recently written at 
L0 on the original node - they cover whatever was recently written into them, 
right? I ignore those from my logic and leave them for L0.

Or did I get it completely wrong?


was (Author: ngrigor...@gmail.com):
The logic I have built is very simple. And probably has some fundamental flaws 
:)

First I calculate the target size for each level (in bytes) to accommodate all 
my data - i.e. to distribute the total size of all my sstables. This also gives 
me the maximum level to target. Then I take all sstables for the given CF, sort 
them by the beginning (left) of their bounds. Then I start from the highest 
level (L4 in my example) and iterate over that list of sstables. I grab the 
first sstable, remember its bounds, put it to the current level. Then skip to 
the next one that does not intersect with these bounds, assign it to the 
current level and change the bounds. And so on until the end of the list or 
until I use all available size. Then I move to the lower level and repeat it on 
the remaining sstables. And so on. The remainder goes to L0 where overlaps are 
allowed (right?).

I had to also come up with some logic to exclude the sstables that cover large 
range of tokens. Most likely these are the ones that were recently written at 
L0 on the source node - they cover whatever was recently written into them, 
right? I ignore those from my logic and leave them for L0.

Or did I get it completely wrong?

 Create a tool that given a bunch of sstables creates a decent sstable 
 leveling
 

 Key: CASSANDRA-8301
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8301
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson

 In old versions of cassandra (i.e. not trunk/3.0), when bootstrapping a new 
 node, you will end up with a ton of files in L0 and it might be extremely 
 painful to get LCS to compact into a new leveling
 We could probably exploit the fact that we have many non-overlapping sstables 
 in L0, and offline-bump those sstables into higher levels. It does not need 
 to be perfect, just get the majority of the data into L1+ without creating 
 overlaps.
 So, suggestion is to create an offline tool that looks at the range each 
 sstable covers and tries to bump it as high as possible in the leveling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-11-24 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223036#comment-14223036
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

I have recently realized that there may be relatively cheap (operationally and 
development-wise) workaround for that limitation. It would also partially 
address the problem with bootstrapping new node. The root cause of all this is 
a large amount of data in a single CF on a single node when using LCS for that 
CF. The performance of a single compaction task running on a single thread is 
limited anyway. One of the obvious ways to break this limitation is to shard 
the data across multiple clones of that CF at the application level. 
Something as dumb as row key hash mod X and add this suffix to the CF name. In 
my case looks like having X=4 would be more than enough to solve the problem.

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name **table_list1**Data** | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots 

[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-11-12 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208154#comment-14208154
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

I had to rebuild one of the nodes in that test cluster. After bootstrapping it  
I have checked the results - I had over 6,5K pending compactions and many large 
sstables (between few Gb and 40-60Gb). I knew that under traffic this will 
*never* return to reasonable number of pending compactions.

I have decided to give it another try, enable the option from CASSANDRA-6621 
and re-bootstrap. This time I did not end up with huge sstables but, I think, 
it will also never recover. This is, essentially, what the node does most of 
the time:

{code}
pending tasks: 7217
  compaction typekeyspace   table   completed   
total  unit  progress
   Compaction  myks mytable1  5434997373 
10667184206 bytes50.95%
   Compaction  myksmytable2  1080506914  
7466286503 bytes14.47%
Active compaction remaining time :   0h00m09s
{code}

while:

{code}
# nodetool cfstats myks.mytable1
Keyspace: myks
Read Count: 49783
Read Latency: 38.612470602414476 ms.
Write Count: 521971
Write Latency: 1.3617571608384373 ms.
Pending Tasks: 0
Table: mytable1
SSTable count: 7893
SSTables in each level: [7828/4, 10, 56, 0, 0, 0, 0, 0, 0]
Space used (live), bytes: 1181508730955
Space used (total), bytes: 1181509085659
SSTable Compression Ratio: 0.3068450302663634
Number of keys (estimate): 28180352
Memtable cell count: 153554
Memtable data size, bytes: 41190431
Memtable switch count: 178
Local read count: 49826
Local read latency: 38.886 ms
Local write count: 522464
Local write latency: 1.392 ms
Pending tasks: 0
Bloom filter false positives: 11802553
Bloom filter false ratio: 0.98767
Bloom filter space used, bytes: 17686928
Compacted partition minimum bytes: 104
Compacted partition maximum bytes: 3379391
Compacted partition mean bytes: 142171
Average live cells per slice (last five minutes): 537.5
Average tombstones per slice (last five minutes): 0.0
{code}

By the way, this is the picture from another node that functions normally:

{code}
# nodetool cfstats myks.mytable1
Keyspace: myks
Read Count: 4638154
Read Latency: 20.784106776316612 ms.
Write Count: 15067667
Write Latency: 1.7291775639188205 ms.
Pending Tasks: 0
Table: mytable1
SSTable count: 4561
SSTables in each level: [37/4, 15/10, 106/100, 1053/1000, 3350, 
0, 0, 0, 0]
Space used (live), bytes: 1129716897255
Space used (total), bytes: 1129752918759
SSTable Compression Ratio: 0.33488717551698993
Number of keys (estimate): 25036672
Memtable cell count: 334212
Memtable data size, bytes: 115610737
Memtable switch count: 4476
Local read count: 4638155
Local read latency: 20.784 ms
Local write count: 15067679
Local write latency: 1.729 ms
Pending tasks: 0
Bloom filter false positives: 104377
Bloom filter false ratio: 0.59542
Bloom filter space used, bytes: 20319608
Compacted partition minimum bytes: 104
Compacted partition maximum bytes: 3379391
Compacted partition mean bytes: 152368
Average live cells per slice (last five minutes): 529.5
Average tombstones per slice (last five minutes): 0.0
{code}

So, not only the streaming has created an excessive amount of sstables, the 
compactions are not advancing at all. In fact, the number of pending 
compactions grows slowly on that (first) node. New L0 sstables get added 
because the write activity is taking place.

Just a simple math. If I take the compaction throughput of the node when it 
uses only one thread and compare it to my write rate I think the latter is like 
4x the former. Under this conditions this node will never recover - while 
having plenty of resources and very fast I/O.

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949

[jira] [Commented] (CASSANDRA-8211) Overlapping sstables in L1+

2014-11-10 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14204994#comment-14204994
 ] 

Nikolai Grigoriev commented on CASSANDRA-8211:
--

Could it happen at any level? I did use sstablesplit in my cluster and I have 
recently spotted a number of messages like:

{code}
system.log.1: WARN [main] 2014-11-09 03:32:17,434 LeveledManifest.java (line 
164) At level 1, 
SSTableReader(path='/cassandra-data/disk2/myks/mytable1/myks-mytable1-jb-200217-Data.db')
 [DecoratedKey(-163275977074170, 
001001164335100116433510400100),
 DecoratedKey(2116162112767472431, 
001001432a4c1001432a4c10400100)]
 overlaps 
SSTableReader(path='/cassandra-data/disk3/myks/mytable1/myks-mytable1-jb-200215-Data.db')
 [DecoratedKey(665029536263181199, 
0010052d6e8d10052d6e8d10400100),
 DecoratedKey(1008355148187355376, 
001001135f971001135f9710400100)].
  This could be caused by a bug in Cassandra 1.1.0 .. 1.1.3 or due to the fact 
that you have dropped sstables from another node into the data directory. 
Sending back to L0.  If you didn’t drop in sstables, and have not yet run 
scrub, you should do so since you may also have rows out-of-order within an 
sstable
{code}

 Overlapping sstables in L1+
 ---

 Key: CASSANDRA-8211
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8211
 Project: Cassandra
  Issue Type: Bug
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
 Fix For: 2.0.12

 Attachments: 0001-Avoid-overlaps-in-L1-v2.patch, 
 0001-Avoid-overlaps-in-L1.patch


 Seems we have a bug that can create overlapping sstables in L1:
 {code}
 WARN [main] 2014-10-28 04:09:42,295 LeveledManifest.java (line 164) At level 
 2, SSTableReader(path='sstable') [DecoratedKey(2838397575996053472, 00
 10066059b210066059b210400100),
  DecoratedKey(5516674013223138308, 
 001000ff2d161000ff2d160
 00010400100)] overlaps 
 SSTableReader(path='sstable') [DecoratedKey(2839992722300822584, 
 0010
 00229ad21000229ad210400100),
  DecoratedKey(5532836928694021724, 
 0010034b05a610034b05a6100
 000400100)].  This could be caused by a bug in 
 Cassandra 1.1.0 .. 1.1.3 or due to the fact that you have dropped sstables 
 from another node into the data directory. Sending back to L0.  If
  you didn't drop in sstables, and have not yet run scrub, you should do so 
 since you may also have rows out-of-order within an sstable
 {code}
 Which might manifest itself during compaction with this exception:
 {code}
 ERROR [CompactionExecutor:3152] 2014-10-28 00:24:06,134 CassandraDaemon.java 
 (line 199) Exception in thread Thread[CompactionExecutor:3152,1,main]
 java.lang.RuntimeException: Last written key 
 DecoratedKey(5516674013223138308, 
 001000ff2d161000ff2d1610400100)
  = current key DecoratedKey(2839992722300822584, 
 001000229ad21000229ad210400100)
  writing into sstable
 {code}
 since we use LeveledScanner when compacting (the backing sstable scanner 
 might go beyond the start of the next sstable scanner)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-11-08 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203751#comment-14203751
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

Here is another extreme (but, unfortunately, real) example of LCS going a bit 
crazy.

{code}
# nodetool cfstats myks.mytable
Keyspace: myks
Read Count: 3006212
Read Latency: 21.02595119106703 ms.
Write Count: 11226340
Write Latency: 1.8405579886231844 ms.
Pending Tasks: 0
Table: wm_contacts
SSTable count: 6530
SSTables in each level: [2369/4, 10, 104/100, 1043/1000, 3004, 
0, 0, 0, 0]
Space used (live), bytes: 1113384288740
Space used (total), bytes: 1113406795020
SSTable Compression Ratio: 0.3307170610260717
Number of keys (estimate): 26294144
Memtable cell count: 782994
Memtable data size, bytes: 213472460
Memtable switch count: 3493
Local read count: 3006239
Local read latency: 21.026 ms
Local write count: 11226517
Local write latency: 1.841 ms
Pending tasks: 0
Bloom filter false positives: 41835779
Bloom filter false ratio: 0.97500
Bloom filter space used, bytes: 19666944
Compacted partition minimum bytes: 104
Compacted partition maximum bytes: 3379391
Compacted partition mean bytes: 139451
Average live cells per slice (last five minutes): 444.0
Average tombstones per slice (last five minutes): 0.0
{code}

{code}
# nodetool compactionstats
pending tasks: 190
  compaction typekeyspace   table   completed   
total  unit  progress
   Compaction  myksmytable2  7198353690  
7446734394 bytes96.66%
   Compaction  myks mytable 4851429651 10717052513  
   bytes45.27%
Active compaction remaining time :   0h00m04s
{code}


Note the cfstats. The number of sstables at L0 is insane. Yet, C* is sitting 
quietly compacting the data using 2 cores out of 32.

Once it gets into this state I immediately start seeing large sstables forming  
- instead of 256Mb the sstables of 1-2Gb and more start appearing. And it 
creates the snowball effect.



 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many 

[jira] [Commented] (CASSANDRA-7108) Enabling the Repair Service in OpsCenter generates imprecise repair errors

2014-10-31 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191738#comment-14191738
 ] 

Nikolai Grigoriev commented on CASSANDRA-7108:
--

Just wanted to add my $.02 - I am experiencing identical issue. I suspect this 
issue (at least on my side) results in snapshot left-overs in the cluster, 
which leads to higher disk usage until I clean these manually.

 Enabling the Repair Service in OpsCenter generates imprecise repair errors
 

 Key: CASSANDRA-7108
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7108
 Project: Cassandra
  Issue Type: Bug
 Environment: Ubuntu 12.04, 12.10, 14.04
 DSE version: 4.0.0
 Cassandra version: 2.0.5.x (x = multiple, e.g. 22, 24)
Reporter: nayden kolev

 Enabling the Repair Service in OpsCenter seems to trigger an error on every 
 node, logged every few minutes (sample below). This does not happen if a 
 nodetool repair keyspace command is issued. I have been able to reproduce 
 it on 4 separate clusters over the past month or so, all of them running the 
 latest DSE and Cassandra (2.0.5+)
 Error logged
  INFO [RMI TCP Connection(1350)-127.0.0.1] 2014-04-29 18:22:17,705 
 StorageService.java (line 2539) Starting repair command #6311, repairing 1 
 ranges for keyspace OpsCenter
 ERROR [RMI TCP Connection(1350)-127.0.0.1] 2014-04-29 18:22:17,710 
 StorageService.java (line 2560) Repair session failed:
 java.lang.IllegalArgumentException: Requested range intersects a local range 
 but is not fully contained in one; this would lead to imprecise repair
 at 
 org.apache.cassandra.service.ActiveRepairService.getNeighbors(ActiveRepairService.java:164)
 at 
 org.apache.cassandra.repair.RepairSession.init(RepairSession.java:128)
 at 
 org.apache.cassandra.repair.RepairSession.init(RepairSession.java:117)
 at 
 org.apache.cassandra.service.ActiveRepairService.submitRepairSession(ActiveRepairService.java:97)
 at 
 org.apache.cassandra.service.StorageService.forceKeyspaceRepair(StorageService.java:2620)
 at 
 org.apache.cassandra.service.StorageService$5.runMayThrow(StorageService.java:2556)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 org.apache.cassandra.service.StorageService.forceKeyspaceRepairRange(StorageService.java:2519)
 at 
 org.apache.cassandra.service.StorageService.forceKeyspaceRepairRange(StorageService.java:2512)
 at sun.reflect.GeneratedMethodAccessor97.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
 at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
 at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
 at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
 at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
 at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
 at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
 at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
 at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
 at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
 at 
 javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
 at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
 at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
 at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
 at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
 at 

[jira] [Commented] (CASSANDRA-8190) Compactions stop completely because of RuntimeException in CompactionExecutor

2014-10-29 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14188364#comment-14188364
 ] 

Nikolai Grigoriev commented on CASSANDRA-8190:
--

[~krummas]

Marcus, believe me I do not really enjoy hitting this weird stuff lately ;)

Most of the background is in CASSANDRA-7949 (the one you have marked resolved, 
although I am not sure I fully agree with that resolution).

The only detail I would add to CASSANDRA-7949 is that finally, after ~3 weeks 
the cluster has managed to finish all compactions. 3 weeks to compact the data 
created in ~4 days. In between I have lost the patience, stopped it and ran 
sstablesplit on all large sstables (anything larger than 1Gb) on each node. And 
then I started the nodes one by one once they were done with the split. Upon 
restart each node had between ~2K and 7K compactions to complete. I had to let 
them finish them. On the way I have seen these errors on different nodes at 
different time - so I reported them.

Yesterday night the last node has finished the compactions. I've been scrubbing 
each node after the compactions were done to make sure the data integrity is 
not broken. Now I am about to restart the load that updates and fetches the 
data. We are doing some kind of modelling for our real data, a capacity 
exercise to determine the size of the production cluster.

 Compactions stop completely because of RuntimeException in CompactionExecutor
 -

 Key: CASSANDRA-8190
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8190
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.2 (Cassandra 2.0.10)
Reporter: Nikolai Grigoriev
Assignee: Marcus Eriksson
 Attachments: jstack.txt.gz, system.log.gz


 I have a cluster that is recovering from being overloaded with writes.  I am 
 using the workaround from CASSANDRA-6621 to prevent the STCS fallback (which 
 is killing the cluster - see CASSANDRA-7949). 
 I have observed that after one or more exceptions like this
 {code}
 ERROR [CompactionExecutor:4087] 2014-10-26 22:50:05,016 CassandraDaemon.java 
 (line 199) Exception in thread Thread[CompactionExecutor:4087,1,main]
 java.lang.RuntimeException: Last written key DecoratedKey(425124616570337476, 
 0010033523da10033523da10
 400100) = current key DecoratedKey(-8778432288598355336, 
 0010040c7a8f10040c7a8f10
 400100) writing into 
 /cassandra-data/disk2/myks/mytable/myks-mytable-tmp-jb-130379-Data.db
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:142)
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:165)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
 at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {code}
 the node completely stops the compactions and I end up in the state like this:
 {code}
 # nodetool compactionstats
 pending tasks: 1288
   compaction typekeyspace   table   completed 
   total  unit  progress
 Active compaction remaining time :n/a
 {code}
 The node recovers if restarted and starts compactions - until getting more 
 exceptions like this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8190) Compactions stop completely because of RuntimeException in CompactionExecutor

2014-10-29 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14188364#comment-14188364
 ] 

Nikolai Grigoriev edited comment on CASSANDRA-8190 at 10/29/14 2:09 PM:


[~krummas]

Marcus, believe me I do not really enjoy hitting this weird stuff lately ;)

Most of the background is in CASSANDRA-7949 (the one you have marked resolved, 
although I am not sure I fully agree with that resolution).

The only detail I would add to CASSANDRA-7949 is that finally, after ~3 weeks 
the cluster has managed to finish all compactions. 3 weeks to compact the data 
created in ~4 days. In between I have lost the patience, stopped it and ran 
sstablesplit on all large sstables (anything larger than 1Gb) on each node. And 
then I started the nodes one by one once they were done with the split. Upon 
restart each node had between ~2K and 7K compactions to complete. I had to let 
them finish them. On the way I have seen these errors on different nodes at 
different time - so I reported them. My goal was to get the system to the state 
with no pending compactions and all sstables having the size close to the 
target one. This is why I used the flag from CASSANDRA-6621 
(cassandra.disable_stcs_in_l0), otherwise the cluster would stay in unusable 
state forever.

Yesterday night the last node has finished the compactions. I've been scrubbing 
each node after the compactions were done to make sure the data integrity is 
not broken. Now I am about to restart the load that updates and fetches the 
data. We are doing some kind of modelling for our real data, a capacity 
exercise to determine the size of the production cluster.


was (Author: ngrigor...@gmail.com):
[~krummas]

Marcus, believe me I do not really enjoy hitting this weird stuff lately ;)

Most of the background is in CASSANDRA-7949 (the one you have marked resolved, 
although I am not sure I fully agree with that resolution).

The only detail I would add to CASSANDRA-7949 is that finally, after ~3 weeks 
the cluster has managed to finish all compactions. 3 weeks to compact the data 
created in ~4 days. In between I have lost the patience, stopped it and ran 
sstablesplit on all large sstables (anything larger than 1Gb) on each node. And 
then I started the nodes one by one once they were done with the split. Upon 
restart each node had between ~2K and 7K compactions to complete. I had to let 
them finish them. On the way I have seen these errors on different nodes at 
different time - so I reported them.

Yesterday night the last node has finished the compactions. I've been scrubbing 
each node after the compactions were done to make sure the data integrity is 
not broken. Now I am about to restart the load that updates and fetches the 
data. We are doing some kind of modelling for our real data, a capacity 
exercise to determine the size of the production cluster.

 Compactions stop completely because of RuntimeException in CompactionExecutor
 -

 Key: CASSANDRA-8190
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8190
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.2 (Cassandra 2.0.10)
Reporter: Nikolai Grigoriev
Assignee: Marcus Eriksson
 Attachments: jstack.txt.gz, system.log.gz


 I have a cluster that is recovering from being overloaded with writes.  I am 
 using the workaround from CASSANDRA-6621 to prevent the STCS fallback (which 
 is killing the cluster - see CASSANDRA-7949). 
 I have observed that after one or more exceptions like this
 {code}
 ERROR [CompactionExecutor:4087] 2014-10-26 22:50:05,016 CassandraDaemon.java 
 (line 199) Exception in thread Thread[CompactionExecutor:4087,1,main]
 java.lang.RuntimeException: Last written key DecoratedKey(425124616570337476, 
 0010033523da10033523da10
 400100) = current key DecoratedKey(-8778432288598355336, 
 0010040c7a8f10040c7a8f10
 400100) writing into 
 /cassandra-data/disk2/myks/mytable/myks-mytable-tmp-jb-130379-Data.db
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:142)
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:165)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
 at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at 
 

[jira] [Comment Edited] (CASSANDRA-8190) Compactions stop completely because of RuntimeException in CompactionExecutor

2014-10-29 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14188364#comment-14188364
 ] 

Nikolai Grigoriev edited comment on CASSANDRA-8190 at 10/29/14 2:10 PM:


[~krummas]

Marcus, believe me I do not really enjoy hitting this weird stuff lately ;)

Most of the background is in CASSANDRA-7949 (the one you have marked resolved, 
although I am not sure I fully agree with that resolution).

The only detail I would add to CASSANDRA-7949 is that finally, after ~3 weeks 
the cluster has managed to finish all compactions. 3 weeks to compact the data 
created in ~4 days. In between I have lost the patience, stopped it and ran 
sstablesplit on all large sstables (anything larger than 1Gb) on each node. And 
then I started the nodes one by one once they were done with the split. Upon 
restart each node had between ~2K and 7K compactions to complete. I had to let 
them finish them. On the way I have seen these errors on different nodes at 
different time - so I reported them. My goal was to get the system to the state 
with no pending compactions and all sstables having the size close to the 
target one. This is why I used the flag from CASSANDRA-6621 
(cassandra.disable_stcs_in_l0), otherwise the cluster would stay in unusable 
state forever.

Yesterday night the last node has finished the compactions. I've been scrubbing 
each node after the compactions were done to make sure the data integrity is 
not broken. Now I am about to restart the load that updates and fetches the 
data. We are doing some kind of modelling for our real data, a capacity 
exercise to determine the size of the production cluster.

Note that the configuration I am attaching was modified a bit to attempt to 
speed up compactions. There was not too much to tune but stillLike 0 
compaction throughput limit etc.


was (Author: ngrigor...@gmail.com):
[~krummas]

Marcus, believe me I do not really enjoy hitting this weird stuff lately ;)

Most of the background is in CASSANDRA-7949 (the one you have marked resolved, 
although I am not sure I fully agree with that resolution).

The only detail I would add to CASSANDRA-7949 is that finally, after ~3 weeks 
the cluster has managed to finish all compactions. 3 weeks to compact the data 
created in ~4 days. In between I have lost the patience, stopped it and ran 
sstablesplit on all large sstables (anything larger than 1Gb) on each node. And 
then I started the nodes one by one once they were done with the split. Upon 
restart each node had between ~2K and 7K compactions to complete. I had to let 
them finish them. On the way I have seen these errors on different nodes at 
different time - so I reported them. My goal was to get the system to the state 
with no pending compactions and all sstables having the size close to the 
target one. This is why I used the flag from CASSANDRA-6621 
(cassandra.disable_stcs_in_l0), otherwise the cluster would stay in unusable 
state forever.

Yesterday night the last node has finished the compactions. I've been scrubbing 
each node after the compactions were done to make sure the data integrity is 
not broken. Now I am about to restart the load that updates and fetches the 
data. We are doing some kind of modelling for our real data, a capacity 
exercise to determine the size of the production cluster.

 Compactions stop completely because of RuntimeException in CompactionExecutor
 -

 Key: CASSANDRA-8190
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8190
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.2 (Cassandra 2.0.10)
Reporter: Nikolai Grigoriev
Assignee: Marcus Eriksson
 Attachments: cassandra-env.sh, cassandra.yaml, jstack.txt.gz, 
 system.log.gz


 I have a cluster that is recovering from being overloaded with writes.  I am 
 using the workaround from CASSANDRA-6621 to prevent the STCS fallback (which 
 is killing the cluster - see CASSANDRA-7949). 
 I have observed that after one or more exceptions like this
 {code}
 ERROR [CompactionExecutor:4087] 2014-10-26 22:50:05,016 CassandraDaemon.java 
 (line 199) Exception in thread Thread[CompactionExecutor:4087,1,main]
 java.lang.RuntimeException: Last written key DecoratedKey(425124616570337476, 
 0010033523da10033523da10
 400100) = current key DecoratedKey(-8778432288598355336, 
 0010040c7a8f10040c7a8f10
 400100) writing into 
 /cassandra-data/disk2/myks/mytable/myks-mytable-tmp-jb-130379-Data.db
 at 
 

[jira] [Updated] (CASSANDRA-8190) Compactions stop completely because of RuntimeException in CompactionExecutor

2014-10-29 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-8190:
-
Attachment: cassandra.yaml
cassandra-env.sh

config files

 Compactions stop completely because of RuntimeException in CompactionExecutor
 -

 Key: CASSANDRA-8190
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8190
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.2 (Cassandra 2.0.10)
Reporter: Nikolai Grigoriev
Assignee: Marcus Eriksson
 Attachments: cassandra-env.sh, cassandra.yaml, jstack.txt.gz, 
 system.log.gz


 I have a cluster that is recovering from being overloaded with writes.  I am 
 using the workaround from CASSANDRA-6621 to prevent the STCS fallback (which 
 is killing the cluster - see CASSANDRA-7949). 
 I have observed that after one or more exceptions like this
 {code}
 ERROR [CompactionExecutor:4087] 2014-10-26 22:50:05,016 CassandraDaemon.java 
 (line 199) Exception in thread Thread[CompactionExecutor:4087,1,main]
 java.lang.RuntimeException: Last written key DecoratedKey(425124616570337476, 
 0010033523da10033523da10
 400100) = current key DecoratedKey(-8778432288598355336, 
 0010040c7a8f10040c7a8f10
 400100) writing into 
 /cassandra-data/disk2/myks/mytable/myks-mytable-tmp-jb-130379-Data.db
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:142)
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:165)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
 at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {code}
 the node completely stops the compactions and I end up in the state like this:
 {code}
 # nodetool compactionstats
 pending tasks: 1288
   compaction typekeyspace   table   completed 
   total  unit  progress
 Active compaction remaining time :n/a
 {code}
 The node recovers if restarted and starts compactions - until getting more 
 exceptions like this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8190) Compactions stop completely because of RuntimeException in CompactionExecutor

2014-10-29 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-8190:
-
Attachment: system.log.gz

a sample log

 Compactions stop completely because of RuntimeException in CompactionExecutor
 -

 Key: CASSANDRA-8190
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8190
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.2 (Cassandra 2.0.10)
Reporter: Nikolai Grigoriev
Assignee: Marcus Eriksson
 Attachments: cassandra-env.sh, cassandra.yaml, jstack.txt.gz, 
 system.log.gz, system.log.gz


 I have a cluster that is recovering from being overloaded with writes.  I am 
 using the workaround from CASSANDRA-6621 to prevent the STCS fallback (which 
 is killing the cluster - see CASSANDRA-7949). 
 I have observed that after one or more exceptions like this
 {code}
 ERROR [CompactionExecutor:4087] 2014-10-26 22:50:05,016 CassandraDaemon.java 
 (line 199) Exception in thread Thread[CompactionExecutor:4087,1,main]
 java.lang.RuntimeException: Last written key DecoratedKey(425124616570337476, 
 0010033523da10033523da10
 400100) = current key DecoratedKey(-8778432288598355336, 
 0010040c7a8f10040c7a8f10
 400100) writing into 
 /cassandra-data/disk2/myks/mytable/myks-mytable-tmp-jb-130379-Data.db
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:142)
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:165)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
 at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {code}
 the node completely stops the compactions and I end up in the state like this:
 {code}
 # nodetool compactionstats
 pending tasks: 1288
   compaction typekeyspace   table   completed 
   total  unit  progress
 Active compaction remaining time :n/a
 {code}
 The node recovers if restarted and starts compactions - until getting more 
 exceptions like this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-5256) Memory was freed AssertionError During Major Compaction

2014-10-29 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-5256:
-
Attachment: cassandra.yaml
cassandra-env.sh

 Memory was freed AssertionError During Major Compaction
 -

 Key: CASSANDRA-5256
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5256
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.0
 Environment: Linux ashbdrytest01p 3.2.0-37-generic #58-Ubuntu SMP Thu 
 Jan 24 15:28:10 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
 java version 1.6.0_30
 Java(TM) SE Runtime Environment (build 1.6.0_30-b12)
 Java HotSpot(TM) 64-Bit Server VM (build 20.5-b03, mixed mode)
 Ubuntu 12.04.2 LTS
Reporter: C. Scott Andreas
Assignee: Jonathan Ellis
Priority: Critical
  Labels: compaction
 Fix For: 1.2.2

 Attachments: 5256-v2.txt, 5256-v4.txt, 5256-v5.txt, 5256.txt, 
 cassandra-env.sh, cassandra.yaml, occurence frequency.png


 When initiating a major compaction with `./nodetool -h localhost compact`, an 
 AssertionError is thrown in the CompactionExecutor from o.a.c.io.util.Memory:
 ERROR [CompactionExecutor:41495] 2013-02-14 14:38:35,720 CassandraDaemon.java 
 (line 133) Exception in thread Thread[CompactionExecutor:41495,1,RMI Runtime]
 java.lang.AssertionError: Memory was freed
   at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:146)
   at org.apache.cassandra.io.util.Memory.getLong(Memory.java:116)
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.chunkFor(CompressionMetadata.java:176)
   at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:88)
   at 
 org.apache.cassandra.io.util.RandomAccessReader.read(RandomAccessReader.java:327)
   at java.io.RandomAccessFile.readInt(RandomAccessFile.java:755)
   at java.io.RandomAccessFile.readLong(RandomAccessFile.java:792)
   at 
 org.apache.cassandra.utils.BytesReadTracker.readLong(BytesReadTracker.java:114)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:101)
   at 
 org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:92)
   at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumnsFromSSTable(ColumnFamilySerializer.java:149)
   at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:235)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:109)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:93)
   at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:162)
   at 
 org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:76)
   at 
 org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:57)
   at 
 org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:114)
   at 
 org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
   at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
   at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:158)
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:71)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:342)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 ---
 I've invoked the `nodetool compact` three times; this occurred after each. 
 The node has been up for a couple days accepting writes and has not been 
 restarted.
 Here's the server's log since it was started a few days ago: 
 

[jira] [Updated] (CASSANDRA-5256) Memory was freed AssertionError During Major Compaction

2014-10-29 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-5256:
-
Attachment: occurence frequency.png

 Memory was freed AssertionError During Major Compaction
 -

 Key: CASSANDRA-5256
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5256
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.0
 Environment: Linux ashbdrytest01p 3.2.0-37-generic #58-Ubuntu SMP Thu 
 Jan 24 15:28:10 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
 java version 1.6.0_30
 Java(TM) SE Runtime Environment (build 1.6.0_30-b12)
 Java HotSpot(TM) 64-Bit Server VM (build 20.5-b03, mixed mode)
 Ubuntu 12.04.2 LTS
Reporter: C. Scott Andreas
Assignee: Jonathan Ellis
Priority: Critical
  Labels: compaction
 Fix For: 1.2.2

 Attachments: 5256-v2.txt, 5256-v4.txt, 5256-v5.txt, 5256.txt, 
 cassandra-env.sh, cassandra.yaml, occurence frequency.png


 When initiating a major compaction with `./nodetool -h localhost compact`, an 
 AssertionError is thrown in the CompactionExecutor from o.a.c.io.util.Memory:
 ERROR [CompactionExecutor:41495] 2013-02-14 14:38:35,720 CassandraDaemon.java 
 (line 133) Exception in thread Thread[CompactionExecutor:41495,1,RMI Runtime]
 java.lang.AssertionError: Memory was freed
   at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:146)
   at org.apache.cassandra.io.util.Memory.getLong(Memory.java:116)
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.chunkFor(CompressionMetadata.java:176)
   at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:88)
   at 
 org.apache.cassandra.io.util.RandomAccessReader.read(RandomAccessReader.java:327)
   at java.io.RandomAccessFile.readInt(RandomAccessFile.java:755)
   at java.io.RandomAccessFile.readLong(RandomAccessFile.java:792)
   at 
 org.apache.cassandra.utils.BytesReadTracker.readLong(BytesReadTracker.java:114)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:101)
   at 
 org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:92)
   at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumnsFromSSTable(ColumnFamilySerializer.java:149)
   at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:235)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:109)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:93)
   at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:162)
   at 
 org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:76)
   at 
 org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:57)
   at 
 org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:114)
   at 
 org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
   at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
   at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:158)
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:71)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:342)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 ---
 I've invoked the `nodetool compact` three times; this occurred after each. 
 The node has been up for a couple days accepting writes and has not been 
 restarted.
 Here's the server's log since it was started a few days ago: 
 https://gist.github.com/cscotta/4956472/raw/95e7cbc68de1aefaeca11812cbb98d5d46f534e8/cassandra.log
 Here's the code 

[jira] [Created] (CASSANDRA-8210) java.lang.AssertionError: Memory was freed exception in CompactionExecutor

2014-10-29 Thread Nikolai Grigoriev (JIRA)
Nikolai Grigoriev created CASSANDRA-8210:


 Summary: java.lang.AssertionError: Memory was freed exception in 
CompactionExecutor
 Key: CASSANDRA-8210
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8210
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.2, Cassandra 2.0.10, OEL 6.5, kernel 
3.8.13-44.el6uek.x86_64, 128Gb of RAM, swap disabled, JRE 1.7.0_67-b01
Reporter: Nikolai Grigoriev
Priority: Minor


I have just got this problem on multiple nodes. Cassandra 2.0.10 (DSE 4.5.2). 
After looking through the history I have found that it was actually happening 
on all nodes since the start of large compaction process (I've loaded tons of 
data in the system and then turned off all load to let it compact the data).

{code}
ERROR [CompactionExecutor:1196] 2014-10-28 17:14:50,124 CassandraDaemon.java 
(line 199) Exception in thread Thread[CompactionExecutor:1196,1,main]
java.lang.AssertionError: Memory was freed
at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259)
at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211)
at 
org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79)
at 
org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84)
at 
org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:58)
at 
org.apache.cassandra.io.sstable.SSTableReader.getSampleIndexesForRanges(SSTableReader.java:692)
at 
org.apache.cassandra.io.sstable.SSTableReader.estimatedKeysForRanges(SSTableReader.java:663)
at 
org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:328)
at 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy.findDroppableSSTable(LeveledCompactionStrategy.java:354)
at 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getMaximalTask(LeveledCompactionStrategy.java:125)
at 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:113)
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:192)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8210) java.lang.AssertionError: Memory was freed exception in CompactionExecutor

2014-10-29 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-8210:
-
Attachment: cassandra.yaml
cassandra-env.sh
occurence frequency.png

 java.lang.AssertionError: Memory was freed exception in CompactionExecutor
 

 Key: CASSANDRA-8210
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8210
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.2, Cassandra 2.0.10, OEL 6.5, kernel 
 3.8.13-44.el6uek.x86_64, 128Gb of RAM, swap disabled, JRE 1.7.0_67-b01
Reporter: Nikolai Grigoriev
Priority: Minor
 Attachments: cassandra-env.sh, cassandra.yaml, occurence frequency.png


 I have just got this problem on multiple nodes. Cassandra 2.0.10 (DSE 4.5.2). 
 After looking through the history I have found that it was actually happening 
 on all nodes since the start of large compaction process (I've loaded tons of 
 data in the system and then turned off all load to let it compact the data).
 {code}
 ERROR [CompactionExecutor:1196] 2014-10-28 17:14:50,124 CassandraDaemon.java 
 (line 199) Exception in thread Thread[CompactionExecutor:1196,1,main]
 java.lang.AssertionError: Memory was freed
 at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259)
 at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211)
 at 
 org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79)
 at 
 org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84)
 at 
 org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:58)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getSampleIndexesForRanges(SSTableReader.java:692)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.estimatedKeysForRanges(SSTableReader.java:663)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:328)
 at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.findDroppableSSTable(LeveledCompactionStrategy.java:354)
 at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getMaximalTask(LeveledCompactionStrategy.java:125)
 at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:113)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:192)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-5256) Memory was freed AssertionError During Major Compaction

2014-10-29 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-5256:
-
Attachment: (was: occurence frequency.png)

 Memory was freed AssertionError During Major Compaction
 -

 Key: CASSANDRA-5256
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5256
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.0
 Environment: Linux ashbdrytest01p 3.2.0-37-generic #58-Ubuntu SMP Thu 
 Jan 24 15:28:10 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
 java version 1.6.0_30
 Java(TM) SE Runtime Environment (build 1.6.0_30-b12)
 Java HotSpot(TM) 64-Bit Server VM (build 20.5-b03, mixed mode)
 Ubuntu 12.04.2 LTS
Reporter: C. Scott Andreas
Assignee: Jonathan Ellis
Priority: Critical
  Labels: compaction
 Fix For: 1.2.2

 Attachments: 5256-v2.txt, 5256-v4.txt, 5256-v5.txt, 5256.txt


 When initiating a major compaction with `./nodetool -h localhost compact`, an 
 AssertionError is thrown in the CompactionExecutor from o.a.c.io.util.Memory:
 ERROR [CompactionExecutor:41495] 2013-02-14 14:38:35,720 CassandraDaemon.java 
 (line 133) Exception in thread Thread[CompactionExecutor:41495,1,RMI Runtime]
 java.lang.AssertionError: Memory was freed
   at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:146)
   at org.apache.cassandra.io.util.Memory.getLong(Memory.java:116)
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.chunkFor(CompressionMetadata.java:176)
   at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:88)
   at 
 org.apache.cassandra.io.util.RandomAccessReader.read(RandomAccessReader.java:327)
   at java.io.RandomAccessFile.readInt(RandomAccessFile.java:755)
   at java.io.RandomAccessFile.readLong(RandomAccessFile.java:792)
   at 
 org.apache.cassandra.utils.BytesReadTracker.readLong(BytesReadTracker.java:114)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:101)
   at 
 org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:92)
   at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumnsFromSSTable(ColumnFamilySerializer.java:149)
   at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:235)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:109)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:93)
   at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:162)
   at 
 org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:76)
   at 
 org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:57)
   at 
 org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:114)
   at 
 org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
   at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
   at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:158)
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:71)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:342)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 ---
 I've invoked the `nodetool compact` three times; this occurred after each. 
 The node has been up for a couple days accepting writes and has not been 
 restarted.
 Here's the server's log since it was started a few days ago: 
 https://gist.github.com/cscotta/4956472/raw/95e7cbc68de1aefaeca11812cbb98d5d46f534e8/cassandra.log
 Here's the code being used to issue writes to the datastore: 
 

[jira] [Updated] (CASSANDRA-5256) Memory was freed AssertionError During Major Compaction

2014-10-29 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-5256:
-
Attachment: (was: cassandra.yaml)

 Memory was freed AssertionError During Major Compaction
 -

 Key: CASSANDRA-5256
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5256
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.0
 Environment: Linux ashbdrytest01p 3.2.0-37-generic #58-Ubuntu SMP Thu 
 Jan 24 15:28:10 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
 java version 1.6.0_30
 Java(TM) SE Runtime Environment (build 1.6.0_30-b12)
 Java HotSpot(TM) 64-Bit Server VM (build 20.5-b03, mixed mode)
 Ubuntu 12.04.2 LTS
Reporter: C. Scott Andreas
Assignee: Jonathan Ellis
Priority: Critical
  Labels: compaction
 Fix For: 1.2.2

 Attachments: 5256-v2.txt, 5256-v4.txt, 5256-v5.txt, 5256.txt


 When initiating a major compaction with `./nodetool -h localhost compact`, an 
 AssertionError is thrown in the CompactionExecutor from o.a.c.io.util.Memory:
 ERROR [CompactionExecutor:41495] 2013-02-14 14:38:35,720 CassandraDaemon.java 
 (line 133) Exception in thread Thread[CompactionExecutor:41495,1,RMI Runtime]
 java.lang.AssertionError: Memory was freed
   at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:146)
   at org.apache.cassandra.io.util.Memory.getLong(Memory.java:116)
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.chunkFor(CompressionMetadata.java:176)
   at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:88)
   at 
 org.apache.cassandra.io.util.RandomAccessReader.read(RandomAccessReader.java:327)
   at java.io.RandomAccessFile.readInt(RandomAccessFile.java:755)
   at java.io.RandomAccessFile.readLong(RandomAccessFile.java:792)
   at 
 org.apache.cassandra.utils.BytesReadTracker.readLong(BytesReadTracker.java:114)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:101)
   at 
 org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:92)
   at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumnsFromSSTable(ColumnFamilySerializer.java:149)
   at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:235)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:109)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:93)
   at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:162)
   at 
 org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:76)
   at 
 org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:57)
   at 
 org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:114)
   at 
 org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
   at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
   at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:158)
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:71)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:342)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 ---
 I've invoked the `nodetool compact` three times; this occurred after each. 
 The node has been up for a couple days accepting writes and has not been 
 restarted.
 Here's the server's log since it was started a few days ago: 
 https://gist.github.com/cscotta/4956472/raw/95e7cbc68de1aefaeca11812cbb98d5d46f534e8/cassandra.log
 Here's the code being used to issue writes to the datastore: 
 

[jira] [Updated] (CASSANDRA-5256) Memory was freed AssertionError During Major Compaction

2014-10-29 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-5256:
-
Attachment: (was: cassandra-env.sh)

 Memory was freed AssertionError During Major Compaction
 -

 Key: CASSANDRA-5256
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5256
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.0
 Environment: Linux ashbdrytest01p 3.2.0-37-generic #58-Ubuntu SMP Thu 
 Jan 24 15:28:10 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
 java version 1.6.0_30
 Java(TM) SE Runtime Environment (build 1.6.0_30-b12)
 Java HotSpot(TM) 64-Bit Server VM (build 20.5-b03, mixed mode)
 Ubuntu 12.04.2 LTS
Reporter: C. Scott Andreas
Assignee: Jonathan Ellis
Priority: Critical
  Labels: compaction
 Fix For: 1.2.2

 Attachments: 5256-v2.txt, 5256-v4.txt, 5256-v5.txt, 5256.txt


 When initiating a major compaction with `./nodetool -h localhost compact`, an 
 AssertionError is thrown in the CompactionExecutor from o.a.c.io.util.Memory:
 ERROR [CompactionExecutor:41495] 2013-02-14 14:38:35,720 CassandraDaemon.java 
 (line 133) Exception in thread Thread[CompactionExecutor:41495,1,RMI Runtime]
 java.lang.AssertionError: Memory was freed
   at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:146)
   at org.apache.cassandra.io.util.Memory.getLong(Memory.java:116)
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.chunkFor(CompressionMetadata.java:176)
   at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:88)
   at 
 org.apache.cassandra.io.util.RandomAccessReader.read(RandomAccessReader.java:327)
   at java.io.RandomAccessFile.readInt(RandomAccessFile.java:755)
   at java.io.RandomAccessFile.readLong(RandomAccessFile.java:792)
   at 
 org.apache.cassandra.utils.BytesReadTracker.readLong(BytesReadTracker.java:114)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:101)
   at 
 org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:92)
   at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumnsFromSSTable(ColumnFamilySerializer.java:149)
   at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:235)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:109)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:93)
   at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:162)
   at 
 org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:76)
   at 
 org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:57)
   at 
 org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:114)
   at 
 org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
   at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
   at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:158)
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:71)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:342)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 ---
 I've invoked the `nodetool compact` three times; this occurred after each. 
 The node has been up for a couple days accepting writes and has not been 
 restarted.
 Here's the server's log since it was started a few days ago: 
 https://gist.github.com/cscotta/4956472/raw/95e7cbc68de1aefaeca11812cbb98d5d46f534e8/cassandra.log
 Here's the code being used to issue writes to the datastore: 
 

[jira] [Updated] (CASSANDRA-8210) java.lang.AssertionError: Memory was freed exception in CompactionExecutor

2014-10-29 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-8210:
-
Attachment: system.log.gz

 java.lang.AssertionError: Memory was freed exception in CompactionExecutor
 

 Key: CASSANDRA-8210
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8210
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.2, Cassandra 2.0.10, OEL 6.5, kernel 
 3.8.13-44.el6uek.x86_64, 128Gb of RAM, swap disabled, JRE 1.7.0_67-b01
Reporter: Nikolai Grigoriev
Priority: Minor
 Attachments: cassandra-env.sh, cassandra.yaml, occurence 
 frequency.png, system.log.gz


 I have just got this problem on multiple nodes. Cassandra 2.0.10 (DSE 4.5.2). 
 After looking through the history I have found that it was actually happening 
 on all nodes since the start of large compaction process (I've loaded tons of 
 data in the system and then turned off all load to let it compact the data).
 {code}
 ERROR [CompactionExecutor:1196] 2014-10-28 17:14:50,124 CassandraDaemon.java 
 (line 199) Exception in thread Thread[CompactionExecutor:1196,1,main]
 java.lang.AssertionError: Memory was freed
 at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259)
 at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211)
 at 
 org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79)
 at 
 org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84)
 at 
 org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:58)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getSampleIndexesForRanges(SSTableReader.java:692)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.estimatedKeysForRanges(SSTableReader.java:663)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:328)
 at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.findDroppableSSTable(LeveledCompactionStrategy.java:354)
 at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getMaximalTask(LeveledCompactionStrategy.java:125)
 at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:113)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:192)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8210) java.lang.AssertionError: Memory was freed exception in CompactionExecutor

2014-10-29 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14188456#comment-14188456
 ] 

Nikolai Grigoriev commented on CASSANDRA-8210:
--

Opened new ticket as per [~jbellis]'s recommendation in response to my comment 
to CASSANDRA-5256

 java.lang.AssertionError: Memory was freed exception in CompactionExecutor
 

 Key: CASSANDRA-8210
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8210
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.2, Cassandra 2.0.10, OEL 6.5, kernel 
 3.8.13-44.el6uek.x86_64, 128Gb of RAM, swap disabled, JRE 1.7.0_67-b01
Reporter: Nikolai Grigoriev
Priority: Minor
 Attachments: cassandra-env.sh, cassandra.yaml, occurence 
 frequency.png, system.log.gz


 I have just got this problem on multiple nodes. Cassandra 2.0.10 (DSE 4.5.2). 
 After looking through the history I have found that it was actually happening 
 on all nodes since the start of large compaction process (I've loaded tons of 
 data in the system and then turned off all load to let it compact the data).
 {code}
 ERROR [CompactionExecutor:1196] 2014-10-28 17:14:50,124 CassandraDaemon.java 
 (line 199) Exception in thread Thread[CompactionExecutor:1196,1,main]
 java.lang.AssertionError: Memory was freed
 at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259)
 at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211)
 at 
 org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79)
 at 
 org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84)
 at 
 org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:58)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getSampleIndexesForRanges(SSTableReader.java:692)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.estimatedKeysForRanges(SSTableReader.java:663)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:328)
 at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.findDroppableSSTable(LeveledCompactionStrategy.java:354)
 at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getMaximalTask(LeveledCompactionStrategy.java:125)
 at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:113)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:192)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-5256) Memory was freed AssertionError During Major Compaction

2014-10-28 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187189#comment-14187189
 ] 

Nikolai Grigoriev commented on CASSANDRA-5256:
--

I have just got this problem on multiple nodes. Cassandra 2.0.10 (DSE 4.5.2). 
Should I reopen?

{code}
ERROR [CompactionExecutor:1196] 2014-10-28 17:14:50,124 CassandraDaemon.java 
(line 199) Exception in thread Thread[CompactionExecutor:1196,1,main]
java.lang.AssertionError: Memory was freed
at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259)
at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211)
at 
org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79)
at 
org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84)
at 
org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:58)
at 
org.apache.cassandra.io.sstable.SSTableReader.getSampleIndexesForRanges(SSTableReader.java:692)
at 
org.apache.cassandra.io.sstable.SSTableReader.estimatedKeysForRanges(SSTableReader.java:663)
at 
org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:328)
at 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy.findDroppableSSTable(LeveledCompactionStrategy.java:354)
at 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getMaximalTask(LeveledCompactionStrategy.java:125)
at 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:113)
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:192)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}

 Memory was freed AssertionError During Major Compaction
 -

 Key: CASSANDRA-5256
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5256
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.0
 Environment: Linux ashbdrytest01p 3.2.0-37-generic #58-Ubuntu SMP Thu 
 Jan 24 15:28:10 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
 java version 1.6.0_30
 Java(TM) SE Runtime Environment (build 1.6.0_30-b12)
 Java HotSpot(TM) 64-Bit Server VM (build 20.5-b03, mixed mode)
 Ubuntu 12.04.2 LTS
Reporter: C. Scott Andreas
Assignee: Jonathan Ellis
Priority: Critical
  Labels: compaction
 Fix For: 1.2.2

 Attachments: 5256-v2.txt, 5256-v4.txt, 5256-v5.txt, 5256.txt


 When initiating a major compaction with `./nodetool -h localhost compact`, an 
 AssertionError is thrown in the CompactionExecutor from o.a.c.io.util.Memory:
 ERROR [CompactionExecutor:41495] 2013-02-14 14:38:35,720 CassandraDaemon.java 
 (line 133) Exception in thread Thread[CompactionExecutor:41495,1,RMI Runtime]
 java.lang.AssertionError: Memory was freed
   at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:146)
   at org.apache.cassandra.io.util.Memory.getLong(Memory.java:116)
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.chunkFor(CompressionMetadata.java:176)
   at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:88)
   at 
 org.apache.cassandra.io.util.RandomAccessReader.read(RandomAccessReader.java:327)
   at java.io.RandomAccessFile.readInt(RandomAccessFile.java:755)
   at java.io.RandomAccessFile.readLong(RandomAccessFile.java:792)
   at 
 org.apache.cassandra.utils.BytesReadTracker.readLong(BytesReadTracker.java:114)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:101)
   at 
 org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:92)
   at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumnsFromSSTable(ColumnFamilySerializer.java:149)
   at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:235)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:109)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:93)
   at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:162)
   at 
 

[jira] [Commented] (CASSANDRA-8167) sstablesplit tool can be made much faster with few JVM settings

2014-10-27 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14185573#comment-14185573
 ] 

Nikolai Grigoriev commented on CASSANDRA-8167:
--

No, unfortunately, I did not think about capturing it :( I only saved the stack 
trace. I can share my cassandra.yaml if needed. Plus - I was splitting the 
sstables for a table that has relatively wide rows. Not necessarily in terms or 
number of columns but size-wise there are few rows there that may be up to 
3,5Mb (average row size is about 140Kb for that table).

 sstablesplit tool can be made much faster with few JVM settings
 ---

 Key: CASSANDRA-8167
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8167
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Nikolai Grigoriev
Priority: Trivial

 I had to use sstablesplit tool intensively to split some really huge 
 sstables. The tool is painfully slow as it does compaction in one single 
 thread.
 I have just found that one one of my machines the tool has crashed when I was 
 almost done with 152Gb sstable (!!!). 
 {code}
  INFO 16:59:22,342 Writing Memtable-compactions_in_progress@1948660572(0/0 
 serialized/live bytes, 1 ops)
  INFO 16:59:22,352 Completed flushing 
 /cassandra-data/disk1/system/compactions_in_progress/system-compactions_in_progress-jb-79242-Data.db
  (42 bytes) for commitlog position ReplayPosition(segmentId=1413904450653, 
 position=69178)
 Exception in thread main java.lang.OutOfMemoryError: GC overhead limit 
 exceeded
 at java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
 at 
 org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:586)
 at 
 org.apache.cassandra.utils.ByteBufferUtil.readBytesWithShortLength(ByteBufferUtil.java:596)
 at 
 org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:61)
 at 
 org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:36)
 at 
 org.apache.cassandra.db.RangeTombstoneList$InOrderTester.isDeleted(RangeTombstoneList.java:751)
 at 
 org.apache.cassandra.db.DeletionInfo$InOrderTester.isDeleted(DeletionInfo.java:422)
 at 
 org.apache.cassandra.db.DeletionInfo$InOrderTester.isDeleted(DeletionInfo.java:403)
 at 
 org.apache.cassandra.db.ColumnFamily.hasIrrelevantData(ColumnFamily.java:489)
 at 
 org.apache.cassandra.db.compaction.PrecompactedRow.removeDeleted(PrecompactedRow.java:66)
 at 
 org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:85)
 at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:196)
 at 
 org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:74)
 at 
 org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:55)
 at 
 org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:204)
 at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
 at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:154)
 at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 at 
 org.apache.cassandra.db.compaction.SSTableSplitter.split(SSTableSplitter.java:38)
 at 
 org.apache.cassandra.tools.StandaloneSplitter.main(StandaloneSplitter.java:150)
 {code}
 This has  triggered my desire to see what memory settings are used for JVM 
 running the tool...and I have found that it runs with default Java settings 
 (no settings at all).
 I have tried to apply the settings from C* itself and this resulted in over 
 40% speed increase. It went from ~5Mb/s to 7Mb/s - from the compressed output 
 perspective. I believe this is mostly due to concurrent GC. I see my CPU 
 usage has increased to ~200%. But this is fine, this is an offline tool, the 
 node is down anyway. I know that concurrent GC (at least something like 
 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled) 
 normally improves the performance of even primitive single-threaded 
 heap-intensive Java programs.
 I think it should be acceptable to apply the server JVM settings to this tool.



[jira] [Updated] (CASSANDRA-8190) Compactions stop completely because of RuntimeException in CompactionExecutor

2014-10-27 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-8190:
-
Attachment: jstack.txt.gz

Captured when the node was in this state: pending compactions and no 
compactions active.

 Compactions stop completely because of RuntimeException in CompactionExecutor
 -

 Key: CASSANDRA-8190
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8190
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.2 (Cassandra 2.0.10)
Reporter: Nikolai Grigoriev
 Attachments: jstack.txt.gz


 I have a cluster that is recovering from being overloaded with writes.  I am 
 using the workaround from CASSANDRA-6621 to prevent the STCS fallback (which 
 is killing the cluster - see CASSANDRA-7949). 
 I have observed that after one or more exceptions like this
 {code}
 ERROR [CompactionExecutor:4087] 2014-10-26 22:50:05,016 CassandraDaemon.java 
 (line 199) Exception in thread Thread[CompactionExecutor:4087,1,main]
 java.lang.RuntimeException: Last written key DecoratedKey(425124616570337476, 
 0010033523da10033523da10
 400100) = current key DecoratedKey(-8778432288598355336, 
 0010040c7a8f10040c7a8f10
 400100) writing into 
 /cassandra-data/disk2/myks/mytable/myks-mytable-tmp-jb-130379-Data.db
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:142)
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:165)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
 at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {code}
 the node completely stops the compactions and I end up in the state like this:
 {code}
 # nodetool compactionstats
 pending tasks: 1288
   compaction typekeyspace   table   completed 
   total  unit  progress
 Active compaction remaining time :n/a
 {code}
 The node recovers if restarted and starts compactions - until getting more 
 exceptions like this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8190) Compactions stop completely because of RuntimeException in CompactionExecutor

2014-10-27 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-8190:
-
Attachment: system.log.gz

 Compactions stop completely because of RuntimeException in CompactionExecutor
 -

 Key: CASSANDRA-8190
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8190
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.2 (Cassandra 2.0.10)
Reporter: Nikolai Grigoriev
 Attachments: jstack.txt.gz, system.log.gz


 I have a cluster that is recovering from being overloaded with writes.  I am 
 using the workaround from CASSANDRA-6621 to prevent the STCS fallback (which 
 is killing the cluster - see CASSANDRA-7949). 
 I have observed that after one or more exceptions like this
 {code}
 ERROR [CompactionExecutor:4087] 2014-10-26 22:50:05,016 CassandraDaemon.java 
 (line 199) Exception in thread Thread[CompactionExecutor:4087,1,main]
 java.lang.RuntimeException: Last written key DecoratedKey(425124616570337476, 
 0010033523da10033523da10
 400100) = current key DecoratedKey(-8778432288598355336, 
 0010040c7a8f10040c7a8f10
 400100) writing into 
 /cassandra-data/disk2/myks/mytable/myks-mytable-tmp-jb-130379-Data.db
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:142)
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:165)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
 at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {code}
 the node completely stops the compactions and I end up in the state like this:
 {code}
 # nodetool compactionstats
 pending tasks: 1288
   compaction typekeyspace   table   completed 
   total  unit  progress
 Active compaction remaining time :n/a
 {code}
 The node recovers if restarted and starts compactions - until getting more 
 exceptions like this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8190) Compactions stop completely because of RuntimeException in CompactionExecutor

2014-10-27 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186231#comment-14186231
 ] 

Nikolai Grigoriev commented on CASSANDRA-8190:
--

Happens quite often. I have captured the thread dump and server log (both 
attached) when I've got this issue again on one of the nodes.

{code}
pending tasks: 601
Active compaction remaining time :n/a
{code}

 Compactions stop completely because of RuntimeException in CompactionExecutor
 -

 Key: CASSANDRA-8190
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8190
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.2 (Cassandra 2.0.10)
Reporter: Nikolai Grigoriev
 Attachments: jstack.txt.gz, system.log.gz


 I have a cluster that is recovering from being overloaded with writes.  I am 
 using the workaround from CASSANDRA-6621 to prevent the STCS fallback (which 
 is killing the cluster - see CASSANDRA-7949). 
 I have observed that after one or more exceptions like this
 {code}
 ERROR [CompactionExecutor:4087] 2014-10-26 22:50:05,016 CassandraDaemon.java 
 (line 199) Exception in thread Thread[CompactionExecutor:4087,1,main]
 java.lang.RuntimeException: Last written key DecoratedKey(425124616570337476, 
 0010033523da10033523da10
 400100) = current key DecoratedKey(-8778432288598355336, 
 0010040c7a8f10040c7a8f10
 400100) writing into 
 /cassandra-data/disk2/myks/mytable/myks-mytable-tmp-jb-130379-Data.db
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:142)
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:165)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
 at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {code}
 the node completely stops the compactions and I end up in the state like this:
 {code}
 # nodetool compactionstats
 pending tasks: 1288
   compaction typekeyspace   table   completed 
   total  unit  progress
 Active compaction remaining time :n/a
 {code}
 The node recovers if restarted and starts compactions - until getting more 
 exceptions like this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8190) Compactions stop completely because of RuntimeException in CompactionExecutor

2014-10-26 Thread Nikolai Grigoriev (JIRA)
Nikolai Grigoriev created CASSANDRA-8190:


 Summary: Compactions stop completely because of RuntimeException 
in CompactionExecutor
 Key: CASSANDRA-8190
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8190
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.2 (Cassandra 2.0.10)
Reporter: Nikolai Grigoriev


I have a cluster that is recovering from being overloaded with writes.  I am 
using the workaround from CASSANDRA-6621 to prevent the STCS fallback (which is 
killing the cluster - see CASSANDRA-7949). 

I have observed that after one or more exceptions like this

{code}
ERROR [CompactionExecutor:4087] 2014-10-26 22:50:05,016 CassandraDaemon.java 
(line 199) Exception in thread Thread[CompactionExecutor:4087,1,main]
java.lang.RuntimeException: Last written key DecoratedKey(425124616570337476, 
0010033523da10033523da10
400100) = current key DecoratedKey(-8778432288598355336, 
0010040c7a8f10040c7a8f10
400100) writing into 
/cassandra-data/disk2/myks/mytable/myks-mytable-tmp-jb-130379-Data.db
at 
org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:142)
at 
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:165)
at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}

the node completely stops the compactions and I end up in the state like this:

{code}
# nodetool compactionstats
pending tasks: 1288
  compaction typekeyspace   table   completed   
total  unit  progress
Active compaction remaining time :n/a
{code}

The node recovers if restarted and starts compactions - until getting more 
exceptions like this.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8191) After sstablesplit all nodes log RuntimeException in CompactionExecutor (Last written key DecoratedKey... = current key DecoratedKey)

2014-10-26 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-8191:
-
Summary: After sstablesplit all nodes log RuntimeException in 
CompactionExecutor (Last written key DecoratedKey... = current key 
DecoratedKey)  (was: All nodes log RuntimeException in CompactionExecutor (Last 
written key DecoratedKey... = current key DecoratedKey))

 After sstablesplit all nodes log RuntimeException in CompactionExecutor (Last 
 written key DecoratedKey... = current key DecoratedKey)
 --

 Key: CASSANDRA-8191
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8191
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.2 (Cassandra 2.0.10)
Reporter: Nikolai Grigoriev

 While recovering the cluster from CASSANDRA-7949 (using the flag from 
 CASSANDRA-6621) I had to use sstablesplit tool to split large sstables. Nodes 
 were off while using this tool and only one sstablesplit instance was 
 running, of course. 
 After splitting was done I have restarted the nodes and they all started 
 compacting the data. All the nodes are logging the exceptions like this:
 {code}
 ERROR [CompactionExecutor:4028] 2014-10-26 23:14:52,653 CassandraDaemon.java 
 (line 199) Exception in thread Thread[CompactionExecutor:4028,1,main]
 java.lang.RuntimeException: Last written key DecoratedKey(425124616570337476, 
 0010033523da10033523da10
 400100) = current key DecoratedKey(-8778432288598355336, 
 0010040c7a8f10040c7a8f10
 400100) writing into 
 /cassandra-data/disk2/myks/mytable/myks-mytable-tmp-jb-130525-Data.db
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:142)
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:165)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
 at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {code}
 It seems that scrubbing helps but scrubbing blocks the compactions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8191) All nodes log RuntimeException in CompactionExecutor (Last written key DecoratedKey... = current key DecoratedKey)

2014-10-26 Thread Nikolai Grigoriev (JIRA)
Nikolai Grigoriev created CASSANDRA-8191:


 Summary: All nodes log RuntimeException in CompactionExecutor 
(Last written key DecoratedKey... = current key DecoratedKey)
 Key: CASSANDRA-8191
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8191
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.2 (Cassandra 2.0.10)
Reporter: Nikolai Grigoriev


While recovering the cluster from CASSANDRA-7949 (using the flag from 
CASSANDRA-6621) I had to use sstablesplit tool to split large sstables. Nodes 
were off while using this tool and only one sstablesplit instance was running, 
of course. 

After splitting was done I have restarted the nodes and they all started 
compacting the data. All the nodes are logging the exceptions like this:

{code}
ERROR [CompactionExecutor:4028] 2014-10-26 23:14:52,653 CassandraDaemon.java 
(line 199) Exception in thread Thread[CompactionExecutor:4028,1,main]
java.lang.RuntimeException: Last written key DecoratedKey(425124616570337476, 
0010033523da10033523da10
400100) = current key DecoratedKey(-8778432288598355336, 
0010040c7a8f10040c7a8f10
400100) writing into 
/cassandra-data/disk2/myks/mytable/myks-mytable-tmp-jb-130525-Data.db
at 
org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:142)
at 
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:165)
at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}

It seems that scrubbing helps but scrubbing blocks the compactions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6285) 2.0 HSHA server introduces corrupt data

2014-10-23 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181742#comment-14181742
 ] 

Nikolai Grigoriev commented on CASSANDRA-6285:
--

By the way, I am getting 

{code}
ERROR [CompactionExecutor:2333] 2014-10-23 18:29:53,590 CassandraDaemon.java 
(line 199) Exception in thread Thread[Compactio
nExecutor:2333,1,main]
java.lang.RuntimeException: Last written key DecoratedKey(1156541975678546868, 
001003bc510f1
003bc510f10400100) = 
current key DecoratedKey(36735936098318717, 001000
00015feb8a10015feb8a10400100)
 writing into /
cassandra-data/disk2/myks/mytable/myks-mytable-tmp-jb-94445-Data.db
at 
org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:142)
at 
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:165)
at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}

with 2.0.10 release. I am using native protocol. I believe native protocol 
handler is based on HSHA, am I right? Anyway, I am getting those too.

 2.0 HSHA server introduces corrupt data
 ---

 Key: CASSANDRA-6285
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
Reporter: David Sauer
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 2.0.8

 Attachments: 6285_testnotes1.txt, 
 CASSANDRA-6285-disruptor-heap.patch, cassandra-attack-src.zip, 
 compaction_test.py, disruptor-high-cpu.patch, 
 disruptor-memory-corruption.patch, enable_reallocate_buffers.txt


 After altering everything to LCS the table OpsCenter.rollups60 amd one other 
 none OpsCenter-Table got stuck with everything hanging around in L0.
 The compaction started and ran until the logs showed this:
 ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
 (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
 java.lang.RuntimeException: Last written key 
 DecoratedKey(1326283851463420237, 
 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
  = current key DecoratedKey(954210699457429663, 
 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
  writing into 
 /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
   at 
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
   at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 

[jira] [Created] (CASSANDRA-8167) sstablesplit tool can be made much faster with few JVM settings

2014-10-22 Thread Nikolai Grigoriev (JIRA)
Nikolai Grigoriev created CASSANDRA-8167:


 Summary: sstablesplit tool can be made much faster with few JVM 
settings
 Key: CASSANDRA-8167
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8167
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Nikolai Grigoriev
Priority: Trivial


I had to use sstablesplit tool intensively to split some really huge sstables. 
The tool is painfully slow as it does compaction in one single thread.

I have just found that one one of my machines the tool has crashed when I was 
almost done with 152Gb sstable (!!!). 

{code}
 INFO 16:59:22,342 Writing Memtable-compactions_in_progress@1948660572(0/0 
serialized/live bytes, 1 ops)
 INFO 16:59:22,352 Completed flushing 
/cassandra-data/disk1/system/compactions_in_progress/system-compactions_in_progress-jb-79242-Data.db
 (42 bytes) for commitlog position ReplayPosition(segmentId=1413904450653, 
position=69178)
Exception in thread main java.lang.OutOfMemoryError: GC overhead limit 
exceeded
at java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
at 
org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:586)
at 
org.apache.cassandra.utils.ByteBufferUtil.readBytesWithShortLength(ByteBufferUtil.java:596)
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:61)
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:36)
at 
org.apache.cassandra.db.RangeTombstoneList$InOrderTester.isDeleted(RangeTombstoneList.java:751)
at 
org.apache.cassandra.db.DeletionInfo$InOrderTester.isDeleted(DeletionInfo.java:422)
at 
org.apache.cassandra.db.DeletionInfo$InOrderTester.isDeleted(DeletionInfo.java:403)
at 
org.apache.cassandra.db.ColumnFamily.hasIrrelevantData(ColumnFamily.java:489)
at 
org.apache.cassandra.db.compaction.PrecompactedRow.removeDeleted(PrecompactedRow.java:66)
at 
org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:85)
at 
org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:196)
at 
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:74)
at 
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:55)
at 
org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:204)
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:154)
at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
at 
org.apache.cassandra.db.compaction.SSTableSplitter.split(SSTableSplitter.java:38)
at 
org.apache.cassandra.tools.StandaloneSplitter.main(StandaloneSplitter.java:150)

{code}

This has  triggered my desire to see what memory settings are used for JVM 
running the tool...and I have found that it runs with default Java settings (no 
settings at all).

I have tried to apply the settings from C* itself and this resulted in over 40% 
speed increase. It went from ~5Mb/s to 7Mb/s - from the compressed output 
perspective. I believe this is mostly due to concurrent GC. I see my CPU usage 
has increased to ~200%. But this is fine, this is an offline tool, the node is 
down anyway. I know that concurrent GC (at least something like 
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled) 
normally improves the performance of even primitive single-threaded 
heap-intensive Java programs.

I think it should be acceptable to apply the server JVM settings to this tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6285) 2.0 HSHA server introduces corrupt data

2014-10-21 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178828#comment-14178828
 ] 

Nikolai Grigoriev commented on CASSANDRA-6285:
--

[~sterligovak] I was always wondering why did I always see these problems 
appearing for OpsCenter keyspace. My keyspace had much more traffic but when I 
had this problem - it always manifested itself with OpsCenter keyspace. Even 
when I was also using Thrift (we use native protocol now).

I even remember disabling OpsCenter to prove the point :) 



 2.0 HSHA server introduces corrupt data
 ---

 Key: CASSANDRA-6285
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
Reporter: David Sauer
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 2.0.8

 Attachments: 6285_testnotes1.txt, 
 CASSANDRA-6285-disruptor-heap.patch, cassandra-attack-src.zip, 
 compaction_test.py, disruptor-high-cpu.patch, 
 disruptor-memory-corruption.patch, enable_reallocate_buffers.txt


 After altering everything to LCS the table OpsCenter.rollups60 amd one other 
 none OpsCenter-Table got stuck with everything hanging around in L0.
 The compaction started and ran until the logs showed this:
 ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
 (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
 java.lang.RuntimeException: Last written key 
 DecoratedKey(1326283851463420237, 
 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
  = current key DecoratedKey(954210699457429663, 
 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
  writing into 
 /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
   at 
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
   at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:724)
 Moving back to STC worked to keep the compactions running.
 Especialy my own Table i would like to move to LCS.
 After a major compaction with STC the move to LCS fails with the same 
 Exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6285) 2.0 HSHA server introduces corrupt data

2014-10-21 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14179431#comment-14179431
 ] 

Nikolai Grigoriev commented on CASSANDRA-6285:
--

I think this is the error that you cannot fix by scrubbing. Corrupted sstable. 
I was fixing those by deleting the sstables and doing repairs. Unfortunately, 
if that happens on many nodes there is a risk of data loss.

As for the OpsCenter - do not get me wrong ;) I did not want to say that 
OpsCenter was directly responsible for these troubles. But I do believe that 
OpsCenter does something particular that reveals the bug in hsha server. At 
least this was my impression. After disabling OpsCenter and fixing the 
outstanding problems I do not recall seeing those errors anymore. And I was 
also using Thrift and I was writing and reading 100x more data than OpsCenter.



 2.0 HSHA server introduces corrupt data
 ---

 Key: CASSANDRA-6285
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
Reporter: David Sauer
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 2.0.8

 Attachments: 6285_testnotes1.txt, 
 CASSANDRA-6285-disruptor-heap.patch, cassandra-attack-src.zip, 
 compaction_test.py, disruptor-high-cpu.patch, 
 disruptor-memory-corruption.patch, enable_reallocate_buffers.txt


 After altering everything to LCS the table OpsCenter.rollups60 amd one other 
 none OpsCenter-Table got stuck with everything hanging around in L0.
 The compaction started and ran until the logs showed this:
 ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
 (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
 java.lang.RuntimeException: Last written key 
 DecoratedKey(1326283851463420237, 
 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
  = current key DecoratedKey(954210699457429663, 
 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
  writing into 
 /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
   at 
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
   at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:724)
 Moving back to STC worked to keep the compactions running.
 Especialy my own Table i would like to move to LCS.
 After a major compaction with STC the move to LCS fails with the same 
 Exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-10-20 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176884#comment-14176884
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

Then I doubt I can really try it. We are quite close from production deployment 
and trying with something that far from what we will use in prod is pointless 
(for me, not for the fix ;) ).

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name **table_list1**Data** | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only for that 
 larger table.
 I'll be attaching the relevant logs shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-10-19 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176426#comment-14176426
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

[~krummas]

Marcus,

Which patch are you talking about? I am running latest DSE with Cassandra 
2.0.10.

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name **table_list1**Data** | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only for that 
 larger table.
 I'll be attaching the relevant logs shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-10-16 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14174702#comment-14174702
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

Update:

Using the property from CASSANDRA-6621 does help to get out of this state. My 
cluster is slowly digesting the large sstables and creating bunch of nice small 
sstables from them. It is slower than using sstablesplit, I believe, because it 
actually does real compactions and, thus, processes and reprocesses different 
sets of sstables. My understanding is that every time I get new bunch of L0 
sstables there is a phase for updating other levels and it repeats and repeats.

With that property set I see that my total number of sstables grows, my number 
of huge sstables decreases and the average size of the sstable decreases as 
result.

My conclusions so far:

1. STCS fallback in LCS is a double-edged sword. It is needed to prevent the 
flooding the node with tons of small sstables resulting from ongoing writes. 
These small ones are often much smaller than the configured target size and hey 
need to be merged. But also the use of STCS results in generation of the 
super-sized sstables. These become a large headache when the fallback stops and 
LCS is supposed to resume normal operations.  It appears to me (my humble 
opinion) that fallback should be done to some kind of specialized rescue STCS 
flavor that merges the small sstables to approximately the LCS target sstable 
size BUT DOES NOT create sstables that are much larger than the target size. 
With this approach the LCS will resume normal operations much faster than the 
cause for the fallback (abnormally high write load) is gone.

2. LCS has major (performance?) issue when you have super-large sstables in the 
system. It often gets stuck with single long (many hours) compaction stream 
that, by itself, will increase the probability of another STCS fallback even 
with reasonable write load. As a possible workaround I was recommended to 
consider running multiple C* instances on our relatively powerful machines - to 
significantly reduce the amount of data per node and increase compaction 
throughput.

3. In the existing systems, depending on the severity of the STCS fallback 
work the fix from CASSANDRA-6621 may help to recover while keeping the nodes 
up. It will take a very long time to recover but the nodes will be online.

4. Recovery (see above) is very long. It is much much longer than the duration 
of the stress period that causes the condition. In my case I was writing like 
crazy for about 4 days and it's been over a week of compactions after that. I 
am still very far from 0 pending compactions. Considering this it makes sense 
to artificially throttle the write speed when generating the data (like in the 
use case I described in previous comments). Extra time spent on writing the 
data will be still significantly  shorter than the amount of time required to 
recover from the consequences of abusing the available write bandwidth.

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been 

[jira] [Comment Edited] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-10-16 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14174702#comment-14174702
 ] 

Nikolai Grigoriev edited comment on CASSANDRA-7949 at 10/17/14 3:57 AM:


Update:

Using the property from CASSANDRA-6621 does help to get out of this state. My 
cluster is slowly digesting the large sstables and creating bunch of nice small 
sstables from them. It is slower than using sstablesplit, I believe, because it 
actually does real compactions and, thus, processes and reprocesses different 
sets of sstables. My understanding is that every time I get new bunch of L0 
sstables there is a phase for updating other levels and it repeats and repeats.

With that property set I see that my total number of sstables grows, my number 
of huge sstables decreases and the average size of the sstable decreases as 
result.

My conclusions so far:

1. STCS fallback in LCS is a double-edged sword. It is needed to prevent the 
flooding the node with tons of small sstables resulting from ongoing writes. 
These small ones are often much smaller than the configured target size and hey 
need to be merged. But also the use of STCS results in generation of the 
super-sized sstables. These become a large headache when the fallback stops and 
LCS is supposed to resume normal operations.  It appears to me (my humble 
opinion) that fallback should be done to some kind of specialized rescue STCS 
flavor that merges the small sstables to approximately the LCS target sstable 
size BUT DOES NOT create sstables that are much larger than the target size. 
With this approach the LCS will resume normal operations much faster than the 
cause for the fallback (abnormally high write load) is gone.

2. LCS has major (performance?) issue when you have super-large sstables in the 
system. It often gets stuck with single long (many hours) compaction stream 
that, by itself, will increase the probability of another STCS fallback even 
with reasonable write load. As a possible workaround I was recommended to 
consider running multiple C* instances on our relatively powerful machines - to 
significantly reduce the amount of data per node and increase compaction 
throughput.

3. In the existing systems, depending on the severity of the STCS fallback 
work, the fix from CASSANDRA-6621 may help to recover while keeping the nodes 
up. It will take a very long time to recover but the nodes will be online.

4. Recovery (see above) is very long. It is much much longer than the duration 
of the stress period that causes the condition. In my case I was writing like 
crazy for about 4 days and it's been over a week of compactions after that. I 
am still very far from 0 pending compactions. Considering this it makes sense 
to artificially throttle the write speed when generating the data (like in the 
use case I described in previous comments). Extra time spent on writing the 
data will be still significantly  shorter than the amount of time required to 
recover from the consequences of abusing the available write bandwidth.


was (Author: ngrigor...@gmail.com):
Update:

Using the property from CASSANDRA-6621 does help to get out of this state. My 
cluster is slowly digesting the large sstables and creating bunch of nice small 
sstables from them. It is slower than using sstablesplit, I believe, because it 
actually does real compactions and, thus, processes and reprocesses different 
sets of sstables. My understanding is that every time I get new bunch of L0 
sstables there is a phase for updating other levels and it repeats and repeats.

With that property set I see that my total number of sstables grows, my number 
of huge sstables decreases and the average size of the sstable decreases as 
result.

My conclusions so far:

1. STCS fallback in LCS is a double-edged sword. It is needed to prevent the 
flooding the node with tons of small sstables resulting from ongoing writes. 
These small ones are often much smaller than the configured target size and hey 
need to be merged. But also the use of STCS results in generation of the 
super-sized sstables. These become a large headache when the fallback stops and 
LCS is supposed to resume normal operations.  It appears to me (my humble 
opinion) that fallback should be done to some kind of specialized rescue STCS 
flavor that merges the small sstables to approximately the LCS target sstable 
size BUT DOES NOT create sstables that are much larger than the target size. 
With this approach the LCS will resume normal operations much faster than the 
cause for the fallback (abnormally high write load) is gone.

2. LCS has major (performance?) issue when you have super-large sstables in the 
system. It often gets stuck with single long (many hours) compaction stream 
that, by itself, will increase the probability of another STCS fallback even 
with reasonable write 

[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-10-12 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14168822#comment-14168822
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

I did another round of testing and I can confirm my previous suspicion. If LCS 
goes into STCS fallback mode there seems to be some kind of point of no 
return. After loading fairly large amount of data I end up with a number of 
large (from few Gb to 200+Gb) sstables. After that the cluster simply goes 
downhill - it never recovers. Even if there is no traffic except the repair 
service (DSE OpsCenter) the number of pending compactions never declines. It 
actually grows. Sstables also grow and grow in size until the moment one of the 
compactions runs out of disk space and crashes the node.

Also I believe once in this state there is no way out. sstablesplit tool, as 
far as I understand, cannot be used with the live node. And the tool splits the 
data in single thread. I have measured its performance on my system, it 
processes about 13Mb/s on average, thus, to split all these large sstables it 
would take many DAYS.

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name **table_list1**Data** | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% 

[jira] [Comment Edited] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-10-12 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14168822#comment-14168822
 ] 

Nikolai Grigoriev edited comment on CASSANDRA-7949 at 10/12/14 11:59 PM:
-

I did another round of testing and I can confirm my previous suspicion. If LCS 
goes into STCS fallback mode there seems to be some kind of point of no 
return. After loading fairly large amount of data I end up with a number of 
large (from few Gb to 200+Gb) sstables. After that the cluster simply goes 
downhill - it never recovers. Even if there is no traffic except the repair 
service (DSE OpsCenter) the number of pending compactions never declines. It 
actually grows. Sstables also grow and grow in size until the moment one of the 
compactions runs out of disk space and crashes the node.

Also I believe once in this state there is no way out. sstablesplit tool, as 
far as I understand, cannot be used with the live node. And the tool splits the 
data in single thread. I have measured its performance on my system, it 
processes about 13Mb/s on average, thus, to split all these large sstables it 
would take many DAYS.

I have got an idea that might actually help. That JVM property from 
CASSANDRA-6621 - it seems to be what I need right now. I have tried it and it 
seems (so far) that when compacting my nodes produce only the sstables of the 
target size, i.e (I may be wrong but so far it seems so) it is splitting the 
large sstables into the small ones while the nodes are on. If it continues like 
this I may hope to eventually get rid of mega-huge-sstables and then LCS 
performance should be back to normal. Will provide an update later.


was (Author: ngrigor...@gmail.com):
I did another round of testing and I can confirm my previous suspicion. If LCS 
goes into STCS fallback mode there seems to be some kind of point of no 
return. After loading fairly large amount of data I end up with a number of 
large (from few Gb to 200+Gb) sstables. After that the cluster simply goes 
downhill - it never recovers. Even if there is no traffic except the repair 
service (DSE OpsCenter) the number of pending compactions never declines. It 
actually grows. Sstables also grow and grow in size until the moment one of the 
compactions runs out of disk space and crashes the node.

Also I believe once in this state there is no way out. sstablesplit tool, as 
far as I understand, cannot be used with the live node. And the tool splits the 
data in single thread. I have measured its performance on my system, it 
processes about 13Mb/s on average, thus, to split all these large sstables it 
would take many DAYS.

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process 

[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-10-07 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14162263#comment-14162263
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

It seems that what I am suffering from in this specific test is similar to 
CASSANDRA-6621. When I write all unique data to create my initial snapshot I 
effectively do something similar to what happens when new node is bootstrapped, 
I think.

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name **table_list1**Data** | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only for that 
 larger table.
 I'll be attaching the relevant logs shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-29 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14152569#comment-14152569
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

Upgraded to Cassandra 2.0.10 (via DSE 4.5.2) today. Switched my tables that 
used STCS to LCS. Restarted. For last 8 hours I observe this on all nodes:

{code}
# nodetool compactionstats
pending tasks: 13808
  compaction typekeyspace   table   completed   
total  unit  progress
   Compaction  mykeyspacetable_1528230773591   
1616185183262 bytes32.68%
   Compaction  mykeyspace table_2456361916088   
4158821946280 bytes10.97%
Active compaction remaining time :   3h57m56s
{code}

In the beginning of these 8 hours the remaining time was about 4h08m. CPU 
activity - almost nothing (between 2 and 3 cores), disk I/O - nearly zero. So 
clearly it compacts in one thread per keyspace and almost does not progress.

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name **table_list1**Data** | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug 

[jira] [Issue Comment Deleted] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-25 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-7949:
-
Comment: was deleted

(was: May be this is not related but I have another small cluster with similar 
data. I have just upgraded that one to 2.0.10 (not DSE, original open-source 
version). On all machines in this cluster I have many thousands of sstables, 
all 160Mb, few ones that are smaller. So they are all L0, no L1 or higher level 
sstables exist. LCS is used. Number of pending compactions: 0. There is even 
incoming traffic that writes into that keyspace. nodetool compact returns 
immediately.

)

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name **table_list1**Data** | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only 

[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-24 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14146632#comment-14146632
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

May be this is not related but I have another small cluster with similar data. 
I have just upgraded that one to 2.0.10 (not DSE, original open-source 
version). On all machines in this cluster I have many thousands of sstables, 
all 160Mb, few ones that are smaller. So they are all L0, no L1 or higher level 
sstables exist. LCS is used. Number of pending compactions: 0. There is even 
incoming traffic that writes into that keyspace. nodetool compact returns 
immediately.



 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name **table_list1**Data** | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending 

[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-22 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143214#comment-14143214
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

Update: I have completed my last data writing test, now I have enough data to 
start another phase. I did that last test with compaction strategy set to STCS 
but disabled for the duration of the test. Once all writers have finished I 
have re-enabled the compactions. In under one day STCS has completed the job on 
all nodes, I ended up with few dozens (~40 or so) large sstables, total amount 
of data about 23Tb on 15 nodes.

I have switched back to LCS this morning and immediately observed the hockey 
stick on the pending compaction graph. Now each node reports about 8-10K of 
pending compactions, they are all compacting in one stream per CF and consume 
virtually no resources:

{code}
# nodetool compactionstats
pending tasks: 9900
  compaction typekeyspace   table   completed   
total  unit  progress
   Compaction  testks test_list2 26630083587
812539331642 bytes 3.28%
   Compaction  testks test_list1 24071738534   
1994877844635 bytes 1.21%
Active compaction remaining time :   2h16m55s


# w
 13:41:45 up 23 days, 18:13,  2 users,  load average: 1.81, 2.13, 2.51
...


# iostat -mdx 5
Linux 3.8.13-44.el6uek.x86_64 (cassandra01.mydomain.com)  22/09/14
_x86_64_(32 CPU)

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz 
avgqu-sz   await  svctm  %util
sdb   0.00 5.73   88.00   13.33 5.47 5.16   214.84 
0.515.08   0.39   3.98
sda   0.00 8.160.13   65.80 0.00 3.28   101.80 
0.060.87   0.11   0.71
sdc   0.00 4.93   75.05   13.34 4.67 5.42   233.62 
0.495.55   0.39   3.42
sdd   0.00 5.82   86.40   14.10 5.37 5.52   221.83 
0.565.59   0.38   3.81

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz 
avgqu-sz   await  svctm  %util
sdb   0.00 0.00  134.600.00 8.37 0.00   127.30 
0.060.42   0.42   5.64
sda   0.0013.000.00  220.40 0.00 0.96 8.94 
0.010.05   0.01   0.32
sdc   0.00 0.00   36.400.00 2.27 0.00   128.00 
0.010.41   0.41   1.50
sdd   0.00 0.00   21.200.00 1.32 0.00   128.00 
0.000.19   0.19   0.40
{code}

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of 

[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-19 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14141087#comment-14141087
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

Yes and no. Yes - the number of pending compactions started to go down and I 
ended up with fewer (and large) sstables. But I think the issue is more about 
LCS compaction performance. Is it normal that LCS cannot efficiently use the 
host resources while having tons of pending compactions?

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name **table_list1**Data** | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only for that 
 larger table.
 I'll be attaching the relevant logs shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-19 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14141154#comment-14141154
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

I understand that it was an estimate but my cluster was trying to process 
this estimate in almost 3 full days with little progress. About 1,5 days of 
data injection3 days of compaction with no progress - that does not sound 
right. And STCS was able to crunch most of the data in about one day after the 
switch.

I strongly suspect that the fact that I was loading and not updating the data 
at high rate resulted in some sort of edge case scenario for LCS. But 
considering that the cluster could not recover in reasonable amount of time 
(exceeding the original load time by factor of 2+) I do believe that something 
may need to be improved in LCS logic OR some kind of diagnostic message needs 
to be generated to request a specific action to be taken by the cluster owner. 
In my case the problem was easy to spot as it was highly visible - but if this 
happens to one of 50 CFs it may take a while before someone spots endless 
compactions happening.

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name **table_list1**Data** | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting 

[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-19 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14141174#comment-14141174
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

Just a small clarification: just not very many is not exactly what I 
observed. Mainly there was one active compaction but once in a while there was 
a burst of compactions with high CPU usage, GOSSIP issues caused by nodes being 
less responsive etc.

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name **table_list1**Data** | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only for that 
 larger table.
 I'll be attaching the relevant logs shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7956) nodetool compactionhistory crashes because of low heap size (GC overhead limit exceeded)

2014-09-18 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139342#comment-14139342
 ] 

Nikolai Grigoriev commented on CASSANDRA-7956:
--

I think that setting is not effective for nodetool status because of the GC 
settings. I have seen it before in other apps that the default GC settings may 
be very ineffective. Mostly it was due to the parallel GC not being enabled. 
Maybe trying to put -XX:+UseParNewGC -XX:+UseConcMarkSweepGC 
-XX:+CMSParallelRemarkEnabled would be enough. Although, of course, in that 
case nodetool will use more CPU resources.



 nodetool compactionhistory crashes because of low heap size (GC overhead 
 limit exceeded)
 --

 Key: CASSANDRA-7956
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7956
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.0.8
Reporter: Nikolai Grigoriev
Priority: Trivial
 Fix For: 2.0.11

 Attachments: 7956.txt, 
 nodetool_compactionhistory_128m_heap_output.txt.gz


 {code}
 ]# nodetool compactionhistory
 Compaction History:
 Exception in thread main java.lang.OutOfMemoryError: GC overhead limit 
 exceeded
 at java.io.ObjectStreamClass.newInstance(ObjectStreamClass.java:967)
 at 
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1782)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
 at java.util.HashMap.readObject(HashMap.java:1180)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
 at 
 java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
 at 
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
 at 
 java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
 at 
 java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)
 at 
 javax.management.openmbean.TabularDataSupport.readObject(TabularDataSupport.java:912)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
 at 
 java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
 at 
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
 at sun.rmi.server.UnicastRef.unmarshalValue(UnicastRef.java:325)
 at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:174)
 at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
 at 
 javax.management.remote.rmi.RMIConnectionImpl_Stub.getAttribute(Unknown 
 Source)
 at 
 javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.getAttribute(RMIConnector.java:906)
 at 
 javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:267)
 at com.sun.proxy.$Proxy3.getCompactionHistory(Unknown Source)
 {code}
 nodetool starts with -Xmx32m. This seems to be not enough at least in my case 
 to show the history. I am not sure what would the appropriate amount be but 
 increasing it to 128m definitely solves the problem. Output from modified 
 nodetool attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-17 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14137054#comment-14137054
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

I see. I could try to switch to STCS now and see what happens.

My concern is that the issue seems to be permanent. Even after last night none 
of the nodes (being vritually idle - the load was over) was able to eat through 
the pending compactions. And, to my surprise, half of the nodes in the cluster 
do not even compact fast enough - look at the graphs attached.

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name **table_list1**Data** | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only for that 
 larger table.
 I'll be attaching the relevant logs shortly.



--
This message was 

[jira] [Updated] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-17 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-7949:
-
Attachment: pending compactions 2day

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name **table_list1**Data** | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only for that 
 larger table.
 I'll be attaching the relevant logs shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-17 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-7949:
-
Attachment: (was: pending compactions 2day)

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name **table_list1**Data** | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only for that 
 larger table.
 I'll be attaching the relevant logs shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-17 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-7949:
-
Attachment: pending compactions 2day.png

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name **table_list1**Data** | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only for that 
 larger table.
 I'll be attaching the relevant logs shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-7956) nodetool compactionhistory crashes because of low heap size (GC overhead limit exceeded)

2014-09-17 Thread Nikolai Grigoriev (JIRA)
Nikolai Grigoriev created CASSANDRA-7956:


 Summary: nodetool compactionhistory crashes because of low heap 
size (GC overhead limit exceeded)
 Key: CASSANDRA-7956
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7956
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.0.8
Reporter: Nikolai Grigoriev
Priority: Trivial
 Attachments: nodetool_compactionhistory_128m_heap_output.txt.gz

{code}
]# nodetool compactionhistory
Compaction History:
Exception in thread main java.lang.OutOfMemoryError: GC overhead limit 
exceeded
at java.io.ObjectStreamClass.newInstance(ObjectStreamClass.java:967)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1782)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at java.util.HashMap.readObject(HashMap.java:1180)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at 
java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)
at 
javax.management.openmbean.TabularDataSupport.readObject(TabularDataSupport.java:912)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at sun.rmi.server.UnicastRef.unmarshalValue(UnicastRef.java:325)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:174)
at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
at 
javax.management.remote.rmi.RMIConnectionImpl_Stub.getAttribute(Unknown Source)
at 
javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.getAttribute(RMIConnector.java:906)
at 
javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:267)
at com.sun.proxy.$Proxy3.getCompactionHistory(Unknown Source)
{code}

nodetool starts with -Xmx32m. This seems to be not enough at least in my case 
to show the history. I am not sure what would the appropriate amount be but 
increasing it to 128m definitely solves the problem. Output from modified 
nodetool attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-7957) improve active/pending compaction monitoring

2014-09-17 Thread Nikolai Grigoriev (JIRA)
Nikolai Grigoriev created CASSANDRA-7957:


 Summary: improve active/pending compaction monitoring
 Key: CASSANDRA-7957
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7957
 Project: Cassandra
  Issue Type: Improvement
  Components: Core, Tools
Reporter: Nikolai Grigoriev
Priority: Minor


I think it might be useful to create a way to see what sstables are being 
compacted into what new sstable. Something like an extension of nodetool 
compactionstats. I think it would be easier with this feature to troubleshoot 
and understand how compactions are happening on your data. Not sure how it is 
useful in everyday life but I could use such a feature when dealing with 
CASSANDRA-7949.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-17 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138137#comment-14138137
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

Just an update: I have switched to STCS early this morning and by now half of 
the nodes are getting close to zero pending transactions. Half of remaining 
nodes seem to be behind but they are compacting at full speed (smoke coming 
from the lab ;) ) and I see the number of pending compactions going down on 
them as well. On the nodes where compactions are almost over the number of 
sstables is now very small, less than a hundred.


 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name **table_list1**Data** | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only for that 
 larger 

[jira] [Created] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-16 Thread Nikolai Grigoriev (JIRA)
Nikolai Grigoriev created CASSANDRA-7949:


 Summary: LCS compaction low performance, many pending compactions, 
nodes are almost idle
 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev


I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
load similar to the load in our future product. Before running the simulator I 
had to pre-generate enough data. This was done using Java code and DataStax 
Java driver. To avoid going deep into details, two tables have been generated. 
Each table currently has about 55M rows and between few dozens and few 
thousands of columns in each row.

This data generation process was generating massive amount of non-overlapping 
data. Thus, the activity was write-only and highly parallel. This is not the 
type of the traffic that the system will have ultimately to deal with, it will 
be mix of reads and updates to the existing data in the future. This is just to 
explain the choice of LCS, not mentioning the expensive SSD disk space.

At some point while generating the data I have noticed that the compactions 
started to pile up. I knew that I was overloading the cluster but I still 
wanted the genration test to complete. I was expecting to give the cluster 
enough time to finish the pending compactions and get ready for real traffic.

However, after the storm of write requests have been stopped I have noticed 
that the number of pending compactions remained constant (and even climbed up a 
little bit) on all nodes. After trying to tune some parameters (like setting 
the compaction bandwidth cap to 0) I have noticed a strange pattern: the nodes 
were compacting one of the CFs in a single stream using virtually no CPU and no 
disk I/O. This process was taking hours. After that it would be followed by a 
short burst of few dozens of compactions running in parallel (CPU at 2000%, 
some disk I/O - up to 10-20%) and then getting stuck again for many hours doing 
one compaction at time. So it looks like this:

# nodetool compactionstats
pending tasks: 3351
  compaction typekeyspace   table   completed   
total  unit  progress
   Compaction  myks table_list1 66499295588   
1910515889913 bytes 3.48%
Active compaction remaining time :n/a

# df -h

...
/dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
/dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
/dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3

# find . -name *wm_contacts*Data* | grep -v snapshot | wc -l
1310

Among these files I see:

1043 files of 161Mb (my sstable size is 160Mb)
9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
263 files of various sized - between few dozens of Kb and 160Mb

I've been running the heavy load for about 1,5days and it's been close to 3 
days after that and the number of pending compactions does not go down.

I have applied one of the not-so-obvious recommendations to disable 
multithreaded compactions and that seems to be helping a bit - I see some nodes 
started to have fewer pending compactions. About half of the cluster, in fact. 
But even there I see they are sitting idle most of the time lazily compacting 
in one stream with CPU at ~140% and occasionally doing the bursts of compaction 
work for few minutes.

I am wondering if this is really a bug or something in the LCS logic that would 
manifest itself only in such an edge case scenario where I have loaded lots of 
unique data quickly.

I'll be attaching the relevant logs shortly.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-16 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-7949:
-
Description: 
I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
load similar to the load in our future product. Before running the simulator I 
had to pre-generate enough data. This was done using Java code and DataStax 
Java driver. To avoid going deep into details, two tables have been generated. 
Each table currently has about 55M rows and between few dozens and few 
thousands of columns in each row.

This data generation process was generating massive amount of non-overlapping 
data. Thus, the activity was write-only and highly parallel. This is not the 
type of the traffic that the system will have ultimately to deal with, it will 
be mix of reads and updates to the existing data in the future. This is just to 
explain the choice of LCS, not mentioning the expensive SSD disk space.

At some point while generating the data I have noticed that the compactions 
started to pile up. I knew that I was overloading the cluster but I still 
wanted the genration test to complete. I was expecting to give the cluster 
enough time to finish the pending compactions and get ready for real traffic.

However, after the storm of write requests have been stopped I have noticed 
that the number of pending compactions remained constant (and even climbed up a 
little bit) on all nodes. After trying to tune some parameters (like setting 
the compaction bandwidth cap to 0) I have noticed a strange pattern: the nodes 
were compacting one of the CFs in a single stream using virtually no CPU and no 
disk I/O. This process was taking hours. After that it would be followed by a 
short burst of few dozens of compactions running in parallel (CPU at 2000%, 
some disk I/O - up to 10-20%) and then getting stuck again for many hours doing 
one compaction at time. So it looks like this:

# nodetool compactionstats
pending tasks: 3351
  compaction typekeyspace   table   completed   
total  unit  progress
   Compaction  myks table_list1 66499295588   
1910515889913 bytes 3.48%
Active compaction remaining time :n/a

# df -h

...
/dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
/dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
/dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3

# find . -name *wm_contacts*Data* | grep -v snapshot | wc -l
1310

Among these files I see:

1043 files of 161Mb (my sstable size is 160Mb)
9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
263 files of various sized - between few dozens of Kb and 160Mb

I've been running the heavy load for about 1,5days and it's been close to 3 
days after that and the number of pending compactions does not go down.

I have applied one of the not-so-obvious recommendations to disable 
multithreaded compactions and that seems to be helping a bit - I see some nodes 
started to have fewer pending compactions. About half of the cluster, in fact. 
But even there I see they are sitting idle most of the time lazily compacting 
in one stream with CPU at ~140% and occasionally doing the bursts of compaction 
work for few minutes.

I am wondering if this is really a bug or something in the LCS logic that would 
manifest itself only in such an edge case scenario where I have loaded lots of 
unique data quickly.

By the way, I see this pattern only for one of two tables - the one that has 
about 4 times more data than another (space-wise, number of rows is the same). 
Looks like all these pending compactions are really only for that larger table.

I'll be attaching the relevant logs shortly.


  was:
I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
load similar to the load in our future product. Before running the simulator I 
had to pre-generate enough data. This was done using Java code and DataStax 
Java driver. To avoid going deep into details, two tables have been generated. 
Each table currently has about 55M rows and between few dozens and few 
thousands of columns in each row.

This data generation process was generating massive amount of non-overlapping 
data. Thus, the activity was write-only and highly parallel. This is not the 
type of the traffic that the system will have ultimately to deal with, it will 
be mix of reads and updates to the existing data in the future. This is just to 
explain the choice of LCS, not mentioning the expensive SSD disk space.

At some point while generating the data I have noticed that the compactions 
started to pile up. I knew that I was overloading the cluster but I still 
wanted the genration test 

[jira] [Updated] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-16 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-7949:
-
Attachment: system.log.gz

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: system.log.gz


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name *wm_contacts*Data* | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only for that 
 larger table.
 I'll be attaching the relevant logs shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-16 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-7949:
-
Attachment: iostats.txt

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, system.log.gz


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name *wm_contacts*Data* | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only for that 
 larger table.
 I'll be attaching the relevant logs shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-16 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-7949:
-
Attachment: vmstat.txt

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name *wm_contacts*Data* | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only for that 
 larger table.
 I'll be attaching the relevant logs shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-16 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-7949:
-
Attachment: nodetool_compactionstats.txt

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name *wm_contacts*Data* | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only for that 
 larger table.
 I'll be attaching the relevant logs shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-16 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-7949:
-
Attachment: nodetool_tpstats.txt

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name *wm_contacts*Data* | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only for that 
 larger table.
 I'll be attaching the relevant logs shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-16 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14136549#comment-14136549
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

system log already includes the logs from 
log4j.logger.org.apache.cassandra.db.compaction (except 
log4j.logger.org.apache.cassandra.db.compaction.ParallelCompactionIterable)

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name *wm_contacts*Data* | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only for that 
 larger table.
 I'll be attaching the relevant logs shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-09-16 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-7949:
-
Description: 
I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
load similar to the load in our future product. Before running the simulator I 
had to pre-generate enough data. This was done using Java code and DataStax 
Java driver. To avoid going deep into details, two tables have been generated. 
Each table currently has about 55M rows and between few dozens and few 
thousands of columns in each row.

This data generation process was generating massive amount of non-overlapping 
data. Thus, the activity was write-only and highly parallel. This is not the 
type of the traffic that the system will have ultimately to deal with, it will 
be mix of reads and updates to the existing data in the future. This is just to 
explain the choice of LCS, not mentioning the expensive SSD disk space.

At some point while generating the data I have noticed that the compactions 
started to pile up. I knew that I was overloading the cluster but I still 
wanted the genration test to complete. I was expecting to give the cluster 
enough time to finish the pending compactions and get ready for real traffic.

However, after the storm of write requests have been stopped I have noticed 
that the number of pending compactions remained constant (and even climbed up a 
little bit) on all nodes. After trying to tune some parameters (like setting 
the compaction bandwidth cap to 0) I have noticed a strange pattern: the nodes 
were compacting one of the CFs in a single stream using virtually no CPU and no 
disk I/O. This process was taking hours. After that it would be followed by a 
short burst of few dozens of compactions running in parallel (CPU at 2000%, 
some disk I/O - up to 10-20%) and then getting stuck again for many hours doing 
one compaction at time. So it looks like this:

# nodetool compactionstats
pending tasks: 3351
  compaction typekeyspace   table   completed   
total  unit  progress
   Compaction  myks table_list1 66499295588   
1910515889913 bytes 3.48%
Active compaction remaining time :n/a

# df -h

...
/dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
/dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
/dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3

# find . -name **table_list1**Data** | grep -v snapshot | wc -l
1310

Among these files I see:

1043 files of 161Mb (my sstable size is 160Mb)
9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
263 files of various sized - between few dozens of Kb and 160Mb

I've been running the heavy load for about 1,5days and it's been close to 3 
days after that and the number of pending compactions does not go down.

I have applied one of the not-so-obvious recommendations to disable 
multithreaded compactions and that seems to be helping a bit - I see some nodes 
started to have fewer pending compactions. About half of the cluster, in fact. 
But even there I see they are sitting idle most of the time lazily compacting 
in one stream with CPU at ~140% and occasionally doing the bursts of compaction 
work for few minutes.

I am wondering if this is really a bug or something in the LCS logic that would 
manifest itself only in such an edge case scenario where I have loaded lots of 
unique data quickly.

By the way, I see this pattern only for one of two tables - the one that has 
about 4 times more data than another (space-wise, number of rows is the same). 
Looks like all these pending compactions are really only for that larger table.

I'll be attaching the relevant logs shortly.


  was:
I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
load similar to the load in our future product. Before running the simulator I 
had to pre-generate enough data. This was done using Java code and DataStax 
Java driver. To avoid going deep into details, two tables have been generated. 
Each table currently has about 55M rows and between few dozens and few 
thousands of columns in each row.

This data generation process was generating massive amount of non-overlapping 
data. Thus, the activity was write-only and highly parallel. This is not the 
type of the traffic that the system will have ultimately to deal with, it will 
be mix of reads and updates to the existing data in the future. This is just to 
explain the choice of LCS, not mentioning the expensive SSD disk space.

At some point while generating the data I have noticed that the compactions 
started to pile up. I knew that I was overloading the cluster but I still 
wanted the genration 

[jira] [Commented] (CASSANDRA-6173) Unable to delete multiple entries using In clause on clustering part of compound key

2014-08-08 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091595#comment-14091595
 ] 

Nikolai Grigoriev commented on CASSANDRA-6173:
--

In the absence of range-based deletes (because deleting of a slice is not 
supported) this option is quite important for some data structures. I have just 
hit a case myself when I need to delete a slice of columns (range of last 
component of the clustering key in CQL terms). So, first I have found that 
deleting a slice is not possible (CASSANDRA-494) - so I need to first read the 
list of values to delete them :) And then I have found that for each value I 
will be issuing a delete statement :(

 Unable to delete multiple entries using In clause on clustering part of 
 compound key
 

 Key: CASSANDRA-6173
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6173
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Ashot Golovenko
Priority: Minor

 I have the following table:
 CREATE TABLE user_relation (
 u1 bigint,
 u2 bigint,
 mf int,
 i boolean,
 PRIMARY KEY (u1, u2));
 And I'm trying to delete two entries using In clause on clustering part of 
 compound key and I fail to do so:
 cqlsh:bm DELETE from user_relation WHERE u1 = 755349113 and u2 in 
 (13404014120, 12537242743);
 Bad Request: Invalid operator IN for PRIMARY KEY part u2
 Although the select statement works just fine:
 cqlsh:bm select * from user_relation WHERE u1 = 755349113 and u2 in 
 (13404014120, 12537242743);
  u1| u2  | i| mf
 ---+-+--+
  755349113 | 12537242743 | null | 27
  755349113 | 13404014120 | null |  0
 (2 rows)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7415) COPY command does not quote map keys

2014-06-18 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-7415:
-

Description: 
{code}
create table test (pk text primary key, props mapascii, blob);
cqlsh:myks insert into test (pk, props) values ('aaa', {'prop1': 0x1020, 
'prop2': 0x4050});
cqlsh:myks copy test to 't.csv';
1 rows exported in 0.056 seconds.
cqlsh:myks copy test from 't.csv';
Bad Request: line 1:74 no viable alternative at input ':'
Aborting import at record #0 (line 1). Previously-inserted values still present.
0 rows imported in 0.012 seconds.
cqlsh:myks
{code}

t.csv:

{code}
# cat t.csv
aaa,{prop1: 0x1020, prop2: 0x4050}
{code}

I believe the missing quotes in the CSV file cause INSERT to fail. 

  was:
{quote}
create table test (pk text primary key, props mapascii, blob);
cqlsh:myks insert into test (pk, props) values ('aaa', {'prop1': 0x1020, 
'prop2': 0x4050});
cqlsh:myks copy test to 't.csv';
1 rows exported in 0.056 seconds.
cqlsh:myks copy test from 't.csv';
Bad Request: line 1:74 no viable alternative at input ':'
Aborting import at record #0 (line 1). Previously-inserted values still present.
0 rows imported in 0.012 seconds.
cqlsh:myks
{quote}

t.csv:

{code}
# cat t.csv
aaa,{prop1: 0x1020, prop2: 0x4050}
{code}

I believe the missing quotes in the CSV file cause INSERT to fail. 


 COPY command does not quote map keys
 

 Key: CASSANDRA-7415
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7415
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: Cassandra 2.0.5, Linux
Reporter: Nikolai Grigoriev
Priority: Minor

 {code}
 create table test (pk text primary key, props mapascii, blob);
 cqlsh:myks insert into test (pk, props) values ('aaa', {'prop1': 0x1020, 
 'prop2': 0x4050});
 cqlsh:myks copy test to 't.csv';
 1 rows exported in 0.056 seconds.
 cqlsh:myks copy test from 't.csv';
 Bad Request: line 1:74 no viable alternative at input ':'
 Aborting import at record #0 (line 1). Previously-inserted values still 
 present.
 0 rows imported in 0.012 seconds.
 cqlsh:myks
 {code}
 t.csv:
 {code}
 # cat t.csv
 aaa,{prop1: 0x1020, prop2: 0x4050}
 {code}
 I believe the missing quotes in the CSV file cause INSERT to fail. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7415) COPY command does not quote map keys

2014-06-18 Thread Nikolai Grigoriev (JIRA)
Nikolai Grigoriev created CASSANDRA-7415:


 Summary: COPY command does not quote map keys
 Key: CASSANDRA-7415
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7415
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: Cassandra 2.0.5, Linux
Reporter: Nikolai Grigoriev
Priority: Minor


{quote}
create table test (pk text primary key, props mapascii, blob);
cqlsh:myks insert into test (pk, props) values ('aaa', {'prop1': 0x1020, 
'prop2': 0x4050});
cqlsh:myks copy test to 't.csv';
1 rows exported in 0.056 seconds.
cqlsh:myks copy test from 't.csv';
Bad Request: line 1:74 no viable alternative at input ':'
Aborting import at record #0 (line 1). Previously-inserted values still present.
0 rows imported in 0.012 seconds.
cqlsh:myks
{quote}

t.csv:

{code}
# cat t.csv
aaa,{prop1: 0x1020, prop2: 0x4050}
{code}

I believe the missing quotes in the CSV file cause INSERT to fail. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7415) COPY command does not quote map keys

2014-06-18 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-7415:
-

Description: 
{code}
cqlsh:myks create table test (pk text primary key, props mapascii, blob);
cqlsh:myks insert into test (pk, props) values ('aaa', {'prop1': 0x1020, 
'prop2': 0x4050});
cqlsh:myks copy test to 't.csv';
1 rows exported in 0.056 seconds.
cqlsh:myks copy test from 't.csv';
Bad Request: line 1:74 no viable alternative at input ':'
Aborting import at record #0 (line 1). Previously-inserted values still present.
0 rows imported in 0.012 seconds.
cqlsh:myks
{code}

t.csv:

{code}
# cat t.csv
aaa,{prop1: 0x1020, prop2: 0x4050}
{code}

I believe the missing quotes in the CSV file cause INSERT to fail. 

  was:
{code}
create table test (pk text primary key, props mapascii, blob);
cqlsh:myks insert into test (pk, props) values ('aaa', {'prop1': 0x1020, 
'prop2': 0x4050});
cqlsh:myks copy test to 't.csv';
1 rows exported in 0.056 seconds.
cqlsh:myks copy test from 't.csv';
Bad Request: line 1:74 no viable alternative at input ':'
Aborting import at record #0 (line 1). Previously-inserted values still present.
0 rows imported in 0.012 seconds.
cqlsh:myks
{code}

t.csv:

{code}
# cat t.csv
aaa,{prop1: 0x1020, prop2: 0x4050}
{code}

I believe the missing quotes in the CSV file cause INSERT to fail. 


 COPY command does not quote map keys
 

 Key: CASSANDRA-7415
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7415
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: Cassandra 2.0.5, Linux
Reporter: Nikolai Grigoriev
Priority: Minor

 {code}
 cqlsh:myks create table test (pk text primary key, props mapascii, blob);
 cqlsh:myks insert into test (pk, props) values ('aaa', {'prop1': 0x1020, 
 'prop2': 0x4050});
 cqlsh:myks copy test to 't.csv';
 1 rows exported in 0.056 seconds.
 cqlsh:myks copy test from 't.csv';
 Bad Request: line 1:74 no viable alternative at input ':'
 Aborting import at record #0 (line 1). Previously-inserted values still 
 present.
 0 rows imported in 0.012 seconds.
 cqlsh:myks
 {code}
 t.csv:
 {code}
 # cat t.csv
 aaa,{prop1: 0x1020, prop2: 0x4050}
 {code}
 I believe the missing quotes in the CSV file cause INSERT to fail. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6716) nodetool scrub constantly fails with RuntimeException (Tried to hard link to file that does not exist)

2014-05-22 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14005971#comment-14005971
 ] 

Nikolai Grigoriev commented on CASSANDRA-6716:
--

I have made two more observations, one of them may be unrelated, but still:

1. I had tons of these exceptions when doing compaction or scrubbing on some of 
the nodes. Disabling Datastax agent on them and restarting the nodes eliminated 
the exceptions completely. All under heavy load.

2. Just started having these exceptions again on one of the nodes after a minor 
configuration change (compaction throughput) and restarting the node. Restarted 
again - same thing, several exceptions per second, all FileNotFoundException 
when compacting. Stopped the node. Removed the caches stored in 
/var/lib/cassandra/saved_caches. Started the node. Not a single exception in 
~1,5 hours. Again, all this under heavy load.

Now I am wondering - where else a reference to a non-existing sstable can be 
except the cache? If simple restart does not help and the filesystem really 
does not have the file the server tries to access - then it cannot be something 
about in-memory cache being out of sync, so it's got to be the persistent one.

 nodetool scrub constantly fails with RuntimeException (Tried to hard link to 
 file that does not exist)
 --

 Key: CASSANDRA-6716
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6716
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.0.5 (built from source), Linux, 6 nodes, JDK 
 1.7
Reporter: Nikolai Grigoriev
 Attachments: system.log.gz


 It seems that since recently I have started getting a number of exceptions 
 like File not found on all Cassandra nodes. Currently I am getting an 
 exception like this every couple of seconds on each node, for different 
 keyspaces and CFs.
 I have tried to restart the nodes, tried to scrub them. No luck so far. It 
 seems that scrub cannot complete on any of these nodes, at some point it 
 fails because of the file that it can't find.
 One one of the nodes currently the nodetool scrub command fails  instantly 
 and consistently with this exception:
 {code}
 # /opt/cassandra/bin/nodetool scrub 
 Exception in thread main java.lang.RuntimeException: Tried to hard link to 
 file that does not exist 
 /mnt/disk5/cassandra/data/mykeyspace_jmeter/test_contacts/mykeyspace_jmeter-test_contacts-jb-28049-Data.db
   at 
 org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:75)
   at 
 org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:1215)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1826)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.scrub(ColumnFamilyStore.java:1122)
   at 
 org.apache.cassandra.service.StorageService.scrub(StorageService.java:2159)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
   at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
   at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
   at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
  

[jira] [Commented] (CASSANDRA-6285) LCS compaction failing with Exception

2014-03-03 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918048#comment-13918048
 ] 

Nikolai Grigoriev commented on CASSANDRA-6285:
--

[~krummas]

I think using HSHA makes it easier to reproduce but...I am running SYNC for 
over a week now and recently I have experienced the same issue again.

We had another unclean shutdown (hrrr...some people are smarter than the UPSes 
;) ) and after bringing the nodes back I have found that  on one node my 
compactions constantly fail with FileNotFoundException. Even worse, I can't 
scrub the keyspace/CF in question because scrub fails instantly with 
RuntimeException: Tried to hard link to file that does not exist I have 
reported that one too. It is impossible to scrub. The only way to fix that 
issue I have found so far is to restart Cassandra on that node, stop 
compactions as soon as it starts (well, I could disable them differently, I 
assume) and then scrub. Sometimes I have to do it in several iterations to 
complete the process. Once I scrub all problematic KS/CFs I see no more 
exceptions.

 LCS compaction failing with Exception
 -

 Key: CASSANDRA-6285
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
Reporter: David Sauer
Assignee: Marcus Eriksson
 Fix For: 2.0.6

 Attachments: compaction_test.py


 After altering everything to LCS the table OpsCenter.rollups60 amd one other 
 none OpsCenter-Table got stuck with everything hanging around in L0.
 The compaction started and ran until the logs showed this:
 ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
 (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
 java.lang.RuntimeException: Last written key 
 DecoratedKey(1326283851463420237, 
 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
  = current key DecoratedKey(954210699457429663, 
 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
  writing into 
 /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
   at 
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
   at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:724)
 Moving back to STC worked to keep the compactions running.
 Especialy my own Table i would like to move to LCS.
 After a major compaction with STC the move to LCS fails with the same 
 Exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6285) 2.0 HSHA server introduces corrupt data

2014-03-03 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918463#comment-13918463
 ] 

Nikolai Grigoriev commented on CASSANDRA-6285:
--

[~xedin]

That seems to be a parameter of the Thrift server...How do I control this 
parameter? Or I should just disable JNA?

 2.0 HSHA server introduces corrupt data
 ---

 Key: CASSANDRA-6285
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
Reporter: David Sauer
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 2.0.6

 Attachments: compaction_test.py


 After altering everything to LCS the table OpsCenter.rollups60 amd one other 
 none OpsCenter-Table got stuck with everything hanging around in L0.
 The compaction started and ran until the logs showed this:
 ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
 (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
 java.lang.RuntimeException: Last written key 
 DecoratedKey(1326283851463420237, 
 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
  = current key DecoratedKey(954210699457429663, 
 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
  writing into 
 /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
   at 
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
   at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:724)
 Moving back to STC worked to keep the compactions running.
 Especialy my own Table i would like to move to LCS.
 After a major compaction with STC the move to LCS fails with the same 
 Exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6285) LCS compaction failing with Exception

2014-02-20 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907434#comment-13907434
 ] 

Nikolai Grigoriev commented on CASSANDRA-6285:
--

Can confirm on my side. I have switched to sync RPC server and after few 
scrubs/restarts I am running my load tests on a 6-node 2.0.5 cluster without a 
single exception in last ~8 hours.

I tried to correlate the moment I started getting large number of 
FileNotFoundException's with other events in my clusterrealized that it was 
not exactly 2.0.5 upgrade. It seems to correlate mostly with a moment when my 
jmeter server went out of free space and a bunch of tests crashed. Obviously, 
these crashes have terminated a few hundreds of client connections to Cassandra.

Not sure if it is related but it seems that from that moment it was some sort 
of snowball effect.

 LCS compaction failing with Exception
 -

 Key: CASSANDRA-6285
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
Reporter: David Sauer
Assignee: Tyler Hobbs
 Fix For: 2.0.6

 Attachments: compaction_test.py


 After altering everything to LCS the table OpsCenter.rollups60 amd one other 
 none OpsCenter-Table got stuck with everything hanging around in L0.
 The compaction started and ran until the logs showed this:
 ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
 (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
 java.lang.RuntimeException: Last written key 
 DecoratedKey(1326283851463420237, 
 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
  = current key DecoratedKey(954210699457429663, 
 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
  writing into 
 /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
   at 
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
   at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:724)
 Moving back to STC worked to keep the compactions running.
 Especialy my own Table i would like to move to LCS.
 After a major compaction with STC the move to LCS fails with the same 
 Exception.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6716) nodetool scrub constantly fails with RuntimeException (Tried to hard link to file that does not exist)

2014-02-19 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905486#comment-13905486
 ] 

Nikolai Grigoriev commented on CASSANDRA-6716:
--

OK, I am observing *massive* problems with the sstables as of moving from 2.0.4 
to 2.0.5. I am rolling back now and scrubbing (I wish I had Mr. Net ;) ). Just 
when scrubbing OpsCenter keyspaces I see tons of messages like this:

{quote}
WARN [CompactionExecutor:110] 2014-02-19 14:25:13,811 OutputHandler.java (line 
52) 1 out of order rows found while scrubbing 
SSTableReader(path='/hadoop/disk2/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-jb-1901-Data.db');
 Those have been written (in order) to a new sstable 
(SSTableReader(path='/hadoop/disk5/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-jb-15423-Data.db'))
{quote}

I am not exaggerating - dozens of thousands. To be fair, I am not 100% if the 
problem was there with 2.0.4. But as of 2.0.5 I have noticed the frequent 
exceptions about the key ordering, that caught my attention.

 nodetool scrub constantly fails with RuntimeException (Tried to hard link to 
 file that does not exist)
 --

 Key: CASSANDRA-6716
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6716
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.0.5 (built from source), Linux, 6 nodes, JDK 
 1.7
Reporter: Nikolai Grigoriev
 Attachments: system.log.gz


 It seems that since recently I have started getting a number of exceptions 
 like File not found on all Cassandra nodes. Currently I am getting an 
 exception like this every couple of seconds on each node, for different 
 keyspaces and CFs.
 I have tried to restart the nodes, tried to scrub them. No luck so far. It 
 seems that scrub cannot complete on any of these nodes, at some point it 
 fails because of the file that it can't find.
 One one of the nodes currently the nodetool scrub command fails  instantly 
 and consistently with this exception:
 {code}
 # /opt/cassandra/bin/nodetool scrub 
 Exception in thread main java.lang.RuntimeException: Tried to hard link to 
 file that does not exist 
 /mnt/disk5/cassandra/data/mykeyspace_jmeter/test_contacts/mykeyspace_jmeter-test_contacts-jb-28049-Data.db
   at 
 org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:75)
   at 
 org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:1215)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1826)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.scrub(ColumnFamilyStore.java:1122)
   at 
 org.apache.cassandra.service.StorageService.scrub(StorageService.java:2159)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
   at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
   at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
   at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
   at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   

[jira] [Commented] (CASSANDRA-6716) nodetool scrub constantly fails with RuntimeException (Tried to hard link to file that does not exist)

2014-02-19 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905506#comment-13905506
 ] 

Nikolai Grigoriev commented on CASSANDRA-6716:
--

And more scary one:

{quote}
 WARN [CompactionExecutor:84] 2014-02-19 14:35:25,418 OutputHandler.java (line 
52) Unable to r
ecover 8 rows that were skipped.  You can attempt manual recovery from the 
pre-scrub snapshot.
  You can also run nodetool repair to transfer the data from a healthy replica, 
if any
{quote}

 nodetool scrub constantly fails with RuntimeException (Tried to hard link to 
 file that does not exist)
 --

 Key: CASSANDRA-6716
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6716
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.0.5 (built from source), Linux, 6 nodes, JDK 
 1.7
Reporter: Nikolai Grigoriev
 Attachments: system.log.gz


 It seems that since recently I have started getting a number of exceptions 
 like File not found on all Cassandra nodes. Currently I am getting an 
 exception like this every couple of seconds on each node, for different 
 keyspaces and CFs.
 I have tried to restart the nodes, tried to scrub them. No luck so far. It 
 seems that scrub cannot complete on any of these nodes, at some point it 
 fails because of the file that it can't find.
 One one of the nodes currently the nodetool scrub command fails  instantly 
 and consistently with this exception:
 {code}
 # /opt/cassandra/bin/nodetool scrub 
 Exception in thread main java.lang.RuntimeException: Tried to hard link to 
 file that does not exist 
 /mnt/disk5/cassandra/data/mykeyspace_jmeter/test_contacts/mykeyspace_jmeter-test_contacts-jb-28049-Data.db
   at 
 org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:75)
   at 
 org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:1215)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1826)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.scrub(ColumnFamilyStore.java:1122)
   at 
 org.apache.cassandra.service.StorageService.scrub(StorageService.java:2159)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
   at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
   at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
   at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
   at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
   at sun.rmi.transport.Transport$1.run(Transport.java:177)
   at sun.rmi.transport.Transport$1.run(Transport.java:174)
   at java.security.AccessController.doPrivileged(Native Method)
   at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
   at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553)
   at 
 

[jira] [Commented] (CASSANDRA-6716) nodetool scrub constantly fails with RuntimeException (Tried to hard link to file that does not exist)

2014-02-19 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906444#comment-13906444
 ] 

Nikolai Grigoriev commented on CASSANDRA-6716:
--

I have switched from hsha RPC server to sync to test this theory from 
CASSANDRA-6285. It seems that the things are getting a bit better.

I did some scrubbing. It seems that the most affected sstables were from 
OpsCenter. 

Even after scrubbing the entire OpsCenter keyspace on all nodes, shutting down 
OpsCenter and its agents and restarting Cassandra I am still getting this in 
the logs:

{code}
 INFO [CompactionExecutor:239] 2014-02-20 01:22:38,931 OutputHandler.java (line 
42) Scrubbing SSTableReader(path='/hado
op/disk2/cassandra/data/OpsCenter/pdps/OpsCenter-pdps-jb-2152-Data.db') (249309 
bytes)
 WARN [CompactionExecutor:239] 2014-02-20 01:22:38,958 OutputHandler.java (line 
57) Error reading row (stacktrace follo
ws):
org.apache.cassandra.io.sstable.CorruptSSTableException: 
org.apache.cassandra.serializers.MarshalException: String didn
't validate.
at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.next(SSTableIdentityIterator.java:152)
at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.next(SSTableIdentityIterator.java:32)
at 
org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:203)
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at com.google.common.collect.Iterators$7.computeNext(Iterators.java:645)
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at 
org.apache.cassandra.db.ColumnIndex$Builder.buildForCompaction(ColumnIndex.java:156)
at 
org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:101)
at 
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:169)
at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:199)
at 
org.apache.cassandra.db.compaction.CompactionManager.scrubOne(CompactionManager.java:443)
at 
org.apache.cassandra.db.compaction.CompactionManager.doScrub(CompactionManager.java:432)
at 
org.apache.cassandra.db.compaction.CompactionManager.access$300(CompactionManager.java:62)
at 
org.apache.cassandra.db.compaction.CompactionManager$3.perform(CompactionManager.java:236)
at 
org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:222)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: org.apache.cassandra.serializers.MarshalException: String didn't 
validate.
at 
org.apache.cassandra.serializers.UTF8Serializer.validate(UTF8Serializer.java:35)
at 
org.apache.cassandra.db.marshal.AbstractType.validate(AbstractType.java:172)
at org.apache.cassandra.db.Column.validateName(Column.java:295)
at org.apache.cassandra.db.Column.validateFields(Column.java:300)
at 
org.apache.cassandra.db.ExpiringColumn.validateFields(ExpiringColumn.java:181)
at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.next(SSTableIdentityIterator.java:147)
... 21 more
{code}



 nodetool scrub constantly fails with RuntimeException (Tried to hard link to 
 file that does not exist)
 --

 Key: CASSANDRA-6716
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6716
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.0.5 (built from source), Linux, 6 nodes, JDK 
 1.7
Reporter: Nikolai Grigoriev
 Attachments: system.log.gz


 It seems that since recently I have started getting a number of exceptions 
 like File not found on all Cassandra nodes. Currently I am getting an 
 exception like this every couple of seconds on each node, for different 
 keyspaces and CFs.
 I have tried to restart the nodes, tried to scrub them. No luck so far. It 
 seems that scrub cannot complete on any of these nodes, at some point it 
 fails because of the file that it can't find.
 One one of the nodes currently the nodetool scrub command fails  instantly 
 and consistently with this exception:
 {code}
 # /opt/cassandra/bin/nodetool scrub 
 Exception 

[jira] [Created] (CASSANDRA-6720) Implment support for Log4j DOMConfigurator for Cassandra damon

2014-02-18 Thread Nikolai Grigoriev (JIRA)
Nikolai Grigoriev created CASSANDRA-6720:


 Summary: Implment support for Log4j DOMConfigurator for Cassandra 
damon
 Key: CASSANDRA-6720
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6720
 Project: Cassandra
  Issue Type: Improvement
  Components: Config, Core
Reporter: Nikolai Grigoriev
Priority: Trivial


Currently CassandraDaemon explicitly uses PropertyConfigurator to load log4j 
settings if log4j.defaultInitOverride is set to true, which is done by 
default. This does not allow to use log4j XML configuration file because it 
requires using of DOMConfigurator, in the similar fashion. The only way to use 
it is to change the value of  log4j.defaultInitOverride property in the 
startup script.

Here is the background why I think it might be useful to support the XML 
configuration, even if you hate XML ;)

I wanted to ship my Cassandra logs to Logstash and I have been using 
SocketAppender. But then I have discovered that any issue with Logstash log4j 
server result in significant performance degradation for Cassandra as the 
logger blocks. I was able to easily reproduce the problem with a separate test. 
It seems that the obvious solution was to use AsyncAppender before 
SocketAppender, that eliminates the blocking. However, AsyncAppender can be 
only confgured via DOMConfigurator, at least in Log4j 1.2.

I think it does not hurt to make a little change to support both configuration 
types, in a way similar to Spring's Log4jConfigurer:

{code}
public static void initLogging(String location, long refreshInterval) 
throws FileNotFoundException {
String resolvedLocation = 
SystemPropertyUtils.resolvePlaceholders(location);
File file = ResourceUtils.getFile(resolvedLocation);
if (!file.exists()) {
throw new FileNotFoundException(Log4j config file [ + 
resolvedLocation + ] not found);
}
if 
(resolvedLocation.toLowerCase().endsWith(XML_FILE_EXTENSION)) {

DOMConfigurator.configureAndWatch(file.getAbsolutePath(), refreshInterval);
}
else {

PropertyConfigurator.configureAndWatch(file.getAbsolutePath(), refreshInterval);
}
}

I would be happy to submit the change unless there are any objections.
{code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6285) LCS compaction failing with Exception

2014-02-17 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13903449#comment-13903449
 ] 

Nikolai Grigoriev commented on CASSANDRA-6285:
--

I have started seeing these too. Surprisingly...after adding OpsCenter CE to my 
cluster. I do not see these associated with my own data.

{code}
java.lang.RuntimeException: Last written key DecoratedKey(3542937286762954312, 
31302e332e34352e3135382d676574466c757368657350656e64696e67) = current
key DecoratedKey(-2152912038130700738, 
31302e332e34352e3135362d77696e7465726d7574655f6a6d657465722d776d5f6170706c69636174696f6e732d676574526563656e744
26c6f6f6d46) writing into 
/hadoop/disk1/cassandra/data/OpsCenter/rollups300/OpsCenter-rollups300-tmp-jb-5055-Data.db
at 
org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:142)
at 
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:165)
at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:197)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
{code}

 LCS compaction failing with Exception
 -

 Key: CASSANDRA-6285
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
Reporter: David Sauer
Assignee: Tyler Hobbs
 Fix For: 2.0.6

 Attachments: compaction_test.py


 After altering everything to LCS the table OpsCenter.rollups60 amd one other 
 none OpsCenter-Table got stuck with everything hanging around in L0.
 The compaction started and ran until the logs showed this:
 ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
 (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
 java.lang.RuntimeException: Last written key 
 DecoratedKey(1326283851463420237, 
 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
  = current key DecoratedKey(954210699457429663, 
 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
  writing into 
 /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
   at 
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
   at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:724)
 Moving back to STC worked to keep the compactions running.
 Especialy my own Table i would like to move to LCS.
 After a major compaction with STC the move to LCS fails with the same 
 Exception.




[jira] [Created] (CASSANDRA-6716) nodetool scrub constantly fails with RuntimeException (Tried to hard link to file that does not exist)

2014-02-17 Thread Nikolai Grigoriev (JIRA)
Nikolai Grigoriev created CASSANDRA-6716:


 Summary: nodetool scrub constantly fails with RuntimeException 
(Tried to hard link to file that does not exist)
 Key: CASSANDRA-6716
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6716
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.0.5 (built from source), Linux, 6 nodes, JDK 
1.7
Reporter: Nikolai Grigoriev
 Attachments: system.log.gz

It seems that since recently I have started getting a number of exceptions like 
File not found on all Cassandra nodes. Currently I am getting an exception 
like this every couple of seconds on each node, for different keyspaces and CFs.

I have tried to restart the nodes, tried to scrub them. No luck so far. It 
seems that scrub cannot complete on any of these nodes, at some point it fails 
because of the file that it can't find.

One one of the nodes currently the nodetool scrub command fails  instantly 
and consistently with this exception:

{code}
# /opt/cassandra/bin/nodetool scrub 
Exception in thread main java.lang.RuntimeException: Tried to hard link to 
file that does not exist 
/mnt/disk5/cassandra/data/mykeyspace_jmeter/test_contacts/mykeyspace_jmeter-test_contacts-jb-28049-Data.db
at 
org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:75)
at 
org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:1215)
at 
org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1826)
at 
org.apache.cassandra.db.ColumnFamilyStore.scrub(ColumnFamilyStore.java:1122)
at 
org.apache.cassandra.service.StorageService.scrub(StorageService.java:2159)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
at 
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
at 
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
at 
javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
at 
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
at 
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
at 
javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
at sun.rmi.transport.Transport$1.run(Transport.java:177)
at sun.rmi.transport.Transport$1.run(Transport.java:174)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
{code}

Also I have noticed that the files that are missing are often (or maybe 
always?) referred to in the log as follows:

{quote}
 WARN 00:06:10,597 At level 3, 

[jira] [Updated] (CASSANDRA-6716) nodetool scrub constantly fails with RuntimeException (Tried to hard link to file that does not exist)

2014-02-17 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-6716:
-

Attachment: system.log.gz

log from one of the nodes

 nodetool scrub constantly fails with RuntimeException (Tried to hard link to 
 file that does not exist)
 --

 Key: CASSANDRA-6716
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6716
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.0.5 (built from source), Linux, 6 nodes, JDK 
 1.7
Reporter: Nikolai Grigoriev
 Attachments: system.log.gz


 It seems that since recently I have started getting a number of exceptions 
 like File not found on all Cassandra nodes. Currently I am getting an 
 exception like this every couple of seconds on each node, for different 
 keyspaces and CFs.
 I have tried to restart the nodes, tried to scrub them. No luck so far. It 
 seems that scrub cannot complete on any of these nodes, at some point it 
 fails because of the file that it can't find.
 One one of the nodes currently the nodetool scrub command fails  instantly 
 and consistently with this exception:
 {code}
 # /opt/cassandra/bin/nodetool scrub 
 Exception in thread main java.lang.RuntimeException: Tried to hard link to 
 file that does not exist 
 /mnt/disk5/cassandra/data/mykeyspace_jmeter/test_contacts/mykeyspace_jmeter-test_contacts-jb-28049-Data.db
   at 
 org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:75)
   at 
 org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:1215)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1826)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.scrub(ColumnFamilyStore.java:1122)
   at 
 org.apache.cassandra.service.StorageService.scrub(StorageService.java:2159)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
   at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
   at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
   at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
   at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
   at sun.rmi.transport.Transport$1.run(Transport.java:177)
   at sun.rmi.transport.Transport$1.run(Transport.java:174)
   at java.security.AccessController.doPrivileged(Native Method)
   at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
   at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at 

[jira] [Commented] (CASSANDRA-6716) nodetool scrub constantly fails with RuntimeException (Tried to hard link to file that does not exist)

2014-02-17 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13903662#comment-13903662
 ] 

Nikolai Grigoriev commented on CASSANDRA-6716:
--

Yes, but I was not sure if the problem with missing sstables is the consequence 
of that issue.

 nodetool scrub constantly fails with RuntimeException (Tried to hard link to 
 file that does not exist)
 --

 Key: CASSANDRA-6716
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6716
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.0.5 (built from source), Linux, 6 nodes, JDK 
 1.7
Reporter: Nikolai Grigoriev
 Attachments: system.log.gz


 It seems that since recently I have started getting a number of exceptions 
 like File not found on all Cassandra nodes. Currently I am getting an 
 exception like this every couple of seconds on each node, for different 
 keyspaces and CFs.
 I have tried to restart the nodes, tried to scrub them. No luck so far. It 
 seems that scrub cannot complete on any of these nodes, at some point it 
 fails because of the file that it can't find.
 One one of the nodes currently the nodetool scrub command fails  instantly 
 and consistently with this exception:
 {code}
 # /opt/cassandra/bin/nodetool scrub 
 Exception in thread main java.lang.RuntimeException: Tried to hard link to 
 file that does not exist 
 /mnt/disk5/cassandra/data/mykeyspace_jmeter/test_contacts/mykeyspace_jmeter-test_contacts-jb-28049-Data.db
   at 
 org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:75)
   at 
 org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:1215)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1826)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.scrub(ColumnFamilyStore.java:1122)
   at 
 org.apache.cassandra.service.StorageService.scrub(StorageService.java:2159)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
   at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
   at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
   at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
   at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
   at sun.rmi.transport.Transport$1.run(Transport.java:177)
   at sun.rmi.transport.Transport$1.run(Transport.java:174)
   at java.security.AccessController.doPrivileged(Native Method)
   at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
   at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   

[jira] [Comment Edited] (CASSANDRA-6716) nodetool scrub constantly fails with RuntimeException (Tried to hard link to file that does not exist)

2014-02-17 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13903662#comment-13903662
 ] 

Nikolai Grigoriev edited comment on CASSANDRA-6716 at 2/18/14 12:54 AM:


Yes, but I was not sure if the problem with missing sstables is the consequence 
of that issue. And, unlike with that issue I did not upgrade from 1.2.


was (Author: ngrigoriev):
Yes, but I was not sure if the problem with missing sstables is the consequence 
of that issue.

 nodetool scrub constantly fails with RuntimeException (Tried to hard link to 
 file that does not exist)
 --

 Key: CASSANDRA-6716
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6716
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.0.5 (built from source), Linux, 6 nodes, JDK 
 1.7
Reporter: Nikolai Grigoriev
 Attachments: system.log.gz


 It seems that since recently I have started getting a number of exceptions 
 like File not found on all Cassandra nodes. Currently I am getting an 
 exception like this every couple of seconds on each node, for different 
 keyspaces and CFs.
 I have tried to restart the nodes, tried to scrub them. No luck so far. It 
 seems that scrub cannot complete on any of these nodes, at some point it 
 fails because of the file that it can't find.
 One one of the nodes currently the nodetool scrub command fails  instantly 
 and consistently with this exception:
 {code}
 # /opt/cassandra/bin/nodetool scrub 
 Exception in thread main java.lang.RuntimeException: Tried to hard link to 
 file that does not exist 
 /mnt/disk5/cassandra/data/mykeyspace_jmeter/test_contacts/mykeyspace_jmeter-test_contacts-jb-28049-Data.db
   at 
 org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:75)
   at 
 org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:1215)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1826)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.scrub(ColumnFamilyStore.java:1122)
   at 
 org.apache.cassandra.service.StorageService.scrub(StorageService.java:2159)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
   at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
   at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
   at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
   at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
   at sun.rmi.transport.Transport$1.run(Transport.java:177)
   at sun.rmi.transport.Transport$1.run(Transport.java:174)
   at java.security.AccessController.doPrivileged(Native Method)
   at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
   at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553)
   at 
 

[jira] [Commented] (CASSANDRA-6407) CQL/Thrift request hangs forever when querying more than certain amount of data

2014-01-09 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867189#comment-13867189
 ] 

Nikolai Grigoriev commented on CASSANDRA-6407:
--

[~xedin] Source patch will be OK too, whichever is simpler for you. We are 
building our Cassandra from source with two patches that are scheduled for 
2.0.5. I do not mind rebuilding another dependency :) Thanks!

 CQL/Thrift request hangs forever when querying more than certain amount of 
 data
 ---

 Key: CASSANDRA-6407
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6407
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Oracle Linux 6.4, JDK 1.7.0_25-b15, Cassandra 2.0.2
Reporter: Nikolai Grigoriev
 Attachments: cassandra.jstack.gz, cassandra.yaml, 
 cassandra6407test.cql.gz, system.log.gz


 I have a table like this (slightly simplified for clarity):
 {code}
 CREATE TABLE my_test_table (
   uid  uuid,
   d_id uuid,
   a_id uuid,  
   c_idtext,
   i_idblob,   
   datatext,
   PRIMARY KEY ((uid, d_id, a_id), c_id, i_id)
 );
 {code}
 I have created about over a hundred (117 to be specific) of sample entities 
 with the same row key and different clustering keys. Each has a blob of 
 approximately 4Kb.
 I have tried to fetch all of them with a query like this via CQLSH:
 {code}
 select * from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 This query simply hangs in CQLSH, it does not return at all until I abort it.
 Then I started playing with LIMIT clause and found that this query returns 
 instantly (with good data) when I use LIMIT 55 but hangs forever when I use 
 LIMIT 56.
 Then I tried to just query all i_id values like this:
 {code}
 select i_id from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 And this query returns instantly with the complete set of 117 values. So I 
 started thinking that it must be something about the total size of the 
 response, not the number of results or the number of columns to be fetches in 
 slices. And I have tried another test:
 {code}
 select cdata from my_test_table where 
 uid=44338526-7aac-4640-bcde-0f4663c07572 and 
 a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2' LIMIT 63
 {code}
 This query returns instantly but if I change the limit to 64 it hangs 
 forever. Since my blob is about 4Kb for each entity it *seems* like the query 
 hangs when the total size of the response exceeds 252..256Kb. Looks quite 
 suspicious especially because 256Kb is such a particular number. I am 
 wondering if this has something to do with the result paging.
 I did not test if the issue is reproducible outside of CQLSH but I do recall 
 that I observed somewhat similar behavior when fetching relatively large data 
 sets.
 I can consistently reproduce this problem on my cluster. I am also attaching 
 the jstack output that I have captured when CQLSH was hanging on one of these 
 queries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6407) CQL/Thrift request hangs forever when querying more than certain amount of data

2014-01-09 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867356#comment-13867356
 ] 

Nikolai Grigoriev commented on CASSANDRA-6407:
--

I have tested the updated Thrift server with a single-node cluster using my 
test case and in my larger cluster with my original test - it seems to be 
working correctly now with large responses! Thanks!!!

 CQL/Thrift request hangs forever when querying more than certain amount of 
 data
 ---

 Key: CASSANDRA-6407
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6407
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Oracle Linux 6.4, JDK 1.7.0_25-b15, Cassandra 2.0.2
Reporter: Nikolai Grigoriev
 Attachments: cassandra.jstack.gz, cassandra.yaml, 
 cassandra6407test.cql.gz, disruptor-thrift-server-0.3.3-SNAPSHOT.jar, 
 system.log.gz


 I have a table like this (slightly simplified for clarity):
 {code}
 CREATE TABLE my_test_table (
   uid  uuid,
   d_id uuid,
   a_id uuid,  
   c_idtext,
   i_idblob,   
   datatext,
   PRIMARY KEY ((uid, d_id, a_id), c_id, i_id)
 );
 {code}
 I have created about over a hundred (117 to be specific) of sample entities 
 with the same row key and different clustering keys. Each has a blob of 
 approximately 4Kb.
 I have tried to fetch all of them with a query like this via CQLSH:
 {code}
 select * from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 This query simply hangs in CQLSH, it does not return at all until I abort it.
 Then I started playing with LIMIT clause and found that this query returns 
 instantly (with good data) when I use LIMIT 55 but hangs forever when I use 
 LIMIT 56.
 Then I tried to just query all i_id values like this:
 {code}
 select i_id from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 And this query returns instantly with the complete set of 117 values. So I 
 started thinking that it must be something about the total size of the 
 response, not the number of results or the number of columns to be fetches in 
 slices. And I have tried another test:
 {code}
 select cdata from my_test_table where 
 uid=44338526-7aac-4640-bcde-0f4663c07572 and 
 a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2' LIMIT 63
 {code}
 This query returns instantly but if I change the limit to 64 it hangs 
 forever. Since my blob is about 4Kb for each entity it *seems* like the query 
 hangs when the total size of the response exceeds 252..256Kb. Looks quite 
 suspicious especially because 256Kb is such a particular number. I am 
 wondering if this has something to do with the result paging.
 I did not test if the issue is reproducible outside of CQLSH but I do recall 
 that I observed somewhat similar behavior when fetching relatively large data 
 sets.
 I can consistently reproduce this problem on my cluster. I am also attaching 
 the jstack output that I have captured when CQLSH was hanging on one of these 
 queries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6407) CQLSH hangs forever when querying more than certain amount of data

2014-01-08 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866117#comment-13866117
 ] 

Nikolai Grigoriev commented on CASSANDRA-6407:
--

Some additional details.

I can confirm that the problem is not limited to CQLSH, it can be reproduced 
via CQL/Thrift. Which does not surprise me, I was assuming that's what CQLSH is 
using today.

One of my coworkers has pointed out that he did not observe this problem in his 
small single-node cluster, even with larger amounts of data in one response. I 
was curious enough to try it so I have configured a single-node Cassandra 2.0.4 
cluster on a spare Linux machine, loaded my schema there and generated the 
problematic test data set. I could not reproduce the problem, i.e. I was 
getting back much larger result set than in my larger cluster. After that I 
took my production cassandra.yaml, changed the cluster name to a dummy one, 
reinitialized that single-node cluster with new config, reloaded the data and I 
could immediately reproduce the problem. To keep long story short, I was 
comparing the parameters I changed in my config with the defaults and finally 
found THE parameter that is clearly responsible for this issue: 
rpc_server_type. If set to sync, then I can query larger data set. If set to 
hsha - I can only query up to ~256Kb of data and then the connection gets 
stuck forever.

Anything obvious that I am missing about the limitations of hsha? 

 CQLSH hangs forever when querying more than certain amount of data
 --

 Key: CASSANDRA-6407
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6407
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: Oracle Linux 6.4, JDK 1.7.0_25-b15, Cassandra 2.0.2
Reporter: Nikolai Grigoriev
 Attachments: cassandra.jstack.gz


 I have a table like this (slightly simplified for clarity):
 {code}
 CREATE TABLE my_test_table (
   uid  uuid,
   d_id uuid,
   a_id uuid,  
   c_idtext,
   i_idblob,   
   datatext,
   PRIMARY KEY ((uid, d_id, a_id), c_id, i_id)
 );
 {code}
 I have created about over a hundred (117 to be specific) of sample entities 
 with the same row key and different clustering keys. Each has a blob of 
 approximately 4Kb.
 I have tried to fetch all of them with a query like this via CQLSH:
 {code}
 select * from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 This query simply hangs in CQLSH, it does not return at all until I abort it.
 Then I started playing with LIMIT clause and found that this query returns 
 instantly (with good data) when I use LIMIT 55 but hangs forever when I use 
 LIMIT 56.
 Then I tried to just query all i_id values like this:
 {code}
 select i_id from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 And this query returns instantly with the complete set of 117 values. So I 
 started thinking that it must be something about the total size of the 
 response, not the number of results or the number of columns to be fetches in 
 slices. And I have tried another test:
 {code}
 select cdata from my_test_table where 
 uid=44338526-7aac-4640-bcde-0f4663c07572 and 
 a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2' LIMIT 63
 {code}
 This query returns instantly but if I change the limit to 64 it hangs 
 forever. Since my blob is about 4Kb for each entity it *seems* like the query 
 hangs when the total size of the response exceeds 252..256Kb. Looks quite 
 suspicious especially because 256Kb is such a particular number. I am 
 wondering if this has something to do with the result paging.
 I did not test if the issue is reproducible outside of CQLSH but I do recall 
 that I observed somewhat similar behavior when fetching relatively large data 
 sets.
 I can consistently reproduce this problem on my cluster. I am also attaching 
 the jstack output that I have captured when CQLSH was hanging on one of these 
 queries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6407) CQL/Thrift request hangs forever when querying more than certain amount of data

2014-01-08 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-6407:
-

Reproduced In: 2.0.4

 CQL/Thrift request hangs forever when querying more than certain amount of 
 data
 ---

 Key: CASSANDRA-6407
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6407
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Oracle Linux 6.4, JDK 1.7.0_25-b15, Cassandra 2.0.2
Reporter: Nikolai Grigoriev
 Attachments: cassandra.jstack.gz


 I have a table like this (slightly simplified for clarity):
 {code}
 CREATE TABLE my_test_table (
   uid  uuid,
   d_id uuid,
   a_id uuid,  
   c_idtext,
   i_idblob,   
   datatext,
   PRIMARY KEY ((uid, d_id, a_id), c_id, i_id)
 );
 {code}
 I have created about over a hundred (117 to be specific) of sample entities 
 with the same row key and different clustering keys. Each has a blob of 
 approximately 4Kb.
 I have tried to fetch all of them with a query like this via CQLSH:
 {code}
 select * from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 This query simply hangs in CQLSH, it does not return at all until I abort it.
 Then I started playing with LIMIT clause and found that this query returns 
 instantly (with good data) when I use LIMIT 55 but hangs forever when I use 
 LIMIT 56.
 Then I tried to just query all i_id values like this:
 {code}
 select i_id from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 And this query returns instantly with the complete set of 117 values. So I 
 started thinking that it must be something about the total size of the 
 response, not the number of results or the number of columns to be fetches in 
 slices. And I have tried another test:
 {code}
 select cdata from my_test_table where 
 uid=44338526-7aac-4640-bcde-0f4663c07572 and 
 a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2' LIMIT 63
 {code}
 This query returns instantly but if I change the limit to 64 it hangs 
 forever. Since my blob is about 4Kb for each entity it *seems* like the query 
 hangs when the total size of the response exceeds 252..256Kb. Looks quite 
 suspicious especially because 256Kb is such a particular number. I am 
 wondering if this has something to do with the result paging.
 I did not test if the issue is reproducible outside of CQLSH but I do recall 
 that I observed somewhat similar behavior when fetching relatively large data 
 sets.
 I can consistently reproduce this problem on my cluster. I am also attaching 
 the jstack output that I have captured when CQLSH was hanging on one of these 
 queries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6407) CQL/Thrift request hangs forever when querying more than certain amount of data

2014-01-08 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-6407:
-

Component/s: (was: Tools)
 Core

 CQL/Thrift request hangs forever when querying more than certain amount of 
 data
 ---

 Key: CASSANDRA-6407
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6407
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Oracle Linux 6.4, JDK 1.7.0_25-b15, Cassandra 2.0.2
Reporter: Nikolai Grigoriev
 Attachments: cassandra.jstack.gz


 I have a table like this (slightly simplified for clarity):
 {code}
 CREATE TABLE my_test_table (
   uid  uuid,
   d_id uuid,
   a_id uuid,  
   c_idtext,
   i_idblob,   
   datatext,
   PRIMARY KEY ((uid, d_id, a_id), c_id, i_id)
 );
 {code}
 I have created about over a hundred (117 to be specific) of sample entities 
 with the same row key and different clustering keys. Each has a blob of 
 approximately 4Kb.
 I have tried to fetch all of them with a query like this via CQLSH:
 {code}
 select * from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 This query simply hangs in CQLSH, it does not return at all until I abort it.
 Then I started playing with LIMIT clause and found that this query returns 
 instantly (with good data) when I use LIMIT 55 but hangs forever when I use 
 LIMIT 56.
 Then I tried to just query all i_id values like this:
 {code}
 select i_id from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 And this query returns instantly with the complete set of 117 values. So I 
 started thinking that it must be something about the total size of the 
 response, not the number of results or the number of columns to be fetches in 
 slices. And I have tried another test:
 {code}
 select cdata from my_test_table where 
 uid=44338526-7aac-4640-bcde-0f4663c07572 and 
 a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2' LIMIT 63
 {code}
 This query returns instantly but if I change the limit to 64 it hangs 
 forever. Since my blob is about 4Kb for each entity it *seems* like the query 
 hangs when the total size of the response exceeds 252..256Kb. Looks quite 
 suspicious especially because 256Kb is such a particular number. I am 
 wondering if this has something to do with the result paging.
 I did not test if the issue is reproducible outside of CQLSH but I do recall 
 that I observed somewhat similar behavior when fetching relatively large data 
 sets.
 I can consistently reproduce this problem on my cluster. I am also attaching 
 the jstack output that I have captured when CQLSH was hanging on one of these 
 queries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6407) CQL/Thrift request hangs forever when querying more than certain amount of data

2014-01-08 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-6407:
-

Summary: CQL/Thrift request hangs forever when querying more than certain 
amount of data  (was: CQLSH hangs forever when querying more than certain 
amount of data)

 CQL/Thrift request hangs forever when querying more than certain amount of 
 data
 ---

 Key: CASSANDRA-6407
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6407
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Oracle Linux 6.4, JDK 1.7.0_25-b15, Cassandra 2.0.2
Reporter: Nikolai Grigoriev
 Attachments: cassandra.jstack.gz


 I have a table like this (slightly simplified for clarity):
 {code}
 CREATE TABLE my_test_table (
   uid  uuid,
   d_id uuid,
   a_id uuid,  
   c_idtext,
   i_idblob,   
   datatext,
   PRIMARY KEY ((uid, d_id, a_id), c_id, i_id)
 );
 {code}
 I have created about over a hundred (117 to be specific) of sample entities 
 with the same row key and different clustering keys. Each has a blob of 
 approximately 4Kb.
 I have tried to fetch all of them with a query like this via CQLSH:
 {code}
 select * from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 This query simply hangs in CQLSH, it does not return at all until I abort it.
 Then I started playing with LIMIT clause and found that this query returns 
 instantly (with good data) when I use LIMIT 55 but hangs forever when I use 
 LIMIT 56.
 Then I tried to just query all i_id values like this:
 {code}
 select i_id from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 And this query returns instantly with the complete set of 117 values. So I 
 started thinking that it must be something about the total size of the 
 response, not the number of results or the number of columns to be fetches in 
 slices. And I have tried another test:
 {code}
 select cdata from my_test_table where 
 uid=44338526-7aac-4640-bcde-0f4663c07572 and 
 a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2' LIMIT 63
 {code}
 This query returns instantly but if I change the limit to 64 it hangs 
 forever. Since my blob is about 4Kb for each entity it *seems* like the query 
 hangs when the total size of the response exceeds 252..256Kb. Looks quite 
 suspicious especially because 256Kb is such a particular number. I am 
 wondering if this has something to do with the result paging.
 I did not test if the issue is reproducible outside of CQLSH but I do recall 
 that I observed somewhat similar behavior when fetching relatively large data 
 sets.
 I can consistently reproduce this problem on my cluster. I am also attaching 
 the jstack output that I have captured when CQLSH was hanging on one of these 
 queries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6407) CQL/Thrift request hangs forever when querying more than certain amount of data

2014-01-08 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866133#comment-13866133
 ] 

Nikolai Grigoriev commented on CASSANDRA-6407:
--

It sounds somewhat related to:

CASSANDRA-4573

CASSANDRA-6373



 CQL/Thrift request hangs forever when querying more than certain amount of 
 data
 ---

 Key: CASSANDRA-6407
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6407
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Oracle Linux 6.4, JDK 1.7.0_25-b15, Cassandra 2.0.2
Reporter: Nikolai Grigoriev
 Attachments: cassandra.jstack.gz


 I have a table like this (slightly simplified for clarity):
 {code}
 CREATE TABLE my_test_table (
   uid  uuid,
   d_id uuid,
   a_id uuid,  
   c_idtext,
   i_idblob,   
   datatext,
   PRIMARY KEY ((uid, d_id, a_id), c_id, i_id)
 );
 {code}
 I have created about over a hundred (117 to be specific) of sample entities 
 with the same row key and different clustering keys. Each has a blob of 
 approximately 4Kb.
 I have tried to fetch all of them with a query like this via CQLSH:
 {code}
 select * from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 This query simply hangs in CQLSH, it does not return at all until I abort it.
 Then I started playing with LIMIT clause and found that this query returns 
 instantly (with good data) when I use LIMIT 55 but hangs forever when I use 
 LIMIT 56.
 Then I tried to just query all i_id values like this:
 {code}
 select i_id from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 And this query returns instantly with the complete set of 117 values. So I 
 started thinking that it must be something about the total size of the 
 response, not the number of results or the number of columns to be fetches in 
 slices. And I have tried another test:
 {code}
 select cdata from my_test_table where 
 uid=44338526-7aac-4640-bcde-0f4663c07572 and 
 a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2' LIMIT 63
 {code}
 This query returns instantly but if I change the limit to 64 it hangs 
 forever. Since my blob is about 4Kb for each entity it *seems* like the query 
 hangs when the total size of the response exceeds 252..256Kb. Looks quite 
 suspicious especially because 256Kb is such a particular number. I am 
 wondering if this has something to do with the result paging.
 I did not test if the issue is reproducible outside of CQLSH but I do recall 
 that I observed somewhat similar behavior when fetching relatively large data 
 sets.
 I can consistently reproduce this problem on my cluster. I am also attaching 
 the jstack output that I have captured when CQLSH was hanging on one of these 
 queries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6407) CQL/Thrift request hangs forever when querying more than certain amount of data

2014-01-08 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-6407:
-

Attachment: cassandra.yaml

 CQL/Thrift request hangs forever when querying more than certain amount of 
 data
 ---

 Key: CASSANDRA-6407
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6407
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Oracle Linux 6.4, JDK 1.7.0_25-b15, Cassandra 2.0.2
Reporter: Nikolai Grigoriev
 Attachments: cassandra.jstack.gz, cassandra.yaml, 
 cassandra6407test.cql.gz


 I have a table like this (slightly simplified for clarity):
 {code}
 CREATE TABLE my_test_table (
   uid  uuid,
   d_id uuid,
   a_id uuid,  
   c_idtext,
   i_idblob,   
   datatext,
   PRIMARY KEY ((uid, d_id, a_id), c_id, i_id)
 );
 {code}
 I have created about over a hundred (117 to be specific) of sample entities 
 with the same row key and different clustering keys. Each has a blob of 
 approximately 4Kb.
 I have tried to fetch all of them with a query like this via CQLSH:
 {code}
 select * from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 This query simply hangs in CQLSH, it does not return at all until I abort it.
 Then I started playing with LIMIT clause and found that this query returns 
 instantly (with good data) when I use LIMIT 55 but hangs forever when I use 
 LIMIT 56.
 Then I tried to just query all i_id values like this:
 {code}
 select i_id from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 And this query returns instantly with the complete set of 117 values. So I 
 started thinking that it must be something about the total size of the 
 response, not the number of results or the number of columns to be fetches in 
 slices. And I have tried another test:
 {code}
 select cdata from my_test_table where 
 uid=44338526-7aac-4640-bcde-0f4663c07572 and 
 a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2' LIMIT 63
 {code}
 This query returns instantly but if I change the limit to 64 it hangs 
 forever. Since my blob is about 4Kb for each entity it *seems* like the query 
 hangs when the total size of the response exceeds 252..256Kb. Looks quite 
 suspicious especially because 256Kb is such a particular number. I am 
 wondering if this has something to do with the result paging.
 I did not test if the issue is reproducible outside of CQLSH but I do recall 
 that I observed somewhat similar behavior when fetching relatively large data 
 sets.
 I can consistently reproduce this problem on my cluster. I am also attaching 
 the jstack output that I have captured when CQLSH was hanging on one of these 
 queries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6407) CQL/Thrift request hangs forever when querying more than certain amount of data

2014-01-08 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-6407:
-

Attachment: cassandra6407test.cql.gz

 CQL/Thrift request hangs forever when querying more than certain amount of 
 data
 ---

 Key: CASSANDRA-6407
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6407
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Oracle Linux 6.4, JDK 1.7.0_25-b15, Cassandra 2.0.2
Reporter: Nikolai Grigoriev
 Attachments: cassandra.jstack.gz, cassandra.yaml, 
 cassandra6407test.cql.gz


 I have a table like this (slightly simplified for clarity):
 {code}
 CREATE TABLE my_test_table (
   uid  uuid,
   d_id uuid,
   a_id uuid,  
   c_idtext,
   i_idblob,   
   datatext,
   PRIMARY KEY ((uid, d_id, a_id), c_id, i_id)
 );
 {code}
 I have created about over a hundred (117 to be specific) of sample entities 
 with the same row key and different clustering keys. Each has a blob of 
 approximately 4Kb.
 I have tried to fetch all of them with a query like this via CQLSH:
 {code}
 select * from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 This query simply hangs in CQLSH, it does not return at all until I abort it.
 Then I started playing with LIMIT clause and found that this query returns 
 instantly (with good data) when I use LIMIT 55 but hangs forever when I use 
 LIMIT 56.
 Then I tried to just query all i_id values like this:
 {code}
 select i_id from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 And this query returns instantly with the complete set of 117 values. So I 
 started thinking that it must be something about the total size of the 
 response, not the number of results or the number of columns to be fetches in 
 slices. And I have tried another test:
 {code}
 select cdata from my_test_table where 
 uid=44338526-7aac-4640-bcde-0f4663c07572 and 
 a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2' LIMIT 63
 {code}
 This query returns instantly but if I change the limit to 64 it hangs 
 forever. Since my blob is about 4Kb for each entity it *seems* like the query 
 hangs when the total size of the response exceeds 252..256Kb. Looks quite 
 suspicious especially because 256Kb is such a particular number. I am 
 wondering if this has something to do with the result paging.
 I did not test if the issue is reproducible outside of CQLSH but I do recall 
 that I observed somewhat similar behavior when fetching relatively large data 
 sets.
 I can consistently reproduce this problem on my cluster. I am also attaching 
 the jstack output that I have captured when CQLSH was hanging on one of these 
 queries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6407) CQL/Thrift request hangs forever when querying more than certain amount of data

2014-01-08 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866245#comment-13866245
 ] 

Nikolai Grigoriev commented on CASSANDRA-6407:
--

[~xedin] I have prepared a simple test that does demonstrate the problem even 
in a small single-node cluster. Interestingly enough, with this test and such a 
small cluster with no load at all sometimes it actually works.

So, here is how I use it:

1. Set the RPC server type to hsha
2. Load the attached CQL ile
3. Use CQLSH
   use cassandra6407test ;
   select * from my_test_table ;

In most of the cases this SELECT gets stuck forever. Sometimes if you interrupt 
it (after a while) and do it again it actually returns all the data on the 
second attempt. Sometimes it does not. If you restart CQLSH and do it again - 
it will get stuck again. Specifying a LIMIT above 24-25 demonstrates similar 
behavior.

If you switch  RPC server type to sync and restart, then select * from 
my_test_table ; works all the time.

It almost feels like some sort of race condition or a timing issue somewhere 
between the part that produces the query result and the part that streams it 
back to the client.

 CQL/Thrift request hangs forever when querying more than certain amount of 
 data
 ---

 Key: CASSANDRA-6407
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6407
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Oracle Linux 6.4, JDK 1.7.0_25-b15, Cassandra 2.0.2
Reporter: Nikolai Grigoriev
 Attachments: cassandra.jstack.gz, cassandra.yaml, 
 cassandra6407test.cql.gz


 I have a table like this (slightly simplified for clarity):
 {code}
 CREATE TABLE my_test_table (
   uid  uuid,
   d_id uuid,
   a_id uuid,  
   c_idtext,
   i_idblob,   
   datatext,
   PRIMARY KEY ((uid, d_id, a_id), c_id, i_id)
 );
 {code}
 I have created about over a hundred (117 to be specific) of sample entities 
 with the same row key and different clustering keys. Each has a blob of 
 approximately 4Kb.
 I have tried to fetch all of them with a query like this via CQLSH:
 {code}
 select * from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 This query simply hangs in CQLSH, it does not return at all until I abort it.
 Then I started playing with LIMIT clause and found that this query returns 
 instantly (with good data) when I use LIMIT 55 but hangs forever when I use 
 LIMIT 56.
 Then I tried to just query all i_id values like this:
 {code}
 select i_id from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 And this query returns instantly with the complete set of 117 values. So I 
 started thinking that it must be something about the total size of the 
 response, not the number of results or the number of columns to be fetches in 
 slices. And I have tried another test:
 {code}
 select cdata from my_test_table where 
 uid=44338526-7aac-4640-bcde-0f4663c07572 and 
 a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2' LIMIT 63
 {code}
 This query returns instantly but if I change the limit to 64 it hangs 
 forever. Since my blob is about 4Kb for each entity it *seems* like the query 
 hangs when the total size of the response exceeds 252..256Kb. Looks quite 
 suspicious especially because 256Kb is such a particular number. I am 
 wondering if this has something to do with the result paging.
 I did not test if the issue is reproducible outside of CQLSH but I do recall 
 that I observed somewhat similar behavior when fetching relatively large data 
 sets.
 I can consistently reproduce this problem on my cluster. I am also attaching 
 the jstack output that I have captured when CQLSH was hanging on one of these 
 queries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6407) CQL/Thrift request hangs forever when querying more than certain amount of data

2014-01-08 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-6407:
-

Attachment: system.log.gz

this is the DEBUG log -  I have tried that select * request 3 times after 
restarting the server with RPC server type set to hsha.

 CQL/Thrift request hangs forever when querying more than certain amount of 
 data
 ---

 Key: CASSANDRA-6407
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6407
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Oracle Linux 6.4, JDK 1.7.0_25-b15, Cassandra 2.0.2
Reporter: Nikolai Grigoriev
 Attachments: cassandra.jstack.gz, cassandra.yaml, 
 cassandra6407test.cql.gz, system.log.gz


 I have a table like this (slightly simplified for clarity):
 {code}
 CREATE TABLE my_test_table (
   uid  uuid,
   d_id uuid,
   a_id uuid,  
   c_idtext,
   i_idblob,   
   datatext,
   PRIMARY KEY ((uid, d_id, a_id), c_id, i_id)
 );
 {code}
 I have created about over a hundred (117 to be specific) of sample entities 
 with the same row key and different clustering keys. Each has a blob of 
 approximately 4Kb.
 I have tried to fetch all of them with a query like this via CQLSH:
 {code}
 select * from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 This query simply hangs in CQLSH, it does not return at all until I abort it.
 Then I started playing with LIMIT clause and found that this query returns 
 instantly (with good data) when I use LIMIT 55 but hangs forever when I use 
 LIMIT 56.
 Then I tried to just query all i_id values like this:
 {code}
 select i_id from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 And this query returns instantly with the complete set of 117 values. So I 
 started thinking that it must be something about the total size of the 
 response, not the number of results or the number of columns to be fetches in 
 slices. And I have tried another test:
 {code}
 select cdata from my_test_table where 
 uid=44338526-7aac-4640-bcde-0f4663c07572 and 
 a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2' LIMIT 63
 {code}
 This query returns instantly but if I change the limit to 64 it hangs 
 forever. Since my blob is about 4Kb for each entity it *seems* like the query 
 hangs when the total size of the response exceeds 252..256Kb. Looks quite 
 suspicious especially because 256Kb is such a particular number. I am 
 wondering if this has something to do with the result paging.
 I did not test if the issue is reproducible outside of CQLSH but I do recall 
 that I observed somewhat similar behavior when fetching relatively large data 
 sets.
 I can consistently reproduce this problem on my cluster. I am also attaching 
 the jstack output that I have captured when CQLSH was hanging on one of these 
 queries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Comment Edited] (CASSANDRA-6407) CQL/Thrift request hangs forever when querying more than certain amount of data

2014-01-08 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866245#comment-13866245
 ] 

Nikolai Grigoriev edited comment on CASSANDRA-6407 at 1/9/14 3:21 AM:
--

[~xedin] I have prepared a simple test that does demonstrate the problem even 
in a small single-node cluster. Interestingly enough, with this test and such a 
small cluster with no load at all sometimes it actually works.

So, here is how I use it:

1. Set the RPC server type to hsha
2. Load the attached CQL ile
3. Use CQLSH
   use cassandra6407test ;
   select * from my_test_table ;

In most of the cases this SELECT gets stuck forever. Sometimes if you interrupt 
it (after a while) and do it again it actually returns all the data on the 
second attempt. Sometimes it does not. If you restart CQLSH and do it again - 
it will get stuck again. Specifying a LIMIT above 24-25 demonstrates similar 
behavior.

If you switch  RPC server type to sync and restart, then select * from 
my_test_table ; works all the time.

It almost feels like some sort of race condition or a timing issue somewhere 
between the part that produces the query result and the part that streams it 
back to the client.

The server config I have attached is simplified, I have disabled JNA, JEMalloc 
etc to have a configuration that is as close as possible to the default 
installation.


was (Author: ngrigoriev):
[~xedin] I have prepared a simple test that does demonstrate the problem even 
in a small single-node cluster. Interestingly enough, with this test and such a 
small cluster with no load at all sometimes it actually works.

So, here is how I use it:

1. Set the RPC server type to hsha
2. Load the attached CQL ile
3. Use CQLSH
   use cassandra6407test ;
   select * from my_test_table ;

In most of the cases this SELECT gets stuck forever. Sometimes if you interrupt 
it (after a while) and do it again it actually returns all the data on the 
second attempt. Sometimes it does not. If you restart CQLSH and do it again - 
it will get stuck again. Specifying a LIMIT above 24-25 demonstrates similar 
behavior.

If you switch  RPC server type to sync and restart, then select * from 
my_test_table ; works all the time.

It almost feels like some sort of race condition or a timing issue somewhere 
between the part that produces the query result and the part that streams it 
back to the client.

 CQL/Thrift request hangs forever when querying more than certain amount of 
 data
 ---

 Key: CASSANDRA-6407
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6407
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Oracle Linux 6.4, JDK 1.7.0_25-b15, Cassandra 2.0.2
Reporter: Nikolai Grigoriev
 Attachments: cassandra.jstack.gz, cassandra.yaml, 
 cassandra6407test.cql.gz, system.log.gz


 I have a table like this (slightly simplified for clarity):
 {code}
 CREATE TABLE my_test_table (
   uid  uuid,
   d_id uuid,
   a_id uuid,  
   c_idtext,
   i_idblob,   
   datatext,
   PRIMARY KEY ((uid, d_id, a_id), c_id, i_id)
 );
 {code}
 I have created about over a hundred (117 to be specific) of sample entities 
 with the same row key and different clustering keys. Each has a blob of 
 approximately 4Kb.
 I have tried to fetch all of them with a query like this via CQLSH:
 {code}
 select * from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 This query simply hangs in CQLSH, it does not return at all until I abort it.
 Then I started playing with LIMIT clause and found that this query returns 
 instantly (with good data) when I use LIMIT 55 but hangs forever when I use 
 LIMIT 56.
 Then I tried to just query all i_id values like this:
 {code}
 select i_id from my_test_table where uid=44338526-7aac-4640-bcde-0f4663c07572 
 and a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2'
 {code}
 And this query returns instantly with the complete set of 117 values. So I 
 started thinking that it must be something about the total size of the 
 response, not the number of results or the number of columns to be fetches in 
 slices. And I have tried another test:
 {code}
 select cdata from my_test_table where 
 uid=44338526-7aac-4640-bcde-0f4663c07572 and 
 a_id=--4000--0002 and 
 d_id=--1e64--0001 and c_id='list-2' LIMIT 63
 {code}
 This query returns instantly but if I change the limit to 64 it hangs 
 forever. Since my blob is about 4Kb 

[jira] [Resolved] (CASSANDRA-6528) TombstoneOverwhelmingException is thrown while populating data in recently truncated CF

2014-01-06 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev resolved CASSANDRA-6528.
--

Resolution: Cannot Reproduce

Closing since I cannot reproduce it anymore. Will reopen if I manage to 
reproduce it again and capture the debug information as per instructions above.

 TombstoneOverwhelmingException is thrown while populating data in recently 
 truncated CF
 ---

 Key: CASSANDRA-6528
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6528
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassadra 2.0.3, Linux, 6 nodes
Reporter: Nikolai Grigoriev
Priority: Minor

 I am running some performance tests and recently I had to flush the data from 
 one of the tables and repopulate it. I have about 30M rows with a few columns 
 in each, about 5kb per row in in total. In order to repopulate the data I do 
 truncate table from CQLSH and then relaunch the test. The test simply 
 inserts the data in the table, does not read anything. Shortly after 
 restarting the data generator I see this on one of the nodes:
 {code}
  INFO [HintedHandoff:655] 2013-12-26 16:45:42,185 HintedHandOffManager.java 
 (line 323) Started hinted handoff f
 or host: 985c8a08-3d92-4fad-a1d1-7135b2b9774a with IP: /10.5.45.158
 ERROR [HintedHandoff:655] 2013-12-26 16:45:42,680 SliceQueryFilter.java (line 
 200) Scanned ove
 r 10 tombstones; query aborted (see tombstone_fail_threshold)
 ERROR [HintedHandoff:655] 2013-12-26 16:45:42,680 CassandraDaemon.java (line 
 187) Exception in thread Thread[HintedHandoff:655,1,main]
 org.apache.cassandra.db.filter.TombstoneOverwhelmingException
 at 
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:201)
 at 
 org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122)
 at 
 org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80)
 at 
 org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72)
 at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297)
 at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:56)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1487)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1306)
 at 
 org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:351)
 at 
 org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:309)
 at 
 org.apache.cassandra.db.HintedHandOffManager.access$4(HintedHandOffManager.java:281)
 at 
 org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:530)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:724)
  INFO [OptionalTasks:1] 2013-12-26 16:45:53,946 MeteredFlusher.java (line 63) 
 flushing high-traffic column family CFS(Keyspace='test_jmeter', 
 ColumnFamily='test_profiles') (estimated 192717267 bytes)
 {code}
 I am inserting the data with CL=1.
 It seems to be happening every time I do it. But I do not see any errors on 
 the client side and the node seems to continue operating, this is why I think 
 it is not a major issue. Maybe not an issue at all, but the message is logged 
 as ERROR.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


  1   2   >