Re: compaction throughput

2016-01-21 Thread PenguinWhispererThe .
Thanks for that clarification Sebastian! That's really good to know! I
never took increasing this value in consideration because of my previous
experience.

In my case I had a table that was compacting over and over... and only one
CPU was used. So that made me believe it was not multithreaded (I actually
believe I asked this on IRC however it's been a few months ago so I might
be wrong).

Have there been behavioral changes on this lately? (I was using 2.0.9 or
2.0.11 I believe).

2016-01-21 14:15 GMT+01:00 Sebastian Estevez <sebastian.este...@datastax.com
>:

> >So compaction of one table will NOT spread over different cores.
>
> This is not exactly true. You actually can have multiple compactions
> running at the same time on the same table, it just doesn't happen all that
> often. You essentially would have to have two sets of sstables that are
> both eligible for compactions at the same time.
>
> all the best,
>
> Sebastián
> On Jan 21, 2016 7:41 AM, "PenguinWhispererThe ." <
> th3penguinwhispe...@gmail.com> wrote:
>
>> After having some issues myself with compaction I think it's noteworthy
>> to explicitly state that compaction of a table can only run on one CPU. So
>> compaction of one table will NOT spread over different cores.
>> To really have use of concurrent_compactors you need to have multiple
>> table compactions initiated at the same time. If those are small they'll
>> finish way earlier resulting in only one core using 100% as compaction is
>> generally CPU bound (unless your disks can't keep up).
>> I believe it's better to be CPU(core) bound on one core(or at least not
>> all) for compaction than disk IO bound as this would result in writes and
>> reads, ... having performance impact.
>> Compaction is a maintenance task so it shouldn't be eating all your
>> resources.
>>
>>
>> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail>
>>  This
>> email has been sent from a virus-free computer protected by Avast.
>> www.avast.com
>> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail>
>> <#1162782367_-1582318301_DDB4FAA8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>
>> 2016-01-16 0:18 GMT+01:00 Kai Wang <dep...@gmail.com>:
>>
>>> Jeff & Sebastian,
>>>
>>> Thanks for the reply. There are 12 cores but in my case C* only uses one
>>> core most of the time. *nodetool compactionstats* shows there's only
>>> one compactor running. I can see C* process only uses one core. So I guess
>>> I should've asked the question more clearly:
>>>
>>> 1. Is ~25 M/s a reasonable compaction throughput for one core?
>>> 2. Is there any configuration that affects single core compaction
>>> throughput?
>>> 3. Is concurrent_compactors the only option to parallelize compaction?
>>> If so, I guess it's the compaction strategy itself that decides when to
>>> parallelize and when to block on one core. Then there's not much we can do
>>> here.
>>>
>>> Thanks.
>>>
>>> On Fri, Jan 15, 2016 at 5:23 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
>>> wrote:
>>>
>>>> With SSDs, the typical recommendation is up to 0.8-1 compactor per core
>>>> (depending on other load).  How many CPU cores do you have?
>>>>
>>>>
>>>> From: Kai Wang
>>>> Reply-To: "user@cassandra.apache.org"
>>>> Date: Friday, January 15, 2016 at 12:53 PM
>>>> To: "user@cassandra.apache.org"
>>>> Subject: compaction throughput
>>>>
>>>> Hi,
>>>>
>>>> I am trying to figure out the bottleneck of compaction on my node. The
>>>> node is CentOS 7 and has SSDs installed. The table is configured to use
>>>> LCS. Here is my compaction related configs in cassandra.yaml:
>>>>
>>>> compaction_throughput_mb_per_sec: 160
>>>> concurrent_compactors: 4
>>>>
>>>> I insert about 10G of data and start observing compaction.
>>>>
>>>> *nodetool compaction* shows most of time there is one compaction.
>>>> Sometimes there are 3-4 (I suppose this is controlled by
>>>> concurrent_compactors). During the compaction, I see one CPU core is 100%.
>>>> At that point, disk IO is about 20-25 M/s write which is much lower than
>>>> the disk is capable of. Even when there are 4 compactions running, I see
>>>> CPU go to +400% but disk IO is still at 20-25M/s write. I use *nodetool
>>>> setcompactionthroughput 0* to disable the compaction throttle but
>>>> don't see any difference.
>>>>
>>>> Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is
>>>> there anyway to improve the throughput?
>>>>
>>>> Thanks.
>>>>
>>>
>>>
>>


Re: Cassandra compaction stuck? Should I disable?

2015-12-02 Thread PenguinWhispererThe .
So it seems I found the problem.

The node opening a stream is waiting for the other node to respond but that
node never responds due to a broken pipe which makes Cassandra wait forever.

It's basically this issue:
https://issues.apache.org/jira/browse/CASSANDRA-8472
And this is the workaround/fix:
https://issues.apache.org/jira/browse/CASSANDRA-8611

So:
- update cassandra to >=2.0.11
- add option streaming_socket_timeout_in_ms = 1
- do rolling restart of cassandra

What's weird is that the IOException: Broken pipe is never shown in my logs
(not on any node). And my logging is set to INFO in log4j config.
I have this config in log4j-server.properties:
# output messages into a rolling log file as well as stdout
log4j.rootLogger=INFO,stdout,R

# stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%5p %d{HH:mm:ss,SSS} %m%n

# rolling log file
log4j.appender.R=org.apache.log4j.RollingFileAppender
log4j.appender.R.maxFileSize=20MB
log4j.appender.R.maxBackupIndex=50
log4j.appender.R.layout=org.apache.log4j.PatternLayout
log4j.appender.R.layout.ConversionPattern=%5p [%t] %d{ISO8601} %F (line %L)
%m%n
# Edit the next line to point to your logs directory
log4j.appender.R.File=/var/log/cassandra/system.log

# Application logging options
#log4j.logger.org.apache.cassandra=DEBUG
#log4j.logger.org.apache.cassandra.db=DEBUG
#log4j.logger.org.apache.cassandra.service.StorageProxy=DEBUG

# Adding this to avoid thrift logging disconnect errors.
log4j.logger.org.apache.thrift.server.TNonblockingServer=ERROR

Too bad nobody else could point to those. Hope it helps someone else from
wasting a lot of time.

2015-11-11 15:42 GMT+01:00 Sebastian Estevez <sebastian.este...@datastax.com
>:

> Use 'nodetool compactionhistory'
>
> all the best,
>
> Sebastián
> On Nov 11, 2015 3:23 AM, "PenguinWhispererThe ." <
> th3penguinwhispe...@gmail.com> wrote:
>
>> Does compactionstats shows only stats for completed compactions (100%)?
>> It might be that the compaction is running constantly, over and over again.
>> In that case I need to know what I might be able to do to stop this
>> constant compaction so I can start a nodetool repair.
>>
>> Note that there is a lot of traffic on this columnfamily so I'm not sure
>> if temporary disabling compaction is an option. The repair will probably
>> take long as well.
>>
>> Sebastian and Rob: do you might have any more ideas about the things I
>> put in this thread? Any help is appreciated!
>>
>> 2015-11-10 20:03 GMT+01:00 PenguinWhispererThe . <
>> th3penguinwhispe...@gmail.com>:
>>
>>> Hi Sebastian,
>>>
>>> Thanks for your response.
>>>
>>> No swap is used. No offense, I just don't see a reason why having swap
>>> would be the issue here. I put swapiness on 1. I also have jna installed.
>>> That should prevent java being swapped out as wel AFAIK.
>>>
>>>
>>> 2015-11-10 19:50 GMT+01:00 Sebastian Estevez <
>>> sebastian.este...@datastax.com>:
>>>
>>>> Turn off Swap.
>>>>
>>>>
>>>> http://docs.datastax.com/en/cassandra/2.1/cassandra/install/installRecommendSettings.html?scroll=reference_ds_sxl_gf3_2k__disable-swap
>>>>
>>>>
>>>> All the best,
>>>>
>>>>
>>>> [image: datastax_logo.png] <http://www.datastax.com/>
>>>>
>>>> Sebastián Estévez
>>>>
>>>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>>>>
>>>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>>>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>>>> <https://twitter.com/datastax> [image: g+.png]
>>>> <https://plus.google.com/+Datastax/about>
>>>> <http://feeds.feedburner.com/datastax>
>>>> <http://goog_410786983>
>>>>
>>>>
>>>> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>>>>
>>>> DataStax is the fastest, most scalable distributed database
>>>> technology, delivering Apache Cassandra to the world’s most innovative
>>>> enterprises. Datastax is built to be agile, always-on, and predictably
>>>> scalable to any size. With more than 500 customers in 45 countries, 
>>>> DataStax
>>>> is the database technology and transactional backbone of choice for the
>>>> worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>>>
>>>>

Re: Cassandra compaction stuck? Should I disable?

2015-11-11 Thread PenguinWhispererThe .
Does compactionstats shows only stats for completed compactions (100%)? It
might be that the compaction is running constantly, over and over again.
In that case I need to know what I might be able to do to stop this
constant compaction so I can start a nodetool repair.

Note that there is a lot of traffic on this columnfamily so I'm not sure if
temporary disabling compaction is an option. The repair will probably take
long as well.

Sebastian and Rob: do you might have any more ideas about the things I put
in this thread? Any help is appreciated!

2015-11-10 20:03 GMT+01:00 PenguinWhispererThe . <
th3penguinwhispe...@gmail.com>:

> Hi Sebastian,
>
> Thanks for your response.
>
> No swap is used. No offense, I just don't see a reason why having swap
> would be the issue here. I put swapiness on 1. I also have jna installed.
> That should prevent java being swapped out as wel AFAIK.
>
>
> 2015-11-10 19:50 GMT+01:00 Sebastian Estevez <
> sebastian.este...@datastax.com>:
>
>> Turn off Swap.
>>
>>
>> http://docs.datastax.com/en/cassandra/2.1/cassandra/install/installRecommendSettings.html?scroll=reference_ds_sxl_gf3_2k__disable-swap
>>
>>
>> All the best,
>>
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>>
>> Sebastián Estévez
>>
>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>>
>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>> <https://twitter.com/datastax> [image: g+.png]
>> <https://plus.google.com/+Datastax/about>
>> <http://feeds.feedburner.com/datastax>
>> <http://goog_410786983>
>>
>>
>> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>>
>> DataStax is the fastest, most scalable distributed database technology,
>> delivering Apache Cassandra to the world’s most innovative enterprises.
>> Datastax is built to be agile, always-on, and predictably scalable to any
>> size. With more than 500 customers in 45 countries, DataStax is the
>> database technology and transactional backbone of choice for the worlds
>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>
>> On Tue, Nov 10, 2015 at 1:48 PM, PenguinWhispererThe . <
>> th3penguinwhispe...@gmail.com> wrote:
>>
>>> I also have the following memory usage:
>>> [root@US-BILLINGDSX4 cassandra]# free -m
>>>  total   used   free sharedbuffers cached
>>> Mem: 12024   9455   2569  0110   2163
>>> -/+ buffers/cache:   7180   4844
>>> Swap: 2047  0   2047
>>>
>>> Still a lot free and a lot of free buffers/cache.
>>>
>>> 2015-11-10 19:45 GMT+01:00 PenguinWhispererThe . <
>>> th3penguinwhispe...@gmail.com>:
>>>
>>>> Still stuck with this. However I enabled GC logging. This shows the
>>>> following:
>>>>
>>>> [root@myhost cassandra]# tail -f gc-1447180680.log
>>>> 2015-11-10T18:41:45.516+: 225.428: [GC
>>>> 2721842K->2066508K(6209536K), 0.0199040 secs]
>>>> 2015-11-10T18:41:45.977+: 225.889: [GC
>>>> 2721868K->2066511K(6209536K), 0.0221910 secs]
>>>> 2015-11-10T18:41:46.437+: 226.349: [GC
>>>> 2721871K->2066524K(6209536K), 0.0222140 secs]
>>>> 2015-11-10T18:41:46.897+: 226.809: [GC
>>>> 2721884K->2066539K(6209536K), 0.0224140 secs]
>>>> 2015-11-10T18:41:47.359+: 227.271: [GC
>>>> 2721899K->2066538K(6209536K), 0.0302520 secs]
>>>> 2015-11-10T18:41:47.821+: 227.733: [GC
>>>> 2721898K->2066557K(6209536K), 0.0280530 secs]
>>>> 2015-11-10T18:41:48.293+: 228.205: [GC
>>>> 2721917K->2066571K(6209536K), 0.0218000 secs]
>>>> 2015-11-10T18:41:48.790+: 228.702: [GC
>>>> 2721931K->2066780K(6209536K), 0.0292470 secs]
>>>> 2015-11-10T18:41:49.290+: 229.202: [GC
>>>> 2722140K->2066843K(6209536K), 0.0288740 secs]
>>>> 2015-11-10T18:41:49.756+: 229.668: [GC
>>>> 2722203K->2066818K(6209536K), 0.0283380 secs]
>>>> 2015-11-10T18:41:50.249+: 230.161: [GC
>>>> 2722178K->2067158K(6209536K), 0.0218690 secs]
>>>> 2015-11-10T18:41:50.713+: 230.625: [GC
>>>> 2722518K->2067236K(6209536K), 0.0278810 secs]
>>>>
>>>> This is a VM with 12GB of RAM. Highered the HEAP_SIZE to 6GB

Re: Cassandra compaction stuck? Should I disable?

2015-11-10 Thread PenguinWhispererThe .
I also have the following memory usage:
[root@US-BILLINGDSX4 cassandra]# free -m
 total   used   free sharedbuffers cached
Mem: 12024   9455   2569  0110   2163
-/+ buffers/cache:   7180   4844
Swap: 2047  0   2047

Still a lot free and a lot of free buffers/cache.

2015-11-10 19:45 GMT+01:00 PenguinWhispererThe . <
th3penguinwhispe...@gmail.com>:

> Still stuck with this. However I enabled GC logging. This shows the
> following:
>
> [root@myhost cassandra]# tail -f gc-1447180680.log
> 2015-11-10T18:41:45.516+: 225.428: [GC 2721842K->2066508K(6209536K),
> 0.0199040 secs]
> 2015-11-10T18:41:45.977+: 225.889: [GC 2721868K->2066511K(6209536K),
> 0.0221910 secs]
> 2015-11-10T18:41:46.437+: 226.349: [GC 2721871K->2066524K(6209536K),
> 0.0222140 secs]
> 2015-11-10T18:41:46.897+: 226.809: [GC 2721884K->2066539K(6209536K),
> 0.0224140 secs]
> 2015-11-10T18:41:47.359+: 227.271: [GC 2721899K->2066538K(6209536K),
> 0.0302520 secs]
> 2015-11-10T18:41:47.821+: 227.733: [GC 2721898K->2066557K(6209536K),
> 0.0280530 secs]
> 2015-11-10T18:41:48.293+: 228.205: [GC 2721917K->2066571K(6209536K),
> 0.0218000 secs]
> 2015-11-10T18:41:48.790+: 228.702: [GC 2721931K->2066780K(6209536K),
> 0.0292470 secs]
> 2015-11-10T18:41:49.290+: 229.202: [GC 2722140K->2066843K(6209536K),
> 0.0288740 secs]
> 2015-11-10T18:41:49.756+: 229.668: [GC 2722203K->2066818K(6209536K),
> 0.0283380 secs]
> 2015-11-10T18:41:50.249+: 230.161: [GC 2722178K->2067158K(6209536K),
> 0.0218690 secs]
> 2015-11-10T18:41:50.713+: 230.625: [GC 2722518K->2067236K(6209536K),
> 0.0278810 secs]
>
> This is a VM with 12GB of RAM. Highered the HEAP_SIZE to 6GB and
> HEAP_NEWSIZE to 800MB.
>
> Still the same result.
>
> This looks very similar to following issue:
>
> http://mail-archives.apache.org/mod_mbox/cassandra-user/201411.mbox/%3CCAJ=3xgRLsvpnZe0uXEYjG94rKhfXeU+jBR=q3a-_c3rsdd5...@mail.gmail.com%3E
>
> Is the only possibility to upgrade memory? I mean, I can't believe it's
> just loading all it's data in memory. That would require to keep scaling up
> the node to keep it work?
>
>
> 2015-11-10 9:36 GMT+01:00 PenguinWhispererThe . <
> th3penguinwhispe...@gmail.com>:
>
>> Correction...
>> I was grepping on Segmentation on the strace and it happens a lot.
>>
>> Do I need to run a scrub?
>>
>> 2015-11-10 9:30 GMT+01:00 PenguinWhispererThe . <
>> th3penguinwhispe...@gmail.com>:
>>
>>> Hi Rob,
>>>
>>> Thanks for your reply.
>>>
>>> 2015-11-09 23:17 GMT+01:00 Robert Coli <rc...@eventbrite.com>:
>>>
>>>> On Mon, Nov 9, 2015 at 1:29 PM, PenguinWhispererThe . <
>>>> th3penguinwhispe...@gmail.com> wrote:
>>>>>
>>>>> In Opscenter I see one of the nodes is orange. It seems like it's
>>>>> working on compaction. I used nodetool compactionstats and whenever I did
>>>>> this the Completed nad percentage stays the same (even with hours in
>>>>> between).
>>>>>
>>>> Are you the same person from IRC, or a second report today of
>>>> compaction hanging in this way?
>>>>
>>> Same person ;) Just didn't had things to work with from the chat there.
>>> I want to understand the issue more, see what I can tune or fix. I want to
>>> do nodetool repair before upgrading to 2.1.11 but the compaction is
>>> blocking it.
>>>
>>>>
>>>>
>>>>
>>> What version of Cassandra?
>>>>
>>> 2.0.9
>>>
>>>> I currently don't see cpu load from cassandra on that node. So it seems
>>>>> stuck (somewhere mid 60%). Also some other nodes have compaction on the
>>>>> same columnfamily. I don't see any progress.
>>>>>
>>>>>  WARN [RMI TCP Connection(554)-192.168.0.68] 2015-11-09 17:18:13,677 
>>>>> ColumnFamilyStore.java (line 2101) Unable to cancel in-progress 
>>>>> compactions for usage_record_ptd.  Probably there is an unusually large 
>>>>> row in progress somewhere.  It is also possible that buggy code left some 
>>>>> sstables compacting after it was done with them
>>>>>
>>>>>
>>>>>- How can I assure that nothing is happening?
>>>>>
>>>>> Find the thread that is doing compaction and strace it. Generally it
>>>> is one of the threads with a lower thread priority.
>>>

Re: Cassandra compaction stuck? Should I disable?

2015-11-10 Thread PenguinWhispererThe .
Still stuck with this. However I enabled GC logging. This shows the
following:

[root@myhost cassandra]# tail -f gc-1447180680.log
2015-11-10T18:41:45.516+: 225.428: [GC 2721842K->2066508K(6209536K),
0.0199040 secs]
2015-11-10T18:41:45.977+: 225.889: [GC 2721868K->2066511K(6209536K),
0.0221910 secs]
2015-11-10T18:41:46.437+: 226.349: [GC 2721871K->2066524K(6209536K),
0.0222140 secs]
2015-11-10T18:41:46.897+: 226.809: [GC 2721884K->2066539K(6209536K),
0.0224140 secs]
2015-11-10T18:41:47.359+: 227.271: [GC 2721899K->2066538K(6209536K),
0.0302520 secs]
2015-11-10T18:41:47.821+: 227.733: [GC 2721898K->2066557K(6209536K),
0.0280530 secs]
2015-11-10T18:41:48.293+: 228.205: [GC 2721917K->2066571K(6209536K),
0.0218000 secs]
2015-11-10T18:41:48.790+: 228.702: [GC 2721931K->2066780K(6209536K),
0.0292470 secs]
2015-11-10T18:41:49.290+: 229.202: [GC 2722140K->2066843K(6209536K),
0.0288740 secs]
2015-11-10T18:41:49.756+: 229.668: [GC 2722203K->2066818K(6209536K),
0.0283380 secs]
2015-11-10T18:41:50.249+: 230.161: [GC 2722178K->2067158K(6209536K),
0.0218690 secs]
2015-11-10T18:41:50.713+: 230.625: [GC 2722518K->2067236K(6209536K),
0.0278810 secs]

This is a VM with 12GB of RAM. Highered the HEAP_SIZE to 6GB and
HEAP_NEWSIZE to 800MB.

Still the same result.

This looks very similar to following issue:
http://mail-archives.apache.org/mod_mbox/cassandra-user/201411.mbox/%3CCAJ=3xgRLsvpnZe0uXEYjG94rKhfXeU+jBR=q3a-_c3rsdd5...@mail.gmail.com%3E

Is the only possibility to upgrade memory? I mean, I can't believe it's
just loading all it's data in memory. That would require to keep scaling up
the node to keep it work?


2015-11-10 9:36 GMT+01:00 PenguinWhispererThe . <
th3penguinwhispe...@gmail.com>:

> Correction...
> I was grepping on Segmentation on the strace and it happens a lot.
>
> Do I need to run a scrub?
>
> 2015-11-10 9:30 GMT+01:00 PenguinWhispererThe . <
> th3penguinwhispe...@gmail.com>:
>
>> Hi Rob,
>>
>> Thanks for your reply.
>>
>> 2015-11-09 23:17 GMT+01:00 Robert Coli <rc...@eventbrite.com>:
>>
>>> On Mon, Nov 9, 2015 at 1:29 PM, PenguinWhispererThe . <
>>> th3penguinwhispe...@gmail.com> wrote:
>>>>
>>>> In Opscenter I see one of the nodes is orange. It seems like it's
>>>> working on compaction. I used nodetool compactionstats and whenever I did
>>>> this the Completed nad percentage stays the same (even with hours in
>>>> between).
>>>>
>>> Are you the same person from IRC, or a second report today of compaction
>>> hanging in this way?
>>>
>> Same person ;) Just didn't had things to work with from the chat there. I
>> want to understand the issue more, see what I can tune or fix. I want to do
>> nodetool repair before upgrading to 2.1.11 but the compaction is blocking
>> it.
>>
>>>
>>>
>>>
>> What version of Cassandra?
>>>
>> 2.0.9
>>
>>> I currently don't see cpu load from cassandra on that node. So it seems
>>>> stuck (somewhere mid 60%). Also some other nodes have compaction on the
>>>> same columnfamily. I don't see any progress.
>>>>
>>>>  WARN [RMI TCP Connection(554)-192.168.0.68] 2015-11-09 17:18:13,677 
>>>> ColumnFamilyStore.java (line 2101) Unable to cancel in-progress 
>>>> compactions for usage_record_ptd.  Probably there is an unusually large 
>>>> row in progress somewhere.  It is also possible that buggy code left some 
>>>> sstables compacting after it was done with them
>>>>
>>>>
>>>>- How can I assure that nothing is happening?
>>>>
>>>> Find the thread that is doing compaction and strace it. Generally it is
>>> one of the threads with a lower thread priority.
>>>
>>
>> I have 141 threads. Not sure if that's normal.
>>
>> This seems to be the one:
>>  61404 cassandr  24   4 8948m 4.3g 820m R 90.2 36.8 292:54.47 java
>>
>> In the strace I see basically this part repeating (with once in a while
>> the "resource temporarily unavailable"):
>> futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50,
>> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
>> futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 1
>> getpriority(PRIO_PROCESS, 61404)= 16
>> futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50,
>> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
>> futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 0
>> futex(0x1233854, FUTEX_WAIT_PRIVATE, 494045, NULL) = -1 EAGAIN (Resource
>> temporarily unavailable)
>> 

Re: Cassandra compaction stuck? Should I disable?

2015-11-10 Thread PenguinWhispererThe .
Hi Sebastian,

Thanks for your response.

No swap is used. No offense, I just don't see a reason why having swap
would be the issue here. I put swapiness on 1. I also have jna installed.
That should prevent java being swapped out as wel AFAIK.


2015-11-10 19:50 GMT+01:00 Sebastian Estevez <sebastian.este...@datastax.com
>:

> Turn off Swap.
>
>
> http://docs.datastax.com/en/cassandra/2.1/cassandra/install/installRecommendSettings.html?scroll=reference_ds_sxl_gf3_2k__disable-swap
>
>
> All the best,
>
>
> [image: datastax_logo.png] <http://www.datastax.com/>
>
> Sebastián Estévez
>
> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>
> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
> <https://twitter.com/datastax> [image: g+.png]
> <https://plus.google.com/+Datastax/about>
> <http://feeds.feedburner.com/datastax>
> <http://goog_410786983>
>
>
> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>
> DataStax is the fastest, most scalable distributed database technology,
> delivering Apache Cassandra to the world’s most innovative enterprises.
> Datastax is built to be agile, always-on, and predictably scalable to any
> size. With more than 500 customers in 45 countries, DataStax is the
> database technology and transactional backbone of choice for the worlds
> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>
> On Tue, Nov 10, 2015 at 1:48 PM, PenguinWhispererThe . <
> th3penguinwhispe...@gmail.com> wrote:
>
>> I also have the following memory usage:
>> [root@US-BILLINGDSX4 cassandra]# free -m
>>  total   used   free sharedbuffers cached
>> Mem: 12024   9455   2569  0110   2163
>> -/+ buffers/cache:   7180   4844
>> Swap: 2047  0   2047
>>
>> Still a lot free and a lot of free buffers/cache.
>>
>> 2015-11-10 19:45 GMT+01:00 PenguinWhispererThe . <
>> th3penguinwhispe...@gmail.com>:
>>
>>> Still stuck with this. However I enabled GC logging. This shows the
>>> following:
>>>
>>> [root@myhost cassandra]# tail -f gc-1447180680.log
>>> 2015-11-10T18:41:45.516+: 225.428: [GC 2721842K->2066508K(6209536K),
>>> 0.0199040 secs]
>>> 2015-11-10T18:41:45.977+: 225.889: [GC 2721868K->2066511K(6209536K),
>>> 0.0221910 secs]
>>> 2015-11-10T18:41:46.437+: 226.349: [GC 2721871K->2066524K(6209536K),
>>> 0.0222140 secs]
>>> 2015-11-10T18:41:46.897+: 226.809: [GC 2721884K->2066539K(6209536K),
>>> 0.0224140 secs]
>>> 2015-11-10T18:41:47.359+: 227.271: [GC 2721899K->2066538K(6209536K),
>>> 0.0302520 secs]
>>> 2015-11-10T18:41:47.821+: 227.733: [GC 2721898K->2066557K(6209536K),
>>> 0.0280530 secs]
>>> 2015-11-10T18:41:48.293+: 228.205: [GC 2721917K->2066571K(6209536K),
>>> 0.0218000 secs]
>>> 2015-11-10T18:41:48.790+: 228.702: [GC 2721931K->2066780K(6209536K),
>>> 0.0292470 secs]
>>> 2015-11-10T18:41:49.290+: 229.202: [GC 2722140K->2066843K(6209536K),
>>> 0.0288740 secs]
>>> 2015-11-10T18:41:49.756+: 229.668: [GC 2722203K->2066818K(6209536K),
>>> 0.0283380 secs]
>>> 2015-11-10T18:41:50.249+: 230.161: [GC 2722178K->2067158K(6209536K),
>>> 0.0218690 secs]
>>> 2015-11-10T18:41:50.713+: 230.625: [GC 2722518K->2067236K(6209536K),
>>> 0.0278810 secs]
>>>
>>> This is a VM with 12GB of RAM. Highered the HEAP_SIZE to 6GB and
>>> HEAP_NEWSIZE to 800MB.
>>>
>>> Still the same result.
>>>
>>> This looks very similar to following issue:
>>>
>>> http://mail-archives.apache.org/mod_mbox/cassandra-user/201411.mbox/%3CCAJ=3xgRLsvpnZe0uXEYjG94rKhfXeU+jBR=q3a-_c3rsdd5...@mail.gmail.com%3E
>>>
>>> Is the only possibility to upgrade memory? I mean, I can't believe it's
>>> just loading all it's data in memory. That would require to keep scaling up
>>> the node to keep it work?
>>>
>>>
>>> 2015-11-10 9:36 GMT+01:00 PenguinWhispererThe . <
>>> th3penguinwhispe...@gmail.com>:
>>>
>>>> Correction...
>>>> I was grepping on Segmentation on the strace and it happens a lot.
>>>>
>>>> Do I need to run a scrub?
>>>>
>>>> 2015-11-10 9:30 GMT+01:00 PenguinWhispererThe . <
>>>> th3penguinwhispe...@gmail.com&

Re: Cassandra compaction stuck? Should I disable?

2015-11-10 Thread PenguinWhispererThe .
Hi Rob,

Thanks for your reply.

2015-11-09 23:17 GMT+01:00 Robert Coli <rc...@eventbrite.com>:

> On Mon, Nov 9, 2015 at 1:29 PM, PenguinWhispererThe . <
> th3penguinwhispe...@gmail.com> wrote:
>>
>> In Opscenter I see one of the nodes is orange. It seems like it's working
>> on compaction. I used nodetool compactionstats and whenever I did this the
>> Completed nad percentage stays the same (even with hours in between).
>>
> Are you the same person from IRC, or a second report today of compaction
> hanging in this way?
>
Same person ;) Just didn't had things to work with from the chat there. I
want to understand the issue more, see what I can tune or fix. I want to do
nodetool repair before upgrading to 2.1.11 but the compaction is blocking
it.

>
>
>
What version of Cassandra?
>
2.0.9

> I currently don't see cpu load from cassandra on that node. So it seems
>> stuck (somewhere mid 60%). Also some other nodes have compaction on the
>> same columnfamily. I don't see any progress.
>>
>>  WARN [RMI TCP Connection(554)-192.168.0.68] 2015-11-09 17:18:13,677 
>> ColumnFamilyStore.java (line 2101) Unable to cancel in-progress compactions 
>> for usage_record_ptd.  Probably there is an unusually large row in progress 
>> somewhere.  It is also possible that buggy code left some sstables 
>> compacting after it was done with them
>>
>>
>>- How can I assure that nothing is happening?
>>
>> Find the thread that is doing compaction and strace it. Generally it is
> one of the threads with a lower thread priority.
>

I have 141 threads. Not sure if that's normal.

This seems to be the one:
 61404 cassandr  24   4 8948m 4.3g 820m R 90.2 36.8 292:54.47 java

In the strace I see basically this part repeating (with once in a while the
"resource temporarily unavailable"):
futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50,
{FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 1
getpriority(PRIO_PROCESS, 61404)= 16
futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50,
{FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x1233854, FUTEX_WAIT_PRIVATE, 494045, NULL) = -1 EAGAIN (Resource
temporarily unavailable)
futex(0x1233828, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50,
{FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x1233854, FUTEX_WAIT_PRIVATE, 494047, NULL) = 0
futex(0x1233828, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50,
{FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 1
getpriority(PRIO_PROCESS, 61404)= 16
futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50,
{FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x1233854, FUTEX_WAIT_PRIVATE, 494049, NULL) = 0
futex(0x1233828, FUTEX_WAKE_PRIVATE, 1) = 0
getpriority(PRIO_PROCESS, 61404)= 16

But wait!
I also see this:
futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50,
{FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
futex(0x1233854, FUTEX_WAIT_PRIVATE, 494055, NULL) = 0
futex(0x1233828, FUTEX_WAKE_PRIVATE, 1) = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---

This doesn't seem to happen that often though.

>
> Compaction often appears hung when decompressing a very large row, but
> usually not for "hours".
>
>>
>>- Is it recommended to disable compaction from a certain data size?
>>(I believe 25GB on each node).
>>
>> It is almost never recommended to disable compaction.
>
>>
>>- Can I stop this compaction? nodetool stop compaction doesn't seem
>>to work.
>>
>> Killing the JVM ("the dungeon collapses!") would certainly stop it, but
> it'd likely just start again when you restart the node.
>
>>
>>- Is stopping the compaction dangerous?
>>
>>  Not if you're in a version that properly cleans up partial compactions,
> which is most of them.
>
>>
>>- Is killing the cassandra process dangerous while compacting(I did
>>nodetool drain on one node)?
>>
>> No. But probably nodetool drain couldn't actually stop the in-progress
> compaction either, FWIW.
>
>> This is output of nodetool compactionstats grepped for the keyspace that
>> seems stuck.
>>
>> Do you have gigantic rows in that keyspace? What does cfstats say about
> the largest row compaction has seen/do you have log messages about
> compacting large rows?
>

I don't know about the gigantic rows. How can I check?

I've checked the logs and f

Re: Cassandra compaction stuck? Should I disable?

2015-11-10 Thread PenguinWhispererThe .
Correction...
I was grepping on Segmentation on the strace and it happens a lot.

Do I need to run a scrub?

2015-11-10 9:30 GMT+01:00 PenguinWhispererThe . <
th3penguinwhispe...@gmail.com>:

> Hi Rob,
>
> Thanks for your reply.
>
> 2015-11-09 23:17 GMT+01:00 Robert Coli <rc...@eventbrite.com>:
>
>> On Mon, Nov 9, 2015 at 1:29 PM, PenguinWhispererThe . <
>> th3penguinwhispe...@gmail.com> wrote:
>>>
>>> In Opscenter I see one of the nodes is orange. It seems like it's
>>> working on compaction. I used nodetool compactionstats and whenever I did
>>> this the Completed nad percentage stays the same (even with hours in
>>> between).
>>>
>> Are you the same person from IRC, or a second report today of compaction
>> hanging in this way?
>>
> Same person ;) Just didn't had things to work with from the chat there. I
> want to understand the issue more, see what I can tune or fix. I want to do
> nodetool repair before upgrading to 2.1.11 but the compaction is blocking
> it.
>
>>
>>
>>
> What version of Cassandra?
>>
> 2.0.9
>
>> I currently don't see cpu load from cassandra on that node. So it seems
>>> stuck (somewhere mid 60%). Also some other nodes have compaction on the
>>> same columnfamily. I don't see any progress.
>>>
>>>  WARN [RMI TCP Connection(554)-192.168.0.68] 2015-11-09 17:18:13,677 
>>> ColumnFamilyStore.java (line 2101) Unable to cancel in-progress compactions 
>>> for usage_record_ptd.  Probably there is an unusually large row in progress 
>>> somewhere.  It is also possible that buggy code left some sstables 
>>> compacting after it was done with them
>>>
>>>
>>>- How can I assure that nothing is happening?
>>>
>>> Find the thread that is doing compaction and strace it. Generally it is
>> one of the threads with a lower thread priority.
>>
>
> I have 141 threads. Not sure if that's normal.
>
> This seems to be the one:
>  61404 cassandr  24   4 8948m 4.3g 820m R 90.2 36.8 292:54.47 java
>
> In the strace I see basically this part repeating (with once in a while
> the "resource temporarily unavailable"):
> futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50,
> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
> futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 1
> getpriority(PRIO_PROCESS, 61404)= 16
> futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50,
> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
> futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 0
> futex(0x1233854, FUTEX_WAIT_PRIVATE, 494045, NULL) = -1 EAGAIN (Resource
> temporarily unavailable)
> futex(0x1233828, FUTEX_WAKE_PRIVATE, 1) = 0
> futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50,
> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
> futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 1
> futex(0x1233854, FUTEX_WAIT_PRIVATE, 494047, NULL) = 0
> futex(0x1233828, FUTEX_WAKE_PRIVATE, 1) = 0
> futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50,
> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
> futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 1
> getpriority(PRIO_PROCESS, 61404)= 16
> futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50,
> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
> futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 1
> futex(0x1233854, FUTEX_WAIT_PRIVATE, 494049, NULL) = 0
> futex(0x1233828, FUTEX_WAKE_PRIVATE, 1) = 0
> getpriority(PRIO_PROCESS, 61404)= 16
>
> But wait!
> I also see this:
> futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50,
> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
> futex(0x1233854, FUTEX_WAIT_PRIVATE, 494055, NULL) = 0
> futex(0x1233828, FUTEX_WAKE_PRIVATE, 1) = 0
> --- SIGSEGV (Segmentation fault) @ 0 (0) ---
>
> This doesn't seem to happen that often though.
>
>>
>> Compaction often appears hung when decompressing a very large row, but
>> usually not for "hours".
>>
>>>
>>>- Is it recommended to disable compaction from a certain data size?
>>>(I believe 25GB on each node).
>>>
>>> It is almost never recommended to disable compaction.
>>
>>>
>>>- Can I stop this compaction? nodetool stop compaction doesn't seem
>>>to work.
>>>
>>> Killing the JVM ("the dungeon collapses!") would certainly stop it, but
>> it'd likely just start again when you restart the node.
>>
>>>
>>>- Is stopping the compaction dangerous?
>>>
>>>  Not if you're in a version that properly cleans up

Fwd: Cassandra compaction stuck? Should I disable?

2015-11-09 Thread PenguinWhispererThe .
Hi all,

In Opscenter I see one of the nodes is orange. It seems like it's working
on compaction. I used nodetool compactionstats and whenever I did this the
Completed nad percentage stays the same (even with hours in between). I
currently don't see cpu load from cassandra on that node. So it seems stuck
(somewhere mid 60%). Also some other nodes have compaction on the same
columnfamily. I don't see any progress.

 WARN [RMI TCP Connection(554)-192.168.0.68] 2015-11-09 17:18:13,677
ColumnFamilyStore.java (line 2101) Unable to cancel in-progress
compactions for usage_record_ptd.  Probably there is an unusually
large row in progress somewhere.  It is also possible that buggy code
left some sstables compacting after it was done with them


   - How can I assure that nothing is happening?
   - Is it recommended to disable compaction from a certain data size? (I
   believe 25GB on each node).
   - Can I stop this compaction? nodetool stop compaction doesn't seem to
   work.
   - Is stopping the compaction dangerous?
   - Is killing the cassandra process dangerous while compacting(I did
   nodetool drain on one node)?


This is output of nodetool compactionstats grepped for the keyspace that
seems stuck.

4e48f940-86c6-11e5-96be-dd3c9e46ec74 mykeyspace
mycolumnfamily 1447062197972 52321301
16743606   {1:2, 4:248}
94acec50-86c8-11e5-96be-dd3c9e46ec74 mykeyspace
mycolumnfamily 1447063175061 48992375
13420862   {3:3, 4:245}
3210c9b0-8707-11e5-96be-dd3c9e46ec74 mykeyspace
mycolumnfamily 1447090067915 52763216
17732003   {1:2, 4:248}
24f96fe0-86ce-11e5-96be-dd3c9e46ec74 mykeyspace
mycolumnfamily 1447065564638 44909171
17029440   {1:2, 3:39, 4:209}
06d58370-86ef-11e5-96be-dd3c9e46ec74 mykeyspace
mycolumnfamily 1447079687463 53570365
17873962   {1:2, 3:2, 4:246}
f7aa5fa0-86c7-11e5-96be-dd3c9e46ec74 mykeyspace
mycolumnfamily 1447062911642 47701016
13291915   {3:2, 4:246}
806a4380-86f7-11e5-96be-dd3c9e46ec74 mykeyspace
mycolumnfamily 1447083327416 52644411
17363023   {1:2, 2:1, 4:247}
c845b900-86c5-11e5-96be-dd3c9e46ec74 mykeyspace
mycolumnfamily 1447061973136 48944530
16698191   {1:2, 3:6, 4:242}
bb44a0b0-8718-11e5-96be-dd3c9e46ec74 mykeyspace
mycolumnfamily 1447097599547 48768463
13518523   {2:2, 3:5, 4:242}
f2c17ea0-86c3-11e5-96be-dd3c9e46ec74 mykeyspace
mycolumnfamily 1447061185418 90367799
13904914   {5:4, 6:7, 7:52, 8:185}
1aae6590-86ce-11e5-96be-dd3c9e46ec74 mykeyspace
mycolumnfamily 1447065547369 53190698
17228121   {1:2, 4:248}
d7ca8d00-86d5-11e5-96be-dd3c9e46ec74 mykeyspace
mycolumnfamily 1447068871120 52422499
16995963   {1:2, 3:3, 4:245}
6e890290-86df-11e5-96be-dd3c9e46ec74 mykeyspace
mycolumnfamily 1447072989497 45218168
17174468   {1:2, 3:21, 4:227}

I also see frequently lines like this in system.log:

WARN [Native-Transport-Requests:11935] 2015-11-09 20:10:41,886
BatchStatement.java (line 223) Batch of prepared statements for
[billing.usage_record_by_billing_period, billing.metric] is of size
53086, exceeding specified threshold of 5120 by 47966.


Any other remarks? Thanks a lot in advance!