Re: compaction throughput
Thanks for that clarification Sebastian! That's really good to know! I never took increasing this value in consideration because of my previous experience. In my case I had a table that was compacting over and over... and only one CPU was used. So that made me believe it was not multithreaded (I actually believe I asked this on IRC however it's been a few months ago so I might be wrong). Have there been behavioral changes on this lately? (I was using 2.0.9 or 2.0.11 I believe). 2016-01-21 14:15 GMT+01:00 Sebastian Estevez <sebastian.este...@datastax.com >: > >So compaction of one table will NOT spread over different cores. > > This is not exactly true. You actually can have multiple compactions > running at the same time on the same table, it just doesn't happen all that > often. You essentially would have to have two sets of sstables that are > both eligible for compactions at the same time. > > all the best, > > Sebastián > On Jan 21, 2016 7:41 AM, "PenguinWhispererThe ." < > th3penguinwhispe...@gmail.com> wrote: > >> After having some issues myself with compaction I think it's noteworthy >> to explicitly state that compaction of a table can only run on one CPU. So >> compaction of one table will NOT spread over different cores. >> To really have use of concurrent_compactors you need to have multiple >> table compactions initiated at the same time. If those are small they'll >> finish way earlier resulting in only one core using 100% as compaction is >> generally CPU bound (unless your disks can't keep up). >> I believe it's better to be CPU(core) bound on one core(or at least not >> all) for compaction than disk IO bound as this would result in writes and >> reads, ... having performance impact. >> Compaction is a maintenance task so it shouldn't be eating all your >> resources. >> >> >> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail> >> This >> email has been sent from a virus-free computer protected by Avast. >> www.avast.com >> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail> >> <#1162782367_-1582318301_DDB4FAA8-2DD7-40BB-A1B8-4E2AA1F9FDF2> >> >> 2016-01-16 0:18 GMT+01:00 Kai Wang <dep...@gmail.com>: >> >>> Jeff & Sebastian, >>> >>> Thanks for the reply. There are 12 cores but in my case C* only uses one >>> core most of the time. *nodetool compactionstats* shows there's only >>> one compactor running. I can see C* process only uses one core. So I guess >>> I should've asked the question more clearly: >>> >>> 1. Is ~25 M/s a reasonable compaction throughput for one core? >>> 2. Is there any configuration that affects single core compaction >>> throughput? >>> 3. Is concurrent_compactors the only option to parallelize compaction? >>> If so, I guess it's the compaction strategy itself that decides when to >>> parallelize and when to block on one core. Then there's not much we can do >>> here. >>> >>> Thanks. >>> >>> On Fri, Jan 15, 2016 at 5:23 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com> >>> wrote: >>> >>>> With SSDs, the typical recommendation is up to 0.8-1 compactor per core >>>> (depending on other load). How many CPU cores do you have? >>>> >>>> >>>> From: Kai Wang >>>> Reply-To: "user@cassandra.apache.org" >>>> Date: Friday, January 15, 2016 at 12:53 PM >>>> To: "user@cassandra.apache.org" >>>> Subject: compaction throughput >>>> >>>> Hi, >>>> >>>> I am trying to figure out the bottleneck of compaction on my node. The >>>> node is CentOS 7 and has SSDs installed. The table is configured to use >>>> LCS. Here is my compaction related configs in cassandra.yaml: >>>> >>>> compaction_throughput_mb_per_sec: 160 >>>> concurrent_compactors: 4 >>>> >>>> I insert about 10G of data and start observing compaction. >>>> >>>> *nodetool compaction* shows most of time there is one compaction. >>>> Sometimes there are 3-4 (I suppose this is controlled by >>>> concurrent_compactors). During the compaction, I see one CPU core is 100%. >>>> At that point, disk IO is about 20-25 M/s write which is much lower than >>>> the disk is capable of. Even when there are 4 compactions running, I see >>>> CPU go to +400% but disk IO is still at 20-25M/s write. I use *nodetool >>>> setcompactionthroughput 0* to disable the compaction throttle but >>>> don't see any difference. >>>> >>>> Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is >>>> there anyway to improve the throughput? >>>> >>>> Thanks. >>>> >>> >>> >>
Re: Cassandra compaction stuck? Should I disable?
So it seems I found the problem. The node opening a stream is waiting for the other node to respond but that node never responds due to a broken pipe which makes Cassandra wait forever. It's basically this issue: https://issues.apache.org/jira/browse/CASSANDRA-8472 And this is the workaround/fix: https://issues.apache.org/jira/browse/CASSANDRA-8611 So: - update cassandra to >=2.0.11 - add option streaming_socket_timeout_in_ms = 1 - do rolling restart of cassandra What's weird is that the IOException: Broken pipe is never shown in my logs (not on any node). And my logging is set to INFO in log4j config. I have this config in log4j-server.properties: # output messages into a rolling log file as well as stdout log4j.rootLogger=INFO,stdout,R # stdout log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern=%5p %d{HH:mm:ss,SSS} %m%n # rolling log file log4j.appender.R=org.apache.log4j.RollingFileAppender log4j.appender.R.maxFileSize=20MB log4j.appender.R.maxBackupIndex=50 log4j.appender.R.layout=org.apache.log4j.PatternLayout log4j.appender.R.layout.ConversionPattern=%5p [%t] %d{ISO8601} %F (line %L) %m%n # Edit the next line to point to your logs directory log4j.appender.R.File=/var/log/cassandra/system.log # Application logging options #log4j.logger.org.apache.cassandra=DEBUG #log4j.logger.org.apache.cassandra.db=DEBUG #log4j.logger.org.apache.cassandra.service.StorageProxy=DEBUG # Adding this to avoid thrift logging disconnect errors. log4j.logger.org.apache.thrift.server.TNonblockingServer=ERROR Too bad nobody else could point to those. Hope it helps someone else from wasting a lot of time. 2015-11-11 15:42 GMT+01:00 Sebastian Estevez <sebastian.este...@datastax.com >: > Use 'nodetool compactionhistory' > > all the best, > > Sebastián > On Nov 11, 2015 3:23 AM, "PenguinWhispererThe ." < > th3penguinwhispe...@gmail.com> wrote: > >> Does compactionstats shows only stats for completed compactions (100%)? >> It might be that the compaction is running constantly, over and over again. >> In that case I need to know what I might be able to do to stop this >> constant compaction so I can start a nodetool repair. >> >> Note that there is a lot of traffic on this columnfamily so I'm not sure >> if temporary disabling compaction is an option. The repair will probably >> take long as well. >> >> Sebastian and Rob: do you might have any more ideas about the things I >> put in this thread? Any help is appreciated! >> >> 2015-11-10 20:03 GMT+01:00 PenguinWhispererThe . < >> th3penguinwhispe...@gmail.com>: >> >>> Hi Sebastian, >>> >>> Thanks for your response. >>> >>> No swap is used. No offense, I just don't see a reason why having swap >>> would be the issue here. I put swapiness on 1. I also have jna installed. >>> That should prevent java being swapped out as wel AFAIK. >>> >>> >>> 2015-11-10 19:50 GMT+01:00 Sebastian Estevez < >>> sebastian.este...@datastax.com>: >>> >>>> Turn off Swap. >>>> >>>> >>>> http://docs.datastax.com/en/cassandra/2.1/cassandra/install/installRecommendSettings.html?scroll=reference_ds_sxl_gf3_2k__disable-swap >>>> >>>> >>>> All the best, >>>> >>>> >>>> [image: datastax_logo.png] <http://www.datastax.com/> >>>> >>>> Sebastián Estévez >>>> >>>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com >>>> >>>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image: >>>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png] >>>> <https://twitter.com/datastax> [image: g+.png] >>>> <https://plus.google.com/+Datastax/about> >>>> <http://feeds.feedburner.com/datastax> >>>> <http://goog_410786983> >>>> >>>> >>>> <http://www.datastax.com/gartner-magic-quadrant-odbms> >>>> >>>> DataStax is the fastest, most scalable distributed database >>>> technology, delivering Apache Cassandra to the world’s most innovative >>>> enterprises. Datastax is built to be agile, always-on, and predictably >>>> scalable to any size. With more than 500 customers in 45 countries, >>>> DataStax >>>> is the database technology and transactional backbone of choice for the >>>> worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. >>>> >>>>
Re: Cassandra compaction stuck? Should I disable?
Does compactionstats shows only stats for completed compactions (100%)? It might be that the compaction is running constantly, over and over again. In that case I need to know what I might be able to do to stop this constant compaction so I can start a nodetool repair. Note that there is a lot of traffic on this columnfamily so I'm not sure if temporary disabling compaction is an option. The repair will probably take long as well. Sebastian and Rob: do you might have any more ideas about the things I put in this thread? Any help is appreciated! 2015-11-10 20:03 GMT+01:00 PenguinWhispererThe . < th3penguinwhispe...@gmail.com>: > Hi Sebastian, > > Thanks for your response. > > No swap is used. No offense, I just don't see a reason why having swap > would be the issue here. I put swapiness on 1. I also have jna installed. > That should prevent java being swapped out as wel AFAIK. > > > 2015-11-10 19:50 GMT+01:00 Sebastian Estevez < > sebastian.este...@datastax.com>: > >> Turn off Swap. >> >> >> http://docs.datastax.com/en/cassandra/2.1/cassandra/install/installRecommendSettings.html?scroll=reference_ds_sxl_gf3_2k__disable-swap >> >> >> All the best, >> >> >> [image: datastax_logo.png] <http://www.datastax.com/> >> >> Sebastián Estévez >> >> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com >> >> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image: >> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png] >> <https://twitter.com/datastax> [image: g+.png] >> <https://plus.google.com/+Datastax/about> >> <http://feeds.feedburner.com/datastax> >> <http://goog_410786983> >> >> >> <http://www.datastax.com/gartner-magic-quadrant-odbms> >> >> DataStax is the fastest, most scalable distributed database technology, >> delivering Apache Cassandra to the world’s most innovative enterprises. >> Datastax is built to be agile, always-on, and predictably scalable to any >> size. With more than 500 customers in 45 countries, DataStax is the >> database technology and transactional backbone of choice for the worlds >> most innovative companies such as Netflix, Adobe, Intuit, and eBay. >> >> On Tue, Nov 10, 2015 at 1:48 PM, PenguinWhispererThe . < >> th3penguinwhispe...@gmail.com> wrote: >> >>> I also have the following memory usage: >>> [root@US-BILLINGDSX4 cassandra]# free -m >>> total used free sharedbuffers cached >>> Mem: 12024 9455 2569 0110 2163 >>> -/+ buffers/cache: 7180 4844 >>> Swap: 2047 0 2047 >>> >>> Still a lot free and a lot of free buffers/cache. >>> >>> 2015-11-10 19:45 GMT+01:00 PenguinWhispererThe . < >>> th3penguinwhispe...@gmail.com>: >>> >>>> Still stuck with this. However I enabled GC logging. This shows the >>>> following: >>>> >>>> [root@myhost cassandra]# tail -f gc-1447180680.log >>>> 2015-11-10T18:41:45.516+: 225.428: [GC >>>> 2721842K->2066508K(6209536K), 0.0199040 secs] >>>> 2015-11-10T18:41:45.977+: 225.889: [GC >>>> 2721868K->2066511K(6209536K), 0.0221910 secs] >>>> 2015-11-10T18:41:46.437+: 226.349: [GC >>>> 2721871K->2066524K(6209536K), 0.0222140 secs] >>>> 2015-11-10T18:41:46.897+: 226.809: [GC >>>> 2721884K->2066539K(6209536K), 0.0224140 secs] >>>> 2015-11-10T18:41:47.359+: 227.271: [GC >>>> 2721899K->2066538K(6209536K), 0.0302520 secs] >>>> 2015-11-10T18:41:47.821+: 227.733: [GC >>>> 2721898K->2066557K(6209536K), 0.0280530 secs] >>>> 2015-11-10T18:41:48.293+: 228.205: [GC >>>> 2721917K->2066571K(6209536K), 0.0218000 secs] >>>> 2015-11-10T18:41:48.790+: 228.702: [GC >>>> 2721931K->2066780K(6209536K), 0.0292470 secs] >>>> 2015-11-10T18:41:49.290+: 229.202: [GC >>>> 2722140K->2066843K(6209536K), 0.0288740 secs] >>>> 2015-11-10T18:41:49.756+: 229.668: [GC >>>> 2722203K->2066818K(6209536K), 0.0283380 secs] >>>> 2015-11-10T18:41:50.249+: 230.161: [GC >>>> 2722178K->2067158K(6209536K), 0.0218690 secs] >>>> 2015-11-10T18:41:50.713+: 230.625: [GC >>>> 2722518K->2067236K(6209536K), 0.0278810 secs] >>>> >>>> This is a VM with 12GB of RAM. Highered the HEAP_SIZE to 6GB
Re: Cassandra compaction stuck? Should I disable?
I also have the following memory usage: [root@US-BILLINGDSX4 cassandra]# free -m total used free sharedbuffers cached Mem: 12024 9455 2569 0110 2163 -/+ buffers/cache: 7180 4844 Swap: 2047 0 2047 Still a lot free and a lot of free buffers/cache. 2015-11-10 19:45 GMT+01:00 PenguinWhispererThe . < th3penguinwhispe...@gmail.com>: > Still stuck with this. However I enabled GC logging. This shows the > following: > > [root@myhost cassandra]# tail -f gc-1447180680.log > 2015-11-10T18:41:45.516+: 225.428: [GC 2721842K->2066508K(6209536K), > 0.0199040 secs] > 2015-11-10T18:41:45.977+: 225.889: [GC 2721868K->2066511K(6209536K), > 0.0221910 secs] > 2015-11-10T18:41:46.437+: 226.349: [GC 2721871K->2066524K(6209536K), > 0.0222140 secs] > 2015-11-10T18:41:46.897+: 226.809: [GC 2721884K->2066539K(6209536K), > 0.0224140 secs] > 2015-11-10T18:41:47.359+: 227.271: [GC 2721899K->2066538K(6209536K), > 0.0302520 secs] > 2015-11-10T18:41:47.821+: 227.733: [GC 2721898K->2066557K(6209536K), > 0.0280530 secs] > 2015-11-10T18:41:48.293+: 228.205: [GC 2721917K->2066571K(6209536K), > 0.0218000 secs] > 2015-11-10T18:41:48.790+: 228.702: [GC 2721931K->2066780K(6209536K), > 0.0292470 secs] > 2015-11-10T18:41:49.290+: 229.202: [GC 2722140K->2066843K(6209536K), > 0.0288740 secs] > 2015-11-10T18:41:49.756+: 229.668: [GC 2722203K->2066818K(6209536K), > 0.0283380 secs] > 2015-11-10T18:41:50.249+: 230.161: [GC 2722178K->2067158K(6209536K), > 0.0218690 secs] > 2015-11-10T18:41:50.713+: 230.625: [GC 2722518K->2067236K(6209536K), > 0.0278810 secs] > > This is a VM with 12GB of RAM. Highered the HEAP_SIZE to 6GB and > HEAP_NEWSIZE to 800MB. > > Still the same result. > > This looks very similar to following issue: > > http://mail-archives.apache.org/mod_mbox/cassandra-user/201411.mbox/%3CCAJ=3xgRLsvpnZe0uXEYjG94rKhfXeU+jBR=q3a-_c3rsdd5...@mail.gmail.com%3E > > Is the only possibility to upgrade memory? I mean, I can't believe it's > just loading all it's data in memory. That would require to keep scaling up > the node to keep it work? > > > 2015-11-10 9:36 GMT+01:00 PenguinWhispererThe . < > th3penguinwhispe...@gmail.com>: > >> Correction... >> I was grepping on Segmentation on the strace and it happens a lot. >> >> Do I need to run a scrub? >> >> 2015-11-10 9:30 GMT+01:00 PenguinWhispererThe . < >> th3penguinwhispe...@gmail.com>: >> >>> Hi Rob, >>> >>> Thanks for your reply. >>> >>> 2015-11-09 23:17 GMT+01:00 Robert Coli <rc...@eventbrite.com>: >>> >>>> On Mon, Nov 9, 2015 at 1:29 PM, PenguinWhispererThe . < >>>> th3penguinwhispe...@gmail.com> wrote: >>>>> >>>>> In Opscenter I see one of the nodes is orange. It seems like it's >>>>> working on compaction. I used nodetool compactionstats and whenever I did >>>>> this the Completed nad percentage stays the same (even with hours in >>>>> between). >>>>> >>>> Are you the same person from IRC, or a second report today of >>>> compaction hanging in this way? >>>> >>> Same person ;) Just didn't had things to work with from the chat there. >>> I want to understand the issue more, see what I can tune or fix. I want to >>> do nodetool repair before upgrading to 2.1.11 but the compaction is >>> blocking it. >>> >>>> >>>> >>>> >>> What version of Cassandra? >>>> >>> 2.0.9 >>> >>>> I currently don't see cpu load from cassandra on that node. So it seems >>>>> stuck (somewhere mid 60%). Also some other nodes have compaction on the >>>>> same columnfamily. I don't see any progress. >>>>> >>>>> WARN [RMI TCP Connection(554)-192.168.0.68] 2015-11-09 17:18:13,677 >>>>> ColumnFamilyStore.java (line 2101) Unable to cancel in-progress >>>>> compactions for usage_record_ptd. Probably there is an unusually large >>>>> row in progress somewhere. It is also possible that buggy code left some >>>>> sstables compacting after it was done with them >>>>> >>>>> >>>>>- How can I assure that nothing is happening? >>>>> >>>>> Find the thread that is doing compaction and strace it. Generally it >>>> is one of the threads with a lower thread priority. >>>
Re: Cassandra compaction stuck? Should I disable?
Still stuck with this. However I enabled GC logging. This shows the following: [root@myhost cassandra]# tail -f gc-1447180680.log 2015-11-10T18:41:45.516+: 225.428: [GC 2721842K->2066508K(6209536K), 0.0199040 secs] 2015-11-10T18:41:45.977+: 225.889: [GC 2721868K->2066511K(6209536K), 0.0221910 secs] 2015-11-10T18:41:46.437+: 226.349: [GC 2721871K->2066524K(6209536K), 0.0222140 secs] 2015-11-10T18:41:46.897+: 226.809: [GC 2721884K->2066539K(6209536K), 0.0224140 secs] 2015-11-10T18:41:47.359+: 227.271: [GC 2721899K->2066538K(6209536K), 0.0302520 secs] 2015-11-10T18:41:47.821+: 227.733: [GC 2721898K->2066557K(6209536K), 0.0280530 secs] 2015-11-10T18:41:48.293+: 228.205: [GC 2721917K->2066571K(6209536K), 0.0218000 secs] 2015-11-10T18:41:48.790+: 228.702: [GC 2721931K->2066780K(6209536K), 0.0292470 secs] 2015-11-10T18:41:49.290+: 229.202: [GC 2722140K->2066843K(6209536K), 0.0288740 secs] 2015-11-10T18:41:49.756+: 229.668: [GC 2722203K->2066818K(6209536K), 0.0283380 secs] 2015-11-10T18:41:50.249+: 230.161: [GC 2722178K->2067158K(6209536K), 0.0218690 secs] 2015-11-10T18:41:50.713+: 230.625: [GC 2722518K->2067236K(6209536K), 0.0278810 secs] This is a VM with 12GB of RAM. Highered the HEAP_SIZE to 6GB and HEAP_NEWSIZE to 800MB. Still the same result. This looks very similar to following issue: http://mail-archives.apache.org/mod_mbox/cassandra-user/201411.mbox/%3CCAJ=3xgRLsvpnZe0uXEYjG94rKhfXeU+jBR=q3a-_c3rsdd5...@mail.gmail.com%3E Is the only possibility to upgrade memory? I mean, I can't believe it's just loading all it's data in memory. That would require to keep scaling up the node to keep it work? 2015-11-10 9:36 GMT+01:00 PenguinWhispererThe . < th3penguinwhispe...@gmail.com>: > Correction... > I was grepping on Segmentation on the strace and it happens a lot. > > Do I need to run a scrub? > > 2015-11-10 9:30 GMT+01:00 PenguinWhispererThe . < > th3penguinwhispe...@gmail.com>: > >> Hi Rob, >> >> Thanks for your reply. >> >> 2015-11-09 23:17 GMT+01:00 Robert Coli <rc...@eventbrite.com>: >> >>> On Mon, Nov 9, 2015 at 1:29 PM, PenguinWhispererThe . < >>> th3penguinwhispe...@gmail.com> wrote: >>>> >>>> In Opscenter I see one of the nodes is orange. It seems like it's >>>> working on compaction. I used nodetool compactionstats and whenever I did >>>> this the Completed nad percentage stays the same (even with hours in >>>> between). >>>> >>> Are you the same person from IRC, or a second report today of compaction >>> hanging in this way? >>> >> Same person ;) Just didn't had things to work with from the chat there. I >> want to understand the issue more, see what I can tune or fix. I want to do >> nodetool repair before upgrading to 2.1.11 but the compaction is blocking >> it. >> >>> >>> >>> >> What version of Cassandra? >>> >> 2.0.9 >> >>> I currently don't see cpu load from cassandra on that node. So it seems >>>> stuck (somewhere mid 60%). Also some other nodes have compaction on the >>>> same columnfamily. I don't see any progress. >>>> >>>> WARN [RMI TCP Connection(554)-192.168.0.68] 2015-11-09 17:18:13,677 >>>> ColumnFamilyStore.java (line 2101) Unable to cancel in-progress >>>> compactions for usage_record_ptd. Probably there is an unusually large >>>> row in progress somewhere. It is also possible that buggy code left some >>>> sstables compacting after it was done with them >>>> >>>> >>>>- How can I assure that nothing is happening? >>>> >>>> Find the thread that is doing compaction and strace it. Generally it is >>> one of the threads with a lower thread priority. >>> >> >> I have 141 threads. Not sure if that's normal. >> >> This seems to be the one: >> 61404 cassandr 24 4 8948m 4.3g 820m R 90.2 36.8 292:54.47 java >> >> In the strace I see basically this part repeating (with once in a while >> the "resource temporarily unavailable"): >> futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50, >> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 >> futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 1 >> getpriority(PRIO_PROCESS, 61404)= 16 >> futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50, >> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 >> futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 0 >> futex(0x1233854, FUTEX_WAIT_PRIVATE, 494045, NULL) = -1 EAGAIN (Resource >> temporarily unavailable) >>
Re: Cassandra compaction stuck? Should I disable?
Hi Sebastian, Thanks for your response. No swap is used. No offense, I just don't see a reason why having swap would be the issue here. I put swapiness on 1. I also have jna installed. That should prevent java being swapped out as wel AFAIK. 2015-11-10 19:50 GMT+01:00 Sebastian Estevez <sebastian.este...@datastax.com >: > Turn off Swap. > > > http://docs.datastax.com/en/cassandra/2.1/cassandra/install/installRecommendSettings.html?scroll=reference_ds_sxl_gf3_2k__disable-swap > > > All the best, > > > [image: datastax_logo.png] <http://www.datastax.com/> > > Sebastián Estévez > > Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com > > [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image: > facebook.png] <https://www.facebook.com/datastax> [image: twitter.png] > <https://twitter.com/datastax> [image: g+.png] > <https://plus.google.com/+Datastax/about> > <http://feeds.feedburner.com/datastax> > <http://goog_410786983> > > > <http://www.datastax.com/gartner-magic-quadrant-odbms> > > DataStax is the fastest, most scalable distributed database technology, > delivering Apache Cassandra to the world’s most innovative enterprises. > Datastax is built to be agile, always-on, and predictably scalable to any > size. With more than 500 customers in 45 countries, DataStax is the > database technology and transactional backbone of choice for the worlds > most innovative companies such as Netflix, Adobe, Intuit, and eBay. > > On Tue, Nov 10, 2015 at 1:48 PM, PenguinWhispererThe . < > th3penguinwhispe...@gmail.com> wrote: > >> I also have the following memory usage: >> [root@US-BILLINGDSX4 cassandra]# free -m >> total used free sharedbuffers cached >> Mem: 12024 9455 2569 0110 2163 >> -/+ buffers/cache: 7180 4844 >> Swap: 2047 0 2047 >> >> Still a lot free and a lot of free buffers/cache. >> >> 2015-11-10 19:45 GMT+01:00 PenguinWhispererThe . < >> th3penguinwhispe...@gmail.com>: >> >>> Still stuck with this. However I enabled GC logging. This shows the >>> following: >>> >>> [root@myhost cassandra]# tail -f gc-1447180680.log >>> 2015-11-10T18:41:45.516+: 225.428: [GC 2721842K->2066508K(6209536K), >>> 0.0199040 secs] >>> 2015-11-10T18:41:45.977+: 225.889: [GC 2721868K->2066511K(6209536K), >>> 0.0221910 secs] >>> 2015-11-10T18:41:46.437+: 226.349: [GC 2721871K->2066524K(6209536K), >>> 0.0222140 secs] >>> 2015-11-10T18:41:46.897+: 226.809: [GC 2721884K->2066539K(6209536K), >>> 0.0224140 secs] >>> 2015-11-10T18:41:47.359+: 227.271: [GC 2721899K->2066538K(6209536K), >>> 0.0302520 secs] >>> 2015-11-10T18:41:47.821+: 227.733: [GC 2721898K->2066557K(6209536K), >>> 0.0280530 secs] >>> 2015-11-10T18:41:48.293+: 228.205: [GC 2721917K->2066571K(6209536K), >>> 0.0218000 secs] >>> 2015-11-10T18:41:48.790+: 228.702: [GC 2721931K->2066780K(6209536K), >>> 0.0292470 secs] >>> 2015-11-10T18:41:49.290+: 229.202: [GC 2722140K->2066843K(6209536K), >>> 0.0288740 secs] >>> 2015-11-10T18:41:49.756+: 229.668: [GC 2722203K->2066818K(6209536K), >>> 0.0283380 secs] >>> 2015-11-10T18:41:50.249+: 230.161: [GC 2722178K->2067158K(6209536K), >>> 0.0218690 secs] >>> 2015-11-10T18:41:50.713+: 230.625: [GC 2722518K->2067236K(6209536K), >>> 0.0278810 secs] >>> >>> This is a VM with 12GB of RAM. Highered the HEAP_SIZE to 6GB and >>> HEAP_NEWSIZE to 800MB. >>> >>> Still the same result. >>> >>> This looks very similar to following issue: >>> >>> http://mail-archives.apache.org/mod_mbox/cassandra-user/201411.mbox/%3CCAJ=3xgRLsvpnZe0uXEYjG94rKhfXeU+jBR=q3a-_c3rsdd5...@mail.gmail.com%3E >>> >>> Is the only possibility to upgrade memory? I mean, I can't believe it's >>> just loading all it's data in memory. That would require to keep scaling up >>> the node to keep it work? >>> >>> >>> 2015-11-10 9:36 GMT+01:00 PenguinWhispererThe . < >>> th3penguinwhispe...@gmail.com>: >>> >>>> Correction... >>>> I was grepping on Segmentation on the strace and it happens a lot. >>>> >>>> Do I need to run a scrub? >>>> >>>> 2015-11-10 9:30 GMT+01:00 PenguinWhispererThe . < >>>> th3penguinwhispe...@gmail.com&
Re: Cassandra compaction stuck? Should I disable?
Hi Rob, Thanks for your reply. 2015-11-09 23:17 GMT+01:00 Robert Coli <rc...@eventbrite.com>: > On Mon, Nov 9, 2015 at 1:29 PM, PenguinWhispererThe . < > th3penguinwhispe...@gmail.com> wrote: >> >> In Opscenter I see one of the nodes is orange. It seems like it's working >> on compaction. I used nodetool compactionstats and whenever I did this the >> Completed nad percentage stays the same (even with hours in between). >> > Are you the same person from IRC, or a second report today of compaction > hanging in this way? > Same person ;) Just didn't had things to work with from the chat there. I want to understand the issue more, see what I can tune or fix. I want to do nodetool repair before upgrading to 2.1.11 but the compaction is blocking it. > > > What version of Cassandra? > 2.0.9 > I currently don't see cpu load from cassandra on that node. So it seems >> stuck (somewhere mid 60%). Also some other nodes have compaction on the >> same columnfamily. I don't see any progress. >> >> WARN [RMI TCP Connection(554)-192.168.0.68] 2015-11-09 17:18:13,677 >> ColumnFamilyStore.java (line 2101) Unable to cancel in-progress compactions >> for usage_record_ptd. Probably there is an unusually large row in progress >> somewhere. It is also possible that buggy code left some sstables >> compacting after it was done with them >> >> >>- How can I assure that nothing is happening? >> >> Find the thread that is doing compaction and strace it. Generally it is > one of the threads with a lower thread priority. > I have 141 threads. Not sure if that's normal. This seems to be the one: 61404 cassandr 24 4 8948m 4.3g 820m R 90.2 36.8 292:54.47 java In the strace I see basically this part repeating (with once in a while the "resource temporarily unavailable"): futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 1 getpriority(PRIO_PROCESS, 61404)= 16 futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x1233854, FUTEX_WAIT_PRIVATE, 494045, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0x1233828, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x1233854, FUTEX_WAIT_PRIVATE, 494047, NULL) = 0 futex(0x1233828, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 1 getpriority(PRIO_PROCESS, 61404)= 16 futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x1233854, FUTEX_WAIT_PRIVATE, 494049, NULL) = 0 futex(0x1233828, FUTEX_WAKE_PRIVATE, 1) = 0 getpriority(PRIO_PROCESS, 61404)= 16 But wait! I also see this: futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 futex(0x1233854, FUTEX_WAIT_PRIVATE, 494055, NULL) = 0 futex(0x1233828, FUTEX_WAKE_PRIVATE, 1) = 0 --- SIGSEGV (Segmentation fault) @ 0 (0) --- This doesn't seem to happen that often though. > > Compaction often appears hung when decompressing a very large row, but > usually not for "hours". > >> >>- Is it recommended to disable compaction from a certain data size? >>(I believe 25GB on each node). >> >> It is almost never recommended to disable compaction. > >> >>- Can I stop this compaction? nodetool stop compaction doesn't seem >>to work. >> >> Killing the JVM ("the dungeon collapses!") would certainly stop it, but > it'd likely just start again when you restart the node. > >> >>- Is stopping the compaction dangerous? >> >> Not if you're in a version that properly cleans up partial compactions, > which is most of them. > >> >>- Is killing the cassandra process dangerous while compacting(I did >>nodetool drain on one node)? >> >> No. But probably nodetool drain couldn't actually stop the in-progress > compaction either, FWIW. > >> This is output of nodetool compactionstats grepped for the keyspace that >> seems stuck. >> >> Do you have gigantic rows in that keyspace? What does cfstats say about > the largest row compaction has seen/do you have log messages about > compacting large rows? > I don't know about the gigantic rows. How can I check? I've checked the logs and f
Re: Cassandra compaction stuck? Should I disable?
Correction... I was grepping on Segmentation on the strace and it happens a lot. Do I need to run a scrub? 2015-11-10 9:30 GMT+01:00 PenguinWhispererThe . < th3penguinwhispe...@gmail.com>: > Hi Rob, > > Thanks for your reply. > > 2015-11-09 23:17 GMT+01:00 Robert Coli <rc...@eventbrite.com>: > >> On Mon, Nov 9, 2015 at 1:29 PM, PenguinWhispererThe . < >> th3penguinwhispe...@gmail.com> wrote: >>> >>> In Opscenter I see one of the nodes is orange. It seems like it's >>> working on compaction. I used nodetool compactionstats and whenever I did >>> this the Completed nad percentage stays the same (even with hours in >>> between). >>> >> Are you the same person from IRC, or a second report today of compaction >> hanging in this way? >> > Same person ;) Just didn't had things to work with from the chat there. I > want to understand the issue more, see what I can tune or fix. I want to do > nodetool repair before upgrading to 2.1.11 but the compaction is blocking > it. > >> >> >> > What version of Cassandra? >> > 2.0.9 > >> I currently don't see cpu load from cassandra on that node. So it seems >>> stuck (somewhere mid 60%). Also some other nodes have compaction on the >>> same columnfamily. I don't see any progress. >>> >>> WARN [RMI TCP Connection(554)-192.168.0.68] 2015-11-09 17:18:13,677 >>> ColumnFamilyStore.java (line 2101) Unable to cancel in-progress compactions >>> for usage_record_ptd. Probably there is an unusually large row in progress >>> somewhere. It is also possible that buggy code left some sstables >>> compacting after it was done with them >>> >>> >>>- How can I assure that nothing is happening? >>> >>> Find the thread that is doing compaction and strace it. Generally it is >> one of the threads with a lower thread priority. >> > > I have 141 threads. Not sure if that's normal. > > This seems to be the one: > 61404 cassandr 24 4 8948m 4.3g 820m R 90.2 36.8 292:54.47 java > > In the strace I see basically this part repeating (with once in a while > the "resource temporarily unavailable"): > futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50, > {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 > futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 1 > getpriority(PRIO_PROCESS, 61404)= 16 > futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50, > {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 > futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 0 > futex(0x1233854, FUTEX_WAIT_PRIVATE, 494045, NULL) = -1 EAGAIN (Resource > temporarily unavailable) > futex(0x1233828, FUTEX_WAKE_PRIVATE, 1) = 0 > futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50, > {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 > futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 1 > futex(0x1233854, FUTEX_WAIT_PRIVATE, 494047, NULL) = 0 > futex(0x1233828, FUTEX_WAKE_PRIVATE, 1) = 0 > futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50, > {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 > futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 1 > getpriority(PRIO_PROCESS, 61404)= 16 > futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50, > {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 > futex(0x7f5c64145e28, FUTEX_WAKE_PRIVATE, 1) = 1 > futex(0x1233854, FUTEX_WAIT_PRIVATE, 494049, NULL) = 0 > futex(0x1233828, FUTEX_WAKE_PRIVATE, 1) = 0 > getpriority(PRIO_PROCESS, 61404)= 16 > > But wait! > I also see this: > futex(0x7f5c64145e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f5c64145e50, > {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 > futex(0x1233854, FUTEX_WAIT_PRIVATE, 494055, NULL) = 0 > futex(0x1233828, FUTEX_WAKE_PRIVATE, 1) = 0 > --- SIGSEGV (Segmentation fault) @ 0 (0) --- > > This doesn't seem to happen that often though. > >> >> Compaction often appears hung when decompressing a very large row, but >> usually not for "hours". >> >>> >>>- Is it recommended to disable compaction from a certain data size? >>>(I believe 25GB on each node). >>> >>> It is almost never recommended to disable compaction. >> >>> >>>- Can I stop this compaction? nodetool stop compaction doesn't seem >>>to work. >>> >>> Killing the JVM ("the dungeon collapses!") would certainly stop it, but >> it'd likely just start again when you restart the node. >> >>> >>>- Is stopping the compaction dangerous? >>> >>> Not if you're in a version that properly cleans up
Fwd: Cassandra compaction stuck? Should I disable?
Hi all, In Opscenter I see one of the nodes is orange. It seems like it's working on compaction. I used nodetool compactionstats and whenever I did this the Completed nad percentage stays the same (even with hours in between). I currently don't see cpu load from cassandra on that node. So it seems stuck (somewhere mid 60%). Also some other nodes have compaction on the same columnfamily. I don't see any progress. WARN [RMI TCP Connection(554)-192.168.0.68] 2015-11-09 17:18:13,677 ColumnFamilyStore.java (line 2101) Unable to cancel in-progress compactions for usage_record_ptd. Probably there is an unusually large row in progress somewhere. It is also possible that buggy code left some sstables compacting after it was done with them - How can I assure that nothing is happening? - Is it recommended to disable compaction from a certain data size? (I believe 25GB on each node). - Can I stop this compaction? nodetool stop compaction doesn't seem to work. - Is stopping the compaction dangerous? - Is killing the cassandra process dangerous while compacting(I did nodetool drain on one node)? This is output of nodetool compactionstats grepped for the keyspace that seems stuck. 4e48f940-86c6-11e5-96be-dd3c9e46ec74 mykeyspace mycolumnfamily 1447062197972 52321301 16743606 {1:2, 4:248} 94acec50-86c8-11e5-96be-dd3c9e46ec74 mykeyspace mycolumnfamily 1447063175061 48992375 13420862 {3:3, 4:245} 3210c9b0-8707-11e5-96be-dd3c9e46ec74 mykeyspace mycolumnfamily 1447090067915 52763216 17732003 {1:2, 4:248} 24f96fe0-86ce-11e5-96be-dd3c9e46ec74 mykeyspace mycolumnfamily 1447065564638 44909171 17029440 {1:2, 3:39, 4:209} 06d58370-86ef-11e5-96be-dd3c9e46ec74 mykeyspace mycolumnfamily 1447079687463 53570365 17873962 {1:2, 3:2, 4:246} f7aa5fa0-86c7-11e5-96be-dd3c9e46ec74 mykeyspace mycolumnfamily 1447062911642 47701016 13291915 {3:2, 4:246} 806a4380-86f7-11e5-96be-dd3c9e46ec74 mykeyspace mycolumnfamily 1447083327416 52644411 17363023 {1:2, 2:1, 4:247} c845b900-86c5-11e5-96be-dd3c9e46ec74 mykeyspace mycolumnfamily 1447061973136 48944530 16698191 {1:2, 3:6, 4:242} bb44a0b0-8718-11e5-96be-dd3c9e46ec74 mykeyspace mycolumnfamily 1447097599547 48768463 13518523 {2:2, 3:5, 4:242} f2c17ea0-86c3-11e5-96be-dd3c9e46ec74 mykeyspace mycolumnfamily 1447061185418 90367799 13904914 {5:4, 6:7, 7:52, 8:185} 1aae6590-86ce-11e5-96be-dd3c9e46ec74 mykeyspace mycolumnfamily 1447065547369 53190698 17228121 {1:2, 4:248} d7ca8d00-86d5-11e5-96be-dd3c9e46ec74 mykeyspace mycolumnfamily 1447068871120 52422499 16995963 {1:2, 3:3, 4:245} 6e890290-86df-11e5-96be-dd3c9e46ec74 mykeyspace mycolumnfamily 1447072989497 45218168 17174468 {1:2, 3:21, 4:227} I also see frequently lines like this in system.log: WARN [Native-Transport-Requests:11935] 2015-11-09 20:10:41,886 BatchStatement.java (line 223) Batch of prepared statements for [billing.usage_record_by_billing_period, billing.metric] is of size 53086, exceeding specified threshold of 5120 by 47966. Any other remarks? Thanks a lot in advance!