[
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wei Deng updated CASSANDRA-7949:
--------------------------------
Labels: lcs (was: )
> LCS compaction low performance, many pending compactions, nodes are almost
> idle
> -------------------------------------------------------------------------------
>
> Key: CASSANDRA-7949
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
> Project: Cassandra
> Issue Type: Bug
> Environment: DSE 4.5.1-1, Cassandra 2.0.8
> Reporter: Nikolai Grigoriev
> Labels: lcs
> Attachments: iostats.txt, nodetool_compactionstats.txt,
> nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt
>
>
> I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks +
> 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the
> load similar to the load in our future product. Before running the simulator
> I had to pre-generate enough data. This was done using Java code and DataStax
> Java driver. To avoid going deep into details, two tables have been
> generated. Each table currently has about 55M rows and between few dozens and
> few thousands of columns in each row.
> This data generation process was generating massive amount of non-overlapping
> data. Thus, the activity was write-only and highly parallel. This is not the
> type of the traffic that the system will have ultimately to deal with, it
> will be mix of reads and updates to the existing data in the future. This is
> just to explain the choice of LCS, not mentioning the expensive SSD disk
> space.
> At some point while generating the data I have noticed that the compactions
> started to pile up. I knew that I was overloading the cluster but I still
> wanted the genration test to complete. I was expecting to give the cluster
> enough time to finish the pending compactions and get ready for real traffic.
> However, after the storm of write requests have been stopped I have noticed
> that the number of pending compactions remained constant (and even climbed up
> a little bit) on all nodes. After trying to tune some parameters (like
> setting the compaction bandwidth cap to 0) I have noticed a strange pattern:
> the nodes were compacting one of the CFs in a single stream using virtually
> no CPU and no disk I/O. This process was taking hours. After that it would be
> followed by a short burst of few dozens of compactions running in parallel
> (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for
> many hours doing one compaction at time. So it looks like this:
> # nodetool compactionstats
> pending tasks: 3351
> compaction type keyspace table completed
> total unit progress
> Compaction myks table_list1 66499295588
> 1910515889913 bytes 3.48%
> Active compaction remaining time : n/a
> # df -h
> ...
> /dev/sdb 1.5T 637G 854G 43% /cassandra-data/disk1
> /dev/sdc 1.5T 425G 1.1T 29% /cassandra-data/disk2
> /dev/sdd 1.5T 429G 1.1T 29% /cassandra-data/disk3
> # find . -name **table_list1**Data** | grep -v snapshot | wc -l
> 1310
> Among these files I see:
> 1043 files of 161Mb (my sstable size is 160Mb)
> 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
> 263 files of various sized - between few dozens of Kb and 160Mb
> I've been running the heavy load for about 1,5days and it's been close to 3
> days after that and the number of pending compactions does not go down.
> I have applied one of the not-so-obvious recommendations to disable
> multithreaded compactions and that seems to be helping a bit - I see some
> nodes started to have fewer pending compactions. About half of the cluster,
> in fact. But even there I see they are sitting idle most of the time lazily
> compacting in one stream with CPU at ~140% and occasionally doing the bursts
> of compaction work for few minutes.
> I am wondering if this is really a bug or something in the LCS logic that
> would manifest itself only in such an edge case scenario where I have loaded
> lots of unique data quickly.
> By the way, I see this pattern only for one of two tables - the one that has
> about 4 times more data than another (space-wise, number of rows is the
> same). Looks like all these pending compactions are really only for that
> larger table.
> I'll be attaching the relevant logs shortly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)