RE: Compaction Strategy guidance
ABUSE YA NO QUIERO MAS MAILS SOY DE MEXICO De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com] Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m. Para: user@cassandra.apache.org Asunto: Re: Compaction Strategy guidance Importancia: Alta Stephane, As everything good, LCS comes at certain price. LCS will put most load on you I/O system (if you use spindles - you may need to be careful about that) and on CPU. Also LCS (by default) may fall back to STCS if it is falling behind (which is very possible with heavy writing activity) and this will result in higher disk space usage. Also LCS has certain limitation I have discovered lately. Sometimes LCS may not be able to use all your node's resources (algorithm limitations) and this reduces the overall compaction throughput. This may happen if you have a large column family with lots of data per node. STCS won't have this limitation. By the way, the primary goal of LCS is to reduce the number of sstables C* has to look at to find your data. With LCS properly functioning this number will be most likely between something like 1 and 3 for most of the reads. But if you do few reads and not concerned about the latency today, most likely LCS may only save you some disk space. On Sat, Nov 22, 2014 at 6:25 PM, Stephane Legay wrote: Hi there, use case: - Heavy write app, few reads. - Lots of updates of rows / columns. - Current performance is fine, for both writes and reads.. - Currently using SizedCompactionStrategy We're trying to limit the amount of storage used during compaction. Should we switch to LeveledCompactionStrategy? Thanks -- Nikolai Grigoriev (514) 772-5178
Re: Compaction Strategy guidance
Stephane, As everything good, LCS comes at certain price. LCS will put most load on you I/O system (if you use spindles - you may need to be careful about that) and on CPU. Also LCS (by default) may fall back to STCS if it is falling behind (which is very possible with heavy writing activity) and this will result in higher disk space usage. Also LCS has certain limitation I have discovered lately. Sometimes LCS may not be able to use all your node's resources (algorithm limitations) and this reduces the overall compaction throughput. This may happen if you have a large column family with lots of data per node. STCS won't have this limitation. By the way, the primary goal of LCS is to reduce the number of sstables C* has to look at to find your data. With LCS properly functioning this number will be most likely between something like 1 and 3 for most of the reads. But if you do few reads and not concerned about the latency today, most likely LCS may only save you some disk space. On Sat, Nov 22, 2014 at 6:25 PM, Stephane Legay wrote: > Hi there, > > use case: > > - Heavy write app, few reads. > - Lots of updates of rows / columns. > - Current performance is fine, for both writes and reads.. > - Currently using SizedCompactionStrategy > > We're trying to limit the amount of storage used during compaction. Should > we switch to LeveledCompactionStrategy? > > Thanks > -- Nikolai Grigoriev (514) 772-5178
Compaction Strategy guidance
Hi there, use case: - Heavy write app, few reads. - Lots of updates of rows / columns. - Current performance is fine, for both writes and reads.. - Currently using SizedCompactionStrategy We're trying to limit the amount of storage used during compaction. Should we switch to LeveledCompactionStrategy? Thanks
Re: Problem with performance, memory consumption, and RLIMIT_MEMLOCK
Hi Dmitri, I have not used the CPP driver, but maybe you have forgotten set the equivalent of the Iava driver's fetchsize to something sensible? Just an idea, Jens — Sent from Mailbox On Sun, Nov 16, 2014 at 6:09 PM, Dmitri Dmitrienko wrote: > Hi, > I have a very simple table in cassandra that contains only three columns: > id, time and blob with data. I added 1M rows of data and now the database > is about 12GB on disk. > 1M is only part of data I want to store in the database, it's necessary to > synchronize this table with external source. In order to do this, I have to > read id and time columns of all the rows and compare them with what I see > in the external source and insert/update/delete the rows where I see a > difference. > So, I'm trying to fetch id and time columns from cassandra. All of sudden > in all 100% my attempts, server hangs for ~ 1minute, while doing so it > loads >100% CPU, then abnormally terminates with error saying I have to run > cassandra as root or increase RLIMIT_MEMLOCK. > I increased RLIMIT_MEMLOCK to 1GB and seems it still is not sufficient. > It seems cassandra tries to read and lock whole the table in memory, > ignoring the fact that I need only two tiny columns (~12MB of data). > This is how it works when I use the latest cpp-driver. > With cqlsh it works differently -- it show first page of data almost > immediately, without any sensible delay. > Is there a way to have cpp-driver working like cqlsh? I'd like to have data > sent to the client immediately upon availability without any attempts to > lock huge chunks of virtual memory. > My platform is 64bit linux (centos) with all necessary updates installed, > openjdk. I also tried macosx with oracle jdk. In this case I don't get > RLIMIT_MEMLOCK, but regular out of memory error in system.log, although I > provided server with sufficiently large heap, as recommended, 8GB.
Re: bootstrapping node stuck in JOINING state
Hello, I posted a similar issue the other day. We wound up not nuking the data dir and simply deleting the system keyspace from the data dir and then restarted the node. This actually worked and caused our never-ending join process to complete and the node is now a part of the cluster. Stan Lemon On Fri, Nov 21, 2014 at 1:30 PM, Robert Coli wrote: > On Fri, Nov 21, 2014 at 9:44 AM, Chris Hornung > wrote: > >> On bootstrapping the third node, the data steaming sessions completed >> without issue, but bootstrapping did not finish. The node is stuck in >> JOINING state even 19 hours or so after data streaming completed. >> > > Stop the joining node. Wipe the data dir including system keyspace. > Re-bootstrap. > > =Rob > http://twitter.com/rcolidba >