Re: disk space issue
This is a shot into the dark but you could check whether you have too many snapshots laying around that you actually don't need. You can get rid of those with a quick nodetool clearsnapshot. On Wed, Oct 1, 2014 at 5:49 AM, cem cayiro...@gmail.com wrote: Hi All, I have a 7 node cluster. One node ran out of disk space and others are around 80% disk utilization. The data has 10 days TTL but I think compaction wasn't fast enough to clean up the expired data. gc_grace value is set default. I have a replication factor of 3. Do you think that it may help if I delete all data for that node and run repair. Does node repair check the ttl value before retrieving data from other nodes? Do you have any other suggestions? Best Regards, Cem. -- Dominic Letz Director of RD Exosite http://exosite.com
Re: disk space issue
In the past in such scenarios it has helped us to check the partition where cassandra is installed and allocate more space for the partition. Maybe it is a disk space issue but it is good to check if it is related to the space allocation for the partition issue. My 2 cents. Sent from my iPhone On 01-Oct-2014, at 11:53 am, Dominic Letz dominicl...@exosite.com wrote: This is a shot into the dark but you could check whether you have too many snapshots laying around that you actually don't need. You can get rid of those with a quick nodetool clearsnapshot. On Wed, Oct 1, 2014 at 5:49 AM, cem cayiro...@gmail.com wrote: Hi All, I have a 7 node cluster. One node ran out of disk space and others are around 80% disk utilization. The data has 10 days TTL but I think compaction wasn't fast enough to clean up the expired data. gc_grace value is set default. I have a replication factor of 3. Do you think that it may help if I delete all data for that node and run repair. Does node repair check the ttl value before retrieving data from other nodes? Do you have any other suggestions? Best Regards, Cem. -- Dominic Letz Director of RD Exosite
Re: disk space issue
my 2 cents: try major compaction on the column family with TTL's - for sure will be faster than full rebuild. also try not cassandra related things, such check and remove old log files, backups etc. On Wed, Oct 1, 2014 at 9:34 AM, Sumod Pawgi spa...@gmail.com wrote: In the past in such scenarios it has helped us to check the partition where cassandra is installed and allocate more space for the partition. Maybe it is a disk space issue but it is good to check if it is related to the space allocation for the partition issue. My 2 cents. Sent from my iPhone On 01-Oct-2014, at 11:53 am, Dominic Letz dominicl...@exosite.com wrote: This is a shot into the dark but you could check whether you have too many snapshots laying around that you actually don't need. You can get rid of those with a quick nodetool clearsnapshot. On Wed, Oct 1, 2014 at 5:49 AM, cem cayiro...@gmail.com wrote: Hi All, I have a 7 node cluster. One node ran out of disk space and others are around 80% disk utilization. The data has 10 days TTL but I think compaction wasn't fast enough to clean up the expired data. gc_grace value is set default. I have a replication factor of 3. Do you think that it may help if I delete all data for that node and run repair. Does node repair check the ttl value before retrieving data from other nodes? Do you have any other suggestions? Best Regards, Cem. -- Dominic Letz Director of RD Exosite http://exosite.com
Re: Not-Equals (!=) in Where Clause
Right, my bad, thanks Tyler for the correction. On Tue, Sep 30, 2014 at 5:44 PM, Tyler Hobbs ty...@datastax.com wrote: I think Sylvain may not have had his coffee yet. You can't use IF's in SELECT statements, but you can in INSERT/UPDATE/DELETE: UPDATE foo SET a = 0 WHERE k = 0 IF b != 0; On Tue, Sep 30, 2014 at 2:36 AM, Sylvain Lebresne sylv...@datastax.com wrote: Is != supported as part of the where clause in Cassandra? It's not. Or is it the grammar for some other purpose? It's supported in 'IF' conditions. You can do something like: SELECT * FROM foo WHERE k = 0 IF v != 3; -- Sylvain -- Tyler Hobbs DataStax http://datastax.com/
Regarding Cassandra-Stress tool
Hi, I am trying to benchmark our custom schema in Cassandra and I managed to run it. However there are couple of setting and issues which I couldn't find any solution/explanation for. I appreciate any comments. 1- The default number of warm-up iterations in stress tool is about 5. I would like to reduce this number (due to my storage space limitations), but I couldn't find any input parameters to do this. I just wonder if this setting is possible ? 2- I did not understand well what does the output of cassandra stress tool mean? I read this http://www.datastax.com/documentation/cassandra/2.1/cassandra/tools/toolsCStressOutput_c.html, but . for example, what does latency means here? does it mean how long a read/write operation is delayed until it is executed? in this case, what is the measure for actual read/write operation? It seems that the documentation is outdated, there is an output parameter partition_rate which is not explained in documentation? best, /Shahab
Re: disk space issue
Major compaction is bad if you're using size-tiered, especially if you're already having capacity issues. Once you have one huge table, with default settings, you'll need 4x that huge table worth of storage in order for it to compact again to ever reclaim your TTL'd data. If you're running into space issues that are ultimately going to get your system wedged and you're using columns with TTL, I'd recommend using the jmx operation to compact individual tables. This will free the TTL'd data assuming that you've exceeded your gc_grace_seconds. This can probably be scripted up in a relatively easy manner with a nice, shellshocked-vulnerable bash script and jmxterm. On Wed, Oct 1, 2014 at 2:43 AM, Nikolay Mihaylov n...@nmmm.nu wrote: my 2 cents: try major compaction on the column family with TTL's - for sure will be faster than full rebuild. also try not cassandra related things, such check and remove old log files, backups etc. On Wed, Oct 1, 2014 at 9:34 AM, Sumod Pawgi spa...@gmail.com wrote: In the past in such scenarios it has helped us to check the partition where cassandra is installed and allocate more space for the partition. Maybe it is a disk space issue but it is good to check if it is related to the space allocation for the partition issue. My 2 cents. Sent from my iPhone On 01-Oct-2014, at 11:53 am, Dominic Letz dominicl...@exosite.com wrote: This is a shot into the dark but you could check whether you have too many snapshots laying around that you actually don't need. You can get rid of those with a quick nodetool clearsnapshot. On Wed, Oct 1, 2014 at 5:49 AM, cem cayiro...@gmail.com wrote: Hi All, I have a 7 node cluster. One node ran out of disk space and others are around 80% disk utilization. The data has 10 days TTL but I think compaction wasn't fast enough to clean up the expired data. gc_grace value is set default. I have a replication factor of 3. Do you think that it may help if I delete all data for that node and run repair. Does node repair check the ttl value before retrieving data from other nodes? Do you have any other suggestions? Best Regards, Cem. -- Dominic Letz Director of RD Exosite http://exosite.com -- *Ken Hancock *| System Architect, Advanced Advertising SeaChange International 50 Nagog Park Acton, Massachusetts 01720 ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC http://www.schange.com/en-US/Company/InvestorRelations.aspx Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hanc...@schange.com | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks[image: LinkedIn] http://www.linkedin.com/in/kenhancock [image: SeaChange International] http://www.schange.com/This e-mail and any attachments may contain information which is SeaChange International confidential. The information enclosed is intended only for the addressees herein and may not be copied or forwarded without permission from SeaChange International.
cassandra stress tools
Hi, I am trying to benchmark our custom schema in Cassandra and I managed to run it. However there are couple of setting and issues which I couldn't find any solution/explanation for. I appreciate any comments. 1- The default number of warm-up iterations in stress tool is about 5. I would like to reduce this number (due to my storage space limitations), but I couldn't find any input parameters to do this. I just wonder if this setting is possible ? 2- I did not understand well what does the output of cassandra stress tool mean? I read this http://www.datastax.com/documentation/cassandra/2.1/cassandra/tools/toolsCStressOutput_c.html, but . for example, what does latency means here? does it mean how long a read/write operation is delayed until it is executed? in this case, what is the measure for actual read/write operation? It seems that the documentation is outdated, there is an output parameter partition_rate which is not explained in documentation? best, /Shahab
CASSANDRA-7649 : upgrade existing db to 2.0.10
I deploy/distribute the Cassandra database as an embedded service allowing me to create a basic cassandra.yaml file based on the global cluster of machines (seeds, non-seeds, ports, disks, etc...). That allows me to configure and upgrade my own software and the cassandra software using the same cassandra.yaml. That yaml file has no tokens specified in it, still having a vnode cluster (thanks cassandra) . In previous versions that was ok, since the cassandra code was simply accepting the tokens it saved in its own database, disregarding any changes one made in the yaml file ( there was no test like bootstrapTokens.size() != DatabaseDescriptor.getNumTokens() ). I guess there was some logic to that, since at that time the system is not bootstrapping and thus should/could use the known token configuration without using the yaml token parameter. Also, isn't this small code change of CASSANDRA-7649 inspired on balancing problems going to vnodes (CASSANDRA-7601) using a random partitioner. And in my case I'm using a ByteOrdered partitioner, forcing me to balance/move/add nodes/tokens myself. And as the description is saying, it was meant to avoid 'to change the number of tokens', that test is doing a little more (from my point of view). Well, in short : I would be in favor of removing that test, clearly leaving a message that the saved tokens are used, not the yaml configured tokens. Regards, Ignace
Re: disk space issue
thanks for the answers! Cem On Wed, Oct 1, 2014 at 2:38 PM, Ken Hancock ken.hanc...@schange.com wrote: *https://github.com/hancockks/cassandra-compact-cf https://github.com/hancockks/cassandra-compact-cf* On Tue, Sep 30, 2014 at 5:49 PM, cem cayiro...@gmail.com wrote: Hi All, I have a 7 node cluster. One node ran out of disk space and others are around 80% disk utilization. The data has 10 days TTL but I think compaction wasn't fast enough to clean up the expired data. gc_grace value is set default. I have a replication factor of 3. Do you think that it may help if I delete all data for that node and run repair. Does node repair check the ttl value before retrieving data from other nodes? Do you have any other suggestions? Best Regards, Cem. -- *Ken Hancock *| System Architect, Advanced Advertising SeaChange International 50 Nagog Park Acton, Massachusetts 01720 ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC http://www.schange.com/en-US/Company/InvestorRelations.aspx Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hanc...@schange.com | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks[image: LinkedIn] http://www.linkedin.com/in/kenhancock [image: SeaChange International] http://www.schange.com/This e-mail and any attachments may contain information which is SeaChange International confidential. The information enclosed is intended only for the addressees herein and may not be copied or forwarded without permission from SeaChange International.
Cassaandra Java 8
Hi All, Has anyone done any performance testing of say Cassandra 2.1 using Java 8? Thanks,-Tony
Question about incremental repair
If you only run incremental repairs, does that mean that bitrot will go undetected for already repaired sstables? If so, is there any other process that will detect bitrot for all the repaired sstables other than full repair (or an unfortunate user)? John... NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
Re: Question about incremental repair
Compressed SSTables store a checksum for every compressed block, which is checked each time the block is decompressed. I believe there's a ticket out there to add something similar for non-compressed SSTables. We also store the sha1 hash of SSTables in its own file on disk. On Wed, Oct 1, 2014 at 4:45 PM, John Sumsion sumsio...@familysearch.org wrote: If you only run incremental repairs, does that mean that bitrot will go undetected for already repaired sstables? If so, is there any other process that will detect bitrot for all the repaired sstables other than full repair (or an unfortunate user)? John... NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. -- Tyler Hobbs DataStax http://datastax.com/
Re: Question about incremental repair
On Wed, Oct 1, 2014 at 3:11 PM, Tyler Hobbs ty...@datastax.com wrote: Compressed SSTables store a checksum for every compressed block, which is checked each time the block is decompressed. I believe there's a ticket out there to add something similar for non-compressed SSTables. We also store the sha1 hash of SSTables in its own file on disk. @OP : this came up a few weeks ago on the list, search for bitrot for the previous thread. Expanding on the discussion further, I plan to file a JIRA on you-must-mark-all-sstables-for-that-range-unrepaired-if-you-fail-CRC-on-read. I'll try to remember to reply on thread when I do. Once there is a CRC on uncompressed read, marking-all-sstables-unrepaired-on-failed-CRC would handle the bitrot case for both uncompressed and compressed reads. =Rob
Re: cassandra stress tools
Not a direct answer to your post but you can also take a look at YCSB. Sent from my iPhone On 01-Oct-2014, at 8:38 pm, shahab shahab.mok...@gmail.com wrote: Hi, I am trying to benchmark our custom schema in Cassandra and I managed to run it. However there are couple of setting and issues which I couldn't find any solution/explanation for. I appreciate any comments. 1- The default number of warm-up iterations in stress tool is about 5. I would like to reduce this number (due to my storage space limitations), but I couldn't find any input parameters to do this. I just wonder if this setting is possible ? 2- I did not understand well what does the output of cassandra stress tool mean? I read this http://www.datastax.com/documentation/cassandra/2.1/cassandra/tools/toolsCStressOutput_c.html, but . for example, what does latency means here? does it mean how long a read/write operation is delayed until it is executed? in this case, what is the measure for actual read/write operation? It seems that the documentation is outdated, there is an output parameter partition_rate which is not explained in documentation? best, /Shahab