Re: disk space and tombstones

2014-08-19 Thread Vitaly Chirkov
DuyHai Doan wrote it looks like there is a need for a tool to take care of the bucketing switch But I still can't understand why bucketing should be better than `DELETE row USING TIMESTAMP`. Looks like the only source of truth about this topic is the source code of Cassa. -- View this

Re: cassandra-stress with clustering columns?

2014-08-19 Thread Mikhail Stepura
Are you interested in cassandra-stress in particular? Or in any tool which will allow you to stress test your schema? I believe Apache Jmeter + CQL plugin may be useful in the latter case. https://github.com/Mishail/CqlJmeter -M On 8/17/14 12:26, Clint Kelly wrote: Hi all, Is there a way

Re: Best way to format a ResultSet / Row ?

2014-08-19 Thread Fabrice Larcher
Hello, I would try something like that (I have not tested, no guarantee ..) : import com.datastax.driver.core.ColumnDefinitions; import com.datastax.driver.core.ResultSet; import com.datastax.driver.core.Row; import com.datastax.driver.core.utils.Bytes; /* ... */ ResultSet result =

[RELEASE CANDIDATE] Apache Cassandra 2.1.0-rc6 released

2014-08-19 Thread Sylvain Lebresne
The Cassandra team is pleased to announce the sixth release candidate for the future Apache Cassandra version 2.1.0. Please note that this is not yet the final 2.1.0 release and as such, it should not be considered for production use. We'd appreciate testing and let us know if you encounter any

Re: Best way to format a ResultSet / Row ?

2014-08-19 Thread Sylvain Lebresne
This kind of question belong to the java driver mailing list, not the Cassandra one, please try to use the proper mailing list in the future. On Tue, Aug 19, 2014 at 10:11 AM, Fabrice Larcher fabrice.larc...@level5.fr wrote: But this is probably not very usefull, since you get only prints of

Options for expanding Cassandra cluster on AWS

2014-08-19 Thread Oleg Dulin
Distinguished Colleagues: Our current Cassandra cluster on AWS looks like this: 3 nodes in N. Virginia, one per zone. RF=3 Each node is a c3.4xlarge with 2x160G SSDs in RAID-0 (~300 Gig SSD on each node). Works great, I find it the most optimal configuration for a Cassandra node. But the

Re: Options for expanding Cassandra cluster on AWS

2014-08-19 Thread Brian Tarbox
The last guidance I heard from DataStax was to use m2.2xlarge's on AWS and put data on the ephemeral drivehave they changed this guidance? Brian On Tue, Aug 19, 2014 at 9:41 AM, Oleg Dulin oleg.du...@gmail.com wrote: Distinguished Colleagues: Our current Cassandra cluster on AWS looks

Re: Options for expanding Cassandra cluster on AWS

2014-08-19 Thread Russell Bradberry
I’m not sure about Datastax’s official stance but using the SSD backed instances (ed. i2.2xl, c3.4xl etc) outperform the m2.2xl greatly. Also, since Datastax is pro-ssd, I doubt they would still recommend to stay on magnetic disks. That said, I have benchmarked all the way up to the c3.8xl

Re: cassandra-stress with clustering columns?

2014-08-19 Thread Clint Kelly
Hi Mikail, This plugin looks great! I have actually been using JMeter + a custom REST endpoint driving Cassandra. It would be great to compare the results I got from that against the pure JMeter + Cassandra (to evaluate the REST endpoint's performance). Thanks! I'll check this out. Best

Re: cassandra-stress with clustering columns?

2014-08-19 Thread Benedict Elliott Smith
The stress tool in 2.1 also now supports clustering columns: http://www.datastax.com/dev/blog/improved-cassandra-2-1-stress-tool-benchmark-any-schema There are however some features up for revision before release in order to help generate realistic workloads. See

EC2 SSD cluster costs

2014-08-19 Thread Jeremy Jongsma
The latest consensus around the web for running Cassandra on EC2 seems to be use new SSD instances. I've not seen any mention of the elephant in the room - using the new SSD instances significantly raises the cluster cost per TB. With Cassandra's strength being linear scalability to many terabytes

Re: EC2 SSD cluster costs

2014-08-19 Thread Russell Bradberry
Short answer, it depends on your use-case. We migrated to i2.xlarge nodes and saw an immediate increase in performance.   If you just need plain ole raw disk space and don’t have a performance requirement to meet then the m1 machines would work, or hell even SSD EBS volumes may work for you.  

Re: EC2 SSD cluster costs

2014-08-19 Thread Kevin Burton
You're pricing it out at $ per GB… that's not the way to look at it. Price it out at $ per IO… Once you price it that way, SSD makes a LOT more sense. Of course, it depends on your workload. If you're just doing writes, and they're all sequential, then cost per IO might not make a lot of sense.

Re: cassandra-stress with clustering columns?

2014-08-19 Thread Clint Kelly
Thanks for the update, Benedict. We are still using 2.0.9 unfortunately. :/ I will keep that in mind for when we upgrade. On Tue, Aug 19, 2014 at 10:51 AM, Benedict Elliott Smith belliottsm...@datastax.com wrote: The stress tool in 2.1 also now supports clustering columns:

Re: Best way to format a ResultSet / Row ?

2014-08-19 Thread Kevin Burton
I agree that it belongs on that mailing list but it's setup weird.. .I can't subscribe to it in Google Groups… I am not sure what exactly is wrong with it.. mailed the admins but it hasn't been resolved. On Tue, Aug 19, 2014 at 1:49 AM, Sylvain Lebresne sylv...@datastax.com wrote: This kind of

Re: EC2 SSD cluster costs

2014-08-19 Thread Shane Hansen
Again, depends on your use case. But we wanted to keep the data per node below 500gb, and we found raided ssds to be the best bang for the buck for our cluster. I think we moved to from the i2 to c3 because our bottleneck tended to be CPU utilization (from parsing requests). (Discliamer, we're

Manually deleting sstables

2014-08-19 Thread Parag Patel
After we dropped a table, we noticed that the sstables are still there. After searching through the forum history, I noticed that this is known behavior. 1) Is there any negative impact of deleting the sstables off disk and then restarting Cassandra? 2) Are there any other

Re: cassandra-stress with clustering columns?

2014-08-19 Thread Benedict Elliott Smith
The stress tool will work against any version of Cassandra, it's only released alongside for ease of deployment. You can safely use the tool from pre-release versions. On Tue, Aug 19, 2014 at 11:03 PM, Clint Kelly clint.ke...@gmail.com wrote: Thanks for the update, Benedict. We are still

Re: [RELEASE CANDIDATE] Apache Cassandra 2.1.0-rc6 released

2014-08-19 Thread Tony Anecito
That is great news keep up the great work! Best Regards, Tony Anecito Founder/PresidentMyUniPortal LLC http://www.myuniportal.com On Tuesday, August 19, 2014 2:38 AM, Sylvain Lebresne sylv...@datastax.com wrote: The Cassandra team is pleased to announce the sixth release candidate for the

Re: LOCAL_QUORUM without a replica in current data center

2014-08-19 Thread Viswanathan Ramachandran
Sorry for the spam - but I wanted to double check if anyone had experience with such a scenario. Thanks. On Sun, Aug 17, 2014 at 7:11 PM, Viswanathan Ramachandran vish.ramachand...@gmail.com wrote: Hi, How does LOCAL_QUORUM read/write behave when the data center on which query is

Re: Manually deleting sstables

2014-08-19 Thread Robert Coli
On Tue, Aug 19, 2014 at 8:59 AM, Parag Patel ppa...@clearpoolgroup.com wrote: After we dropped a table, we noticed that the sstables are still there. After searching through the forum history, I noticed that this is known behavior. Yes, it's providing protection in this case, though many

Re: EC2 SSD cluster costs

2014-08-19 Thread Paulo Ricardo Motta Gomes
Still using good ol' m1.xlarge here + external caching (memcached). Trying to adapt our use case to have different clusters for different use cases so we can leverage SSD at an acceptable cost in some of them. On Tue, Aug 19, 2014 at 1:05 PM, Shane Hansen shanemhan...@gmail.com wrote: Again,

Re: EC2 SSD cluster costs

2014-08-19 Thread Aiman Parvaiz
I completely agree with others here. It depends on your use case. We were using Hi1.4xlarge boxes and paying huge amount to Amazon, lately our requirements changed and we are not hammering C* as much and our data size has gone down too, so given the new conditions we reserved and migrated to

Re: Cassandra Wiki Immutable?

2014-08-19 Thread Dave Brosius
added, thanks. On 08/18/2014 06:15 AM, Otis Gospodnetic wrote: Hi, What is the state of Cassandra Wiki -- http://wiki.apache.org/cassandra ? I tried to update a few pages, but it looks like pages are immutable. Do I need to have my Wiki username (OtisGospodnetic) added to some ACL?

Cassandra Consistency Level

2014-08-19 Thread Check Peck
We have cassandra cluster in three different datacenters (DC1, DC2 and DC3) and we have 10 machines in each datacenter. We have few tables in cassandra in which we have less than 100 records. What we are seeing - some tables are out of sync between machines in DC3 as compared to DC1 or DC2 when

Re: Cassandra Consistency Level

2014-08-19 Thread Robert Coli
On Tue, Aug 19, 2014 at 4:14 PM, Check Peck comptechge...@gmail.com wrote: What could be the reason for this sync issue? Can anyone shed some light on this? Since our java driver code and datastax c++ driver code are using these tables with CONSISTENCY LEVEL ONE. 1) write with CL.ONE 2)

updated num_tokens value while changing replication factor and getting a nodetool repair error

2014-08-19 Thread Bryan Holladay
I have 1 DC that was originally 3 nodes each set with a single token: '-9223372036854775808', '-3074457345618258603', '3074457345618258602' I added two more nodes and ran nodetool move and nodetool cleanup one server at a time with these tokens: '-9223372036854775808', '-5534023222112865485',

Re: Cassandra Consistency Level

2014-08-19 Thread Mark Reddy
Hi, As you are writing as CL.ONE and cqlsh by default reads at CL.ONE, there is a probability that you are reading stale data i.e. the node you have contacted for the read may not have the most recent data. If you have a higher consistency requirement, you should look at increasing your