Re: Performance Difference between Batch Insert and Bulk Load

2014-12-04 Thread Shane Hansen
I'd be really interested to know what sort of performance or load improvements you see by doing client side partitioning. Please post back some results if you've tried that strategy. On Thu, Dec 4, 2014 at 11:46 AM, Tyler Hobbs ty...@datastax.com wrote: On Thu, Dec 4, 2014 at 11:50 AM, Dong

Re: What causes NoHostAvailableException, WriteTimeoutException, and UnavailableException?

2014-11-24 Thread Shane Hansen
Not sure if this is what you're looking for, but api docs can be useful (I won't copy/paste the docs themselves) http://www.datastax.com/drivers/java/2.0/com/datastax/driver/core/exceptions/NoHostAvailableException.html

Re: Better option to load data to cassandra

2014-11-13 Thread Shane Hansen
So sstableloader is a cpu efficient online method of loading data if you already have sstables. An option you may not have considered is just using batch inserts. It was a surprise to me coming from another database system, but C*'s primary use case is shoving data to an append only log. Is there

Re: Exploring Simply Queueing

2014-10-06 Thread Shane Hansen
Sorry if I'm hijacking the conversation, but why in the world would you want to implement a queue on top of Cassandra? It seems like using a proper queuing service would make your life a lot easier. That being said, there might be a better way to play to the strengths of C*. Ideally everything

Re: Increasing size of Batch of prepared statements

2014-10-03 Thread Shane Hansen
It appears to be configurable in cassandra.yaml using batch_size_warn_threshold https://issues.apache.org/jira/browse/CASSANDRA-6487 On Fri, Oct 3, 2014 at 10:47 AM, shahab shahab.mok...@gmail.com wrote: Hi, I am getting the following warning in the cassandra log: BatchStatement.java:258

Re: Storage: upsert vs. delete + insert

2014-09-10 Thread Shane Hansen
My understanding is that a update is the same as an insert. So I would think delete+insert is a bad idea. Also insert+delete would put 2 entries in the commit log. On Sep 10, 2014 9:49 AM, Michal Budzyn michalbud...@gmail.com wrote: Is there any serious difference in the used disk and memory

Re: are dynamic columns supported at all in CQL 3?

2014-08-26 Thread Shane Hansen
Does this answer your question Ian? http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows On Tue, Aug 26, 2014 at 1:12 PM, Ian Rose ianr...@fullstory.com wrote: Is it possible in CQL to create a table that supports dynamic column names? I am using C* v2.0.9, which I

Re: EC2 SSD cluster costs

2014-08-19 Thread Shane Hansen
Again, depends on your use case. But we wanted to keep the data per node below 500gb, and we found raided ssds to be the best bang for the buck for our cluster. I think we moved to from the i2 to c3 because our bottleneck tended to be CPU utilization (from parsing requests). (Discliamer, we're

Re: too many open files

2014-08-08 Thread Shane Hansen
Are you using apache or Datastax cassandra? The datastax distribution ups the file handle limit to 10. That number's hard to exceed. On Fri, Aug 8, 2014 at 1:35 PM, Marcelo Elias Del Valle marc...@s1mbi0se.com.br wrote: Hi, I am using Cassandra 2.0.9 running on Debian Wheezy, and I am

Re: Cassandra Scaling Alerts

2014-07-22 Thread Shane Hansen
I would look at load (disk space used) and system.compactions_in_progress. On Tue, Jul 22, 2014 at 3:49 PM, Arup Chakrabarti a...@pagerduty.com wrote: We have been going through and setting up alerts on our Cassandra clusters. We have catastrophic alerts setup to let us know when things are

Re: Case Study from Migrating from RDBMS to Cassandra

2014-07-22 Thread Shane Hansen
There's lots of info on migrating from a relational database to Cassandra here: http://www.datastax.com/relational-database-to-nosql On Tue, Jul 22, 2014 at 7:45 PM, Surbhi Gupta surbhi.gupt...@gmail.com wrote: Hi, Does anybody has the case study for Migrating from RDBMS to Cassandra ?

Re: Easy diff of schema from dev-production

2014-07-08 Thread Shane Hansen
I'd suggest looking at the system keyspace. Like schema_columns On Jul 8, 2014 9:39 AM, Kevin Burton bur...@spinn3r.com wrote: Are there any easy/elegant ways to compare dev schema to production schema. I want to find if there are any rows/columns we need to add. I could try to format the