Combining Cassandra with some SQL language

2012-02-26 Thread R. Verlangen
Hi there, I'm currently busy with the technical design of a new project. Of course it will depend on your needs, but is it weird to combine Cassandra with a SQL language like MySQL? In my usecase it would be nice because we have some tables/CF's with lots and lots of data that does not really

Re: Combining Cassandra with some SQL language

2012-02-26 Thread Benjamin Hawkes-Lewis
On Sun, Feb 26, 2012 at 1:06 PM, R. Verlangen ro...@us2.nl wrote: I'm currently busy with the technical design of a new project. Of course it will depend on your needs, but is it weird to combine Cassandra with a SQL language like MySQL? In my usecase it would be nice because we have some

Re: Combining Cassandra with some SQL language

2012-02-26 Thread Adam Haney
I've been using a combination of MySQL and Cassandra for about a year now on a project that now serves about 20k users. We use Cassandra for storing large entities and MySQL to store meta data that allows us to do better ad hoc querying. It's worked quite well for us. During this time we have also

Re: Frequency of Flushing in 1.0

2012-02-26 Thread Radim Kolar
if a node goes down, it will take longer for commitlog replay. commit log replay time is insignificant. most time during node startup is wasted on index sampling. Index sampling here runs for about 15 minutes.

Re: Frequency of Flushing in 1.0

2012-02-26 Thread Edward Capriolo
If you are doing a planned maintenance you can flush first as well ensuring the that the commit logs will not be as large. On Sun, Feb 26, 2012 at 10:09 AM, Radim Kolar h...@sendmail.cz wrote: if a node goes down, it will take longer for commitlog replay. commit log replay time is

Re: unidirectional communication/replication

2012-02-26 Thread aaron morton
All nodes in the cluster need two way communication. Nodes need to talk to Gossip to each other so they know they are alive. If you need to dump a lot of data consider the Hadoop integration. http://wiki.apache.org/cassandra/HadoopSupport It can run a bit faster than going through the thrift

Re: How to delete a range of columns using first N components of CompositeType Column?

2012-02-26 Thread aaron morton
it has been discussed a few times :) https://issues.apache.org/jira/browse/CASSANDRA-494 A - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 25/02/2012, at 8:06 AM, Praveen Baratam wrote: Thank you Aaron for the clarification. May be this

Re: Server crashed due to OutOfMemoryError: Java heap space

2012-02-26 Thread aaron morton
several compactions on few 200-300 GB SSTables Sounds like some big files. Out of interest how much data do you have per node ? Also do you have wide rows ? Can check via nodetool cfstats. In cases where OOM / GC is related to compaction these are the steps i take first. It's heavy handed

Re: Querying all keys in a column family

2012-02-26 Thread aaron morton
When you say query 1 million records in my mind i'm saying dump 1 million records to another system as a back office job. Hadoop will split the job over multiple nodes and will assign a task to read the range owned by each node. From memory it uses CL ONE (by default) for the read so the node

Re: Frequency of Flushing in 1.0

2012-02-26 Thread aaron morton
Nathan Milford has a post about taking a node down http://blog.milford.io/2011/11/rolling-upgrades-for-cassandra/ The only thing I would do differently would be turn off thrift first. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On

how to cast traditional sql schema to nosql

2012-02-26 Thread Michael Cherkasov
Hi all, I'm newbie in nosql and can't understand how to create nosql style schema. First, I what describe my problem: I need to store results of tests. Each test consists of a list of parameters(if tests have the same list of parameters that means, two tests belong to the same testcase), tag or

Re: Frequency of Flushing in 1.0

2012-02-26 Thread Mohit Anchlia
On Sun, Feb 26, 2012 at 12:18 PM, aaron morton aa...@thelastpickle.comwrote: Nathan Milford has a post about taking a node down http://blog.milford.io/2011/11/rolling-upgrades-for-cassandra/ The only thing I would do differently would be turn off thrift first. Cheers Isn't decomission

CounterColumn java.lang.AssertionError: Wrong class type.

2012-02-26 Thread Gary Ogasawara
Using v1.0.7, we see many of the following errors. Any thoughts on why this is occurring? Thanks in advance. -gary ERROR [ReadRepairStage:9] 2012-02-24 18:31:28,623 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[ReadRepairStage:9,5,main] java.lang.AssertionError: Wrong

Re: Frequency of Flushing in 1.0

2012-02-26 Thread Xaero S
The challenge that we face is that our commitlog disk capacity is much much less (under 10 GB in some cases) than the disk capacity of SSTables. So we cannot really have the commitlog data continuously growing. This is the reason that we need to be able to tune the the way we flush the memtables.

Cassandra 1.1 beta on Maven?

2012-02-26 Thread Praveen Sadhu
Hi, I could not find cassandra 1.1 jars on maven repo. Can a beta version be released? Thanks, Praveen

Re: Frequency of Flushing in 1.0

2012-02-26 Thread Peter Schuller
if a node goes down, it will take longer for commitlog replay. commit log replay time is insignificant. most time during node startup is wasted on index sampling. Index sampling here runs for about 15 minutes. Depends entirely on your situation. If you have few keys and lots of writes, index

RE: Combining Cassandra with some SQL language

2012-02-26 Thread Sanjay Sharma
Kundera (https://github.com/impetus-opensource/Kundera)- an open source APL Java ORM allows polyglot persistence between RDBMS and NoSQL databases such as Cassandra, MongoDB, HBase etc. transparently to the business logic developer. A note of caution- this does not mean that Cassandra data

Re: newer Cassandra + Hadoop = TimedOutException()

2012-02-26 Thread Patrik Modesto
On Sun, Feb 26, 2012 at 04:25, Edward Capriolo edlinuxg...@gmail.com wrote: Did you see the notes here? I'm not sure what do you mean by the notes? I'm using the mapred.* settings suggested there: property namemapred.max.tracker.failures/name value20/value /property

Re: Combining Cassandra with some SQL language

2012-02-26 Thread R. Verlangen
Ok, thank you all for your opinions. Seems that I can continue without any extra db-model headaches ;-) 2012/2/27 Sanjay Sharma sanjay.sha...@impetus.co.in Kundera (https://github.com/impetus-opensource/Kundera)- an open source APL Java ORM allows polyglot persistence between RDBMS and NoSQL