Re: Upgrade to Cassandra 1.2

2013-02-15 Thread Alain RODRIGUEZ
There was a webinar (Datastax C*llege) about Vnodes. It will be available soon there I guess: http://www.datastax.com/resources/webinars/collegecredit. You could have watch it live and ask your own questions. Here is a howto:

heap usage

2013-02-15 Thread Reik Schatz
Hi, recently we are hitting some OOM: Java heap space, so I was investigating how the heap is used in Cassandra 1.2+ We use the calculated 4G heap. Our cluster is 6 nodes, around 750 GB data and a replication factor of 3. Row cache is disabled. All key cache and memtable settings are left at

Error in cassandra-cli : ERROR 11:05:06,144 Fatal configuration error error

2013-02-15 Thread Sloot, Hans-Peter
Hi, I am trying to do some examples with cassandra-cli and encounter the error further below. Is this really a configuration error ? Regards HansP [default@MyCassandraKS] SET blog_entry['yomama'][timeuuid()] = 'I love my new shoes!'; ERROR 11:05:06,144 Fatal configuration error error Can't

[nodetool] repair with vNodes

2013-02-15 Thread Haithem Jarraya
Hi, I am new to Cassandra and I would like to hear your thoughts on this. We are running our tests with Cassandra 1.2.1, in relatively small dataset ~60GB. Nodetool repair command has been running for almost 24hours and I can't see any activity from the logs or JMX. What am I missing? Or

RE: Error in cassandra-cli : ERROR 11:05:06,144 Fatal configuration error error

2013-02-15 Thread Sloot, Hans-Peter
But if I add 'key_cache_size_in_mb' to the yaml file it will not even start. BTW I have Datastax Enterprise 2.1 installed. From: Vivek Mishra [mailto:mishra.v...@gmail.com] Sent: vrijdag 15 februari 2013 11:22 To: user@cassandra.apache.org Subject: Re: Error in cassandra-cli : ERROR

Re: heap usage

2013-02-15 Thread Blake Manders
You probably want to look at your bloom filters. Be forewarned though, they're difficult to change; changes to bloom filter settings only apply to new SSTables, so they might not be noticeable until a few compactions have taken place. If that is your issue, and your usage model fits it, a good

Re: Cassandra Geospatial Search

2013-02-15 Thread Hiller, Dean
Yes, this is in PlayOrm's roadmap as well but not there yet. Dean On 2/13/13 6:42 PM, Drew Kutcharian d...@venarc.com wrote: Hi Guys, Has anyone on this mailing list tried to build a bounding box style (get the records inside a known bounding box) geospatial search? I've been researching this

RE: Question on Cassandra Snapshot

2013-02-15 Thread S C
I appreciate any advise or pointers on this. Thanks in advance. From: as...@outlook.com To: user@cassandra.apache.org Subject: Question on Cassandra Snapshot Date: Thu, 14 Feb 2013 20:47:14 -0600 I have been looking at incremental backups and snapshots. I have done some experimentation but

Re: heap usage

2013-02-15 Thread Edward Capriolo
It is not going to be true for long that LCS does not require bloom filters. https://issues.apache.org/jira/browse/CASSANDRA-5029 Apparently, without bloom filters there are issues. On Fri, Feb 15, 2013 at 7:29 AM, Blake Manders bl...@crosspixel.net wrote: You probably want to look at your

odd production issue today 1.1.4

2013-02-15 Thread Hiller, Dean
We ran into an issue today where website became around 10 times slower. We found out node 5 out of our 6 nodes was hitting 2100% cpu (cat /proc/cpuinfo reveals a 16 processor machine). I am really not sure how to hit 2100% unless we had 21 processors. It bounces between 300% and 2100% so I

virtual nodes + map reduce = too many mappers

2013-02-15 Thread cem
Hi All, I have just started to use virtual nodes. I set the number of nodes to 256 as recommended. The problem that I have is when I run a mapreduce job it creates node * 256 mappers. It creates node * 256 splits. this effects the performance since the range queries have a lot of overhead. Any

Deletion consistency

2013-02-15 Thread Víctor Hugo Oliveira Molinar
hello everyone! I have a column family filled with event objects which need to be processed by query threads. Once each thread query for those objects(spread among columns bellow a row), it performs a delete operation for each object in cassandra. It's done in order to ensure that these events

Re: Deletion consistency

2013-02-15 Thread Mike
If you increase the number of nodes to 3, with an RF of 3, then you should be able to read/delete utilizing a quorum consistency level, which I believe will help here. Also, make sure the time of your servers are in sync, utilizing NTP, as drifting time between you client and server could

Re: heap usage

2013-02-15 Thread Wei Zhu
We have 250G data and running at 8GB heap and one of the node is OOM during repair. I checked bloomfilter, only 200M. Not sure how the memory is used, maybe take a memory dump and exam that. - Original Message - From: Edward Capriolo edlinuxg...@gmail.com To:

Re: heap usage

2013-02-15 Thread Bryan Talbot
Aren't bloom filters kept off heap in 1.2? https://issues.apache.org/jira/browse/CASSANDRA-4865 Disabling bloom filters also disables tombstone removal as well, so don't disable them if you delete anything. https://issues.apache.org/jira/browse/CASSANDRA-5182 I believe that the index samples

Re: Deletion consistency

2013-02-15 Thread Bryan Talbot
With a RF and CL of one, there is no replication so there can be no issue with distributed deletes. Writes (and reads) can only go to the one host that has the data and will be refused if that node is down. I'd guess that your app isn't deleting records when you think that it is, or that the

Re: Deletion consistency

2013-02-15 Thread Víctor Hugo Oliveira Molinar
*Mike*, for now I can't upgrade my cluster. I'm going to check the servers time sync. Thanks; *Bryan*, so u think it's not a distributed deleted problem. Thanks for bringing it up. Btw, hector should not be hiding any exception from me. Although there's a mutator reuse in my application. I'm

Re: odd production issue today 1.1.4

2013-02-15 Thread Edward Capriolo
With hyper threading a core can show up as two or maybe even four physical system processors, this is something the kernel does. On Fri, Feb 15, 2013 at 11:41 AM, Hiller, Dean dean.hil...@nrel.gov wrote: We ran into an issue today where website became around 10 times slower. We found out node

cassandra vs. mongodb quick question

2013-02-15 Thread Hiller, Dean
So I found out mongodb varies their node size from 1T to 42T per node depending on the profile. So if I was going to be writing a lot but rarely changing rows, could I also use cassandra with a per node size of +20T or is that not advisable? Thanks, Dean

can we pull rows out compressed from cassandra(lots of rows)?

2013-02-15 Thread Hiller, Dean
Thanks, Dean

[RELEASE] Apache Cassandra 1.1.10 released

2013-02-15 Thread Sylvain Lebresne
The Cassandra team is pleased to announce the release of Apache Cassandra version 1.1.10. Cassandra is a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model. You can read more here:

Re: Cassandra Geospatial Search

2013-02-15 Thread Drew Kutcharian
Hey Dean, do you guys have any thoughts on how to implement it yet? On Feb 15, 2013, at 6:18 AM, Hiller, Dean dean.hil...@nrel.gov wrote: Yes, this is in PlayOrm's roadmap as well but not there yet. Dean On 2/13/13 6:42 PM, Drew Kutcharian d...@venarc.com wrote: Hi Guys, Has anyone

Fwd:

2013-02-15 Thread Michael Morris
http://www.dimanoinmano1.it/ubnb7o.php?s=lf

Re: virtual nodes + map reduce = too many mappers

2013-02-15 Thread Edward Capriolo
Seems like the hadoop Input format should combine the splits that are on the same node into the same map task, like Hadoop's CombinedInputFormat can. I am not sure who recommends vnodes as the default, because this is now the second problem (that I know of) of this class where vnodes has extra

Re: Cassandra 1.20 with Cloudera Hadoop (CDH4) Compatibility Issue

2013-02-15 Thread Dave Brosius
see https://issues.apache.org/jira/browse/CASSANDRA-5201 On 02/15/2013 10:05 PM, Yang Song wrote: Hi, Does anyone use CDH4's Hadoop with Cassandra to interact? The goal is simply read/write to Cassandra from Hadoop direclty using ColumnFamilyInput(Output)Format, but seems a bit

Re: virtual nodes + map reduce = too many mappers

2013-02-15 Thread Eric Evans
On Fri, Feb 15, 2013 at 7:01 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Seems like the hadoop Input format should combine the splits that are on the same node into the same map task, like Hadoop's CombinedInputFormat can. I am not sure who recommends vnodes as the default, because this

Re: Upgrade to Cassandra 1.2

2013-02-15 Thread Eric Evans
On Thu, Feb 14, 2013 at 5:48 PM, Daning Wang dan...@netseer.com wrote: Thanks! suppose I can upgrade to 1.2.x with 1 token by commenting out num_tokens, how can I changed to multiple tokens? could not find doc clearly stating about this. If you decided to move to virtual nodes after upgrading

Re: Cassandra 1.20 with Cloudera Hadoop (CDH4) Compatibility Issue

2013-02-15 Thread Michael Kjellman
That bug is kinda wrong though. 1.0.x is current for like a year now and C* works great with it :) On Feb 15, 2013, at 7:38 PM, Dave Brosius dbros...@mebigfatguy.commailto:dbros...@mebigfatguy.com wrote: see https://issues.apache.org/jira/browse/CASSANDRA-5201 On 02/15/2013 10:05 PM, Yang

Re: Cassandra 1.20 with Cloudera Hadoop (CDH4) Compatibility Issue

2013-02-15 Thread Michael Kjellman
Sorry. I meant to say even though there *wasnt* a major change between 1.0.x and 0.22. The big change was 0.20 to 0.22. Sorry for the confusion. On Feb 15, 2013, at 9:53 PM, Michael Kjellman mkjell...@barracuda.commailto:mkjell...@barracuda.com wrote: There were pretty big changes in Hadoop