Re: Many creation/inserts in parallel

2013-04-29 Thread Sasha Yanushkevich
1) We’ve tested 100 threads in parallel and each thread created 10 tables. I think we will change our data model, but another problem may occur. About 80% of these CFs should be truncated every day and if we decrease many CF by creating one key field in one CF, a huge amount of tombstones will

Understanding the source code

2013-04-29 Thread Mahmood Naderan
Dear all, I am trying to understand and analyze the source code of Cassandra. What I expect (and see in other codes) is that there should be three sections in a code. 1) Initialization and input reading, 2) Core computation and 3) Finalizing and gathering the output. However I can not find

Re: Adding nodes in 1.2 with vnodes requires huge disks

2013-04-29 Thread aaron morton
is this understanding correct we had a 12 node cluster with 256 vnodes on each node (upgraded from 1.1), we added two additional nodes that streamed so much data (600+Gb when other nodes had 150-200GB) during the joining phase that they filled their local disks and had to be killed ? Can you

Re: CQL Clarification

2013-04-29 Thread aaron morton
Not really, I've passed on the comments to the doc teams. The column timestamp is just a 64 bit int like I said. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 29/04/2013, at 10:06 AM, Michael Theroux

Re: cassandra-shuffle time to completion and required disk space

2013-04-29 Thread Sam Overton
An alternative to running shuffle is to do a rolling bootstrap/decommission. You would set num_tokens on the existing hosts (and restart them) so that they split their ranges, then bootstrap in N new hosts, then decommission the old ones. On 28 April 2013 22:21, John Watson j...@disqus.com

Fwd: Inter-DC communication optimization

2013-04-29 Thread Sergey Naumov
Hello. I would like to know whether updates are propagated from local DC to remote DCs simultaneously (so All-to-All network connections are preferable) or Cassandra can somehow determine nearest DCs and send updates only to them (so these nearest DCs have to propagate updates further)? Is there

Cass 1.1.1 and 1.1.11 Exception during compactions

2013-04-29 Thread Oleg Dulin
We saw this exception with 1.1.1 and also with 1.1.11 (we upgraded for unrelated reasons, to fix the FD leak during slice queries) -- name of the CF replaced with * for confidentiality: 10419 ERROR [CompactionExecutor:36] 2013-04-29 07:50:49,060 AbstractCassandraDaemon.java (line 132)

normal thread counts?

2013-04-29 Thread William Oberman
Hi, I'm having some issues. I keep getting: ERROR [GossipStage:1] 2013-04-28 07:48:48,876 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[GossipStage:1,5,main] java.lang.OutOfMemoryError: unable to create new native thread -- after a day or two of

Re: Deletes, null values

2013-04-29 Thread Alain RODRIGUEZ
I created it almost a year ago with cassandra-cli. Now show_schema returns: create column family myCF with column_type = 'Standard' and comparator = 'UTF8Type' and default_validation_class = 'UTF8Type' and key_validation_class = 'UTF8Type' and read_repair_chance = 0.1 and

Exception when setting tokens for the cassandra nodes

2013-04-29 Thread Rahul
Hi, I am testing out Cassandra 1.2 on two of my local servers. But I face problems with assigning tokens to my nodes. When I use nodetool to set token, I end up getting an java Exception. My test setup is as follows, Node1: local ip 1 (seed) Node2: local ip 2 (seed) Since I have two nodes, i

RE: Exception when setting tokens for the cassandra nodes

2013-04-29 Thread moshe.kranc
For starters: If you are using the Murmur3 partitioner, which is the default in cassandra.yaml, then you need to calculate the tokens using: python -c 'print [str(((2**64 / 2) * i) - 2**63) for i in range(2)]' which gives the following values: ['-9223372036854775808', '0'] From: Rahul

Fwd: error casandra ring an hadoop connection ¿?

2013-04-29 Thread Miguel Angel Martin junquera
*hi all:* * * *i can run pig with cassandra and hadoop in EC2.* * * *I ,m trying to run pig with cassandra ring and hadoop * *The ring cassandra have the tasktrackers and datanodes , too. * * * *and i running pig from another machine where i have intalled the namenode-jobtracker.* *ihave

Re: cassandra-shuffle time to completion and required disk space

2013-04-29 Thread John Watson
That's what we tried first before the shuffle. And ran into the space issue. That's detailed in another thread title: Adding nodes in 1.2 with vnodes requires huge disks On Mon, Apr 29, 2013 at 4:08 AM, Sam Overton s...@acunu.com wrote: An alternative to running shuffle is to do a rolling

Re: Adding nodes in 1.2 with vnodes requires huge disks

2013-04-29 Thread Sam Overton
Did you update num_tokens on the existing hosts and restart them, before you tried bootstrapping in the new node? If the new node tried to stream all the data in the cluster then this would be consistent with you having missed that step. You should see Calculating new tokens in the logs of the

Compaction, Slow Ring, and bad behavior

2013-04-29 Thread Drew from Zhrodague
Hi, we have a 9-node ring on m1.xlarge AWS hosts. We started having some trouble a while ago, and it's making me pull out all of my hair. The host in position #3 has been replaced 4 times. Each time, the host joins the ring, I do a nodetool repair -pr, and she seems fine for about a day.

Re: Cass 1.1.1 and 1.1.11 Exception during compactions

2013-04-29 Thread aaron morton
nodetool scrub will repair out of order rows in the source SSTables for the compaction process. Or you can stop the node and use the offline bin/sstablescrub tool Not sure how they got there, there was a ticket for similar problems in 1.1.1 Cheers - Aaron Morton Freelance

Re: normal thread counts?

2013-04-29 Thread aaron morton
I used JMX to check current number of threads in a production cassandra machine, and it was ~27,000. That does not sound too good. My first guess would be lots of client connections. What client are you using, does it do connection pooling ? See the comments in cassandra.yaml around

Re: setcompactionthroughput and setstreamthroughput have no effect

2013-04-29 Thread John Watson
Same behavior on 1.1.3, 1.1.5 and 1.1.9. Currently: 1.2.3 On Mon, Apr 29, 2013 at 11:43 AM, Robert Coli rc...@eventbrite.com wrote: On Sun, Apr 28, 2013 at 2:28 PM, John Watson j...@disqus.com wrote: Running these 2 commands are noop IO wise: nodetool setcompactionthroughput 0

Re: setcompactionthroughput and setstreamthroughput have no effect

2013-04-29 Thread Robert Coli
On Mon, Apr 29, 2013 at 3:52 PM, John Watson j...@disqus.com wrote: Same behavior on 1.1.3, 1.1.5 and 1.1.9. Currently: 1.2.3 (below snippets are from trunk) ./src/java/org/apache/cassandra/tools/NodeCmd.java case SETCOMPACTIONTHROUGHPUT : if (arguments.length

Re: How to use Write Consistency 'ANY' with SSTABLELOADER - DSE Cassandra 1.1.9

2013-04-29 Thread Robert Coli
On Mon, Apr 29, 2013 at 1:17 PM, aaron morton aa...@thelastpickle.com wrote: Bulk Loader does not use CL, it's more like a repair / bootstrap. If you have to skip a node then use repair. The bulk loader (sstableloader) can ignore replica nodes via -i option :

Re: Adding nodes in 1.2 with vnodes requires huge disks

2013-04-29 Thread John Watson
Opened a ticket: https://issues.apache.org/jira/browse/CASSANDRA-5525 On Mon, Apr 29, 2013 at 2:24 AM, aaron morton aa...@thelastpickle.comwrote: is this understanding correct we had a 12 node cluster with 256 vnodes on each node (upgraded from 1.1), we added two additional nodes that

Kundera 2.5 released

2013-04-29 Thread Vivek Mishra
Hi All, We are happy to announce the release of Kundera 2.5. Kundera is a JPA 2.0 compliant, object-datastore mapping library for NoSQL datastores. The idea behind Kundera is to make working with NoSQL databases drop-dead simple and fun. It currently supports Cassandra, HBase, MongoDB,

Re: Deletes, null values

2013-04-29 Thread aaron morton
I thought that C* had no null values... I use a lot of CF in which only the columns name are filled up and I request a range of column to see which references (like 1228#16866) exists. So I would like those column to simply disappear from the table. Cassandra does not store null values.