Unable to run hadoop_cql3_word_count examples

2013-12-03 Thread Parth Patil
Hi, I am new to Cassandra and I am exploring the Hadoop integration (MapReduce) provided by Cassandra. I am trying to run the hadoop examples provided in the cassandra's repo under examples/hadoop_cql3_word_count. I am using the cassandra-2.0 branch. I have a single node cassandra running

How to monitor the progress of a HintedHandoff task?

2013-12-03 Thread Tom van den Berge
Hi, Is there a way to monitor the progress of a hinted handoff task? I found the following two mbeans providing some info: org.apache.cassandra.internal:type=HintedHandoff, which tells me that there is 1 active task, and org.apache.cassandra.db:type=HintedHandoffManager#countPendingHints(),

Re: bin/cqlsh is missing cqlshlib

2013-12-03 Thread Jason Wee
Hi, if you download the rpm from http://rpm.datastax.com/community/noarch/, example cassandra20-2.0.3-1.noarch.rpm , it should contain the cqlshlib and it is package into /usr/lib/python2.6/site-packages/cqlshlib hth /Jason On Tue, Dec 3, 2013 at 10:17 AM, Ritchie Iu r...@ixl.com wrote: No,

How to measure data transfer between data centers?

2013-12-03 Thread Tom van den Berge
Is there a way to know how much data is transferred between two nodes, or more specifically, between two data centers? I'm especially interested in how much data is being replicated from one data center to another, to know how much of the available bandwidth is used. Thanks, Tom

Re: How to monitor the progress of a HintedHandoff task?

2013-12-03 Thread Rahul Menon
Tom, You should check the size of the hints column family to determine how much are present. The hints are a super column family and its keys are destination tokens. You could look at it if you would like. Hints send and timedouts are logged, you should be seeing something like Timed out

Re: How to monitor the progress of a HintedHandoff task?

2013-12-03 Thread Tom van den Berge
Hi Rahul, Thanks for your reply. I have never seen message like Timed out replaying hints to..., which is a good thing then, I suppose ;) Normally, I do see the Finished hinted handoff... log message. However, every now and then this message is not logged, not even after several hours. This is

Re: How to monitor the progress of a HintedHandoff task?

2013-12-03 Thread Rahul Menon
Tom, Do you know why these hints are piling up? What is the size of the hints cf? Thanks Rahul On Tue, Dec 3, 2013 at 6:41 PM, Tom van den Berge t...@drillster.com wrote: Hi Rahul, Thanks for your reply. I have never seen message like Timed out replaying hints to..., which is a good

Commitlog replay makes dropped and recreated keyspace and column family rows reappear

2013-12-03 Thread Desimpel, Ignace
Hi, I have the impression that there is an issue with dropping a keyspace and then recreating the keyspace (and column families), combined with a restart of the database My test goes as follows: Create keyspace K and column families C. Insert rows X0 column family C0 Query for X0 : found

Re: How to monitor the progress of a HintedHandoff task?

2013-12-03 Thread Tom van den Berge
Rahul, This problem occurs every now and then, and currently everything is ok, so there are no hints. But whenever it happens, the hints are quickly piling up. This results in heap problems on the node (Heap is 0.813462 full... appears many times). This in turn results in the flushing of the

Re: Stack trace from a node during a repair

2013-12-03 Thread John Pyeatt
Then my issue must be the 0.01% because 1) I'm running the repair as root. 2) The directory exists and the permissions are appropriate. root:root 755 3) The three times it occurred during the repair it always complained about backups directories. But there are dozens other backups directories

Re: Stack trace from a node during a repair

2013-12-03 Thread Hannu Kröger
Hi, Are you running nodetool or cassandra as root? I think it doesn't really matter what user is running the nodetool. Those directories should be writable by the user who is running the actual cassandra process. Hannu 2013/12/3 John Pyeatt john.pye...@singlewire.com Then my issue must be

Re: data dropped when using sstableloader?

2013-12-03 Thread Francisco Nogueira Calmon Sobral
Hi, Ross. We had the same problem under the same version of Cassandra. We opted to copy ALL the stables from the old cluster to each new node, then run nodetool refresh. The missing rows have appeared after this procedure. Best regards, Francisco. On Nov 27, 2013, at 7:49 PM, Ross Black

Re: Stack trace from a node during a repair

2013-12-03 Thread John Pyeatt
Both cassandra and nodetool are running as root. also ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 59450 max locked memory

Re: Stack trace from a node during a repair

2013-12-03 Thread Robert Coli
On Tue, Dec 3, 2013 at 6:19 AM, John Pyeatt john.pye...@singlewire.comwrote: Then my issue must be the 0.01% because 1) I'm running the repair as root. Huh? Repair doesn't care what user your shell is. It is a process built into cassandra and has the permissions that cassandra does?

Re: Stack trace from a node during a repair

2013-12-03 Thread John Pyeatt
This is running the Amazon Linux OS which is essentially CentOS 6 I believe. java version 1.6.0_45 Java(TM) SE Runtime Environment (build 1.6.0_45-b06) Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode) Installed cassandra 1.2.9 from

CQL workaround for modifying a primary key

2013-12-03 Thread Ike Walker
What is the best practice for modifying the primary key definition of a table in Cassandra 1.2.9? Say I have this table: CREATE TABLE temperature ( weatherstation_id text, event_time timestamp, temperature text, PRIMARY KEY (weatherstation_id,event_time) ); I want to add a new

Re: Exactly one wide row per node for a given CF?

2013-12-03 Thread Vivek Mishra
So Basically you want to create a cluster of multiple unique keys, but data which belongs to one unique should be colocated. correct? -Vivek On Tue, Dec 3, 2013 at 10:39 AM, onlinespending onlinespend...@gmail.comwrote: Subject says it all. I want to be able to randomly distribute a large set