cassandra boot is stuck in hint compaction.

2014-05-25 Thread Igor Shprukh




?hi guys, we have a 6 node cluster, consisting of 5 linux machines and a 
windows one.

after a hard shutdown of the windows machine, the node is stuck on hints 
compaction for more than

half an hour and cassandra won't start. must say that it is a strong machine 
with 16gb of ram and 250 gb of space dedicated to the node. all other nodes are 
up.


what could be the problem causing this?

thank you in advance.



Re: cassandra boot is stuck in hint compaction.

2014-05-25 Thread Paulo Ricardo Motta Gomes
What is the Cassandra version? Are the same sstables being compacted over
and over?

Please post a sample of the compaction log and the output of DESCRIBE
TABLE system.hints; on cqlsh.

Cheers,


On Sun, May 25, 2014 at 6:12 AM, Igor Shprukh i...@newage.co.il wrote:


  --


 ​hi guys, we have a 6 node cluster, consisting of 5 linux machines and a
 windows one.

 after a hard shutdown of the windows machine, the node is stuck on hints
 compaction for more than

 half an hour and cassandra won't start. must say that it is a strong
 machine with 16gb of ram and 250 gb of space dedicated to the node. all
 other nodes are up.


  what could be the problem causing this?

 thank you in advance.





-- 
*Paulo Motta*

Chaordic | *Platform*
*www.chaordic.com.br http://www.chaordic.com.br/*
+55 48 3232.3200


decommissioning a node

2014-05-25 Thread Tim Dunphy
Hey all,

I'm attempting to decommission a node I want to remove.

First I get a status of the ring

[root@beta-new:~] #nodetool status

Datacenter: datacenter1

===

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  Address Load   Tokens  Owns   Host ID
Rack

UN  10.10.1.94  197.37 KB  256 49.4%
fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1

UN  10.10.1.18216.95 KB  256 50.6%
f2a48fc7-a362-43f5-9061-4bb3739fdeaf  rack


I see that the node I want to remove is UP. Tho I believe UN means up I
don't know what it stands for.


[root@beta-new:~] #nodetool -host  10.10.1.18 decommission

Failed to connect to ' 10.10.1.18 : Connection timed out

The connection to the node I want to decommission times out. :(

I’m running this node from the seed node, and while I do see port 7199
active and listening there, I do NOT see this port active and listening on
the node that I want to decommission.


Seed node:

[root@beta-new:~] #lsof -i :7199

COMMAND   PID USER   FD   TYPEDEVICE SIZE/OFF NODE NAME

java15331 root   51u  IPv4 566368606  0t0  TCP *:7199 (LISTEN)


[root@beta:/etc/alternatives/cassandrahome] #lsof -i :7199

[root@beta:/etc/alternatives/cassandrahome] #


However cassandra does seem to be running on the node I want to
decommission in addition to it being shown as UN by nodetool status:


[root@beta:/etc/alternatives/cassandrahome] #netstat -tulpn | grep -i
listen | grep java

tcp0  0 0.0.0.0:46755   0.0.0.0:*
LISTEN  23039/java

tcp0  0 10.10.1.18:9160   0.0.0.0:*
LISTEN  23039/java

tcp0  0 0.0.0.0:42990   0.0.0.0:*
LISTEN  23039/java

tcp0  0 10.10.1.18:8081   0.0.0.0:*
LISTEN  23039/java

tcp0  0 10.10.1.18:9042   0.0.0.0:*
LISTEN  23039/java

tcp0  0 10.10.1.18:7000   0.0.0.0:*
LISTEN  23039/java

tcp0  0 0.0.0.0:71980.0.0.0:*
LISTEN  23039/java


So why do you think my seed is listening on port 7199 but the node I want
to get rid of is not? And how can I accomplish my goal of deleting the
unwanted node?


Thanks

Tim



-- 
GPG me!!

gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B


Possible to Add multiple columns in one query ?

2014-05-25 Thread Mark Farnan
I'm sure this is a  CQL 101 question, but. 

 

Is it possible to add MULTIPLE   Rows/Columns  to a single Partition in a
single CQL 3  Query / Call.  

 

Need: 

I'm trying to find the most efficient way to add multiple time series events
to a table in a single call. 

Whilst most time series data comes in sequentially, we have a case where it
is often loaded in bulk,  say sent  100,000 points for 50  channels/tags  at
one go.  (sometimes more), and this needs to be loaded as quickly and
efficiently as possible. 

 

Fairly standard Time-Series schema (this is for testing purposes only at
this point, and doesn't represent final schemas) 

 

CREATE TABLE tag (

  tagid int,

  idx timestamp,

  value double,

  PRIMARY KEY (channelid, idx)

) WITH CLUSTERING ORDER BY (idx DESC);

 

 

Currently I'm using Batch statements, but even that is not fast enough. 

 

Note: At this point I'm testing on a single node cluster on laptop, to
compare different versions.

 

We are using DataStax C# 2.0 (beta) client. And Cassandra 2.0.7

 

Regards

Mark. 



Re: decommissioning a node

2014-05-25 Thread Tim Dunphy
Also for information that may help diagnose this issue I am running
cassandra 2.0.7

I am also using these java options:

[root@beta:/etc/alternatives/cassandrahome] #grep -i jvm_opts
conf/cassandra-env.sh  | grep -v '#'
JVM_OPTS=$JVM_OPTS -ea
JVM_OPTS=$JVM_OPTS -javaagent:$CASSANDRA_HOME/lib/jamm-0.2.5.jar
JVM_OPTS=$JVM_OPTS -XX:+CMSClassUnloadingEnabled
JVM_OPTS=$JVM_OPTS -XX:+UseThreadPriorities
JVM_OPTS=$JVM_OPTS -XX:ThreadPriorityPolicy=42
JVM_OPTS=$JVM_OPTS -Xms${MAX_HEAP_SIZE}
JVM_OPTS=$JVM_OPTS -Xmx${MAX_HEAP_SIZE}
JVM_OPTS=$JVM_OPTS -Xmn${HEAP_NEWSIZE}
JVM_OPTS=$JVM_OPTS -XX:+HeapDumpOnOutOfMemoryError
JVM_OPTS=$JVM_OPTS
-XX:HeapDumpPath=$CASSANDRA_HEAPDUMP_DIR/cassandra-`date +%s`-pid$$.hprof
JVM_OPTS=$JVM_OPTS -Xss256k
JVM_OPTS=$JVM_OPTS -XX:StringTableSize=103
JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=1
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
JVM_OPTS=$JVM_OPTS -XX:+UseCondCardMark
JVM_OPTS=$JVM_OPTS -Djava.net.preferIPv4Stack=true
JVM_OPTS=$JVM_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT
JVM_OPTS=$JVM_OPTS -Dcom.sun.management.jmxremote.ssl=false
JVM_OPTS=$JVM_OPTS -Dcom.sun.management.jmxremote.authenticate=false
JVM_OPTS=$JVM_OPTS $JVM_EXTRA_OPTS


Still need to figure out why the node I want to decommission isn't
listening on port 7199 and how I can actually decommission it.

Thanks
Tim


On Sun, May 25, 2014 at 9:20 AM, Tim Dunphy bluethu...@gmail.com wrote:


 Hey all,

 I'm attempting to decommission a node I want to remove.

 First I get a status of the ring

 [root@beta-new:~] #nodetool status

 Datacenter: datacenter1

 ===

 Status=Up/Down

 |/ State=Normal/Leaving/Joining/Moving

 --  Address Load   Tokens  Owns   Host ID
   Rack

 UN  10.10.1.94  197.37 KB  256 49.4%
 fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1

 UN  10.10.1.18216.95 KB  256 50.6%
 f2a48fc7-a362-43f5-9061-4bb3739fdeaf  rack


 I see that the node I want to remove is UP. Tho I believe UN means up I
 don't know what it stands for.


 [root@beta-new:~] #nodetool -host  10.10.1.18 decommission

 Failed to connect to ' 10.10.1.18 : Connection timed out

 The connection to the node I want to decommission times out. :(

 I’m running this node from the seed node, and while I do see port 7199
 active and listening there, I do NOT see this port active and listening on
 the node that I want to decommission.


 Seed node:

 [root@beta-new:~] #lsof -i :7199

 COMMAND   PID USER   FD   TYPEDEVICE SIZE/OFF NODE NAME

 java15331 root   51u  IPv4 566368606  0t0  TCP *:7199 (LISTEN)


 [root@beta:/etc/alternatives/cassandrahome] #lsof -i :7199

 [root@beta:/etc/alternatives/cassandrahome] #


 However cassandra does seem to be running on the node I want to
 decommission in addition to it being shown as UN by nodetool status:


 [root@beta:/etc/alternatives/cassandrahome] #netstat -tulpn | grep -i
 listen | grep java

 tcp0  0 0.0.0.0:46755   0.0.0.0:*
   LISTEN  23039/java

 tcp0  0 10.10.1.18:9160   0.0.0.0:*
 LISTEN  23039/java

 tcp0  0 0.0.0.0:42990   0.0.0.0:*
   LISTEN  23039/java

 tcp0  0 10.10.1.18:8081   0.0.0.0:*
 LISTEN  23039/java

 tcp0  0 10.10.1.18:9042   0.0.0.0:*
 LISTEN  23039/java

 tcp0  0 10.10.1.18:7000   0.0.0.0:*
 LISTEN  23039/java

 tcp0  0 0.0.0.0:71980.0.0.0:*
   LISTEN  23039/java


 So why do you think my seed is listening on port 7199 but the node I want
 to get rid of is not? And how can I accomplish my goal of deleting the
 unwanted node?


 Thanks

 Tim



 --
 GPG me!!

 gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B




-- 
GPG me!!

gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B


Re: decommissioning a node

2014-05-25 Thread Colin Clark
Try this:

nodetool decomission host-id-of-node-to-decomission

UN means UP, NORMAL

--
Colin
+1 320 221 9531



On Sun, May 25, 2014 at 9:09 AM, Tim Dunphy bluethu...@gmail.com wrote:

 Also for information that may help diagnose this issue I am running
 cassandra 2.0.7

 I am also using these java options:

 [root@beta:/etc/alternatives/cassandrahome] #grep -i jvm_opts
 conf/cassandra-env.sh  | grep -v '#'
 JVM_OPTS=$JVM_OPTS -ea
 JVM_OPTS=$JVM_OPTS -javaagent:$CASSANDRA_HOME/lib/jamm-0.2.5.jar
 JVM_OPTS=$JVM_OPTS -XX:+CMSClassUnloadingEnabled
 JVM_OPTS=$JVM_OPTS -XX:+UseThreadPriorities
 JVM_OPTS=$JVM_OPTS -XX:ThreadPriorityPolicy=42
 JVM_OPTS=$JVM_OPTS -Xms${MAX_HEAP_SIZE}
 JVM_OPTS=$JVM_OPTS -Xmx${MAX_HEAP_SIZE}
 JVM_OPTS=$JVM_OPTS -Xmn${HEAP_NEWSIZE}
 JVM_OPTS=$JVM_OPTS -XX:+HeapDumpOnOutOfMemoryError
 JVM_OPTS=$JVM_OPTS
 -XX:HeapDumpPath=$CASSANDRA_HEAPDUMP_DIR/cassandra-`date +%s`-pid$$.hprof
 JVM_OPTS=$JVM_OPTS -Xss256k
 JVM_OPTS=$JVM_OPTS -XX:StringTableSize=103
 JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
 JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
 JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
 JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
 JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=1
 JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
 JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
 JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
 JVM_OPTS=$JVM_OPTS -XX:+UseCondCardMark
 JVM_OPTS=$JVM_OPTS -Djava.net.preferIPv4Stack=true
 JVM_OPTS=$JVM_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT
 JVM_OPTS=$JVM_OPTS -Dcom.sun.management.jmxremote.ssl=false
 JVM_OPTS=$JVM_OPTS -Dcom.sun.management.jmxremote.authenticate=false
 JVM_OPTS=$JVM_OPTS $JVM_EXTRA_OPTS


 Still need to figure out why the node I want to decommission isn't
 listening on port 7199 and how I can actually decommission it.

 Thanks
 Tim


 On Sun, May 25, 2014 at 9:20 AM, Tim Dunphy bluethu...@gmail.com wrote:


 Hey all,

 I'm attempting to decommission a node I want to remove.

 First I get a status of the ring

 [root@beta-new:~] #nodetool status

 Datacenter: datacenter1

 ===

 Status=Up/Down

 |/ State=Normal/Leaving/Joining/Moving

 --  Address Load   Tokens  Owns   Host ID
   Rack

 UN  10.10.1.94  197.37 KB  256 49.4%
 fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1

 UN  10.10.1.18216.95 KB  256 50.6%
 f2a48fc7-a362-43f5-9061-4bb3739fdeaf  rack


 I see that the node I want to remove is UP. Tho I believe UN means up I
 don't know what it stands for.


 [root@beta-new:~] #nodetool -host  10.10.1.18 decommission

 Failed to connect to ' 10.10.1.18 : Connection timed out

 The connection to the node I want to decommission times out. :(

 I’m running this node from the seed node, and while I do see port 7199
 active and listening there, I do NOT see this port active and listening on
 the node that I want to decommission.


 Seed node:

 [root@beta-new:~] #lsof -i :7199

 COMMAND   PID USER   FD   TYPEDEVICE SIZE/OFF NODE NAME

 java15331 root   51u  IPv4 566368606  0t0  TCP *:7199 (LISTEN)


 [root@beta:/etc/alternatives/cassandrahome] #lsof -i :7199

 [root@beta:/etc/alternatives/cassandrahome] #


 However cassandra does seem to be running on the node I want to
 decommission in addition to it being shown as UN by nodetool status:


 [root@beta:/etc/alternatives/cassandrahome] #netstat -tulpn | grep -i
 listen | grep java

 tcp0  0 0.0.0.0:46755   0.0.0.0:*
 LISTEN  23039/java

 tcp0  0 10.10.1.18:9160   0.0.0.0:*
   LISTEN  23039/java

 tcp0  0 0.0.0.0:42990   0.0.0.0:*
 LISTEN  23039/java

 tcp0  0 10.10.1.18:8081   0.0.0.0:*
   LISTEN  23039/java

 tcp0  0 10.10.1.18:9042   0.0.0.0:*
   LISTEN  23039/java

 tcp0  0 10.10.1.18:7000   0.0.0.0:*
   LISTEN  23039/java

 tcp0  0 0.0.0.0:71980.0.0.0:*
 LISTEN  23039/java


 So why do you think my seed is listening on port 7199 but the node I want
 to get rid of is not? And how can I accomplish my goal of deleting the
 unwanted node?


 Thanks

 Tim



 --
 GPG me!!

 gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B




 --
 GPG me!!

 gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B




Re: decommissioning a node

2014-05-25 Thread Tim Dunphy
ok I copied the cassandra.env from the host that had cassandra listening on
port 7199 to the node that wasn't.

That got it listening on the JMX port:

[root@beta:~] #lsof -i :7199
COMMAND  PID USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
java9197 root   45u  IPv4 6411278  0t0  TCP *:7199 (LISTEN)

But even tho I can telnet to that port from the seed node:

[root@beta-new:~] #telnet  10.10.1.18 7199
Trying 166.78.27.18...
Connected to 166.78.27.18.
Escape character is '^]'.


I still get connection refused when trying to decommission the node:

[root@beta-new:~] #nodetool -host 10.10.1.18 decommission
Failed to connect to '166.78.27.18:7199': Connection refused

To Colin, thanks for the information!

nodetool decomission host-id-of-node-to-decomission

UN means UP, NORMAL

Oh and thanks I tried that and it seems to be working!

[root@beta-new:~] #nodetool decommission
f2a48fc7-a362-43f5-9061-4bb3739fdeaf
Decommission will decommission the node you are connected to and does not
take arguments!

Sorry guys, wrote this email in a hurry as we're checking out of a hotel
room currently. :) I'll let you know if this does work.



On Sun, May 25, 2014 at 10:19 AM, Colin Clark co...@clark.ws wrote:

 Try this:

 nodetool decomission host-id-of-node-to-decomission

 UN means UP, NORMAL

 --
 Colin
 +1 320 221 9531



 On Sun, May 25, 2014 at 9:09 AM, Tim Dunphy bluethu...@gmail.com wrote:

 Also for information that may help diagnose this issue I am running
 cassandra 2.0.7

 I am also using these java options:

 [root@beta:/etc/alternatives/cassandrahome] #grep -i jvm_opts
 conf/cassandra-env.sh  | grep -v '#'
 JVM_OPTS=$JVM_OPTS -ea
 JVM_OPTS=$JVM_OPTS -javaagent:$CASSANDRA_HOME/lib/jamm-0.2.5.jar
 JVM_OPTS=$JVM_OPTS -XX:+CMSClassUnloadingEnabled
 JVM_OPTS=$JVM_OPTS -XX:+UseThreadPriorities
 JVM_OPTS=$JVM_OPTS -XX:ThreadPriorityPolicy=42
 JVM_OPTS=$JVM_OPTS -Xms${MAX_HEAP_SIZE}
 JVM_OPTS=$JVM_OPTS -Xmx${MAX_HEAP_SIZE}
 JVM_OPTS=$JVM_OPTS -Xmn${HEAP_NEWSIZE}
 JVM_OPTS=$JVM_OPTS -XX:+HeapDumpOnOutOfMemoryError
 JVM_OPTS=$JVM_OPTS
 -XX:HeapDumpPath=$CASSANDRA_HEAPDUMP_DIR/cassandra-`date +%s`-pid$$.hprof
 JVM_OPTS=$JVM_OPTS -Xss256k
 JVM_OPTS=$JVM_OPTS -XX:StringTableSize=103
 JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
 JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
 JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
 JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
 JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=1
 JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
 JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
 JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
 JVM_OPTS=$JVM_OPTS -XX:+UseCondCardMark
 JVM_OPTS=$JVM_OPTS -Djava.net.preferIPv4Stack=true
 JVM_OPTS=$JVM_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT
 JVM_OPTS=$JVM_OPTS -Dcom.sun.management.jmxremote.ssl=false
 JVM_OPTS=$JVM_OPTS -Dcom.sun.management.jmxremote.authenticate=false
 JVM_OPTS=$JVM_OPTS $JVM_EXTRA_OPTS


 Still need to figure out why the node I want to decommission isn't
 listening on port 7199 and how I can actually decommission it.

 Thanks
 Tim


 On Sun, May 25, 2014 at 9:20 AM, Tim Dunphy bluethu...@gmail.com wrote:


 Hey all,

 I'm attempting to decommission a node I want to remove.

 First I get a status of the ring

 [root@beta-new:~] #nodetool status

 Datacenter: datacenter1

 ===

 Status=Up/Down

 |/ State=Normal/Leaving/Joining/Moving

 --  Address Load   Tokens  Owns   Host ID
 Rack

 UN  10.10.1.94  197.37 KB  256 49.4%
 fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1

 UN  10.10.1.18216.95 KB  256 50.6%
 f2a48fc7-a362-43f5-9061-4bb3739fdeaf  rack


 I see that the node I want to remove is UP. Tho I believe UN means up I
 don't know what it stands for.


 [root@beta-new:~] #nodetool -host  10.10.1.18 decommission

 Failed to connect to ' 10.10.1.18 : Connection timed out

 The connection to the node I want to decommission times out. :(

 I’m running this node from the seed node, and while I do see port 7199
 active and listening there, I do NOT see this port active and listening on
 the node that I want to decommission.


 Seed node:

 [root@beta-new:~] #lsof -i :7199

 COMMAND   PID USER   FD   TYPEDEVICE SIZE/OFF NODE NAME

 java15331 root   51u  IPv4 566368606  0t0  TCP *:7199 (LISTEN)


 [root@beta:/etc/alternatives/cassandrahome] #lsof -i :7199

 [root@beta:/etc/alternatives/cassandrahome] #


 However cassandra does seem to be running on the node I want to
 decommission in addition to it being shown as UN by nodetool status:


 [root@beta:/etc/alternatives/cassandrahome] #netstat -tulpn | grep -i
 listen | grep java

 tcp0  0 0.0.0.0:46755   0.0.0.0:*
 LISTEN  23039/java

 tcp0  0 10.10.1.18:9160   0.0.0.0:*
   LISTEN  23039/java

 tcp0  0 0.0.0.0:42990   0.0.0.0:*
 LISTEN  23039/java

 tcp0  0 10.10.1.18:8081   0.0.0.0:*
   LISTEN  23039/java

 tcp 

Re: cassandra boot is stuck in hint compaction.

2014-05-25 Thread Michael Shuler

On 05/25/2014 04:12 AM, Igor Shprukh wrote:

​hi guys, we have a 6 node cluster, consisting of 5 linux machines and a
windows one.


Mixed linux/windows clusters are not a supported configuration. It might 
work.

https://www.google.com/search?q=cassandra+mixed+linux+windows+cluster


after a hard shutdown of the windows machine, the node is stuck on hints
compaction for more than

half an hour and cassandra won't start. must say that it is a strong
machine with 16gb of ram and 250 gb of space dedicated to the node. all
other nodes are up.

what could be the problem causing this?


I'm not sure if your mixed-OS cluster is by choice, or if you are simply 
testing out the feasibility of running mixed-OS clusters. It may be 
interesting to continue to see where you get with trying to work through 
your problem, and if some improvements could be made to fix whatever 
problem you are having, please open a jira ticket with your discoveries.


As Paulo mentioned, you'll need to dig through log details and do some 
thorough investigation to help others help you, since you are in 
untested waters.  :)


--
Kind regards,
Michael


Re: Possible to Add multiple columns in one query ?

2014-05-25 Thread Jack Krupansky
Typo: I presume “channelid” should be “tagid” for the partition key for your 
table.

Yes, BATCH statements are the way to go, but be careful not to make your 
batches too large, otherwise you could lose performance when Cassandra is 
relatively idle while the batch is slowly streaming in to the coordinator node 
over the network. Better to break up a large batch into multiple moderate size 
batches (exact size and number will vary and need testing to deduce) that will 
transmit quicker and can be executed in parallel.

I’m not sure Cassandra on a laptop would be the best measure of performance for 
a real cluster, especially compared to a server with more CPU cores than your 
laptop.

And for a real cluster, rows with different partition keys can be sent to a 
coordinator node that owns that partition key, which could be multiple nodes 
for RF1.

-- Jack Krupansky

From: Mark Farnan 
Sent: Sunday, May 25, 2014 9:36 AM
To: user@cassandra.apache.org 
Subject: Possible to Add multiple columns in one query ?

I’m sure this is a  CQL 101 question, but. 

 

Is it possible to add MULTIPLE   Rows/Columns  to a single Partition in a 
single CQL 3  Query / Call.  

 

Need: 

I’m trying to find the most efficient way to add multiple time series events to 
a table in a single call. 

Whilst most time series data comes in sequentially, we have a case where it is 
often loaded in bulk,  say sent  100,000 points for 50  channels/tags  at one 
go.  (sometimes more), and this needs to be loaded as quickly and efficiently 
as possible. 

 

Fairly standard Time-Series schema (this is for testing purposes only at this 
point, and doesn’t represent final schemas) 

 

CREATE TABLE tag (

  tagid int,

  idx timestamp,

  value double,

  PRIMARY KEY (channelid, idx)

) WITH CLUSTERING ORDER BY (idx DESC);

 

 

Currently I’m using Batch statements, but even that is not fast enough. 

 

Note: At this point I’m testing on a single node cluster on laptop, to compare 
different versions.

 

We are using DataStax C# 2.0 (beta) client. And Cassandra 2.0.7

 

Regards

Mark. 


Avoiding High Cell Tombstone Count

2014-05-25 Thread Charlie Mason
Hi All,

I have a table which has one column per user. It revives at lot of updates
to these columns through out the life time. They are always updates on a
few specific columns Firstly is Cassandra storing a Tombstone for each of
these old column values.

I have run a simple select and seen the following tracing results:

activity
   | timestamp| source| source_elapsed
---+--+---+
execute_cql3_query | 19:48:36,582 | 127.0.0.1 |  0
Parsing SELECT Account, Balance FROM AccountBalances WHERE Account =
'test9' LIMIT 1; | 19:48:36,582 | 127.0.0.1 | 56
Preparing statement | 19:48:36,582 | 127.0.0.1 |181
Executing single-partition query on accountbalances | 19:48:36,583 |
127.0.0.1 |878
Acquiring sstable references | 19:48:36,583 | 127.0.0.1 |895
Merging memtable tombstones | 19:48:36,583 | 127.0.0.1 |918
Key cache hit for sstable 569 | 19:48:36,583 | 127.0.0.1 |997
Seeking to partition beginning in data file | 19:48:36,583 | 127.0.0.1 |
1034
Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones |
19:48:36,583 | 127.0.0.1 |   1383
Merging data from memtables and 1 sstables | 19:48:36,583 | 127.0.0.1 |
  1402
Read 1 live and 123780 tombstoned cells | 19:48:36,710 | 127.0.0.1 |
  128631
Request complete | 19:48:36,711 | 127.0.0.1 | 129276


As you can see that's awful lot of tombstoned cells. That's after a full
compaction as well. Just so you are aware this table is updated using a
Paxos IF statement.

Its still seems fairly snappy however I am concerned its only going to get
worse.

Would I better off adding a time based key to the primary key. Then doing a
sepperate insert and then deleting the original. If I did the query with a
limit of one it should always find the first rows before hitting a
tombstone. Is that correct?

Thanks,

Charlie M


Re: Possible to Add multiple columns in one query ?

2014-05-25 Thread Colin
Try asynch updates, and collect the futures at 1,000 and play around from 
there.  

Also, in the real world, you'd want to use load balancing and token aware 
policies when connecting to the cluster.  This will actually bypass the 
coordinator and write directly to the correct nodes.

I will post a link to my github with an example when I get off the road

--
Colin
320-221-9531


 On May 25, 2014, at 1:56 PM, Jack Krupansky j...@basetechnology.com wrote:
 
 Typo: I presume “channelid” should be “tagid” for the partition key for your 
 table.
  
 Yes, BATCH statements are the way to go, but be careful not to make your 
 batches too large, otherwise you could lose performance when Cassandra is 
 relatively idle while the batch is slowly streaming in to the coordinator 
 node over the network. Better to break up a large batch into multiple 
 moderate size batches (exact size and number will vary and need testing to 
 deduce) that will transmit quicker and can be executed in parallel.
  
 I’m not sure Cassandra on a laptop would be the best measure of performance 
 for a real cluster, especially compared to a server with more CPU cores than 
 your laptop.
  
 And for a real cluster, rows with different partition keys can be sent to a 
 coordinator node that owns that partition key, which could be multiple nodes 
 for RF1.
  
 -- Jack Krupansky
  
 From: Mark Farnan
 Sent: Sunday, May 25, 2014 9:36 AM
 To: user@cassandra.apache.org
 Subject: Possible to Add multiple columns in one query ?
  
 I’m sure this is a  CQL 101 question, but.  
  
 Is it possible to add MULTIPLE   Rows/Columns  to a single Partition in a 
 single CQL 3  Query / Call. 
  
 Need:
 I’m trying to find the most efficient way to add multiple time series events 
 to a table in a single call.
 Whilst most time series data comes in sequentially, we have a case where it 
 is often loaded in bulk,  say sent  100,000 points for 50  channels/tags  at 
 one go.  (sometimes more), and this needs to be loaded as quickly and 
 efficiently as possible.
  
 Fairly standard Time-Series schema (this is for testing purposes only at this 
 point, and doesn’t represent final schemas)
  
 CREATE TABLE tag (
   tagid int,
   idx timestamp,
   value double,
   PRIMARY KEY (channelid, idx)
 ) WITH CLUSTERING ORDER BY (idx DESC);
  
  
 Currently I’m using Batch statements, but even that is not fast enough.
  
 Note: At this point I’m testing on a single node cluster on laptop, to 
 compare different versions.
  
 We are using DataStax C# 2.0 (beta) client. And Cassandra 2.0.7
  
 Regards
 Mark.


Re: Possible to Add multiple columns in one query ?

2014-05-25 Thread Colin
Also, make sure you're using prepared statements.

--
Colin
320-221-9531


 On May 25, 2014, at 1:56 PM, Jack Krupansky j...@basetechnology.com wrote:
 
 Typo: I presume “channelid” should be “tagid” for the partition key for your 
 table.
  
 Yes, BATCH statements are the way to go, but be careful not to make your 
 batches too large, otherwise you could lose performance when Cassandra is 
 relatively idle while the batch is slowly streaming in to the coordinator 
 node over the network. Better to break up a large batch into multiple 
 moderate size batches (exact size and number will vary and need testing to 
 deduce) that will transmit quicker and can be executed in parallel.
  
 I’m not sure Cassandra on a laptop would be the best measure of performance 
 for a real cluster, especially compared to a server with more CPU cores than 
 your laptop.
  
 And for a real cluster, rows with different partition keys can be sent to a 
 coordinator node that owns that partition key, which could be multiple nodes 
 for RF1.
  
 -- Jack Krupansky
  
 From: Mark Farnan
 Sent: Sunday, May 25, 2014 9:36 AM
 To: user@cassandra.apache.org
 Subject: Possible to Add multiple columns in one query ?
  
 I’m sure this is a  CQL 101 question, but.  
  
 Is it possible to add MULTIPLE   Rows/Columns  to a single Partition in a 
 single CQL 3  Query / Call. 
  
 Need:
 I’m trying to find the most efficient way to add multiple time series events 
 to a table in a single call.
 Whilst most time series data comes in sequentially, we have a case where it 
 is often loaded in bulk,  say sent  100,000 points for 50  channels/tags  at 
 one go.  (sometimes more), and this needs to be loaded as quickly and 
 efficiently as possible.
  
 Fairly standard Time-Series schema (this is for testing purposes only at this 
 point, and doesn’t represent final schemas)
  
 CREATE TABLE tag (
   tagid int,
   idx timestamp,
   value double,
   PRIMARY KEY (channelid, idx)
 ) WITH CLUSTERING ORDER BY (idx DESC);
  
  
 Currently I’m using Batch statements, but even that is not fast enough.
  
 Note: At this point I’m testing on a single node cluster on laptop, to 
 compare different versions.
  
 We are using DataStax C# 2.0 (beta) client. And Cassandra 2.0.7
  
 Regards
 Mark.


New node Unable to gossip with any seeds

2014-05-25 Thread Tim Dunphy
Hello,

 I am trying to spin up a new node using cassandra 2.0.7. Both nodes are at
Digital Ocean. The seed node is up and running and I can telnet to port
7000 on that host from the node I'm trying to start.

[root@cassandra02 apache-cassandra-2.0.7]# telnet 10.10.1.94 7000

Trying 10.10.1.94...

Connected to 10.10.1.94.

Escape character is '^]'.

But when I start cassandra on the new node I see the following exception:


INFO 00:01:34,744 Handshaking version with /10.10.1.94

ERROR 00:02:05,733 Exception encountered during startup

java.lang.RuntimeException: Unable to gossip with any seeds

at
org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1193)

at
org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:447)

at
org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:656)

at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:612)

at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:505)

at
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:362)

at
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:480)

at
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:569)

java.lang.RuntimeException: Unable to gossip with any seeds

at
org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1193)

at
org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:447)

at
org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:656)

at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:612)

at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:505)

at
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:362)

at
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:480)

at
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:569)

Exception encountered during startup: Unable to gossip with any seeds

ERROR 00:02:05,742 Exception in thread
Thread[StorageServiceShutdownHook,5,main]

java.lang.NullPointerException

at org.apache.cassandra.gms.Gossiper.stop(Gossiper.java:1270)

at
org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:573)

at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)

at java.lang.Thread.run(Thread.java:745)



I'm using the murmur3 partition on both nodes and I have the seed node's IP
listed in the cassandra.yaml of the new node. I'm just wondering what the
issue might be and how I can get around it.


Thanks

Tim




-- 
GPG me!!

gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B