Cassandra 1 node crashed in ring

2012-06-07 Thread Adeel Akbar
Hi, 

 

I am running 2 nodes of Cassandra 0.8.1 in ring with replication factor 2.
Last night one of the Cassandra servers crashed and now we are running on
single node. Please help me that how I add new node in ring and its gets all
update/data which lost in crash server. 

 

Thanks  Regards

 

Adeel Akbar

 



how to configure cassandra as multi tenant

2012-06-07 Thread MOHD ARSHAD SALEEM
Hi All,

I wanted to know how to use cassandra as a multi tenant .

Regards
Arshad


Re: NullPointerException when requesting compactionstats through nodetool

2012-06-07 Thread Sylvain Lebresne
On Thu, Jun 7, 2012 at 1:49 AM, sj.climber sj.clim...@gmail.com wrote:
 Looking at the data file directory, it's clear that the major compaction is
 progressing.  However, I am unable to get stats on the compaction.  More
 specifically, nodetool -h host1 compactionstats yields the following
 NullPointerException.

Obviously, that shouldn't happen so that's a bug. Would you mind
opening a ticket on https://issues.apache.org/jira/browse/CASSANDRA
with that stack trace.

Thanks,
Sylvain


 Exception in thread main java.lang.NullPointerException
        at
 org.apache.cassandra.db.compaction.CompactionInfo.asMap(CompactionInfo.java:103)
        at
 org.apache.cassandra.db.compaction.CompactionManager.getCompactions(CompactionManager.java:1115)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
        at
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
        at
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
        at 
 com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:65)
        at 
 com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:216)
        at
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:666)
        at
 com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638)
        at
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1404)
        at
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
        at
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
        at
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
        at
 javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:600)
        at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
        at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:303)
        at sun.rmi.transport.Transport$1.run(Transport.java:159)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
        at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
        at
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
        at
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
        at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)

 Ideas?  Thanks in advance!

 --
 View this message in context: 
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/NullPointerException-when-requesting-compactionstats-through-nodetool-tp7580370.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
 Nabble.com.


Re: Cassandra 1 node crashed in ring

2012-06-07 Thread rohit bhatia
Restart cassandra on new node with autobootstrap as true, seed node as
the existing node in the cluster and an appropriate token...
You should not need to run nodetool repair as autobootstrap would take
care of it.

On Thu, Jun 7, 2012 at 12:22 PM, Adeel Akbar
adeel.ak...@panasiangroup.com wrote:
 Hi,



 I am running 2 nodes of Cassandra 0.8.1 in ring with replication factor 2.
 Last night one of the Cassandra servers crashed and now we are running on
 single node. Please help me that how I add new node in ring and its gets all
 update/data which lost in crash server.



 Thanks  Regards



 Adeel Akbar




RE: Cassandra 1 node crashed in ring

2012-06-07 Thread Adeel Akbar
Hi, 

I have done same and now its displayed three node in ring. How I remove
crashed node as well as what about data ?


root@zerg:~/apache-cassandra-0.8.1/bin# ./nodetool -h XXX.XX.XXX.XX ring
Address DC  RackStatus State   LoadOwns
Token   
 
147906224866113468886003862620136792702 
XX.XX.XX.XX 16  100 Up Normal  17.37 MB
14.93%  3159755813495848170708142250209621026   
XX.XX.XX.XX 16  100 Down   Normal  ?
23.56%  43237339313998282086051322460691860905  
XX.XX.XX.XX 16  100 Up Normal  15.21 KB
61.52%  147906224866113468886003862620136792702   

Thanks  Regards

Adeel Akbar

-Original Message-
From: rohit bhatia [mailto:rohit2...@gmail.com] 
Sent: Thursday, June 07, 2012 12:28 PM
To: user@cassandra.apache.org
Subject: Re: Cassandra 1 node crashed in ring

Restart cassandra on new node with autobootstrap as true, seed node as the
existing node in the cluster and an appropriate token...
You should not need to run nodetool repair as autobootstrap would take care
of it.

On Thu, Jun 7, 2012 at 12:22 PM, Adeel Akbar adeel.ak...@panasiangroup.com
wrote:
 Hi,



 I am running 2 nodes of Cassandra 0.8.1 in ring with replication factor 2.
 Last night one of the Cassandra servers crashed and now we are running 
 on single node. Please help me that how I add new node in ring and its 
 gets all update/data which lost in crash server.



 Thanks  Regards



 Adeel Akbar





Re: Cassandra 1 node crashed in ring

2012-06-07 Thread rohit bhatia
pardon me for assuming that  ur new node was the same as the failed node..

please see 
http://www.datastax.com/docs/1.0/operations/cluster_management#replacing-a-dead-node

You should be able to proceed with the above link after
decommissioning the new node...

On Thu, Jun 7, 2012 at 1:12 PM, Adeel Akbar
adeel.ak...@panasiangroup.com wrote:
 Hi,

 I have done same and now its displayed three node in ring. How I remove
 crashed node as well as what about data ?


 root@zerg:~/apache-cassandra-0.8.1/bin# ./nodetool -h XXX.XX.XXX.XX ring
 Address         DC          Rack        Status State   Load            Owns
 Token

 147906224866113468886003862620136792702
 XX.XX.XX.XX     16          100         Up     Normal  17.37 MB
 14.93%  3159755813495848170708142250209621026
 XX.XX.XX.XX     16          100         Down   Normal  ?
 23.56%  43237339313998282086051322460691860905
 XX.XX.XX.XX     16          100         Up     Normal  15.21 KB
 61.52%  147906224866113468886003862620136792702

 Thanks  Regards

 Adeel Akbar

 -Original Message-
 From: rohit bhatia [mailto:rohit2...@gmail.com]
 Sent: Thursday, June 07, 2012 12:28 PM
 To: user@cassandra.apache.org
 Subject: Re: Cassandra 1 node crashed in ring

 Restart cassandra on new node with autobootstrap as true, seed node as the
 existing node in the cluster and an appropriate token...
 You should not need to run nodetool repair as autobootstrap would take care
 of it.

 On Thu, Jun 7, 2012 at 12:22 PM, Adeel Akbar adeel.ak...@panasiangroup.com
 wrote:
 Hi,



 I am running 2 nodes of Cassandra 0.8.1 in ring with replication factor 2.
 Last night one of the Cassandra servers crashed and now we are running
 on single node. Please help me that how I add new node in ring and its
 gets all update/data which lost in crash server.



 Thanks  Regards



 Adeel Akbar





Re: Cassandra 1 node crashed in ring

2012-06-07 Thread rohit bhatia
for 0.8 
http://www.datastax.com/docs/0.8/operations/cluster_management#replacing-a-dead-node

On Thu, Jun 7, 2012 at 1:22 PM, rohit bhatia rohit2...@gmail.com wrote:
 pardon me for assuming that  ur new node was the same as the failed node..

 please see 
 http://www.datastax.com/docs/1.0/operations/cluster_management#replacing-a-dead-node

 You should be able to proceed with the above link after
 decommissioning the new node...

 On Thu, Jun 7, 2012 at 1:12 PM, Adeel Akbar
 adeel.ak...@panasiangroup.com wrote:
 Hi,

 I have done same and now its displayed three node in ring. How I remove
 crashed node as well as what about data ?


 root@zerg:~/apache-cassandra-0.8.1/bin# ./nodetool -h XXX.XX.XXX.XX ring
 Address         DC          Rack        Status State   Load            Owns
 Token

 147906224866113468886003862620136792702
 XX.XX.XX.XX     16          100         Up     Normal  17.37 MB
 14.93%  3159755813495848170708142250209621026
 XX.XX.XX.XX     16          100         Down   Normal  ?
 23.56%  43237339313998282086051322460691860905
 XX.XX.XX.XX     16          100         Up     Normal  15.21 KB
 61.52%  147906224866113468886003862620136792702

 Thanks  Regards

 Adeel Akbar

 -Original Message-
 From: rohit bhatia [mailto:rohit2...@gmail.com]
 Sent: Thursday, June 07, 2012 12:28 PM
 To: user@cassandra.apache.org
 Subject: Re: Cassandra 1 node crashed in ring

 Restart cassandra on new node with autobootstrap as true, seed node as the
 existing node in the cluster and an appropriate token...
 You should not need to run nodetool repair as autobootstrap would take care
 of it.

 On Thu, Jun 7, 2012 at 12:22 PM, Adeel Akbar adeel.ak...@panasiangroup.com
 wrote:
 Hi,



 I am running 2 nodes of Cassandra 0.8.1 in ring with replication factor 2.
 Last night one of the Cassandra servers crashed and now we are running
 on single node. Please help me that how I add new node in ring and its
 gets all update/data which lost in crash server.



 Thanks  Regards



 Adeel Akbar





Re: Nodes not picking up data on repair, disk loaded unevenly

2012-06-07 Thread aaron morton
  I am now running major compactions on those nodes (and all is well so far).
Major compaction in this situation will make things worse. When end up with one 
big file you will need that much space again to compact / upgrade / re-write 
it. 

 back down to a normal size, can I move all the data back off the ebs volumes?
 something along the lines of:
Yup.  

 Then add some more nodes to the cluster to keep this from happening in the 
 future.
Yerp. Get everything settled and repair running it should be a simple 
operation. 

 I assume all the files stored in any of the data directories are all uniquely 
 named and cassandra won't really care where they are as long as everything it 
 wants is in it's data directories.

Unique on each node. 

 So it looks like I never got the tree from node #2 (the node which has 
 particularly out of control disk usage).
If you look at the logs for 2. you will probably find an error. 
Or it may still be running, check nodetool compactionstats

 -Is there any way to force replay of hints to empty this out – just a full 
 cluster restart when everything is working again maybe?
Normally I would say stop the nodes and delete the hints CF's. As you have 
deleted CF's from one of the nodes there is a risk of losing data though. 

If you have been working at CL QUORUM and have not been getting 
TimedOutException you can still delete the hints. As the writes they contain 
should be on at least one other node and they will be repaired by repair. 

  I have a high replication factor and all my writes have been at cl=ONE (so 
 all the data in the hints should actually exist in a CF somewhere right?).
There is a chance that a write was only applied locally on the node that you 
delete the data from, and it recorded hints to send to the othe nodes. It's a 
remote chance but still there. 

  how much working space does this need?  Problem is that node #2 is so full 
 I'm not sure any major rebuild or compaction will be susccessful.  The other 
 nodes seem to be handiling things ok although they are still heavily loaded.
upgradetables processes one SSTable at a time, it only needs enough space to 
re-write the SSTable. 

This is why major compaction hurts in these situations. If you have 1.5T of 
small files, you may have enough free space to re-write all the files. If you 
have a single 1.5T file you don't. 

 This cluster has a super high write load currently since I'm still building 
 it out.  I frequently update every row in my CFs
 Sounds like a lot of overwrites. When you get compaction running it may purge 
a lot of data. 


Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 7/06/2012, at 2:51 AM, Luke Hospadaruk wrote:

 Thanks for the tips
 
 Some things I found looking around:
 
 grepping the logs for a specific repair I ran yesterday:
 
 /var/log/cassandra# grep df14e460-af48-11e1--e9014560c7bd system.log
 INFO [AntiEntropySessions:13] 2012-06-05 19:58:51,303 AntiEntropyService.java 
 (line 658) [repair #df14e460-af48-11e1--e9014560c7bd] new session: will 
 sync /4.xx.xx.xx, /1.xx.xx.xx, /3.xx.xx.xx, /2.xx.xx.xx on range 
 (85070591730234615865843651857942052864,127605887595351923798765477786913079296]
  for content.[article2]
 INFO [AntiEntropySessions:13] 2012-06-05 19:58:51,304 AntiEntropyService.java 
 (line 837) [repair #df14e460-af48-11e1--e9014560c7bd] requests for merkle 
 tree sent for article2 (to [ /4.xx.xx.xx, /1.xx.xx.xx, /3.xx.xx.xx, 
 /2.xx.xx.xx])
 INFO [AntiEntropyStage:1] 2012-06-05 20:07:01,169 AntiEntropyService.java 
 (line 190) [repair #df14e460-af48-11e1--e9014560c7bd] Received merkle 
 tree for article2 from /4.xx.xx.xx
 INFO [AntiEntropyStage:1] 2012-06-06 04:12:30,633 AntiEntropyService.java 
 (line 190) [repair #df14e460-af48-11e1--e9014560c7bd] Received merkle 
 tree for article2 from /3.xx.xx.xx
 INFO [AntiEntropyStage:1] 2012-06-06 07:02:51,497 AntiEntropyService.java 
 (line 190) [repair #df14e460-af48-11e1--e9014560c7bd] Received merkle 
 tree for article2 from /1.xx.xx.xx
 
 So it looks like I never got the tree from node #2 (the node which has 
 particularly out of control disk usage).
 
 These are running on amazon m1.xlarge instances with all the EBS volumes 
 raided together for a total of 1.7TB.
 
 What version are you using ?
 1.0
 
 Has there been times when nodes were down ?
 Yes, but mostly just restarts, and mostly just one node at a time
 
 Clear as much space as possible from the disk. Check for snapshots in all 
 KS's.
 Already done.
 
 What KS's (including the system KS) are taking up the most space ? Are there 
 a lot of hints in the system KS (they are not replicated)?
 -There's just one KS that I'm actually using, which is taking up anywhere 
 from about 650GB on the node I was able to scrub and compact (that sounds 
 like the right size to me), and 1.3T on the node that is hugely bloated.
 -There are pretty big huge hints CFs on all but one node (the 

Re: Secondary Indexes, Quorum and Cluster Availability

2012-06-07 Thread aaron morton
Sounds good. Do you want to make the change ? 

Thanks for taking the time. 


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 7/06/2012, at 7:54 AM, Jim Ancona wrote:

 On Tue, Jun 5, 2012 at 4:30 PM, Jim Ancona j...@anconafamily.com wrote:
 It might be a good idea for the documentation to reflect the tradeoffs more 
 clearly.
 
 Here's a proposed addition to the Secondary Index FAQ at 
 http://wiki.apache.org/cassandra/SecondaryIndexes
 
 Q: How does choice of Consistency Level affect cluster availability when 
 using secondary indexes?
 A: Because secondary indexes are distributed, you must have CL level nodes 
 available for all token ranges in the cluster in order to complete a query. 
 For example, with RF = 3, when two out of three consecutive nodes in the ring 
 are unavailable, all secondary index queries at CL = QUORUM will fail, 
 however secondary index queries at CL = ONE will succeed. This is true 
 regardless of cluster size.
 
 Comments?
 
 Jim



Re: Cassandra 1.1.1 Fails to Start

2012-06-07 Thread aaron morton
How much data do you have on the node ? 
Was this a previously running system that was upgraded ? 

 with disk_access_mode mmap_index_only and mmap I see OOM map failed error on 
 SSTableBatchOpen thread
Do you have the stack trace from the log ?

 ERROR [CompactionExecutor:6] 2012-06-06 20:24:19,772 
 AbstractCassandraDaemon.java (line 134) Exception in thread 
 Thread[CompactionExecutor:6,1,main]
 java.lang.StackOverflowError
 at com.google.common.collect.Sets$1.iterator(Sets.java:578)
 at com.google.common.collect.Sets$1.iterator(Sets.java:578)
 at com.google.common.collect.Sets$1.iterator(Sets.java:578)
Was there more to this stack trace ? 
What were the log messages before this error ? 


  INFO [main] 2012-06-06 20:17:10,267 AbstractCassandraDaemon.java (line 122) 
 Heap size: 1525415936/1525415936
The JVM only has 1.5 G of ram, this is at the lower limit. If you have some 
data to load I would not be surprised if it failed to start. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 7/06/2012, at 8:41 AM, Javier Sotelo wrote:

 Hi All,
 
 On SuSe Linux blade with 6GB of RAM.
 
 with disk_access_mode mmap_index_only and mmap I see OOM map failed error on 
 SSTableBatchOpen thread. cat /proc/pid/maps shows a peak of 53521 right 
 before it dies. vm.max_map_count = 1966080 and /proc/pid/limits shows 
 unlimited locked memory.
 
 with disk_access_mode standard, the node does start up but I see the repeated 
 error:
 ERROR [CompactionExecutor:6] 2012-06-06 20:24:19,772 
 AbstractCassandraDaemon.java (line 134) Exception in thread 
 Thread[CompactionExecutor:6,1,main]
 java.lang.StackOverflowError
 at com.google.common.collect.Sets$1.iterator(Sets.java:578)
 at com.google.common.collect.Sets$1.iterator(Sets.java:578)
 at com.google.common.collect.Sets$1.iterator(Sets.java:578)
 ...
 
 I'm not sure the second error is related to the first. I prefer to run with 
 full mmap but I have run out of ideas. Is there anything else I can do to 
 debug this?
 
 Here's startup settings from debug log:
  INFO [main] 2012-06-06 20:17:10,267 AbstractCassandraDaemon.java (line 121) 
 JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_31
  INFO [main] 2012-06-06 20:17:10,267 AbstractCassandraDaemon.java (line 122) 
 Heap size: 1525415936/1525415936
  ...
  INFO [main] 2012-06-06 20:17:10,946 CLibrary.java (line 111) JNA mlockall 
 successful
  ...
  INFO [main] 2012-06-06 20:17:11,055 DatabaseDescriptor.java (line 191) 
 DiskAccessMode is standard, indexAccessMode is standard
  INFO [main] 2012-06-06 20:17:11,213 DatabaseDescriptor.java (line 247) 
 Global memtable threshold is enabled at 484MB
  INFO [main] 2012-06-06 20:17:11,499 CacheService.java (line 96) Initializing 
 key cache with capacity of 72 MBs.
  INFO [main] 2012-06-06 20:17:11,509 CacheService.java (line 107) Scheduling 
 key cache save to each 14400 seconds (going to save all keys).
  INFO [main] 2012-06-06 20:17:11,510 CacheService.java (line 121) 
 Initializing row cache with capacity of 0 MBs and provider 
 org.apache.cassandra.cache.SerializingCacheProvider
  INFO [main] 2012-06-06 20:17:11,513 CacheService.java (line 133) Scheduling 
 row cache save to each 0 seconds (going to save all keys).
 
 Thanks In Advance,
 Javier



Re: Cassandra 1 node crashed in ring

2012-06-07 Thread aaron morton
  of Cassandra 0.8.1
I would recommend upgrading to the latest 0.8 release there are a lot bug 
fixes. (if not 1.0.10)

 Please help me that how I add new node in ring and its gets all update/data 
 which lost in crash server.
Have you been working at CL QUOURM and running repair regularly?
 
Am assuming the crashed server is totally dead and cannot be started. 

You can start a new node with the same token as the last one. When it joins the 
it will auto bootstrap and copy the data from the one that is up.  

If you can copy the data of the old one, copy it all to a new node and start 
with. Then run nodetool repair.

Cheers
 
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 7/06/2012, at 6:52 PM, Adeel Akbar wrote:

 Hi,
  
 I am running 2 nodes of Cassandra 0.8.1 in ring with replication factor 2. 
 Last night one of the Cassandra servers crashed and now we are running on 
 single node. Please help me that how I add new node in ring and its gets all 
 update/data which lost in crash server.
  
 Thanks  Regards
  
 Adeel Akbar
  



Re: how to configure cassandra as multi tenant

2012-06-07 Thread aaron morton
Cassandra is not designed to run as a multi tenant database. 

There have been some recent discussions on this, search the user group for more 
detailed answers. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 7/06/2012, at 7:03 PM, MOHD ARSHAD SALEEM wrote:

 Hi All,
 
 I wanted to know how to use cassandra as a multi tenant .
 
 Regards
 Arshad



memtable_flush_queue_size and memtable_flush_writers

2012-06-07 Thread rohit bhatia
Hi

I can't find this in any documentation online, so just wanted to ask

Do all flush writers share the same flush queue or do they maintain
their separate queues..

Thanks
Rohit


Maximum load per node

2012-06-07 Thread Filippo Diotalevi
Hi, 
one of latest Aaron's observation about the max load per Cassandra node caught 
my attention
 At ~840GB I'm probably running close
 to the max load I should have on a node,[AM] roughly 300GB to 400GB is the 
 max load
Since we currently have a Cassandra node with roughly 330GB of data, it looks 
like that's a good time for us to really understand what's that limit in our 
case. Also, a (maybe old) Stackoverflow question at 
http://stackoverflow.com/questions/4775388/how-much-data-per-node-in-cassandra-cluster
 , seems to suggest a higher limit per node.

Just considering the compaction issues, what are the factors we need to account 
to determine the max load? 

* disk space
Datastax cassandra docs state (pg 97) that a major compaction temporarily 
doubles disk space usage. Is it a safe estimate to say that the Cassandra 
machine needs to have roughly the same amount of free disk space as the current 
load of the Cassandra node, or are there any other factor to consider?

* RAM
Is the amount of RAM in the machine (or dedicated to the Cassandra node) 
affecting in any way the speed/efficiency of the compaction process? 

* Performance degradation for overloaded nodes?
What kind of performance degradation can we expect for a Cassandra node which 
is overloaded? (f.i. with 500GB or more of data)


Thanks for the clarifications,
-- 
Filippo Diotalevi




Re: Maximum load per node

2012-06-07 Thread Ben Kaehne
Does this max load have correlation to replication factor?

IE a 3 node cluster with rf of 3. Should i be worried at {max load} X 3 or
what people generally mention the max load is?

On Thu, Jun 7, 2012 at 10:55 PM, Filippo Diotalevi fili...@ntoklo.comwrote:

  Hi,
 one of latest Aaron's observation about the max load per Cassandra node
 caught my attention

 At ~840GB I'm probably running close
 to the max load I should have on a node,

 [AM] roughly 300GB to 400GB is the max load
 Since we currently have a Cassandra node with roughly 330GB of data, it
 looks like that's a good time for us to really understand what's that limit
 in our case. Also, a (maybe old) Stackoverflow question at
 http://stackoverflow.com/questions/4775388/how-much-data-per-node-in-cassandra-cluster
  ,
 seems to suggest a higher limit per node.

 Just considering the compaction issues, what are the factors we need to
 account to determine the max load?

 * disk space
 Datastax cassandra docs state (pg 97) that a major compaction temporarily
 doubles disk space usage. Is it a safe estimate to say that the Cassandra
 machine needs to have roughly the same amount of free disk space as the
 current load of the Cassandra node, or are there any other factor to
 consider?

 * RAM
 Is the amount of RAM in the machine (or dedicated to the Cassandra node)
 affecting in any way the speed/efficiency of the compaction process?

 * Performance degradation for overloaded nodes?
 What kind of performance degradation can we expect for a Cassandra node
 which is overloaded? (f.i. with 500GB or more of data)


 Thanks for the clarifications,
 --
 Filippo Diotalevi





-- 
-Ben


Failing operations repair

2012-06-07 Thread Віталій Тимчишин
Hello.

I am making some cassandra presentations in Kyiv and would like to check
that I am telling people truth :)
Could community tell me if next points are true:
1) Failed (from client-side view) operation may still be applied to cluster
2) Coordinator does not try anything to roll-back operation that failed
because it was processed by less then consitency level number of nodes.
3) Hinted handoff works only for successfull operations.
4) Counters are not reliable because of (1)
5) Read-repair may help to propagate operation that was failed it's
consistency level, but was persisted to some nodes.
6) Manual repair is still needed because of (2) and (3)

P.S. If some points apply only to some cassandra versions, I will be happy
to know this too.
-- 
Best regards,
 Vitalii Tymchyshyn


Re: Cassandra 1.1.1 Fails to Start

2012-06-07 Thread Javier Sotelo
nodetool ring showed 34.89GB load. Upgrading from 1.1.0. One small keyspace
with no compression, about 250MB. The rest taken by the second keyspace
with leveled compaction and snappy compressed.

The blade is an Intel(R) Xeon(R) CPU E5620 @ 2.40GHz with 6GB of RAM.

On Thu, Jun 7, 2012 at 2:52 AM, aaron morton aa...@thelastpickle.comwrote:

 How much data do you have on the node ?
 Was this a previously running system that was upgraded ?

  with disk_access_mode mmap_index_only and mmap I see OOM map failed
 error on SSTableBatchOpen thread
 Do you have the stack trace from the log ?

  ERROR [CompactionExecutor:6] 2012-06-06 20:24:19,772
 AbstractCassandraDaemon.java (line 134) Exception in thread
 Thread[CompactionExecutor:6,1,main]
  java.lang.StackOverflowError
  at com.google.common.collect.Sets$1.iterator(Sets.java:578)
  at com.google.common.collect.Sets$1.iterator(Sets.java:578)
  at com.google.common.collect.Sets$1.iterator(Sets.java:578)
 Was there more to this stack trace ?
 What were the log messages before this error ?


   INFO [main] 2012-06-06 20:17:10,267 AbstractCassandraDaemon.java (line
 122) Heap size: 1525415936/1525415936
 The JVM only has 1.5 G of ram, this is at the lower limit. If you have
 some data to load I would not be surprised if it failed to start.

 Cheers

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 7/06/2012, at 8:41 AM, Javier Sotelo wrote:

  Hi All,
 
  On SuSe Linux blade with 6GB of RAM.
 
  with disk_access_mode mmap_index_only and mmap I see OOM map failed
 error on SSTableBatchOpen thread. cat /proc/pid/maps shows a peak of
 53521 right before it dies. vm.max_map_count = 1966080 and
 /proc/pid/limits shows unlimited locked memory.
 
  with disk_access_mode standard, the node does start up but I see the
 repeated error:
  ERROR [CompactionExecutor:6] 2012-06-06 20:24:19,772
 AbstractCassandraDaemon.java (line 134) Exception in thread
 Thread[CompactionExecutor:6,1,main]
  java.lang.StackOverflowError
  at com.google.common.collect.Sets$1.iterator(Sets.java:578)
  at com.google.common.collect.Sets$1.iterator(Sets.java:578)
  at com.google.common.collect.Sets$1.iterator(Sets.java:578)
  ...
 
  I'm not sure the second error is related to the first. I prefer to run
 with full mmap but I have run out of ideas. Is there anything else I can do
 to debug this?
 
  Here's startup settings from debug log:
   INFO [main] 2012-06-06 20:17:10,267 AbstractCassandraDaemon.java (line
 121) JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_31
   INFO [main] 2012-06-06 20:17:10,267 AbstractCassandraDaemon.java (line
 122) Heap size: 1525415936/1525415936
   ...
   INFO [main] 2012-06-06 20:17:10,946 CLibrary.java (line 111) JNA
 mlockall successful
   ...
   INFO [main] 2012-06-06 20:17:11,055 DatabaseDescriptor.java (line 191)
 DiskAccessMode is standard, indexAccessMode is standard
   INFO [main] 2012-06-06 20:17:11,213 DatabaseDescriptor.java (line 247)
 Global memtable threshold is enabled at 484MB
   INFO [main] 2012-06-06 20:17:11,499 CacheService.java (line 96)
 Initializing key cache with capacity of 72 MBs.
   INFO [main] 2012-06-06 20:17:11,509 CacheService.java (line 107)
 Scheduling key cache save to each 14400 seconds (going to save all keys).
   INFO [main] 2012-06-06 20:17:11,510 CacheService.java (line 121)
 Initializing row cache with capacity of 0 MBs and provider
 org.apache.cassandra.cache.SerializingCacheProvider
   INFO [main] 2012-06-06 20:17:11,513 CacheService.java (line 133)
 Scheduling row cache save to each 0 seconds (going to save all keys).
 
  Thanks In Advance,
  Javier




Cassandra 1.1.1 stack overflow on an infinite loop building IntervalTree

2012-06-07 Thread Omid Aladini
Hi,

One of my 1.1.1 nodes doesn't restart due to stack overflow on building the
interval tree. Bumping the stack size doesn't help. Here's the stack trace:

https://gist.github.com/2889611

It looks more like an infinite loop on IntervalNode constructor's logic
than a deep tree since DEBUG log shows looping over the same intervals:

https://gist.github.com/2889862

Running it with assertions enabled shows a number of sstables which the
first key  last key, for example:

2012-06-07_16:12:18.18781 java.lang.AssertionError: SSTable first key
DecoratedKey(2254009252149354268486114339861094,
3730343137317c3438333632333932)  last key
DecoratedKey(22166106697727078019854024428005234814,
313138323637397c3432373931353435)

and let's the node come up without hitting IntervalNode constructor. I
wonder how invalid sstables get create in the first place? Is there a way
to verify if other nodes in the cluster are affected as well?

Speaking of a solution to get the node back up without wiping the data off
and let it bootstrap again, I was wondering if I remove affected sstables
and restart the node followed by a repair, will the node end up in a
consistent state?

SStables contain counter columns and leveled compaction is used.

Thanks,
Omid


Cassandra 1.1.1 crash during compaction

2012-06-07 Thread Dustin Wenz
We observed a JRE crash on one node in a seven node cluster about a half hour 
after upgrading to version 1.1.1 yesterday. Immediately after the upgrade, 
everything seemed to be working fine. The last item in the cassandra log was a 
info-level notification that compaction had started on a data file. Four 
minutes later, the process crashed.

The host OS is FreeBSD 8.2, built for the amd64 architecture. Most of the 
cluster settings are left to their defaults and the replication factor is set 
to 2 for our keyspace. We are using the RandomPartitioner and 
RackInferringSnitch. JNA is enabled, but cannot use mlockall since the process 
runs as a non-privileged user. It was also necessary to build our own Snappy 
compressor jar file, since the required architecture was not built-in to the 
public distribution.

Cassandra is a fairly new software deployment for us, and I was hoping someone 
could give me some pointers on interpreting the crash report below.

Thanks,

- .Dustin

#
# An unexpected error has been detected by Java Runtime Environment:
#
#  SIGBUS (0xa) at pc=0x000801199140, pid=44897, tid=0x8d1fdc80
#
# Java VM: Diablo Java HotSpot(TM) 64-Bit Server VM (10.0-b23 mixed mode 
bsd-amd64)
# Problematic frame:
# V  [libjvm.so+0x599140]
#
# Please submit bug reports to freebsd-j...@freebsd.org
#

---  T H R E A D  ---

Current thread (0x000aa9d56000):  JavaThread CompactionExecutor:30 daemon 
[_thread_in_vm, id=-1927291776, stack(0x7898e000,0x78a8e000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=3 (BUS_OBJERR), 
si_addr=0x000801199140

Registers:
RAX=0x000aa95f8fe8, RBX=0x000aa95fc2c0, RCX=0x000987a334f0, 
RDX=0x0009274e9888
RSP=0x78a8d630, RBP=0x78a8d640, RSI=0x0009274e9888, 
RDI=0xc90009274e99
R8 =0x00098c203bd8, R9 =0x0008809ff4b8, R10=0x000801488580, 
R11=0x0001
R12=0x000aa9d56000, R13=0x000aa95f8c00, R14=0x78a8d818, 
R15=0x000aa95f8c10
RIP=0x000801199140, EFL=0x003b003b0001, ERR=0x
  TRAPNO=0x001b00130009

Top of Stack: (sp=0x78a8d630)
0x78a8d630:   000aa95fc2c0 000aa9d56000
0x78a8d640:   78a8d660 00080119917e
0x78a8d650:   00080345b690 000aa95fc2c0
0x78a8d660:   78a8d6a0 000800f3a1fd
0x78a8d670:   000aa95f8fe8 0009274e9869
0x78a8d680:   000986f7acb0 000986f7e591
0x78a8d690:   78a8d818 000aa9d56000
0x78a8d6a0:   78a8d700 00080346556f
0x78a8d6b0:   0009274e9888 00080346553b
0x78a8d6c0:   78a8d6c0 000986f7e591
0x78a8d6d0:   78a8d818 000987a30198
0x78a8d6e0:   000987a334f0 000986f7e918
0x78a8d6f0:    78a8d810
0x78a8d700:   78a8d870 00080345c04e
0x78a8d710:    
0x78a8d720:    
0x78a8d730:    
0x78a8d740:   0009274aa428 00080817df80
0x78a8d750:   0008081714c0 
0x78a8d760:   00080817df30 
0x78a8d770:   00091b5ed0a0 
0x78a8d780:    
0x78a8d790:    
0x78a8d7a0:    
0x78a8d7b0:    
0x78a8d7c0:    
0x78a8d7d0:    0017974b
0x78a8d7e0:   deaddeaf 0137c3fdd558
0x78a8d7f0:   deaddeaf 
0x78a8d800:    00091b638910
0x78a8d810:    
0x78a8d820:   000882d0d2e8 0008406a3de8 

Instructions: (pc=0x000801199140)
0x000801199130:   55 48 89 e5 48 89 5d f0 4c 89 65 f8 48 83 ec 10
0x000801199140:   0f b7 47 10 48 89 fb 44 8d 60 01 49 63 fc e8 4d 

Stack: [0x7898e000,0x78a8e000],  sp=0x78a8d630,  free 
space=1021k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x599140]
V  [libjvm.so+0x59917e]
V  [libjvm.so+0x33a1fd]
v  ~BufferBlob::Interpreter
v  ~BufferBlob::Interpreter
v  ~BufferBlob::Interpreter
v  ~BufferBlob::Interpreter
v  ~BufferBlob::Interpreter
v  ~BufferBlob::Interpreter

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
v  ~BufferBlob::Interpreter
v  ~BufferBlob::Interpreter
v  ~BufferBlob::Interpreter
v  ~BufferBlob::Interpreter
v  ~BufferBlob::Interpreter
v  ~BufferBlob::Interpreter
J  java.util.concurrent.ThreadPoolExecutor$Worker.run()V
v  ~BufferBlob::Interpreter
v  ~BufferBlob::StubRoutines (1)

---  P R O C E S S  ---

Java Threads: ( = current thread )

Data corruption issues with 1.1

2012-06-07 Thread Oleg Dulin
I can't quite describe what happened, but essentially one day I found 
that my column values that are supposed to be UTF-8 strings started 
getting bogus characters.


Is there a known data corruption issue with 1.1 ?




nodetool repair -- should I schedule a weekly one ?

2012-06-07 Thread Oleg Dulin
We have a 3-node cluster. We use RF of 3 and CL of ONE for both reads 
and writes…. Is there a reason I should schedule a regular nodetool 
repair job ?


Thanks,
Oleg




Re: nodetool repair -- should I schedule a weekly one ?

2012-06-07 Thread ruslan usifov
Yes, for ONE you cant got inconsistent read in case when one of you
nodes are die, and dinamyc snitch doesn't do it job

2012/6/7 Oleg Dulin oleg.du...@gmail.com:
 We have a 3-node cluster. We use RF of 3 and CL of ONE for both reads and
 writes…. Is there a reason I should schedule a regular nodetool repair job ?

 Thanks,
 Oleg




Re: nodetool repair -- should I schedule a weekly one ?

2012-06-07 Thread ruslan usifov
Sorry no dinamic snitch, but hinted handoff. Remember casaandra is
evently consistent

2012/6/8 ruslan usifov ruslan.usi...@gmail.com:
 Yes, for ONE you cant got inconsistent read in case when one of you
 nodes are die, and dinamyc snitch doesn't do it job

 2012/6/7 Oleg Dulin oleg.du...@gmail.com:
 We have a 3-node cluster. We use RF of 3 and CL of ONE for both reads and
 writes…. Is there a reason I should schedule a regular nodetool repair job ?

 Thanks,
 Oleg




Re: Maximum load per node

2012-06-07 Thread aaron morton
It's not a hard rule, you can put more data on a node. The 300GB to 400GB idea 
is mostly concerned with operations, you may want to put less on a node due to 
higher throughput demands. 

(We are talking about the amount of data on a node, regardless of the RF). 

On the operations side the considerations are:

* If you want to move the node to a new host moving 400 GB at 35MB/sec takes 
about 3 to 4  hours (this is the speed I recently got for moving 500GB on AWS 
in the same AZ)

* Repair will need to process all of the data. Assuming the bottle neck is not 
the CPU, and there are no other background processes running, it will take 7 
hours to read the data at the default 16MB/sec 
(compaction_throughput_mb_per_sec).  

* Some throughput considerations for compaction.

* Major compaction compacts all the sstables, and assumes that it needs that 
much space again to write the new file. We normally dont want to do major 
compactions though. 

* If you are in a situation where you have lost redundancy for all or part of 
the key ring, you will want to get new nodes online ASAP. Taking several hours 
to bring new nodes on may not be acceptable. 

* The more data on disk the memory needed. The memory is taken up by bloom 
filters and index sampling. These can be tuned to reduce the memory footprint, 
with potential reduction in read speed. 

* Using compression helps reduce the on disk, and makes some things run faster. 
My experience is that is that repair and compaction will still take a while, as 
they deal with the uncompressed data. 

* Startup time for index sampling is/was an issue (it's faster in 1.1). If the 
node has more memory and more disk the time to get the page cache hot will 
increase.

* As the amount of data per node goes up, potentially so does the working set 
of hot data. If the memory per node available for the page cache remains the 
same potentially the latency will increase. e.g.  3 nodes with 800Gb each has 
less memory for the hot set than 6 nodes with 400GB each.

It's just a rule of thumb to avoid getting into trouble. Where trouble is often 
help something went wrong and it's takes ages to fix or why does X take 
forever or why does it use Y amount of memory. If you are aware of the 
issues, there is essentially no upper limit on how much data you can put on a 
node. 

Hope that helps. 
Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 8/06/2012, at 12:59 AM, Ben Kaehne wrote:

 Does this max load have correlation to replication factor?
 
 IE a 3 node cluster with rf of 3. Should i be worried at {max load} X 3 or 
 what people generally mention the max load is?
 
 On Thu, Jun 7, 2012 at 10:55 PM, Filippo Diotalevi fili...@ntoklo.com wrote:
 Hi,
 one of latest Aaron's observation about the max load per Cassandra node 
 caught my attention
 At ~840GB I'm probably running close
 to the max load I should have on a node,
 [AM] roughly 300GB to 400GB is the max load
 Since we currently have a Cassandra node with roughly 330GB of data, it looks 
 like that's a good time for us to really understand what's that limit in our 
 case. Also, a (maybe old) Stackoverflow question at 
 http://stackoverflow.com/questions/4775388/how-much-data-per-node-in-cassandra-cluster
  , seems to suggest a higher limit per node.
 
 Just considering the compaction issues, what are the factors we need to 
 account to determine the max load? 
 
 * disk space
 Datastax cassandra docs state (pg 97) that a major compaction temporarily 
 doubles disk space usage. Is it a safe estimate to say that the Cassandra 
 machine needs to have roughly the same amount of free disk space as the 
 current load of the Cassandra node, or are there any other factor to consider?
 
 * RAM
 Is the amount of RAM in the machine (or dedicated to the Cassandra node) 
 affecting in any way the speed/efficiency of the compaction process? 
 
 * Performance degradation for overloaded nodes?
 What kind of performance degradation can we expect for a Cassandra node which 
 is overloaded? (f.i. with 500GB or more of data)
 
 
 Thanks for the clarifications,
 -- 
 Filippo Diotalevi
 
 
 
 
 
 -- 
 -Ben



Setting column to null

2012-06-07 Thread Leonid Ilyevsky
Is it possible to explicitly set a column value to null?

I see that if insert statement does not include a specific column, that column 
comes up as null (assuming we are creating a record with new unique key).
But if we want to update a record, how we set it to null?

Another situation is when I use prepared cql3 statement (in Java) and send 
parameters when I execute it. If I want to leave some column unassigned, I need 
a special statement without that column.
What I would like is, prepare one statement including all columns, and then be 
able to set some of them to null. I tried to set corresponding ByteBuffer 
parameter to null, obviously got an exception.


This email, along with any attachments, is confidential and may be legally 
privileged or otherwise protected from disclosure. Any unauthorized 
dissemination, copying or use of the contents of this email is strictly 
prohibited and may be in violation of law. If you are not the intended 
recipient, any disclosure, copying, forwarding or distribution of this email is 
strictly prohibited and this email and any attachments should be deleted 
immediately. This email and any attachments do not constitute an offer to sell 
or a solicitation of an offer to purchase any interest in any investment 
vehicle sponsored by Moon Capital Management LP (Moon Capital). Moon Capital 
does not provide legal, accounting or tax advice. Any statement regarding 
legal, accounting or tax matters was not intended or written to be relied upon 
by any person as advice. Moon Capital does not waive confidentiality or 
privilege as a result of this email.


Re: Secondary Indexes, Quorum and Cluster Availability

2012-06-07 Thread Jim Ancona
On Thu, Jun 7, 2012 at 5:41 AM, aaron morton aa...@thelastpickle.comwrote:

 Sounds good. Do you want to make the change ?

Done.


 Thanks for taking the time.

Thanks for giving the answer!

Jim



 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 7/06/2012, at 7:54 AM, Jim Ancona wrote:

 On Tue, Jun 5, 2012 at 4:30 PM, Jim Ancona j...@anconafamily.com wrote:

 It might be a good idea for the documentation to reflect the tradeoffs
 more clearly.


 Here's a proposed addition to the Secondary Index FAQ at
 http://wiki.apache.org/cassandra/SecondaryIndexes

 Q: How does choice of Consistency Level affect cluster availability when
 using secondary indexes?
 A: Because secondary indexes are distributed, you must have CL level nodes
 available for *all* token ranges in the cluster in order to complete a
 query. For example, with RF = 3, when two out of three consecutive nodes in
 the ring are unavailable, *all* secondary index queries at CL = QUORUM
 will fail, however secondary index queries at CL = ONE will succeed. This
 is true regardless of cluster size.

 Comments?

 Jim





Problem joining new node to cluster in 1.1.1

2012-06-07 Thread Bryce Godfrey
As the new node starts up I get this error before boostrap starts:

INFO 08:20:51,584 Enqueuing flush of Memtable-schema_columns@1493418651(0/0 
serialized/live bytes, 1 ops)
INFO 08:20:51,584 Writing Memtable-schema_columns@1493418651(0/0 
serialized/live bytes, 1 ops)
INFO 08:20:51,589 Completed flushing 
/opt/cassandra/data/system/schema_columns/system-schema_columns-hc-1-Data.db 
(61 bytes)
ERROR 08:20:51,889 Exception in thread Thread[MigrationStage:1,5,main]
java.lang.IllegalArgumentException: value already present: 1015
at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
at 
com.google.common.collect.AbstractBiMap.putInBothMaps(AbstractBiMap.java:111)
at com.google.common.collect.AbstractBiMap.put(AbstractBiMap.java:96)
at com.google.common.collect.HashBiMap.put(HashBiMap.java:84)
at org.apache.cassandra.config.Schema.load(Schema.java:385)
at org.apache.cassandra.db.DefsTable.addColumnFamily(DefsTable.java:426)
at 
org.apache.cassandra.db.DefsTable.mergeColumnFamilies(DefsTable.java:361)
at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:270)
at 
org.apache.cassandra.db.DefsTable.mergeRemoteSchema(DefsTable.java:248)
at 
org.apache.cassandra.service.MigrationManager$MigrationTask.runMayThrow(MigrationManager.java:416)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
INFO 08:20:51,931 Enqueuing flush of 
Memtable-schema_keyspaces@833041663(943/1178 serialized/live bytes, 20 ops)
INFO 08:20:51,932 Writing Memtable-schema_keyspaces@833041663(943/1178 
serialized/live bytes, 20 ops)


Then it starts spewing these errors nonstop until I kill it.

ERROR 08:21:45,959 Error in row mutation
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=1019
at 
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:126)
at 
org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:439)
at 
org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:447)
at org.apache.cassandra.db.RowMutation.fromBytes(RowMutation.java:395)
at 
org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:42)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
ERROR 08:21:45,814 Error in row mutation
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=1019
at 
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:126)
at 
org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:439)
at 
org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:447)
at org.apache.cassandra.db.RowMutation.fromBytes(RowMutation.java:395)
at 
org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:42)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
ERROR 08:21:45,813 Error in row mutation
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=1020
at 
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:126)
at 
org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:439)
at 
org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:447)
at org.apache.cassandra.db.RowMutation.fromBytes(RowMutation.java:395)
at 
org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:42)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
ERROR 08:21:45,813 Error in row mutation


I'm guessing the first error caused some column families to not be created?  

RE: Problem in getting data from a 2 node cluster

2012-06-07 Thread Prakrati Agrawal
What is the default replication factor? I did not set any replication factor.

Prakrati Agrawal | Developer - Big Data(ID)| 9731648376 | www.mu-sigma.com

-Original Message-
From: Tim Wintle [mailto:timwin...@gmail.com]
Sent: Wednesday, June 06, 2012 5:42 PM
To: user@cassandra.apache.org
Subject: RE: Problem in getting data from a 2 node cluster

On Wed, 2012-06-06 at 06:54 -0500, Prakrati Agrawal wrote:
 This node will not auto bootstrap because it is configured to be a
 seed node

This means the cassandra.yaml on that node references itself as a seed
node.


After you decommission the second node, can you still access the entire
dataset in the single node cluser, or has it been lost along the way?

What is the replication factor for your data?


Tim Wintle



 This email message may contain proprietary, private and confidential 
information. The information transmitted is intended only for the person(s) or 
entities to which it is addressed. Any review, retransmission, dissemination or 
other use of, or taking of any action in reliance upon, this information by 
persons or entities other than the intended recipient is prohibited and may be 
illegal. If you received this in error, please contact the sender and delete 
the message from your system.

Mu Sigma takes all reasonable steps to ensure that its electronic 
communications are free from viruses. However, given Internet accessibility, 
the Company cannot accept liability for any virus introduced by this e-mail or 
any attachment and you are advised to use up-to-date virus checking software.


RE: Cassandra not retrieving the complete data on 2 nodes

2012-06-07 Thread Prakrati Agrawal
I have specified the consistency level as 1

Prakrati Agrawal | Developer - Big Data(ID)| 9731648376 | www.mu-sigma.com

From: Poziombka, Wade L [mailto:wade.l.poziom...@intel.com]
Sent: Wednesday, June 06, 2012 11:11 PM
To: user@cassandra.apache.org
Subject: RE: Cassandra not retrieving the complete data on 2 nodes

what is your consistency level?

From: Prakrati Agrawal [mailto:prakrati.agra...@mu-sigma.com]
Sent: Wednesday, June 06, 2012 4:58 AM
To: user@cassandra.apache.org
Subject: RE: Cassandra not retrieving the complete data on 2 nodes

Please anyone reply to my query

Prakrati Agrawal | Developer - Big Data(ID)| 9731648376 | 
www.mu-sigma.comhttp://www.mu-sigma.com

From: Prakrati Agrawal [mailto:prakrati.agra...@mu-sigma.com]
Sent: Wednesday, June 06, 2012 2:34 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Cassandra not retrieving the complete data on 2 nodes

Dear all

I was originally having a 1 node cluster. Then I added one more node to it with 
initial token configured appropriately. Now when I run my queries I am not 
getting all my data ie all columns.

Output on 2 nodes
Time taken to retrieve columns 43707 of key range is 1276
Time taken to retrieve columns 2084199 of all tickers is 54334
Time taken to count is 230776
Total number of rows in the database are 183
Total number of columns in the database are 7903753

Output on 1 node
Time taken to retrieve columns 43707 of key range is 767
Time taken to retrieve columns 382 of all tickers is 52793
Time taken to count is 268135
Total number of rows in the database are 396
Total number of columns in the database are 16316426

Please help me. Where is my data going or how should I retrieve it.

Thanks and Regards
Prakrati Agrawal | Developer - Big Data(ID)| 9731648376 | 
www.mu-sigma.comhttp://www.mu-sigma.com



This email message may contain proprietary, private and confidential 
information. The information transmitted is intended only for the person(s) or 
entities to which it is addressed. Any review, retransmission, dissemination or 
other use of, or taking of any action in reliance upon, this information by 
persons or entities other than the intended recipient is prohibited and may be 
illegal. If you received this in error, please contact the sender and delete 
the message from your system.

Mu Sigma takes all reasonable steps to ensure that its electronic 
communications are free from viruses. However, given Internet accessibility, 
the Company cannot accept liability for any virus introduced by this e-mail or 
any attachment and you are advised to use up-to-date virus checking software.


This email message may contain proprietary, private and confidential 
information. The information transmitted is intended only for the person(s) or 
entities to which it is addressed. Any review, retransmission, dissemination or 
other use of, or taking of any action in reliance upon, this information by 
persons or entities other than the intended recipient is prohibited and may be 
illegal. If you received this in error, please contact the sender and delete 
the message from your system.

Mu Sigma takes all reasonable steps to ensure that its electronic 
communications are free from viruses. However, given Internet accessibility, 
the Company cannot accept liability for any virus introduced by this e-mail or 
any attachment and you are advised to use up-to-date virus checking software.


This email message may contain proprietary, private and confidential 
information. The information transmitted is intended only for the person(s) or 
entities to which it is addressed. Any review, retransmission, dissemination or 
other use of, or taking of any action in reliance upon, this information by 
persons or entities other than the intended recipient is prohibited and may be 
illegal. If you received this in error, please contact the sender and delete 
the message from your system.

Mu Sigma takes all reasonable steps to ensure that its electronic 
communications are free from viruses. However, given Internet accessibility, 
the Company cannot accept liability for any virus introduced by this e-mail or 
any attachment and you are advised to use up-to-date virus checking software.


RE: Cassandra not retrieving the complete data on 2 nodes

2012-06-07 Thread Prakrati Agrawal
I was told that the node bootstraps automatically in 1.1.0 version of 
Cassandra. Please help me how to rectify the mistake

Prakrati Agrawal | Developer - Big Data(ID)| 9731648376 | www.mu-sigma.com

From: Tyler Hobbs [mailto:ty...@datastax.com]
Sent: Wednesday, June 06, 2012 11:45 PM
To: user@cassandra.apache.org
Subject: Re: Cassandra not retrieving the complete data on 2 nodes

In addition to using a low consistency level, it sounds like you didn't 
bootstrap the node or run a repair after it joined the ring.
On Wed, Jun 6, 2012 at 12:41 PM, Poziombka, Wade L 
wade.l.poziom...@intel.commailto:wade.l.poziom...@intel.com wrote:
what is your consistency level?

From: Prakrati Agrawal 
[mailto:prakrati.agra...@mu-sigma.commailto:prakrati.agra...@mu-sigma.com]
Sent: Wednesday, June 06, 2012 4:58 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: RE: Cassandra not retrieving the complete data on 2 nodes

Please anyone reply to my query

Prakrati Agrawal | Developer - Big Data(ID)| 9731648376 | 
www.mu-sigma.comhttp://www.mu-sigma.com

From: Prakrati Agrawal [mailto:prakrati.agra...@mu-sigma.com]
Sent: Wednesday, June 06, 2012 2:34 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Cassandra not retrieving the complete data on 2 nodes

Dear all

I was originally having a 1 node cluster. Then I added one more node to it with 
initial token configured appropriately. Now when I run my queries I am not 
getting all my data ie all columns.

Output on 2 nodes
Time taken to retrieve columns 43707 of key range is 1276
Time taken to retrieve columns 2084199 of all tickers is 54334
Time taken to count is 230776
Total number of rows in the database are 183
Total number of columns in the database are 7903753

Output on 1 node
Time taken to retrieve columns 43707 of key range is 767
Time taken to retrieve columns 382 of all tickers is 52793
Time taken to count is 268135
Total number of rows in the database are 396
Total number of columns in the database are 16316426

Please help me. Where is my data going or how should I retrieve it.

Thanks and Regards
Prakrati Agrawal | Developer - Big Data(ID)| 9731648376 | 
www.mu-sigma.comhttp://www.mu-sigma.com



This email message may contain proprietary, private and confidential 
information. The information transmitted is intended only for the person(s) or 
entities to which it is addressed. Any review, retransmission, dissemination or 
other use of, or taking of any action in reliance upon, this information by 
persons or entities other than the intended recipient is prohibited and may be 
illegal. If you received this in error, please contact the sender and delete 
the message from your system.

Mu Sigma takes all reasonable steps to ensure that its electronic 
communications are free from viruses. However, given Internet accessibility, 
the Company cannot accept liability for any virus introduced by this e-mail or 
any attachment and you are advised to use up-to-date virus checking software.


This email message may contain proprietary, private and confidential 
information. The information transmitted is intended only for the person(s) or 
entities to which it is addressed. Any review, retransmission, dissemination or 
other use of, or taking of any action in reliance upon, this information by 
persons or entities other than the intended recipient is prohibited and may be 
illegal. If you received this in error, please contact the sender and delete 
the message from your system.

Mu Sigma takes all reasonable steps to ensure that its electronic 
communications are free from viruses. However, given Internet accessibility, 
the Company cannot accept liability for any virus introduced by this e-mail or 
any attachment and you are advised to use up-to-date virus checking software.



--
Tyler Hobbs
DataStaxhttp://datastax.com/


This email message may contain proprietary, private and confidential 
information. The information transmitted is intended only for the person(s) or 
entities to which it is addressed. Any review, retransmission, dissemination or 
other use of, or taking of any action in reliance upon, this information by 
persons or entities other than the intended recipient is prohibited and may be 
illegal. If you received this in error, please contact the sender and delete 
the message from your system.

Mu Sigma takes all reasonable steps to ensure that its electronic 
communications are free from viruses. However, given Internet accessibility, 
the Company cannot accept liability for any virus introduced by this e-mail or 
any attachment and you are advised to use up-to-date virus checking software.