unable to find sufficient sources for streaming range
We are running Cassandra 1.2.5 We have 8 nodes cluster, and we removed one machine from cluster and try to add it back(the purpose is we are using vnodes, some node has more tokens so by rejoining this machine we hope it could get some loads from the busy machines). But we got following exception and the node cannot add to the ring anymore. Please help, Thanks in advance, INFO 16:01:56,260 JOINING: Starting to bootstrap... ERROR 16:01:56,514 Exception encountered during startup java.lang.IllegalStateException: unable to find sufficient sources for streaming range (131921530760098415548184818173535242096,132123583169200197961735373586277861750] at org.apache.cassandra.dht.RangeStreamer.getRangeFetchMap(RangeStreamer.java:205) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:129) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:924) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:693) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:548) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:445) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:325) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:413) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:456) java.lang.IllegalStateException: unable to find sufficient sources for streaming range (131921530760098415548184818173535242096,132123583169200197961735373586277861750] at org.apache.cassandra.dht.RangeStreamer.getRangeFetchMap(RangeStreamer.java:205) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:129) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:924) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:693) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:548) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:445) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:325) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:413) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:456) Exception encountered during startup: unable to find sufficient sources for streaming range (131921530760098415548184818173535242096,132123583169200197961735373586277861750] ERROR 16:01:56,518 Exception in thread Thread[StorageServiceShutdownHook,5,main] java.lang.NullPointerException at org.apache.cassandra.service.StorageService.stopRPCServer(StorageService.java:321) at org.apache.cassandra.service.StorageService.shutdownClientServers(StorageService.java:362) at org.apache.cassandra.service.StorageService.access$000(StorageService.java:88) at org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:513) Daning
Bulk writes and key cache
Does Cassandra put keys in key cache during the write path? If I have two tables, the key cache for the first table was warmed up nicely, and I want to insert millions rows in the second table, and there is no read on the second table yet, will that affect cache hit ratio for the first table? Thanks, Daning
Move token to another node on 1.2.x
How to move a token to another node on 1.2.x? I have tried move command, [cassy@dsat103.e1a ~]$ nodetool move 168755834953206242653616795390304335559 Exception in thread main java.io.IOException: target token 168755834953206242653616795390304335559 is already owned by another node. at org.apache.cassandra.service.StorageService.move(StorageService.java:2908) at org.apache.cassandra.service.StorageService.move(StorageService.java:2892) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) change the token number a bit [cassy@dsat103.e1a ~]$ nodetool -h localhost move 168755834953206242653616795390304335560 This node has more than one token and cannot be moved thusly We don't want to use cassandra-shuffle, because it put too much load on the server. we just want to move some tokens. Thanks, Daning
Re: ReadCount change rate is different across nodes
Thanks. actually I forgot to mention it is multi-center environment and we have dynamic snitch disabled. because we saw some performance impact on the multi-center environment. On Wed, Oct 30, 2013 at 11:12 AM, Piavlo lolitus...@gmail.com wrote: On 10/30/2013 02:06 AM, Daning Wang wrote: We are running 1.2.5 on 8 nodes(256 tokens). all the nodes are running on same type of machine. and db size is about same. but recently we checked ReadCount stats through jmx, and found that some nodes got 3 times change rate(we have calculated the changes per minute) than others. We are using hector on client side, and clients are connecting to all the servers, we checked open connections on each server, the numbers are about same. What could cause this problem, how to debug this? check per node reads latency CF metrics, and i guess you have dynamic snitch enabled? http://www.datastax.com/dev/**blog/dynamic-snitching-in-** cassandra-past-present-and-**futurehttp://www.datastax.com/dev/blog/dynamic-snitching-in-cassandra-past-present-and-future Thanks in advance, Daning
ReadCount change rate is different across nodes
We are running 1.2.5 on 8 nodes(256 tokens). all the nodes are running on same type of machine. and db size is about same. but recently we checked ReadCount stats through jmx, and found that some nodes got 3 times change rate(we have calculated the changes per minute) than others. We are using hector on client side, and clients are connecting to all the servers, we checked open connections on each server, the numbers are about same. What could cause this problem, how to debug this? Thanks in advance, Daning
Key cache size
We noticed that key cache could not be fully populated, we have set the key cache size to 1024M. key_cache_size_in_mb: 1024 But none of nodes showed the cache capacity is 1G, we have recently upgraded to 1.2.5, could be an issue in that version? Token: (invoke with -T/--tokens to see all 256 tokens) ID : 0fd912fb-3187-462b-8c8a-7d223751b649 Gossip active: true Thrift active: true Load : 73.16 GB Generation No: 1372374984 Uptime (seconds) : 5953779 Heap Memory (MB) : 5440.59 / 10035.25 Data Center : dc1 Rack : rac1 Exceptions : 34601 Key Cache: size 540060752 (bytes), capacity 540060796 (bytes), 12860975403 hits, 15535054378 requests, 0.839 recent hit rate, 14400 save period in seconds Row Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Thanks, Daninng
Dynamic Snitch and EC2MultiRegionSnitch
How does dynamic snitch work with EC2MultiRegionSnitch? Can dynamic routing only happen in one data center? We don't wan to have the requests routed to another center even nodes are idle in other side since the network could be slow. Thanks in advance, Daning
Re: Multiple data center performance
Sorry for the confusion. Sylvain - Do you think what could cause the client higher latency in multiDC(CL=one for read and write) ? clients only connect to nodes in the same DC. we did see the performance greatly improved after changing the replication factor for counters, but still slower than other DC is shutdown. Thanks, Daning On Wed, Jun 12, 2013 at 7:48 AM, Sylvain Lebresne sylv...@datastax.comwrote: Is there something special of this kind regarding counters over multiDC ? No. Counters behave exactly as other writes as far the consistency level is concerned. Technically, the counter write path is different from the normal write path in the sense that a counter write will be written to one replica first and then written to the rest of the replicas in a second time (with a local read on the first replica in between, which is why counter writes are slower than normal ones). But, outside of the obvious performance impact, this has no impact on the behavior observed from a client point of view. The consistency level has the exact same meaning in particular (though one small difference is that counters don't support CL.ANY). -- Sylvain Thank you anyway Sylvain 2013/6/12 Sylvain Lebresne sylv...@datastax.com It is the normal behavior, but that's true of any update, not only of counters. The consistency level does *not* influence which replica are written to. Cassandra always write to all replicas. The consistency level only decides how replica acknowledgement are waited for. -- Sylvain On Wed, Jun 12, 2013 at 4:56 AM, Alain RODRIGUEZ arodr...@gmail.comwrote: counter will replicate to all replicas during write regardless the consistency level I that the normal behavior or a bug ? 2013/6/11 Daning Wang dan...@netseer.com It is counter caused the problem. counter will replicate to all replicas during write regardless the consistency level. In our case. we don't need to sync the counter across the center. so moving counter to new keyspace and all the replica in one center solved problem. There is option replicate_on_write on table. If you turn that off for counter might have better performance. but you are on high risk to lose data and create inconsistency. I did not try this option. Daning On Sat, Jun 8, 2013 at 6:53 AM, srmore comom...@gmail.com wrote: I am seeing the similar behavior, in my case I have 2 nodes in each datacenter and one node always has high latency (equal to the latency between the two datacenters). When one of the datacenters is shutdown the latency drops. I am curious to know whether anyone else has these issues and if yes how did to get around it. Thanks ! On Fri, Jun 7, 2013 at 11:49 PM, Daning Wang dan...@netseer.comwrote: We have deployed multi-center but got performance issue. When the nodes on other center are up, the read response time from clients is 4 or 5 times higher. when we take those nodes down, the response time becomes normal(compare to the time before we changed to multi-center). We have high volume on the cluster, the consistency level is one for read. so my understanding is most of traffic between data center should be read repair. but seems that could not create much delay. What could cause the problem? how to debug this? Here is the keyspace, [default@dsat] describe dsat; Keyspace: dsat: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [dc2:1, dc1:3] Column Families: ColumnFamily: categorization_cache Ring Datacenter: dc1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN xx.xx.xx..111 59.2 GB256 37.5% 4d6ed8d6-870d-4963-8844-08268607757e rac1 DN xx.xx.xx..121 99.63 GB 256 37.5% 9d0d56ce-baf6-4440-a233-ad6f1d564602 rac1 UN xx.xx.xx..120 66.32 GB 256 37.5% 0fd912fb-3187-462b-8c8a-7d223751b649 rac1 UN xx.xx.xx..118 63.61 GB 256 37.5% 3c6e6862-ab14-4a8c-9593-49631645349d rac1 UN xx.xx.xx..117 68.16 GB 256 37.5% ee6cdf23-d5e4-4998-a2db-f6c0ce41035a rac1 UN xx.xx.xx..116 32.41 GB 256 37.5% f783eeef-1c51-4f91-ab7c-a60669816770 rac1 UN xx.xx.xx..115 64.24 GB 256 37.5% e75105fb-b330-4f40-aa4f-8e6e11838e37 rac1 UN xx.xx.xx..112 61.32 GB 256 37.5% 2547ee54-88dd-4994-a1ad-d9ba367ed11f rac1 Datacenter: dc2 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack DN xx.xx.xx.19958.39 GB 256 50.0% 6954754a-e9df-4b3c-aca7-146b938515d8 rac1 DN xx.xx.xx..61 33.79 GB 256 50.0% 91b8d510-966a-4f2d-a666-d7edbe986a1c rac1 Thank you in advance, Daning
Re: Multiple data center performance
It is counter caused the problem. counter will replicate to all replicas during write regardless the consistency level. In our case. we don't need to sync the counter across the center. so moving counter to new keyspace and all the replica in one center solved problem. There is option replicate_on_write on table. If you turn that off for counter might have better performance. but you are on high risk to lose data and create inconsistency. I did not try this option. Daning On Sat, Jun 8, 2013 at 6:53 AM, srmore comom...@gmail.com wrote: I am seeing the similar behavior, in my case I have 2 nodes in each datacenter and one node always has high latency (equal to the latency between the two datacenters). When one of the datacenters is shutdown the latency drops. I am curious to know whether anyone else has these issues and if yes how did to get around it. Thanks ! On Fri, Jun 7, 2013 at 11:49 PM, Daning Wang dan...@netseer.com wrote: We have deployed multi-center but got performance issue. When the nodes on other center are up, the read response time from clients is 4 or 5 times higher. when we take those nodes down, the response time becomes normal(compare to the time before we changed to multi-center). We have high volume on the cluster, the consistency level is one for read. so my understanding is most of traffic between data center should be read repair. but seems that could not create much delay. What could cause the problem? how to debug this? Here is the keyspace, [default@dsat] describe dsat; Keyspace: dsat: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [dc2:1, dc1:3] Column Families: ColumnFamily: categorization_cache Ring Datacenter: dc1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN xx.xx.xx..111 59.2 GB256 37.5% 4d6ed8d6-870d-4963-8844-08268607757e rac1 DN xx.xx.xx..121 99.63 GB 256 37.5% 9d0d56ce-baf6-4440-a233-ad6f1d564602 rac1 UN xx.xx.xx..120 66.32 GB 256 37.5% 0fd912fb-3187-462b-8c8a-7d223751b649 rac1 UN xx.xx.xx..118 63.61 GB 256 37.5% 3c6e6862-ab14-4a8c-9593-49631645349d rac1 UN xx.xx.xx..117 68.16 GB 256 37.5% ee6cdf23-d5e4-4998-a2db-f6c0ce41035a rac1 UN xx.xx.xx..116 32.41 GB 256 37.5% f783eeef-1c51-4f91-ab7c-a60669816770 rac1 UN xx.xx.xx..115 64.24 GB 256 37.5% e75105fb-b330-4f40-aa4f-8e6e11838e37 rac1 UN xx.xx.xx..112 61.32 GB 256 37.5% 2547ee54-88dd-4994-a1ad-d9ba367ed11f rac1 Datacenter: dc2 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack DN xx.xx.xx.19958.39 GB 256 50.0% 6954754a-e9df-4b3c-aca7-146b938515d8 rac1 DN xx.xx.xx..61 33.79 GB 256 50.0% 91b8d510-966a-4f2d-a666-d7edbe986a1c rac1 Thank you in advance, Daning
replication factor is zero
We have multi-center deployment. data from some tables we don't want to sync to other center. could we set replication factor to 0 on other data center? what is the best to way for not syncing some data in a cluster? Thanks in advance, Daning
How to change existing cluster to multi-center
Hi All, We have 8 nodes cluster(replication factor is 3), about 50G data on each node. we need to change the cluster to multi-center environment(to EC2). the data need to have one replica on ec2. Here is the plan, - Change cluster config to mult-center. - Add 2 or 3 nodes in another center, which is ec2. - Change the replication factor to make data synced to other center. We have not done the test yet, is this doable? the main concern is that since connection to ec2 is slow, it will take longer time to streaming data(should be more than 100G) at the beginning. Anybody has done this before, please share some light, Thanks in advance, Daning
Cassandra remote backup solution
Hi Guys, What is the cassandra solution for remote backup besides multi-center? I hope I can do incremental backup to remote database center. Thanks, Daning
Re: Upgrade to Cassandra 1.2
Thanks Aaron and Manu. Since we are using 1.1, there is no num_taken parameter. when I upgrade to 1.2, should I set num_token=1 to start up, or I can set to other numbers? Daning On Tue, Feb 12, 2013 at 3:45 PM, Manu Zhang owenzhang1...@gmail.com wrote: num_tokens is only used at bootstrap I think it's also used in this case (already bootstrapped with num_tokens = 1 and now num_tokens 1). Cassandra will split a node's current range into *num_tokens* parts and there should be no change to the amount of ring a node holds before shuffling. On Wed, Feb 13, 2013 at 3:12 AM, aaron morton aa...@thelastpickle.comwrote: Restore the settings for num_tokens and intial_token to what they were before you upgraded. They should not be changed just because you are upgrading to 1.2, they are used to enable virtual nodes. Which are not necessary to run 1.2. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 13/02/2013, at 8:02 AM, Daning Wang dan...@netseer.com wrote: No, I did not run shuffle since the upgrade was not successful. what do you mean reverting the changes to num_tokens and inital_token? set num_tokens=1? initial_token should be ignored since it is not bootstrap. right? Thanks, Daning On Tue, Feb 12, 2013 at 10:52 AM, aaron morton aa...@thelastpickle.comwrote: Were you upgrading to 1.2 AND running the shuffle or just upgrading to 1.2? If you have not run shuffle I would suggest reverting the changes to num_tokens and inital_token. This is a guess because num_tokens is only used at bootstrap. Just get upgraded to 1.2 first, then do the shuffle when things are stable. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 12/02/2013, at 2:55 PM, Daning Wang dan...@netseer.com wrote: Thanks Aaron. I tried to migrate existing cluster(ver 1.1.0) to 1.2.1 but failed. - I followed http://www.datastax.com/docs/1.2/install/upgrading, have merged cassandra.yaml, with follow parameter num_tokens: 256 #initial_token: 0 the initial_token is commented out, current token should be obtained from system schema - I did rolling upgrade, during the upgrade, I got Borken Pipe error from the nodes with old version, is that normal? - After I upgraded 3 nodes(still have 5 to go), I found it is total wrong, the first node upgraded owns 99.2 of ring [cassy@d5:/usr/local/cassy conf]$ ~/bin/nodetool -h localhost status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack DN 10.210.101.11745.01 GB 254 99.2% f4b6afe3-7e2e-4c61-96e8-12a529a31373 rack1 UN 10.210.101.12045.43 GB 256 0.4% 0fd912fb-3187-462b-8c8a-7d223751b649 rack1 UN 10.210.101.11127.08 GB 256 0.4% bd4c37bc-07dd-488b-bfab-e74e32c26f6e rack1 What was wrong? please help. I could provide more information if you need. Thanks, Daning On Mon, Feb 4, 2013 at 9:16 AM, aaron morton aa...@thelastpickle.comwrote: There is a command line utility in 1.2 to shuffle the tokens… http://www.datastax.com/dev/blog/upgrading-an-existing-cluster-to-vnodes $ ./cassandra-shuffle --help Missing sub-command argument. Usage: shuffle [options] sub-command Sub-commands: create Initialize a new shuffle operation ls List pending relocations clearClear pending relocations en[able] Enable shuffling dis[able]Disable shuffling Options: -dc, --only-dc Apply only to named DC (create only) -tp, --thrift-port Thrift port number (Default: 9160) -p, --port JMX port number (Default: 7199) -tf, --thrift-framed Enable framed transport for Thrift (Default: false) -en, --and-enableImmediately enable shuffling (create only) -H, --help Print help information -h, --host JMX hostname or IP address (Default: localhost) -th, --thrift-host Thrift hostname or IP address (Default: JMX host) Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 3/02/2013, at 11:32 PM, Manu Zhang owenzhang1...@gmail.com wrote: On Sun 03 Feb 2013 05:45:56 AM CST, Daning Wang wrote: I'd like to upgrade from 1.1.6 to 1.2.1, one big feature in 1.2 is that it can have multiple tokens in one node. but there is only one token in 1.1.6. how can I upgrade to 1.2.1 then breaking the token to take advantage of this feature? I went through this doc but it does not say how to change the num_token http://www.datastax.com/docs/1.2/install/upgrading Is there other doc about this upgrade path? Thanks, Daning I think for each node you need to change the num_token
Re: Upgrade to Cassandra 1.2
Thanks! suppose I can upgrade to 1.2.x with 1 token by commenting out num_tokens, how can I changed to multiple tokens? could not find doc clearly stating about this. On Thu, Feb 14, 2013 at 10:54 AM, Alain RODRIGUEZ arodr...@gmail.comwrote: From: http://www.datastax.com/docs/1.2/configuration/node_configuration#num-tokens About num_tokens: If left unspecified, Cassandra uses the default value of 1 token (for legacy compatibility) and uses the initial_token. If you already have a cluster with one token per node, and wish to migrate to multiple tokens per node. So I would let #num_tokens commented in the cassandra.yaml and would set the initial_token at the same value than in the pre-C*1.2.x-uprage configuration. Alain 2013/2/14 Daning Wang dan...@netseer.com Thanks Aaron and Manu. Since we are using 1.1, there is no num_taken parameter. when I upgrade to 1.2, should I set num_token=1 to start up, or I can set to other numbers? Daning On Tue, Feb 12, 2013 at 3:45 PM, Manu Zhang owenzhang1...@gmail.comwrote: num_tokens is only used at bootstrap I think it's also used in this case (already bootstrapped with num_tokens = 1 and now num_tokens 1). Cassandra will split a node's current range into *num_tokens* parts and there should be no change to the amount of ring a node holds before shuffling. On Wed, Feb 13, 2013 at 3:12 AM, aaron morton aa...@thelastpickle.comwrote: Restore the settings for num_tokens and intial_token to what they were before you upgraded. They should not be changed just because you are upgrading to 1.2, they are used to enable virtual nodes. Which are not necessary to run 1.2. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 13/02/2013, at 8:02 AM, Daning Wang dan...@netseer.com wrote: No, I did not run shuffle since the upgrade was not successful. what do you mean reverting the changes to num_tokens and inital_token? set num_tokens=1? initial_token should be ignored since it is not bootstrap. right? Thanks, Daning On Tue, Feb 12, 2013 at 10:52 AM, aaron morton aa...@thelastpickle.com wrote: Were you upgrading to 1.2 AND running the shuffle or just upgrading to 1.2? If you have not run shuffle I would suggest reverting the changes to num_tokens and inital_token. This is a guess because num_tokens is only used at bootstrap. Just get upgraded to 1.2 first, then do the shuffle when things are stable. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 12/02/2013, at 2:55 PM, Daning Wang dan...@netseer.com wrote: Thanks Aaron. I tried to migrate existing cluster(ver 1.1.0) to 1.2.1 but failed. - I followed http://www.datastax.com/docs/1.2/install/upgrading, have merged cassandra.yaml, with follow parameter num_tokens: 256 #initial_token: 0 the initial_token is commented out, current token should be obtained from system schema - I did rolling upgrade, during the upgrade, I got Borken Pipe error from the nodes with old version, is that normal? - After I upgraded 3 nodes(still have 5 to go), I found it is total wrong, the first node upgraded owns 99.2 of ring [cassy@d5:/usr/local/cassy conf]$ ~/bin/nodetool -h localhost status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack DN 10.210.101.11745.01 GB 254 99.2% f4b6afe3-7e2e-4c61-96e8-12a529a31373 rack1 UN 10.210.101.12045.43 GB 256 0.4% 0fd912fb-3187-462b-8c8a-7d223751b649 rack1 UN 10.210.101.11127.08 GB 256 0.4% bd4c37bc-07dd-488b-bfab-e74e32c26f6e rack1 What was wrong? please help. I could provide more information if you need. Thanks, Daning On Mon, Feb 4, 2013 at 9:16 AM, aaron morton aa...@thelastpickle.comwrote: There is a command line utility in 1.2 to shuffle the tokens… http://www.datastax.com/dev/blog/upgrading-an-existing-cluster-to-vnodes $ ./cassandra-shuffle --help Missing sub-command argument. Usage: shuffle [options] sub-command Sub-commands: create Initialize a new shuffle operation ls List pending relocations clearClear pending relocations en[able] Enable shuffling dis[able]Disable shuffling Options: -dc, --only-dc Apply only to named DC (create only) -tp, --thrift-port Thrift port number (Default: 9160) -p, --port JMX port number (Default: 7199) -tf, --thrift-framed Enable framed transport for Thrift (Default: false) -en, --and-enableImmediately enable shuffling (create only) -H, --help Print help information -h, --host JMX hostname or IP address (Default: localhost) -th, --thrift-host Thrift
Re: Upgrade to Cassandra 1.2
No, I did not run shuffle since the upgrade was not successful. what do you mean reverting the changes to num_tokens and inital_token? set num_tokens=1? initial_token should be ignored since it is not bootstrap. right? Thanks, Daning On Tue, Feb 12, 2013 at 10:52 AM, aaron morton aa...@thelastpickle.comwrote: Were you upgrading to 1.2 AND running the shuffle or just upgrading to 1.2? If you have not run shuffle I would suggest reverting the changes to num_tokens and inital_token. This is a guess because num_tokens is only used at bootstrap. Just get upgraded to 1.2 first, then do the shuffle when things are stable. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 12/02/2013, at 2:55 PM, Daning Wang dan...@netseer.com wrote: Thanks Aaron. I tried to migrate existing cluster(ver 1.1.0) to 1.2.1 but failed. - I followed http://www.datastax.com/docs/1.2/install/upgrading, have merged cassandra.yaml, with follow parameter num_tokens: 256 #initial_token: 0 the initial_token is commented out, current token should be obtained from system schema - I did rolling upgrade, during the upgrade, I got Borken Pipe error from the nodes with old version, is that normal? - After I upgraded 3 nodes(still have 5 to go), I found it is total wrong, the first node upgraded owns 99.2 of ring [cassy@d5:/usr/local/cassy conf]$ ~/bin/nodetool -h localhost status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack DN 10.210.101.11745.01 GB 254 99.2% f4b6afe3-7e2e-4c61-96e8-12a529a31373 rack1 UN 10.210.101.12045.43 GB 256 0.4% 0fd912fb-3187-462b-8c8a-7d223751b649 rack1 UN 10.210.101.11127.08 GB 256 0.4% bd4c37bc-07dd-488b-bfab-e74e32c26f6e rack1 What was wrong? please help. I could provide more information if you need. Thanks, Daning On Mon, Feb 4, 2013 at 9:16 AM, aaron morton aa...@thelastpickle.comwrote: There is a command line utility in 1.2 to shuffle the tokens… http://www.datastax.com/dev/blog/upgrading-an-existing-cluster-to-vnodes $ ./cassandra-shuffle --help Missing sub-command argument. Usage: shuffle [options] sub-command Sub-commands: create Initialize a new shuffle operation ls List pending relocations clearClear pending relocations en[able] Enable shuffling dis[able]Disable shuffling Options: -dc, --only-dc Apply only to named DC (create only) -tp, --thrift-port Thrift port number (Default: 9160) -p, --port JMX port number (Default: 7199) -tf, --thrift-framed Enable framed transport for Thrift (Default: false) -en, --and-enableImmediately enable shuffling (create only) -H, --help Print help information -h, --host JMX hostname or IP address (Default: localhost) -th, --thrift-host Thrift hostname or IP address (Default: JMX host) Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 3/02/2013, at 11:32 PM, Manu Zhang owenzhang1...@gmail.com wrote: On Sun 03 Feb 2013 05:45:56 AM CST, Daning Wang wrote: I'd like to upgrade from 1.1.6 to 1.2.1, one big feature in 1.2 is that it can have multiple tokens in one node. but there is only one token in 1.1.6. how can I upgrade to 1.2.1 then breaking the token to take advantage of this feature? I went through this doc but it does not say how to change the num_token http://www.datastax.com/docs/1.2/install/upgrading Is there other doc about this upgrade path? Thanks, Daning I think for each node you need to change the num_token option in conf/cassandra.yaml (this only split the current range into num_token parts) and run the bin/cassandra-shuffle command (this spread it all over the ring).
Re: Upgrade to Cassandra 1.2
Thanks Aaron. I tried to migrate existing cluster(ver 1.1.0) to 1.2.1 but failed. - I followed http://www.datastax.com/docs/1.2/install/upgrading, have merged cassandra.yaml, with follow parameter num_tokens: 256 #initial_token: 0 the initial_token is commented out, current token should be obtained from system schema - I did rolling upgrade, during the upgrade, I got Borken Pipe error from the nodes with old version, is that normal? - After I upgraded 3 nodes(still have 5 to go), I found it is total wrong, the first node upgraded owns 99.2 of ring [cassy@d5:/usr/local/cassy conf]$ ~/bin/nodetool -h localhost status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack DN 10.210.101.11745.01 GB 254 99.2% f4b6afe3-7e2e-4c61-96e8-12a529a31373 rack1 UN 10.210.101.12045.43 GB 256 0.4% 0fd912fb-3187-462b-8c8a-7d223751b649 rack1 UN 10.210.101.11127.08 GB 256 0.4% bd4c37bc-07dd-488b-bfab-e74e32c26f6e rack1 What was wrong? please help. I could provide more information if you need. Thanks, Daning On Mon, Feb 4, 2013 at 9:16 AM, aaron morton aa...@thelastpickle.comwrote: There is a command line utility in 1.2 to shuffle the tokens… http://www.datastax.com/dev/blog/upgrading-an-existing-cluster-to-vnodes $ ./cassandra-shuffle --help Missing sub-command argument. Usage: shuffle [options] sub-command Sub-commands: create Initialize a new shuffle operation ls List pending relocations clearClear pending relocations en[able] Enable shuffling dis[able]Disable shuffling Options: -dc, --only-dc Apply only to named DC (create only) -tp, --thrift-port Thrift port number (Default: 9160) -p, --port JMX port number (Default: 7199) -tf, --thrift-framed Enable framed transport for Thrift (Default: false) -en, --and-enableImmediately enable shuffling (create only) -H, --help Print help information -h, --host JMX hostname or IP address (Default: localhost) -th, --thrift-host Thrift hostname or IP address (Default: JMX host) Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 3/02/2013, at 11:32 PM, Manu Zhang owenzhang1...@gmail.com wrote: On Sun 03 Feb 2013 05:45:56 AM CST, Daning Wang wrote: I'd like to upgrade from 1.1.6 to 1.2.1, one big feature in 1.2 is that it can have multiple tokens in one node. but there is only one token in 1.1.6. how can I upgrade to 1.2.1 then breaking the token to take advantage of this feature? I went through this doc but it does not say how to change the num_token http://www.datastax.com/docs/1.2/install/upgrading Is there other doc about this upgrade path? Thanks, Daning I think for each node you need to change the num_token option in conf/cassandra.yaml (this only split the current range into num_token parts) and run the bin/cassandra-shuffle command (this spread it all over the ring).
Cassandra jmx stats ReadCount
We have 8 nodes cluster in Casandra 1.1.0, with replication factor is 3. We found that when you just insert data, not only WriteCount increases, the ReadCount also increases. How could this happen? I am under the impression that readCount only counts the reads from client. Thanks, Daning
Upgrade to Cassandra 1.2
I'd like to upgrade from 1.1.6 to 1.2.1, one big feature in 1.2 is that it can have multiple tokens in one node. but there is only one token in 1.1.6. how can I upgrade to 1.2.1 then breaking the token to take advantage of this feature? I went through this doc but it does not say how to change the num_token http://www.datastax.com/docs/1.2/install/upgrading Is there other doc about this upgrade path? Thanks, Daning
Problem on node join the ring
I add a new node to ring(version 1.1.6), after more than 30 hours, it is still in the 'Joining' state Address DC RackStatus State Load Effective-Ownership Token 141784319550391026443072753096570088105 10.28.78.123datacenter1 rack1 Up Normal 18.73 GB 50.00% 0 10.4.17.138 datacenter1 rack1 Up Normal 15 GB 39.29% 24305883351495604533098186245126300818 10.93.95.51 datacenter1 rack1 Up Normal 17.96 GB 41.67% 42535295865117307932921825928971026432 10.170.1.26 datacenter1 rack1 Up Joining 6.89 GB 0.00% 56713727820156410577229101238628035242 10.6.115.239datacenter1 rack1 Up Normal 20.3 GB 50.00% 85070591730234615865843651857942052864 10.28.20.200datacenter1 rack1 Up Normal 22.68 GB 60.71% 127605887595351923798765477786913079296 10.240.113.171 datacenter1 rack1 Up Normal 18.4 GB 58.33% 141784319550391026443072753096570088105 since after a while, the cpu usage goes down to 0, looks it is stuck. I have restarted server several times in last 30 hours. when server is just started, you can see streaming in 'nodetool netstats', but after a few minutes, there is no streaming anymore I have turned on the debug, this is what it is doing now(cpu is pretty much idle), no any error message. Please help, I can provide more info if needed. Thanks in advance, DEBUG [MutationStage:17] 2013-01-28 12:47:59,618 RowMutationVerbHandler.java (line 44) Applying RowMutation(keyspace='dsat', key='52f5298affbb8bf0', modifications=[ColumnFamily(dsatcache [_meta:false:278@1359406079725000!3888000,])]) DEBUG [MutationStage:17] 2013-01-28 12:47:59,618 Table.java (line 395) applying mutation of row 52f5298affbb8bf0 DEBUG [MutationStage:17] 2013-01-28 12:47:59,618 RowMutationVerbHandler.java (line 56) RowMutation(keyspace='dsat', key='52f5298affbb8bf0', modifications=[ColumnFamily(dsatcache [_meta:false:278@1359406079725000!3888000,])]) applied. Sending response to 571645593@/10.28.78.123 DEBUG [MutationStage:26] 2013-01-28 12:47:59,623 RowMutationVerbHandler.java (line 44) Applying RowMutation(keyspace='dsat', key='57f700499922964b', modifications=[ColumnFamily(dsatcache [cache_type:false:8@1359406079730002,path:false:30@1359406079730001 ,top_node:false:22@135940607973,v0:false:976@1359406079730003 !3888000,])]) DEBUG [MutationStage:26] 2013-01-28 12:47:59,623 Table.java (line 395) applying mutation of row 57f700499922964b DEBUG [MutationStage:26] 2013-01-28 12:47:59,623 Table.java (line 429) mutating indexed column top_node value 6d617474626f7574726f732e74756d626c722e636f6d DEBUG [MutationStage:26] 2013-01-28 12:47:59,623 CollationController.java (line 78) collectTimeOrderedData DEBUG [MutationStage:26] 2013-01-28 12:47:59,623 Table.java (line 453) Pre-mutation index row is null DEBUG [MutationStage:26] 2013-01-28 12:47:59,624 KeysIndex.java (line 119) applying index row mattboutros.tumblr.com in ColumnFamily(dsatcache.dsatcache_top_node_idx [57f700499922964b:false:0@135940607973,]) DEBUG [MutationStage:26] 2013-01-28 12:47:59,624 RowMutationVerbHandler.java (line 56) RowMutation(keyspace='dsat', key='57f700499922964b', modifications=[ColumnFamily(dsatcache [cache_type:false:8@1359406079730002,path:false:30@1359406079730001 ,top_node:false:22@135940607973,v0:false:976@1359406079730003!3888000,])]) applied. Sending response to 710680715@/10.28.20.200 DEBUG [MutationStage:22] 2013-01-28 12:47:59,624 RowMutationVerbHandler.java (line 44) Applying RowMutation(keyspace='dsat', key='57f700499922964b', modifications=[ColumnFamily(dsatcache [_meta:false:278@1359406079731000!3888000,])]) DEBUG [MutationStage:22] 2013-01-28 12:47:59,624 Table.java (line 395) applying mutation of row 57f700499922964b DEBUG [MutationStage:22] 2013-01-28 12:47:59,624 RowMutationVerbHandler.java (line 56) RowMutation(keyspace='dsat', key='57f700499922964b', modifications=[ColumnFamily(dsatcache [_meta:false:278@1359406079731000!3888000,])]) applied. Sending response to 710680719@/10.28.20.200 DEBUG [MutationStage:25] 2013-01-28 12:47:59,652 RowMutationVerbHandler.java (line 44) Applying RowMutation(keyspace='dsat', key='2a50083d5332071f', modifications=[ColumnFamily(dsatcache [cache_type:false:8@1359406079692002,path:false:26@1359406079692001 ,top_node:false:18@1359406079692000,v0:false:583@1359406079692003 !3888000,])]) DEBUG [MutationStage:25] 2013-01-28 12:47:59,652 Table.java (line 395) applying mutation of row 2a50083d5332071f DEBUG [MutationStage:25] 2013-01-28 12:47:59,652 Table.java (line 429) mutating indexed column top_node value 772e706163696669632d72652e636f6d DEBUG [MutationStage:25] 2013-01-28 12:47:59,652 CollationController.java (line 78) collectTimeOrderedData DEBUG [MutationStage:25] 2013-01-28 12:47:59,652 Table.java (line 453) Pre-mutation index row is null DEBUG [MutationStage:25] 2013-01-28
1.2 Authentication
We were using SimpleAuthenticator on 1.1.x, it worked fine. While testing 1.2, I have put classes under example/simple_authentication in a jar and copy to lib directory, the class is loaded. however, when I try to connect with correct user/password, it gives me error ./cqlsh s2.dsat103-e1a -u -p Traceback (most recent call last): File ./cqlsh, line 2262, in module main(*read_options(sys.argv[1:], os.environ)) File ./cqlsh, line 2248, in main display_float_precision=options.float_precision) File ./cqlsh, line 483, in __init__ cql_version=cqlver, transport=transport) File ./../lib/cql-internal-only-1.4.0.zip/cql-1.4.0/cql/connection.py, line 143, in connect File ./../lib/cql-internal-only-1.4.0.zip/cql-1.4.0/cql/connection.py, line 59, in __init__ File ./../lib/cql-internal-only-1.4.0.zip/cql-1.4.0/cql/thrifteries.py, line 157, in establish_connection File ./../lib/cql-internal-only-1.4.0.zip/cql-1.4.0/cql/cassandra/Cassandra.py, line 455, in login File ./../lib/cql-internal-only-1.4.0.zip/cql-1.4.0/cql/cassandra/Cassandra.py, line 476, in recv_login cql.cassandra.ttypes.AuthenticationException: AuthenticationException(why=User doesn't exist - create it with CREATE USER query first) What does create it with CREATE USER query first mean? I put debug information in SimpleAuthenticator class, that showed authentication is passed in the authenticate() method. Thanks, Daning
Re: Replication factor
Thanks guys. Aaron, I am confused about this. from wiki http://wiki.apache.org/cassandra/ReadRepair, looks for any consistency level. Read Repair will be done either before or after responding data. Read Repair does not run at CL ONE Daning On Wed, May 23, 2012 at 3:51 AM, Viktor Jevdokimov viktor.jevdoki...@adform.com wrote: When RF == number of nodes, and you read at CL ONE you will always be reading locally. “always be reading locally” – only if Dynamic Snitch is “off”. With dynamic snitch “on” request may be redirected to other node, which may introduce latency spikes. ** ** ** ** Best regards / Pagarbiai *Viktor Jevdokimov* Senior Developer Email: viktor.jevdoki...@adform.com Phone: +370 5 212 3063, Fax +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Follow us on Twitter: @adforminsider http://twitter.com/#!/adforminsider What is Adform: watch this short video http://vimeo.com/adform/display [image: Adform News] http://www.adform.com Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. *From:* aaron morton [mailto:aa...@thelastpickle.com] *Sent:* Wednesday, May 23, 2012 13:00 *To:* user@cassandra.apache.org *Subject:* Re: Replication factor ** ** RF is normally adjusted to modify availability (see http://thelastpickle.com/2011/06/13/Down-For-Me/) ** ** for example, if I have 4 nodes cluster in one data center, how can RF=2 vs RF=4 affect read performance? If consistency level is ONE, looks reading does not need to go to another hop to get data if RF=4, but it would do more work on read repair in the background. Read Repair does not run at CL ONE. When RF == number of nodes, and you read at CL ONE you will always be reading locally. But with a low consistency. If you read with QUORUM when RF == number of nodes you will still get some performance benefit from the data being read locally. ** ** Cheers ** ** ** ** - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com ** ** On 23/05/2012, at 9:34 AM, Daning Wang wrote: Hello, What is the pros and cons to choose different number of replication factor in term of performance? if space is not a concern. for example, if I have 4 nodes cluster in one data center, how can RF=2 vs RF=4 affect read performance? If consistency level is ONE, looks reading does not need to go to another hop to get data if RF=4, but it would do more work on read repair in the background. Can you share some insights about this? Thanks in advance, Daning ** ** signature-logo7789.png
Re: Couldn't find cfId
Thanks Aaron! We will upgrade to 1.0.9. Just curious, you said removing the HintedHandoff files from data/system, what do the HintedHandoff files look like? Thanks, Daning On Wed, May 16, 2012 at 2:32 AM, aaron morton aa...@thelastpickle.comwrote: Looks like this https://issues.apache.org/jira/browse/CASSANDRA-3975 Fixed in the latest 1.0.9. Either upgrade (which is always a good idea) or purge the hints from the server. Either using JMX or stopping the node and removing the HintedHandoff files from data/system. In either case you should then run a nodetool repair as hints for other CF's may have been dropped. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 16/05/2012, at 2:27 AM, Daning Wang wrote: We got exception UnserializableColumnFamilyException: Couldn't find cfId=1075 in the log of one node, describe cluster showed all the nodes in same schema version. how to fix this problem? did repair but looks does not work, haven't try scrub yet. We are on v1.0.3 ERROR [HintedHandoff:1631] 2012-05-15 07:13:07,877 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[HintedHandoff:1631,1,main] java.lang.RuntimeException: org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=1075 at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:689) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=1075 at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:129) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:401) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:409) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpointInternal(HintedHandOffManager.java:344) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:248) at org.apache.cassandra.db.HintedHandOffManager.access$200(HintedHandOffManager.java:84) at org.apache.cassandra.db.HintedHandOffManager$3.runMayThrow(HintedHandOffManager.java:418) Thanks, Daning
Couldn't find cfId
We got exception UnserializableColumnFamilyException: Couldn't find cfId=1075 in the log of one node, describe cluster showed all the nodes in same schema version. how to fix this problem? did repair but looks does not work, haven't try scrub yet. We are on v1.0.3 ERROR [HintedHandoff:1631] 2012-05-15 07:13:07,877 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[HintedHandoff:1631,1,main] java.lang.RuntimeException: org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=1075 at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:689) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=1075 at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:129) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:401) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:409) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpointInternal(HintedHandOffManager.java:344) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:248) at org.apache.cassandra.db.HintedHandOffManager.access$200(HintedHandOffManager.java:84) at org.apache.cassandra.db.HintedHandOffManager$3.runMayThrow(HintedHandOffManager.java:418) Thanks, Daning
Re: Request timeout and host marked down
Thanks Aaron, will seek help from hector team. On Tue, Apr 10, 2012 at 3:41 AM, aaron morton aa...@thelastpickle.comwrote: Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) ... 31 more This looks like a client side timeout to me. AFAIK it will use this http://rantav.github.com/hector//source/content/API/core/1.0-1/me/prettyprint/cassandra/service/CassandraHost.html#getCassandraThriftSocketTimeout() if it's 0 otherwise the value of the CASSANDRA_THRIFT_SOCKET_TIMEOUT JVM param otherwise 0 i think. Hector is one of the many things I am not an expert on. Try the hector user list if you are still having problems. [cassy@s2.dsat4 ~]$ ~/bin/nodetool -h localhost tpstats Pool NameActive Pending Completed Blocked All time blocked ReadStage 3 3 414129625 0 0 Looks fine. Hope that helps. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 10/04/2012, at 8:08 AM, Daning Wang wrote: Thanks Aaron! Here is the exception, is that the timeout between nodes? any parameter I can change to reduce timeout? me.prettyprint.hector.api.exceptions.HectorTransportException: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:33) at me.prettyprint.cassandra.model.CqlQuery$1.execute(CqlQuery.java:130) at me.prettyprint.cassandra.model.CqlQuery$1.execute(CqlQuery.java:100) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:103) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:246) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:97) at me.prettyprint.cassandra.model.CqlQuery.execute(CqlQuery.java:99) at com.netseer.cassandra.cache.dao.CacheReader.getRows(CacheReader.java:267) at com.netseer.cassandra.cache.dao.CacheReader.getCache0(CacheReader.java:55) at com.netseer.cassandra.cache.dao.CacheDao.getCaches(CacheDao.java:85) at com.netseer.cassandra.cache.dao.CacheDao.getCache(CacheDao.java:71) at com.netseer.cassandra.cache.dao.CacheDao.getCache(CacheDao.java:149) at com.netseer.cassandra.cache.service.CacheServiceImpl.getCache(CacheServiceImpl.java:55) at com.netseer.cassandra.cache.service.CacheServiceImpl.getCache(CacheServiceImpl.java:28) at com.netseer.dsat.cache.CassandraDSATCacheImpl.get(CassandraDSATCacheImpl.java:62) at com.netseer.dsat.cache.CassandraDSATCacheImpl.getTimedValue(CassandraDSATCacheImpl.java:144) at com.netseer.dsat.serving.GenericCacheManager$4.call(GenericCacheManager.java:427) at com.netseer.dsat.serving.GenericCacheManager$4.call(GenericCacheManager.java:423) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.cassandra.thrift.Cassandra$Client.recv_execute_cql_query(Cassandra.java:1698) at org.apache.cassandra.thrift.Cassandra$Client.execute_cql_query(Cassandra.java:1682) at me.prettyprint.cassandra.model.CqlQuery$1.execute(CqlQuery.java:106) ... 21 more Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129
Re: Request timeout and host marked down
0 0 0 0 HintedHandoff 0 0 2746 0 0 Message type Dropped RANGE_SLICE 0 READ_REPAIR 17931 BINARY 0 READ 5185149 MUTATION232317 REQUEST_RESPONSE 1317 On Sun, Apr 8, 2012 at 2:15 PM, aaron morton aa...@thelastpickle.comwrote: You need to see if the timeout is from the client to the server, or between the server nodes. If it's server side a TimedOutException will be thrown from thrift. Take a look at the nodetool tpstats on the servers, you will probably see lots of Pending tasks. Basically the cluster is overloaded. Consider: * check the IO, CPU, GC state on the servers. * ensuring the data and requests are evenly spread around the cluster. * reducing the number of columns read in a select. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 6/04/2012, at 5:30 AM, Daning Wang wrote: Hi all, We are using Hector and ofter we see lots of timeout exception in the log, I know that the hector can failover to other node, but I want to reduce the number of timeouts. any hector parameter I should change to reduce this error? also, on the server side, any kind of tunning need to do for the timeout? Thanks in advance. 12/04/04 15:13:20 ERROR com.netseer.services.keywordstat.io.KeywordServiceImpl: Timout 1 ms 12/04/04 15:13:25 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN TRIGGERED for host 10.28.78.123(10.28.78.123):9160 12/04/04 15:13:25 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool state on shutdown: ConcurrentCassandraClientPoolByHost:{10.28.78.123(10.28.78.123):9160}; IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19 12/04/04 15:13:44 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN TRIGGERED for host 10.240.113.171(10.240.113.171):9160 12/04/04 15:13:44 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool state on shutdown: ConcurrentCassandraClientPoolByHost:{10.240.113.171(10.240.113.171):9160}; IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19 12/04/04 15:13:46 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN TRIGGERED for host 10.28.78.123(10.28.78.123):9160 12/04/04 15:13:46 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool state on shutdown: ConcurrentCassandraClientPoolByHost:{10.28.78.123(10.28.78.123):9160}; IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19 12/04/04 15:13:46 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN TRIGGERED for host 10.123.83.114(10.123.83.114):9160 12/04/04 15:13:46 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool state on shutdown: ConcurrentCassandraClientPoolByHost:{10.123.83.114(10.123.83.114):9160}; IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19 12/04/04 15:13:46 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN TRIGGERED for host 10.6.115.239(10.6.115.239):9160 12/04/04 15:13:46 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool state on shutdown: ConcurrentCassandraClientPoolByHost:{10.6.115.239(10.6.115.239):9160}; IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19 12/04/04 15:13:49 ERROR com.netseer.services.keywordstat.io.KeywordServiceImpl: Timout 1 ms 12/04/04 15:13:49 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN TRIGGERED for host 10.120.205.48(10.120.205.48):9160 12/04/04 15:13:49 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool state on shutdown: ConcurrentCassandraClientPoolByHost:{10.120.205.48(10.120.205.48):9160}; IsActive?: true; Active: 3; Blocked: 0; Idle: 3; NumBeforeExhausted: 17 12/04/04 15:13:50 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN TRIGGERED for host 10.28.20.200(10.28.20.200):9160 12/04/04 15:13:50 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool state on shutdown: ConcurrentCassandraClientPoolByHost:{10.28.20.200(10.28.20.200):9160}; IsActive?: true; Active: 2; Blocked: 0; Idle: 4; NumBeforeExhausted: 18 12/04/04 15:13:51 ERROR com.netseer.services.keywordstat.io.KeywordServiceImpl: Timout 1 ms
Request timeout and host marked down
Hi all, We are using Hector and ofter we see lots of timeout exception in the log, I know that the hector can failover to other node, but I want to reduce the number of timeouts. any hector parameter I should change to reduce this error? also, on the server side, any kind of tunning need to do for the timeout? Thanks in advance. 12/04/04 15:13:20 ERROR com.netseer.services.keywordstat.io.KeywordServiceImpl: Timout 1 ms 12/04/04 15:13:25 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN TRIGGERED for host 10.28.78.123(10.28.78.123):9160 12/04/04 15:13:25 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool state on shutdown: ConcurrentCassandraClientPoolByHost:{10.28.78.123(10.28.78.123):9160}; IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19 12/04/04 15:13:44 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN TRIGGERED for host 10.240.113.171(10.240.113.171):9160 12/04/04 15:13:44 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool state on shutdown: ConcurrentCassandraClientPoolByHost:{10.240.113.171(10.240.113.171):9160}; IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19 12/04/04 15:13:46 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN TRIGGERED for host 10.28.78.123(10.28.78.123):9160 12/04/04 15:13:46 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool state on shutdown: ConcurrentCassandraClientPoolByHost:{10.28.78.123(10.28.78.123):9160}; IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19 12/04/04 15:13:46 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN TRIGGERED for host 10.123.83.114(10.123.83.114):9160 12/04/04 15:13:46 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool state on shutdown: ConcurrentCassandraClientPoolByHost:{10.123.83.114(10.123.83.114):9160}; IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19 12/04/04 15:13:46 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN TRIGGERED for host 10.6.115.239(10.6.115.239):9160 12/04/04 15:13:46 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool state on shutdown: ConcurrentCassandraClientPoolByHost:{10.6.115.239(10.6.115.239):9160}; IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19 12/04/04 15:13:49 ERROR com.netseer.services.keywordstat.io.KeywordServiceImpl: Timout 1 ms 12/04/04 15:13:49 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN TRIGGERED for host 10.120.205.48(10.120.205.48):9160 12/04/04 15:13:49 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool state on shutdown: ConcurrentCassandraClientPoolByHost:{10.120.205.48(10.120.205.48):9160}; IsActive?: true; Active: 3; Blocked: 0; Idle: 3; NumBeforeExhausted: 17 12/04/04 15:13:50 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN TRIGGERED for host 10.28.20.200(10.28.20.200):9160 12/04/04 15:13:50 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool state on shutdown: ConcurrentCassandraClientPoolByHost:{10.28.20.200(10.28.20.200):9160}; IsActive?: true; Active: 2; Blocked: 0; Idle: 4; NumBeforeExhausted: 18 12/04/04 15:13:51 ERROR com.netseer.services.keywordstat.io.KeywordServiceImpl: Timout 1 ms
Re: Cassandra Exception
We upgraded to 1.0.8, and looks the problem is gone. Thanks for your help, Daning On Sun, Mar 25, 2012 at 9:54 AM, aaron morton aa...@thelastpickle.comwrote: Can you go to those nodes and run describe cluster ? Also check the logs on the machines that are marked as UNREACHABLE . A node will be marked as UNREACHABLE if it is DOWN or if it did not respond in time. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 23/03/2012, at 11:29 AM, Daning Wang wrote: Thanks Aaron. when I do describe cluster, always there are UNREACHABLE, but nodetool ring is fine. it is pretty busy cluster, read 3K/sec $ cassandra-cli -h localhost -u root -pw cassy Connected to: Production Cluster on localhost/9160 Welcome to the Cassandra CLI. Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit. [root@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: UNREACHABLE: [10.218.17.208, 10.123.83.114, 10.120.205.48, 10.240.113.171] e331e720-4844-11e1--d808570c0dfd: [10.28.78.123, 10.28.20.200, 10.6.115.239] [root@unknown] $ nodetool -h localhost ring Address DC RackStatus State Load OwnsToken 141784319550391026443072753096570088105 10.28.78.123datacenter1 rack1 Up Normal 5.46 GB 16.67% 0 10.120.205.48 datacenter1 rack1 Up Normal 5.49 GB 16.67% 28356863910078205288614550619314017621 10.6.115.239datacenter1 rack1 Up Normal 5.53 GB 16.67% 56713727820156410577229101238628035242 10.28.20.200datacenter1 rack1 Up Normal 5.51 GB 16.67% 85070591730234615865843651857942052863 10.123.83.114 datacenter1 rack1 Up Normal 5.49 GB 16.67% 113427455640312821154458202477256070484 10.240.113.171 datacenter1 rack1 Up Normal 5.43 GB 16.67% 141784319550391026443072753096570088105 Daning On Thu, Mar 22, 2012 at 1:47 PM, aaron morton aa...@thelastpickle.comwrote: java.io.IOError: org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=-387130991 Schema may have diverged between nodes. use cassandra-cli and run describe cluster; to see how many schema versions you have. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 22/03/2012, at 6:27 AM, Daning Wang wrote: and we are on 0.8.6. On Wed, Mar 21, 2012 at 10:24 AM, Daning Wang dan...@netseer.com wrote: Hi All, We got lots of Exception in the log, and later the server crashed. any idea what is happening and how to fix it? ERROR [RequestResponseStage:4] 2012-03-21 04:16:30,482 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[RequestResponseStage:4,5,main] java.io.IOError: java.io.EOFException at org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:71) at org.apache.cassandra.service.ReadCallback.response(ReadCallback.java:125) at org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:49) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:197) at org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:104) at org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:82) at org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:64) ... 6 more ERROR [RequestResponseStage:2] 2012-03-21 04:16:30,480 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[RequestResponseStage:2,5,main] java.io.IOError: org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=-387130991 at org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:71) at org.apache.cassandra.service.AsyncRepairCallback.response(AsyncRepairCallback.java:47) at org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:49) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=-387130991
How to find CF from cfId
Hi, How to find a column family from a cfId? I got a bunch of exception, want to find out which CF has problem. java.io.IOError: org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=1744830464 at org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:71) at org.apache.cassandra.service.AsyncRepairCallback.response(AsyncRepairCallback.java:47) at org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:49) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=1744830464 at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:123) at org.apache.cassandra.db.RowSerializer.deserialize(Row.java:69) at org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:113) at org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:82) at org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:64) Daning
Re: Cassandra Exception
Thanks Aaron. when I do describe cluster, always there are UNREACHABLE, but nodetool ring is fine. it is pretty busy cluster, read 3K/sec $ cassandra-cli -h localhost -u root -pw cassy Connected to: Production Cluster on localhost/9160 Welcome to the Cassandra CLI. Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit. [root@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: UNREACHABLE: [10.218.17.208, 10.123.83.114, 10.120.205.48, 10.240.113.171] e331e720-4844-11e1--d808570c0dfd: [10.28.78.123, 10.28.20.200, 10.6.115.239] [root@unknown] $ nodetool -h localhost ring Address DC RackStatus State Load OwnsToken 141784319550391026443072753096570088105 10.28.78.123datacenter1 rack1 Up Normal 5.46 GB 16.67% 0 10.120.205.48 datacenter1 rack1 Up Normal 5.49 GB 16.67% 28356863910078205288614550619314017621 10.6.115.239datacenter1 rack1 Up Normal 5.53 GB 16.67% 56713727820156410577229101238628035242 10.28.20.200datacenter1 rack1 Up Normal 5.51 GB 16.67% 85070591730234615865843651857942052863 10.123.83.114 datacenter1 rack1 Up Normal 5.49 GB 16.67% 113427455640312821154458202477256070484 10.240.113.171 datacenter1 rack1 Up Normal 5.43 GB 16.67% 141784319550391026443072753096570088105 Daning On Thu, Mar 22, 2012 at 1:47 PM, aaron morton aa...@thelastpickle.comwrote: java.io.IOError: org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=-387130991 Schema may have diverged between nodes. use cassandra-cli and run describe cluster; to see how many schema versions you have. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 22/03/2012, at 6:27 AM, Daning Wang wrote: and we are on 0.8.6. On Wed, Mar 21, 2012 at 10:24 AM, Daning Wang dan...@netseer.com wrote: Hi All, We got lots of Exception in the log, and later the server crashed. any idea what is happening and how to fix it? ERROR [RequestResponseStage:4] 2012-03-21 04:16:30,482 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[RequestResponseStage:4,5,main] java.io.IOError: java.io.EOFException at org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:71) at org.apache.cassandra.service.ReadCallback.response(ReadCallback.java:125) at org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:49) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:197) at org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:104) at org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:82) at org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:64) ... 6 more ERROR [RequestResponseStage:2] 2012-03-21 04:16:30,480 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[RequestResponseStage:2,5,main] java.io.IOError: org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=-387130991 at org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:71) at org.apache.cassandra.service.AsyncRepairCallback.response(AsyncRepairCallback.java:47) at org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:49) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=-387130991 at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:123) at org.apache.cassandra.db.RowSerializer.deserialize(Row.java:69) at org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:113) at org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:82) at org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:64 The is the exception before server crashes. ERROR [ReadRepairStage:299] 2012-03-21 05:02:53,808
Cassandra Exception
Hi All, We got lots of Exception in the log, and later the server crashed. any idea what is happening and how to fix it? ERROR [RequestResponseStage:4] 2012-03-21 04:16:30,482 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[RequestResponseStage:4,5,main] java.io.IOError: java.io.EOFException at org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:71) at org.apache.cassandra.service.ReadCallback.response(ReadCallback.java:125) at org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:49) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:197) at org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:104) at org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:82) at org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:64) ... 6 more ERROR [RequestResponseStage:2] 2012-03-21 04:16:30,480 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[RequestResponseStage:2,5,main] java.io.IOError: org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=-387130991 at org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:71) at org.apache.cassandra.service.AsyncRepairCallback.response(AsyncRepairCallback.java:47) at org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:49) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=-387130991 at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:123) at org.apache.cassandra.db.RowSerializer.deserialize(Row.java:69) at org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:113) at org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:82) at org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:64 The is the exception before server crashes. ERROR [ReadRepairStage:299] 2012-03-21 05:02:53,808 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[ReadRepairStage:299,5,main] java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:490) at org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:388) at org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:346) at org.apache.cassandra.service.RowRepairResolver.maybeScheduleRepairs(RowRepairResolver.java:121) at org.apache.cassandra.service.RowRepairResolver.resolve(RowRepairResolver.java:85) at org.apache.cassandra.service.AsyncRepairCallback$1.runMayThrow(AsyncRepairCallback.java:54) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more Thank you in advance, Daning
Re: Cassandra Exception
and we are on 0.8.6. On Wed, Mar 21, 2012 at 10:24 AM, Daning Wang dan...@netseer.com wrote: Hi All, We got lots of Exception in the log, and later the server crashed. any idea what is happening and how to fix it? ERROR [RequestResponseStage:4] 2012-03-21 04:16:30,482 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[RequestResponseStage:4,5,main] java.io.IOError: java.io.EOFException at org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:71) at org.apache.cassandra.service.ReadCallback.response(ReadCallback.java:125) at org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:49) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:197) at org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:104) at org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:82) at org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:64) ... 6 more ERROR [RequestResponseStage:2] 2012-03-21 04:16:30,480 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[RequestResponseStage:2,5,main] java.io.IOError: org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=-387130991 at org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:71) at org.apache.cassandra.service.AsyncRepairCallback.response(AsyncRepairCallback.java:47) at org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:49) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=-387130991 at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:123) at org.apache.cassandra.db.RowSerializer.deserialize(Row.java:69) at org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:113) at org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:82) at org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:64 The is the exception before server crashes. ERROR [ReadRepairStage:299] 2012-03-21 05:02:53,808 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[ReadRepairStage:299,5,main] java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:490) at org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:388) at org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:346) at org.apache.cassandra.service.RowRepairResolver.maybeScheduleRepairs(RowRepairResolver.java:121) at org.apache.cassandra.service.RowRepairResolver.resolve(RowRepairResolver.java:85) at org.apache.cassandra.service.AsyncRepairCallback$1.runMayThrow(AsyncRepairCallback.java:54) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more Thank you in advance, Daning
hector connection pool
I just got this error : All host pools marked down. Retry burden pushed out to client. in a few clients recently, client could not recover, we have to restart client application. we are using 0.8.0.3 hector. At that time we did compaction for a CF, it takes several hours, server was busy. But I think client should recover after server load was down. Any bug reported about this? I did search but could not find one. Thanks, Daning
Re: Rebalance cluster
Thank you guys. very appreciated. How about just pulling the slow machines out of cluster? I think the most of reads should already from fast machine right now because of dynamic snitch. so removing two machines should not add much loads on the remaining nodes. How do you think? Thanks, Daning On Wed, Jan 11, 2012 at 1:34 PM, Antonio Martinez antyp...@gmail.comwrote: There is another possible approach that I reference from the original Dynamo paper. Instead of trying to manage a heterogeneous cluster at the cassandra level, it might be possible to take the approach Amazon took. Find the smallest common denominator of resource for your nodes(most likely your smallest node) and virtualize the others to that level. For example, say you have 3 physical computers, one with one processor and 2gb of memory, one with 2 processors and 4gb, and one with 4 and 8gb. You could make the smallest one your basic block and then put two one processor 2gb vm's on the second machine and 4 of those on the third and largest machine. Then instead of managing the three of them separately and worrying about them being different you instead manage a ring of 7 equal nodes with equal portions of the ring. This allows you to give smaller machines a lesser load compared to the more powerful ones. The amazon paper on dynamo has more information on how they did it and some of the tricks they use for reliability. http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf Hope this helps somewhat On Wed, Jan 11, 2012 at 2:00 PM, aaron morton aa...@thelastpickle.comwrote: I have good news and bad. The good news is I have a nice coffee. The bad news is it's pretty difficult to have some nodes with less load. In a cluster with 5 nodes and RF 3 each node holds the following token ranges. node1: node 1, 5 and 4 node 2: node 2, 1, 5 node 3: node 3, 2, 1 node 4: node 4, 3, 2 node 5: node 5, 4, 3 The load on each node is it's token range, and those of the preceding RF-1 nodes. e.g. In a balanced ring of 5 nodes with RF 3 each node has 20 % of the token ring and 60% of the total load. if you split the token ring is split like this below each node has the total load shown after the / node 1: 12.5 % / 50% node 2: 25 % / 62.5% node 3: 25 % / 62.5% node 4: 12.5 % / 62.5% node 5: 25% / 62.5 % Only node 1 gets a small amount less. Try a different approach… node 1: 12.5 % / 62.5% node 2: 12.5 % / 50% node 3: 25 % / 50% node 4: 25 % / 62.5% node 5: 25 % / 75.5 % That's even worse. David is right to use nodetool move. It's a good idea to update the initial tokens in the yaml (or your ops condif) after the fact even though they are not used. Hope that helps. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/01/2012, at 8:41 AM, David McNelis wrote: Daning, You can see how to do this basic sort of thing on the Wiki's operations page ( http://wiki.apache.org/cassandra/Operations ) In short, you'll want to run: nodetool -h hostname move newtoken Then, once you've update each of your tokens that you want to move, you'll want to run nodetool -h hostname cleanup That will remove the no-longer necessary tokens from your smaller machines. Please note that someone else may have some better insights than I into whether or not your strategy is going to be effective. On the surface I think what you are doing is logical, but I'm unsure of the actual performance gains you'll see. David On Wed, Jan 11, 2012 at 1:32 PM, Daning Wang dan...@netseer.com wrote: Hi All, We have 5 nodes cluster(on 0.8.6), but two machines are slower and have less memory, so the performance was not good on those two machines for large volume traffic.I want to move some data from slower machine to faster machine to ease some load, the token ring will not be equally balanced. I am thinking the following steps, 1. modify cassandra.yaml to change the initial token. 2. restart cassandra(don't need to auto-bootstrap, right?) 3. then run nodetool repair,(or nodetool move?, not sure which one to use) Is there any doc that has detailed steps about how to do this? Thanks in advance, Daning -- Antonio Perez de Tejada Martinez
Rebalance cluster
Hi All, We have 5 nodes cluster(on 0.8.6), but two machines are slower and have less memory, so the performance was not good on those two machines for large volume traffic.I want to move some data from slower machine to faster machine to ease some load, the token ring will not be equally balanced. I am thinking the following steps, 1. modify cassandra.yaml to change the initial token. 2. restart cassandra(don't need to auto-bootstrap, right?) 3. then run nodetool repair,(or nodetool move?, not sure which one to use) Is there any doc that has detailed steps about how to do this? Thanks in advance, Daning
Pending on ReadStage
Hi all, We have 5 nodes cluster(0.8.6), but the performance from one node is way behind others, I checked tpstats, It always show non-zero pending ReadStage, I don't see this problem on other nodes. What caused the problem? I/O? Memory? Cpu usage is still low. How to fix this problem? ~/bin/nodetool -h localhost tpstats Pool NameActive Pending Completed Blocked All time blocked ReadStage1115 56960 0 0 RequestResponseStage 0 0 606695 0 0 MutationStage 0 0 538634 0 0 ReadRepairStage 0 0 17 0 0 ReplicateOnWriteStage 0 0 0 0 0 GossipStage 0 0 5734 0 0 AntiEntropyStage 0 0 0 0 0 MigrationStage0 0 0 0 0 MemtablePostFlusher 0 0 7 0 0 StreamStage 0 0 0 0 0 FlushWriter 0 0 8 0 0 MiscStage 0 0 0 0 0 FlushSorter 0 0 0 0 0 InternalResponseStage 0 0 0 0 0 HintedHandoff 1 4 0 0 0 Message type Dropped RANGE_SLICE 0 READ_REPAIR 0 BINARY 0 READ 9082 MUTATION 0 REQUEST_RESPONSE 0 Thanks you in advance. Daning
Re: Pending on ReadStage
Thanks for your reply. Nodes are equally balanced. and it is RandomPartitioner. I think that machine is slower, Are you saying it is IO issue? Daning On Fri, Jan 6, 2012 at 10:25 AM, Mohit Anchlia mohitanch...@gmail.comwrote: Are all your nodes equally balanced in terms of read requests? Are you using RandomPartitioner? Are you reading using indexes? First thing you can do is compare iostat -x output between the 2 nodes to rule out any io issues assuming your read requests are equally balanced. On Fri, Jan 6, 2012 at 10:11 AM, Daning Wang dan...@netseer.com wrote: Hi all, We have 5 nodes cluster(0.8.6), but the performance from one node is way behind others, I checked tpstats, It always show non-zero pending ReadStage, I don't see this problem on other nodes. What caused the problem? I/O? Memory? Cpu usage is still low. How to fix this problem? ~/bin/nodetool -h localhost tpstats Pool NameActive Pending Completed Blocked All time blocked ReadStage1115 56960 0 0 RequestResponseStage 0 0 606695 0 0 MutationStage 0 0 538634 0 0 ReadRepairStage 0 0 17 0 0 ReplicateOnWriteStage 0 0 0 0 0 GossipStage 0 0 5734 0 0 AntiEntropyStage 0 0 0 0 0 MigrationStage0 0 0 0 0 MemtablePostFlusher 0 0 7 0 0 StreamStage 0 0 0 0 0 FlushWriter 0 0 8 0 0 MiscStage 0 0 0 0 0 FlushSorter 0 0 0 0 0 InternalResponseStage 0 0 0 0 0 HintedHandoff 1 4 0 0 0 Message type Dropped RANGE_SLICE 0 READ_REPAIR 0 BINARY 0 READ 9082 MUTATION 0 REQUEST_RESPONSE 0 Thanks you in advance. Daning
Cassandra memory usage
I have Cassandra server which has JVM setting -Xms4G -Xmx4G, but why top reports 15G RES memory and 11G SHR memory usage? I understand that -Xmx4G is only for the heap size. but it is strange that OS reports 2.5 times memory usage. Are there a lot of memory used by JNI? Please help to explain this. cassy 2549 39.7 66.1 163805536 16324648 ? Sl Jan02 338:48 /usr/local/cassy/java/current/bin/java -ea -javaagent:./../lib/jamm-0.2.2.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42* -Xms4G -Xmx4G -Xmn1G*-XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=10 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dmx4jport=8085 -Djava.rmi.server.hostname=10.210.101.106 -Dlog4j.configuration=log4j-server.properties -Dlog4j.defaultInitOverride=true -Dpasswd.properties=./../conf/passwd.properties -cp ./../conf:./../build/classes/main:./../build/classes/thrift:./../lib/antlr-3.2.jar:./../lib/apache-cassandra-0.8.6.jar:./../lib/apache-cassandra-thrift-0.8.6.jar:./../lib/avro-1.4.0-fixes.jar:./../lib/avro-1.4.0-sources-fixes.jar:./../lib/commons-cli-1.1.jar:./../lib/commons-codec-1.2.jar:./../lib/commons-collections-3.2.1.jar:./../lib/commons-lang-2.4.jar:./../lib/concurrentlinkedhashmap-lru-1.1.jar:./../lib/guava-r08.jar:./../lib/high-scale-lib-1.1.2.jar:./../lib/jackson-core-asl-1.4.0.jar:./../lib/jackson-mapper-asl-1.4.0.jar:./../lib/jamm-0.2.2.jar:./../lib/jline-0.9.94.jar:./../lib/jna.jar:./../lib/json-simple-1.1.jar:./../lib/libthrift-0.6.jar:./../lib/log4j-1.2.16.jar:./../lib/mx4j-tools.jar:./../lib/servlet-api-2.5-20081211.jar:./../lib/slf4j-api-1.6.1.jar:./../lib/slf4j-log4j12-1.6.1.jar:./../lib/snakeyaml-1.6.jar org.apache.cassandra.thrift.CassandraDaemon Top PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 2549 cassy 21 0 156g * 15g 11g *S 66.9 65.5 338:02.72 java Thank you in advance, Daning
TimedOutException()
Hi All, We are getting TimedOutException() when inserting data into Cassandra, it was working fine for a few months, but suddenly got this problem. I have increase rpc_timout_in_ms to 3, but it still timed out in 30 secs. I turned on debug, I saw many of this error in the log DEBUG [pool-2-thread-420] 2012-01-03 15:25:43,689 CustomTThreadPoolServer.java (line 197) Thrift transport error occurred during processing of message. org.apache.thrift.transport.TTransportException: Cannot read. Remote side has closed. Tried to read 4 bytes, but only got 0 bytes. (This is often indicative of an internal error on the server side. Please ch eck your server logs.) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2877) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) We are on 0.8.6. Any idea how to fix this? Your help is much appreciated. Daning
Re: Weird problem with empty CF
Lots of SliceQueryFilter in the log, is that handling tombstone? DEBUG [ReadStage:49] 2011-10-03 20:15:07,942 SliceQueryFilter.java (line 123) collecting 0 of 1: 1317582939743663:true:4@1317582939933000 DEBUG [ReadStage:50] 2011-10-03 20:15:07,942 SliceQueryFilter.java (line 123) collecting 0 of 1: 1317573253148778:true:4@1317573253354000 DEBUG [ReadStage:43] 2011-10-03 20:15:07,942 SliceQueryFilter.java (line 123) collecting 0 of 1: 1317669552951428:true:4@1317669553018000 DEBUG [ReadStage:33] 2011-10-03 20:15:07,942 SliceQueryFilter.java (line 123) collecting 0 of 1: 1317581886709261:true:4@1317581886957000 DEBUG [ReadStage:52] 2011-10-03 20:15:07,942 SliceQueryFilter.java (line 123) collecting 0 of 1: 1317568165152246:true:4@1317568165482000 DEBUG [ReadStage:36] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line 123) collecting 0 of 1: 1317567265089211:true:4@1317567265405000 DEBUG [ReadStage:53] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line 123) collecting 0 of 1: 1317674324843122:true:4@1317674324946000 DEBUG [ReadStage:38] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line 123) collecting 0 of 1: 1317571990078721:true:4@1317571990141000 DEBUG [ReadStage:57] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line 123) collecting 0 of 1: 1317671855234221:true:4@1317671855239000 DEBUG [ReadStage:54] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line 123) collecting 0 of 1: 1317558305262954:true:4@1317558305337000 DEBUG [RequestResponseStage:11] 2011-10-03 20:15:07,941 ResponseVerbHandler.java (line 48) Processing response on a callback from 12347@/10.210.101.104 DEBUG [RequestResponseStage:9] 2011-10-03 20:15:07,941 AbstractRowResolver.java (line 66) Preprocessed data response DEBUG [RequestResponseStage:13] 2011-10-03 20:15:07,941 AbstractRowResolver.java (line 66) Preprocessed digest response DEBUG [ReadStage:58] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line 123) collecting 0 of 1: 1317581337972739:true:4@1317581338044000 DEBUG [ReadStage:64] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line 123) collecting 0 of 1: 1317582656796332:true:4@131758265697 DEBUG [ReadStage:55] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line 123) collecting 0 of 1: 1317569432886284:true:4@1317569432984000 DEBUG [ReadStage:45] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line 123) collecting 0 of 1: 1317572658687019:true:4@1317572658718000 DEBUG [ReadStage:47] 2011-10-03 20:15:07,940 SliceQueryFilter.java (line 123) collecting 0 of 1: 1317582281617755:true:4@1317582281717000 DEBUG [ReadStage:48] 2011-10-03 20:15:07,940 SliceQueryFilter.java (line 123) collecting 0 of 1: 1317549607869226:true:4@1317549608118000 DEBUG [ReadStage:34] 2011-10-03 20:15:07,940 SliceQueryFilter.java (line 123) collecting 0 of 1: On Thu, Sep 29, 2011 at 2:17 PM, aaron morton aa...@thelastpickle.comwrote: As with any situation involving the un-dead, it really is the number of Zombies, Mummies or Vampires that is the concern. If you delete data there will always be tombstones. If you have a delete heavy workload there will be more tombstones. This is why implementing a queue with cassandra is a bad idea. gc_grace_seconds (and column TTL) are the *minimum* about of time the tombstones will stay in the data files, there is no maximum. Your read performance also depends on the number of SSTables the row is spread over, see http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/ If you really wanted to purge them then yes a repair and then major compaction would be the way to go. Also consider if it's possible to design the data model around the problem, e.g. partitioning rows by date. IMHO I would look to make data model changes before implementing a compaction policy, or consider if cassandra is the right store if you have a delete heavy workload. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 30/09/2011, at 3:27 AM, Daning Wang wrote: Jonathan/Aaron, Thank you guy's reply, I will change GCGracePeriod to 1 day to see what will happen. Is there a way to purge tombstones at anytime? because if tombstones affect performance, we want them to be purged right away, not after GCGracePeriod. We know all the nodes are up, and we can do repair first to make sure the consistency before purging. Thanks, Daning On Wed, Sep 28, 2011 at 5:22 PM, aaron morton aa...@thelastpickle.comwrote: if I had to guess I would say it was spending time handling tombstones. If you see it happen again, and are interested, turn the logging up to DEBUG and look for messages from something starting with Slice Minor (automatic) compaction will, over time, purge the tombstones. Until then reads must read discard the data deleted by the tombstones. If you perform a big (i.e. 100k's ) delete this can reduce performance until compaction does it's thing. My second guess would be read repair (or the simple consistency checks on read
Queue suggestion in Cassandra
We try to implement an ordered queue system in Cassandra(ver 0.8.5). In initial design we use a row as queue, a column for each item in queue. that means creating new column when inserting item and delete column when top item is popped. Since columns are sorted in Cassandra we got the ordered queue. It works fine until queue size reaches 50K, then we got high CPU usage and constant GC, that makes the whole Cassandra server very slow and not responsive, we have to do full compaction to fix this problem. Due to this performance issue that this queue is not useful for us. We are looking for other designs. I want to know if anybody has implemented a large ordered queue successfully. Let me know if you have suggestion, Thank you in advance. Daning
ByteOrderedPartitioner
How is the performance of ByteOrderedPartitioner, compared to RandomPartitioner? the perforamnce when getting data with single key, does it use same algorithm? I have read that the downside of ByteOrderedPartitioner is creating hotspot. But if I have 4 nodes and I set RF to 4, that will replicate data to all 4 nodes, that could avoid hot spot, right? Thank you in advance, Daning