Problem on node join the ring

2013-01-28 Thread Daning Wang
I add a new node to ring(version 1.1.6), after more than 30 hours, it is still in the 'Joining' state Address DC RackStatus State Load Effective-Ownership Token 141784319550391026443072753096570088105 10.28.78.123datacenter1 rack1 Up

1.2 Authentication

2013-01-28 Thread Daning Wang
We were using SimpleAuthenticator on 1.1.x, it worked fine. While testing 1.2, I have put classes under example/simple_authentication in a jar and copy to lib directory, the class is loaded. however, when I try to connect with correct user/password, it gives me error ./cqlsh s2.dsat103-e1a -u

Upgrade to Cassandra 1.2

2013-02-02 Thread Daning Wang
I'd like to upgrade from 1.1.6 to 1.2.1, one big feature in 1.2 is that it can have multiple tokens in one node. but there is only one token in 1.1.6. how can I upgrade to 1.2.1 then breaking the token to take advantage of this feature? I went through this doc but it does not say how to change

Cassandra jmx stats ReadCount

2013-02-07 Thread Daning Wang
We have 8 nodes cluster in Casandra 1.1.0, with replication factor is 3. We found that when you just insert data, not only WriteCount increases, the ReadCount also increases. How could this happen? I am under the impression that readCount only counts the reads from client. Thanks, Daning

Re: Upgrade to Cassandra 1.2

2013-02-11 Thread Daning Wang
://www.thelastpickle.com On 3/02/2013, at 11:32 PM, Manu Zhang owenzhang1...@gmail.com wrote: On Sun 03 Feb 2013 05:45:56 AM CST, Daning Wang wrote: I'd like to upgrade from 1.1.6 to 1.2.1, one big feature in 1.2 is that it can have multiple tokens in one node. but there is only one token in 1.1.6. how

Re: Upgrade to Cassandra 1.2

2013-02-12 Thread Daning Wang
, then do the shuffle when things are stable. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 12/02/2013, at 2:55 PM, Daning Wang dan...@netseer.com wrote: Thanks Aaron. I tried to migrate existing cluster(ver

Re: Upgrade to Cassandra 1.2

2013-02-14 Thread Daning Wang
- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 13/02/2013, at 8:02 AM, Daning Wang dan...@netseer.com wrote: No, I did not run shuffle since the upgrade was not successful. what do you mean reverting the changes to num_tokens

Re: Upgrade to Cassandra 1.2

2013-02-14 Thread Daning Wang
#num_tokens commented in the cassandra.yaml and would set the initial_token at the same value than in the pre-C*1.2.x-uprage configuration. Alain 2013/2/14 Daning Wang dan...@netseer.com Thanks Aaron and Manu. Since we are using 1.1, there is no num_taken parameter. when I upgrade to 1.2

Queue suggestion in Cassandra

2011-09-16 Thread Daning Wang
We try to implement an ordered queue system in Cassandra(ver 0.8.5). In initial design we use a row as queue, a column for each item in queue. that means creating new column when inserting item and delete column when top item is popped. Since columns are sorted in Cassandra we got the ordered

ByteOrderedPartitioner

2011-09-16 Thread Daning Wang
How is the performance of ByteOrderedPartitioner, compared to RandomPartitioner? the perforamnce when getting data with single key, does it use same algorithm? I have read that the downside of ByteOrderedPartitioner is creating hotspot. But if I have 4 nodes and I set RF to 4, that will replicate

Re: Weird problem with empty CF

2011-10-03 Thread Daning Wang
@aaronmorton http://www.thelastpickle.com On 30/09/2011, at 3:27 AM, Daning Wang wrote: Jonathan/Aaron, Thank you guy's reply, I will change GCGracePeriod to 1 day to see what will happen. Is there a way to purge tombstones at anytime? because if tombstones affect performance, we want them

Cassandra memory usage

2012-01-03 Thread Daning Wang
I have Cassandra server which has JVM setting -Xms4G -Xmx4G, but why top reports 15G RES memory and 11G SHR memory usage? I understand that -Xmx4G is only for the heap size. but it is strange that OS reports 2.5 times memory usage. Are there a lot of memory used by JNI? Please help to explain

TimedOutException()

2012-01-03 Thread Daning Wang
Hi All, We are getting TimedOutException() when inserting data into Cassandra, it was working fine for a few months, but suddenly got this problem. I have increase rpc_timout_in_ms to 3, but it still timed out in 30 secs. I turned on debug, I saw many of this error in the log DEBUG

Pending on ReadStage

2012-01-06 Thread Daning Wang
Hi all, We have 5 nodes cluster(0.8.6), but the performance from one node is way behind others, I checked tpstats, It always show non-zero pending ReadStage, I don't see this problem on other nodes. What caused the problem? I/O? Memory? Cpu usage is still low. How to fix this problem?

Re: Pending on ReadStage

2012-01-06 Thread Daning Wang
? Are you using RandomPartitioner? Are you reading using indexes? First thing you can do is compare iostat -x output between the 2 nodes to rule out any io issues assuming your read requests are equally balanced. On Fri, Jan 6, 2012 at 10:11 AM, Daning Wang dan...@netseer.com wrote: Hi all

Rebalance cluster

2012-01-11 Thread Daning Wang
Hi All, We have 5 nodes cluster(on 0.8.6), but two machines are slower and have less memory, so the performance was not good on those two machines for large volume traffic.I want to move some data from slower machine to faster machine to ease some load, the token ring will not be equally

Re: Rebalance cluster

2012-01-12 Thread Daning Wang
. Please note that someone else may have some better insights than I into whether or not your strategy is going to be effective. On the surface I think what you are doing is logical, but I'm unsure of the actual performance gains you'll see. David On Wed, Jan 11, 2012 at 1:32 PM, Daning Wang dan

hector connection pool

2012-03-05 Thread Daning Wang
I just got this error : All host pools marked down. Retry burden pushed out to client. in a few clients recently, client could not recover, we have to restart client application. we are using 0.8.0.3 hector. At that time we did compaction for a CF, it takes several hours, server was busy. But

Cassandra Exception

2012-03-21 Thread Daning Wang
Hi All, We got lots of Exception in the log, and later the server crashed. any idea what is happening and how to fix it? ERROR [RequestResponseStage:4] 2012-03-21 04:16:30,482 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[RequestResponseStage:4,5,main]

Re: Cassandra Exception

2012-03-21 Thread Daning Wang
and we are on 0.8.6. On Wed, Mar 21, 2012 at 10:24 AM, Daning Wang dan...@netseer.com wrote: Hi All, We got lots of Exception in the log, and later the server crashed. any idea what is happening and how to fix it? ERROR [RequestResponseStage:4] 2012-03-21 04:16:30,482

How to find CF from cfId

2012-03-22 Thread Daning Wang
Hi, How to find a column family from a cfId? I got a bunch of exception, want to find out which CF has problem. java.io.IOError: org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=1744830464 at

Re: Cassandra Exception

2012-03-22 Thread Daning Wang
. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 22/03/2012, at 6:27 AM, Daning Wang wrote: and we are on 0.8.6. On Wed, Mar 21, 2012 at 10:24 AM, Daning Wang dan...@netseer.com wrote: Hi All, We got lots of Exception

Re: Cassandra Exception

2012-03-28 Thread Daning Wang
will be marked as UNREACHABLE if it is DOWN or if it did not respond in time. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 23/03/2012, at 11:29 AM, Daning Wang wrote: Thanks Aaron. when I do describe cluster, always

Request timeout and host marked down

2012-04-05 Thread Daning Wang
Hi all, We are using Hector and ofter we see lots of timeout exception in the log, I know that the hector can failover to other node, but I want to reduce the number of timeouts. any hector parameter I should change to reduce this error? also, on the server side, any kind of tunning need to do

Re: Request timeout and host marked down

2012-04-09 Thread Daning Wang
://www.thelastpickle.com On 6/04/2012, at 5:30 AM, Daning Wang wrote: Hi all, We are using Hector and ofter we see lots of timeout exception in the log, I know that the hector can failover to other node, but I want to reduce the number of timeouts. any hector parameter I should change to reduce

Re: Request timeout and host marked down

2012-04-10 Thread Daning Wang
Developer @aaronmorton http://www.thelastpickle.com On 10/04/2012, at 8:08 AM, Daning Wang wrote: Thanks Aaron! Here is the exception, is that the timeout between nodes? any parameter I can change to reduce timeout? me.prettyprint.hector.api.exceptions.HectorTransportException

Couldn't find cfId

2012-05-15 Thread Daning Wang
We got exception UnserializableColumnFamilyException: Couldn't find cfId=1075 in the log of one node, describe cluster showed all the nodes in same schema version. how to fix this problem? did repair but looks does not work, haven't try scrub yet. We are on v1.0.3 ERROR [HintedHandoff:1631]

Re: Couldn't find cfId

2012-05-16 Thread Daning Wang
for other CF's may have been dropped. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 16/05/2012, at 2:27 AM, Daning Wang wrote: We got exception UnserializableColumnFamilyException: Couldn't find cfId=1075 in the log of one node

Re: Replication factor

2012-05-23 Thread Daning Wang
** ** ** ** - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com ** ** On 23/05/2012, at 9:34 AM, Daning Wang wrote: Hello, What is the pros and cons to choose different number of replication factor in term

How to change existing cluster to multi-center

2013-04-25 Thread Daning Wang
Hi All, We have 8 nodes cluster(replication factor is 3), about 50G data on each node. we need to change the cluster to multi-center environment(to EC2). the data need to have one replica on ec2. Here is the plan, - Change cluster config to mult-center. - Add 2 or 3 nodes in another center,

Cassandra remote backup solution

2013-04-25 Thread Daning Wang
Hi Guys, What is the cassandra solution for remote backup besides multi-center? I hope I can do incremental backup to remote database center. Thanks, Daning

replication factor is zero

2013-06-06 Thread Daning Wang
We have multi-center deployment. data from some tables we don't want to sync to other center. could we set replication factor to 0 on other data center? what is the best to way for not syncing some data in a cluster? Thanks in advance, Daning

Re: Multiple data center performance

2013-06-11 Thread Daning Wang
, Jun 7, 2013 at 11:49 PM, Daning Wang dan...@netseer.com wrote: We have deployed multi-center but got performance issue. When the nodes on other center are up, the read response time from clients is 4 or 5 times higher. when we take those nodes down, the response time becomes normal(compare

Re: Multiple data center performance

2013-06-12 Thread Daning Wang
how replica acknowledgement are waited for. -- Sylvain On Wed, Jun 12, 2013 at 4:56 AM, Alain RODRIGUEZ arodr...@gmail.comwrote: counter will replicate to all replicas during write regardless the consistency level I that the normal behavior or a bug ? 2013/6/11 Daning Wang dan

Dynamic Snitch and EC2MultiRegionSnitch

2013-07-01 Thread Daning Wang
How does dynamic snitch work with EC2MultiRegionSnitch? Can dynamic routing only happen in one data center? We don't wan to have the requests routed to another center even nodes are idle in other side since the network could be slow. Thanks in advance, Daning

Key cache size

2013-09-04 Thread Daning Wang
We noticed that key cache could not be fully populated, we have set the key cache size to 1024M. key_cache_size_in_mb: 1024 But none of nodes showed the cache capacity is 1G, we have recently upgraded to 1.2.5, could be an issue in that version? Token: (invoke with -T/--tokens to

ReadCount change rate is different across nodes

2013-10-29 Thread Daning Wang
We are running 1.2.5 on 8 nodes(256 tokens). all the nodes are running on same type of machine. and db size is about same. but recently we checked ReadCount stats through jmx, and found that some nodes got 3 times change rate(we have calculated the changes per minute) than others. We are using

Re: ReadCount change rate is different across nodes

2013-10-30 Thread Daning Wang
Thanks. actually I forgot to mention it is multi-center environment and we have dynamic snitch disabled. because we saw some performance impact on the multi-center environment. On Wed, Oct 30, 2013 at 11:12 AM, Piavlo lolitus...@gmail.com wrote: On 10/30/2013 02:06 AM, Daning Wang wrote

Move token to another node on 1.2.x

2013-11-07 Thread Daning Wang
How to move a token to another node on 1.2.x? I have tried move command, [cassy@dsat103.e1a ~]$ nodetool move 168755834953206242653616795390304335559 Exception in thread main java.io.IOException: target token 168755834953206242653616795390304335559 is already owned by another node. at

Bulk writes and key cache

2014-02-03 Thread Daning Wang
Does Cassandra put keys in key cache during the write path? If I have two tables, the key cache for the first table was warmed up nicely, and I want to insert millions rows in the second table, and there is no read on the second table yet, will that affect cache hit ratio for the first table?

unable to find sufficient sources for streaming range

2014-07-02 Thread Daning Wang
We are running Cassandra 1.2.5 We have 8 nodes cluster, and we removed one machine from cluster and try to add it back(the purpose is we are using vnodes, some node has more tokens so by rejoining this machine we hope it could get some loads from the busy machines). But we got following exception