Re: Rebooted cassandra node timing out all requests but recovers after a while

2015-01-08 Thread Anand Somani
I will keep an eye for that if it happens again. Times at this point are synchronized On Wed, Jan 7, 2015 at 10:37 PM, Duncan Sands duncan.sa...@gmail.com wrote: Hi Anand, On 08/01/15 02:02, Anand Somani wrote: Hi, We have a 3 node cluster (on VM). Eg. host1, host2, host3. One of the VM

Rebooted cassandra node timing out all requests but recovers after a while

2015-01-07 Thread Anand Somani
Hi, We have a 3 node cluster (on VM). Eg. host1, host2, host3. One of the VM rebooted (host1) and when host1 came up it would see the others as down and the others (host2 and host3) see it as down. So we restarted host2 and now the ring seems fine(everybody sees everybody as up). But now the

Multi-dc cassandra keyspace

2014-05-16 Thread Anand Somani
Hi, It seems like it should be possible to have a keyspace replicated only to a subset of DC's on a given cluster spanning across multiple DCs? Is there anything bad about this approach? Scenario Cluster spanning 4 DC's = CA, TX, NY, UT Has multiple keyspaces such that * keyspace_CA_TX -

Cassandra Client authentication and system table replication question

2014-04-29 Thread Anand Somani
Hi We have enabled cassandra client authentication and have set new user/pass per keyspace. As I understand user/pass is stored in the system table, do we need to change the replication factor of the system table so this data is replicated? The cluster is going to be multi-dc. Thanks Anand

Re: Cassandra Client authentication and system table replication question

2014-04-29 Thread Anand Somani
Correction credentials are stored in the system_auth table, so it is ok/recommended to change the replication factor of that keyspace? On Tue, Apr 29, 2014 at 10:41 PM, Anand Somani meatfor...@gmail.com wrote: Hi We have enabled cassandra client authentication and have set new user/pass per

Re: Drop in node replacements.

2014-04-05 Thread Anand Somani
Have you tried nodetool rebuild for that node? I have seen that work when repair failed. On Wed, Apr 2, 2014 at 11:44 AM, Redmumba redmu...@gmail.com wrote: Cassandra 1.2.15, using commodity hardware. On Tue, Apr 1, 2014 at 6:37 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Apr 1,

Best way to track backups/delays for cross DC replication

2013-09-04 Thread Anand Somani
Hi, Scenario is a cluster spanning across datacenters and we use Local_quorum and want to know when things are not getting replicated across data centers. What is the best way to track/alert on that? I was planning on using the HintedHandOffManager (JMX) =

Re: Linear scalability problems

2013-04-04 Thread Anand Somani
or processes? On Wed, Apr 3, 2013 at 8:49 AM, Anand Somani meatfor...@gmail.com wrote: Hi, I am running some tests trying to scale out our application from using a 3 node cluster to 6 node cluster. The thing I observed is that when using a 3 node cluster I was able to handle abt 41 req

Re: Linear scalability problems

2013-04-04 Thread Anand Somani
RF=3. On Thu, Apr 4, 2013 at 7:08 AM, Cem Cayiroglu cayiro...@gmail.com wrote: What was the RF before adding nodes? Sent from my iPhone On 04 Apr 2013, at 15:12, Anand Somani meatfor...@gmail.com wrote: We are using a single process with multiple threads, will look at client side delays

Linear scalability problems

2013-04-03 Thread Anand Somani
Hi, I am running some tests trying to scale out our application from using a 3 node cluster to 6 node cluster. The thing I observed is that when using a 3 node cluster I was able to handle abt 41 req/second, so I added 3 more nodes thinking it should close to double, but instead it only goes upto

Re: upgrade from 0.8.5 to 1.1.6, now it cannot find schema

2012-12-16 Thread Anand Somani
in /var/lib/cassandra/data/KS_NAME/CF_NAME/SSTable.data Is the data in the right place ? Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 13/12/2012, at 6:54 AM, Anand Somani meatfor...@gmail.com wrote: Hi, We

Monitoring question for a multi DC (active/standby) configuration

2011-10-27 Thread Anand Somani
Hi, Have a requirement to do a multi dc low latency application. This will be in an active/standby setup. So I am planning on using LOCAL_QUORUM for writes. Now if there is a hard requirement of maximum loss of data (on a dc destruction) to some minutes, - In cassandra what is the recommended

Re: cassandra crashed while repairing, leave node size X3

2011-09-18 Thread Anand Somani
In my tests I have seen repair sometimes take a lot of space (2-3 times), cleanup did not clean it, the only way I could clean that was using major compaction. On Sun, Sep 18, 2011 at 6:51 PM, Yan Chunlu springri...@gmail.com wrote: while doing repair on node3, the Load keep increasing,

Re: Configuring multi DC cluster

2011-09-15 Thread Anand Somani
into consideration https://issues.apache.org/jira/browse/CASSANDRA-3047 Good luck. - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 14/09/2011, at 8:41 AM, Anand Somani wrote: Hi, Just trying to setup a cluster of 4

Re: [BETA RELEASE] Apache Cassandra 1.0.0-beta1 released

2011-09-15 Thread Anand Somani
So I should be able to do rolling upgrade from 0.7 to 1.0 (not there in the release notes, but I assume that is work in progress). Thanks On Thu, Sep 15, 2011 at 1:36 PM, amulya rattan talk2amu...@gmail.comwrote: Isn't this levelDB implementation for Google's LevelDB?

Re: what's the difference between repair CF separately and repair the entire node?

2011-09-14 Thread Anand Somani
On Tue, Sep 13, 2011 at 3:57 PM, Peter Schuller peter.schul...@infidyne.com wrote: I think it is a serious problem since I can not repair. I am using cassandra on production servers. is there some way to fix it without upgrade? I heard of that 0.8.x is still not quite ready in

StorageProxy Mbean not exposed in 0.7.8 anymore

2011-09-13 Thread Anand Somani
Hi, Upgraded from 7.4 to 7.8, noticed that StorageProxy (under cassandra.db) is no longer exposed, is that intentional? So the question are these covered somewhere else? Thanks Anand

Re: StorageProxy Mbean not exposed in 0.7.8 anymore

2011-09-13 Thread Anand Somani
. On Tue, Sep 13, 2011 at 11:53 AM, Anand Somani meatfor...@gmail.comwrote: Hi, Upgraded from 7.4 to 7.8, noticed that StorageProxy (under cassandra.db) is no longer exposed, is that intentional? So the question are these covered somewhere else? Thanks Anand

Configuring multi DC cluster

2011-09-13 Thread Anand Somani
Hi, Just trying to setup a cluster of 4 nodes for multiDC scenario - with 2 nodes in each DC. This is all on the same box just for testing the configuration aspect. I have configured things as - PropertyFile - 127.0.0.4=SC:rack1 127.0.0.5=SC:rack2 127.0.0.6=AT:rack1

Re: Question on using consistency level with NetworkTopologyStrategy

2011-09-09 Thread Anand Somani
. On Thu, Sep 8, 2011 at 3:14 PM, Anand Somani meatfor...@gmail.com wrote: Hi, Have a requirement, where data is spread across multiple DC for disaster recovery. So I would use the NTS, that is clear, but I have some questions with this scenario I have 2 Data Centers RF - 2 (active DC

Question on using consistency level with NetworkTopologyStrategy

2011-09-08 Thread Anand Somani
Hi, Have a requirement, where data is spread across multiple DC for disaster recovery. So I would use the NTS, that is clear, but I have some questions with this scenario - I have 2 Data Centers - RF - 2 (active DC) , 2 (passive DC) - with NTS - Consistency level options are -

Anybody out there using 0.8 in production

2011-09-08 Thread Anand Somani
Hi Currently we are using 0.7.4 and was wondering if I should upgrade to 0.7.8/9 or move to 0.8? Is anybody using 0.8 in production and what is their experience? Thanks

What are the things to watch out for with big nodes

2011-08-28 Thread Anand Somani
Hi, If I have a cluster with 15-20T nodes, somethings that I know will be a potential problem are - Compactions taking longer - Higher read latencies - Long time for adding/removing nodes What are other things that can be problematic with big nodes? Regards Anand

Re: Commit log fills up in less than a minute

2011-08-26 Thread Anand Somani
http://www.thelastpickle.com On 25/08/2011, at 3:22 AM, Anand Somani wrote: So I have looked at the cluster from - Cassandra-client - describe cluster = shows correctly - 3 nodes - used the StorageService - JMX bean =UnreachableNodes - shows 0 If all these show the correct ring state

Re: Commit log fills up in less than a minute

2011-08-24 Thread Anand Somani
in log files not been deleted https://issues.apache.org/jira/browse/CASSANDRA-2829 Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 22/08/2011, at 4:56 AM, Anand Somani wrote: We have a lot of space on /data, and looks

Re: Commit log fills up in less than a minute

2011-08-24 Thread Anand Somani
about phantom nodes. On Wed, Aug 24, 2011 at 8:01 AM, Anand Somani meatfor...@gmail.com wrote: So, I restarted the cluster (not rolling), but it is still maintaining hints for the IP's that are no longer part of the ring. nodetool ring shows things correctly (as only 3 nodes). When I check thru

Commit log fills up in less than a minute

2011-08-21 Thread Anand Somani
Hi, 7.4, 3 node cluster, RF=3 Load has not changed much, on 2 of the 3 nodes the commit log filled up in less than a minute (did not give a chance to recover). Now have been running this cluster for abt 2-3 months without any problem. At this point I do not see any unusual load (continue to

Re: Commit log fills up in less than a minute

2011-08-21 Thread Anand Somani
exercise I went thru on Saturday. Somehow can this get worse with cleanup, hinted hand off, etc When does the actual commit-data file get deleted. The flush interval on all my memtables is 60 minutes Thanks On Sun, Aug 21, 2011 at 8:43 AM, Anand Somani meatfor...@gmail.com wrote: Hi, 7.4

Re: Commit log fills up in less than a minute

2011-08-21 Thread Anand Somani
We have a lot of space on /data, and looks like it was flushing data fine from file timestamps. We did have a bit of goofup with IP's when bringing up a down node (and the commit files have been around since then). Wonder if that is what triggered it and we have a bunch of hinted handoff's being

Re: 0.7.4: Replication assertion error after removetoken, removetoken force and a restart

2011-08-20 Thread Anand Somani
0.7.4/ 3 node cluster/ RF -3 /Quorum read/write After I re-introduced a corrupted node, followed the process as (thanks to folks on the mailing list for helping me) listed on the operations wiki to handle failures. Still doing a cleanup on one node at this point. But I noticed that I am seeing

Re: Re: Urgent:!! Re: Need to maintenance on a cassandra node, are there problems with this process

2011-08-20 Thread Anand Somani
% 113427455640312821154458202477256070485 What are my choices here, how do I clean up the ring? The other 2 nodes show the ring fine (not even aware of 189) Thanks Anand On Fri, Aug 19, 2011 at 11:53 AM, Anand Somani meatfor...@gmail.com wrote: ok I will go with the IP change strategy and keep you posted. Not going

Urgent:!! Re: Need to maintenance on a cassandra node, are there problems with this process

2011-08-19 Thread Anand Somani
Cassandra Developer @aaronmorton http://www.thelastpickle.com On 19/08/2011, at 11:57 AM, Anand Somani wrote: Hi, version - 0.7.4 cluster size = 3 RF = 3. data size on a node ~500G I want to do some disk maintenance on a cassandra node, so the process that I came up

Re: Urgent:!! Re: Need to maintenance on a cassandra node, are there problems with this process

2011-08-19 Thread Anand Somani
Let me be specific on lost data - lost a replica , the other 2 nodes have replicas I am running read/write at quorum. At this point I have turned off my clients from talking to this node. So if that is the case I can potentially just nodetool repair (without changing IP). But would it be better

Re: Re: Urgent:!! Re: Need to maintenance on a cassandra node, are there problems with this process

2011-08-19 Thread Anand Somani
ok I will go with the IP change strategy and keep you posted. Not going to manually copy any data, just bring up the node and let it bootstrap. Thanks On Fri, Aug 19, 2011 at 11:46 AM, Peter Schuller peter.schul...@infidyne.com wrote: (Yes, this should definitely be easier. Maybe the most

Need to maintenance on a cassandra node, are there problems with this process

2011-08-18 Thread Anand Somani
Hi, version - 0.7.4 cluster size = 3 RF = 3. data size on a node ~500G I want to do some disk maintenance on a cassandra node, so the process that I came up with is - drain this node - back up the system data space - rebuild the disk partition - copy data from another node - copy

Fatal exception in thread Thread[RequestResponseStage......

2011-08-18 Thread Anand Somani
Hi I am using 0.7.4 and am seeing this exception my logs a few times a day, should I be worried? Or is this just a intermittent network disconnect ERROR [RequestResponseStage:257] 2011-08-19 03:05:30,706 AbstractCassandraDaemon.java (line 112) Fatal exception in thread

Problems Iterating over tokens in 0.7.5

2011-07-05 Thread Anand Somani
Hi, Using thrift and get_range_slices call with tokenrange. Using Random Partionioner. Have only tried this on 0.7.5 Used to work in 0.6.4 or earlier version for me , but I notice that it does not work for me anymore. The need is to iterate over a token range to do some bookkeeping. The logic is

Re: how to use indexed column for this case

2011-05-20 Thread Anand Somani
From what I know you cannot create secondary indexes on SCF. You should have gotten this = https://issues.apache.org/jira/browse/CASSANDRA-1813 on index creation. On Fri, May 20, 2011 at 6:56 AM, Monkey me monkey1024.pub...@gmail.comwrote: Hi, I have a SCF, Key is string, super column is

Re: Best way to detect/fix bitrot today?

2011-02-08 Thread Anand Somani
I should have clarified we have 3 copies, so in that case as long as 2 match we should be ok? Even if there were checksumming at the SStable level, I assume it has to check and report these errors on compaction (or node repair)? I have seen some JIRA open on these issues ( 47 and 1717), but if I

Best way to detect/fix bitrot today?

2011-02-07 Thread Anand Somani
Hi, Our application space is such that there is data that might not be read for a long time. The data is mostly immutable. How should I approach detecting/solving the bitrot problem? One approach is read data and let read repair do the detection, but given the size of data, that does not look

Re: Using Cassandra for storing large objects

2011-01-27 Thread Anand Somani
Using it for storing large immutable objects, like Aaron was suggesting we are splitting the blob across multiple columns. Also we are reading it a few columns at a time (for memory considerations). Currently we have only gone upto about 300-400KB size objects. We do have machines with 32Gb

Re: Using Cassandra for storing large objects

2011-01-27 Thread Anand Somani
: - What is the size of nodes (in terms for data)? - How long have you been running? - Howz compaction treating you? Thanks, Naren On Thu, Jan 27, 2011 at 12:13 PM, Anand Somani meatfor...@gmail.comwrote: Using it for storing large immutable objects, like Aaron was suggesting we

Re: Embedded Cassandra server startup question

2011-01-21 Thread Anand Somani
It is a little slow not to the point where it concerns me (only have few tests for now), but keeps things very clean so no surprise effects. On Thu, Jan 20, 2011 at 6:33 PM, Roshan Dawrani roshandawr...@gmail.comwrote: On Fri, Jan 21, 2011 at 5:14 AM, Anand Somani meatfor...@gmail.comwrote

Re: Embedded Cassandra server startup question

2011-01-20 Thread Anand Somani
Here is what worked for me, I use testNg, and initialize and createschema in the @BeforeClass for each test - In the @AfterClass, I had to drop schema, otherwise I was getting the same exception. - After this I started getting port conflict with the second test, so I added my own

Re: Running multiple instances on a single server --micrandra ??

2010-12-08 Thread Anand Somani
Interesting idea, . If it is like dividing the entire load on the system by 6, so if the effective load is still the same and used SSD's for commit volume we could get away with 1 commitlog SSD. Even if these 6 instances can handle 80% of the load (compared to 1 on this machine), that might be

Getting Exception when doing a range query using token (worked in 6.5 not in 6.6)

2010-11-15 Thread Anand Somani
Hi Problem: Call - client.get_range_slices(). Using tokens (not keys), fails with TimedoutException which I think is misleading (Read on) Server : Works with 6.5 server, but not with 6.6 or 6.8 Client: have tried both 6.5 and 6.6 I am getting a TimedoutException when I do a

Re: Getting Exception when doing a range query using token (worked in 6.5 not in 6.6)

2010-11-15 Thread Anand Somani
; token-based queries have to be on non-wrapping ranges (left token right token), or a wrapping range of (mintoken, mintoken). This was changed as part of the range scan fixes post-0.6.5. On Mon, Nov 15, 2010 at 6:32 PM, Anand Somani meatfor...@gmail.com wrote: Hi Problem: Call

Range queries using token instead of key

2010-11-10 Thread Anand Somani
Hi, I am trying to iterate over the entire dataset to calculate some information. Now the way I am trying to do this is by going directly to the node that has a data range, so here is the route I am following - get TokenRange using - describe_ring - then for each tokenRange pick a node and