Re: Using cassandra a BLOB store / web cache.

2016-01-20 Thread Mohit Anchlia
The answer to this questions is very much dependent on the throughput, desired latency and access patters (R/W or R/O)? In general what I have seen working for high throughput environment is to either use a distributed file system like Ceph/Gluster or object store like S3 and keep the pointer in

Re: Cannot query secondary index

2014-06-13 Thread Mohit Anchlia
Some other ways to track old records is: 1) Use external queues - One queue per week or month for instance and pile up data on the queue cluster 2) Create one more table in C* to track the keys per week or month that you can scan to read the keys of the audit table. Make sure you delete the

Re: Cassandra blob storage

2014-03-18 Thread Mohit Anchlia
For large volume big data scenarios we don't recommend using Cassandra as a blob storage simply because of intensive IO involved during compation, repair etc. Cassandra store is only well suited for metadata type storage. However, if you are fairly low volume then it's a different story, but if

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Mohit Anchlia
+1 I like hector client that uses thrift interface and exposes APIs that is similar to how Cassandra physically stores the values. On Thu, Feb 20, 2014 at 9:26 AM, Peter Lin wool...@gmail.com wrote: I disagree with the sentiment that thrift is not worth the trouble. CQL and all SQL inspired

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Mohit Anchlia
On Thu, Feb 20, 2014 at 4:37 PM, Edward Capriolo edlinuxg...@gmail.comwrote: Recomendations in cassandra have a shelf life of about 1 to 2 years. If you try to assert a recomendation from year ago you stand a solid chance of someone telling you there is now a better way. Casaandra once loved

Re: Commit log on USB flash disk?

2013-11-16 Thread Mohit Anchlia
In our testing USB tends to be slower. If there is something more integrated internally would give you better performance Sent from my iPhone On Nov 16, 2013, at 8:30 AM, Dan Simpson dan.simp...@gmail.com wrote: It doesn't seem like a great idea. The USB drives typically use dynamic wear

Re: Cass 1.1.11 out of memory during compaction ?

2013-11-03 Thread Mohit Anchlia
Post your gc logs Sent from my iPhone On Nov 3, 2013, at 6:54 AM, Oleg Dulin oleg.du...@gmail.com wrote: Cass 1.1.11 ran out of memory on me with this exception (see below). My parameters are 8gig heap, new gen is 1200M. ERROR [ReadStage:55887] 2013-11-02 23:35:18,419

Re: Cassandra Heap Size for data more than 1 TB

2013-10-02 Thread Mohit Anchlia
is 1.0.11, we are migrating to 1.2.X though. We had tuned bloom filters (0.1) and AFAIK making it lower than this won't matter. Thanks ! On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia mohitanch...@gmail.comwrote: Which Cassandra version are you on? Essentially heap size is function of number

Re: Cassandra Heap Size for data more than 1 TB

2013-10-01 Thread Mohit Anchlia
Which Cassandra version are you on? Essentially heap size is function of number of keys/metadata. In Cassandra 1.2 lot of the metadata like bloom filters were moved off heap. On Tue, Oct 1, 2013 at 9:34 PM, srmore comom...@gmail.com wrote: Does anyone know what would roughly be the heap size

Re: 答复: Frequent Full GC that take 30s

2013-09-23 Thread Mohit Anchlia
Your ParNew size is way too small. Generally 4GB ParNew (-Xmn) works out best for 16GB heap On Mon, Sep 23, 2013 at 9:05 PM, 谢良 xieli...@xiaomi.com wrote: it looks to me that MaxTenuringThreshold is too small, do you have any chance to try with a bigger one, like 4 or 8 or sth else?

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-20 Thread Mohit Anchlia
of nodetool ring here. On Thu, Sep 19, 2013 at 8:35 PM, Mohit Anchlia mohitanch...@gmail.comwrote: Other thing I noticed is that you are using mutiple RACKS and that might be contributing factor to it. However, I am not sure. Can you paste the output of nodetool cfstats and ring? Is it possible

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-20 Thread Mohit Anchlia
Did you start out your cluster after wiping all the sstables and commit logs? On Fri, Sep 20, 2013 at 3:42 PM, Suruchi Deodhar suruchi.deod...@generalsentiment.com wrote: We have been trying to resolve this issue to find a stable configuration that can give us a balanced cluster with equally

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Mohit Anchlia
Can you check cfstats to see number of keys per node? On Thu, Sep 19, 2013 at 12:36 PM, Suruchi Deodhar suruchi.deod...@generalsentiment.com wrote: Thanks for your replies. I wiped out my data from the cluster and also cleared the commitlog before restarting it with num_tokens=256. I then

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Mohit Anchlia
. Thanks, Suruchi On Thu, Sep 19, 2013 at 3:59 PM, Mohit Anchlia mohitanch...@gmail.comwrote: Can you check cfstats to see number of keys per node? On Thu, Sep 19, 2013 at 12:36 PM, Suruchi Deodhar suruchi.deod...@generalsentiment.com wrote: Thanks for your replies. I wiped out my data

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Mohit Anchlia
at 5:18 PM, Mohit Anchlia mohitanch...@gmail.comwrote: Can you run nodetool repair on all the nodes first and look at the keys? On Thu, Sep 19, 2013 at 1:22 PM, Suruchi Deodhar suruchi.deod...@generalsentiment.com wrote: Yes, the key distribution does vary across the nodes. For example

Re: row cache

2013-09-07 Thread Mohit Anchlia
I agree. We've had similar experience. Sent from my iPhone On Sep 7, 2013, at 6:05 PM, Edward Capriolo edlinuxg...@gmail.com wrote: I have found row cache to be more trouble then bene. The term fools gold comes to mind. Using key cache and leaving more free main memory seems stable and

Re: Cassandra 1.2.4 - Unflushed data lost on restart

2013-09-06 Thread Mohit Anchlia
Are you not using RF = 3 ? On Fri, Sep 6, 2013 at 10:14 AM, Thapar, Vishal (HP Networking) vtha...@hp.com wrote: My usage requirements are such that there should be least possible data loss even in case of a poweroff. When you say clean shutdown do you mean Cassandra service stop? I ran

Re: Temporarily slow nodes on Cassandra

2013-09-02 Thread Mohit Anchlia
In general with LOCAL_QUORUM you should not see such an issue when one node is slow. However, it could be because Client's are still sending requests to that node. Depending on what client library you are using , you could try to take that node out of your connection pool. Not knowing exact issue

Re: Upgrade from 1.0.9 to 1.2.8

2013-08-30 Thread Mohit Anchlia
If you have multiple DCs you at least want to upgrade to 1.0.11. There is an issue where you might get errors during cross DC replication. On Fri, Aug 30, 2013 at 9:41 AM, Mike Neir m...@liquidweb.com wrote: In my testing, mixing 1.0.9 and 1.2.8 seems to work fine as long as there is no need

Re: Having 2 nodes with 100% Ownership ?

2013-08-12 Thread Mohit Anchlia
You need to get it to 50% on each to equally distribute the has range. You need to 1) Calculate new token 2) move nodes to that token or use vnodes For the first option see: http://www.datastax.com/docs/0.8/install/cluster_init On Mon, Aug 12, 2013 at 12:06 PM, Morgan Segalis

Re: Cassandra nodetool repair question

2013-08-08 Thread Mohit Anchlia
But node might be streaming data as well, in that case only option is to restart node that started streaming operation Sent from my iPhone On Aug 8, 2013, at 5:56 PM, Andrey Ilinykh ailin...@gmail.com wrote: nodetool repair just triggers repair procedure. You can kill nodetool after start,

Re: cassandra GC cpu usage

2013-07-16 Thread Mohit Anchlia
What's your replication factor? Can you check tp stats and net stats to see if you are getting more mutations on these nodes ? Sent from my iPhone On Jul 16, 2013, at 3:18 PM, Jure Koren jure.ko...@zemanta.com wrote: Hi C* user list, I have a curious recurring problem with Cassandra 1.2

Re: Logging Cassandra Reads/Writes

2013-07-09 Thread Mohit Anchlia
There is a new tracing feature in Cassandra 1.2 that might help you with this. On Tue, Jul 9, 2013 at 1:31 PM, Blair Zajac bl...@orcaware.com wrote: No idea on the logging, I'm pretty new to Cassandra. Regards, Blair On Jul 9, 2013, at 12:50 PM, hajjat haj...@purdue.edu wrote: Blair,

Re: Reduce Cassandra GC

2013-06-20 Thread Mohit Anchlia
of 4GB). 2013/6/19 Mohit Anchlia mohitanch...@gmail.com How much data do you have per node? How much RAM per node? How much CPU per node? What is the avg CPU and memory usage? On Wed, Jun 19, 2013 at 12:16 AM, Joel Samuelsson samuelsson.j...@gmail.com wrote: My Cassandra ps info

Re: Reduce Cassandra GC

2013-06-19 Thread Mohit Anchlia
How much data do you have per node? How much RAM per node? How much CPU per node? What is the avg CPU and memory usage? On Wed, Jun 19, 2013 at 12:16 AM, Joel Samuelsson samuelsson.j...@gmail.com wrote: My Cassandra ps info: root 26791 1 0 07:14 ?00:00:00 /usr/bin/jsvc

Re: Reduce Cassandra GC

2013-06-18 Thread Mohit Anchlia
Is your young generation size set to 4GB? Can you paste the output of ps -ef|grep cassandra ? On Tue, Jun 18, 2013 at 8:48 AM, Joel Samuelsson samuelsson.j...@gmail.comwrote: Yes, like I said, the only relevant output from that file was: 2013-06-17T08:11:22.300+: 2551.288: [GC

Re: Reduce Cassandra GC

2013-06-15 Thread Mohit Anchlia
Can you paste you gc config? Also can you take a heap dump at 2 diff points so that we can compare it? Quick thing to do would be to do a histo live at 2 points and compare Sent from my iPhone On Jun 15, 2013, at 6:57 AM, Takenori Sato ts...@cloudian.com wrote: INFO [ScheduledTasks:1]

Re: very confused by jmap dump of cassandra

2013-02-21 Thread Mohit Anchlia
Roughly how much data do you have per node? Sent from my iPhone On Feb 20, 2013, at 10:49 AM, Hiller, Dean dean.hil...@nrel.gov wrote: I took this jmap dump of cassandra(in production). Before I restarted the whole production cluster, I had some nodes running compaction and it looked like

Re: How to replace a dead *seed* node while keeping quorum

2012-09-12 Thread Mohit Anchlia
How can this be resolved in this case? On Wed, Sep 12, 2012 at 3:53 PM, Rob Coli rc...@palominodb.com wrote: On Tue, Sep 11, 2012 at 4:21 PM, Edward Sargisson edward.sargis...@globalrelay.net wrote: If the downed node is a seed node then neither of the replace a dead node procedures work

Re: nodetool connection refused

2012-09-08 Thread Mohit Anchlia
Are both running on the same host? On Fri, Sep 7, 2012 at 11:53 PM, Manu Zhang owenzhang1...@gmail.com wrote: When I run Cassandra-trunk in Eclipse, nodetool fail to connect with the following error Failed to connect to '127.0.0.1:7199': Connection refused But if I run in terminal, all will

Re: Monitoring replication lag/latency in multi DC setup

2012-09-05 Thread Mohit Anchlia
As far as I know Cassandra doesn't use internal queueing mechanism specific to replication. Cassandra sends the write the remote DC and after that it's upto the tcp/ip stack to deal with buffering. If requests starts to timeout Cassandra would use HH upto certain time. For longer outage you would

Re: Monitoring replication lag/latency in multi DC setup

2012-09-05 Thread Mohit Anchlia
enough indicator of my back log? Although we know when a network is flaky, we are interested in knowing how much data is piling up in local DC that needs to be transferred. Greatly appreciate your help. VR On Wed, Sep 5, 2012 at 8:33 PM, Mohit Anchlia mohitanch...@gmail.comwrote: As far

Re: Expanding cluster to include a new DR datacenter

2012-08-27 Thread Mohit Anchlia
: org.apache.cassandra.dht.RandomPartitioner Schema versions: 9511e292-f1b6-3f78-b781-4c90aeb6b0f6: [10.20.8.4, 10.20.8.5, 10.20.8.1, 10.20.8.2, 10.20.8.3] *From:* Mohit Anchlia [mailto:mohitanch...@gmail.com] *Sent:* Friday, August 24, 2012 1:55 PM *To:* user@cassandra.apache.org

Re: Expanding cluster to include a new DR datacenter

2012-08-27 Thread Mohit Anchlia
for strategy_options I should be using the DC name from properfy file snitch right? Ours is “Fisher” and “TierPoint” so that’s what I used. ** ** *From:* Mohit Anchlia [mailto:mohitanch...@gmail.com] *Sent:* Monday, August 27, 2012 1:21 PM *To:* user@cassandra.apache.org *Subject:* Re

Re: Decreasing the number of nodes in the ring

2012-08-26 Thread Mohit Anchlia
use nodetool decommission and nodetool removetoken On Sun, Aug 26, 2012 at 5:31 PM, Senthilvel Rangaswamy senthil...@gmail.com wrote: We have a cluster of 9 nodes in the ring. We would like SSD backed boxes. But we may not need 9 nodes in that case. What is the best way to downscale the

Re: help required to resolve super column family problems

2012-08-24 Thread Mohit Anchlia
If you are starting out new use composite column names/values or you could also use JSON style doc as a column value. On Fri, Aug 24, 2012 at 2:31 PM, Rob Coli rc...@palominodb.com wrote: On Fri, Aug 24, 2012 at 4:33 AM, Amit Handa amithand...@gmail.com wrote: kindly help in resolving the

DSE solr HA

2012-08-12 Thread Mohit Anchlia
Going through this page and it looks like indexes are stored locally http://www.datastax.com/dev/blog/cassandra-with-solr-integration-details . My question is what happens if one of the solr nodes crashes? Is the data indexed again on those nodes? Also, if RF 1 then is the same data being

Re: Decision Making- YCSB

2012-08-10 Thread Mohit Anchlia
I agree with Edward. We always develop our own stress tool that tests each use case of interest. Every use case is different in certain ways that can only be tested using custom stress tool. On Fri, Aug 10, 2012 at 7:25 AM, Edward Capriolo edlinuxg...@gmail.comwrote: There are many YCSB forks

Re: Schema advice: (Single row or multiple row!?) How do I store millions of columns when I need to read a set of around 500 columns at a single read query using column names ?

2012-07-23 Thread Mohit Anchlia
On Mon, Jul 23, 2012 at 10:07 AM, Ertio Lew ertio...@gmail.com wrote: My major concern is that is it too bad retrieving 300-500 rows (each for a single column) in a single read query that I should store all these(around a hundred million) columns in a single row? You could create multiple

Re: Schema advice: (Single row or multiple row!?) How do I store millions of columns when I need to read a set of around 500 columns at a single read query using column names ?

2012-07-23 Thread Mohit Anchlia
On Mon, Jul 23, 2012 at 10:53 AM, Ertio Lew ertio...@gmail.com wrote: Actually these columns are 1 for each entity in my application I need to query at any time columns for a list of 300-500 entities in one go. Can you describe your situation with small example?

Re: Schema advice: (Single row or multiple row!?) How do I store millions of columns when I need to read a set of around 500 columns at a single read query using column names ?

2012-07-23 Thread Mohit Anchlia
On Mon, Jul 23, 2012 at 11:00 AM, Ertio Lew ertio...@gmail.com wrote: For each user in my application, I want to store a *value* that is queried by using the userId. So there is going to be one column for each user (userId as col Name *value* as col Value). Now I want to store these columns

Re: Schema advice: (Single row or multiple row!?) How do I store millions of columns when I need to read a set of around 500 columns at a single read query using column names ?

2012-07-23 Thread Mohit Anchlia
On Mon, Jul 23, 2012 at 11:16 AM, Ertio Lew ertio...@gmail.com wrote: I want to read columns for a randomly selected list of userIds(completely random). I fetch the data using userIds(which would be used as column names in case of single row or as rowkeys incase of 1 row for each user) for a

Re: Cassandra Authentication

2012-06-28 Thread Mohit Anchlia
Sent from my iPad On Jun 28, 2012, at 8:45 AM, Christof Bornhoevd cbornho...@gmail.com wrote: Hi, we are using Cassandra v1.0.8 with Hector v1.0-5 and would like to move our current system to an operational setting based on Amazon AWS. What are best practices for addessing security

Re: Multi datacenter, WAN hiccups and replication

2012-06-26 Thread Mohit Anchlia
On Tue, Jun 26, 2012 at 7:52 AM, Karthik N karthik@gmail.com wrote: My Cassandra ring spans two DCs. I use local quorum with replication factor=3. I do a write in DC1 with local quorum. Data gets written to multiple nodes in DC1. For the same write to propagate to DC2 only one copy is

Re: Multi datacenter, WAN hiccups and replication

2012-06-26 Thread Mohit Anchlia
question. In general I don't think you can selectively decide on HH. Besides HH should only be used when the outage is in mts, for longer outages using HH would only create memory pressure. On Tuesday, June 26, 2012, Mohit Anchlia wrote: On Tue, Jun 26, 2012 at 7:52 AM, Karthik N karthik

Re: How do I add a custom comparator class to a cassandra cluster ?

2012-05-14 Thread Mohit Anchlia
That's right. Create class that implements the required interface and then drop that jar in lib directory and start the cluster. On Mon, May 14, 2012 at 11:41 AM, Kirk True k...@mustardgrain.com wrote: Disclaimer: I've never tried, but I'd imagine you can drop a JAR containing the class(es)

Updating CF to reversed type

2012-05-05 Thread Mohit Anchlia
Is it possible to update CF definition to use reversed type? If it's possible then what happens to the old values, do they still remain ordered in ascending order?

Re: Updating CF to reversed type

2012-05-05 Thread Mohit Anchlia
, Mohit Anchlia mohitanch...@gmail.com wrote: Is it possible to update CF definition to use reversed type? If it's possible then what happens to the old values, do they still remain ordered in ascending order?

Re: Question regarding major compaction.

2012-05-01 Thread Mohit Anchlia
+1 On Tue, May 1, 2012 at 12:06 PM, Edward Capriolo edlinuxg...@gmail.comwrote: Also there are some tickets in JIRA to impose a max sstable size and some other related optimizations that I think got stuck behind levelDB in coolness factor. Not every use case is good for leveled so adding

Re: cassandra gui

2012-03-30 Thread Mohit Anchlia
On Thu, Mar 29, 2012 at 10:08 PM, Markus Wiesenbacher | Codefreun.de m...@codefreun.de wrote: Hi, yes you can insert data into cassandra with apollo, just try the demo center: http://www.codefreun.de/apolloUI/ You can login by just press the login-button (autologin) and play around with

Re: cassandra gui

2012-03-30 Thread Mohit Anchlia
with this API. I think, searching for a specific key is the most efficient way to get to your data, instead of paging through it. ** I was referring to columns, if a row-key has more than 100 columns then there is no way to look at columns that falls outside of it ** *Von:* Mohit

Re: [BETA RELEASE] Apache Cassandra 1.1.0-beta2 released

2012-03-29 Thread Mohit Anchlia
for any details on the upgrade path for these versions). The incompatibility here is only between 1.1.0-beta1 and 1.1.0-beta2. -- Sylvain On Thu, Mar 29, 2012 at 2:50 AM, Mohit Anchlia mohitanch...@gmail.com wrote: We are currently using 1.0.0-2 version. Do we still need to migrate

Re: [BETA RELEASE] Apache Cassandra 1.1.0-beta2 released

2012-03-29 Thread Mohit Anchlia
Any updates? On Thu, Mar 29, 2012 at 7:31 AM, Mohit Anchlia mohitanch...@gmail.comwrote: This is from NEWS.txt. So my question is if we are on 1.0.0-2 release do we still need to upgrade since this impacts releases between 1.0.3-1.0.5? - If you are running a multi datacenter setup, you

Re: [BETA RELEASE] Apache Cassandra 1.1.0-beta2 released

2012-03-29 Thread Mohit Anchlia
does not generate cross-dc forwarding message at all, so you're safe on that side. Is cross-dc forwarding different than replication? -- Sylvain On Thu, Mar 29, 2012 at 9:33 PM, Mohit Anchlia mohitanch...@gmail.com wrote: Any updates? On Thu, Mar 29, 2012 at 7:31 AM, Mohit Anchlia

Re: [BETA RELEASE] Apache Cassandra 1.1.0-beta2 released

2012-03-28 Thread Mohit Anchlia
We are currently using 1.0.0-2 version. Do we still need to migrate to the latest release of 1.0 before migrating to 1.1? Looks like incompatibility is only between 1.0.3-1.0.8. On Tue, Mar 27, 2012 at 6:42 AM, Benoit Perroud ben...@noisette.ch wrote: Thanks for the quick feedback. I will

Re: Performance overhead when using start and end columns

2012-03-26 Thread Mohit Anchlia
, at 6:21 AM, Mohit Anchlia wrote: Thanks but if I do have to specify start and end columns then how much overhead roughly would that translate to since reading metadata should be constant overall? On Mon, Mar 26, 2012 at 10:18 AM, aaron morton aa...@thelastpickle.comwrote: Some information

Re: Frequency of Flushing in 1.0

2012-02-26 Thread Mohit Anchlia
On Sun, Feb 26, 2012 at 12:18 PM, aaron morton aa...@thelastpickle.comwrote: Nathan Milford has a post about taking a node down http://blog.milford.io/2011/11/rolling-upgrades-for-cassandra/ The only thing I would do differently would be turn off thrift first. Cheers Isn't decomission

Re: Please advise -- 750MB object possible?

2012-02-22 Thread Mohit Anchlia
In my opinion if you are busy site or application keep blobs out of the database. On Wed, Feb 22, 2012 at 9:37 AM, Dan Retzlaff dretzl...@gmail.com wrote: Chunking is a good idea, but you'll have to do it yourself. A few of the columns in our application got quite large (maybe ~150MB) and the

Re: Please advise -- 750MB object possible?

2012-02-22 Thread Mohit Anchlia
Outside on the file system and a pointer to it in C* On Wed, Feb 22, 2012 at 10:03 AM, Rafael Almeida almeida...@yahoo.comwrote: Keep them where? -- *From:* Mohit Anchlia mohitanch...@gmail.com *To:* user@cassandra.apache.org *Cc:* potek...@bnl.gov *Sent

Re: Please advise -- 750MB object possible?

2012-02-22 Thread Mohit Anchlia
PM, Mohit Anchlia wrote: Outside on the file system and a pointer to it in C* On Wed, Feb 22, 2012 at 10:03 AM, Rafael Almeida almeida...@yahoo.comwrote: Keep them where? -- *From:* Mohit Anchlia mohitanch...@gmail.com *To:* user@cassandra.apache.org *Cc

Re: nodetool hangs and didn't print anything with firewall

2012-02-05 Thread Mohit Anchlia
Does it work with iptables disabled? You could add log to your firewall rules to see if firewall is dropping the packets. On Sun, Feb 5, 2012 at 5:35 PM, Roshan codeva...@gmail.com wrote: Hi I have 2 node Cassandra cluster and each linux box configured with a firewall. The ports 7000, 7199

Re: WARN [Memtable] live ratio

2012-02-03 Thread Mohit Anchlia
and ERROR. But if there is nothing to do then it probably is just an INFO. On Tue, Jan 31, 2012 at 9:41 PM, Mohit Anchlia mohitanch...@gmail.com wrote: I guess this is not really a WARN in that case. On Tue, Jan 31, 2012 at 4:29 PM, aaron morton aa...@thelastpickle.com wrote: The ratio

Re: WARN [Memtable] live ratio

2012-02-03 Thread Mohit Anchlia
write to and then read from. On Fri, Feb 3, 2012 at 10:31 AM, Mohit Anchlia mohitanch...@gmail.com wrote: On Fri, Feb 3, 2012 at 7:32 AM, Jonathan Ellis jbel...@gmail.com wrote: It's a warn because it's nonsense for the JVM to report that an column + overhead, takes less space than just

Re: WARN [Memtable] live ratio

2012-01-31 Thread Mohit Anchlia
I guess this is not really a WARN in that case. On Tue, Jan 31, 2012 at 4:29 PM, aaron morton aa...@thelastpickle.com wrote: The ratio is the ratio of serialised bytes for a memtable to actual JVM allocated memory. Using a ratio below 1 would imply the JVM is using less bytes to store the

Re: WARN [Memtable] live ratio

2012-01-30 Thread Mohit Anchlia
I have the same experience. Wondering what's causing this? One thing I noticed is that this happens if server is idle for some time and then load starts going high is when I start to see these messages. On Mon, Jan 30, 2012 at 4:54 PM, Roshan codeva...@gmail.com wrote: Hi All Time to time I am

Re: Cassandra to Oracle?

2012-01-20 Thread Mohit Anchlia
I think the problem stems when you have data in a column that you need to run adhoc query on which is not denormalized. In most cases it's difficult to predict the type of query that would be required. Another way of solving this could be to index the fields in search engine. On Fri, Jan 20,

Re: Garbage collection freezes cassandra node

2012-01-19 Thread Mohit Anchlia
What's the version of Java do you use? Can you try reducing NewSize and increasing Old generation? If you are on old version of Java I also recommend upgrading that version. On Thu, Jan 19, 2012 at 3:27 AM, Rene Kochen rene.koc...@emea.schange.com wrote: Thanks for your comments. The application

Re: Max records per node for a given secondary index value

2012-01-18 Thread Mohit Anchlia
You need to shard your rows On Wed, Jan 18, 2012 at 5:46 PM, Kamal Bahadur mailtoka...@gmail.com wrote: Anyone? On Wed, Jan 18, 2012 at 9:53 AM, Kamal Bahadur mailtoka...@gmail.com wrote: Hi All, It is great to know that Cassandra column family can accommodate 2 billion columns per row!

Re: Unbalanced cluster with RandomPartitioner

2012-01-17 Thread Mohit Anchlia
Have you tried running repair first on each node? Also, verify using df -h on the data dirs On Tue, Jan 17, 2012 at 7:34 AM, Marcel Steinbach marcel.steinb...@chors.de wrote: Hi, we're using RP and have each node assigned the same amount of the token space. The cluster looks like that:

Brisk with standard C* cluster

2012-01-16 Thread Mohit Anchlia
Is it possible to add Brisk only nodes to standard C* cluster? So if we have node A,B,C with standard C* then add Brisk node D,E,F for analytics?

Installing C* on EC2

2012-01-12 Thread Mohit Anchlia
What's the best way to install C*? Any good links? Is it better to just create instances and install rpms on it first, just like regular cluster and then create image from it? I am assuming it's possible. Are there any known issues when running C* on EC2? How do other C* users deal with instance

Re: Pending on ReadStage

2012-01-06 Thread Mohit Anchlia
Are all your nodes equally balanced in terms of read requests? Are you using RandomPartitioner? Are you reading using indexes? First thing you can do is compare iostat -x output between the 2 nodes to rule out any io issues assuming your read requests are equally balanced. On Fri, Jan 6, 2012 at

Re: How to reliably achieve unique constraints with Cassandra?

2012-01-06 Thread Mohit Anchlia
On Fri, Jan 6, 2012 at 10:03 AM, Drew Kutcharian d...@venarc.com wrote: Hi Everyone, What's the best way to reliably have unique constraints like functionality with Cassandra? I have the following (which I think should be very common) use case. User CF Row Key: user email Columns:

Re: How to reliably achieve unique constraints with Cassandra?

2012-01-06 Thread Mohit Anchlia
, no? On Jan 6, 2012, at 10:38 AM, Mohit Anchlia wrote: On Fri, Jan 6, 2012 at 10:03 AM, Drew Kutcharian d...@venarc.com wrote: Hi Everyone, What's the best way to reliably have unique constraints like functionality with Cassandra? I have the following (which I think should be very common

Re: How to reliably achieve unique constraints with Cassandra?

2012-01-06 Thread Mohit Anchlia
This looks like right way to do it. But remember this still doesn't gurantee if your clocks drifts way too much. But it's trade-off with having to manage one additional component or use something internal to C*. It would be good to see similar functionality implemented in C* so that clients don't

Re: How to reliably achieve unique constraints with Cassandra?

2012-01-06 Thread Mohit Anchlia
like this has been tried before, and for various reasons was not added. It's definitely non-trivial to get right. On Fri, 6 Jan 2012 13:33:02 -0800 Mohit Anchlia mohitanch...@gmail.com wrote: This looks like right way to do it. But remember this still doesn't gurantee if your clocks drifts way

Re: cassandra data to hadoop.

2011-12-24 Thread Mohit Anchlia
You could read using Cassandra client and write to HDFS using Hadoop FS Api. On Fri, Dec 23, 2011 at 11:20 PM, ravikumar visweswara talk2had...@gmail.com wrote: Jeremy, We use cloudera distribution for our hadoop cluster and may not be possible to migrate to brisk quickly because of flume/hue

Re: Garbage collection freezes cassandra node

2011-12-19 Thread Mohit Anchlia
Increasing memory in this case may not solve the problem. Share some information about your workload. Cluster configuration, cache sizes etc. You can also try getting java heap historgram to get more info on what's on the heap. On Mon, Dec 19, 2011 at 7:35 AM, Rene Kochen

Re: One ColumnFamily places data on only 3 out of 4 nodes

2011-12-14 Thread Mohit Anchlia
bart@node1:~$ nodetool -h localhost getendpoints A UserDetails 4545027 192.168.81.5 192.168.81.2 192.168.81.3 Can you see what happens if you stop C* say on node .5 and write and read at quorum? On Wed, Dec 14, 2011 at 7:06 AM, Bart Swedrowski b...@timedout.org wrote: On 14 December 2011

Re: Efficiency of Cross Data Center Replication...?

2011-11-20 Thread Mohit Anchlia
On Sun, Nov 20, 2011 at 4:01 AM, Boris Yen yulin...@gmail.com wrote: A quick question, what if DC2 is down, and after a while it comes back on. how does the data get sync to DC2 in this case? (assume hint is disable) Thanks in advance. Manually, use nodetool repair in rolling fashion on all

Re: ParNew and caching

2011-11-18 Thread Mohit Anchlia
On Fri, Nov 18, 2011 at 6:39 AM, Sylvain Lebresne sylv...@datastax.com wrote: On Fri, Nov 18, 2011 at 1:53 AM, Todd Burruss bburr...@expedia.com wrote: I'm using cassandra 1.0.  Been doing some testing on using cass's cache.  When I turn it on (using the CLI) I see ParNew jump from 3-4ms to

Re: ParNew and caching

2011-11-18 Thread Mohit Anchlia
On Fri, Nov 18, 2011 at 7:47 AM, Sylvain Lebresne sylv...@datastax.com wrote: On Fri, Nov 18, 2011 at 4:23 PM, Mohit Anchlia mohitanch...@gmail.com wrote: On Fri, Nov 18, 2011 at 6:39 AM, Sylvain Lebresne sylv...@datastax.com wrote: On Fri, Nov 18, 2011 at 1:53 AM, Todd Burruss bburr

Re: ParNew and caching

2011-11-18 Thread Mohit Anchlia
On Fri, Nov 18, 2011 at 9:42 AM, Sylvain Lebresne sylv...@datastax.com wrote: On Fri, Nov 18, 2011 at 6:31 PM, Mohit Anchlia mohitanch...@gmail.com wrote: On Fri, Nov 18, 2011 at 7:47 AM, Sylvain Lebresne sylv...@datastax.com wrote: On Fri, Nov 18, 2011 at 4:23 PM, Mohit Anchlia mohitanch

Re: ParNew and caching

2011-11-18 Thread Mohit Anchlia
On Fri, Nov 18, 2011 at 1:46 PM, Todd Burruss bburr...@expedia.com wrote: Ok, I figured something like that.  Switching to ConcurrentLinkedHashCacheProvider I see it is a lot better, but still instead of the 25-30ms response times I enjoyed with no caching, I'm seeing 500ms at 100% hit rate on

Re: ParNew and caching

2011-11-18 Thread Mohit Anchlia
ParNew and other major phases recorded in the logs. Are there any significant writes, memtable flushes etc occuring during this time? How many read/sec and writes/sec? What's the size of your row and columns that you are trying to retrieve? On 11/18/11 2:40 PM, Mohit Anchlia mohitanch

Re: Second Cassandra users survey

2011-11-14 Thread Mohit Anchlia
On Mon, Nov 14, 2011 at 4:44 PM, Jake Luciani jak...@gmail.com wrote: Re  Simpler elasticity: Latest opscenter will now rebalance cluster optimally http://www.datastax.com/dev/blog/whats-new-in-opscenter-1-3 /plug Does it cause any impact on reads and writes while re-balance is in progress?

Re: Help with Cassandra Row Caches

2011-11-11 Thread Mohit Anchlia
Can you temporarily increase the size of Heap and try? On Fri, Nov 11, 2011 at 5:21 PM, Oleg Tsvinev oleg.tsvi...@gmail.com wrote: Hi everybody, We set row cache too high, 1 or so and now all our 6 nodes fail with OOM. I believe that high row cache causes OOMs. Now, we trying to change

Re: security

2011-11-09 Thread Mohit Anchlia
We lockdown ssh to root from any network. We also provide individual logins including sysadmin and they go through LDAP authentication. Anyone who does sudo su as root gets logged and alerted via trapsend. We use firewalls and also have a separate vlan for datastore servers. We then open only

Re: Second Cassandra users survey

2011-11-06 Thread Mohit Anchlia
Transparent on disk encryption with pluggable keyprovider will also be really helpful to secure sensitive information. On Sun, Nov 6, 2011 at 9:42 AM, Aaron Turner synfina...@gmail.com wrote: The intent was to have a lighter solution for common problems then having to go with Hadoop or

Re: Second Cassandra users survey

2011-11-03 Thread Mohit Anchlia
On Thu, Nov 3, 2011 at 5:46 AM, Peter Tillotson slatem...@yahoo.co.uk wrote: I'm using Cassandra as a big graph database, loading large volumes of data live and linking on the fly. Not sure if Cassandra is right fit to model complex vertexes and edges. The number of edges grow geometrically

Re: Cassandra cluster HW spec (commit log directory vs data file directory)

2011-10-30 Thread Mohit Anchlia
On Sun, Oct 30, 2011 at 6:53 PM, Chris Goffinet c...@chrisgoffinet.com wrote: On Sun, Oct 30, 2011 at 3:34 PM, Sorin Julean sorin.jul...@gmail.com wrote: Hey Chris,  Thanks for sharing all  the info.  I have few questions:  1. What are you doing with so much memory :) ? How much of it do

Re: Programmatically allow only one out of two types of rows in a CF to enter the CACHE

2011-10-29 Thread Mohit Anchlia
at 10:22 AM, Aditya Narayan ady...@gmail.com wrote: ..so that I can retrieve them through a single query. For reading cols from two CFs you need two queries, right ? On Sat, Oct 29, 2011 at 9:53 PM, Mohit Anchlia mohitanch...@gmail.com wrote: Why not use 2 CFs? On Fri, Oct 28, 2011 at 9

Re: Programmatically allow only one out of two types of rows in a CF to enter the CACHE

2011-10-29 Thread Mohit Anchlia
On Sat, Oct 29, 2011 at 11:23 AM, Aditya Narayan ady...@gmail.com wrote: @Mohit: I have stated the example scenarios in my first post under this heading. Also I have stated above why I want to split that data in two rows like Ikeda below stated, I'm too trying out to prevent the frequently

Re: Cassandra cluster HW spec (commit log directory vs data file directory)

2011-10-25 Thread Mohit Anchlia
On Tue, Oct 25, 2011 at 11:18 AM, Dan Hendry dan.hendry.j...@gmail.com wrote: 2. ... So I am going to use rotational disk for the commit log and an SSD for data. Does this make sense? Yes, just keep in mind however that the primary characteristic of SSDs is lower seek times which translates

Re: Cassandra cluster HW spec (commit log directory vs data file directory)

2011-10-25 Thread Mohit Anchlia
and in memory caching of columns that Cassandra offers? Cheers, Alex On Tue, Oct 25, 2011 at 9:06 PM, Todd Burruss bburr...@expedia.com wrote: This may help determining your data storage requirements ... http://btoddb-cass-storage.blogspot.com/ On 10/25/11 11:22 AM, Mohit Anchlia

Re: how to reduce disk read? (and bloom filter performance)

2011-10-17 Thread Mohit Anchlia
On Sun, Oct 16, 2011 at 2:20 AM, Radim Kolar h...@sendmail.cz wrote: Dne 10.10.2011 18:53, Mohit Anchlia napsal(a): Does it mean you are not updating a row or deleting them? yes. i have 350m rows and only about 100k of them are updated.  Can you look at JMX values of BloomFilter* ? i

Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Mohit Anchlia
Do you have same seed node specified in cass-analysis-1 as cass-1,2,3? I am thinking that changing the seed node in cass-analysis-2 and following the directions in http://wiki.apache.org/cassandra/FAQ#schema_disagreement might solve the problem. Somone please correct me. On Thu, Oct 13, 2011 at

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Mohit Anchlia
You mentioned this happens only on one node? How many nodes do you have? Is it possible to turn off this node completely and run compactions on other nodes and see if this happens there too? Also, you mentioned this happens after compaction. Did you mean during compaction or right after it? What

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Mohit Anchlia
, Oct 12, 2011 at 1:13 PM, Mohit Anchlia mohitanch...@gmail.com wrote: Yes. If you have exhausted all the options I think it will be good to see if this issue persists accross other nodes after you decommission that node. If this is not production and issue is reproducible easily you can also

Re: how to reduce disk read? (and bloom filter performance)

2011-10-10 Thread Mohit Anchlia
to compact more often. On Sun, Oct 9, 2011 at 7:09 AM, Radim Kolar h...@sendmail.cz wrote: Dne 7.10.2011 23:16, Mohit Anchlia napsal(a): You'll see output like: Offset      SSTables 1                  8021 2                  783 Which means 783 read operations accessed 2 SSTables thank

  1   2   >