Setting up Cassandra to store on a specific node and not replicate

2013-12-18 Thread Colin MacDonald
Ahoy the list. I am evaluating Cassandra in the context of using it as a storage back end for the Titan graph database. We’ll have several nodes in the cluster. However, one of our requirements is that data has to be loaded into and stored on a specific node and only on that node. Also, it

Re: Cassandra 1.1.6 - Disk usage and Load displayed in ring doesn't match

2013-12-18 Thread Julien Campan
Hi, When you are increasing the RF, you need to perform repair for the keyspace on each node.(Because datas are not automaticaly streamed). After that you should perform a cleanup on each node to remove obsolete sstable. Good luck :) Julien Campan. 2013/12/18 Aaron Morton

Re: Dimensional SUM, COUNT, DISTINCT in C* (replacing Acunu)

2013-12-18 Thread Alain RODRIGUEZ
Hi, this would indeed be much appreciated by a lot of people. There is this issue, existing about this subject https://issues.apache.org/jira/browse/CASSANDRA-4914 Maybe could you help commiters there. Hope this will be usefull to you. Please let us know when you find a way to do these

Re: Setting up Cassandra to store on a specific node and not replicate

2013-12-18 Thread Janne Jalkanen
This may be hard because the coordinator could store hinted handoff (HH) data on disk. You could turn HH off and have RF=1 to keep data on a single instance, but you would be likely to lose data if you had any problems with your instances… Also you would need to tweak the memtable flushing so

RE: Setting up Cassandra to store on a specific node and not replicate

2013-12-18 Thread Colin MacDonald
-Original Message- From: Janne Jalkanen [mailto:janne.jalka...@ecyrd.com] Essentially you want to turn off all the features which make Cassandra a robust product ;-). Oh, I don't want to, but sadly those are the requirements that I have to work with. Again, the context is using it

Re: Setting up Cassandra to store on a specific node and not replicate

2013-12-18 Thread Sylvain Lebresne
You seem to be well aware that you're not looking at using Cassandra for what it is designed for (which obviously imply you'll need to expect under-optimal behavior), so I'm not going to insist on it. As to how you could achieve that, a relatively simple solution (that do not require writing your

Re: NullPointerException causing repair to hang

2013-12-18 Thread Russ Garrett
On 17 December 2013 19:47, Robert Coli rc...@eventbrite.com wrote: I would comment to that effect on CASSANDRA-6210, were I you. Will do. Are you using vnodes? Have you tried a rolling restart of all nodes? Yes, we're using vnodes, and all nodes have been restarted since this problem started

RE: Setting up Cassandra to store on a specific node and not replicate

2013-12-18 Thread Colin MacDonald
-Original Message- From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: 18 December 2013 10:45 You seem to be well aware that you're not looking at using Cassandra for what it is designed for (which obviously imply you'll need to expect under- optimal behavior), so I'm not

Re: Cassandra 1.2 : OutOfMemoryError: unable to create new native thread

2013-12-18 Thread Oleg Dulin
I figured it out. Another process on that machine was leaking threads. All is well! Thanks guys! Oleg On 2013-12-16 13:48:39 +, Maciej Miklas said: the cassandra-env.sh has option JVM_OPTS=$JVM_OPTS -Xss180k it will give this error if you start cassandra with java 7. So increase the

FW: Commitlog replay makes dropped and recreated keyspace and column family rows reappear

2013-12-18 Thread Desimpel, Ignace
I did the test again to get the log information. There is a Drop keyspace message at the time I drop the keyspace. That actually must be working since after the drop, I do not get any records back. But starting from the time of restart, I do not get any Drop keyspace message in the log. I get

Re: Endless loop LCS compaction

2013-12-18 Thread Marcus Eriksson
this has been fixed: https://issues.apache.org/jira/browse/CASSANDRA-6496 On Wed, Dec 18, 2013 at 2:51 PM, Desimpel, Ignace ignace.desim...@nuance.com wrote: Hi, Would it not be possible that in some rare cases these 'small' files are created also and thus resulting in the same endless

RE: Setting up Cassandra to store on a specific node and not replicate

2013-12-18 Thread Colin MacDonald
-Original Message- From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: 18 December 2013 12:46 Google up NetworkTopologyStrategy. This is what you want to use and it's not configured in cassandra.yaml but when you create the keyspace. Basically, you define your topology in

WRITETIME question

2013-12-18 Thread Christopher Wirt
Is there any reason to use the WRITETIME function on non-counter columns? I'm using CQL statements via the thrift protocol and get a Timestamp returned with each column. I'm pretty sure select a, writetime(a) from b where u = 1 is unnecessary for me. Unless a is a counter. I guess my

Re: WRITETIME question

2013-12-18 Thread Tyler Hobbs
On Wed, Dec 18, 2013 at 8:24 AM, Christopher Wirt chris.w...@struq.comwrote: What, if any, is the difference between selecting writetime(column) and just looking at the Timestamp of a selected column. There's no difference. The writetime() function is only really necessary for native protocol

how wide to make wide rows in practice?

2013-12-18 Thread Lee Mighdoll
I think the recommendation once upon a time was to keep wide storage engine internal rows from growing too large. e.g. for time series, it was recommended to partition samples by day or by hour to keep the size manageable. What's the current cassandra 2.0 advice on sizing for wide storage engine

Re: Setting up Cassandra to store on a specific node and not replicate

2013-12-18 Thread Robert Coli
On Wed, Dec 18, 2013 at 2:44 AM, Sylvain Lebresne sylv...@datastax.comwrote: As Janne said, you could still have hint being written by other nodes if the one storage node is dead, but you can use the system property cassandra.maxHintTTL to 0 to disable hints. If one uses a Token Aware client

Re: how wide to make wide rows in practice?

2013-12-18 Thread Robert Coli
On Wed, Dec 18, 2013 at 9:26 AM, Lee Mighdoll l...@underneath.ca wrote: What's the current cassandra 2.0 advice on sizing for wide storage engine rows? Can we drop the added complexity of managing day/hour partitioning for time series stores? A few hundred megs at very most is generally

Re: Cassandra 1.1.6 - Disk usage and Load displayed in ring doesn't match

2013-12-18 Thread Narendra Sharma
Thanks Aaron. No tmp files and not even a single exception in the system.log. If the file was last modified on 20-Nov then there must be an entry for that in the log (either completed streaming or compacted). On Tue, Dec 17, 2013 at 7:23 PM, Aaron Morton aa...@thelastpickle.comwrote: -tmp-

Re: Dimensional SUM, COUNT, DISTINCT in C* (replacing Acunu)

2013-12-18 Thread Brian O'Neill
Thanks for the pointer Alain. At a quick glance, it looks like people are looking for query time filtering/aggregation, which will suffice for small data sets. Hopefully we might be able to extend that to perform pre-computations as well. (which would support much larger data sets / volumes)

Re: how wide to make wide rows in practice?

2013-12-18 Thread Lee Mighdoll
Hi Rob, thanks for the refresher, and the the issue link (fixed today too- thanks Sylvain!). Cheers, Lee On Wed, Dec 18, 2013 at 10:47 AM, Robert Coli rc...@eventbrite.com wrote: On Wed, Dec 18, 2013 at 9:26 AM, Lee Mighdoll l...@underneath.ca wrote: What's the current cassandra 2.0 advice

Re: Cassandra pytho pagination

2013-12-18 Thread Kumar Ranjan
I am using pycassa. So, here is how I solved this issue. Will discuss 2 approaches. First approach didn't work out for me. Thanks Aaron for your attention. First approach: - Say if column_count = 10 - collect first 11 rows, sort first 10, send it to user (front end) as JSON object and

Re: Cassandra pytho pagination

2013-12-18 Thread Robert Coli
On Wed, Dec 18, 2013 at 1:28 PM, Kumar Ranjan winnerd...@gmail.com wrote: Second approach ( I used in production ): - fetch all super columns for a row key Stock response mentioning that super columns are anti-advised for use, especially in brand new code. =Rob

How to tune cassandra to avoid OOM

2013-12-18 Thread Shammi Jayasinghe
Hi, We are facing with a problem on Cassandra tuning. In that we have faced with following OOM scenario[1], after running the system for 6 days. We have tuned the cassandra with following values. These values also obtained by going through huge number of testing cycles. But still it has gone

Re: How to tune cassandra to avoid OOM

2013-12-18 Thread Lee Mighdoll
I'd suggest setting some cassandra jvm parameters so that you can analyze a heap dump and peek through the gc logs. That'll give you some clues e.g. if the memory problem is growing steadily or suddenly, and clues from a peek at which object are using the memory. -XX:+HeapDumpOnOutOfMemoryError

Occasional NPE using DataStax Java driver

2013-12-18 Thread David Tinker
We are using Cassandra 2.0.3-1 installed on Ubuntu 12.04 from the DataStax repo with the DataStax Java driver version 2.0.0-rc1. Every now and then we get the following exception: 2013-12-19 06:56:34,619 [sql-2-t15] ERROR core.RequestHandler - Unexpected error while querying /x.x.x.x

Re: Best way to measure write throughput...

2013-12-18 Thread Jason Wee
Hello, you could also probably do it in your application? Just sample with an interval of time and that should give some indication of throughput. HTH /Jason On Thu, Dec 19, 2013 at 12:11 AM, Krishna Chaitanya bnsk1990r...@gmail.comwrote: Hello, Could you please suggest to me the best way