Re: Nodetool repair and Leveled Compaction

2012-09-25 Thread Sergey Tryuber
Hi Radim Unfortunately number of compaction tasks is not overestimated. The number is decremented one-by-one and this process takes several hours for our 40GB node(( Also, when a lot of compaction tasks appears, we see that total disk space used (via JMX) is doubled and Cassandra really tries to

RE: Cassandra Counters

2012-09-25 Thread Roshni Rajagopal
Thanks for the reply and sorry for being bull - headed. Once you're past the stage where you've decided its distributed, and NoSQL and cassandra out of all the NoSQL options,Now to count something, you can do it in different ways in cassandra. In all the ways you want to use cassandra's best

Re: Cassandra Counters

2012-09-25 Thread Robin Verlangen
From my point of view an other problem with using the standard column family for counting is transactions. Cassandra lacks of them, so if you're multithreaded updating counters, how will you keep track of that? Yes, I'm aware of software like Zookeeper to do that, however I'm not sure whether

Re: any ways to have compaction use less disk space?

2012-09-25 Thread Віталій Тимчишин
See my comments inline 2012/9/25 Aaron Turner synfina...@gmail.com On Mon, Sep 24, 2012 at 10:02 AM, Віталій Тимчишин tiv...@gmail.com wrote: Why so? What are pluses and minuses? As for me, I am looking for number of files in directory. 700GB/512MB*5(files per SST) = 7000 files, that

a node stays in joining

2012-09-25 Thread Satoshi Yamada
hi, One node in my cluster stay in joining. I found a jira about this, which is fixed,but still sees the similar thing. This is a node I remove the token first becauseit did not boot correctly and re-joined in the cluster without any pre-set token(shouldI set the previous token?). As you see

Re: Cassandra Counters

2012-09-25 Thread Edward Kibardin
I've recently noticed several threads about Cassandra Counters inconsistencies and started seriously think about possible workarounds like store realtime counters in Redis and dump them daily to Cassandra. So general question, should I rely on Counters if I want 100% accuracy? Thanks, Ed On Tue,

Re: [problem with OOM in nodes]

2012-09-25 Thread Denis Gabaydulin
Thanks a lot for helping. We came to the same decision clustering one report to multiple cassandra rows (sorted buckets of report rows) and manage clusters on client side. On Tue, Sep 25, 2012 at 5:28 AM, aaron morton aa...@thelastpickle.com wrote: What exactly is the problem with big rows?

The compaction task cannot delete sstables which are used in a repair session

2012-09-25 Thread Rene Kochen
Is this a bug? I'm using Cassandra 1.0.11: INFO 13:45:43,750 Compacting [SSTableReader(path='d:\data\Traxis\Parameters-hd-47-Data.db'), SSTableReader(path='d:\data\Traxis\Parameters-hd-44-Data.db'), SSTableReader(path='d:\data\Traxis\Parameters-hd-46-Data.db'),

Re: Cassandra Counters

2012-09-25 Thread rohit bhatia
@Edward, We use counters in production with Cassandra 1.0.5. Though since our application is sensitive to write latency and we are seeing problems with Frequent Young Garbage Collections, and also we just do increments (decrements have caused problems for some people) We don't see inconsistencies

Re: Correct model

2012-09-25 Thread Hiller, Dean
If you need anything added/fixed, just let PlayOrm know. PlayOrm has been able to quickly add so far…that may change as more and more requests come but so far PlayOrm seems to have managed to keep up. We are using it live by the way already. It works out very well so far for us (We have 5000

Re: Cassandra Counters

2012-09-25 Thread Sylvain Lebresne
So general question, should I rely on Counters if I want 100% accuracy? No. Even not considering potential bugs, counters being not idempotent, if you get a TimeoutException during a write (which can happen even in relatively normal conditions), you won't know if the increment went in or not

Re: Cassandra Counters

2012-09-25 Thread rohit bhatia
@Sylvain In a relatively untroubled cluster, even timed out writes go through, provided no messages are dropped. Which you can monitor on cassandra nodes. We have 100% consistency on our production servers as we don't see messages being dropped on our servers. Though as you mention, there would

Re: Cassandra Counters

2012-09-25 Thread Edward Kibardin
@Sylvain and @Rohit: Thanks for your answers. On Tue, Sep 25, 2012 at 2:27 PM, Sylvain Lebresne sylv...@datastax.comwrote: So general question, should I rely on Counters if I want 100% accuracy? No. Even not considering potential bugs, counters being not idempotent, if you get a

Re: Cassandra Counters

2012-09-25 Thread Sylvain Lebresne
In a relatively untroubled cluster, even timed out writes go through, provided no messages are dropped. This all depends of your definition of untroubled cluster, but to be clear, in a cluster where a node dies (which for Cassandra is not considered abnormal and will happen to everyone no

Re: Correct model

2012-09-25 Thread Marcelo Elias Del Valle
Dean, In the playOrm data modeling, if I understood it correctly, every CF has its own id, right? For instance, User would have its own ID, Activities would have its own id, etc. What if I have a trillion activities? Wouldn't be a problem to have 1 row id for each activity? Cassandra

Re: Correct model

2012-09-25 Thread Hiller, Dean
Just fyi that some of these are cassandra questions… Dean, In the playOrm data modeling, if I understood it correctly, every CF has its own id, right? No, each entity has a field annotated with @NoSqlId. That tells playOrm this is the row key. Each INSTANCE of the entity is a row in

Running repair negatively impacts read performance?

2012-09-25 Thread Charles Brophy
Hey guys, I've begun to notice that read operations take a performance nose-dive after a standard (full) repair of a fairly large column family: ~11 million records. Interestingly, I've then noticed that read performance returns to normal after a full scrub of the column family. Is it possible

Re:

2012-09-25 Thread Charles Brophy
There are settings in cassandra.yaml that will _gradually_ reduce the available cache to zero if you are under constant memory pressure: # Set to 1.0 to disable. snip reduce_cache_sizes_at: * reduce_cache_capacity_to: * My experience is that the cache size will not return to the configured

Re: Correct model

2012-09-25 Thread Hiller, Dean
Oh, and if you really want to scale very easily, just use play framework 1.2.5 ;). We use that and since it is stateless, to scale up, you simple add more servers. Also, it's like coding in php or ruby, etc. etc as far as development speed(no server restarts) so it's a pretty nice framework. We

Re: any ways to have compaction use less disk space?

2012-09-25 Thread Aaron Turner
On Tue, Sep 25, 2012 at 10:36 AM, Віталій Тимчишин tiv...@gmail.com wrote: See my comments inline 2012/9/25 Aaron Turner synfina...@gmail.com On Mon, Sep 24, 2012 at 10:02 AM, Віталій Тимчишин tiv...@gmail.com wrote: Why so? What are pluses and minuses? As for me, I am looking for

Re: unsubscribe

2012-09-25 Thread Eric Evans
On Tue, Sep 25, 2012 at 1:23 PM, puneet loya puneetl...@gmail.com wrote: http://goo.gl/JcMcr -- Eric Evans Acunu | http://www.acunu.com | @acunu

is this a cassandra bug?

2012-09-25 Thread Hiller, Dean
This is cassandra 1.1.4 Describe shows DecimalType and I test setting comparator TO the DecimalType and it fails (Realize I have never touched this column family until now except for posting data which succeeded) [default@unknown] use databus; Authenticated to keyspace: databus

Re: is this a cassandra bug?

2012-09-25 Thread Hiller, Dean
Hmmm, is rowkey validation asynchronous to the actually sending of the data to cassandra? I seem to be able to put an invalid type and GET that invalid data back just fine even though my type was an int and the comparator was Decimal BUT then in the logs I see a validation fail exception but I

Re: Cassandra failures while moving token

2012-09-25 Thread aaron morton
As per our understanding we expect that when we move token then that node will first sync up the data as per the new assigned token only after that it will receive the requests for new range. When you use nodetool move the node will receive write requests for the new range. As well as

Integrated cassandra

2012-09-25 Thread Robin Verlangen
Hi there, Is there a way to embed/package Cassandra with an other Java application and maintain control over it? Is this done before? Are there any best practices? Why I want to do this? We want to offer as less as configuration as possible to our customers, but only if it's possible without

Re: Can't change replication factor in Cassandra 1.1.2

2012-09-25 Thread Rob Coli
On Wed, Jul 18, 2012 at 10:27 AM, Douglas Muth doug.m...@gmail.com wrote: Even though keyspace test1 had a replication_factor of 1 to start with, each of the above UPDATE KEYSPACE commands caused a new UUID to be generated for the schema, which I assume is normal and expected. I believe the

Re: any ways to have compaction use less disk space?

2012-09-25 Thread Rob Coli
On Sun, Sep 23, 2012 at 12:24 PM, Aaron Turner synfina...@gmail.com wrote: Leveled compaction've tamed space for us. Note that you should set sstable_size_in_mb to reasonably high value (it is 512 for us with ~700GB per node) to prevent creating a lot of small files. 512MB per sstable? Wow,

Re: compression

2012-09-25 Thread aaron morton
Check the logs on nodes 2 and 3 to see if the scrub started. The logs on 1 will be a good help with that. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 24/09/2012, at 10:31 PM, Tamar Fraenkel ta...@tok-media.com wrote: Hi! I ran

Re: downgrade from 1.1.4 to 1.0.X

2012-09-25 Thread aaron morton
No. Versions are capable of reading previous file formats, but can only create files in the current format. File formats are listed here https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/Descriptor.java#L52 Looking for a way to make this work. I'd

Re:

2012-09-25 Thread Manu Zhang
I wonder now if get_range_slices call will ever look for data in row cache. I don't see it in the codebase. Only the get call will check row cache? On Wed, Sep 26, 2012 at 12:11 AM, Charles Brophy cbro...@zulily.com wrote: There are settings in cassandra.yaml that will _gradually_ reduce the

Re: Understanding Thread Pools

2012-09-25 Thread aaron morton
The are thrift connection threads. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 25/09/2012, at 1:32 AM, rohit bhatia rohit2...@gmail.com wrote: Hi What are pool-2-thread-* threads. Someone mentioned Client Connection Threads.

Re: Prevent queries from OOM nodes

2012-09-25 Thread aaron morton
Can you provide some information on the queries and the size of the data they traversed ? The default maximum size for a single thrift message is 16MB, was it larger than that ? https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L375 Cheers - Aaron Morton

Re: performance for different kinds of row keys

2012-09-25 Thread aaron morton
Which one will be faster to insert? In general Composite types have the same performance; the extra work is insignificant. (Assuming you don't create a type with 100 components.) And which one will be faster to read by incremental id? If you have to specify the full key to get a row by row

Re: Cassandra compression not working?

2012-09-25 Thread aaron morton
Nothing jumps out. Are you able to reproduce the fault on a test node ? There were some schema change problems in the early 1.1X releases. Did you enable compression via a schema change ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On

Re:

2012-09-25 Thread Manu Zhang
The DEFAULT_CACHING_STRATEGY is Caching.KEYS_ONLY but even configuring row cache size to be greater zero won't enable row cache. Why? On Wed, Sep 26, 2012 at 9:44 AM, Manu Zhang owenzhang1...@gmail.com wrote: I wonder now if get_range_slices call will ever look for data in row cache. I don't

1.1.5 Missing Insert! Strange Problem

2012-09-25 Thread Arya Goudarzi
Hi All, I have a 4 node cluster setup in 2 zones with NetworkTopology strategy and strategy options for writing a copy to each zone, so the effective load on each machine is 50%. Symptom: I have a column family that has gc grace seconds of 10 days (the default). On 17th there was an insert done