Re: how to implement a client with off-heap memory

2012-10-29 Thread aaron morton
The thrift client is just auto generated code, if you really wanted to you may be able to change / override it to modify the SerDe when it pulls things off the wire. Not sure if this does what you are looking for https://issues.apache.org/jira/browse/CASSANDRA-2478 Cheers -

Re: High bandwidth usage between datacenters for cluster

2012-10-29 Thread aaron morton
Outbound messages for other DC's are grouped and a single instance is sent to a single node in the remote DC. The remote node then forwards the message on to the other recipients in it's DC. All remote DC nodes will however reply directly to the coordinator. Normally this isn’t an issue for

Re: Roadmap/Changelog?

2012-10-29 Thread aaron morton
For committed changes https://github.com/apache/cassandra/blob/trunk/CHANGES.txt For interesting changer per release https://github.com/apache/cassandra/blob/trunk/NEWS.txt For the road map

Re: compression

2012-10-29 Thread Tamar Fraenkel
Hi! Thanks Aaron! Today I restarted Cassandra on that node and ran scrub again, now it is fine. I am worried though that if I decide to change another CF to use compression I will have that issue again. Any clue how to avoid it? Thanks. *Tamar Fraenkel * Senior Software Engineer, TOK Media

Re: Hinted Handoff storage inflation

2012-10-29 Thread aaron morton
With both data centers functional, the test takes just a few minutes to run, with one data center down, 15x the amount of time. Could you provide the numbers, it's easier to get a feel for how the throughput is dropping. Does latency reported by nodetool cf stats change ? I'm also interested

Re: compression

2012-10-29 Thread Alain RODRIGUEZ
#event, data = counter (date format 20121029). 2 - Is it a good Idea to compress this kind of data ? I am looking for using composites columns. 3 - What are the benefits of using a column name like CompositeType(UTF8Type, UTF8Type) and a simple UTF8 column with event and date separated by a sharp as I

CQL3: Unknown property 'comparator'?

2012-10-29 Thread Timmy Turner
Does CQL3 not allow dynamic columns (column names) any more?

Re: CQL3: Unknown property 'comparator'?

2012-10-29 Thread Sylvain Lebresne
CQL3 does absolutely allow dynamic column families, but does it differently from CQL2. See http://www.datastax.com/dev/blog/cql3-for-cassandra-experts. -- Sylvain On Mon, Oct 29, 2012 at 12:34 PM, Timmy Turner timm.t...@gmail.com wrote: Does CQL3 not allow dynamic columns (column names) any

Re: CQL3: Unknown property 'comparator'?

2012-10-29 Thread Timmy Turner
Thank you! That article helps clear up a lot of my confusion about the changes between CQL 2 and 3, since I was wondering how to access/manipulate CompositeType/DynamicCompositeType columns through CQL. So does this mean that in CQL 3 an explicit schema is absolutely mandatory? It's now

Re: ColumnFamilyInputFormat - error when column name is UUID

2012-10-29 Thread Marcelo Elias Del Valle
Answering myself: it seems we can't have any non type 1 UUIDs in column names. I used the UTF8 comparator and saved my UUIDs as strings, it worked. 2012/10/29 Marcelo Elias Del Valle mvall...@gmail.com Hello, I am using ColumnFamilyInputFormat the same way it's described in this example:

Re: ColumnFamilyInputFormat - error when column name is UUID

2012-10-29 Thread Andre Tavares
Marcelo, das vezes q tive este problema geralmente era porque o valor UUID sendo tratado para o cassandra não correspondia a um valor exato em UUID, para isso utilizava bastante o UUID.randomUUID() (para gerar um UUID valido) e UUID.fromString(081f4500-047e-401c-8c0b-a41fefd099d7) - este para

Re: ColumnFamilyInputFormat - error when column name is UUID

2012-10-29 Thread Hiller, Dean
Hmm, this brings the question of what uuid libraries are others using? I know this one generates type 1 UUIDs with two longs so it is 16 bytes. http://johannburkard.de/software/uuid/ Thanks, Dean From: Marcelo Elias Del Valle mvall...@gmail.commailto:mvall...@gmail.com Reply-To:

Re: Simulating a failed node

2012-10-29 Thread Andrew Bialecki
Thanks, extremely helpful. The key bit was I wasn't flushing the old Keyspace before re-running the stress test, so I was stuck at RF = 1 from a previous run despite passing RF = 2 to the stress tool. On Sun, Oct 28, 2012 at 2:49 AM, Peter Schuller peter.schul...@infidyne.com wrote: Operation

Re: ColumnFamilyInputFormat - error when column name is UUID

2012-10-29 Thread Marcelo Elias Del Valle
Dean, Are type 1 UUIDs the best ones to use if I want to avoid conflict? I saw this page: http://en.wikipedia.org/wiki/Universally_unique_identifier The only problem with type 1 UUIDs is they are not opaque? I know there is one kind of UUID that can generate two equal values if you

Re: ColumnFamilyInputFormat - error when column name is UUID

2012-10-29 Thread Marcelo Elias Del Valle
Err... Guess you replied in portuguese to the list :D 2012/10/29 Andre Tavares andre...@gmail.com Marcelo, das vezes q tive este problema geralmente era porque o valor UUID sendo tratado para o cassandra não correspondia a um valor exato em UUID, para isso utilizava bastante o

Re: Benifits by adding nodes to the cluster

2012-10-29 Thread Andrey Ilinykh
This is how cassandra scales. More nodes means better performance. thank you, Andrey On Mon, Oct 29, 2012 at 2:57 PM, Roshan codeva...@gmail.com wrote: Hi All This may be a silly question, but what kind of benefits we can get by adding new nodes to the cluster? Some may be high

RE: Hinted Handoff runs every ten minutes

2012-10-29 Thread Stephen Pierce
I'm running 1.1.5; the bug says it's fixed in 1.0.9/1.1.0. How can I check to see why it keeps running HintedHandoff? Steve -Original Message- From: Brandon Williams [mailto:dri...@gmail.com] Sent: Wednesday, October 24, 2012 4:56 AM To: user@cassandra.apache.org Subject: Re: Hinted

Re: Hinted Handoff runs every ten minutes

2012-10-29 Thread Radim Kolar
Dne 29.10.2012 23:24, Stephen Pierce napsal(a): I'm running 1.1.5; the bug says it's fixed in 1.0.9/1.1.0. How can I check to see why it keeps running HintedHandoff? you have tombstone is system.HintsColumnFamily use list command in cassandra-cli to check

idea drive layout - 4 drives + RAID question

2012-10-29 Thread Ran User
For a server with 4 drive slots only, I'm thinking: either: - OS (1 drive) - Commit Log (1 drive) - Data (2 drives, software raid 0) vs - OS + Data (3 drives, software raid 0) - Commit Log (1 drive) or something else? also, if I can spare the wasted storage, would RAID 10 for cassandra data

Re: compression

2012-10-29 Thread aaron morton
Any clue how to avoid it? Not really sure what went wrong. Diagnosing that sort of problem usually takes access to the running node and time to poke around and see what it does in responses to various things. Rebooting works for Windows 95 and Cassandra is not that different. Cheers

Re: idea drive layout - 4 drives + RAID question

2012-10-29 Thread Timmy Turner
I'm not sure whether the raid 0 gets you anything other than headaches should one of the drives fail. You can already distribute the individual Cassandra column families on different drives by just setting up symlinks to the individual folders. 2012/10/30 Ran User ranuse...@gmail.com: For a

Re: CQL3: Unknown property 'comparator'?

2012-10-29 Thread aaron morton
More background http://www.datastax.com/dev/blog/thrift-to-cql3 So does this mean that in CQL 3 an explicit schema is absolutely mandatory? Not really, it sort of depends on your view. Lets say this is a schema free CF definition in CLI create column family clicks with

Re: idea drive layout - 4 drives + RAID question

2012-10-29 Thread Ran User
I was hoping to achieve approx. 2x IO (write and read) performance via RAID 0 (by accepting a higher MTBF). Do believe the performance gains of RAID0 are much lower and/or are not worth it vs the increased server failure rate? From my understanding, RAID 10 would achieve the read performance

Re: idea drive layout - 4 drives + RAID question

2012-10-29 Thread Ran User
Have you considered running RAID 10 for the data drives to improve MTBF? On one hand Cassandra is handling redundancy issues, on the other hand, reducing the frequency of dealing with failed nodes is attractive if cheap (switching RAID levels to 10). We have no experience with software RAID

Throughput decreases as latency increases with YCSB

2012-10-29 Thread Peter Bailis
Hi, I'm currently benchmarking Cassandra and have encountered some interesting behavior. As I increase the number of client threads (and connections), latency increases as expected but, at some point, throughput actually decreases. I've seen a few posts about this online, with no clear

Re: Throughput decreases as latency increases with YCSB

2012-10-29 Thread Peter Bailis
I'm using YCSB on EC2 with one m1.large instance to drive client load To add, I don't believe this is due to YCSB. I've done a fair bit of client-side profiling and neither client CPU or NIC (or server NIC) are bottlenecks. I'll also add that this dataset fits in memory. Thanks! Peter