AW: New web client future API

2011-06-15 Thread MW | Codefreun.de
Ok, many thanks. I can remember a post (I think it was Jonathan) where they wanted to get away from Thrift because of the weak development. Markus ;) -Ursprüngliche Nachricht- Von: aaron morton [mailto:aa...@thelastpickle.com] Gesendet: Mittwoch, 15. Juni 2011 00:05 An:

Re: Cassandra Statistics and Metrics

2011-06-15 Thread Viktor Jevdokimov
http://www.kjkoster.org/zapcat/Zapcat_JMX_Zabbix_Bridge.html 2011/6/14 Marcos Ortiz mlor...@uci.cu Where I can find the source code? El 6/14/2011 10:13 AM, Viktor Jevdokimov escribió: We're using open source monitoring solution Zabbix from http://www.zabbix.com/ using zapcat - not only

Re: possible 'coming back to life' bug with counters

2011-06-15 Thread Viktor Jevdokimov
What if it is OK for our case and we need counters with TTL? For us Counters and TTL both are important. After column is expired it is not important what value counter will have. Scanning millions rows just to delete expired ones is not a solution. 2011/6/14 Sylvain Lebresne sylv...@datastax.com

Re: one way to make counter delete work better

2011-06-15 Thread Yang
patch in https://issues.apache.org/jira/browse/CASSANDRA-2774 https://issues.apache.org/jira/browse/CASSANDRA-2774some coding is messy and only intended for demonstration only, we could refine it after we agree this is a feasible way to go. Thanks Yang On Tue, Jun 14, 2011 at 11:21 AM, Sylvain

last record rowId

2011-06-15 Thread karim abbouh
in my java application,when we try to insert we should all the time know the last rowId in order the insert the new record in rowId+1,so for that we should save this rowId in a file is there other way to know the last record rowId? thanks B.R

Re: Where is my data?

2011-06-15 Thread Sylvain Lebresne
You can use the thrift call describe_ring(). It will returns a map that associate to each range of the ring who is a replica. Once any range has all it's endpoint unavailable, that range of the data is unavailable. -- Sylvain On Tue, Jun 14, 2011 at 11:33 PM, AJ a...@dude.podzone.net wrote: Is

Re: possible 'coming back to life' bug with counters

2011-06-15 Thread Sylvain Lebresne
Let me point out that the current thread is about counter removal, not about counter TTL. Counter expiration have other problems, so that even if you do not care about incrementing a counter again after it expires, it will still not work for you (please look at the discussion on

Cassandra DC Upcoming Meetup

2011-06-15 Thread Chris Burroughs
Cassandra DC's first meetup of the pizza and talks variety will be on July 6th. There will be an introductory sort of presentation and a totally cool one on Pig integration. If you are in the DC area it would be great to see you there. http://www.meetup.com/Cassandra-DC-Meetup/events/22145481/

Re: New web client future API

2011-06-15 Thread AJ
Nice interface... and someone has good taste in music. BTW, I'm new to web programming, what did you use for the web components? JSF, JavaScript, something else? On 6/14/2011 7:42 AM, Markus Wiesenbacher | Codefreun.de wrote: Hi, what is the future API for Cassandra? Thrift, Avro, CQL? I

Re: New web client future API

2011-06-15 Thread Holger Hoffstaette
On Wed, 15 Jun 2011 10:04:53 +1200, aaron morton wrote: Avro is dead. Just so that this is not misunderstood: for Cassandra. Avro itself (and -ipc) is far from dead. -h

Re: last record rowId

2011-06-15 Thread Utku Can Topçu
As far as I can tell, this functionality doesn't exist. However you can use such a method to insert the rowId into another column within a seperate row, and request the latest column. I think this would work for you. However every insert would need a get request, which I think would be

Re: New web client future API

2011-06-15 Thread Jeremy Hanna
Yes - avro is alive and well. Avro as an RPC alternative for Cassandra is dead. See reasoning here: http://goo.gl/urENc On Jun 15, 2011, at 8:28 AM, Holger Hoffstaette wrote: On Wed, 15 Jun 2011 10:04:53 +1200, aaron morton wrote: Avro is dead. Just so that this is not misunderstood:

Re: Where is my data?

2011-06-15 Thread AJ
Thanks On 6/15/2011 3:20 AM, Sylvain Lebresne wrote: You can use the thrift call describe_ring(). It will returns a map that associate to each range of the ring who is a replica. Once any range has all it's endpoint unavailable, that range of the data is unavailable. -- Sylvain

Re: New web client future API

2011-06-15 Thread Eric Evans
On Tue, 2011-06-14 at 09:49 -0400, Victor Kabdebon wrote: Actually from what I understood (please correct me if I am wrong) CQL is based on Thrift / Avro. In this project, we tend to use the word Thrift as a sort of shorthand for Cassandra's RPC interface, and not, The serialization and RPC

Re: New web client future API

2011-06-15 Thread Markus Wiesenbacher | Codefreun.de
I am using a Javascript framework, Sencha ExtJS. The format between UI and servlets is JSON. Thanks for your response and that you agree to my music taste ;) Am 15.06.2011 um 15:48 schrieb AJ a...@dude.podzone.net: Nice interface... and someone has good taste in music. BTW, I'm new to web

Re: New web client future API

2011-06-15 Thread Victor Kabdebon
Ok thanks for the update. I thought the query string was translated to Thrift, then send to a server. Victor Kabdebon 2011/6/15 Eric Evans eev...@rackspace.com On Tue, 2011-06-14 at 09:49 -0400, Victor Kabdebon wrote: Actually from what I understood (please correct me if I am wrong) CQL is

Atomicity of batch updates

2011-06-15 Thread Artem Orobets
Hi, Wiki says that write operation is atomic within ColumnFamily (http://wiki.apache.org/cassandra/ArchitectureOverview chapter write properties). If I use batch update for single CF, and get an exception in last mutation operation, is it means that all previous operation will be reverted. If

Re: New web client future API

2011-06-15 Thread Jeffrey Kesselman
Correct me if I'm wrong, but AFAIK Hector is the only higher level APi I would consider complete' right now, with support for things like fail-over. I notice in the latest Hector build he is starting to add CQL support, so thats what I'm sticking with. When he has CQL support done I'll decide

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Shotaro Kamio
We've encountered the situation that compacted sstable files aren't deleted after node repair. Even when gc is triggered via jmx, it sometimes leaves compacted files. In a case, a lot of files are left. Some files stay more than 10 hours already. There is no guarantee that gc will cleanup all

sstable2json2sstable bug with json data stored

2011-06-15 Thread Timo Nentwig
Hi! Couldn't google anybody having yet experienced this, so I do (0.8): { foo:{ foo:{ foo:bar, foo:bar, foo:bar, foo:, foo:bar, foo:bar, id:123456 } }, foo:null } (json can likely be boiled down even more...) [default@foo] set

Re: cascading failures due to memory

2011-06-15 Thread AJ
Sasha, Did you ever nail down the cause of this problem? On 5/31/2011 4:01 AM, Sasha Dolgy wrote: hi everyone, the current nodes i have deployed (4) have all been working fine, with not a lot of data ... more reads than writes at the moment. as i had monitoring disabled, when one node's OS

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Terje Marthinussen
Even if the gc call cleaned all files, it is not really acceptable on a decent sized cluster due to the impact full gc has on performance. Especially non-needed ones. The delay in file deletion can also at times make it hard to see how much spare disk you actually have. We easily see 100%

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Terje Marthinussen
On Thu, Jun 16, 2011 at 12:48 AM, Terje Marthinussen tmarthinus...@gmail.com wrote: Even if the gc call cleaned all files, it is not really acceptable on a decent sized cluster due to the impact full gc has on performance. Especially non-needed ones. Not acceptable as running GC on every

What triggers hint delivery?

2011-06-15 Thread Terje Marthinussen
Hi, I was looking quickly at source code tonight. As far as I could see from a quick code scan, hint delivery is only triggered as a state change from a node is down to when it enters up state? If this is indeed the case, it would potentially explain why we sometimes have hints on machines which

Re: cascading failures due to memory

2011-06-15 Thread Sasha Dolgy
No. Upgraded to 0.8 and monitor the systems more. we schedule a repair every 24hrs via cron and so far no problems.. On Jun 15, 2011 5:44 PM, AJ a...@dude.podzone.net wrote: Sasha, Did you ever nail down the cause of this problem? On 5/31/2011 4:01 AM, Sasha Dolgy wrote: hi everyone, the

Re: Forcing Cassandra to free up some space

2011-06-15 Thread AJ
In regards to cleaning-up old sstable files, I posed this question before as I noticed after taking a snapshot, the older files (pre-compaction) shared no links with the snapshots. Therefore, (if the Cass snapshot functionality is working correctly) those older files can be manually deleted.

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Ryan King
There's a ticket open to address this: https://issues.apache.org/jira/browse/CASSANDRA-1974 -ryan On Wed, Jun 15, 2011 at 8:49 AM, Terje Marthinussen tmarthinus...@gmail.com wrote: On Thu, Jun 16, 2011 at 12:48 AM, Terje Marthinussen tmarthinus...@gmail.com wrote: Even if the gc call

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Peter Schuller
Even if the gc call cleaned all files, it is not really acceptable on a decent sized cluster due to the impact full gc has on performance. Especially non-needed ones. You can run with -XX:+ExplicitGCInvokesConcurrent to safely trigger CMS cycles. However that also means System.gc() semantics

Re: last record rowId

2011-06-15 Thread Jonathan Ellis
You're better served using UUIDs than numeric row IDs for surrogate keys. (Of course natural keys work fine too.) On Wed, Jun 15, 2011 at 9:16 AM, Utku Can Topçu u...@topcu.gen.tr wrote: As far as I can tell, this functionality doesn't exist. However you can use such a method to insert the

Re: What triggers hint delivery?

2011-06-15 Thread Jonathan Ellis
On Wed, Jun 15, 2011 at 10:53 AM, Terje Marthinussen tmarthinus...@gmail.com wrote: I was looking quickly at source code tonight. As far as I could see from a quick code scan, hint delivery is only triggered as a state change from a node is down to when it enters up state? Right. If this is

useful little way to run locally with (pig|hive) cassandra

2011-06-15 Thread Jeremy Hanna
We started doing this recently and thought it might be useful to others. Pig (and Hive) have a sample function that allows you to sample data from your data store. In pig it looks something like this: mysample = SAMPLE myrelation 0.01; One possible use for this, with pig and cassandra is to

Force a node to form part of quorum

2011-06-15 Thread A J
Is there a way to favor a node to always participate (or never participate) towards fulfillment of read consistency as well as write consistency ? Thanks AJ

Re: Docs: Token Selection

2011-06-15 Thread Vijay
The problem in the above approach is you have 2 nodes between 12 to 4 in DC1 but from 4 to 12 you just have 1 (Which will cause uneven distribution of data the node) It is easier to think of the DCs as ring and split equally and interleave them together DC1 Node 1 : token 0 DC1 Node 2 :

Re: Docs: Token Selection

2011-06-15 Thread Vijay
Correction The problem in the above approach is you have 2 nodes between 12 to 4 in DC1 but from 4 to 12 you just have 1 should be The problem in the above approach is you have 1 node between 0-4 (25%) and and one node covering the rest which is 4-16, 0-0 (75%) Regards, /VJ On Wed, Jun

prep for cassandra storage from pig

2011-06-15 Thread William Oberman
I think I'm stuck on typing issues trying to store data in cassandra. To verify, cassandra wants (key, {tuples}) My pig script is fairly brief: raw = LOAD 'cassandra://test_in/test_cf' USING CassandraStorage() AS (key:chararray, columns:bag {column:tuple (name, value)}); --colums == timeUUID -

Re: Atomicity of batch updates

2011-06-15 Thread chovatia jaydeep
Cassandra write operation is atomic for all the columns/super columns for a given row key in Column Family. So in your case not all previous operations (assuming each operation was on separate key) will be reverted. Thank you, Jaydeep From: Artem Orobets

upgrading from cassandra 0.7.3 to 0.8.0

2011-06-15 Thread Anurag Gujral
Hi All, I had a cassandra node which was running on cassandra 0.7.3. Without changing the data directories I installed cassandra 0.8.0 but when I query data I get timeouts. Can somehow please guide me how to go about upgrade from cassandra 0.7.3 to cassandra 0.8.0. Thanks Anurag

Re: useful little way to run locally with (pig|hive) cassandra

2011-06-15 Thread Jeremy Hanna
Cool - thanks Dmitriy! On Jun 15, 2011, at 12:54 PM, Dmitriy Ryaboy wrote: Another tip: If you parametrize your load statements, it becomes easy to switch between loading from something like Cassandra, and reading from HDFS or local fs directly. Also: Try using Pig's illustrate command

Re: When does it make sense to use TimeUUID?

2011-06-15 Thread chovatia jaydeep
Hi Sameer, One example is, store all the tweets for a given user in a Column Family, where row key is user name/user id and column name is of TimeUUID type that  represents tweet arrival time. User would generally like to see the tweets sorted based on its arrival time. So TimeUUID will help

Re: upgrading from cassandra 0.7.3 to 0.8.0

2011-06-15 Thread Jonathan Ellis
Are there exceptions in the Cassandra log? On Wed, Jun 15, 2011 at 1:54 PM, Anurag Gujral anurag.guj...@gmail.com wrote: Hi All,           I had a cassandra node which was running on cassandra 0.7.3. Without changing the data directories I installed cassandra 0.8.0 but when I query data I get

Re: prep for cassandra storage from pig

2011-06-15 Thread Jeremy Hanna
Hi Will, That's partly why I like to use FromCassandraBag and ToCassandraBag from pygmalion - it does the work for you to get it back into a form that cassandra understands. Others may know better how to massage the data into that form using just pig, but if all else fails, you could write a

Re: prep for cassandra storage from pig

2011-06-15 Thread William Oberman
My problem is the column names are dynamic (a date), and pygmalion seems to want the column names to be fixed at compile time (the script). On Wed, Jun 15, 2011 at 3:04 PM, Jeremy Hanna jeremy.hanna1...@gmail.comwrote: Hi Will, That's partly why I like to use FromCassandraBag and

Re: prep for cassandra storage from pig

2011-06-15 Thread William Oberman
I'll do a reply all, to keep this more consistent (sorry!). Rather than staying stuck, I wrote a custom function: TupleToBagOfTuple. I'm curious if I could have avoided it with proper pig scripting though. On Wed, Jun 15, 2011 at 3:08 PM, William Oberman ober...@civicscience.comwrote: My

Re: prep for cassandra storage from pig

2011-06-15 Thread Jeremy Hanna
Yeah - for completely dynamic column names, then yeah - From/To Cassandra Bag doesn't handle that. It does handle prefixed names though - like link* will get a bag of all the columns that start with link. But sounds like you are doing what I would have to do if I got into a nested data

Re: Multi data center configuration - A question on read correction

2011-06-15 Thread Selva Kumar
Thanks Jonathan. Can we turn off RR by READ_REPAIR_CHANCE.= 0. Please advice. Selva From: Jonathan Ellis jbel...@gmail.com To: user@cassandra.apache.org Sent: Tue, June 14, 2011 8:59:41 PM Subject: Re: Multi data center configuration - A question on read

Re: Docs: Token Selection

2011-06-15 Thread Vijay
All you heard is right... You are not overriding Cassandra's token assignment by saying here is your token... Logic is: Calculate a token for the given key... find the node in each region independently (If you use NTS and if you set the strategy options which says you want to replicate to the

Re: Docs: Token Selection

2011-06-15 Thread AJ
Vijay, thank you for your thoughtful reply. Will Cass complain if I don't setup my tokens like in the examples? On 6/15/2011 2:41 PM, Vijay wrote: All you heard is right... You are not overriding Cassandra's token assignment by saying here is your token... Logic is: Calculate a token for

Slowdowns during repair

2011-06-15 Thread Aurynn Shaw
Hey all; So, we have Cassandra running on a 5-server ring, with a RF of 3, and we're regularly seeing major slowdowns in read write performance while running nodetool repair, as well as the occasional Cassandra crash during the repair window - slowdowns past 10 seconds to perform a single

Easy way to overload a single node on purpose?

2011-06-15 Thread Suan Aik Yeo
Here's a weird one... what's the best way to get a Cassandra node into a half-crashed state? We have a 3-node cluster running 0.7.5. A few days ago this happened organically to node1 - the partition the commitlog was on was 100% full and there was a No space left on device error, and after a

Re: Is there a way from a running Cassandra node to determine whether or not itself is up?

2011-06-15 Thread Suan Aik Yeo
Thanks, Aaron, but we determined that adding Java into the equation just brings in too much complexity for something that's called out of an Nginx Perl module. Right now I'm having trouble even replicating the above scenario and posted a question here:

Re: Docs: Token Selection

2011-06-15 Thread Vijay
No it wont it will assume you are doing the right thing... Regards, /VJ On Wed, Jun 15, 2011 at 2:34 PM, AJ a...@dude.podzone.net wrote: Vijay, thank you for your thoughtful reply. Will Cass complain if I don't setup my tokens like in the examples? On 6/15/2011 2:41 PM, Vijay

Re: What triggers hint delivery?

2011-06-15 Thread Terje Marthinussen
I suspect a few possibilities: 1. I have not checked, but what happens (in terms of hint delivery) if a node tries to write something but the write times out even if the node is marked as up? 2. I would assume there can be ever so slight variations in how different nodes in the cluster think the

Re: What triggers hint delivery?

2011-06-15 Thread Jonathan Ellis
You're right, those could all cause what you are seeing. We used to have a re-check hourly scheduled task, but took it out because it was very very performance intensive -- at the time, hints were not stored by machine so asking does machine X have any hints required scanning all hints. Should

downgrading from cassandra 0.8 to 0.7.3

2011-06-15 Thread Anurag Gujral
Hi All, I moved to cassandra 0.8.0 from cassandra-0.7.3 when I try to move back I get the following error: java.lang.RuntimeException: Can't open sstables from the future! Current version f, found file: /data/cassandra/data/system/Schema-g-9. Please suggest. Thanks Anurag

Re: downgrading from cassandra 0.8 to 0.7.3

2011-06-15 Thread Terje Marthinussen
Can't help you with that. You may have to go the json2sstable route and re-import into 0.7.3 But... why would you want to go back to 0.7.3? Terje On Thu, Jun 16, 2011 at 10:30 AM, Anurag Gujral anurag.guj...@gmail.comwrote: Hi All, I moved to cassandra 0.8.0 from cassandra-0.7.3

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Terje Marthinussen
Watching this on a node here right now and it sort of shows how bad this can get. This node still has 109GB free disk by the way... INFO [CompactionExecutor:5] 2011-06-16 09:11:59,164 StorageService.java (line 2071) requesting GC to free disk space INFO [CompactionExecutor:5] 2011-06-16

Re: Docs: Token Selection

2011-06-15 Thread AJ
Ok. I understand the reasoning you laid out. But, I think it should be documented more thoroughly. I was trying to get an idea as to how flexible Cass lets you be with the various combinations of strategies, snitches, token ranges, etc.. It would be instructional to see what a graphical

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Jeffrey Kesselman
The GC cleanup approach, if depending on specific objects being GCd, is fundamentally flawed. I brought this up earlier, won't restart that thread. It should be in the archives. On Wed, Jun 15, 2011 at 10:17 PM, Terje Marthinussen tmarthinus...@gmail.com wrote: Watching this on a node here

Re: Is there a way from a running Cassandra node to determine whether or not itself is up?

2011-06-15 Thread Jake Luciani
No force a node down you can use nodetool disablegossip On Wed, Jun 15, 2011 at 6:42 PM, Suan Aik Yeo yeosuan...@gmail.com wrote: Thanks, Aaron, but we determined that adding Java into the equation just brings in too much complexity for something that's called out of an Nginx Perl module.

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Ryan King
There's a ticket open for this: https://issues.apache.org/jira/browse/CASSANDRA-2521. Vote on it if you think its important. -ryan On Wed, Jun 15, 2011 at 7:34 PM, Jeffrey Kesselman jef...@gmail.com wrote: The GC cleanup approach, if depending on specific objects being GCd, is fundamentally

Re: Docs: Token Selection

2011-06-15 Thread Vijay
+1 for more documentation (I guess contributions are always welcomed) I will try to write it down sometime when we have a bit more time... 0.8 nodetool ring command adds the DC and RAC information http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers

Re: What's the best approach to search in Cassandra

2011-06-15 Thread Mark Kerzner
Jake, *You need to maintain a huge number of distinct indexes.* * * *Are we talking about secondary indexes? If yes, this sounds like exactly my problem. There is so little documentation! - but I think that if I read all there is on GitHub, I can probably start using it. * Thank you, Mark On

Re: What's the best approach to search in Cassandra

2011-06-15 Thread Sasha Dolgy
Datastax has pretty sufficient documentation on their site for secondary indexes. On Jun 16, 2011 6:57 AM, Mark Kerzner markkerz...@gmail.com wrote: Jake, *You need to maintain a huge number of distinct indexes.* * * *Are we talking about secondary indexes? If yes, this sounds like exactly