Re: Map return for multiget_slice() query
I ran into this problem in Python because dict's aren't ordered in Python. Not sure if that applies here. --Joe On Jan 12, 2010, at 2:22 AM, Richard Grossman wrote: Hi I've a simple CF like this : ColumnFamily CompareWith=BytesType Name=channelShow FlushPeriodInMinutes=150/ When I make a query via multiget_slice() I expect to get the data back ordered by the keys list that I pass. But not the return doesn't follow any order even not the natural key order. Example I call like this : COLUM_PARENT_CHANNEL_SHOW = new ColumnParent(channelShow, null); PREDICATE_CHANNEL_SHOW = new SlicePredicate(null, new SliceRange(new byte[0], new byte[0], false, 30)); client.multiget_slice(Keyspace1, keys, COLUM_PARENT_CHANNEL_SHOW, PREDICATE_CHANNEL_SHOW, ConsistencyLevel.QUORUM); So How to make this if the keys = {5, 8, 12, 1 ,3, 21} how to get the result in this order ? Thanks Richard
Re: Data Model Index Text
On the topic of Lucandra, apart from having it work with 0.5 of Cassandra, has any work been done to get it up to date with Lucene 2.9/3.0? Also, I'm a bit concerned about its use of OrderPreservingPartitioner; is there an architecture for storage that could be considered that would work with RandomPartitioner? Ryan On Tue, Jan 12, 2010 at 12:20 PM, ML_Seda sonnyh...@gmail.com wrote: i do see the classes now, but All the way back in version .20. Is there a newer version of Lucandra. It would be nice for us to use the lastest cassandra (trunk). -- View this message in context: http://n2.nabble.com/Data-Model-Index-Text-tp4275199p4293071.html Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.
Re: Cassandra and TTL
Just to speak up here, I think it's a more common use-case than you're imagining, eve if maybe there's no reasonable way of implementing it. I for one have plenty of use for a TTL on a key, though in my case the TTL would be in days/weeks. Alternatively, I know it's considered wrong, but having a way of getting all unique keys + timestamps from a RandomPartitioner would allow me to do manual scavenging of my own. sstable2json is perhaps not appropriate because it includes replicated data. On Tue, Jan 12, 2010 at 11:56 AM, Jonathan Ellis jbel...@gmail.com wrote: I'm skeptical that this is a common use-case... If truncating old sstables entirely (https://issues.apache.org/jira/browse/CASSANDRA-531) meets your needs, that is going to be less work and more performant. -Jonathan On Tue, Jan 12, 2010 at 10:45 AM, Sylvain Lebresne sylv...@yakaz.com wrote: Hello, I have to deal with a lot of different data and Cassandra seems to be a good fit for my needs so far. However, some of this data is volatile by nature and for those, I would need to set something akin to a TTL. Those TTL could be long, but keeping those data forever would be useless. I could deal with that by hand, writing some daemon that run regularly and remove what should be removed. However this is not particularly efficient, nor convenient, and I would find it really cool to be able to provide a TTL when inserting something and don't have to care more than that. Which leads me to my question: why Cassandra doesn't allow to set a TTL for data ? Is it for technical reason ? For philosophical reason ? Or just nobody had needed it sufficiently to write it ? From what I understand of how Cassandra works, it seems to me that it could be done pretty efficiently (even though I agree that it wouldn't be a minor change). That is, it would require to add a ttl to column (and/or row). When reading a column whose timestamp + ttl is expired, it would ignore it (as for tombstoned column). Then during compaction, expired column would be collected. Is there any major difficulties/obstacles I don't see ? Or maybe is there some trick I don't know about that allow to do such a thing already ? And if not, would that be something that would interest the Cassandra community ? Or does nobody ever need such a thing ? (I personally believe it to be a desirable feature, but maybe I am the only one.) Thanks, Sylvain
Re: easy interface to Cassandra
On Wed, 13 Jan 2010 08:05:45 +1300 Michael Koziarski mich...@koziarski.com wrote: I see no value in pushing for ports of a Perl library to other languages instead of allowing each to grow its own idiomatic one. MK That's definitely the way to go, the Easy.pm magic strings look a MK little like line noise to me ( a non-perler ) Thanks for the feedback. I'll keep EasyCassandra Perl-only. Ted
Re: Can fix corrupt file? (Compaction step)
What is your CF definition in your config file? On Sun, Jan 10, 2010 at 7:59 PM, JKnight JKnight beukni...@gmail.com wrote: The attachment contains data that raise error in compact step. Could you help me to detect the problem?
Re: Data Model Index Text
I'm assuming I have to run the thrift gen-java from cassandra .4 release. Is there any documentation or tutorial on how to get that up and running? I've checked both cassandra and lucandra into eclipse, but the lucandra project is still unable to resolve some Classes. This is because I need to generate the java client classes? Thanks. Jake Luciani wrote: It should work but not a ton has changed in 2.9/3.0 AFAIK. I'm going to work on updating Lucandra to work with 0.5 branch I can try to update this as well. BTW, if you want to see Lucandra in action check out http://flocking.me (example: http://flocking.me/tjake ) You can use a random partitioner if you store the entire index under a supercolumn (how it was originally implemented) but then you need to accept the entire index will be in memory for any operation on that index (bad for big indexes). -Jake On Wed, Jan 13, 2010 at 9:14 AM, Ryan Daum r...@thimbleware.com wrote: On the topic of Lucandra, apart from having it work with 0.5 of Cassandra, has any work been done to get it up to date with Lucene 2.9/3.0? Also, I'm a bit concerned about its use of OrderPreservingPartitioner; is there an architecture for storage that could be considered that would work with RandomPartitioner? Ryan On Tue, Jan 12, 2010 at 12:20 PM, ML_Seda sonnyh...@gmail.com wrote: i do see the classes now, but All the way back in version .20. Is there a newer version of Lucandra. It would be nice for us to use the lastest cassandra (trunk). -- View this message in context: http://n2.nabble.com/Data-Model-Index-Text-tp4275199p4293071.html Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com. -- View this message in context: http://n2.nabble.com/Data-Model-Index-Text-tp4275199p4349520.html Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.
Re: How to UUID in .Net
Thank Jonathan, I will try to port to C Sharp. On Sat, Jan 9, 2010 at 7:47 AM, Jonathan Ellis jbel...@gmail.com wrote: I didn't see any C# libraries that generate type 1 UUIDs. You might have to port this one from java: http://johannburkard.de/software/uuid/ 2010/1/8 Nguyễn Minh Kha nminh...@gmail.com: Hi, I'm writing Cassandra in .Net (C Sharp) but I have a problem on gen a UUID for my project. I used Guid to gen UUID Version 1 but when I add to Cassandra thow an exception TimeUUID only makes sense with version 1 UUIDs I used uuidgen.exe (Windows SDK) to gen this Guid. Pls help me resolve this problem. Thanks -- Nguyen Minh Kha NCT Corporation Email : kh...@nct.vn Mobile : 090 696 1314 Y!M : iminhkha
Re: Data Model Index Text
On Wed, Jan 13, 2010 at 10:04 AM, ML_Seda sonnyh...@gmail.com wrote: I'm assuming I have to run the thrift gen-java from cassandra .4 release. Is there any documentation or tutorial on how to get that up and running? No, cassandra includes a copy of the thrift Java classes. You don't need to mess w/ the thrift compiler. -Jonathan
Re: Data Model Index Text
You should be using the ant file to build lucandra, see README. For eclipse you need to add lucandra/gen-java to src path (this contains the thrift stubs). -Jake On Wed, Jan 13, 2010 at 11:04 AM, ML_Seda sonnyh...@gmail.com wrote: I'm assuming I have to run the thrift gen-java from cassandra .4 release. Is there any documentation or tutorial on how to get that up and running? I've checked both cassandra and lucandra into eclipse, but the lucandra project is still unable to resolve some Classes. This is because I need to generate the java client classes? Thanks. Jake Luciani wrote: It should work but not a ton has changed in 2.9/3.0 AFAIK. I'm going to work on updating Lucandra to work with 0.5 branch I can try to update this as well. BTW, if you want to see Lucandra in action check out http://flocking.me (example: http://flocking.me/tjake ) You can use a random partitioner if you store the entire index under a supercolumn (how it was originally implemented) but then you need to accept the entire index will be in memory for any operation on that index (bad for big indexes). -Jake On Wed, Jan 13, 2010 at 9:14 AM, Ryan Daum r...@thimbleware.com wrote: On the topic of Lucandra, apart from having it work with 0.5 of Cassandra, has any work been done to get it up to date with Lucene 2.9/3.0? Also, I'm a bit concerned about its use of OrderPreservingPartitioner; is there an architecture for storage that could be considered that would work with RandomPartitioner? Ryan On Tue, Jan 12, 2010 at 12:20 PM, ML_Seda sonnyh...@gmail.com wrote: i do see the classes now, but All the way back in version .20. Is there a newer version of Lucandra. It would be nice for us to use the lastest cassandra (trunk). -- View this message in context: http://n2.nabble.com/Data-Model-Index-Text-tp4275199p4293071.html Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com. -- View this message in context: http://n2.nabble.com/Data-Model-Index-Text-tp4275199p4349520.html Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.
Re: Data Model Index Text
Ah, yes. I made this change locally once. let me try to find it. On Wed, Jan 13, 2010 at 10:43 AM, Ryan Daum r...@thimbleware.com wrote: The only tricky point I saw with 3.0 Lucene switch was that the TokenStream API changed completely, and IndexWriter in your code depended on the old API. I've ruled out OrderPreservingPartitioner for other jobs of mine because distribution of keys is likely not ideal across my cluster. I'm curious with Lucandra if the keys truly distribute well? R On Wed, Jan 13, 2010 at 10:26 AM, Jake Luciani jak...@gmail.com wrote: It should work but not a ton has changed in 2.9/3.0 AFAIK. I'm going to work on updating Lucandra to work with 0.5 branch I can try to update this as well. BTW, if you want to see Lucandra in action check out http://flocking.me (example: http://flocking.me/tjake ) You can use a random partitioner if you store the entire index under a supercolumn (how it was originally implemented) but then you need to accept the entire index will be in memory for any operation on that index (bad for big indexes). -Jake On Wed, Jan 13, 2010 at 9:14 AM, Ryan Daum r...@thimbleware.com wrote: On the topic of Lucandra, apart from having it work with 0.5 of Cassandra, has any work been done to get it up to date with Lucene 2.9/3.0? Also, I'm a bit concerned about its use of OrderPreservingPartitioner; is there an architecture for storage that could be considered that would work with RandomPartitioner? Ryan On Tue, Jan 12, 2010 at 12:20 PM, ML_Seda sonnyh...@gmail.com wrote: i do see the classes now, but All the way back in version .20. Is there a newer version of Lucandra. It would be nice for us to use the lastest cassandra (trunk). -- View this message in context: http://n2.nabble.com/Data-Model-Index-Text-tp4275199p4293071.html Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.
Tuning and upgrades
Hi, So after several days of more close examination, I've discovered something. EC2 io performance is pretty bad. Well okay, we already all knew that, and I have no choice but to deal with it, as moving at this time is not an option. But what I've really discovered is my data is unevenly distributed which I believe is a result of using random partitioning without specifying tokens. So what I think I can do to solve this is upgrade to 0.5.0rc3, add more instances, and use the tools to modify token ranges. Towards that end I had a few questions about different topics. Data gathering: When I run cfstats I get something like this Keyspace: Read Count: 39287 Read Latency: 14.588 ms. Write Count: 13930 Write Latency: 0.062 ms. on a heavily loaded node and Keyspace: Read Count: 8672 Read Latency: 1.072 ms. Write Count: 2126 Write Latency: 0.000 ms. on a lightly loaded node, but my question is what is the timeframe of the counts? Does a read count of 8K say that 8K reads are currently in progress, or 8K since the last time I check or 8K for some interval? Data Striping: One option I have is to add additional ebs volumes, then either turn on raid0 across several ebs's or possibly just add additional DataFileDirectory elements to my config? If I were to add DataFileDirectory entries, can I just move sstable's between directories? If so I assume I want the Index, Filter and Data files to be in the same directory? Or is this data movement something Cassandra will do for me? Also, is this likely to help? Upgrades: I understand that to upgrade from 0.4.x to 0.5.x I need to do something like 1. turn off all writes to a node 2. call 'nodeprobe flush' on that node 3. restart node with version 0.5.x Is this correct? Data Repartitioning: So it seems that if I first upgrade my current nodes to 0.5.0, then bring up some new nodes with AutoBootstrap on, they should take some data from the most loaded machines? But lets say I just want to first even out the load on existing nodes, would the process be something like 1. calculate ideal key ranges (ie, i * (2**127 /N) for i=1..N) (this seems like the ideal candidate for a new tool included with cassandra). 2. foreach node 'nodeprobe move' to ideal range 3. foreach node 'nodeprobe clean' Alternatively, it looks like I might be able to use 'nodeprobe loadbalance' for step 2, and not use step 1? Also, anyone else running in EC2 and have any sort of tuning tips? Thanks, -Anthony -- Anthony Molinaro antho...@alumni.caltech.edu
Re: How to UUID in .Net
2010/1/13 Nguyễn Minh Kha nminh...@gmail.com: Thank Jonathan, I will try to port to C Sharp. If you need to port something, could have a look at better uuid packages. JUG (Java Uuid Generator) is simple, jakarta-commons has one, and there was a third one as well that claimed mostly be slightly faster than the other two. But it would seem odd if C# did not already have a package (open source, if core M$ packages do not provide type1 version) that does this. Have you searched for such? -+ Tatu +- On Sat, Jan 9, 2010 at 7:47 AM, Jonathan Ellis jbel...@gmail.com wrote: I didn't see any C# libraries that generate type 1 UUIDs. You might have to port this one from java: http://johannburkard.de/software/uuid/ 2010/1/8 Nguyễn Minh Kha nminh...@gmail.com: Hi, I'm writing Cassandra in .Net (C Sharp) but I have a problem on gen a UUID for my project. I used Guid to gen UUID Version 1 but when I add to Cassandra thow an exception TimeUUID only makes sense with version 1 UUIDs I used uuidgen.exe (Windows SDK) to gen this Guid. Pls help me resolve this problem. Thanks -- Nguyen Minh Kha NCT Corporation Email : kh...@nct.vn Mobile : 090 696 1314 Y!M : iminhkha
Re: Cassandra and TTL
On Wed, Jan 13, 2010 at 6:18 AM, Ryan Daum r...@thimbleware.com wrote: Just to speak up here, I think it's a more common use-case than you're imagining, eve if maybe there's no reasonable way of implementing it. I for one have plenty of use for a TTL on a key, though in my case the TTL would be in days/weeks. I misunderstood the original question -- I think that the use case of long-term reaping of obsolete entries is much more relevant than short-term cache expiration. So my comments were mostly off the mark. This is indeed often done with Oracle DBs too, with rolling weekly/monthly partitions and other constructs. So I actually agree in that mechanisms for supporting that would be useful, now that I understand the request. :-) -+ Tatu +-
Re: How to UUID in .Net
I'm pretty sure that when I tested JUG it generated broken type 1 UUIDs. On Wed, Jan 13, 2010 at 3:14 PM, Tatu Saloranta tsalora...@gmail.com wrote: 2010/1/13 Nguyễn Minh Kha nminh...@gmail.com: Thank Jonathan, I will try to port to C Sharp. If you need to port something, could have a look at better uuid packages. JUG (Java Uuid Generator) is simple, jakarta-commons has one, and there was a third one as well that claimed mostly be slightly faster than the other two. But it would seem odd if C# did not already have a package (open source, if core M$ packages do not provide type1 version) that does this. Have you searched for such? -+ Tatu +- On Sat, Jan 9, 2010 at 7:47 AM, Jonathan Ellis jbel...@gmail.com wrote: I didn't see any C# libraries that generate type 1 UUIDs. You might have to port this one from java: http://johannburkard.de/software/uuid/ 2010/1/8 Nguyễn Minh Kha nminh...@gmail.com: Hi, I'm writing Cassandra in .Net (C Sharp) but I have a problem on gen a UUID for my project. I used Guid to gen UUID Version 1 but when I add to Cassandra thow an exception TimeUUID only makes sense with version 1 UUIDs I used uuidgen.exe (Windows SDK) to gen this Guid. Pls help me resolve this problem. Thanks -- Nguyen Minh Kha NCT Corporation Email : kh...@nct.vn Mobile : 090 696 1314 Y!M : iminhkha
Re: How to UUID in .Net
Actually (hitting Send jogs my memory :) it was that it does lexical compares which is invalid on type 1. So be careful. :) On Wed, Jan 13, 2010 at 3:36 PM, Jonathan Ellis jbel...@gmail.com wrote: I'm pretty sure that when I tested JUG it generated broken type 1 UUIDs. On Wed, Jan 13, 2010 at 3:14 PM, Tatu Saloranta tsalora...@gmail.com wrote: 2010/1/13 Nguyễn Minh Kha nminh...@gmail.com: Thank Jonathan, I will try to port to C Sharp. If you need to port something, could have a look at better uuid packages. JUG (Java Uuid Generator) is simple, jakarta-commons has one, and there was a third one as well that claimed mostly be slightly faster than the other two. But it would seem odd if C# did not already have a package (open source, if core M$ packages do not provide type1 version) that does this. Have you searched for such? -+ Tatu +- On Sat, Jan 9, 2010 at 7:47 AM, Jonathan Ellis jbel...@gmail.com wrote: I didn't see any C# libraries that generate type 1 UUIDs. You might have to port this one from java: http://johannburkard.de/software/uuid/ 2010/1/8 Nguyễn Minh Kha nminh...@gmail.com: Hi, I'm writing Cassandra in .Net (C Sharp) but I have a problem on gen a UUID for my project. I used Guid to gen UUID Version 1 but when I add to Cassandra thow an exception TimeUUID only makes sense with version 1 UUIDs I used uuidgen.exe (Windows SDK) to gen this Guid. Pls help me resolve this problem. Thanks -- Nguyen Minh Kha NCT Corporation Email : kh...@nct.vn Mobile : 090 696 1314 Y!M : iminhkha
Re: Tuning and upgrades
Hi Jonathon, Thanks for all the information. I just noticed one difference in the .thrift file between 0.4.1 and 0.4.2, the call to get_slice had an exception removed. Does this mean I have to have all my clients rebuilt? (I'm not excactly sure of what sorts of things are backwards compatible with thrift). Also, when transitioning from 0.4.2 to 0.5.0rc3 do I need the clients to upgrade? Trying to figure out the details of how I'll manage the upgrade. -Anthony On Wed, Jan 13, 2010 at 01:38:28PM -0600, Jonathan Ellis wrote: On Wed, Jan 13, 2010 at 1:26 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: When I run cfstats I get something like ... on a lightly loaded node, but my question is what is the timeframe of the counts? Operations in the last 60 seconds. So times will roll in and out of the average gradually, if that makes sense. Data Striping: One option I have is to add additional ebs volumes, then either turn on raid0 across several ebs's or possibly just add additional DataFileDirectory elements to my config? Right. You should see slightly better performance w/ raw volumes. If I were to add DataFileDirectory entries, can I just move sstable's between directories? Yes. (But compaction, and flush, will rotate among your DFDs in round-robin manner so don't rely on them staying there.) If so I assume I want the Index, Filter and Data files to be in the same directory? Yes. Or is this data movement something Cassandra will do for me? Also, is this likely to help? Depends where your bottleneck is, but probably. :) Upgrades: I understand that to upgrade from 0.4.x to 0.5.x I need to do something like 1. turn off all writes to a node 2. call 'nodeprobe flush' on that node 3. restart node with version 0.5.x Is this correct? Yes, remembering that 0.4 and 0.5 gossip are not compatible so you need to upgrade the whole cluster at once. Data Repartitioning: So it seems that if I first upgrade my current nodes to 0.5.0, then bring up some new nodes with AutoBootstrap on, they should take some data from the most loaded machines? Yes. But lets say I just want to first even out the load on existing nodes, would the process be something like 1. calculate ideal key ranges (ie, i * (2**127 /N) for i=1..N) (this seems like the ideal candidate for a new tool included with cassandra). 2. foreach node 'nodeprobe move' to ideal range 3. foreach node 'nodeprobe clean' Yes. Alternatively, it looks like I might be able to use 'nodeprobe loadbalance' for step 2, and not use step 1? LB will move the target node to the middle of the most-loaded range, so it's not likely to achieve perfect ranges, but it should achieve good enough with relatively Also, anyone else running in EC2 and have any sort of tuning tips? The SimpleGeo guys are apparently pretty happy w/ EC2 i/o performance: http://stu.mp/2009/12/disk-io-and-throughput-benchmarks-on-amazons-ec2.html, maybe they will chime in here. -Jonathan -- Anthony Molinaro antho...@alumni.caltech.edu
Re: Tuning and upgrades
On Wed, Jan 13, 2010 at 4:19 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: Hi Jonathon, Thanks for all the information. I just noticed one difference in the .thrift file between 0.4.1 and 0.4.2, the call to get_slice had an exception removed. Does this mean I have to have all my clients rebuilt? (I'm not excactly sure of what sorts of things are backwards compatible with thrift). Not 100% sure -- python will be fine with it, that is the one I am most familiar with. Not sure about other clients. Should be easy to test. -Jonathan
Re: Cassandra and TTL
I also agree: Some mechanism to expire rolling data would be really good if we can incorporate it. Using the existing client interface, deleting old data is very cumbersome. We want to store lots of audit data in Cassandra, this will need to be expired eventually. Nodes should be able to do expiry locally without needing to talk to other nodes in the cluster. As we have a timestamp on everything anyway, can we not use that somehow? If we only ever append data rather than update it (or update it very rarely), can we somehow store timestamp ranges in each sstable file and then have the server know when it's time to expire one? I'm guessing here from my limited understanding of how Cassandra works. Mark
Re: Cassandra guarantees reads and writes to be atomic within a single ColumnFamily.
It's correct, if understood correctly. We should probably just remove it since it's confusing as written. What it means is, if a write for a given row is acked, eventually, _all_ the data updated _in that row_ will be available for reads. So no, it's not atomic at the batch_mutate level but at the listColumnOrSuperColumn level. -Jonathan On Mon, Jan 11, 2010 at 3:01 PM, Ran Tavory ran...@gmail.com wrote: The front page http://incubator.apache.org/cassandra/ states that Cassandra guarantees reads and writes to be atomic within a single ColumnFamily. What exactly does that mean, and where can I learn more about this? It sounds like it means that batch_insert() and batch_mutate() for two different rows but in the same CF is atomic. Is this correct?
Re: Cassandra and TTL
An alternative implementation that may be worth exploring would be to modify IColumn's isMarkedForDelete() method to check TTL. It probably wouldn't be as performant as straight dropping SSTables. You'd probably also need to periodically compact old tables to remove expired rows. However, on the surface, it appears to be a more seamless and fine-grained approach to this problem. -Kelvin A little more background: db.IColumn is the shared interface that db.Column and db.SuperColumn implement. db.Column's isMarkedForDelete() method only checks if a flag has been set, right now. So, it would be relatively straightforward to slip some logic into that method to check if its timestamp has expired beyond some TTL. However, I suspect that there may be other methods that may need to be slightly modified, as well. And, the compaction code would have to be inspected to make sure that old tables are periodically compacted to remove expired rows. On Wed, Jan 13, 2010 at 12:30 PM, Mark Robson mar...@gmail.com wrote: I also agree: Some mechanism to expire rolling data would be really good if we can incorporate it. Using the existing client interface, deleting old data is very cumbersome. We want to store lots of audit data in Cassandra, this will need to be expired eventually. Nodes should be able to do expiry locally without needing to talk to other nodes in the cluster. As we have a timestamp on everything anyway, can we not use that somehow? If we only ever append data rather than update it (or update it very rarely), can we somehow store timestamp ranges in each sstable file and then have the server know when it's time to expire one? I'm guessing here from my limited understanding of how Cassandra works. Mark
Re: Cassandra and TTL
I think that is more or less what Sylvain is proposing. The main downside is adding the extra 8 bytes for a long (or 4 for an int, which should actually be plenty of resolution for this use case) to each Column object. On Wed, Jan 13, 2010 at 4:57 PM, Kelvin Kakugawa kakug...@gmail.com wrote: An alternative implementation that may be worth exploring would be to modify IColumn's isMarkedForDelete() method to check TTL. It probably wouldn't be as performant as straight dropping SSTables. You'd probably also need to periodically compact old tables to remove expired rows. However, on the surface, it appears to be a more seamless and fine-grained approach to this problem. -Kelvin A little more background: db.IColumn is the shared interface that db.Column and db.SuperColumn implement. db.Column's isMarkedForDelete() method only checks if a flag has been set, right now. So, it would be relatively straightforward to slip some logic into that method to check if its timestamp has expired beyond some TTL. However, I suspect that there may be other methods that may need to be slightly modified, as well. And, the compaction code would have to be inspected to make sure that old tables are periodically compacted to remove expired rows. On Wed, Jan 13, 2010 at 12:30 PM, Mark Robson mar...@gmail.com wrote: I also agree: Some mechanism to expire rolling data would be really good if we can incorporate it. Using the existing client interface, deleting old data is very cumbersome. We want to store lots of audit data in Cassandra, this will need to be expired eventually. Nodes should be able to do expiry locally without needing to talk to other nodes in the cluster. As we have a timestamp on everything anyway, can we not use that somehow? If we only ever append data rather than update it (or update it very rarely), can we somehow store timestamp ranges in each sstable file and then have the server know when it's time to expire one? I'm guessing here from my limited understanding of how Cassandra works. Mark
Re: How to UUID in .Net
On Wed, Jan 13, 2010 at 1:37 PM, Jonathan Ellis jbel...@gmail.com wrote: Actually (hitting Send jogs my memory :) it was that it does lexical compares which is invalid on type 1. So be careful. :) Ah. I would be VERY surprised if it produced invalid ones (I wrote the thing years ago :) ). Curious about lexical sorting tho. I thought it sorted first by time stamp, then by Mac. But sounds like it doesn't? (you could of course you comparator anyway) -+ Tatu +-
Re: How to UUID in .Net
Checked the source, yes, it does do timestamp first. Sorry for the misinformation, I must be thinking of something else entirely. It's been a while. :) -Jonathan On Wed, Jan 13, 2010 at 5:09 PM, Tatu Saloranta tsalora...@gmail.com wrote: On Wed, Jan 13, 2010 at 1:37 PM, Jonathan Ellis jbel...@gmail.com wrote: Actually (hitting Send jogs my memory :) it was that it does lexical compares which is invalid on type 1. So be careful. :) Ah. I would be VERY surprised if it produced invalid ones (I wrote the thing years ago :) ). Curious about lexical sorting tho. I thought it sorted first by time stamp, then by Mac. But sounds like it doesn't? (you could of course you comparator anyway) -+ Tatu +-
Re: Cassandra and TTL
Are you thinking about storing the expiration time explicitly? Or, would it be reasonable to calculate it dynamically? -Kelvin On Wed, Jan 13, 2010 at 1:01 PM, Jonathan Ellis jbel...@gmail.com wrote: I think that is more or less what Sylvain is proposing. The main downside is adding the extra 8 bytes for a long (or 4 for an int, which should actually be plenty of resolution for this use case) to each Column object. On Wed, Jan 13, 2010 at 4:57 PM, Kelvin Kakugawa kakug...@gmail.com wrote: An alternative implementation that may be worth exploring would be to modify IColumn's isMarkedForDelete() method to check TTL. It probably wouldn't be as performant as straight dropping SSTables. You'd probably also need to periodically compact old tables to remove expired rows. However, on the surface, it appears to be a more seamless and fine-grained approach to this problem. -Kelvin A little more background: db.IColumn is the shared interface that db.Column and db.SuperColumn implement. db.Column's isMarkedForDelete() method only checks if a flag has been set, right now. So, it would be relatively straightforward to slip some logic into that method to check if its timestamp has expired beyond some TTL. However, I suspect that there may be other methods that may need to be slightly modified, as well. And, the compaction code would have to be inspected to make sure that old tables are periodically compacted to remove expired rows. On Wed, Jan 13, 2010 at 12:30 PM, Mark Robson mar...@gmail.com wrote: I also agree: Some mechanism to expire rolling data would be really good if we can incorporate it. Using the existing client interface, deleting old data is very cumbersome. We want to store lots of audit data in Cassandra, this will need to be expired eventually. Nodes should be able to do expiry locally without needing to talk to other nodes in the cluster. As we have a timestamp on everything anyway, can we not use that somehow? If we only ever append data rather than update it (or update it very rarely), can we somehow store timestamp ranges in each sstable file and then have the server know when it's time to expire one? I'm guessing here from my limited understanding of how Cassandra works. Mark
Re: Cassandra and TTL
If he needs column-level granularity then I don't see any other option. If he needs CF-level granularity then truncate will work fine. :) On Wed, Jan 13, 2010 at 5:16 PM, Kelvin Kakugawa kakug...@gmail.com wrote: Are you thinking about storing the expiration time explicitly? Or, would it be reasonable to calculate it dynamically? -Kelvin On Wed, Jan 13, 2010 at 1:01 PM, Jonathan Ellis jbel...@gmail.com wrote: I think that is more or less what Sylvain is proposing. The main downside is adding the extra 8 bytes for a long (or 4 for an int, which should actually be plenty of resolution for this use case) to each Column object. On Wed, Jan 13, 2010 at 4:57 PM, Kelvin Kakugawa kakug...@gmail.com wrote: An alternative implementation that may be worth exploring would be to modify IColumn's isMarkedForDelete() method to check TTL. It probably wouldn't be as performant as straight dropping SSTables. You'd probably also need to periodically compact old tables to remove expired rows. However, on the surface, it appears to be a more seamless and fine-grained approach to this problem. -Kelvin A little more background: db.IColumn is the shared interface that db.Column and db.SuperColumn implement. db.Column's isMarkedForDelete() method only checks if a flag has been set, right now. So, it would be relatively straightforward to slip some logic into that method to check if its timestamp has expired beyond some TTL. However, I suspect that there may be other methods that may need to be slightly modified, as well. And, the compaction code would have to be inspected to make sure that old tables are periodically compacted to remove expired rows. On Wed, Jan 13, 2010 at 12:30 PM, Mark Robson mar...@gmail.com wrote: I also agree: Some mechanism to expire rolling data would be really good if we can incorporate it. Using the existing client interface, deleting old data is very cumbersome. We want to store lots of audit data in Cassandra, this will need to be expired eventually. Nodes should be able to do expiry locally without needing to talk to other nodes in the cluster. As we have a timestamp on everything anyway, can we not use that somehow? If we only ever append data rather than update it (or update it very rarely), can we somehow store timestamp ranges in each sstable file and then have the server know when it's time to expire one? I'm guessing here from my limited understanding of how Cassandra works. Mark
Re: Cassandra and TTL
You're right, if the TTL will be dynamically set, then we'd need to make room for it. Otherwise, if it's globally set, we could save that space. -Kelvin On Wed, Jan 13, 2010 at 1:16 PM, Kelvin Kakugawa kakug...@gmail.com wrote: Are you thinking about storing the expiration time explicitly? Or, would it be reasonable to calculate it dynamically? -Kelvin On Wed, Jan 13, 2010 at 1:01 PM, Jonathan Ellis jbel...@gmail.com wrote: I think that is more or less what Sylvain is proposing. The main downside is adding the extra 8 bytes for a long (or 4 for an int, which should actually be plenty of resolution for this use case) to each Column object. On Wed, Jan 13, 2010 at 4:57 PM, Kelvin Kakugawa kakug...@gmail.com wrote: An alternative implementation that may be worth exploring would be to modify IColumn's isMarkedForDelete() method to check TTL. It probably wouldn't be as performant as straight dropping SSTables. You'd probably also need to periodically compact old tables to remove expired rows. However, on the surface, it appears to be a more seamless and fine-grained approach to this problem. -Kelvin A little more background: db.IColumn is the shared interface that db.Column and db.SuperColumn implement. db.Column's isMarkedForDelete() method only checks if a flag has been set, right now. So, it would be relatively straightforward to slip some logic into that method to check if its timestamp has expired beyond some TTL. However, I suspect that there may be other methods that may need to be slightly modified, as well. And, the compaction code would have to be inspected to make sure that old tables are periodically compacted to remove expired rows. On Wed, Jan 13, 2010 at 12:30 PM, Mark Robson mar...@gmail.com wrote: I also agree: Some mechanism to expire rolling data would be really good if we can incorporate it. Using the existing client interface, deleting old data is very cumbersome. We want to store lots of audit data in Cassandra, this will need to be expired eventually. Nodes should be able to do expiry locally without needing to talk to other nodes in the cluster. As we have a timestamp on everything anyway, can we not use that somehow? If we only ever append data rather than update it (or update it very rarely), can we somehow store timestamp ranges in each sstable file and then have the server know when it's time to expire one? I'm guessing here from my limited understanding of how Cassandra works. Mark
Re: Tuning and upgrades
So the answer is java handles it fine. However, I unfortunately wasn't able to do a rolling restart, for whatever reason the first node caused all the other nodes to start throwing exceptions, so I had to take everything down for a little bit. However, 0.4.2 seems to start faster than 0.4.1, so that was cool. So is the thrift interface for 0.5.0 compatible with that of 0.4.x or do I need to upgrade clients for that upgrade? -Anthony On Wed, Jan 13, 2010 at 04:27:32PM -0600, Jonathan Ellis wrote: On Wed, Jan 13, 2010 at 4:19 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: Hi Jonathon, Thanks for all the information. I just noticed one difference in the .thrift file between 0.4.1 and 0.4.2, the call to get_slice had an exception removed. Does this mean I have to have all my clients rebuilt? (I'm not excactly sure of what sorts of things are backwards compatible with thrift). Not 100% sure -- python will be fine with it, that is the one I am most familiar with. Not sure about other clients. Should be easy to test. -Jonathan -- Anthony Molinaro antho...@alumni.caltech.edu
Re: Tuning and upgrades
Also, I notice in 0.5.0 cassandra.in.sh you have -XX:SurvivorRatio=8 \ then further down in the file -XX:SurvivorRatio=128 \ Does the second end up winning? Or is there some magic here. -Anthony On Wed, Jan 13, 2010 at 04:02:48PM -0800, Anthony Molinaro wrote: So the answer is java handles it fine. However, I unfortunately wasn't able to do a rolling restart, for whatever reason the first node caused all the other nodes to start throwing exceptions, so I had to take everything down for a little bit. However, 0.4.2 seems to start faster than 0.4.1, so that was cool. So is the thrift interface for 0.5.0 compatible with that of 0.4.x or do I need to upgrade clients for that upgrade? -Anthony On Wed, Jan 13, 2010 at 04:27:32PM -0600, Jonathan Ellis wrote: On Wed, Jan 13, 2010 at 4:19 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: Hi Jonathon, Thanks for all the information. I just noticed one difference in the .thrift file between 0.4.1 and 0.4.2, the call to get_slice had an exception removed. Does this mean I have to have all my clients rebuilt? (I'm not excactly sure of what sorts of things are backwards compatible with thrift). Not 100% sure -- python will be fine with it, that is the one I am most familiar with. Not sure about other clients. Should be easy to test. -Jonathan -- Anthony Molinaro antho...@alumni.caltech.edu -- Anthony Molinaro antho...@alumni.caltech.edu
Re: How to UUID in .Net
On Wed, Jan 13, 2010 at 3:13 PM, Jonathan Ellis jbel...@gmail.com wrote: Checked the source, yes, it does do timestamp first. Sorry for the misinformation, I must be thinking of something else entirely. It's been a while. :) Not at all, thanks for checking it. I might have mis-recalled it as well... was just trying to think of why I had done it some other way! :-) -+ Tatu +-
Re: Cassandra and TTL
On Wed, Jan 13, 2010 at 6:19 PM, Jonathan Ellis jbel...@gmail.com wrote: If he needs column-level granularity then I don't see any other option. If he needs CF-level granularity then truncate will work fine. :) Are you saying the proposed truncate functionality will support the functionality of 'truncate all keys with timestamp X ? R
Re: Tuning and upgrades
On Wed, Jan 13, 2010 at 6:02 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: So is the thrift interface for 0.5.0 compatible with that of 0.4.x or do I need to upgrade clients for that upgrade? Just exceptions have changed. (And get_range_slice was added.) -Jonathan
Re: Tuning and upgrades
Good question. :) On Wed, Jan 13, 2010 at 6:19 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: Also, I notice in 0.5.0 cassandra.in.sh you have -XX:SurvivorRatio=8 \ then further down in the file -XX:SurvivorRatio=128 \ Does the second end up winning? Or is there some magic here. -Anthony On Wed, Jan 13, 2010 at 04:02:48PM -0800, Anthony Molinaro wrote: So the answer is java handles it fine. However, I unfortunately wasn't able to do a rolling restart, for whatever reason the first node caused all the other nodes to start throwing exceptions, so I had to take everything down for a little bit. However, 0.4.2 seems to start faster than 0.4.1, so that was cool. So is the thrift interface for 0.5.0 compatible with that of 0.4.x or do I need to upgrade clients for that upgrade? -Anthony On Wed, Jan 13, 2010 at 04:27:32PM -0600, Jonathan Ellis wrote: On Wed, Jan 13, 2010 at 4:19 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: Hi Jonathon, Thanks for all the information. I just noticed one difference in the .thrift file between 0.4.1 and 0.4.2, the call to get_slice had an exception removed. Does this mean I have to have all my clients rebuilt? (I'm not excactly sure of what sorts of things are backwards compatible with thrift). Not 100% sure -- python will be fine with it, that is the one I am most familiar with. Not sure about other clients. Should be easy to test. -Jonathan -- Anthony Molinaro antho...@alumni.caltech.edu -- Anthony Molinaro antho...@alumni.caltech.edu
Re: Cassandra guarantees reads and writes to be atomic within a single ColumnFamily.
Thanks, so maybe to rephrase: Cassandra guarantees reads and writes to be atomic within a single row. But this isn't saying much... so maybe just take it off... On Thu, Jan 14, 2010 at 12:40 AM, Jonathan Ellis jbel...@gmail.com wrote: It's correct, if understood correctly. We should probably just remove it since it's confusing as written. What it means is, if a write for a given row is acked, eventually, _all_ the data updated _in that row_ will be available for reads. So no, it's not atomic at the batch_mutate level but at the listColumnOrSuperColumn level. -Jonathan On Mon, Jan 11, 2010 at 3:01 PM, Ran Tavory ran...@gmail.com wrote: The front page http://incubator.apache.org/cassandra/ states that Cassandra guarantees reads and writes to be atomic within a single ColumnFamily. What exactly does that mean, and where can I learn more about this? It sounds like it means that batch_insert() and batch_mutate() for two different rows but in the same CF is atomic. Is this correct?
Re: Cassandra and TTL
On Wed, Jan 13, 2010 at 2:30 PM, Mark Robson mar...@gmail.com wrote: I also agree: Some mechanism to expire rolling data would be really good if we can incorporate it. Using the existing client interface, deleting old data is very cumbersome. We want to store lots of audit data in Cassandra, this will need to be expired eventually. Nodes should be able to do expiry locally without needing to talk to other nodes in the cluster. As we have a timestamp on everything anyway, can we not use that somehow? If we only ever append data rather than update it (or update it very rarely), can we somehow store timestamp ranges in each sstable file and then have the server know when it's time to expire one? I personally like this last option of expiring entire sstables. It seems significantly more efficient then scrubbing data. The granularity might be a bit high, but by columnfamily seems a reasonable trade-off in the short run for an easier solution. For apps that don't want to see the old data, during a read if the data had a timestamp older than the expire time on the ColumnFamily it could also be ignored, then when all in an sstable x, truncate. Logs are a great example of this. - August