Re: Storing pre-sorted data

2011-10-18 Thread David Jeske
On Mon, Oct 17, 2011 at 2:39 AM, Matthias Pfau p...@l3s.de wrote: We would be very happy if cassandra would give us an option to maintain the sort order on our own (application logic). That is why it would be interesting to hear from any of the developers if it would be easily possible to add

Re: Storing pre-sorted data

2011-10-18 Thread Matthias Pfau
Aaron, we want to sort completely on the client-side (where the data is encrypted). But that requires an insert at offset X operation. We would always use CL QUORUM and client side synchronisation. However, it seems to be not be a good idea to add such a feature to cassandra as everyone

Re: [VOTE] Release Mojo's Cassandra Maven Plugin 1.0.0-1

2011-10-18 Thread Stephen Connolly
Nobody objects, so I will publish the artifacts as Cassandra 1.0.0 is being released On 12 October 2011 23:44, Stephen Connolly stephen.alan.conno...@gmail.comwrote: Hi, I'd like to release version 1.0.0-1 of Mojo's Cassandra Maven Plugin to sync up with the pending 1.0.0 release of Apache

[RELEASE] Apache Cassandra 1.0 released

2011-10-18 Thread Sylvain Lebresne
The Cassandra team is very pleased to announce the release of Apache Cassandra version 1.0.0. Cassandra 1.0.0 is a new major release that build upon the awesomeness of previous versions and adds numerous improvements[1,2], amongst which: - Compression of on-disk data files (SSTables), with

Re: [RELEASE] Apache Cassandra 1.0 released

2011-10-18 Thread Jonathan Ellis
Thanks for the help, everyone! This is a great milestone for Cassandra. On Tue, Oct 18, 2011 at 7:01 AM, Sylvain Lebresne sylv...@datastax.com wrote: The Cassandra team is very pleased to announce the release of Apache Cassandra version 1.0.0. Cassandra 1.0.0 is a new major release that build

Re: [RELEASE] Apache Cassandra 1.0 released

2011-10-18 Thread Thibaut Britz
Great news! Especially the improved read performance and compactions are great! Thanks, Thibaut On Tue, Oct 18, 2011 at 2:11 PM, Jonathan Ellis jbel...@gmail.com wrote: Thanks for the help, everyone!  This is a great milestone for Cassandra. On Tue, Oct 18, 2011 at 7:01 AM, Sylvain Lebresne

RE: [RELEASE] Apache Cassandra 1.0 released

2011-10-18 Thread Viktor Jevdokimov
Congrats!!! Best regards/ Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.com Phone: +370 5 212 3063 Fax: +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Disclaimer: The information contained in this message and attachments is intended solely for

Re: Using elasticsearch on cassandra nodes

2011-10-18 Thread Brian O'Neill
Anthony, We've been looking at elastic search as well. Presently we have SOLR in place, but it is cumbersome dealing with SOLR schemas when indexing information out of Cassandra (since you can't anticipate all the columns ahead of time). What are you using as your bridge between Cassandra and

Re: [RELEASE] Apache Cassandra 1.0 released

2011-10-18 Thread Jonathan Ellis
Short version: yes, 1.0 addresses the known repair problems. On Tue, Oct 18, 2011 at 7:29 AM, Maxim Potekhin potek...@bnl.gov wrote: There was a problem in early 0.8 where the repair was taking forever -- am I right to assume this was fixed in 1.0? Many thanks to you guys, Maxim On

Re: [RELEASE] Apache Cassandra 1.0 released

2011-10-18 Thread Dikang Gu
Congrats! In 0.8, the schema disagreement occurs sometimes when I create keyspaces/column families dynamically, is this also fixed? Regards. On Tue, Oct 18, 2011 at 9:20 PM, Jonathan Ellis jbel...@gmail.com wrote: Short version: yes, 1.0 addresses the known repair problems. On Tue, Oct 18,

[ANN] Mojo's Cassandra Maven Plugin 1.0.0-1 released

2011-10-18 Thread Stephen Connolly
The Mojo team is pleased to announce the release of Mojo's Cassandra Maven Plugin version 1.0.0-1. Mojo's Cassandra Plugin is used when you want to install and control a test instance of Apache Cassandra from within your Apache Maven build. The Cassandra Plugin has the following goals. *

Re: [RELEASE] Apache Cassandra 1.0 released

2011-10-18 Thread Jonathan Ellis
Multiple concurrent schema modifications are not supported in 1.0. https://issues.apache.org/jira/browse/CASSANDRA-1391 is open to add this in 1.1. On Tue, Oct 18, 2011 at 8:27 AM, Dikang Gu dikan...@gmail.com wrote: Congrats! In 0.8, the schema disagreement occurs sometimes when I create

Bulk Loading Recommendations: Files over 25GBs

2011-10-18 Thread Mike Rapuano
We are not currently live but testing with Cassandra. I'm looking for recommendations on the most efficient way to load text files over 25GBs in size to Cassandra (version 0.8.6). Our application may require us to load 2-3 text files between 25-40GBs each a few times a week to our 3 node cluster.

Column Family row keys

2011-10-18 Thread David Fischer(Gtalk)
Hello all. New to cassandra and I am using pycassa to access data. I was wondering someone knows how to just pull row keys insead of get_range? This question may be a bit more on the pycassa but not sure. If someone has a java snippet to do it that would be ok also Thanks

DELETE where colname == given_value ?

2011-10-18 Thread Yang
is it possible to do this? I have a counter CF, and over time, some rows are no longer useful, so I want to delete them in batches, like once a week. (doing the delete by specifying all the row keys would probably be too slow) I looked over the .thrift definition of Cassandra interface, it seems

Re: show schema fails

2011-10-18 Thread aaron morton
Looks like the column meta for the CF specifies a column name that is not a valid Long. I seem to remember a bug like this something in the past. You should be able to work around this by running ALTER COLUMN FAMILY and only specifying valid column meta data. Cheers - Aaron

Re: [RELEASE] Apache Cassandra 1.0 released

2011-10-18 Thread aaron morton
Thanks all. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 19/10/2011, at 2:45 AM, Jonathan Ellis wrote: Multiple concurrent schema modifications are not supported in 1.0. https://issues.apache.org/jira/browse/CASSANDRA-1391 is open to add

Re: Bulk Loading Recommendations: Files over 25GBs

2011-10-18 Thread aaron morton
At that scale of data, and the fact that it's a batch job, I would go with the bulk loading tool. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 19/10/2011, at 3:32 AM, Mike Rapuano wrote: We are not currently live but testing with

Re: Column Family row keys

2011-10-18 Thread aaron morton
There no first class support for just getting row keys, you will always want to get a column. You can fake it by requesting zero columns. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 19/10/2011, at 3:53 AM, David Fischer(Gtalk) wrote:

Re: Column Family row keys

2011-10-18 Thread Tyler Hobbs
A couple of notes: * You need to use the master branch (not the most recent release) if you want to use get_range() with column_count=0 and get any results. * You'll get range ghosts ( http://wiki.apache.org/cassandra/FAQ#range_ghosts) with column_count=0. You can avoid them if you set

Re: Using elasticsearch on cassandra nodes

2011-10-18 Thread Anthony Ikeda
At the moment we are only prototyping so we haven't bridged the two at all. We had planned on creating a write-through operation that allowed us to filter the calls (AOP perhaps?) to manage the indexing as we stored it in Cassandra. We are still trying to work out if we go the elastic search

Re: Column Family row keys

2011-10-18 Thread Jonathan Ellis
On Tue, Oct 18, 2011 at 4:10 PM, Tyler Hobbs ty...@datastax.com wrote:   * You'll get range ghosts (http://wiki.apache.org/cassandra/FAQ#range_ghosts) with column_count=0. You can avoid them if you set column_count=1. What heuristic do you use for skipping empty rows? -- Jonathan Ellis

Re: Storing pre-sorted data

2011-10-18 Thread Matthias Pfau
Hi David, encrypting and Decrypting data in cassandra is not an option to us. However, I read your elaboration on best practices for adding a custom feature to cassandra with a lot of interest. Thank you! Please see below for the answers to your questions. Kind regards Matthias On

Re: Storing pre-sorted data

2011-10-18 Thread Matthias Pfau
David, thanks for sharing these ideas. We basically implemented it by using longs and dividing the long-namespace into multiple parts on each insert. As you already described, the biggest problem with this approach is, that we are not able to simply invoke get(x) because the indexes of the

commitlog replay extremely slow?

2011-10-18 Thread Yang
I updated to 71ba6998b504966690e099c03e04f9876dc1060e on github 1.0.0 branch HEAD, now the new code seems to be very slow in commitlog replay, it takes 20minutes to recover about 500MB of commit logs, while previously this takes roughly 1--2 minutes. is the official 1.0.0 release built on

Re: commitlog replay extremely slow?

2011-10-18 Thread Yang
jstack shows that all the mutation stages are blocked on a synchronization: MutationStage:1 prio=10 tid=0x2aaabc01b000 nid=0x27b8 waiting for monitor entry [0x42048000] java.lang.Thread.State: BLOCKED (on object monitor) at

Re: Column Family row keys

2011-10-18 Thread Tyler Hobbs
On Tue, Oct 18, 2011 at 4:30 PM, Jonathan Ellis jbel...@gmail.com wrote: On Tue, Oct 18, 2011 at 4:10 PM, Tyler Hobbs ty...@datastax.com wrote: * You'll get range ghosts (http://wiki.apache.org/cassandra/FAQ#range_ghosts) with column_count=0. You can avoid them if you set column_count=1.

Size calculations for off heap caching

2011-10-18 Thread Todd Nine
Hi guys, We've just built a K tree implementation in cassandra. We're going for relatively wide nodes in our tree to minimize our tree depth and increase our search times. Most of the links between parent/child nodes are longs. We're ready to start tuning the size of K so that our most

Re: Size calculations for off heap caching

2011-10-18 Thread Chris Goffinet
My best advice on this is, insert a bit of data into the tree, and then do a heap dump to calculate the extra overhead. It's unfortunately more than you would like from our testing. On Tue, Oct 18, 2011 at 8:14 PM, Todd Nine t...@spidertracks.com wrote: ** Hi guys, We've just built a K tree