Re: Second Cassandra users survey

2011-12-06 Thread Matthias Pfau
It took some time to gather our requirements and to check what are our most important needs. However, here they are: * Column position range queries: We would like to access columns not by their name, but by their position in the row. Example: row(A:v1, B:v2, C:v3, D:v4); ; ordered by

Re: Second Cassandra users survey

2011-11-28 Thread Aditya
Ability to mix counter columns normal columns in same column family. On Thu, Nov 17, 2011 at 6:46 PM, Boris Yen yulin...@gmail.com wrote: I was wondering if it is possible to provide a funtion like delete from cf where column='value' I think this shold be useful for people who use

Re: Second Cassandra users survey

2011-11-14 Thread Chris Burroughs
- It would be super cool if all of that counter work made it possible to support other atomic data types (sets? CAS? just pass a assoc/commun Function to apply). - Again with types, pluggable type specific compression. - Wishy washy wish: Simpler elasticity I would like to go from 6--8--7

Re: Second Cassandra users survey

2011-11-14 Thread Jake Luciani
Re Simpler elasticity: Latest opscenter will now rebalance cluster optimally http://www.datastax.com/dev/blog/whats-new-in-opscenter-1-3 /plug -Jake On Mon, Nov 14, 2011 at 7:27 PM, Chris Burroughs chris.burrou...@gmail.comwrote: - It would be super cool if all of that counter work made it

Re: Second Cassandra users survey

2011-11-14 Thread Mohit Anchlia
On Mon, Nov 14, 2011 at 4:44 PM, Jake Luciani jak...@gmail.com wrote: Re  Simpler elasticity: Latest opscenter will now rebalance cluster optimally http://www.datastax.com/dev/blog/whats-new-in-opscenter-1-3 /plug Does it cause any impact on reads and writes while re-balance is in progress?

Re: Second Cassandra users survey

2011-11-14 Thread Dean Hiller
+1 on coprocessors On Mon, Nov 14, 2011 at 6:51 PM, Mohit Anchlia mohitanch...@gmail.comwrote: On Mon, Nov 14, 2011 at 4:44 PM, Jake Luciani jak...@gmail.com wrote: Re Simpler elasticity: Latest opscenter will now rebalance cluster optimally

Re: Second Cassandra users survey

2011-11-14 Thread Dean Hiller
oh yeah, one more BIG one.in memory writes with asynch write-behind to disk like cassandra does for speed. So if you have atomic locking, it writes to the primary node(memory) and some other node(memory) and returns with success to the client. asynch then writes to disk later. This prove to

Re: Second Cassandra users survey

2011-11-14 Thread Edward Ribeiro
+1 on co-processors. Edward

Re: Second Cassandra users survey

2011-11-11 Thread Aaron Turner
Lately I've been working on some data processing code in Cassandra and apparently I don't write bug-free code the very first time. :) Hence, while debugging, I often need to look at data in Cassandra to see what my code is doing/should be finding, etc. This turns out to be harder then it should

Re: Second Cassandra users survey

2011-11-11 Thread Edward Capriolo
It seems like you could use a composite key partioner to accomplish this On Monday, November 7, 2011, Daniel Doubleday daniel.double...@gmx.net wrote: Allow for deterministic / manual sharding of rows. Right now it seems that there is no way to force rows with different row keys will be stored

Re: Second Cassandra users survey

2011-11-09 Thread Jake Luciani
Hi Todd, Entity Groups : https://issues.apache.org/jira/browse/CASSANDRA-1684 -Jake On Wed, Nov 9, 2011 at 6:44 AM, Todd Burruss bburr...@expedia.com wrote: I believe I heard someone talk at Cassandra SF conference about creating a partitioner that was a derivation of RandomPartitioner. It

Re: Second Cassandra users survey

2011-11-09 Thread Aaron Turner
I think this was already asked for, but you can add my vote for TTL support for Counters. On Tue, Nov 1, 2011 at 3:59 PM, Jonathan Ellis jbel...@gmail.com wrote: Hi all, Two years ago I asked for Cassandra use cases and feature requests. [1]  The results [2] have been extremely useful in

Re: Second Cassandra users survey

2011-11-09 Thread Todd Burruss
@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Wed, 9 Nov 2011 02:53:20 -0800 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Second Cassandra users survey Hi Todd

Re: Second Cassandra users survey

2011-11-09 Thread Jake Luciani
@cassandra.apache.org user@cassandra.apache.org Subject: Re: Second Cassandra users survey Hi Todd, Entity Groups : https://issues.apache.org/jira/browse/CASSANDRA-1684 -Jake On Wed, Nov 9, 2011 at 6:44 AM, Todd Burruss bburr...@expedia.com wrote: I believe I heard someone talk at Cassandra SF

Re: Second Cassandra users survey

2011-11-09 Thread Vijay
My wish list: 1) Conditional updates: if a column has a value then put column in the column family atomically else fail. 2) getAndSet: on counters: a separate API 3) Revert the count when client disconnects or receives a exception (so they can safely retry). 4) Something like a freeze API for

Re: Second Cassandra users survey

2011-11-08 Thread Daniel Doubleday
Ah cool - thanks for the pointer! On Nov 7, 2011, at 5:25 PM, Ed Anuff wrote: This is basically what entity groups are about - https://issues.apache.org/jira/browse/CASSANDRA-1684 On Mon, Nov 7, 2011 at 5:26 AM, Peter Lin wool...@gmail.com wrote: This feature interests me, so I thought I'd

Re: Second Cassandra users survey

2011-11-08 Thread Todd Burruss
A use case that could use this (but isn't in my top requests) is usage history for a given user. I use a single row to save history per user, each column is a user action with name a TimeUUID and value is a blob. I use the TimeUUID to sort the actions, but I don't really care about exact time.

Re: Second Cassandra users survey

2011-11-07 Thread Radim Kolar
Take a look at this: http://www.oracle.com/technetwork/database/nosqldb/overview/index.html I understand the limitation/advantages of the architecture. Read this http://en.wikipedia.org/wiki/CAP_theorem

Re: Second Cassandra users survey

2011-11-07 Thread Daniel Doubleday
Allow for deterministic / manual sharding of rows. Right now it seems that there is no way to force rows with different row keys will be stored on the same nodes in the ring. This is our number one reason why we get data inconsistencies when nodes fail. Sometimes a logical transaction requires

Re: Second Cassandra users survey

2011-11-07 Thread Peter Lin
This feature interests me, so I thought I'd add some comments. Having used partition features in existing databases like DB2, Oracle and manual partitioning, one of the biggest challenges is keeping the partitions balanced. What I've seen with manual partitioning is that often the partitions get

Re: Second Cassandra users survey

2011-11-07 Thread Flavio Baronti
We are using Cassandra for time series storage. Strong points: write performance. Pain points: dinamically adding column families as new time series come in. Caused a lot of headaches, mismatchers between nodes, etc. In the end we just put everything together in a single (huge) column family.

Re: Second Cassandra users survey

2011-11-07 Thread Radim Kolar
So my question related deterministic sharding is this, what rebalance feature(s) would be useful or needed once the partitions get unbalanced? In current cassandra you can use nodetool move for rebalancing. Its fast operation, portion of existing data is moved to new server.

Re: Second Cassandra users survey

2011-11-07 Thread Jeremiah Jordan
Actually, the data will be visible at QUORUM as well if you can see it with ONE. QUORUM actually gives you a higher chance of seeing the new value than ONE does. In the case of R=3 you have 2/3 chance of seeing the new value with QUORUM, with ONE you have 1/3... And this JIRA fixed an issue

Re: Second Cassandra users survey

2011-11-07 Thread Jeremiah Jordan
- Batch read/slice from multiple column families. On 11/01/2011 05:59 PM, Jonathan Ellis wrote: Hi all, Two years ago I asked for Cassandra use cases and feature requests. [1] The results [2] have been extremely useful in setting and prioritizing goals for Cassandra development. But with

Re: Second Cassandra users survey

2011-11-07 Thread Ed Anuff
This is basically what entity groups are about - https://issues.apache.org/jira/browse/CASSANDRA-1684 On Mon, Nov 7, 2011 at 5:26 AM, Peter Lin wool...@gmail.com wrote: This feature interests me, so I thought I'd add some comments. Having used partition features in existing databases like DB2,

RE: Second Cassandra users survey

2011-11-07 Thread Deeter, Derek
Anchlia [mailto:mohitanch...@gmail.com] Sent: Sunday, November 06, 2011 10:58 AM To: user@cassandra.apache.org Subject: Re: Second Cassandra users survey Transparent on disk encryption with pluggable keyprovider will also be really helpful to secure sensitive information. On Sun, Nov 6, 2011 at 9:42

Re: Second Cassandra users survey

2011-11-07 Thread Ian Danforth
Wish list: A decent GUI to explore data kept in Cassandra would be much valuable. It should also be extendable to provide viewers for custom data. +1 to that. @jonathan - This is what google moderator is really good at. Perhaps start one and move the idea creation / voting there.

Re: Second Cassandra users survey

2011-11-07 Thread Daniel Doubleday
Well - given the example in our case the prefix that determines the endpoints where a token should be routed to could be something like a user-id so with key = userid + . + userthingid; instead of // this is happening right now getEndpoints(hash(key)) you would have getEndpoints(userid)

Re: Second Cassandra users survey

2011-11-07 Thread Brian O'Neill
It should be dead-simple to build a slick GUI on the REST layer. (@Virgilhttp://code.google.com/a/apache-extras.org/p/virgil/ ) I had planned to crank one out this week (using ExtJS) that mimicked the Squirrel/Toad look and feel. The UI would have a tree-panel of keyspaces and column families on

Re: Second Cassandra users survey

2011-11-07 Thread Colin Taylor
Decompression without compression (for lack of a better name). We store into Cassandra log batches that come in over http either uncompressed, deflate, snappy. We just add 'magic e.g. \0 \s \n \a \p \p \y as a prefix to the column value so we can decode it when serve it back up. Seems like

Re: Second Cassandra users survey

2011-11-06 Thread Aaron Turner
1. Basic SQL-like summary transforms for both CQL and Thrift API clients like: SUM AVG MIN MAX 2. Native 64bit UNsigned datatype 3. Add support for matching column names via LIKE (% and _ wildcards) for ascii type -- Aaron Turner http://synfin.net/         Twitter: @synfinatic

RE: Second Cassandra users survey

2011-11-06 Thread Sarah Baker
of intent? -Sarah -Original Message- From: Aaron Turner [mailto:synfina...@gmail.com] Sent: Sunday, November 06, 2011 8:25 AM To: user@cassandra.apache.org Subject: Re: Second Cassandra users survey 1. Basic SQL-like summary transforms for both CQL and Thrift API clients like: SUM AVG MIN MAX

Re: Second Cassandra users survey

2011-11-06 Thread Aaron Turner
external.  At its core was get and put. Did I miss something in my reading of intent? -Sarah -Original Message- From: Aaron Turner [mailto:synfina...@gmail.com] Sent: Sunday, November 06, 2011 8:25 AM To: user@cassandra.apache.org Subject: Re: Second Cassandra users survey 1. Basic SQL

Re: Second Cassandra users survey

2011-11-06 Thread Mohit Anchlia
-Original Message- From: Aaron Turner [mailto:synfina...@gmail.com] Sent: Sunday, November 06, 2011 8:25 AM To: user@cassandra.apache.org Subject: Re: Second Cassandra users survey 1. Basic SQL-like summary transforms for both CQL and Thrift API clients like: SUM AVG MIN MAX

RE: Second Cassandra users survey

2011-11-06 Thread Pierre Chalamet
- support for atomic operations or batches (if QUORUM fails, data should not be visible with ONE) zookeeper is solving that. Yeah, I can use HBase too. I might have screwed up a little bit since I didn't talk about isolation; let's reformulate: support for read committed (using DB

Re: Second Cassandra users survey

2011-11-06 Thread Ed Anuff
On Sun, Nov 6, 2011 at 12:52 AM, Radim Kolar h...@sendmail.cz wrote: - support for atomic operations or batches (if QUORUM fails, data should not be visible with ONE) zookeeper is solving that. I'd like to see official support for Zookeeper inside of Cassandra. I'd like it to be something that

Re: Second Cassandra users survey

2011-11-06 Thread Robert Jackson
On Nov 6, 2011, at 3:41 PM, Ed Anuff e...@anuff.com wrote: I'd like to see official support for Zookeeper inside of Cassandra. I'd like it to be something that can be optionally configured. I'd like to be able to make batch mutations atomic using it. Not sure how possible this is, but we are

Re: Second Cassandra users survey

2011-11-06 Thread Radim Kolar
Yeah, I can use HBase too. but why you are not using hbase if its feature set fits your needs better and want to have same functionality in cassandra? Its good that both projects are different in this area. From rest of your post it looks like you want to have cassandra ACID compliant, which

RE: Second Cassandra users survey

2011-11-06 Thread Pierre Chalamet
To: user@cassandra.apache.org Subject: Re: Second Cassandra users survey Yeah, I can use HBase too. but why you are not using hbase if its feature set fits your needs better and want to have same functionality in cassandra? Its good that both projects are different in this area. From rest

RE: Second Cassandra users survey

2011-11-05 Thread Pierre Chalamet
Dear Santa, here is my wish list :) - support for atomic operations or batches (if QUORUM fails, data should not be visible with ONE) - TTL on CF, rows and counters - restart the TTL when a row, column or CF is touched - streamed data transfer (both send receive). At least for receive

Re: Second Cassandra users survey

2011-11-05 Thread Brandon Williams
On Fri, Nov 4, 2011 at 9:50 PM, Jim Newsham jnews...@referentia.com wrote: Our use case is time-series data (such as sampled sensor data).  Each row describes a particular statistic over time, the column name is a time, and the column value is the sample.  So it makes perfect sense to want to

Re: Second Cassandra users survey

2011-11-04 Thread Jim Newsham
- Bulk column deletion by (column name) range. Without this feature, we are forced to perform a range query and iterate over all of the columns, deleting them one by one (we do this in a batch, but it's still a very slow approach). See CASSANDRA-494/3448. If anyone else has a need for

Re: Second Cassandra users survey

2011-11-04 Thread Brandon Williams
On Fri, Nov 4, 2011 at 9:19 PM, Jim Newsham jnews...@referentia.com wrote: - Bulk column deletion by (column name) range.  Without this feature, we are forced to perform a range query and iterate over all of the columns, deleting them one by one (we do this in a batch, but it's still a very

Re: Second Cassandra users survey

2011-11-04 Thread Jim Newsham
On 11/4/2011 4:32 PM, Brandon Williams wrote: On Fri, Nov 4, 2011 at 9:19 PM, Jim Newshamjnews...@referentia.com wrote: - Bulk column deletion by (column name) range. Without this feature, we are forced to perform a range query and iterate over all of the columns, deleting them one by one (we

Re: Second Cassandra users survey

2011-11-03 Thread Peter Tillotson
I'm using Cassandra as a big graph database, loading large volumes of data live and linking on the fly.  The number of edges grow geometrically with data added, and need to be read to continue linking the graph on the fly.  Consequently, my problem is constrained by:  * Predominantly read -

Re: Second Cassandra users survey

2011-11-03 Thread Radim Kolar
* Compaction is expensive Yes, it is. Thats why i deciced not to go with hadoop hdfs backed by cassandra.

Re: Second Cassandra users survey

2011-11-03 Thread Mohit Anchlia
On Thu, Nov 3, 2011 at 5:46 AM, Peter Tillotson slatem...@yahoo.co.uk wrote: I'm using Cassandra as a big graph database, loading large volumes of data live and linking on the fly. Not sure if Cassandra is right fit to model complex vertexes and edges. The number of edges grow geometrically

Re: Second Cassandra users survey

2011-11-03 Thread Peter Tillotson
; Peter Tillotson slatem...@yahoo.co.uk Sent: Thursday, 3 November 2011, 14:15 Subject: Re: Second Cassandra users survey On Thu, Nov 3, 2011 at 5:46 AM, Peter Tillotson slatem...@yahoo.co.uk wrote: I'm using Cassandra as a big graph database, loading large volumes of data live and linking

Re: Second Cassandra users survey

2011-11-03 Thread Ertio Lew
Provide an option to sort columns by timestamp i.e, in the order they have been added to the row, with the facility to use any column names. On Wed, Nov 2, 2011 at 4:29 AM, Jonathan Ellis jbel...@gmail.com wrote: Hi all, Two years ago I asked for Cassandra use cases and feature requests. [1]

Re: Second Cassandra users survey

2011-11-03 Thread Konstantin Naryshkin
I realize that it is not realistic to expect it, but is would be good to have a Partitioner that supports both range slices and automatic load balancing. On Thu, Nov 3, 2011 at 13:57, Ertio Lew ertio...@gmail.com wrote: Provide an option to sort columns by timestamp i.e, in the order they have

Re: Second Cassandra users survey

2011-11-03 Thread Todd Burruss
- Better performance when access random columns in a wide row - caching subsets of wide rows - possibly on the same boundaries as the index - some sort of notification architecture when data is inserted. This could be co-processors, triggers, plugins, etc - auto load balance when adding new nodes

Re: Second Cassandra users survey

2011-11-02 Thread Patrick Julien
- entity groups - co-processors - materialized views - CQL support directly in cassandra-cli On Tue, Nov 1, 2011 at 6:59 PM, Jonathan Ellis jbel...@gmail.com wrote: Hi all, Two years ago I asked for Cassandra use cases and feature requests. [1]  The results [2] have been extremely useful in

Re: Second Cassandra users survey

2011-11-02 Thread Boris Yen
1. entity groups 2. cql support in cassandra-cli. 3. offset support in slice_range. 4. more sophisticated secondary index implementation. On Wed, Nov 2, 2011 at 8:38 PM, Patrick Julien pjul...@gmail.com wrote: - entity groups - co-processors - materialized views - CQL support directly in

Re: Second Cassandra users survey

2011-11-01 Thread Ramesh Natarajan
Here is my wish list - I would love Cassandra to - provide a efficient method to retrieve the count of columns for a given row without resorting to read all columns and calculate the count for a given row key. - support auto increment column names - Column slice based query doesn't take