Re: MVCC

2009-08-03 Thread Jonathan Ellis
On Mon, Aug 3, 2009 at 3:39 PM, Ivan Chang wrote: > Is this going to be an inherent limitation of Cassandra? If someone writes a patch that adds multi-version support without compromising single-version performance then I don't see any reasons to turn it down. -Jonathan

Re: MVCC

2009-08-03 Thread Evan Weaver
You can support this at the domain level with custom comparators, I think. It doesn't need to be in Cassandra itself as a first-class operation. Evan On Mon, Aug 3, 2009 at 1:39 PM, Ivan Chang wrote: > Is this going to be an inherent limitation of Cassandra? > > There is no doubt many application

Re: MVCC

2009-08-03 Thread Ivan Chang
Is this going to be an inherent limitation of Cassandra? There is no doubt many applications will benefit from db with build-in support for mutliple versions of the same data - features that allow reversal of operations, applications that require historical data maintained (e.g. credit/debit appli

Re: MVCC

2009-08-03 Thread Jonathan Ellis
It's not moderated (click the login link to get to a signup form). Changes are sent to the -commits list where anyone interested (like me :) can review them. -Jonathan P.S. sorry for the signup captcha questions -- someone apparently thought they were cute, but they typically take a bit of googli

Re: MVCC

2009-08-03 Thread Mark McBride
Cool. There are a few things I've found out recently that should probably go into the wiki (this, the fact that get_columns_since silently returns no results if your column family isn't ordered by time)... is it moderated at all? Should I run changes by the mailing list? On Mon, Aug 3, 2009 at 1

Re: MVCC

2009-08-03 Thread Jonathan Ellis
On Mon, Aug 3, 2009 at 12:12 PM, Mark McBride wrote: > Thanks, that makes sense.  Is it an ok general rule that the > timestamps should be set to > > 1) The time that the data to be mutated was generated > 2) The current system time if the time the data was mutated isn't available Yes. > Looking

Re: MVCC

2009-08-03 Thread Jonathan Ellis
Strictly speaking, no; timestamp is client-provided. But in the sense that "you'd better use ntpd on your clients," yes. On Mon, Aug 3, 2009 at 12:10 PM, Wilson Mar wrote: > So if different servers are not synchronized in time (to a Tier 1 time > server), then updates from slower server will not

Re: MVCC

2009-08-03 Thread Mark McBride
Thanks, that makes sense. Is it an ok general rule that the timestamps should be set to 1) The time that the data to be mutated was generated 2) The current system time if the time the data was mutated isn't available Looking around at code it seems like time 0 is used a lot, which seems pretty

Re: MVCC

2009-08-03 Thread Wilson Mar
So if different servers are not synchronized in time (to a Tier 1 time server), then updates from slower server will not be updated on faster servers?

Re: MVCC

2009-08-03 Thread Jonathan Ellis
It's there for the same reason as the other timestamps: it lets cassandra ignore obsolete operations. So if you do a delete at time X and an insert at time Y where X < Y, the insert will not be deleted by mistake even if a node is down temporarily and gets the delete later. -Jonathan On Mon, Aug

Re: MVCC

2009-08-03 Thread Mark McBride
If this is the case, what does the timestamp passed in to the remove call do? I assumed you had to have it match up with a specific version... On Mon, Aug 3, 2009 at 9:53 AM, wrote: > I always thought cassandra had free multiple versions and we needed to > manually delete the older versions > >

Re: MVCC

2009-08-03 Thread mobiledreamers
I always thought cassandra had free multiple versions and we needed to manually delete the older versions On Mon, Aug 3, 2009 at 8:56 AM, Jonathan Ellis wrote: > On Mon, Aug 3, 2009 at 10:49 AM, Jun Rao wrote: > > Ivan, > > > > The original cassandra keeps multiple versions of the column data. >

Re: MVCC

2009-08-03 Thread Jonathan Ellis
On Mon, Aug 3, 2009 at 10:49 AM, Jun Rao wrote: > Ivan, > > The original cassandra keeps multiple versions of the column data. No, it didn't. (It had versioning-related bugs but multiple versions a la Bigtable was never part of the design.) -Jonathan

Re: MVCC

2009-08-03 Thread Chris Goffinet
How was it used in the original? On Aug 3, 2009, at 8:49 AM, Jun Rao wrote: Ivan, The original cassandra keeps multiple versions of the column data. However, that support has been removed in the apache code. Right now, only the latest version is kept. In the future, we could add the vers

Re: MVCC

2009-08-03 Thread Jun Rao
Ivan, The original cassandra keeps multiple versions of the column data. However, that support has been removed in the apache code. Right now, only the latest version is kept. In the future, we could add the versioning support back. Jun IBM Almaden Research Center K55/B1, 650 Harry Road, San Jos

MVCC

2009-08-03 Thread Ivan Chang
Does Cassandra support MVCC? I am building an application with concurrent updates (add, update, delete) and one of the requirements is to be able to run audits that reproduce all the update histories and the data objects in different versions. What's the best way to go about this in Cassandra? A