from:"Jeffrey Kesselman"

Not quite, its more limited and specific

The order of operations is all within the Cassandra node server and looks
like this this...

We have one row, A. Thats the only row being operated on.

Client - submits A'
Server does the following:
(1) Validate function reads current A
(2) Validate function validates A' vs. A
(3) If validation succeeds, allows update to A'.

My fear/concern is that after 1 and before 3, a second update to A'' comes
in and changes the current value of A, therefor invalidating my
validation check, see?

If Cassandra does not guard against this then one possible solution would be
to make my own key-to-mutex map in memory, lock the mutex for A's key as a
precursor to (1) and release it in a post-update function. But I am always
very nervous about inserting locking into a process that wasn't designed
with it already in mind...

On Fri, Jul 8, 2011 at 8:30 AM, William Oberman ober...@civicscience.comwrote:

Questions like this seem to come up a lot:

http://stackoverflow.com/questions/6033888/cassandra-atomicity-isolation-of-column-updates-on-a-single-row-on-on-single-no

http://stackoverflow.com/questions/2055037/cassandra-atomic-reads-writes-within-a-single-columnfamily
http://www.mail-archive.com/user@cassandra.apache.org/msg14701.html

Lets say you read state A (from one key in one CF), you change the data to
A' in your client, and you write A'. Are you worried that someone else
might have changed A to B during this process (making the new state a race
between A' and B)? It doesn't sound to me like you are... It sounds to me
like you're worried about a set of columns for the key being in a consistent
state before, during, and after a process. And A - A' and A - B will each
be atomic for the key (based on my understanding). But, if A' and B are
changes to a different set of columns, I believe that would interleave,
which itself could be inconsistent from your application's point of view.

will

On Thu, Jul 7, 2011 at 11:41 PM, Jeffrey Kesselman jef...@gmail.comwrote:

Really, as i lay in the bath thinking nabout it, I concluded what I am
looking for is a very limited form of Consistency.

Its consistency over a single row on a single node just for the period of
update.

On Thu, Jul 7, 2011 at 10:34 PM, Jeffrey Kesselman jef...@gmail.comwrote:

Its not really isolation, btw, because we
arent talking about anyone seeing an update mid-update.Rather, we
are talking about when updates are allowed to occur.

Atomicity means that all the updates happen together or they don't happen
at all.
Isolation means that no results of the update are visible until the
entire update operation is complete.

This really lies somewhere in the middle of the two concepts. Its part
of the results of the combined effects of ACID

On Thu, Jul 7, 2011 at 10:27 PM, Jonathan Ellis jbel...@gmail.comwrote:

Sounds to me like you're confusing atomicity with isolation.

On Thu, Jul 7, 2011 at 2:54 PM, Jeffrey Kesselman jef...@gmail.com
wrote:
Yup, im even more confused.Lets talk about the model, not the
implementation.
AIUI updates to a row are atomic across all columns in that row at
once,
true?
If true then the next question is, does the validation happen inside
or
outside of that guarantee, and is the row guaranteed not to change
between
validation and update?
If that is *not* the case then it makes a whole class of solutions to
synchronization problems fail and puts my larger project
in serious question.

On Thu, Jul 7, 2011 at 3:43 PM, Yang tedd...@gmail.com wrote:

no , the memtable is a concurrentskiplistmap

insertion can happen in parallel

On Jul 7, 2011 9:24 AM, Jeffrey Kesselman jef...@gmail.com
wrote:
This has me more confused.

Does this mean that ALL rows on a given node are only updated
sequentially,
never in parallel?

On Thu, Jul 7, 2011 at 3:21 PM, Yang tedd...@gmail.com
wrote:

just to add onto what jonathan said

the columns are immutable . if u overwrite/ reconcile a new obj is
created and shoved into the memtable

there is a shared lock for all writes though which guard against
an
exclusive lock on memtable switching/flushing
On Jul 7, 2011 7:51 AM, A J s5a...@gmail.com wrote:
Does a write lock:
1. Just the columns in question for the specific row in question
?
2. The full row in question ?
3. The full CF ?

I doubt read does any locks.

Thanks.

--
It's always darkest just before you are eaten by a grue.

--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

--
It's always darkest just before you are eaten by a grue.

--
It's always darkest just before you

Re: What does a write lock ?

I am confused by what you mean by Cassandra client code.  Is this part of
the Cassnadra server?

My architecture is my user talks thrift to Cassandra.

Re: What does a write lock ?

Where does a custom validation method run?

Given that it is validating a row update, my assumption was that it ran on
the node that owns the row. That would make sense to me as it would
fulfill the NoSql philosophy of taking computation to data, rather then data
to computation.

I don't follow the relevance of the rest of your comment, sorry.

On Fri, Jul 8, 2011 at 10:34 AM, Jonathan Ellis jbel...@gmail.com wrote:

It doesn't look like that at all.

Row A exists.

Client submits mutation Am. This is not necessarily a full row.

Coordinator validates Am.

If validation succeeds, coordinator sends Am to the replica owners,
effectively creating A'.

Neither A nor A' is ever explicitly assembled on the write path.

On Fri, Jul 8, 2011 at 9:22 AM, Jeffrey Kesselman jef...@gmail.com
wrote:
Not quite, its more limited and specific
The order of operations is all within the Cassandra node server and looks
like this this...
We have one row, A. Thats the only row being operated on.
Client - submits A'
Server does the following:
(1) Validate function reads current A
(2) Validate function validates A' vs. A
(3) If validation succeeds, allows update to A'.
My fear/concern is that after 1 and before 3, a second update to A''
comes
in and changes the current value of A, therefor invalidating my
validation check, see?
If Cassandra does not guard against this then one possible
solution would be
to make my own key-to-mutex map in memory, lock the mutex for A's key as
a
precursor to (1) and release it in a post-update function. But I am
always
very nervous about inserting locking into a process that wasn't designed
with it already in mind...

On Fri, Jul 8, 2011 at 8:30 AM, William Oberman
ober...@civicscience.com
wrote:

Questions like this seem to come up a lot:

http://stackoverflow.com/questions/6033888/cassandra-atomicity-isolation-of-column-updates-on-a-single-row-on-on-single-no

http://stackoverflow.com/questions/2055037/cassandra-atomic-reads-writes-within-a-single-columnfamily
http://www.mail-archive.com/user@cassandra.apache.org/msg14701.html
Lets say you read state A (from one key in one CF), you change the data
to
A' in your client, and you write A'. Are you worried that someone else
might have changed A to B during this process (making the new state a
race
between A' and B)? It doesn't sound to me like you are... It sounds to
me
like you're worried about a set of columns for the key being in a
consistent
state before, during, and after a process. And A - A' and A - B will
each
be atomic for the key (based on my understanding). But, if A' and B are
changes to a different set of columns, I believe that would interleave,
which itself could be inconsistent from your application's point of
view.

will

On Thu, Jul 7, 2011 at 11:41 PM, Jeffrey Kesselman jef...@gmail.com
wrote:

Really, as i lay in the bath thinking nabout it, I concluded what I am
looking for is a very limited form of Consistency.
Its consistency over a single row on a single node just for the period
of
update.

On Thu, Jul 7, 2011 at 10:34 PM, Jeffrey Kesselman jef...@gmail.com
wrote:

Its not really isolation, btw, because we
arent talking about anyone seeing an update mid-update.Rather, we
are talking about when updates are allowed to occur.
Atomicity means that all the updates happen together or they don't
happen at all.
Isolation means that no results of the update are visible until the
entire update operation is complete.
This really lies somewhere in the middle of the two concepts. Its
part
of the results of the combined effects of ACID

On Thu, Jul 7, 2011 at 10:27 PM, Jonathan Ellis jbel...@gmail.com
wrote:

Sounds to me like you're confusing atomicity with isolation.

On Thu, Jul 7, 2011 at 2:54 PM, Jeffrey Kesselman jef...@gmail.com
wrote:
Yup, im even more confused.Lets talk about the model, not the
implementation.
AIUI updates to a row are atomic across all columns in that row at
once,
true?
If true then the next question is, does the validation happen
inside
or
outside of that guarantee, and is the row guaranteed not to change
between
validation and update?
If that is *not* the case then it makes a whole class
of solutions to
synchronization problems fail and puts my larger project
in serious question.

On Thu, Jul 7, 2011 at 3:43 PM, Yang tedd...@gmail.com
wrote:

no , the memtable is a concurrentskiplistmap

insertion can happen in parallel

On Jul 7, 2011 9:24 AM, Jeffrey Kesselman jef...@gmail.com
wrote:
This has me more confused.

Does this mean that ALL rows on a given node are only updated
sequentially,
never in parallel?

On Thu, Jul 7, 2011 at 3:21 PM, Yang tedd...@gmail.com
wrote:

just to add onto what jonathan said

the columns

Re: What does a write lock ?

Alright,

So are you saying the column validator, as specified
by conf/storage-conf.xml is checked in the client interface library and not
on the server side?  That seems odd to me on a number of levels, not the
least being I cant see how thrift could autogenerate that
for different languages or how those other languages would use a Java class.
*
*
On Fri, Jul 8, 2011 at 11:13 AM, William Oberman
ober...@civicscience.comwrote:

 I use a language specific wrapper around thrift as my client, but yes, I
 guess I fundamentally mean thrift == client, and the cassandra server ==
 server.

 will


 On Fri, Jul 8, 2011 at 11:08 AM, Jeffrey Kesselman jef...@gmail.comwrote:

 I am confused by what you mean by Cassandra client code.  Is this part
 of the Cassnadra server?

 My architecture is my user talks thrift to Cassandra.






-- 
It's always darkest just before you are eaten by a grue.

Re: What does a write lock ?

Hmm.

Thanks Nate.

I need to think about this and our data store design some.  In general I
dislike architecture with large numbers of independent servers, I think it
invites communication latencies and partial failures into the mix.  But I'll
cogitate some.

On Fri, Jul 8, 2011 at 12:21 PM, Nate McCall n...@datastax.com wrote:

 Validation occurs at the API level, returning an
 InvalidRequestException to the caller of the API (a thrift client in
 this case). Specifically, a mutation will not be scheduled for the
 storage until it has been validated at the API level.

 If the intention is to do a read-before-write validation as an
 AbstractType extension, then yes, the underlying value could indeed
 change between validation and storage. If this were the goal, you need
 to implement locking externally (via zookeper or similar as previously
 mentioned).

 On Fri, Jul 8, 2011 at 10:21 AM, William Oberman
 ober...@civicscience.com wrote:
  I haven't ever written my own
 org.apache.cassandra.db.marshal.AbstractType
  (which is I think what your talking about), so I have no idea.
 
  Looking up the JavaDoc for that class, validate says validate that the
 byte
  array is a valid sequence for the type we are supposed to be comparing,
  which sounds like a local operation to me (e.g. it shouldn't fetch remote
  data, it's just saying yep, this is a valid member of type T).
 
  will
 
  On Fri, Jul 8, 2011 at 11:17 AM, Jeffrey Kesselman jef...@gmail.com
 wrote:
 
  Alright,
  So are you saying the column validator, as specified
  by conf/storage-conf.xml is checked in the client interface library and
 not
  on the server side?  That seems odd to me on a number of levels, not the
  least being I cant see how thrift could autogenerate that
  for different languages or how those other languages would use a Java
 class.
 
  On Fri, Jul 8, 2011 at 11:13 AM, William Oberman
  ober...@civicscience.com wrote:
 
  I use a language specific wrapper around thrift as my client, but
 yes,
  I guess I fundamentally mean thrift == client, and the cassandra server
 ==
  server.
 
  will
 
  On Fri, Jul 8, 2011 at 11:08 AM, Jeffrey Kesselman jef...@gmail.com
  wrote:
 
  I am confused by what you mean by Cassandra client code.  Is this
 part
  of the Cassnadra server?
  My architecture is my user talks thrift to Cassandra.
 
 
 
 
 
  --
  It's always darkest just before you are eaten by a grue.
 
 
 




-- 
It's always darkest just before you are eaten by a grue.

Re: What does a write lock ?

I should add, Nate, that the intention is to do a read before write
validation and have that occur as close to the data as possible.

if there is a better hook to implement it on I'd love a pointer to it.

JK

On Fri, Jul 8, 2011 at 12:21 PM, Nate McCall n...@datastax.com wrote:

 Validation occurs at the API level, returning an
 InvalidRequestException to the caller of the API (a thrift client in
 this case). Specifically, a mutation will not be scheduled for the
 storage until it has been validated at the API level.

 If the intention is to do a read-before-write validation as an
 AbstractType extension, then yes, the underlying value could indeed
 change between validation and storage. If this were the goal, you need
 to implement locking externally (via zookeper or similar as previously
 mentioned).

 On Fri, Jul 8, 2011 at 10:21 AM, William Oberman
 ober...@civicscience.com wrote:
  I haven't ever written my own
 org.apache.cassandra.db.marshal.AbstractType
  (which is I think what your talking about), so I have no idea.
 
  Looking up the JavaDoc for that class, validate says validate that the
 byte
  array is a valid sequence for the type we are supposed to be comparing,
  which sounds like a local operation to me (e.g. it shouldn't fetch remote
  data, it's just saying yep, this is a valid member of type T).
 
  will
 
  On Fri, Jul 8, 2011 at 11:17 AM, Jeffrey Kesselman jef...@gmail.com
 wrote:
 
  Alright,
  So are you saying the column validator, as specified
  by conf/storage-conf.xml is checked in the client interface library and
 not
  on the server side?  That seems odd to me on a number of levels, not the
  least being I cant see how thrift could autogenerate that
  for different languages or how those other languages would use a Java
 class.
 
  On Fri, Jul 8, 2011 at 11:13 AM, William Oberman
  ober...@civicscience.com wrote:
 
  I use a language specific wrapper around thrift as my client, but
 yes,
  I guess I fundamentally mean thrift == client, and the cassandra server
 ==
  server.
 
  will
 
  On Fri, Jul 8, 2011 at 11:08 AM, Jeffrey Kesselman jef...@gmail.com
  wrote:
 
  I am confused by what you mean by Cassandra client code.  Is this
 part
  of the Cassnadra server?
  My architecture is my user talks thrift to Cassandra.
 
 
 
 
 
  --
  It's always darkest just before you are eaten by a grue.
 
 
 




-- 
It's always darkest just before you are eaten by a grue.

Re: What does a write lock ?

Hi Jonnathan,

This brings up an important question.  I have been assuming that the
validation check is part of the atomic update operation. Is this NOT the
case?  Which is to say, can the row be changed between the time the
validation method is executed and the validated data is written?

The reason I ask is because I have been thinking of this as effectively a
write-lock on the row during the entire update
process, including validation, but your answer has caused some concerns that
this is wrong...

Thanks

JK

On Thu, Jul 7, 2011 at 2:01 PM, Jonathan Ellis jbel...@gmail.com wrote:

 None of the above, it uses a glorified CAS* at the column level

 *http://en.wikipedia.org/wiki/Compare-and-swap

 On Thu, Jul 7, 2011 at 12:51 PM, A J s5a...@gmail.com wrote:
  Does a write lock:
  1. Just the columns in question for the specific row in question ?
  2. The full row in question ?
  3. The full CF ?
 
  I doubt read does any locks.
 
  Thanks.
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




-- 
It's always darkest just before you are eaten by a grue.

Re: What does a write lock ?

The old row is accessible and my validation requires a comparison of the
two.

JK

On Thu, Jul 7, 2011 at 3:23 PM, Yang tedd...@gmail.com wrote:

 validation is on the new incoming column ,not the old row,right?
  On Jul 7, 2011 8:25 AM, Jeffrey Kesselman jef...@gmail.com wrote:
  Hi Jonnathan,
 
  This brings up an important question. I have been assuming that the
  validation check is part of the atomic update operation. Is this NOT the
  case? Which is to say, can the row be changed between the time the
  validation method is executed and the validated data is written?
 
  The reason I ask is because I have been thinking of this as effectively a
  write-lock on the row during the entire update
  process, including validation, but your answer has caused some concerns
 that
  this is wrong...
 
  Thanks
 
  JK
 
  On Thu, Jul 7, 2011 at 2:01 PM, Jonathan Ellis jbel...@gmail.com
 wrote:
 
  None of the above, it uses a glorified CAS* at the column level
 
  *http://en.wikipedia.org/wiki/Compare-and-swap
 
  On Thu, Jul 7, 2011 at 12:51 PM, A J s5a...@gmail.com wrote:
   Does a write lock:
   1. Just the columns in question for the specific row in question ?
   2. The full row in question ?
   3. The full CF ?
  
   I doubt read does any locks.
  
   Thanks.
  
 
 
 
  --
  Jonathan Ellis
  Project Chair, Apache Cassandra
  co-founder of DataStax, the source for professional Cassandra support
  http://www.datastax.com
 
 
 
 
  --
  It's always darkest just before you are eaten by a grue.




-- 
It's always darkest just before you are eaten by a grue.

Re: What does a write lock ?

Yup, im even more confused.Lets talk about the model, not the
implementation.

AIUI updates to a row are atomic across all columns in that row at once,
true?

If true then the next question is, does the validation happen inside or
outside of that guarantee, and is the row guaranteed not to change between
validation and update?

If that is *not* the case then it makes a whole class of solutions to
synchronization problems fail and puts my larger project
in serious question.

On Thu, Jul 7, 2011 at 3:43 PM, Yang tedd...@gmail.com wrote:

 no , the memtable is a concurrentskiplistmap

 insertion can happen in parallel
 On Jul 7, 2011 9:24 AM, Jeffrey Kesselman jef...@gmail.com wrote:
  This has me more confused.
 
  Does this mean that ALL rows on a given node are only updated
 sequentially,
  never in parallel?
 
  On Thu, Jul 7, 2011 at 3:21 PM, Yang tedd...@gmail.com wrote:
 
  just to add onto what jonathan said
 
  the columns are immutable . if u overwrite/ reconcile a new obj is
  created and shoved into the memtable
 
  there is a shared lock for all writes though which guard against an
  exclusive lock on memtable switching/flushing
  On Jul 7, 2011 7:51 AM, A J s5a...@gmail.com wrote:
   Does a write lock:
   1. Just the columns in question for the specific row in question ?
   2. The full row in question ?
   3. The full CF ?
  
   I doubt read does any locks.
  
   Thanks.
 
 
 
 
  --
  It's always darkest just before you are eaten by a grue.




-- 
It's always darkest just before you are eaten by a grue.

Re: What does a write lock ?

Yup, im even more confused.Lets tlak about the model, not the
implementation.

AIUI updates to a single row are atomic across all columns in that row at
once, true?

If true then the next question is, does the validation happen inside or
outside of that guarantee, and is the row guaranteed not to change
underneath the validation call?

On Thu, Jul 7, 2011 at 3:43 PM, Yang tedd...@gmail.com wrote:

 no , the memtable is a concurrentskiplistmap

 insertion can happen in parallel
 On Jul 7, 2011 9:24 AM, Jeffrey Kesselman jef...@gmail.com wrote:
  This has me more confused.
 
  Does this mean that ALL rows on a given node are only updated
 sequentially,
  never in parallel?
 
  On Thu, Jul 7, 2011 at 3:21 PM, Yang tedd...@gmail.com wrote:
 
  just to add onto what jonathan said
 
  the columns are immutable . if u overwrite/ reconcile a new obj is
  created and shoved into the memtable
 
  there is a shared lock for all writes though which guard against an
  exclusive lock on memtable switching/flushing
  On Jul 7, 2011 7:51 AM, A J s5a...@gmail.com wrote:
   Does a write lock:
   1. Just the columns in question for the specific row in question ?
   2. The full row in question ?
   3. The full CF ?
  
   I doubt read does any locks.
  
   Thanks.
 
 
 
 
  --
  It's always darkest just before you are eaten by a grue.




-- 
It's always darkest just before you are eaten by a grue.

Re: What does a write lock ?

Not confusing, but assuming a few things.

I made a more detailed post in the Datatstax forums.

On Thu, Jul 7, 2011 at 10:27 PM, Jonathan Ellis jbel...@gmail.com wrote:

 Sounds to me like you're confusing atomicity with isolation.

 On Thu, Jul 7, 2011 at 2:54 PM, Jeffrey Kesselman jef...@gmail.com
 wrote:
  Yup, im even more confused.Lets talk about the model, not the
  implementation.
  AIUI updates to a row are atomic across all columns in that row at once,
  true?
  If true then the next question is, does the validation happen inside or
  outside of that guarantee, and is the row guaranteed not to change
 between
  validation and update?
  If that is *not* the case then it makes a whole class of solutions to
  synchronization problems fail and puts my larger project
  in serious question.
 
  On Thu, Jul 7, 2011 at 3:43 PM, Yang tedd...@gmail.com wrote:
 
  no , the memtable is a concurrentskiplistmap
 
  insertion can happen in parallel
 
  On Jul 7, 2011 9:24 AM, Jeffrey Kesselman jef...@gmail.com wrote:
   This has me more confused.
  
   Does this mean that ALL rows on a given node are only updated
   sequentially,
   never in parallel?
  
   On Thu, Jul 7, 2011 at 3:21 PM, Yang tedd...@gmail.com wrote:
  
   just to add onto what jonathan said
  
   the columns are immutable . if u overwrite/ reconcile a new obj is
   created and shoved into the memtable
  
   there is a shared lock for all writes though which guard against an
   exclusive lock on memtable switching/flushing
   On Jul 7, 2011 7:51 AM, A J s5a...@gmail.com wrote:
Does a write lock:
1. Just the columns in question for the specific row in question ?
2. The full row in question ?
3. The full CF ?
   
I doubt read does any locks.
   
Thanks.
  
  
  
  
   --
   It's always darkest just before you are eaten by a grue.
 
 
 
  --
  It's always darkest just before you are eaten by a grue.
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




-- 
It's always darkest just before you are eaten by a grue.

Re: What does a write lock ?

Its not really isolation, btw, because we arent talking about anyone seeing
an update mid-update.Rather, we are talking about when updates are
allowed to occur.

Atomicity means that all the updates happen together or they don't happen at
all.
Isolation means that no results of the update are visible until the entire
update operation is complete.

This really lies somewhere in the middle of the two concepts.   Its part of
the results of the combined effects of ACID


On Thu, Jul 7, 2011 at 10:27 PM, Jonathan Ellis jbel...@gmail.com wrote:

 Sounds to me like you're confusing atomicity with isolation.

 On Thu, Jul 7, 2011 at 2:54 PM, Jeffrey Kesselman jef...@gmail.com
 wrote:
  Yup, im even more confused.Lets talk about the model, not the
  implementation.
  AIUI updates to a row are atomic across all columns in that row at once,
  true?
  If true then the next question is, does the validation happen inside or
  outside of that guarantee, and is the row guaranteed not to change
 between
  validation and update?
  If that is *not* the case then it makes a whole class of solutions to
  synchronization problems fail and puts my larger project
  in serious question.
 
  On Thu, Jul 7, 2011 at 3:43 PM, Yang tedd...@gmail.com wrote:
 
  no , the memtable is a concurrentskiplistmap
 
  insertion can happen in parallel
 
  On Jul 7, 2011 9:24 AM, Jeffrey Kesselman jef...@gmail.com wrote:
   This has me more confused.
  
   Does this mean that ALL rows on a given node are only updated
   sequentially,
   never in parallel?
  
   On Thu, Jul 7, 2011 at 3:21 PM, Yang tedd...@gmail.com wrote:
  
   just to add onto what jonathan said
  
   the columns are immutable . if u overwrite/ reconcile a new obj is
   created and shoved into the memtable
  
   there is a shared lock for all writes though which guard against an
   exclusive lock on memtable switching/flushing
   On Jul 7, 2011 7:51 AM, A J s5a...@gmail.com wrote:
Does a write lock:
1. Just the columns in question for the specific row in question ?
2. The full row in question ?
3. The full CF ?
   
I doubt read does any locks.
   
Thanks.
  
  
  
  
   --
   It's always darkest just before you are eaten by a grue.
 
 
 
  --
  It's always darkest just before you are eaten by a grue.
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




-- 
It's always darkest just before you are eaten by a grue.

Re: What does a write lock ?

Really, as i lay in the bath thinking nabout it, I concluded what I am
looking for is a very limited form of Consistency.

Its consistency over a single row on a single node just for the period of
update.

On Thu, Jul 7, 2011 at 10:34 PM, Jeffrey Kesselman jef...@gmail.com wrote:

 Its not really isolation, btw, because we arent talking about anyone seeing
 an update mid-update.Rather, we are talking about when updates are
 allowed to occur.

 Atomicity means that all the updates happen together or they don't happen
 at all.
 Isolation means that no results of the update are visible until the entire
 update operation is complete.

 This really lies somewhere in the middle of the two concepts.   Its part of
 the results of the combined effects of ACID


 On Thu, Jul 7, 2011 at 10:27 PM, Jonathan Ellis jbel...@gmail.com wrote:

 Sounds to me like you're confusing atomicity with isolation.

 On Thu, Jul 7, 2011 at 2:54 PM, Jeffrey Kesselman jef...@gmail.com
 wrote:
  Yup, im even more confused.Lets talk about the model, not the
  implementation.
  AIUI updates to a row are atomic across all columns in that row at once,
  true?
  If true then the next question is, does the validation happen inside or
  outside of that guarantee, and is the row guaranteed not to change
 between
  validation and update?
  If that is *not* the case then it makes a whole class of solutions to
  synchronization problems fail and puts my larger project
  in serious question.
 
  On Thu, Jul 7, 2011 at 3:43 PM, Yang tedd...@gmail.com wrote:
 
  no , the memtable is a concurrentskiplistmap
 
  insertion can happen in parallel
 
  On Jul 7, 2011 9:24 AM, Jeffrey Kesselman jef...@gmail.com wrote:
   This has me more confused.
  
   Does this mean that ALL rows on a given node are only updated
   sequentially,
   never in parallel?
  
   On Thu, Jul 7, 2011 at 3:21 PM, Yang tedd...@gmail.com wrote:
  
   just to add onto what jonathan said
  
   the columns are immutable . if u overwrite/ reconcile a new obj is
   created and shoved into the memtable
  
   there is a shared lock for all writes though which guard against an
   exclusive lock on memtable switching/flushing
   On Jul 7, 2011 7:51 AM, A J s5a...@gmail.com wrote:
Does a write lock:
1. Just the columns in question for the specific row in question ?
2. The full row in question ?
3. The full CF ?
   
I doubt read does any locks.
   
Thanks.
  
  
  
  
   --
   It's always darkest just before you are eaten by a grue.
 
 
 
  --
  It's always darkest just before you are eaten by a grue.
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




 --
 It's always darkest just before you are eaten by a grue.




-- 
It's always darkest just before you are eaten by a grue.

Re: faster ByteBuffer comparison

2011-07-02 Thread Jeffrey Kesselman

GetLong has to get it a byte at a time still to support endianess.

Id have to think about it, but what you really want is to get it all
into a byte array and then process it in  64bits. AIR there are some
new array recasting things in Java 5+.  Ill need to go look at them
more closely...

On Fri, Jul 1, 2011 at 5:42 PM, Yang tedd...@gmail.com wrote:
 I can see from profiling that a lot of the time in both reading and writing
 are spend on ByteBuffer compare on the column names (for long rows with many
 columns)
 I looked at the ByteBufferUtil.unsignedCompareByteBuffer() , it's basically
 the same structure as standard JVM ByteBuffer.compare()
 looping over each byte doing a ByteBuffer.get()
 is there a faster (probably hardware-based) compare ? I tried doing 8 bytes
 at a time by doing getLong() and it actually seems slower
 thanks
 Yang



-- 
It's always darkest just before you are eaten by a grue.

Re: faster ByteBuffer comparison

2011-07-02 Thread Jeffrey Kesselman

I'd fetch it all at once into a single byte array and try Arrays.equals()

On Sat, Jul 2, 2011 at 12:45 PM, Jeffrey Kesselman jef...@gmail.com wrote:

 GetLong has to get it a byte at a time still to support endianess.

 Id have to think about it, but what you really want is to get it all
 into a byte array and then process it in  64bits. AIR there are some
 new array recasting things in Java 5+.  Ill need to go look at them
 more closely...

 On Fri, Jul 1, 2011 at 5:42 PM, Yang tedd...@gmail.com wrote:
  I can see from profiling that a lot of the time in both reading and
 writing
  are spend on ByteBuffer compare on the column names (for long rows with
 many
  columns)
  I looked at the ByteBufferUtil.unsignedCompareByteBuffer() , it's
 basically
  the same structure as standard JVM ByteBuffer.compare()
  looping over each byte doing a ByteBuffer.get()
  is there a faster (probably hardware-based) compare ? I tried doing 8
 bytes
  at a time by doing getLong() and it actually seems slower
  thanks
  Yang



 --
 It's always darkest just before you are eaten by a grue.




-- 
It's always darkest just before you are eaten by a grue.

Re: Cassandra Clients for Java

2011-06-17 Thread Jeffrey Kesselman

I'm using Hector.  AFAIK its the only one that supports failover today.

On Fri, Jun 17, 2011 at 6:02 PM, Daniel Colchete d...@cloud3.tc wrote:
 Good day everyone!
 I'm getting started with a new project and I'm thinking about using
 Cassandra because of its distributed quality and because of its performance.
 I'm using Java on the back-end. There are many many things being said about
 the Java high level clients for Cassandra on the web. To be frank, I see
 problems with all of the java clients. For example, Hector and Scale7-pelops
 have new semantics on them that are neither Java's or Cassandra's, and I
 don't see much gain from it apart from the fact that it is more complex.
 Also, I was hoping to go with something that was annotation based so that it
 wouldn't be necessary to write boilerplate code (again, no gain).
 Demoiselle Cassandra seems to be one option but I couldn't find a download
 for it. I'm new to Java in the back-end and I find that maven is too much to
 learn just because of a client library. Also it seems to be hard to
 integrate with the other things I use on my project (GWT, GWT-platform,
 Google Eclipse Plugin).
 Kundera looks great but besides not having a download link (Google site link
 to Github, that links to Google site, but no download) its information is
 partitioned on many blog posts, some of them saying things I couldn't find
 on its website. One says it uses Lucandra for indexes but that is the only
 place talking about it, no documentation about using it. It doesn't seem to
 support Cassandra 0.8 also. Does it?
 I would like to hear from the users here what worked for you guys. Some real
 world project in production that was good to write in Java, where the client
 was stable and is maintained. What are the success stories of using
 Cassandra with Java. What would you recommend?
 Thank you very much!
 Best,
 --
 Dani
 Cloud3 Tech - http://cloud3.tc/
 Twitter: @DaniCloud3 @Cloud3Tech




-- 
It's always darkest just before you are eaten by a grue.

Re: New web client future API

2011-06-15 Thread Jeffrey Kesselman

Correct me if I'm wrong, but AFAIK Hector is the  only higher level
APi I would consider complete' right now, with support for things
like fail-over.

I notice in the latest Hector build he is starting to add CQL support,
so thats what I'm sticking with.  When he has CQL support done I'll
decide if I want to use it or stick with the programmatic API.

On Wed, Jun 15, 2011 at 10:35 AM, Victor Kabdebon
victor.kabde...@gmail.com wrote:
 Ok thanks for the update. I thought the query string was translated to
 Thrift, then send to a server.

 Victor Kabdebon

 2011/6/15 Eric Evans eev...@rackspace.com

 On Tue, 2011-06-14 at 09:49 -0400, Victor Kabdebon wrote:
  Actually from what I understood (please correct me if I am wrong) CQL
  is based on Thrift / Avro.

 In this project, we tend to use the word Thrift as a sort of shorthand
 for Cassandra's RPC interface, and not, The serialization and RPC
 framework from the Apache Thrift project.

 CQL does not (yet )have its own networking protocol, so it uses Thrift
 as a means of delivering queries, and serializing the results, but it is
 *not* a wrapper around the existing RPC methods.  The query string you
 provide is parsed entirely on the server.

 --
 Eric Evans
 eev...@rackspace.com






-- 
It's always darkest just before you are eaten by a grue.

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Jeffrey Kesselman

/CASSANDRA-2521


 Regards,
 Shotaro


 On Fri, May 27, 2011 at 11:27 AM, Jeffrey Kesselman jef...@gmail.com
 wrote:
  Im also not sure that will guarantee all space is cleaned up.  It
  really depends on what you are doing inside Cassandra.  If you have
  your on garbage collect that is just in some way tied to the gc run,
  then it will run when  it runs.
 
  If otoh you are associating records in your storage with specific
  objects in memory and using one of the post-mortem hooks (finalize or
  PhantomReference) to tell you to clean up that particular record then
  its quite possible they wont all get cleaned up.  In general hotspot
  does not find and clean every candidate object on every GC run.  It
  starts with the easiest/fastest to find and then sees what more it
  thinks it needs to do to create enough memory for anticipated near
  future needs.
 
  On Thu, May 26, 2011 at 10:16 PM, Jonathan Ellis jbel...@gmail.com
  wrote:
  In summary, system.gc works fine unless you've deliberately done
  something like setting the -XX:-DisableExplicitGC flag.
 
  On Thu, May 26, 2011 at 5:58 PM, Konstantin  Naryshkin
  konstant...@a-bb.net wrote:
  So, in summary, there is no way to predictably and efficiently tell
  Cassandra to get rid of all of the extra space it is using on disk?
 
  - Original Message -
  From: Jeffrey Kesselman jef...@gmail.com
  To: user@cassandra.apache.org
  Sent: Thursday, May 26, 2011 8:57:49 PM
  Subject: Re: Forcing Cassandra to free up some space
 
  Which JVM?  Which collector?  There have been and continue to be many.
 
  Hotspot itself supports a number of different collectors with
  different behaviors.   Many of them do not collect every candidate on
  every gc, but merely the easiest ones to find.  This is why depending
  on finalizers is a *bad* idea in java code.  They may well never get
  run.  (Finalizer is one of a few features the Sun Java team always
  regretted putting in Java to start with.  It has caused quite a few
  application problems over the years)
 
  The really important thing is that NONE of these behaviors of the
  colelctors are guaranteed by specification not to change from version
  to version.  Basing your code on non-specified behaviors is a good way
  to hit mysterious failures on updates.
 
  For instance, in the mid 90s, IBM had a mode of their Vm called
  infinite heap.  it *never* garbage collected, even if you called
  System.gc.  Instead it just threw away address space and counted on
  the total memory needs for the life of the program being less then the
  total addressable space of the processor.
 
  It was *very* fast for certain kinds of applications.
 
  Far from being pedantic, not depending on undocumented behavior is
  simply good engineering.
 
 
  On Thu, May 26, 2011 at 4:51 PM, Jonathan Ellis jbel...@gmail.com
  wrote:
  I've read the relevant source. While you're pedantically correct re
  the spec, you're wrong as to what the JVM actually does.
 
  On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman jef...@gmail.com
  wrote:
  Some references...
 
  An object enters an unreachable state when no more strong
  references
  to it exist. When an object is unreachable, it is a candidate for
  collection. Note the wording: Just because an object is a candidate
  for collection doesn't mean it will be immediately collected. The
  JVM
  is free to delay collection until there is an immediate need for the
  memory being consumed by the object.
 
 
  http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394
 
  and Calling the gc method suggests that the Java Virtual Machine
  expend effort toward recycling unused objects
 
 
  http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()
 
  It goes on to say that the VM will make a best effort, but best
  effort is *deliberately* left up to the definition of the gc
  implementor.
 
  I guess you missed the many lectures I have given on this subject
  over
  the years at Java One Conferences
 
  On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis jbel...@gmail.com
  wrote:
  It's a common misunderstanding that system.gc is only a suggestion;
  on
  any VM you're likely to run Cassandra on, System.gc will actually
  invoke a full collection.
 
  On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman
  jef...@gmail.com wrote:
  Actually this is no gaurantee.   Its a common misunderstanding
  that
  System.gc forces gc.  It does not. It is a suggestion only. The
  vm always
  has the option as to when and how much it gcs
 
  On May 26, 2011 2:51 PM, Jonathan Ellis jbel...@gmail.com
  wrote:
 
 
 
 
  --
  Jonathan Ellis
  Project Chair, Apache Cassandra
  co-founder of DataStax, the source for professional Cassandra
  support
  http://www.datastax.com
 
 
 
 
  --
  It's always darkest just before you are eaten by a grue.
 
 
 
 
  --
  Jonathan Ellis
  Project Chair, Apache Cassandra
  co-founder of DataStax, the source for professional Cassandra

Re: nosql yes but yescql, no?

2011-06-08 Thread Jeffrey Kesselman

While I agree the Thrift API sucks, Id love to see that sovled on a
binary level, and CQl on top of that.

JK

On Wed, Jun 8, 2011 at 2:50 PM, Marcos Ortiz mlor...@uci.cu wrote:
 On 06/08/2011 01:23 PM, SriSatish Ambati wrote:

 Gotta love, Eric!
 http://www.slideshare.net/jericevans/nosql-yes-but-yescql-no

 --
 SriSatish Ambati
 Director of Engineering, DataStax
 @srisatish




 Good resource.
 Thanks for share it with us SriSatish

 Regards

 --
 Marcos Luís Ortíz Valmaseda
  Software Engineer (UCI)
  http://marcosluis2186.posterous.com
  http://twitter.com/marcosluis2186





-- 
It's always darkest just before you are eaten by a grue.

Re: nosql yes but yescql, no?

2011-06-08 Thread Jeffrey Kesselman

That makes sense :)

On Wed, Jun 8, 2011 at 2:37 PM, Jeremy Hanna jeremy.hanna1...@gmail.com wrote:
 I think that's partly the idea of it.  CQL could end up being a way forward 
 and it currently builds on thrift.  Then if it becomes the API/client of 
 record to build on, then it could move to something else underneath that's 
 more efficient and CQL itself wouldn't have to change at all.

 On Jun 8, 2011, at 1:29 PM, Jeffrey Kesselman wrote:

 While I agree the Thrift API sucks, Id love to see that sovled on a
 binary level, and CQl on top of that.

 JK

 On Wed, Jun 8, 2011 at 2:50 PM, Marcos Ortiz mlor...@uci.cu wrote:
 On 06/08/2011 01:23 PM, SriSatish Ambati wrote:

 Gotta love, Eric!
 http://www.slideshare.net/jericevans/nosql-yes-but-yescql-no

 --
 SriSatish Ambati
 Director of Engineering, DataStax
 @srisatish




 Good resource.
 Thanks for share it with us SriSatish

 Regards

 --
 Marcos Luís Ortíz Valmaseda
 Software Engineer (UCI)
 http://marcosluis2186.posterous.com
 http://twitter.com/marcosluis2186





 --
 It's always darkest just before you are eaten by a grue.





-- 
It's always darkest just before you are eaten by a grue.

Re: CQL How to do

2011-06-05 Thread Jeffrey Kesselman

Fair enough.

I do have to keep reminding myself that a REST interface requires text.
And it does make more sense, at least, when coming from a human as
opposed to when you make a computer spend cycles converting binary to
text just so another computer can spend cycles turning it back again.

On Sun, Jun 5, 2011 at 8:01 PM, aaron morton aa...@thelastpickle.com wrote:
 From what I've seen of CQL there is no comparison between the potential 
 complexity of a CQL statement and that of a SQL statement. IMHO CQL is more 
 or less a human readable form of the current API, it does not add features. 
 SQL statements are arbitrarily complex and may generate many possible query 
 plans which need to be somehow compared and optimised.

 I'll add another pro to using CQL, it will be a lot easier for people to 
 describe a query they have sent to the server. It will make the helping 
 people using multiple languages a bit easier if they can grab a log record 
 and post the query they sent.

 I'm keen to see how it goes.

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 6 Jun 2011, at 03:27, Eric Evans wrote:

 On Sun, 2011-06-05 at 00:51 -0400, Jeffrey Kesselman wrote:
 Is CQL really the path for the future for Cassandra?

 CQL is no more or less official than the Thrift interface, and TTBMK,
 there is no secret cabal that met to decide it would be The Way.  People
 will use what works best for them, and if a de facto standard emerges
 (it usually does), then so much the better.

 It seems to me by introducing a textual language that has to be parsed
 and understood, you are adding back in some of the inefficiency of
 SQl...

 I think this inefficiency remains to be proven.  Or if it is less
 inefficient, if it is enough so to warrant a discussion.

 No matter what the technology, you have to have a client send a query to
 the server structured in some way.  Once received, the server has to
 parse that structure before it can act.  What is different, is that
 CQL structures the query in a human-readable string, and Thrift
 structures it as a hierarchy of records serialized to binary.

 It may be true that CQL parsing has higher overhead (Thrift does more
 object creation and is likely worse on gc), but Cassandra nodes are
 typically limited by disk IO and have loads of idle processor time.  I
 might be biased, but I think it is easy to justify considering how much
 easier CQL makes things.

 2011/6/4 aaron morton aa...@thelastpickle.com:
 May be wrong but as far as I know thrift is still the official API, for 
 now.
 CQL is in it's first release and still has a few things to be added to
 it https://issues.apache.org/jira/browse/CASSANDRA-2472 . That said, jump 
 in
 and try it out :)
 The best documentation I can point you to is
 https://github.com/apache/cassandra/blob/cassandra-0.8.0/doc/cql/CQL.textile
 There are Java,  Python and Twisted Python drivers in the source tree under
 the drivers/ directory.
 Hope that helps.
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 On 5 Jun 2011, at 04:16, Yonder wrote:

 Hi,

 In Cassandra 0.8, CQL become the primary client interface, but I don't know
 how to use it in a non-command line env. I could not find out any how-to do
 docs in Wiki or DataStax's website.

 --
 Eric Evans
 eev...@rackspace.com






-- 
It's always darkest just before you are eaten by a grue.

Re: JRockit

2011-06-01 Thread Jeffrey Kesselman

Well, my information is old...

But back in the heyday of VMs, JRockit really only had one specific
area of performance advantage, which was in message passing, and all
their benchmarks were tweaked to play to that.

Id say its not coincidence that oracle has made this free shortly
after they acquired Hotspot and the Hotspot team.  This looks like an
EOL to me.

On Wed, Jun 1, 2011 at 6:34 AM, Daniel Doubleday
daniel.double...@gmx.net wrote:
 Hi all

 now that JRockit is available for free and the claims are there that it has 
 better performance and gc I wanted to know if anybody out here has done any 
 testing / benchmarking yet.
 Also interested in deterministic gc ... maybe its worth the 300 bucks?

 Cheers,
 Daniel



-- 
It's always darkest just before you are eaten by a grue.

Re: Forcing Cassandra to free up some space

Actually this is no gaurantee.   Its a common misunderstanding that
System.gc forces gc.  It does not. It is a suggestion only. The vm always
has the option as to when and how much it gcs
 On May 26, 2011 2:51 PM, Jonathan Ellis jbel...@gmail.com wrote:

Re: Forcing Cassandra to free up some space

Im sorry.  This was my business at Sun.  You are certainly wrong about
the Hotspot VM.

See this chapter of my book

http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394

On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis jbel...@gmail.com wrote:
 It's a common misunderstanding that system.gc is only a suggestion; on
 any VM you're likely to run Cassandra on, System.gc will actually
 invoke a full collection.

 On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman jef...@gmail.com wrote:
 Actually this is no gaurantee.   Its a common misunderstanding that
 System.gc forces gc.  It does not. It is a suggestion only. The vm always
 has the option as to when and how much it gcs

 On May 26, 2011 2:51 PM, Jonathan Ellis jbel...@gmail.com wrote:




 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




-- 
It's always darkest just before you are eaten by a grue.

Re: Forcing Cassandra to free up some space

Some references...

An object enters an unreachable state when no more strong references
to it exist. When an object is unreachable, it is a candidate for
collection. Note the wording: Just because an object is a candidate
for collection doesn't mean it will be immediately collected. The JVM
is free to delay collection until there is an immediate need for the
memory being consumed by the object.

http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394

and Calling the gc method suggests that the Java Virtual Machine
expend effort toward recycling unused objects

http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()

It goes on to say that the VM will make a best effort, but best
effort is *deliberately* left up to the definition of the gc
implementor.

I guess you missed the many lectures I have given on this subject over
the years at Java One Conferences

On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis jbel...@gmail.com wrote:
 It's a common misunderstanding that system.gc is only a suggestion; on
 any VM you're likely to run Cassandra on, System.gc will actually
 invoke a full collection.

 On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman jef...@gmail.com wrote:
 Actually this is no gaurantee.   Its a common misunderstanding that
 System.gc forces gc.  It does not. It is a suggestion only. The vm always
 has the option as to when and how much it gcs

 On May 26, 2011 2:51 PM, Jonathan Ellis jbel...@gmail.com wrote:




 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




-- 
It's always darkest just before you are eaten by a grue.

Re: Forcing Cassandra to free up some space

Which JVM?  Which collector?  There have been and continue to be many.

Hotspot itself supports a number of different collectors with
different behaviors.   Many of them do not collect every candidate on
every gc, but merely the easiest ones to find.  This is why depending
on finalizers is a *bad* idea in java code.  They may well never get
run.  (Finalizer is one of a few features the Sun Java team always
regretted putting in Java to start with.  It has caused quite a few
application problems over the years)

The really important thing is that NONE of these behaviors of the
colelctors are guaranteed by specification not to change from version
to version.  Basing your code on non-specified behaviors is a good way
to hit mysterious failures on updates.

For instance, in the mid 90s, IBM had a mode of their Vm called
infinite heap.  it *never* garbage collected, even if you called
System.gc.  Instead it just threw away address space and counted on
the total memory needs for the life of the program being less then the
total addressable space of the processor.

It was *very* fast for certain kinds of applications.

Far from being pedantic, not depending on undocumented behavior is
simply good engineering.


On Thu, May 26, 2011 at 4:51 PM, Jonathan Ellis jbel...@gmail.com wrote:
 I've read the relevant source. While you're pedantically correct re
 the spec, you're wrong as to what the JVM actually does.

 On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman jef...@gmail.com wrote:
 Some references...

 An object enters an unreachable state when no more strong references
 to it exist. When an object is unreachable, it is a candidate for
 collection. Note the wording: Just because an object is a candidate
 for collection doesn't mean it will be immediately collected. The JVM
 is free to delay collection until there is an immediate need for the
 memory being consumed by the object.

 http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394

 and Calling the gc method suggests that the Java Virtual Machine
 expend effort toward recycling unused objects

 http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()

 It goes on to say that the VM will make a best effort, but best
 effort is *deliberately* left up to the definition of the gc
 implementor.

 I guess you missed the many lectures I have given on this subject over
 the years at Java One Conferences

 On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis jbel...@gmail.com wrote:
 It's a common misunderstanding that system.gc is only a suggestion; on
 any VM you're likely to run Cassandra on, System.gc will actually
 invoke a full collection.

 On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman jef...@gmail.com wrote:
 Actually this is no gaurantee.   Its a common misunderstanding that
 System.gc forces gc.  It does not. It is a suggestion only. The vm always
 has the option as to when and how much it gcs

 On May 26, 2011 2:51 PM, Jonathan Ellis jbel...@gmail.com wrote:




 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




 --
 It's always darkest just before you are eaten by a grue.




 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




-- 
It's always darkest just before you are eaten by a grue.

Re: Forcing Cassandra to free up some space

Not if it depends on a side effect of garbage collection such as finalizers

It aught to publish its own JMX control to cause that to happen.



On Thu, May 26, 2011 at 6:58 PM, Konstantin  Naryshkin
konstant...@a-bb.net wrote:
 So, in summary, there is no way to predictably and efficiently tell Cassandra 
 to get rid of all of the extra space it is using on disk?

 - Original Message -
 From: Jeffrey Kesselman jef...@gmail.com
 To: user@cassandra.apache.org
 Sent: Thursday, May 26, 2011 8:57:49 PM
 Subject: Re: Forcing Cassandra to free up some space

 Which JVM?  Which collector?  There have been and continue to be many.

 Hotspot itself supports a number of different collectors with
 different behaviors.   Many of them do not collect every candidate on
 every gc, but merely the easiest ones to find.  This is why depending
 on finalizers is a *bad* idea in java code.  They may well never get
 run.  (Finalizer is one of a few features the Sun Java team always
 regretted putting in Java to start with.  It has caused quite a few
 application problems over the years)

 The really important thing is that NONE of these behaviors of the
 colelctors are guaranteed by specification not to change from version
 to version.  Basing your code on non-specified behaviors is a good way
 to hit mysterious failures on updates.

 For instance, in the mid 90s, IBM had a mode of their Vm called
 infinite heap.  it *never* garbage collected, even if you called
 System.gc.  Instead it just threw away address space and counted on
 the total memory needs for the life of the program being less then the
 total addressable space of the processor.

 It was *very* fast for certain kinds of applications.

 Far from being pedantic, not depending on undocumented behavior is
 simply good engineering.


 On Thu, May 26, 2011 at 4:51 PM, Jonathan Ellis jbel...@gmail.com wrote:
 I've read the relevant source. While you're pedantically correct re
 the spec, you're wrong as to what the JVM actually does.

 On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman jef...@gmail.com wrote:
 Some references...

 An object enters an unreachable state when no more strong references
 to it exist. When an object is unreachable, it is a candidate for
 collection. Note the wording: Just because an object is a candidate
 for collection doesn't mean it will be immediately collected. The JVM
 is free to delay collection until there is an immediate need for the
 memory being consumed by the object.

 http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394

 and Calling the gc method suggests that the Java Virtual Machine
 expend effort toward recycling unused objects

 http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()

 It goes on to say that the VM will make a best effort, but best
 effort is *deliberately* left up to the definition of the gc
 implementor.

 I guess you missed the many lectures I have given on this subject over
 the years at Java One Conferences

 On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis jbel...@gmail.com wrote:
 It's a common misunderstanding that system.gc is only a suggestion; on
 any VM you're likely to run Cassandra on, System.gc will actually
 invoke a full collection.

 On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman jef...@gmail.com 
 wrote:
 Actually this is no gaurantee.   Its a common misunderstanding that
 System.gc forces gc.  It does not. It is a suggestion only. The vm 
 always
 has the option as to when and how much it gcs

 On May 26, 2011 2:51 PM, Jonathan Ellis jbel...@gmail.com wrote:




 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




 --
 It's always darkest just before you are eaten by a grue.




 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




 --
 It's always darkest just before you are eaten by a grue.




-- 
It's always darkest just before you are eaten by a grue.

Re: Forcing Cassandra to free up some space

You really should qualify  that with on all currently known versions
of Hotspot

Not trying to give you grief, really, but its an important limitation
to understand.

On Thu, May 26, 2011 at 10:16 PM, Jonathan Ellis jbel...@gmail.com wrote:
 In summary, system.gc works fine unless you've deliberately done
 something like setting the -XX:-DisableExplicitGC flag.

 On Thu, May 26, 2011 at 5:58 PM, Konstantin  Naryshkin
 konstant...@a-bb.net wrote:
 So, in summary, there is no way to predictably and efficiently tell 
 Cassandra to get rid of all of the extra space it is using on disk?

 - Original Message -
 From: Jeffrey Kesselman jef...@gmail.com
 To: user@cassandra.apache.org
 Sent: Thursday, May 26, 2011 8:57:49 PM
 Subject: Re: Forcing Cassandra to free up some space

 Which JVM?  Which collector?  There have been and continue to be many.

 Hotspot itself supports a number of different collectors with
 different behaviors.   Many of them do not collect every candidate on
 every gc, but merely the easiest ones to find.  This is why depending
 on finalizers is a *bad* idea in java code.  They may well never get
 run.  (Finalizer is one of a few features the Sun Java team always
 regretted putting in Java to start with.  It has caused quite a few
 application problems over the years)

 The really important thing is that NONE of these behaviors of the
 colelctors are guaranteed by specification not to change from version
 to version.  Basing your code on non-specified behaviors is a good way
 to hit mysterious failures on updates.

 For instance, in the mid 90s, IBM had a mode of their Vm called
 infinite heap.  it *never* garbage collected, even if you called
 System.gc.  Instead it just threw away address space and counted on
 the total memory needs for the life of the program being less then the
 total addressable space of the processor.

 It was *very* fast for certain kinds of applications.

 Far from being pedantic, not depending on undocumented behavior is
 simply good engineering.


 On Thu, May 26, 2011 at 4:51 PM, Jonathan Ellis jbel...@gmail.com wrote:
 I've read the relevant source. While you're pedantically correct re
 the spec, you're wrong as to what the JVM actually does.

 On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman jef...@gmail.com wrote:
 Some references...

 An object enters an unreachable state when no more strong references
 to it exist. When an object is unreachable, it is a candidate for
 collection. Note the wording: Just because an object is a candidate
 for collection doesn't mean it will be immediately collected. The JVM
 is free to delay collection until there is an immediate need for the
 memory being consumed by the object.

 http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394

 and Calling the gc method suggests that the Java Virtual Machine
 expend effort toward recycling unused objects

 http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()

 It goes on to say that the VM will make a best effort, but best
 effort is *deliberately* left up to the definition of the gc
 implementor.

 I guess you missed the many lectures I have given on this subject over
 the years at Java One Conferences

 On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis jbel...@gmail.com wrote:
 It's a common misunderstanding that system.gc is only a suggestion; on
 any VM you're likely to run Cassandra on, System.gc will actually
 invoke a full collection.

 On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman jef...@gmail.com 
 wrote:
 Actually this is no gaurantee.   Its a common misunderstanding that
 System.gc forces gc.  It does not. It is a suggestion only. The vm 
 always
 has the option as to when and how much it gcs

 On May 26, 2011 2:51 PM, Jonathan Ellis jbel...@gmail.com wrote:




 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




 --
 It's always darkest just before you are eaten by a grue.




 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




 --
 It's always darkest just before you are eaten by a grue.




 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




-- 
It's always darkest just before you are eaten by a grue.

Re: Forcing Cassandra to free up some space

Im also not sure that will guarantee all space is cleaned up.  It
really depends on what you are doing inside Cassandra.  If you have
your on garbage collect that is just in some way tied to the gc run,
then it will run when  it runs.

If otoh you are associating records in your storage with specific
objects in memory and using one of the post-mortem hooks (finalize or
PhantomReference) to tell you to clean up that particular record then
its quite possible they wont all get cleaned up.  In general hotspot
does not find and clean every candidate object on every GC run.  It
starts with the easiest/fastest to find and then sees what more it
thinks it needs to do to create enough memory for anticipated near
future needs.

On Thu, May 26, 2011 at 10:16 PM, Jonathan Ellis jbel...@gmail.com wrote:
 In summary, system.gc works fine unless you've deliberately done
 something like setting the -XX:-DisableExplicitGC flag.

 On Thu, May 26, 2011 at 5:58 PM, Konstantin  Naryshkin
 konstant...@a-bb.net wrote:
 So, in summary, there is no way to predictably and efficiently tell 
 Cassandra to get rid of all of the extra space it is using on disk?

 - Original Message -
 From: Jeffrey Kesselman jef...@gmail.com
 To: user@cassandra.apache.org
 Sent: Thursday, May 26, 2011 8:57:49 PM
 Subject: Re: Forcing Cassandra to free up some space

 Which JVM?  Which collector?  There have been and continue to be many.

 Hotspot itself supports a number of different collectors with
 different behaviors.   Many of them do not collect every candidate on
 every gc, but merely the easiest ones to find.  This is why depending
 on finalizers is a *bad* idea in java code.  They may well never get
 run.  (Finalizer is one of a few features the Sun Java team always
 regretted putting in Java to start with.  It has caused quite a few
 application problems over the years)

 The really important thing is that NONE of these behaviors of the
 colelctors are guaranteed by specification not to change from version
 to version.  Basing your code on non-specified behaviors is a good way
 to hit mysterious failures on updates.

 For instance, in the mid 90s, IBM had a mode of their Vm called
 infinite heap.  it *never* garbage collected, even if you called
 System.gc.  Instead it just threw away address space and counted on
 the total memory needs for the life of the program being less then the
 total addressable space of the processor.

 It was *very* fast for certain kinds of applications.

 Far from being pedantic, not depending on undocumented behavior is
 simply good engineering.


 On Thu, May 26, 2011 at 4:51 PM, Jonathan Ellis jbel...@gmail.com wrote:
 I've read the relevant source. While you're pedantically correct re
 the spec, you're wrong as to what the JVM actually does.

 On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman jef...@gmail.com wrote:
 Some references...

 An object enters an unreachable state when no more strong references
 to it exist. When an object is unreachable, it is a candidate for
 collection. Note the wording: Just because an object is a candidate
 for collection doesn't mean it will be immediately collected. The JVM
 is free to delay collection until there is an immediate need for the
 memory being consumed by the object.

 http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394

 and Calling the gc method suggests that the Java Virtual Machine
 expend effort toward recycling unused objects

 http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()

 It goes on to say that the VM will make a best effort, but best
 effort is *deliberately* left up to the definition of the gc
 implementor.

 I guess you missed the many lectures I have given on this subject over
 the years at Java One Conferences

 On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis jbel...@gmail.com wrote:
 It's a common misunderstanding that system.gc is only a suggestion; on
 any VM you're likely to run Cassandra on, System.gc will actually
 invoke a full collection.

 On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman jef...@gmail.com 
 wrote:
 Actually this is no gaurantee.   Its a common misunderstanding that
 System.gc forces gc.  It does not. It is a suggestion only. The vm 
 always
 has the option as to when and how much it gcs

 On May 26, 2011 2:51 PM, Jonathan Ellis jbel...@gmail.com wrote:




 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




 --
 It's always darkest just before you are eaten by a grue.




 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




 --
 It's always darkest just before you are eaten by a grue.




 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




-- 
It's always

Re: Cassandra Vs. Oracle Coherence

I believe coherence is their name for the TimesTen technology they bought.

TT is an in memory SQL database that can  run as a cache for Oracle.

Its totally different from Cassandra.   On the one hand it supports
trad SQL whereas Cassandra does not.  On the other hand Cassandra is
truly distributed and fault tolerant, whereas TT is not.

I suggest getting and reading the Oriely Cassandra book.

JK

On Tue, May 17, 2011 at 10:44 PM, Karamel, Raghu
raghu_kara...@intuit.com wrote:
 Hi,



 I am new to Cassandra and very excited with the technology. I am evaluating
 it and trying to understand the difference between Cassandra and Oracle
 Coherence. Precisely , looking for reasons why would some select Cassandra
 over Oracle Coherence. Does anyone did the exercise of comparing them?
 Appreciate if you can share some information on that.



 Regrads

 -RK



-- 
It's always darkest just before you are eaten by a grue.

Re: How to reduce the Read Latency.

What consistency are you asking for?

On Fri, May 20, 2011 at 7:42 AM, Dikang Gu dikan...@gmail.com wrote:
 Hi All,
 I'm running three cassandra 0.7.4 nodes in a cluster, and I give 2G memory
 to each node.
 Now, I get the cfstats here:
 Keyspace: UserMap
 Read Count: 38411
 Read Latency: 123.54214613001484 ms.
 Write Count: 44155
 Write Latency: 0.02341093873853471 ms.
 Pending Tasks: 0
 Column Family: Map
 SSTable count: 3
 Space used (live): 32704387
 Space used (total): 32704387
 Memtable Columns Count: 49
 Memtable Data Size: 3348
 Memtable Switch Count: 56
 Read Count: 38411
 Read Latency: 123.542 ms.
 Write Count: 44155
 Write Latency: 0.023 ms.
 Pending Tasks: 0
 Key cache capacity: 20
 Key cache size: 611
 Key cache hit rate: 0.9294361241314483
 Row cache: disabled
 Compacted row minimum size: 125
 Compacted row maximum size: 17436917
 Compacted row mean size: 147647
 You can find that the Read Latency is really high here, so what can I do to
 reduce the latency?  Give more memory to the three nodes? Any other options?
 Thanks.
 --
 Dikang Gu
 0086 - 18611140205




-- 
It's always darkest just before you are eaten by a grue.

Re: Inter node communication over UDP

TCP/IP byte over-head v. UDP really isnt that much if your packets are
of any significant size (its 30 bytes).

And as others have pointed out you can easily get more over-head with
worse results trying to reinvent reliable transport on top of UDP.
Remember that TCP/IP has had 30 years of development and tuning.

On Fri, May 20, 2011 at 7:39 AM, pankajsoni0126
pankajsoni0...@gmail.com wrote:
 I am working on version 0.7.6 of cassandra. I have been looking into the code
 to identify communication between nodes.

 it seems to me that both inter-node and servernode-client communication
 happens using thrift protocol, is my understanding correct?

 and the gossiper communication takes place using tcp and message queue?



 --
 View this message in context: 
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Inter-node-communication-over-UDP-tp6358459p6384978.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
 Nabble.com.




-- 
It's always darkest just before you are eaten by a grue.

Re: Cassandra Vs. Oracle Coherence