Re: cassandra increment counters, Jira #1072

2010-08-13 Thread Ben Standefer
Interesting idea with the counter row approach.  I think it puts a dubious
responsibility on the Cassandra user.  Sure, Cassandra users are expected to
maintain the size of a row, but asking Cassandra users to constantly
aggregate counts of uuids in a situation where the rows are growing rapidly
to maintain a counter seems out of the realm of the average Cassandra end
user.

My napkin math may be slightly off, but if a counter row aggregator
stopped functioning, crashed, or didn't do it's job correctly on a counter
receiving 2,000 increments per second, you end up with a single row at
2.57GB after 24 hours (2,000/sec x 86,400 seconds x 16 bytes per uuid).
 This is approaches the magnitude of memory on a single node and would seem
(to me?) to significantly impact load and load distribution.  Maybe there is
a way Cassandra could perform the counter row aggregation internally (with
read repair?) and offer it to end users as a clean, simple, intuitive
interface.

I have never thought counters were something Cassandra handles well.  If
there is not a satisfactory way to integrate counter into the Cassandra
internals, I think it'd be great for somebody in-the-know to provide
in-depth and detailed documentation on best practices for how to implement
counters.  I think distributed and scalable counters can be a killer app for
Cassandra, and circumventing locking systems such as ZooKeeper is key.

Disclaimer: I'm not quite a Cassandra developer, more of an Ops guy and
user, just trying to add perspective.  I do not want a pony.

-Ben Standefer


On Thu, Aug 12, 2010 at 8:54 PM, Jonathan Ellis jbel...@gmail.com wrote:

 There are two concerns that give me pause.

 The first is that 1072 is tackling a use case that Cassandra already
 handles well: high volume of writes to a counter, with low volume
 reads.  (This can be done by inserting uuids into a counter row, and
 aggregating them either in the background or at read time or with some
 combination of these.  The counter rows can be sharded if necessary.)

 The second is that the approach in 1072 resembles an entirely separate
 system that happens to use part of Cassandra infrastructure -- the
 thrift API, the MessagingService, the sstable format -- but isn't
 really part of it.  ConsistencyLevel is not respected, and special
 cases abound to weld things in that don't fit, e.g. the AES/Streaming
 business.

 On Thu, Aug 12, 2010 at 1:28 AM, Robin Bowes robin-li...@robinbowes.com
 wrote:
  Hi Jonathan,
 
  I'm contacting you in your capacity as project lead for the cassandra
  project. I am wondering how close ticket #1072 is to implementation [1]
 
  We are about to do a proof of concept with cassandra to replace around
  20 MySQL partitions (1 partition = 4 machines: master/slave in DC A,
  master/slave in DC B).
 
  We're essentially just counting web hits - around 10k/second at peak
  times - so increment counters is pretty much essential functionality for
 us.
 
  How close is the patch in #1072 to being acceptable? What is blocking it?
 
  Thanks,
 
  R.
 
  [1] https://issues.apache.org/jira/browse/CASSANDRA-1072
 
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com



Re: cassandra increment counters, Jira #1072

2010-08-13 Thread Benjamin Black
On Fri, Aug 13, 2010 at 6:24 AM, Jonathan Ellis jbel...@gmail.com wrote:

 This is simply not an acceptable alternative and just can't be called
 handling it well.

 What part is it handling poorly, at a technical level?  This is almost
 exactly what 1072 does internally -- we are concerned here with the
 high write, low read volume case.


Requiring clients directly manage the counter rows in order to
periodically compress or segment them.  Yes, you can emulate the
behavior.  No, that is not handling it well.

  It is equivalent to make the users do it, which
 is the case for almost anything.

 I strongly feel we should be in the business of providing building
 blocks, not special cases on top of that.  (But see below, I *do*
 think the 580 version vectors is the kind of building block we want!)


I agree, 580 is really valuable and should be in.  The problem for
high write rate, distributed counters is the requirement of read
before write inherent in such vector-based approaches.  Am I missing
some aspect of 580 that precludes that?

  The reasons #1072 is so valuable:

 1) Does not require _any_ user action.

 This can be addressed at the library level.  Just as our first stab at
 ZK integration was a rather clunky patch; cages is better.


Certainly, but it would be hard to argue (and I am not) that the
tightly synchronized behavior of ZK is a good match for Cassandra
(mixing in Paxos could make for some neat options, but that's another
debate...).

 2) Does not change the EC-centric model of Cassandra.

 It does, though.  1072 is *not* a version vector-based approach --
 that would be 580.  Read the 1072 design doc, if you haven't.  (Thanks
 to Kelvin for writing that up!)


Nor is Cassandra right now.  I know 1072 isn't vector based, and I
think that is in its favor _for this application_.

 I'm referring in particular to reads requiring CL.ALL.  (My
 understanding is that in the previous design, a master replica was
 chosen and was always written to first.)  Both of these break the
 EC-centric model and that is precisely the objection I made when I
 said ConsistencyLevel is not respected.  I don't think this is
 fixable in the 1072 approach.  I would be thrilled to be wrong.


It is EC in that the total for a counter is unknown until resolved on
read.  Yes, it does not respect CL, but since it can only be used in 1
way, I don't see that as a disadvantage.

 The second is that the approach in 1072 resembles an entirely separate
 system that happens to use part of Cassandra infrastructure -- the
 thrift API, the MessagingService, the sstable format -- but isn't
 really part of it.  ConsistencyLevel is not respected, and special
 cases abound to weld things in that don't fit, e.g. the AES/Streaming
 business.

 Then let's find ways to make it as elegant as it can be.  Ultimately,
 this functionality needs to be in Cassandra or users will simply
 migrate someplace else for this extremely common use case.

 This is what I've been pushing for.  The version vector approach to
 counting (i.e. 580 as opposed to 1072) is exactly the more elegant,
 EC-centric approach that addresses a case that we *don't* currently
 handle well (counters with a higher read volume than 1072).


Perhaps I missed something: does counting 580 require read before
counter update (local to the node, not a client read)?


b


Re: cassandra increment counters, Jira #1072

2010-08-13 Thread Lenin Gali
+1M , We need this too.

Lenin Gali
Dir, Infrastructure and BI

Cell:513.382.3371
le...@sharethis.com
1883 Landings Drive,
Mountain View CA 94043
Contact Me [image: Linkedin] http://www.linkedin.com/in/leningali[image:
Twitter] leningali

--- @ WiseStamp
Signaturehttp://my.wisestamp.com/link?u=77hbwcdby98krxxqsite=www.wisestamp.com/email-install.
Get it 
nowhttp://my.wisestamp.com/link?u=77hbwcdby98krxxqsite=www.wisestamp.com/email-install


On Thu, Aug 12, 2010 at 4:31 PM, Colin Taylor colin.tay...@gmail.comwrote:

 Would it help prioritizing  if silent majority chimed in if keen on
 this functionality which is so key to large scale analytical apps?
 in which case  :

 +1

 Although perhaps I should encourage signing up on jira and vote there.

 https://issues.apache.org/jira/secure/Signup!default.jspahttps://issues.apache.org/jira/secure/Signup%21default.jspa
 https://issues.apache.org/jira/browse/CASSANDRA-1072

 [We intend counting various attributes of the 100 million documents
 coming through our system a day]

 On Fri, Aug 13, 2010 at 11:15 AM, Benjamin Black b...@b3k.us wrote:
  On Thu, Aug 12, 2010 at 10:23 AM, Kelvin Kakugawa kakug...@gmail.com
 wrote:
 
  I think the underlying unanswered question is whether #1072 is a niche
  feature or whether it should be brought into trunk.
 
 
  This should not be an unanswered question!  #1072 should be considered
  essential, as it enables numerous use cases that currently require
  bolting something like memcache or redis onto the side to handle
  counters.
 
  +1 on getting this into trunk ASAP.
 
 
  b
 




-- 
twitter: leningali
skype: galilenin
Cell:513.382.3371


cassandra increment counters, Jira #1072

2010-08-12 Thread Robin Bowes
Hi Jonathan,

I'm contacting you in your capacity as project lead for the cassandra
project. I am wondering how close ticket #1072 is to implementation [1]

We are about to do a proof of concept with cassandra to replace around
20 MySQL partitions (1 partition = 4 machines: master/slave in DC A,
master/slave in DC B).

We're essentially just counting web hits - around 10k/second at peak
times - so increment counters is pretty much essential functionality for us.

How close is the patch in #1072 to being acceptable? What is blocking it?

Thanks,

R.

[1] https://issues.apache.org/jira/browse/CASSANDRA-1072



Re: cassandra increment counters, Jira #1072

2010-08-12 Thread Jesse McConnell
out of curiosity are you shooting for incrementing these counters 10k
times a second for sustained periods of time?

cheers,
jesse

--
jesse mcconnell
jesse.mcconn...@gmail.com



On Thu, Aug 12, 2010 at 03:28, Robin Bowes robin-li...@robinbowes.com wrote:
 Hi Jonathan,

 I'm contacting you in your capacity as project lead for the cassandra
 project. I am wondering how close ticket #1072 is to implementation [1]

 We are about to do a proof of concept with cassandra to replace around
 20 MySQL partitions (1 partition = 4 machines: master/slave in DC A,
 master/slave in DC B).

 We're essentially just counting web hits - around 10k/second at peak
 times - so increment counters is pretty much essential functionality for us.

 How close is the patch in #1072 to being acceptable? What is blocking it?

 Thanks,

 R.

 [1] https://issues.apache.org/jira/browse/CASSANDRA-1072




Re: cassandra increment counters, Jira #1072

2010-08-12 Thread Robin Bowes
On 12/08/10 19:21, Jesse McConnell wrote:
 out of curiosity are you shooting for incrementing these counters 10k
 times a second for sustained periods of time?

Jesse,

Our traffic pattern varies between 5.5k and 10k connections/hits per
second. We currently process the hits and log to MySQL (partitioned
DBs). We're looking into the possibility of using cassandra. I don't
think we'll be sending each hit to the DB individually, ie. 10k hits/sec
won't correspond to 10k updates/sec, but I imagine the counter updates
will be fairly high volume. We'll bottom that out in our initial testing.

R.



Re: cassandra increment counters, Jira #1072

2010-08-12 Thread Colin Taylor
Would it help prioritizing  if silent majority chimed in if keen on
this functionality which is so key to large scale analytical apps?
in which case  :

+1

Although perhaps I should encourage signing up on jira and vote there.

https://issues.apache.org/jira/secure/Signup!default.jspa
https://issues.apache.org/jira/browse/CASSANDRA-1072

[We intend counting various attributes of the 100 million documents
coming through our system a day]

On Fri, Aug 13, 2010 at 11:15 AM, Benjamin Black b...@b3k.us wrote:
 On Thu, Aug 12, 2010 at 10:23 AM, Kelvin Kakugawa kakug...@gmail.com wrote:

 I think the underlying unanswered question is whether #1072 is a niche
 feature or whether it should be brought into trunk.


 This should not be an unanswered question!  #1072 should be considered
 essential, as it enables numerous use cases that currently require
 bolting something like memcache or redis onto the side to handle
 counters.

 +1 on getting this into trunk ASAP.


 b