Re: cassandra increment counters, Jira #1072
Interesting idea with the counter row approach. I think it puts a dubious responsibility on the Cassandra user. Sure, Cassandra users are expected to maintain the size of a row, but asking Cassandra users to constantly aggregate counts of uuids in a situation where the rows are growing rapidly to maintain a counter seems out of the realm of the average Cassandra end user. My napkin math may be slightly off, but if a counter row aggregator stopped functioning, crashed, or didn't do it's job correctly on a counter receiving 2,000 increments per second, you end up with a single row at 2.57GB after 24 hours (2,000/sec x 86,400 seconds x 16 bytes per uuid). This is approaches the magnitude of memory on a single node and would seem (to me?) to significantly impact load and load distribution. Maybe there is a way Cassandra could perform the counter row aggregation internally (with read repair?) and offer it to end users as a clean, simple, intuitive interface. I have never thought counters were something Cassandra handles well. If there is not a satisfactory way to integrate counter into the Cassandra internals, I think it'd be great for somebody in-the-know to provide in-depth and detailed documentation on best practices for how to implement counters. I think distributed and scalable counters can be a killer app for Cassandra, and circumventing locking systems such as ZooKeeper is key. Disclaimer: I'm not quite a Cassandra developer, more of an Ops guy and user, just trying to add perspective. I do not want a pony. -Ben Standefer On Thu, Aug 12, 2010 at 8:54 PM, Jonathan Ellis jbel...@gmail.com wrote: There are two concerns that give me pause. The first is that 1072 is tackling a use case that Cassandra already handles well: high volume of writes to a counter, with low volume reads. (This can be done by inserting uuids into a counter row, and aggregating them either in the background or at read time or with some combination of these. The counter rows can be sharded if necessary.) The second is that the approach in 1072 resembles an entirely separate system that happens to use part of Cassandra infrastructure -- the thrift API, the MessagingService, the sstable format -- but isn't really part of it. ConsistencyLevel is not respected, and special cases abound to weld things in that don't fit, e.g. the AES/Streaming business. On Thu, Aug 12, 2010 at 1:28 AM, Robin Bowes robin-li...@robinbowes.com wrote: Hi Jonathan, I'm contacting you in your capacity as project lead for the cassandra project. I am wondering how close ticket #1072 is to implementation [1] We are about to do a proof of concept with cassandra to replace around 20 MySQL partitions (1 partition = 4 machines: master/slave in DC A, master/slave in DC B). We're essentially just counting web hits - around 10k/second at peak times - so increment counters is pretty much essential functionality for us. How close is the patch in #1072 to being acceptable? What is blocking it? Thanks, R. [1] https://issues.apache.org/jira/browse/CASSANDRA-1072 -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: cassandra increment counters, Jira #1072
On Fri, Aug 13, 2010 at 6:24 AM, Jonathan Ellis jbel...@gmail.com wrote: This is simply not an acceptable alternative and just can't be called handling it well. What part is it handling poorly, at a technical level? This is almost exactly what 1072 does internally -- we are concerned here with the high write, low read volume case. Requiring clients directly manage the counter rows in order to periodically compress or segment them. Yes, you can emulate the behavior. No, that is not handling it well. It is equivalent to make the users do it, which is the case for almost anything. I strongly feel we should be in the business of providing building blocks, not special cases on top of that. (But see below, I *do* think the 580 version vectors is the kind of building block we want!) I agree, 580 is really valuable and should be in. The problem for high write rate, distributed counters is the requirement of read before write inherent in such vector-based approaches. Am I missing some aspect of 580 that precludes that? The reasons #1072 is so valuable: 1) Does not require _any_ user action. This can be addressed at the library level. Just as our first stab at ZK integration was a rather clunky patch; cages is better. Certainly, but it would be hard to argue (and I am not) that the tightly synchronized behavior of ZK is a good match for Cassandra (mixing in Paxos could make for some neat options, but that's another debate...). 2) Does not change the EC-centric model of Cassandra. It does, though. 1072 is *not* a version vector-based approach -- that would be 580. Read the 1072 design doc, if you haven't. (Thanks to Kelvin for writing that up!) Nor is Cassandra right now. I know 1072 isn't vector based, and I think that is in its favor _for this application_. I'm referring in particular to reads requiring CL.ALL. (My understanding is that in the previous design, a master replica was chosen and was always written to first.) Both of these break the EC-centric model and that is precisely the objection I made when I said ConsistencyLevel is not respected. I don't think this is fixable in the 1072 approach. I would be thrilled to be wrong. It is EC in that the total for a counter is unknown until resolved on read. Yes, it does not respect CL, but since it can only be used in 1 way, I don't see that as a disadvantage. The second is that the approach in 1072 resembles an entirely separate system that happens to use part of Cassandra infrastructure -- the thrift API, the MessagingService, the sstable format -- but isn't really part of it. ConsistencyLevel is not respected, and special cases abound to weld things in that don't fit, e.g. the AES/Streaming business. Then let's find ways to make it as elegant as it can be. Ultimately, this functionality needs to be in Cassandra or users will simply migrate someplace else for this extremely common use case. This is what I've been pushing for. The version vector approach to counting (i.e. 580 as opposed to 1072) is exactly the more elegant, EC-centric approach that addresses a case that we *don't* currently handle well (counters with a higher read volume than 1072). Perhaps I missed something: does counting 580 require read before counter update (local to the node, not a client read)? b
Re: cassandra increment counters, Jira #1072
+1M , We need this too. Lenin Gali Dir, Infrastructure and BI Cell:513.382.3371 le...@sharethis.com 1883 Landings Drive, Mountain View CA 94043 Contact Me [image: Linkedin] http://www.linkedin.com/in/leningali[image: Twitter] leningali --- @ WiseStamp Signaturehttp://my.wisestamp.com/link?u=77hbwcdby98krxxqsite=www.wisestamp.com/email-install. Get it nowhttp://my.wisestamp.com/link?u=77hbwcdby98krxxqsite=www.wisestamp.com/email-install On Thu, Aug 12, 2010 at 4:31 PM, Colin Taylor colin.tay...@gmail.comwrote: Would it help prioritizing if silent majority chimed in if keen on this functionality which is so key to large scale analytical apps? in which case : +1 Although perhaps I should encourage signing up on jira and vote there. https://issues.apache.org/jira/secure/Signup!default.jspahttps://issues.apache.org/jira/secure/Signup%21default.jspa https://issues.apache.org/jira/browse/CASSANDRA-1072 [We intend counting various attributes of the 100 million documents coming through our system a day] On Fri, Aug 13, 2010 at 11:15 AM, Benjamin Black b...@b3k.us wrote: On Thu, Aug 12, 2010 at 10:23 AM, Kelvin Kakugawa kakug...@gmail.com wrote: I think the underlying unanswered question is whether #1072 is a niche feature or whether it should be brought into trunk. This should not be an unanswered question! #1072 should be considered essential, as it enables numerous use cases that currently require bolting something like memcache or redis onto the side to handle counters. +1 on getting this into trunk ASAP. b -- twitter: leningali skype: galilenin Cell:513.382.3371
cassandra increment counters, Jira #1072
Hi Jonathan, I'm contacting you in your capacity as project lead for the cassandra project. I am wondering how close ticket #1072 is to implementation [1] We are about to do a proof of concept with cassandra to replace around 20 MySQL partitions (1 partition = 4 machines: master/slave in DC A, master/slave in DC B). We're essentially just counting web hits - around 10k/second at peak times - so increment counters is pretty much essential functionality for us. How close is the patch in #1072 to being acceptable? What is blocking it? Thanks, R. [1] https://issues.apache.org/jira/browse/CASSANDRA-1072
Re: cassandra increment counters, Jira #1072
out of curiosity are you shooting for incrementing these counters 10k times a second for sustained periods of time? cheers, jesse -- jesse mcconnell jesse.mcconn...@gmail.com On Thu, Aug 12, 2010 at 03:28, Robin Bowes robin-li...@robinbowes.com wrote: Hi Jonathan, I'm contacting you in your capacity as project lead for the cassandra project. I am wondering how close ticket #1072 is to implementation [1] We are about to do a proof of concept with cassandra to replace around 20 MySQL partitions (1 partition = 4 machines: master/slave in DC A, master/slave in DC B). We're essentially just counting web hits - around 10k/second at peak times - so increment counters is pretty much essential functionality for us. How close is the patch in #1072 to being acceptable? What is blocking it? Thanks, R. [1] https://issues.apache.org/jira/browse/CASSANDRA-1072
Re: cassandra increment counters, Jira #1072
On 12/08/10 19:21, Jesse McConnell wrote: out of curiosity are you shooting for incrementing these counters 10k times a second for sustained periods of time? Jesse, Our traffic pattern varies between 5.5k and 10k connections/hits per second. We currently process the hits and log to MySQL (partitioned DBs). We're looking into the possibility of using cassandra. I don't think we'll be sending each hit to the DB individually, ie. 10k hits/sec won't correspond to 10k updates/sec, but I imagine the counter updates will be fairly high volume. We'll bottom that out in our initial testing. R.
Re: cassandra increment counters, Jira #1072
Would it help prioritizing if silent majority chimed in if keen on this functionality which is so key to large scale analytical apps? in which case : +1 Although perhaps I should encourage signing up on jira and vote there. https://issues.apache.org/jira/secure/Signup!default.jspa https://issues.apache.org/jira/browse/CASSANDRA-1072 [We intend counting various attributes of the 100 million documents coming through our system a day] On Fri, Aug 13, 2010 at 11:15 AM, Benjamin Black b...@b3k.us wrote: On Thu, Aug 12, 2010 at 10:23 AM, Kelvin Kakugawa kakug...@gmail.com wrote: I think the underlying unanswered question is whether #1072 is a niche feature or whether it should be brought into trunk. This should not be an unanswered question! #1072 should be considered essential, as it enables numerous use cases that currently require bolting something like memcache or redis onto the side to handle counters. +1 on getting this into trunk ASAP. b