Re: How is Cassandra being used?

2011-11-16 Thread Ryan King
On Wed, Nov 16, 2011 at 10:02 AM, Jonathan Ellis jbel...@gmail.com wrote:
 Sounds like the consensus is that if this is a good idea at all, it
 needs to be opt-in.  Like I said earlier, I can live with that.

In addition, if you want to get data from large companies that manage
their own datacenters, there needs to be a way to contribute data
without the software phoning home automatically. We aren't allowed to
make connections to the outside world from our datacenter. And I'm not
willing to ask for an exception for this.

A mode that dumps the data to a file which can be uploaded would be
preferable. People probably won't do it often, but imagine if your
periodic how are you using cassandra? email threads included data?

-ryan


Re: Cassandra Pig with network topology and data centers.

2011-07-29 Thread Ryan King
It'd be great if we had different settings for inter- and intra-DC read repair.

-ryan

On Fri, Jul 29, 2011 at 5:06 PM, Jake Luciani jak...@gmail.com wrote:
 Yes it's read repair you can lower the read repair chance to tune this.



 On Jul 29, 2011, at 6:31 PM, Aaron Griffith aaron.c.griff...@gmail.com 
 wrote:

 I currently have a 9 node cassandra cluster setup as follows:

 DC1: Six nodes
 DC2: Three nodes

 The tokens alternate between the two datacenters.

 I have hadoop installed as tasktracker/datanodes on the
 three cassandra nodes in DC2.

 There is another non cassandra node that is used as the hadoop namenode / job
 tracker.

 When running pig scripts pointed to a node in DC2 using LOCAL_QUORUM as read
 consistency I am seeing network and cpu spikes on the nodes in DC1.  I was
 not expecting any impact on those nodes when local quorum is used.

 Can read repair be causing the traffic/cpu spikes?

 The replication settings for DC1 is 5, and for DC2 is 1.

 When looking at the map tasks I am seeing input splits for computers in
 both data centers.  I am not sure what this means.  My thought is
 that is should only be getting data from the nodes in DC2.

 Thanks

 Aaron




Re: set rpc_timeout_in_ms via jmx?

2011-07-18 Thread Ryan King
On Sat, Jul 16, 2011 at 12:30 PM, Jeremy Hanna
jeremy.hanna1...@gmail.com wrote:
 I don't see a way in DatabaseDescriptor to set the rpc_timeout_in_ms via jmx.

 It doesn't seem possible right now.

 Is there any reason why that couldn't be set via jmx?  It seems like a 
 rolling restart to update that is pretty heavy.  It would be nice to set it 
 in the yaml and set it via jmx so it wouldn't require restart to take effect 
 immediately.

 Jeremy

 btw I'm trying to do that with my analytics nodes - hadoop jobs fail when 
 cycling a single cassandra node - might be 2388 I guess.

There's no way to do this currently, but we'd be interested in having
it. Ideally, the timeout would be configurable per request so that
mixed workloads can have different timeouts.

-ryan


Re: Cassandra 1.0

2011-06-16 Thread Ryan King
I think maybe 4 months was too short? Do we optimistically want to try
that again or plan on taking a bit more time?

Either way I'm happy to have a plan. :)

-ryan

On Thu, Jun 16, 2011 at 9:11 AM, Jonathan Ellis jbel...@gmail.com wrote:
 +1

 On Thu, Jun 16, 2011 at 7:36 AM, Sylvain Lebresne sylv...@datastax.com 
 wrote:
 Ladies and Gentlemen,

 Cassandra 0.8 is now out and we'll hopefully soon have the first minor 
 release
 on that branch out too. It is now time to think of the next iteration, aka
 Apache Cassandra 1.0 (sounds amazing...).

 The 0.8 release was our first release on our new fixed 4 months release
 schedule. 0.7.0 was released January 9th and 0.8.0 was release just 4 months
 later, June 8th. Alright, alright, that's 5 months, but close enough for a
 first time.

 Sticking to that 4 months schedule, I propose the following deadlines:
  - September 8th: feature freeze
  - October 8th: release (tentative date)

 --
 Sylvain




 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



dapper-style tracing in cassandra

2011-06-14 Thread Ryan King
I'll open a ticket on this soon, but I'd like to start a discussion first.

We're working on a distributed tracing system, whose design is
somewhat inspired by the Google Dapper paper [1]. We have instrumented
a bunch of our internal services through our custom networking stack
[2].

In a nutshell, the way it works is that each request is given a trace
id which gets passed through to each service involved in servicing
that request. Each hop in that tree is given a span id. Each node logs
its data to a local agent (we use scribe for this). An aggregator can
pull the pieces back together so you can do analysis.

I'd like to add the ability to plug tracers into cassandra. Like with
many things in cassandra, I think like many parts of Cassandra we
should make this an extensible point with a good default
implementation in place.

Here's what I propose:

1. Update the thrift server to allow clients to pass in tracing
details. I'll have docs soon on how we're doing this internally.
2. Add the necessary metadata to each message passed between cassandra
nodes. This should be easy to Message.java and thread through to the
places we need it.
3. Implement a universally useful version of this– one that's not
dependent on our system since it may not ever get open-sourced.
Perhaps writing to local files?

Thoughts? Opinions?

-ryan

1. http://research.google.com/pubs/pub36356.html
2. https://github.com/twitter/finagle/tree/master/finagle-b3


Re: dapper-style tracing in cassandra

2011-06-14 Thread Ryan King
On Tue, Jun 14, 2011 at 2:02 PM, Jonathan Ellis jbel...@gmail.com wrote:
 Sounds a lot like
 https://issues.apache.org/jira/browse/CASSANDRA-1123. The main change
 you'd want is to allow passing an external trace ID.

Yeah, that patch seems like a good start. In addition to passing an
external trace id we need a way to plug in our our implementation of
what to do with the data (we want to publish thrift structs through
scribe).

-ryan


Re: Updating cassandra RubyGem for 0.8 CQL

2011-04-28 Thread Ryan King
This is awesome. I'll work to get it merged.

-ryan

On Wed, Apr 27, 2011 at 8:36 PM, Robert Jackson
robe...@promedicalinc.com wrote:
 I have just finished these updates. The following changes/new features have 
 been made:

 * Update Rakefile to install 0.6.13, 0.7.4, 0.8.0-beta1 to 
 ~/cassandra/cassandra-VERSION
 * Add data:load task to Rakefile for creating the schema required for the 
 tests
 * Default the Rakefile to use 0.8-beta1
 * Setup test suite to work on 0.6.13, 0.7.4, and 0.8.0-beta1
 * All tests pass for all supported (0.6.13, 0.7.4, 0.8.0-beta1) versions.
 * Added Support for 0.8-beta1
 * Changed get_index_slices to return a hash of rows
 * Updated Cassandra::Mock to pass all tests for each Cassandra version

 Please review my changes at:

 https://github.com/rjackson/cassandra

 I have submitted a pull request to the main fauna/cassandra repo on Github.

 The next round of updates will be to add an additional version for CQL.

 Robert Jackson

 - Original Message -
 From: Robert Jackson robe...@promedicalinc.com
 To: client-dev@cassandra.apache.org
 Sent: Saturday, April 23, 2011 12:58:03 AM
 Subject: Updating cassandra RubyGem for 0.8  CQL


 I have been working on a local fork of the fauna/cassandra rubygem to add 
 support for 0.8. I am a relative newcomer to Cassandra in general, and 
 working on the internals of the client has really helped.

 To make sure that I didn't loose any ground with other versions of Cassandra, 
 I updated the test suite so that it can run tests against 0.6, 0.7, and 0.8. 
 This works by setting a CASSANDRA_VERSION env variable before calling the 
 normal rake or bin/cassandra_helper scripts.

 Run the cassandra version with:

 CASSANDRA_VERSION=0.8 rake cassandra

 Then you can run the tests with:

 CASSANDRA_VERSION=0.8 rake

 I still have a ways to go with getting all the tests passing, but at this 
 point 0.8 and 0.6 have around 6 failures and 8 errors. (I am struggling with 
 an schema loading issue with 0.7.4.)

 I hope to have the tests all passing in the next couple of days, and 
 hopefully we can get the changes pushed upstream. Then I am going to start 
 fleshing out the CQL version (which hopefully shouldn't be such a moving 
 target between Cassandra versions).

 I would certainly appreciate any feedback on my work so far.

 https://github.com/rjackson/cassandra/tree/cassandra_0.8

 Robert Jackson



Re: Maintenance releases

2011-02-11 Thread Ryan King
On Fri, Feb 11, 2011 at 8:35 AM, Gary Dusbabek gdusba...@gmail.com wrote:
 I've been uncomfortable with the amount of features I perceive are
 going into our maintenance releases for a while now.  I thought it
 would stop after we committed ourselves to having a more predictable
 major release schedule.  But getting 0.7.1 out feels like it's taken a
 lot more effort than it should have.  I wonder if part of the problem
 is that we've been committing destabilizing features into it?  IMO,
 maintenance releases (0.7.1, 0.7.2, etc.) should only contain bug
 fixes and *carefully* vetted features.

 I've scanned down the list of 0.7.1 changes in CHANGES.txt and about
 half of them are features that I think could have stayed in trunk.  I
 think we did this a lot with the early maintenance releases of 0.6 as
 well, probably in an effort to get features out *now* instead of
 waiting for an 0.7 that was not happening soon enough.  We've decided
 to pick up the pace of our major release schedule (sticking to four
 months).  I think maintaining this pace will be difficult if we
 continue to commit as many features into the minor releases as we have
 been.

 I'm willing to concede that I may have an abnormally conservative
 opinion about this.  But I wanted to voice my concern in hopes we can
 improve the quality and delivery of our maintenance releases.

I agree with you. We've tried both approaches and I believe that its
clear that releasing features in maintenance releases leads to more
pain and unpredictability.

-ryan


Re: Does Ruby library returns the RowKey?

2011-02-10 Thread Ryan King
On Thu, Feb 10, 2011 at 2:17 AM, Joshua Partogi joshua.j...@gmail.com wrote:
 Hi,

 Does the Ruby library currently returns the RowKey during a row get?
 From what I am seeing it seems like it is only returning an
 OrderedHash of the columns. Would it be possible to return the RowKey,
 or it doesn't make sense to do so?

Which method are you talking about? It doesn't make sense to return
the row key on a get or get_slice, but does for multiget and company
(which it should already).

-ryan


Re: Monitoring Cluster with JMX

2011-02-09 Thread Ryan King
If you're using 0.7, I'd skip jmx and use the mx4j http interface then
write scripts that convert the data to the format you need.

-ryan

On Wed, Feb 9, 2011 at 2:47 AM, Roland Gude roland.g...@yoochoose.com wrote:
 Unfortunately not, as the nagios JMX check expects a numeric return value and 
 only allows for defining thresholds for issuing warnings or errors depending 
 on that value. It does not allow for post processing the return values.

 roland

 Von: Aaron Morton [mailto:aa...@thelastpickle.com]
 Gesendet: Dienstag, 8. Februar 2011 21:32
 An: dev@cassandra.apache.org
 Betreff: Re: Monitoring Cluster with JMX

 Can't you get the length of the list on the monitoring side of things ?
 aaron
 On 08 Feb, 2011,at 10:25 PM, Roland Gude roland.g...@yoochoose.com wrote:
 Hello,

 we are trying to monitor our cassandra cluster with Nagios JMX checks. While 
 there are JMX attributes which expose the list of reachable/unreachable 
 hosts, it would be very helpful to have additional numeric attributes 
 exposing the size of these lists. This could be used to set thresholds (in 
 Nagios monitoring) i.e. at least 3 hosts must be reachable before Nagios 
 issues a warning.
 This is probably not hard to do and we are willing to implement/supply 
 patches if someone could point us in the right direction on where to 
 implement it.

 Greetings,
 roland

 --
 YOOCHOOSE GmbH

 Roland Gude
 Software Engineer

 Im Mediapark 8, 50670 Köln

 +49 221 4544151 (Tel)
 +49 221 4544159 (Fax)
 +49 171 7894057 (Mobil)


 Email: roland.g...@yoochoose.commailto:roland.g...@yoochoose.com
 WWW: 
 www.yoochoose.comhttp://www.yoochoose.com/http://www.yoochoose.com%3chttp:/www.yoochoose.com/

 YOOCHOOSE GmbH
 Geschäftsführer: Dr. Uwe Alkemper, Michael Friedmann
 Handelsregister: Amtsgericht Köln HRB 65275
 Ust-Ident-Nr: DE 264 773 520
 Sitz der Gesellschaft: Köln




-- 
-@rk


Re: Proposal: fixed release schedule

2011-01-20 Thread Ryan King
On Thu, Jan 20, 2011 at 8:39 AM, Eric Evans eev...@rackspace.com wrote:
 On Wed, 2011-01-19 at 10:29 -0600, Jonathan Ellis wrote:
 On Tue, Jan 18, 2011 at 12:36 PM, Eric Evans eev...@rackspace.com wrote:
  The discussion seems to be petering out and I wonder if that means folks
  are still trying to wrap their heads around everything, or if we have
  consensus.
 
  If we're in agreement on 4 months between releases, and feature-freezing
  branches in the run-up, then that would leave us with say 7 weeks (give
  or take) to land everything in trunk that we expect in the next release
  (and I would think that at this point we'd at least have a good idea
  what that'd be).

 Sounds good.

 I've assigned to the Riptano/Datastax team the issues we can get to in

 OK, then I'm going to assume we have consensus on this.

 So again, we released on Jan 9th, 4 months (nominally) would give us a
 release date of May 9th.  We need a few weeks for testing and bug
 fixing, say time enough for a couple beta iterations, so let's set a
 tentative date of April 9 to branch, (just under 7 weeks from now).

 As of right now I see 71 issues marked 0.8.  I haven't been through all
 of them, some are trivial or have patches attached, but some are no
 doubt unrealistic considering the time-line.

 For any issues you're championing, please take some time over the next
 couple of weeks to make sure the ones marked fixfor-0.8 match what you
 can accomplish before we branch.

 Is that reasonable to everyone?

Seems reasonable to me, though I think the release date can be a bit
more flexible (while the freeze date shouldn't be). In other words, if
we feature freeze and branch on April 9th, then we're ready to ship
before May 9th, we should just go ahead and ship.

I'm guessing that we'll have to cut a bunch of scope in order to make
this happen.

-ryan


Re: Time for 1.0

2011-01-14 Thread Ryan King
On Thu, Jan 13, 2011 at 7:32 PM, Jonathan Ellis jbel...@gmail.com wrote:
...
 In other words, at some point you have so many production users that
 it's silly to pretend it's ready for 1.0.  I'd say we've passed that
 point.

Did you mean to say silly to pretend it's *not* ready for 1.0?
Otherwise, I don't understand.

 I'm on board with this, to the point that Riptano is hiring a
 full-time QA engineer to contribute here.

Like I said at the outset, I don't care so much about what the version
is called as long as the quality continues to improve.

-ryan


Re: Time for 1.0

2011-01-13 Thread Ryan King
I'm a -1 on naming the next release 1.0 because I don't think it has
the quality that 1.0 implies, but to be honest I don't really care
that much. The version numbers don't really effect those that of use
that are running production clusters. Calling it 1.0 won't make it any
more stable or faster.

Also, before we say that everything people want in 1.0 is done,
perhaps we need to do that survey again. A lot of people have joined
the community since 0.5 days and their needs should probably be
considered in this situation. Also, those of use who've been around
have new things we care about. Of course this will always be true and
at some point we need to draw a line in the sand and put the 1.0 stamp
on it– I just feel that that time has not come yet (but, like I said I
don't really care that much because it won't affect me).

Regardless of what we call the next major release there's at least 2
things I'd like to see happen:

1. make the distributed test suite more reliable (its admittedly flaky
on ec2) and flesh it out to include all distributed functionality. We
shouldn't run a distributed system without distributed tests. We'll
work on the flakiness, but we need people to write tests (and
reviewers to require tests).
2. I think we should change how we plan releases. I'll send another
email about this soon.

-ryan

On Tue, Jan 11, 2011 at 5:35 PM, Jonathan Ellis jbel...@gmail.com wrote:
 Way back in Nov 09, we did a users survey and asked what features
 people wanted to see.  Here was my summary of the responses:
 http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg01446.html

 Looking at that, we've done essentially all of them.  I think we can
 make a strong case that our next release should be 1.0; it's
 production ready, it's reasonably feature-complete, it's documented,
 and we know what our upgrade path story is.

 The list--

 Load balancing: basics done;
 https://issues.apache.org/jira/browse/CASSANDRA-1427 is open to
 improve it

 Decommission: done

 Map/reduce support: done

 ColumnFamily / Keyspace definitions w/o restart: done

 Design documentation: started at
 http://wiki.apache.org/cassandra/ArchitectureInternals

 Insert multiple rows at once: done

 Remove_slice_range / remove_key_range: turned out to be a *lot* harder
 than it looks at first.  Postponed indefinitely.

 Secondary indexing: done

 Caching: done (with some enhancements possible such as
 https://issues.apache.org/jira/browse/CASSANDRA-1969 and
 https://issues.apache.org/jira/browse/CASSANDRA-1956)

 Bulk delete (truncate): done

 I would add,

 User documentation: done (http://www.riptano.com/docs)

 Large row support: done

 Improved replication strategies and more sophisticated ConsistencyLevels: done

 Efficient bootstrap/streaming: done

 Flow control: done

 Network-level compatibility between releases: scheduled
 (https://issues.apache.org/jira/browse/CASSANDRA-1015)

 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com



Proposal: fixed release schedule

2011-01-13 Thread Ryan King
I think many believe that shipping 0.7 took longer than it should.
Rather than going into why that happened, I'd like to propose a better
way to move forward that will hopefully allow us to ship on a more
predictable schedule. This proposal is heavily influenced by the
google chrome release process:

http://www.scribd.com/doc/46659928/Chrome-Release-Cycle-12-16-2010'

..which is heavily influenced by how large websites deploy code
(everyone close to trunk, hide incomplete changes behind configuration
flags, etc.)

I'm not saying we should adopt this process as-is, but some aspects of
it seem like they would be valuable–

# Fixed schedule

We should set a fixed schedule and stick to it. Anything features not
ready at branch time won't make it and will be disabled in the stable
branch.

# Trunk-first

Everyone on chrome commits to trunk first. I think the important
change we could make is to keep everyone closer to trunk. We spend a
good deal of effort back-porting patches between major versions. I
think we should make the major versions less different. This would
mean letting them live for shorter amounts of time and possibly making
them bugfix only. Currently we add new features in stable branches,
but I think if we made the major release schedule more predictable
people would be more comfortable with letting their new feature wait
until the next major version.

We should be more liberal about committing things to trunk early and
iterating on them there (rather than iterating on them in patches). If
the features are unstable we can either hide them behind configuration
flags or remove them when we cut a stable branch.

# Automate all tests

I think the only way that we can keep people close to trunk and stay
stable is to build automated tests for *everything*. All code should
be exercised by thorough unit tests and distributed black-box tests.
Every regression should get a test.


Chrome has a 6 week cycle. I think ours would be more like 4 months
for major releases.

Whatever we do, I think the schedule needs to be more predictable,
which means that the contents of each release will be less predictable
(since its whatever's ready at the appointed time). Like the Chrome
presentation mentioned the idea isn't raw speed, but predictable
release schedules.

Feedback please.

-ryan


Re: Proposal: fixed release schedule

2011-01-13 Thread Ryan King
To be more clear, here's what I think is broken in the current release planning:

1. The dates are wildly unpredictable
2. People aren't allowed to work against trunk on features for
multiple iterations (see #1072)
3. Stable branches diverge too much, causing duplicated effort. (we
essentially implemented #1072 twice for 0.6 and 0.7)
4, back porting features is risky and causes bugs, esp with the
limited QA available

-ryan

On Thu, Jan 13, 2011 at 2:32 PM, Ryan King r...@twitter.com wrote:
 I think many believe that shipping 0.7 took longer than it should.
 Rather than going into why that happened, I'd like to propose a better
 way to move forward that will hopefully allow us to ship on a more
 predictable schedule. This proposal is heavily influenced by the
 google chrome release process:

 http://www.scribd.com/doc/46659928/Chrome-Release-Cycle-12-16-2010'

 ..which is heavily influenced by how large websites deploy code
 (everyone close to trunk, hide incomplete changes behind configuration
 flags, etc.)

 I'm not saying we should adopt this process as-is, but some aspects of
 it seem like they would be valuable–

 # Fixed schedule

 We should set a fixed schedule and stick to it. Anything features not
 ready at branch time won't make it and will be disabled in the stable
 branch.

 # Trunk-first

 Everyone on chrome commits to trunk first. I think the important
 change we could make is to keep everyone closer to trunk. We spend a
 good deal of effort back-porting patches between major versions. I
 think we should make the major versions less different. This would
 mean letting them live for shorter amounts of time and possibly making
 them bugfix only. Currently we add new features in stable branches,
 but I think if we made the major release schedule more predictable
 people would be more comfortable with letting their new feature wait
 until the next major version.

 We should be more liberal about committing things to trunk early and
 iterating on them there (rather than iterating on them in patches). If
 the features are unstable we can either hide them behind configuration
 flags or remove them when we cut a stable branch.

 # Automate all tests

 I think the only way that we can keep people close to trunk and stay
 stable is to build automated tests for *everything*. All code should
 be exercised by thorough unit tests and distributed black-box tests.
 Every regression should get a test.


 Chrome has a 6 week cycle. I think ours would be more like 4 months
 for major releases.

 Whatever we do, I think the schedule needs to be more predictable,
 which means that the contents of each release will be less predictable
 (since its whatever's ready at the appointed time). Like the Chrome
 presentation mentioned the idea isn't raw speed, but predictable
 release schedules.

 Feedback please.

 -ryan



Re: Proposal: fixed release schedule

2011-01-13 Thread Ryan King
On Thu, Jan 13, 2011 at 4:04 PM, Jonathan Ellis jbel...@gmail.com wrote:
 On Thu, Jan 13, 2011 at 2:32 PM, Ryan King r...@twitter.com wrote:
 # Fixed schedule

 We should set a fixed schedule and stick to it. Anything features not
 ready at branch time won't make it and will be disabled in the stable
 branch.

 I like this idea, as long as we're willing to be flexible when
 warranted.  Sometimes it is less work to finish a feature, than to rip
 it out.

Two things:

First, I think a key part of how you make this successful (both for
Chome and for continuously deployed software like large services) is
that non-trivial changes almost always have to be hidden behind flags
until they're ready for wide use.

Second, I think this will only work well if we are somewhat strict about it.

 # Trunk-first

 Everyone on chrome commits to trunk first.

 I suppose that's fine if it works for them, but it's not The One True
 Way.  Changes that affect both stable and trunk branches should really
 be applied to stable first and merged forward.  Here is a good
 presentation explaining why:
 http://video.google.com/videoplay?docid=-577744660535947210.

 Another reason is that committing fixes to a stable branch and then
 using svn merge branch from trunk means svn tracks everything that
 has been committed to branch and not yet to trunk, and merges it in.
 So it protects us somewhat against people committing a fix to trunk,
 then forgetting to commit to the stable branch.

I guess I don't care as much about the mechanics of this as the
intent– which is to keep stable and trunk closer together. And to keep
people working on a more common base.

 I think the important
 change we could make is to keep everyone closer to trunk. We spend a
 good deal of effort back-porting patches between major versions. I
 think we should make the major versions less different. This would
 mean letting them live for shorter amounts of time and possibly making
 them bugfix only.

 In theory I agree (see: policy for 0.4 and 0.5 stable releases).  In
 practice, users overwhelmingly wanted more than that in between major
 releases.  Not that users are always right, but this is an area where
 I think they are worth listening to. :)

Perhaps minor things are worth adding in a stable branch. I think this
is an area where judgement can come into play.

 I think if we made the major release schedule more predictable
 people would be more comfortable with letting their new feature wait
 until the next major version.

 In my experience it's not the unpredictability as much as I'm feeling
 this pain Right Now and four months is too long to wait.

Perhaps waiting is the biggest pain for users, but for developers
unpredictability is just as big a problem. If I don't know when
release N+1 is going to happen I might try to hurry to get my feature
into release N. If I have confidence that release N+1 will come
promptly at a scheduled time I can set my expectation appropriately.

 We should be more liberal about committing things to trunk early and
 iterating on them there (rather than iterating on them in patches).

 I agree in the sense that we were too slow to branch 0.7 to have an
 open trunk to start work on.

 But I disagree in the sense that we shouldn't be committing
 works-in-progress to trunk because that becomes the baseline everyone
 else has to develop from.  (I know at least one team with a nontrivial
 patchset against trunk from the 0.7 beta1 timeframe, back when it had
 the Clock struct that we committed prematurely.)

So we made a mistake once. :) I think committing large changes in
smaller pieces will be a net positive, even if it occasionally trips
us up. For example, I think work on counters between our team and
Sylvain has improved dramatically once we committed 1072.

 IMO the right fix is to help the ASF make git an option; in the
 meantime the best workaround is a git-based workflow with
 git-jira-attacher and git-am as described in
 http://spyced.blogspot.com/2009/06/patch-oriented-development-made-sane.html
 and http://wiki.apache.org/cassandra/GitAndJIRA.

So you're proposing that we use git to keep long-running feature branches?

 # Automate all tests

 I think the only way that we can keep people close to trunk and stay
 stable is to build automated tests for *everything*. All code should
 be exercised by thorough unit tests and distributed black-box tests.
 Every regression should get a test.

 Agreed.

 Chrome has a 6 week cycle. I think ours would be more like 4 months
 for major releases.

 Four months feels about right to me, too, although for 0.7 + 1 I'd
 like to make it a bit shorter (beginning of April?) since we have
 several features (1072 being the most prominent) that just barely
 missed 0.7.

Like I said, we're going to have to figure out the right pace, but we
should try and stick to it.

 Whatever we do, I think the schedule needs to be more predictable,
 which means that the contents of each release will be less

Re: [VOTE] 7.0

2011-01-06 Thread Ryan King
+1 non-binding

-ryan

On Thu, Jan 6, 2011 at 10:24 AM, Jonathan Ellis jbel...@gmail.com wrote:
 +1 for reals
 On Jan 6, 2011 11:14 AM, Eric Evans eev...@rackspace.com wrote:

 RC 4 seems to be holding up OK, shall we? I propose the following for
 release as 0.7.0 (aka For Reals Yo).

 SVN:

 https://svn.apache.org/repos/asf/cassandra/branches/cassandra-0@r1055934
 0.7.0 artifacts: http://people.apache.org/~eevans

 The vote will be open for at least 72 hours.

 P.S. Don't forget that there is still a vote open for 0.6.9

 [1]: http://goo.gl/uT89p (CHANGES.txt)
 [2]: http://goo.gl/Bi8LD (NEWS.txt)
 [3]: http://goo.gl/MHe1z (True Grit(s))

 --
 Eric Evans
 eev...@rackspace.com




Re: Coordinated testing for 0.7

2010-12-01 Thread Ryan King
I'd be happy to host a hackathon at Twitter HQ in SF for this. Anyone
interested in that?

-ryan

On Wed, Dec 1, 2010 at 7:18 PM, Jeremy Hanna jeremy.hanna1...@gmail.com wrote:
 Perhaps the time could be better spent trying to beef up the integration 
 tests and looking for ways to root out potential regressions...

 Back in September a handful of us in the Austin/San Antonio area did an Avro 
 hackathon to get functional parity between thrift and avro.  I wonder if 
 there could be a day set aside to do something to contribute to testing out 
 0.7 - unit/integration test additions would be beneficial long-term.

 Anyway, it could be coordinated by one or a small number of people so that 
 there isn't duplication - something like that.

 I know several have spent long hours already making it solid.  Just trying to 
 brainstorm ways to get some additional good contributions in the core for 
 making for a more solid 0.7.0 release.

 Again...  any thoughts?

 On Dec 1, 2010, at 3:24 PM, Jeremy Hanna wrote:

 I was wondering if there was a coordinated plan for testing the 0.7 release. 
  I realize that testing is ultimately up to the individual team.  However, 
 with 0.7 there are a _lot_ of significant changes and I wondered if there 
 was interest in coordinating efforts to do more extensive testing above and 
 beyond the integration tests and things built into the source tree currently.

 I think https://issues.apache.org/jira/browse/CASSANDRA-874 is also relevant 
 for furthering the integration tests.

 Any thoughts?




Distributed Counters Use Cases

2010-10-06 Thread Ryan King
In the spirit of making sure we have clear communication about our
work, I'd like to outline the use cases Twitter has for distributed
counters. I expect that many of you using Cassandra currently or in
the future will have similar use cases.

The first use case is pretty simple: high scale counters. Our Tweet
button [1] is powered by #1072 counters. We could every mention of
every url that comes through a public tweet. As you would expect,
there are a lot of urls and a lot of traffic to this widget (its on
many high traffic sites, though it is highly cached).

The second is a bit more complex: time series data. We have built
infrastructure that can process logs (in real time from scribe) or
other events  and convert them into a series of keys to increment,
buffer the data for 1 minute and increment those keys. For logs, each
aggregator would do its on increment (so per thing you're tracking you
get an increment for each aggregator), but for events it'll be one
increment per event. We plan to open source all of this soon.

We're hoping to soon start replacing our ganglia clusters with this.
For the ganglia use-case we end up with a large number or increments
for every read. For monitoring data, even a reasonably sized fleet
with a moderate number of metrics can generate a huge amount of data.
Imagine you have 500 machines (not how many we have) and measure 300
(a reasonable estimate based on our experience) metrics per machine.
Suppose you want to measure these things every minute and roll the
values up every hour, day, month and for all time. Suppose also that
you were tracking sum, count, min, max, and sum of squares (so that
you can do standard deviation).  You also want to track these metrics
across groups like web hosts, databases, datacenters, etc.

These basic assumptions would mean this kind of traffic:

(500  +  100  ) *   300   *  5 * 4
3,600,000 increments/minute
(machines   groups)metrics  time granularities   aggregates

Read traffic, being employee-only would be negligible compared to this.

One other use case is that for many of the metrics we track, we want
to track the usage across several facets.

For example [2] to build our local trends feature, you could store a
time series of terms per city. In this case supercolumns would be a
natural fit because the set of facets is unknown and open:

Imagine a CF that has data like this:

city0 = hour0 = { term1 = 2, term2 = 1000, term3 = 1}, hour1 = {
term5 = 2, term2 = 10}
city1 = hour0 = { term12 = 3, term0 = 500, term3 = 1}, hour1 = {
term5 = 2, term2 = 10}

Of course, there are some other ways to model this data– you could
collapse the subcolumn names into the column names and re-do how you
slice (you have to slice anyway). You have to have fixed width terms
then, though:

city0 = { hour0 + term1 = 2, hour0 + term2 = 1000, hour0 + term3 =
1}, hour1 = { hour1 + term5 = 2, hour1 + term2 = 10}
city1 = { hour0 + term12 = 3, hour0 + term0 = 500, hour0 + term3 =
1}, hour1 = { hour1 + term5 = 2, hour1 + term2 = 10}

This is doable, but could be rough.

The other option is to have a separate row for each facet (with a
compound key of [city, term]), and build a custom comparator that only
looks at the first part for generating the token, they we have to do
range slices to get all the facets. Again, doable, but not pretty.


-ryan


1. http://twitter.com/goodies/tweetbutton
2. this is not how we actually do this, but it would be a reasonable approach.


Re: [DISCUSSION] High-volume counters in Cassandra

2010-09-28 Thread Ryan King
Sorry, been catching up on this.

From Twitter's perspective, 1546 is probably insufficient because it
doesn't allow one to do time-series data without supercolumns (which
might work ok, but require a good deal of work). Additionally, one of
our deployed systems already does supercolumns of counters, which is
not feasible in this design at all.

-ryan

On Tue, Sep 28, 2010 at 10:12 AM, Jeremy Hanna
jeremy.hanna1...@gmail.com wrote:
 Is there any feedback from Twitter and Digg and perhaps SimpleGeo people 
 about CASSANDRA-1546?  Would that work so that you wouldn't have to maintain 
 a fork?

 On Sep 27, 2010, at 5:25 AM, Sylvain Lebresne wrote:

 In CASSANDRA-1546, I propose an alternative to #1072. At it's core,
 it rewrites #1072 without the clocks structure (by splitting the clock into
 individual columns, not unlike what Zhu Han proposed in his preceding
 mail, but in a row instead of a super column, for reason explained in the
 issue).

 But it is also my belief that it improves on the actual patch of #1072 in
 the following ways:
  - it supports increments and decrements
  - it supports the usual consistency levels
  - it proposes an (optional) solution to the idempotency problem of
    increments (it's optional because it has a (fairly slight) performance 
 cost
    that some may want to remove if they understand the risk).

 When I say, I propose, I mean that I did wrote the patch (attached to the 
 jira
 ticket). I've just written it, so it is really under-tested and have a
 few details here
 and there to fix, but it should already be fairly functional (it
 passes basic system
 tests).

 I welcome all comments on the patch. It has been written with in mind
 the goal to
 address most of the concerns that have been addressed on those counters 
 since a
 few months (both in terms of performance and implementation). It is my
 belief that
 is reaches this goal, hopefully other will agree.

 --
 Sylvain

 On Mon, Sep 27, 2010 at 5:32 AM, Zhu Han schumi@gmail.com wrote:
  I propose a new way to solve the counter problem in cassandra-1502[1].
 Since I do not follow the jira update very carefully, I paste it here and
 want to let more people comment it and then to see whether its feasible.

 Seems like we have not found a solution acceptable to everybody. I tries to
 propose a new approach. Let's see whether anybody can shed some light on it
 and make it as reality.

 1) We add a basic data structure, called as counter, which is a special type
 of super column.

 2) The name of each column in the counter super column, is the host name of
 a cassandra node. And the value is the calculated result from that node.

 3) WRITE PATH: Once a node receives the add/dec request of a counter, it
 de-serializes its local counter super column, and update the column named by
 itself atomically. After that, it propagates the updated column value to
 other replicas, just like how the mutation of a normal column is propagated
 to other replicas. Different consistency levels can be supported as before.

 4) READ PATH: Depends on the consistency level, contact several replicas,
 read back the counter super column as whole, and get the latest counter
 value by summing up all columns in the counter. Read-repair logic can work
 as before.

 IMHO, the biggest advantages of this approach, is re-using as many
 mechanisms already in the code as possible. So it might not so disruptive.
 But adding new thrift API is inevitable. 
 NB: If it's feasible, I might not be the right man working on it as I have
 not touched the internal of cassandra for more than 1 year. I wants to
 contribute something to help us get consensus.

 [1]
 https://issues.apache.org/jira/browse/CASSANDRA-1502?focusedCommentId=12915103page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12915103

 best regards,
 hanzhu


 On Sun, Sep 26, 2010 at 9:49 PM, Jonathan Ellis jbel...@gmail.com wrote:

 you have misunderstood.  if we continue the 1072 approach of writing
 counter data to the clock field, this is necessarily incompatible with
 the right way of writing counter data to the value field.  it's no
 longer simply a matter of reversing 1070.

 On Sat, Sep 25, 2010 at 11:50 PM, Zhu Han schumi@gmail.com wrote:
 Jonathan,

 This is a personnel email.

 On Sun, Sep 26, 2010 at 1:27 PM, Jonathan Ellis jbel...@gmail.com
 wrote:

 On Sat, Sep 25, 2010 at 8:57 PM, Zhu Han schumi@gmail.com wrote:
 Can we just let the patch committed but mark it as alpah or
 experimental?

 I explained exactly why that is not a good approach here:
 http://www.mail-archive.com/dev@cassandra.apache.org/msg00917.html

 Yes, I see. But the clock structure is in truck since Cassandra-1070.  We
 still need to clean them
 out,  whatever. We need somebody to be volunteer to take this work.
 Considering the complexity
 of Cassandra-1070, the programmer who has the in depth knowledge of this
 patch is preferable. And it
 will take some time to do it.

 

Re: Locking in cassandra

2010-08-16 Thread Ryan King
On Mon, Aug 16, 2010 at 6:07 AM, Maifi Khan maifi.k...@gmail.com wrote:
 Hi
 How is the locking implemented in cassandra? Say, I have 10 nodes and
 I want to write to 6 nodes which is (n+1)/2.

Not to be too pedantic, but you're misunderstanding how to use
cassandra. When we talk about 'n' we mean the number of replicas for a
given piece of data, not the total number of nodes. If you have 10
nodes, you shouldn't be writing a piece of data to 6 of them.

-ryan