1) Works with RandomPartitioner. This is huge and the only way almost everyone would able to use it. 2) Ability to divide up the keys of a single node to more than one mapper. The prototype just slurped up everything on the node. This would probably be easiest to not allow as a configurable thing and just let it be part of the InputSplit calculation. 3) Progress information should be calculated and displayed. -- Jeff
On Mon, Jan 25, 2010 at 5:43 AM, Phillip Michalak <phil.micha...@digitalreasoning.com> wrote: > Multiple people have expressed an interest in 'hadoop integration' and > 'map/reduce functionality' within Cassandra. I'd like to get a feel for what > that means to different people. > > As a starting point for discussion, Jeff Hodges undertook a prototype effort > last summer which was the subject of this thread: > http://mail-archives.apache.org/mod_mbox/incubator-cassandra-dev/200907.mbox/%3cf5f3a6290907240123y22f065edp1649f7c5c1add...@mail.gmail.com%3e. > > Jeff explicitly mentions data locality as one of the things that was out of > scope for the prototype. What other features or characteristics would you > expect to see in an implementation? > > Thanks, > Phil >