Hi,

I've been following the Disco Project for a couple of years. The tricky part 
with using Disco with Riak would be to make sure each map phase is not executed 
multiple times over the same data*. Also, since each map phase would 
(preferably) run on the same host as its data (for data locality), you would 
also have to make sure to only iterate over data that is associated with the 
vnode for that physical host.

If you can easily extract host-specific keys for a specific vnode, then this is 
doable. However, either the Disco master or the Disco job submitter will need 
to have all this data when a job is submitted.

Also, I'm not sure that it will help very much that both are written in Erlang.

Some ideas,
Jens

* Obviously, you could also chain your mapreduce jobs in Disco to remove 
duplicate maps, but this introduces overhead.

Från: riak-users [mailto:[email protected]] För Antonio Rohman 
Fernandez
Skickat: den 17 april 2013 13:15
Till: [email protected]
Ämne: Riak + Disco (MapReduce alternative)


Hello everybody,

Has anyone tried to use Riak with Disco? [ http://discoproject.org ] I was 
looking for Hadoop alternatives ( as the RIAK-HADOOP connector project seems 
not going anywhere ) and I think Disco is quite interesting, moreover is 
written in Erlang same as Riak. Looks like it would be a good match!

As seen in the mailing list, seems that Riak's built-in MapReduce is not 
suitable for much of the queries I would be interested on doing... My idea 
would be to leverage the MapReduce work to a Hadoop ( or Disco, or another ) 
cluster that will do the GETs on the Riak cluster through an Index ( as 
suggested on this list... do multi-gets instead of MR ) and reduce the data 
independently. Does anybody has suggestions about this?

Thanks,
Rohman

[logo]<http://mahalostudio.com>



Antonio Rohman Fernandez
CEO, Founder & Lead Engineer
[email protected]<mailto:[email protected]>



Projects
MaruBatsu.es<http://marubatsu.es>
PupCloud.com<http://pupcloud.com>
Wedding Album<http://wedding.mahalostudio.com>


_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to