CoHadoop Papers

Gary Malouf Tue, 26 Aug 2014 04:44:02 -0700

One of my colleagues has been questioning me as to why Spark/HDFS makes no
attempts to try to co-locate related data blocks.  He pointed to this
paper: http://www.vldb.org/pvldb/vol4/p575-eltabakh.pdf from 2011 on the
CoHadoop research and the performance improvements it yielded for
Map/Reduce jobs.


Would leveraging these ideas for writing data from Spark make sense/be
worthwhile?

CoHadoop Papers

Reply via email to