In a recent thread on the list, I received various important answers to my questions on hadoop plugin. Maybe this thread will help you. https://www.spinics.net/lists/ceph-users/msg40790.html
One of the most important answers is about data locality. The last message lead me to this article. https://www.bluedata.com/blog/2015/05/data-locality-is-irrelevant-for-hadoop/ Regards, -- Aristeu 2017-12-22 2:04 GMT-02:00 Serkan Çoban <[email protected]>: > >Also, are there any benchmark comparisons between hdfs and ceph > specifically around performance of apps benefiting from data locality ? > There will be no data locality in ceph, because all the data is > accessed through network. > > On Fri, Dec 22, 2017 at 4:52 AM, Traiano Welcome <[email protected]> > wrote: > > Hi List > > > > I'm researching the possibility os using ceph as a drop in replacement > for > > hdfs for applications using spark and hadoop. > > > > I note that the jewel documentation states that it requires hadoop 1.1.x, > > which seems a little dated and would be of concern for peopel: > > > > http://docs.ceph.com/docs/jewel/cephfs/hadoop/ > > > > What about the 2.x series? > > > > Also, are there any benchmark comparisons between hdfs and ceph > specifically > > around performance of apps benefiting from data locality ? > > > > Many thanks in advance for any feedback! > > > > Regards, > > Traiano > > > > > > _______________________________________________ > > ceph-users mailing list > > [email protected] > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
