Re: deployment of spark on mesos and data locality in tachyon/hdfs
Response inline. On Tue, Mar 31, 2015 at 10:41 PM, Sean Bigdatafun sean.bigdata...@gmail.com wrote: (resending...) I was thinking the same setup… But the more I think of this problem, and the more interesting this could be. If we allocate 50% total memory to Tachyon statically, then the Mesos benefits of dynamically scheduling resources go away altogether. People can still benefits from Mesos' dynamically scheduling of the rest memory as well as compute resource. Can Tachyon be resource managed by Mesos (dynamically)? Any thought or comment? This requires some integration work. Best, Haoyuan Sean Hi Haoyuan, So on each mesos slave node I should allocate/section off some amount of memory for tachyon (let's say 50% of the total memory) and the rest for regular mesos tasks? This means, on each slave node I would have tachyon worker (+ hdfs configuration to talk to s3 or the hdfs datanode) and the mesos slave ?process. Is this correct? -- --Sean -- Haoyuan Li AMPLab, EECS, UC Berkeley http://www.cs.berkeley.edu/~haoyuan/
Re: deployment of spark on mesos and data locality in tachyon/hdfs
(resending...) I was thinking the same setup… But the more I think of this problem, and the more interesting this could be. If we allocate 50% total memory to Tachyon statically, then the Mesos benefits of dynamically scheduling resources go away altogether. Can Tachyon be resource managed by Mesos (dynamically)? Any thought or comment? Sean Hi Haoyuan, So on each mesos slave node I should allocate/section off some amount of memory for tachyon (let's say 50% of the total memory) and the rest for regular mesos tasks? This means, on each slave node I would have tachyon worker (+ hdfs configuration to talk to s3 or the hdfs datanode) and the mesos slave ?process. Is this correct? -- --Sean
Re: deployment of spark on mesos and data locality in tachyon/hdfs
Tachyon should be co-located with Spark in this case. Best, Haoyuan On Tue, Mar 31, 2015 at 4:30 PM, Ankur Chauhan achau...@brightcove.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, I am fairly new to the spark ecosystem and I have been trying to setup a spark on mesos deployment. I can't seem to figure out the best practices around HDFS and Tachyon. The documentation about Spark's data-locality section seems to point that each of my mesos slave nodes should also run a hdfs datanode. This seems fine but I can't seem to figure out how I would go about pushing tachyon into the mix. How should i organize my cluster? Should tachyon be colocated on my mesos worker nodes? or should all the spark jobs reach out to a separate hdfs/tachyon cluster. - -- Ankur Chauhan -BEGIN PGP SIGNATURE- iQEcBAEBAgAGBQJVGy4bAAoJEOSJAMhvLp3L5bkH/0MECyZkh3ptWzmsNnSNfGWp Oh93TUfD+foXO2ya9D+hxuyAxbjfXs/68aCWZsUT6qdlBQU9T1vX+CmPOnpY1KPN NJP3af+VK0osaFPo6k28OTql1iTnvb9Nq+WDlohxBC/hZtoYl4cVxu8JmRlou/nb /wfpp0ShmJnlxsoPa6mVdwzjUjVQAfEpuet3Ow5veXeA9X7S55k/h0ZQrZtO8eXL jJsKaT8ne9WZPhZwA4PkdzTxkXF3JNveCIKPzNttsJIaLlvd0nLA/wu6QWmxskp6 iliGSmEk5P1zZWPPnk+TPIqbA0Ttue7PeXpSrbA9+pYiNT4R/wAneMvmpTABuR4= =8ijP -END PGP SIGNATURE- - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Haoyuan Li AMPLab, EECS, UC Berkeley http://www.cs.berkeley.edu/~haoyuan/
Re: deployment of spark on mesos and data locality in tachyon/hdfs
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Haoyuan, So on each mesos slave node I should allocate/section off some amount of memory for tachyon (let's say 50% of the total memory) and the rest for regular mesos tasks? This means, on each slave node I would have tachyon worker (+ hdfs configuration to talk to s3 or the hdfs datanode) and the mesos slave process. Is this correct? On 31/03/2015 16:43, Haoyuan Li wrote: Tachyon should be co-located with Spark in this case. Best, Haoyuan On Tue, Mar 31, 2015 at 4:30 PM, Ankur Chauhan achau...@brightcove.com mailto:achau...@brightcove.com wrote: Hi, I am fairly new to the spark ecosystem and I have been trying to setup a spark on mesos deployment. I can't seem to figure out the best practices around HDFS and Tachyon. The documentation about Spark's data-locality section seems to point that each of my mesos slave nodes should also run a hdfs datanode. This seems fine but I can't seem to figure out how I would go about pushing tachyon into the mix. How should i organize my cluster? Should tachyon be colocated on my mesos worker nodes? or should all the spark jobs reach out to a separate hdfs/tachyon cluster. -- Ankur Chauhan - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org mailto:user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org mailto:user-h...@spark.apache.org -- Haoyuan Li AMPLab, EECS, UC Berkeley http://www.cs.berkeley.edu/~haoyuan/ - -- - -- Ankur Chauhan -BEGIN PGP SIGNATURE- iQEcBAEBAgAGBQJVGzKUAAoJEOSJAMhvLp3L3W4IAIVYiEKIZbC1a36/KWo94xYB dvE4VXxF7z5FWmpuaHBEa+U1XWrR4cLVsQhocusOFn+oC7bstdltt3cGNAuwFSv6 Oogs4Sl1J4YZm8omKVdCkwD6Hv71HSntM8llz3qTW+Ljk2aKhfvNtp5nioQAm3e+ bs4ZKlCBij/xV3LbYYIePSS3lL0d9m1qEDJvi6jFcfm3gnBYeNeL9x92B5ylyth0 BGHnPN4sV/yopgrqOimLb12gSexHGNP1y6JBYy8NrHRY8SxkZ4sWKuyDnGDCOPOc HC14Parf5Ly5FEz5g5WjF6HrXRdPlgr2ABxSLWOAB/siXsX9o/4yCy7NtDNcL6Y= =f2xI -END PGP SIGNATURE- - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org