Re: Mesos on Gentoo
Hi James, Spark has support for HDFS, however you don't have to use it and there's no need to install whole Hadoop stack. I've tested Mesos and Spark with FhGFS distributed filesystem and it works just fine. Tomas On 8 September 2014 06:39, Vinod Kone vinodk...@gmail.com wrote: Hi James, Great to see a Gentoo package for Mesos! Regarding HDFS requirement, any shared storage (even just a http/ftp server works) that the Mesos slaves can pull the executor from is enough.
Re: Mesos on Gentoo
On 09/07/14 23:39, Vinod Kone wrote: Hi James, Great to see a Gentoo package for Mesos! Regarding HDFS requirement, any shared storage (even just a http/ftp server works) that the Mesos slaves can pull the executor from is enough. Hello Vinod, I'm looking for more specific advise on not only what to choose for a distributed File System, but some overarching guidance on why/how/where to look to figure out the gentoo_ish path for success. I think a big part of the entire distributed choices is that you either download binaries or things are written too general to be of use. If it does not work, I'll work on option B, C, D.. Since I want a lightening fast computation machine, where due to lots of cells performing the exact same complex calculations over and over again, I'm guessing I need a high performance, open source, file system. Specific suggestions? Syntax (even if another distro) or pseudo_syntax or description of the steps (caveats?) is most encouraging. Mesos slaves can pull the executor from is enough sounds very enticing, but I have no clue as to the choices or how to pursue any of those choices. My background is EE/CS/math so I have tendencies towards assembler and C. I find the whole OO paradigm very interesting, so an overbearing guidance would be keen? James
Re: Mesos on Gentoo
On 09/08/14 02:55, Tomas Barton wrote: Spark has support for HDFS, however you don't have to use it and there's no need to install whole Hadoop stack. I've tested Mesos and Spark with FhGFS distributed filesystem and it works just fine. Yes, from what I have read, since this is a new effort, skip Hadoop and HDFS all together. I agree!. Spark (RDD) is what We're after). Initially my dev platform is (3) AMD FX-8350 each with 32 G of ram for the (8) cores. Little else (only essential/test codes) will run on the (3) box dev cluster. They have water coolers so the freq can go up to 6 or 7 GHz later on just to make things interesting for CPU intensive testing. FhGFS(3)/ is my question. [1]. Why not FhGFS/BTRFS? Many of the technically astute folks using gentoo have left XFS for BTRFS. So pick one for me? Argue against (C), if you can not using stability as an argument. The goal is ultimate performance, stability will come, with the passage of time, imho. (A) fhgfs(3)/ext4 (B) fhgfs(3)/xfs (C) fhgfs(3)/btrfs (D) fhgfs(3)/ I'm not going to mess around with raid tuning, at this time. Besides We hope to run 100% in ram (RDD) with using HDD writes only for long term storage and analytics, which can be delayed without consequence. There is still some debate as to if raid will even be necessary on btrfs, but that debate, can wait. [1] http://moo.nac.uci.edu/~hjm/fhgfs_vs_gluster.html Tomas James
Re: Mesos on Gentoo
Greetings James! This is great to see, also if you're interested feel free to compare notes: http://pkgs.fedoraproject.org/cgit/mesos.git/tree/mesos.spec inline comments below: - Original Message - From: CCAAT cc...@tampabay.rr.com To: user@mesos.apache.org Cc: cc...@tampabay.rr.com Sent: Monday, September 8, 2014 8:45:59 AM Subject: Re: Mesos on Gentoo On 09/07/14 23:39, Vinod Kone wrote: Hi James, Great to see a Gentoo package for Mesos! Regarding HDFS requirement, any shared storage (even just a http/ftp server works) that the Mesos slaves can pull the executor from is enough. Hello Vinod, I'm looking for more specific advise on not only what to choose for a distributed File System, but some overarching guidance on why/how/where to look to figure out the gentoo_ish path for success. I think a big part of the entire distributed choices is that you either download binaries or things are written too general to be of use. If it does not work, I'll work on option B, C, D.. Since I want a lightening fast computation machine, where due to lots of cells performing the exact same complex calculations over and over again, I'm guessing I need a high performance, open source, file system. NFS will work out of the gate for Spark, and is probably the easiest to setup without dragging in Hadoop. Otherwise Spark will need a redirector such as Tachyon in order to abstract the FS details away to support other distFS's (ceph, gluster, etc.). Specific suggestions? Syntax (even if another distro) or pseudo_syntax or description of the steps (caveats?) is most encouraging. Mesos slaves can pull the executor from is enough sounds very enticing, but I have no clue as to the choices or how to pursue any of those choices. Feel free to ping on freenode #mesos if you have more questions. My background is EE/CS/math so I have tendencies towards assembler and C. I find the whole OO paradigm very interesting, so an overbearing guidance would be keen? James -- Cheers, Timothy St. Clair Red Hat Inc.
Mesos on Gentoo
Hello Mesos, I have hacked together an ebuild (gentoo package) to install mesos-0.20.0. It seems to be working, but I need some generic guidelines to fully test the mesos package. I also intend to install it on a small cluster of gentoo machines. Do I need a distributed file system, such as HDFS for a distributed version of mesos? ( in other words, a mesos cluster)? If so, what are my practical choices, CEPH, BTRFS, Gluster or is HDFS required. If possible, I'm trying to skip over the entire Hadoop environment, as I have not legacy interests and it seems to not efficient for what's at the heart of my needs (computations and visual simulations using RDD). On gentoo, use of binaries is only temporary (transient) until the sources can be properly assimilated for compiling and installation control via an ebuild [1]. So I have built the openjdk stack on gentoo, know as icedtea. icedtea actually uses open source software tools to build up the java-jdk from sources. [5] Eventually, I intend to install Spark, Cassandra, a distributed database (HyperTable?) and some analytics tools. I hope to use spark on mesos to solve some very large Finite Element problems [2,[3,4] So, any other component/supporting codes for this sort of big science adventure, mesos+spark is of the utmost interest to me. I'm an old unix/bsd/linux hack but some of these newer codes take me a while to find and figure out the compile-time and run-time dependencies. When I get all of these (ebuild) modules tested, I'll post the ebuilds (as Overlays) if anyone is interested. Your guidance and suggestions are most welcome! James [1] http://en.wikipedia.org/wiki/Ebuild [2] http://en.wikipedia.org/wiki/Finite_element_method [3] http://www.dune-project.org/ [4] http://www.mcs.anl.gov/petsc/ [5] http://icedtea.classpath.org/wiki/Main_Page
Re: Mesos on Gentoo
Hi James, Great to see a Gentoo package for Mesos! Regarding HDFS requirement, any shared storage (even just a http/ftp server works) that the Mesos slaves can pull the executor from is enough.