Re: Mesos on Gentoo

2014-09-08 Thread Tomas Barton
Hi James,

Spark has support for HDFS, however you don't have to use it and there's no
need to install whole Hadoop stack. I've tested Mesos and Spark with FhGFS
distributed filesystem and it works just fine.

Tomas

On 8 September 2014 06:39, Vinod Kone vinodk...@gmail.com wrote:

 Hi James,

 Great to see a Gentoo package for Mesos!

 Regarding HDFS requirement, any shared storage (even just a http/ftp
 server works) that the Mesos slaves can pull the executor from is enough.



Re: Mesos on Gentoo

2014-09-08 Thread CCAAT

On 09/07/14 23:39, Vinod Kone wrote:

Hi James,

Great to see a Gentoo package for Mesos!



Regarding HDFS requirement, any shared storage (even just a http/ftp
server works) that the Mesos slaves can pull the executor from is enough.



Hello Vinod,

I'm looking for more specific advise on not only what to choose for a 
distributed File System, but some overarching guidance on why/how/where
to look to figure out the gentoo_ish path for success. I think a big 
part of the entire distributed choices is that you either download 
binaries or things are written too general to be of use. If it does not 
work, I'll work on option B, C, D..



Since I want a lightening fast computation machine, where due to lots
of cells performing the exact same complex calculations over and over 
again, I'm guessing I need a high performance, open source, file system.


Specific suggestions? Syntax (even if another distro) or pseudo_syntax
or description of the steps (caveats?) is most encouraging.

Mesos slaves can pull the executor from is enough sounds very 
enticing, but I have no clue as to the choices or how to pursue any

of those choices. My background is EE/CS/math so I have tendencies
towards assembler and C. I find the whole OO paradigm very interesting,
so an overbearing guidance would be keen?


James


Re: Mesos on Gentoo

2014-09-08 Thread CCAAT

On 09/08/14 02:55, Tomas Barton wrote:


Spark has support for HDFS, however you don't have to use it and there's
no need to install whole Hadoop stack. I've tested Mesos and Spark with
FhGFS distributed filesystem and it works just fine.


Yes, from what I have read, since this is a new effort, skip Hadoop and 
HDFS all together. I agree!. Spark (RDD) is what We're after).


Initially my dev platform is (3) AMD FX-8350 each with 32 G of ram for 
the (8) cores. Little else (only essential/test codes) will run on the 
(3) box dev cluster. They have water coolers so the freq can go up to 6 
or 7 GHz later on just to make things interesting for CPU intensive testing.


FhGFS(3)/ is my question. [1].

Why not FhGFS/BTRFS? Many of the technically astute folks using gentoo
have left XFS for BTRFS.

So pick one for me?  Argue against (C), if you can not using stability 
as an argument. The goal is ultimate performance, stability will come, 
with the passage of time, imho.



(A) fhgfs(3)/ext4
(B) fhgfs(3)/xfs
(C) fhgfs(3)/btrfs
(D) fhgfs(3)/

I'm not going to mess around with raid tuning, at this time. Besides
We hope to run 100% in ram (RDD) with using HDD writes only for long 
term storage and analytics, which can be delayed without consequence.
There is still some debate as to if raid will even be necessary on 
btrfs, but that debate, can wait.



[1] http://moo.nac.uci.edu/~hjm/fhgfs_vs_gluster.html


Tomas


James


Re: Mesos on Gentoo

2014-09-08 Thread Tim St Clair
Greetings James! 

This is great to see, also if you're interested feel free to compare notes: 
http://pkgs.fedoraproject.org/cgit/mesos.git/tree/mesos.spec

inline comments below: 

- Original Message -
 From: CCAAT cc...@tampabay.rr.com
 To: user@mesos.apache.org
 Cc: cc...@tampabay.rr.com
 Sent: Monday, September 8, 2014 8:45:59 AM
 Subject: Re: Mesos on Gentoo
 
 On 09/07/14 23:39, Vinod Kone wrote:
  Hi James,
 
  Great to see a Gentoo package for Mesos!
 
  Regarding HDFS requirement, any shared storage (even just a http/ftp
  server works) that the Mesos slaves can pull the executor from is enough.
 
 
 Hello Vinod,
 
 I'm looking for more specific advise on not only what to choose for a
 distributed File System, but some overarching guidance on why/how/where
 to look to figure out the gentoo_ish path for success. I think a big
 part of the entire distributed choices is that you either download
 binaries or things are written too general to be of use. If it does not
 work, I'll work on option B, C, D..
 
 
 Since I want a lightening fast computation machine, where due to lots
 of cells performing the exact same complex calculations over and over
 again, I'm guessing I need a high performance, open source, file system.

NFS will work out of the gate for Spark, and is probably the easiest to setup 
without dragging in Hadoop. 

Otherwise Spark will need a redirector such as Tachyon in order to abstract the 
FS details away to support other distFS's (ceph, gluster, etc.).  

 
 Specific suggestions? Syntax (even if another distro) or pseudo_syntax
 or description of the steps (caveats?) is most encouraging.
 
 Mesos slaves can pull the executor from is enough sounds very
 enticing, but I have no clue as to the choices or how to pursue any
 of those choices. 

Feel free to ping on freenode #mesos if you have more questions. 

 My background is EE/CS/math so I have tendencies
 towards assembler and C. I find the whole OO paradigm very interesting,
 so an overbearing guidance would be keen?
 
 
 James
 

-- 
Cheers,
Timothy St. Clair
Red Hat Inc.


Mesos on Gentoo

2014-09-07 Thread CCAAT

Hello Mesos,

I have hacked together an ebuild (gentoo package) to install 
mesos-0.20.0. It seems to be working, but I need some generic guidelines to

fully test the mesos package.

I also intend to install it on a small cluster  of gentoo machines. Do I 
need a distributed file system, such as HDFS for a distributed version 
of mesos? ( in other words, a mesos cluster)? If so, what are my
practical choices, CEPH, BTRFS, Gluster or is HDFS required. If 
possible, I'm trying to skip over the entire Hadoop environment, as I 
have not legacy interests and it seems to not efficient for what's at 
the heart of my needs (computations and visual simulations using RDD).


On gentoo, use of binaries is only temporary (transient) until the 
sources can be properly assimilated for compiling and installation 
control via an ebuild [1]. So I have built the openjdk  stack on gentoo, 
know as icedtea. icedtea actually uses open source software tools to 
build up the java-jdk from sources. [5]



Eventually, I intend to install Spark, Cassandra, a distributed database 
(HyperTable?) and some analytics tools. I hope to use spark on mesos to 
solve some very large Finite Element problems [2,[3,4]


So, any other component/supporting codes for this sort of big science 
adventure,  mesos+spark is of the utmost interest to me.


I'm an old unix/bsd/linux hack but some of these newer codes
take me a while to find and figure out the compile-time and run-time 
dependencies. When I get all of these (ebuild) modules tested, I'll post 
the ebuilds (as Overlays) if anyone is interested.


Your guidance and suggestions are most welcome!


James



[1] http://en.wikipedia.org/wiki/Ebuild

[2] http://en.wikipedia.org/wiki/Finite_element_method

[3] http://www.dune-project.org/

[4] http://www.mcs.anl.gov/petsc/

[5] http://icedtea.classpath.org/wiki/Main_Page


Re: Mesos on Gentoo

2014-09-07 Thread Vinod Kone
Hi James,

Great to see a Gentoo package for Mesos!

Regarding HDFS requirement, any shared storage (even just a http/ftp server
works) that the Mesos slaves can pull the executor from is enough.