On 09/08/14 02:55, Tomas Barton wrote:

Spark has support for HDFS, however you don't have to use it and there's
no need to install whole Hadoop stack. I've tested Mesos and Spark with
FhGFS distributed filesystem and it works just fine.

Yes, from what I have read, since this is a new effort, skip Hadoop and HDFS all together. I agree!. Spark (RDD) is what We're after).

Initially my dev platform is (3) AMD FX-8350 each with 32 G of ram for the (8) cores. Little else (only essential/test codes) will run on the (3) box dev cluster. They have water coolers so the freq can go up to 6 or 7 GHz later on just to make things interesting for CPU intensive testing.

FhGFS(3)/???? is my question. [1].

Why not FhGFS/BTRFS? Many of the technically astute folks using gentoo
have left XFS for BTRFS.

So pick one for me? Argue against (C), if you can not using stability as an argument. The goal is ultimate performance, stability will come, with the passage of time, imho.


(A) fhgfs(3)/ext4
(B) fhgfs(3)/xfs
(C) fhgfs(3)/btrfs
(D) fhgfs(3)/????

I'm not going to mess around with raid tuning, at this time. Besides
We hope to run 100% in ram (RDD) with using HDD writes only for long term storage and analytics, which can be delayed without consequence. There is still some debate as to if raid will even be necessary on btrfs, but that debate, can wait.


[1] http://moo.nac.uci.edu/~hjm/fhgfs_vs_gluster.html

Tomas

James

Reply via email to