+dev@ I think it makes a lot of sense to run Distributed File Systems on top of Mesos, whether that be HDFS, MapRFS, Lustre, BitTorrent, or whatever. HDFS is very popular with Mesos users, and is currently supported as an executor fetching protocol/source. I would love to see an HDFS framework for Mesos. Below are my thoughts.
** Advantages: + Fault-tolerance/HA: Automatically restart failed NameNodes, always have enough standbys. + Shared resources: Mesos can allocate/isolate resources for HDFS NN/DN processes alongside other frameworks + Scaling: Easily scale up/down the number of DataNodes as the cluster grows ** Challenges: Launching NameNode (NN) tasks: - Need multiple NNs, for HA-standbys and/or federated NNs. - Do we have to manually configure federated namespaces? - Could we use the Mesos replicated log for NN-HA's edit log, instead of NFS or the JournalNodes? Launching DataNode (DN) tasks: - DNs must be started with names of all(?) NNs, register/update with each. Need a svc-discovery tool, or just start all NNs first, then start DNs with known NNs? How to update when NNs move? Use ZK to track? The Bootstrap problem: - Where to fetch the NN/DN executors/tasks? Could use another HDFS cluster, S3/HTTP/FTP, or pre-install the binaries on each slave. Migrating an existing HDFS cluster: - Is it possible to do a migration from raw HDFS to HDFS-on-Mesos without moving the data? Data Residency: - Should we destroy the sandbox/hdfs-data when shutting down a DN? - If starting DN on node that was previously running a DN, can/should we try to revive the existing data? Topology contraints: - Must guarantee only one DN (per fwk) per slave, only one NN (per fwk) per slave. - Wouldn't want NNs (or replicated blocks?) to live on the same physical node/rack. Could use attributes to express topology. Kerberos integration: - How to ensure that NN has access to the KDC and/or required keytabs/credentials? On Wed, Jun 25, 2014 at 6:17 AM, Vladimir Vivien <[email protected]> wrote: > +1 wondered about this. Would love to hear pros/cons. > > > On Wed, Jun 25, 2014 at 8:00 AM, Maxime Brugidou < > [email protected]> wrote: > >> Hi Mesos Community, >> >> I am a bit surprised to see that no one has done a framework to run HDFS >> on top of Mesos directly since a lot of people seem to use HDFS in the >> community. HDFS seems to be managed separately from Mesos (but will >> probably run on the same machines). Is there any reason for that? >> >> My understanding is that using Mesos to manage all resources and having >> HDFS on top of it makes much more sense (just like a FS runs inside an OS, >> not on the side). >> >> Is it technical complexity? (we run HDFS and YARN with HA, journalnodes >> and Kerberos Security and it is definitely a beast) Is it because no one >> really feels the need for this since they are already running HDFS on the >> side close to the hardware and don't want to waste time having it in Mesos? >> >> Best >> Maxime >> > > > > -- > Vladimir Vivien >
