Re: HDFS on Mesos

Benjamin Hindman Thu, 26 Jun 2014 12:23:33 -0700

Wanted to jump in here and provide some context on 'persistent resources'.
As Vinod mentioned, this is how we're thinking about enabling storage-like
frameworks on Mesos.

The idea originally came about because, even today, if we allocate some
file system space to a task/executor, and then that task/executor
terminates, we haven't officially "freed" those file system resources until
after we garbage collect the task/executor sandbox! (We keep the sandbox
around so a user/operator can get the stdout/stderr or anything else left
around from their task/executor.)

To solve this problem we wanted to be able to let a task/executor terminate
but not *give up* all of it's resources, hence: persistent resources.

Pushing this concept even further you could imagine always reallocating
resources to a framework that had already been allocated those resources
for a previous task/executor. Looked at from another perspective, these are
"late-binding", or "lazy", resource reservations.

At one point in time we had considered just doing 'right-of-first-refusal'
for allocations after a task/executor terminate. But this is really
insufficient for supporting storage-like frameworks well (and likely even
harder to reliably implement then 'persistent resources' IMHO).

There are a ton of things that need to get worked out in this model,
including (but not limited to), how should a file system (or disk) be
exposed in order to be made persistent? How should persistent resources be
returned to a master? How many persistent resources can a framework get
allocated?

The right place to capture this all is in an "Epic" ticket on JIRA. Nikita,
do you want to create a ticket? If not, no worries, I'm happy to create the
ticket. Really looking forward to seeing this develop!

Ben.

On Thu, Jun 26, 2014 at 11:33 AM, Vinod Kone <[email protected]> wrote:

> SGTM. Feel free to create the ticket!
>
>
> On Thu, Jun 26, 2014 at 11:20 AM, Vetoshkin Nikita <
> [email protected]> wrote:
>
> > Thanks, Vinod! I really like the "persistent resources" idea. Maybe there
> > should be a ticket for discussion and brainstorming?
> > On Jun 26, 2014 11:06 PM, "Vinod Kone" <[email protected]> wrote:
> >
> > > As Maxime mentioned, the long term solution is for Mesos to support the
> > > notion of "persistent resources" i.e., resources that stay (and
> accounted
> > > for) after the life cycle of task/executor. The idea still needs
> fleshing
> > > out.
> > >
> > >
> > > On Thu, Jun 26, 2014 at 8:23 AM, Vetoshkin Nikita <
> > > [email protected]> wrote:
> > >
> > > > What about long term solution? Any ideas? Twitter's Manhattan
> database
> > > > claims to use Mesos for scaling up and down. Can you shed some light
> > how
> > > do
> > > > they deal with the situation like this?
> > > > On Jun 26, 2014 5:01 AM, "Vinod Kone" <[email protected]> wrote:
> > > >
> > > > > Thanks for listing this out Adam.
> > > > >
> > > > > Data Residency:
> > > > > > - Should we destroy the sandbox/hdfs-data when shutting down a
> DN?
> > > > > > - If starting DN on node that was previously running a DN,
> > can/should
> > > > we
> > > > > > try to revive the existing data?
> > > > > >
> > > > >
> > > > > I think this is one of the key challenges for a production quality
> > HDFS
> > > > on
> > > > > Mesos. Currently, since sandbox is deleted after a task exits, if
> all
> > > the
> > > > > data nodes that hold a block (and its replicas) get lost/killed for
> > > > > whatever reason there would be data loss. A short terms solution
> > would
> > > be
> > > > > to write outside sandbox and use slave attributes to track where to
> > > > > re-launch data node tasks.
> > > > >
> > > >
> > >
> >
>

Re: HDFS on Mesos

Reply via email to