Hi Mike,

Why does Dr. Elephant make sense as a separate project instead of
> contributing to Hadoop directly?

Here are a couple reasons why I think Dr. Elephant is more likely to
succeed as a separate project:

* Dr. Elephant supports Hadoop *and* Spark, and may support other
  execution layers in the future. If we make Dr. Elephant a part of
  Hadoop I expect that it will discourage contributions from people
  who are interested mainly in Spark support, and vice versa.

* If Dr. Elephant is added to Hadoop it will be necessary for the
  Hadoop project to declare a dependency on Spark. I doubt this change
  will get approved.

* We don't want to tie Dr. Elephant to a specific version of Hadoop or
  Spark, or tie the Dr. Elephant release cycle to the Hadoop or Spark
  release cycles.

* None of the current Dr. Elephant committers are Hadoop committers,
  and I doubt that the Hadoop PMC is going to give them a commit bit
  just to work on Dr. Elephant. As a result the existing committers
  would be effectively forfeiting their right to continue maintaining
  their own project. I think this is one of the reasons why many
  Hadoop contrib projects are poorly maintained.

> What is the relationship between Dr. Elephant and the (now seemingly
> defunct) Hadoop Vaidya?

Vaidya was a command line tool for tuning Hadoop jobs. Dr. Elephant is
an always-on service for tuning Hadoop and Spark jobs. We were unaware
of Vaidya when we started working on Dr. Elephant.

- Carl

Reply via email to