On Sep 13, 2013, at 6:56 AM, Alexander Kuznetsov wrote: > On Thu, Sep 12, 2013 at 7:30 PM, Michael Basnight <mbasni...@gmail.com> wrote: > On Sep 12, 2013, at 2:39 AM, Thierry Carrez wrote: > > > Sergey Lukjanov wrote: > > > >> [...] > >> As you can see, resources provisioning is just one of the features and the > >> implementation details are not critical for overall architecture. It > >> performs only the first step of the cluster setup. We’ve been considering > >> Heat for a while, but ended up direct API calls in favor of speed and > >> simplicity. Going forward Heat integration will be done by implementing > >> extension mechanism [3] and [4] as part of Icehouse release. > >> > >> The next part, Hadoop cluster configuration, already extensible and we > >> have several plugins - Vanilla, Hortonworks Data Platform and Cloudera > >> plugin started too. This allow to unify management of different Hadoop > >> distributions under single control plane. The plugins are responsible for > >> correct Hadoop ecosystem configuration at already provisioned resources > >> and use different Hadoop management tools like Ambari to setup and > >> configure all cluster services, so, there are no actual provisioning > >> configs on Savanna side in this case. Savanna and its plugins encapsulate > >> the knowledge of Hadoop internals and default configuration for Hadoop > >> services. > > > > My main gripe with Savanna is that it combines (in its upcoming release) > > what sounds like to me two very different services: Hadoop cluster > > provisioning service (like what Trove does for databases) and a > > MapReduce+ data API service (like what Marconi does for queues). > > > > Making it part of the same project (rather than two separate projects, > > potentially sharing the same program) make discussions about shifting > > some of its clustering ability to another library/project more complex > > than they should be (see below). > > > > Could you explain the benefit of having them within the same service, > > rather than two services with one consuming the other ? > > And for the record, i dont think that Trove is the perfect fit for it today. > We are still working on a clustering API. But when we create it, i would love > the Savanna team's input, so we can try to make a pluggable API thats usable > for people who want MySQL or Cassandra or even Hadoop. Im less a fan of a > clustering library, because in the end, we will both have API calls like POST > /clusters, GET /clusters, and there will be API duplication between the > projects. > > I think that Cluster API (if it would be created) will be helpful not only > for Trove and Savanna. NoSQL, RDBMS and Hadoop are not unique software which > can be clustered. What about different kind of messaging solutions like > RabbitMQ, ActiveMQ or J2EE containers like JBoss, Weblogic and WebSphere, > which often are installed in clustered mode. Messaging, databases, J2EE > containers and Hadoop have their own management cycle. It will be confusing > to make Cluster API a part of Trove which has different mission - database > management and provisioning.
Are you suggesting a 3rd program, cluster as a service? Trove is trying to target a generic enough™ API to tackle different technologies with plugins or some sort of extensions. This will include a scheduler to determine rack awareness. Even if we decide that both Savanna and Trove need their own API for building clusters, I still want to understand what makes the Savanna API and implementation different, and how Trove can build an API/system that can encompass multiple datastore technologies. So regardless of how this shakes out, I would urge you to go to the Trove clustering summit session [1] so we can share ideas. [1] http://summit.openstack.org/cfp/details/54
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev