On Jan 16, 2012, at 10:08 PM, Eric Yang wrote:
> Hi all,
>
> The current hierarchical stack definition is confusing to me.
> Supposedly, the definition can be expanded and flatten configuration
> key value pair, and an example of defining namenode url and get
> inherited in hbase configuration would look like this:
>
> {
> ...
> "components": {
> "hdfs": {
> "roles": {
> "namenode": { /* override one value on the namenode */
> "hadoop/hdfs-site": {
> "dfs.https.enable": "true",
> "fs.default.name": "hdfs://${namenode}:${port}/"
> }
> }
> }
> },
> "hbase": {
> "roles": {
> "region-server": {
> "hbase/hbase-site": {
> "hbase.rootdir":
> "${components.hdfs.namenode.hadoop/hdfs-site.fs.default.name}/hbase"
> }
> }
> }
> }
> }
>
> hbase.rootdir is a key for hbase-site.xml, and it should contain key
> of "fs.default.name" plus additional path for hbase to store data. In
> my interpretation of the macro would look like
> ${components.hdfs.namenode.hadoop/hdfs-site.fs.default.name}/hbase.
> This seems like a utterly awkward method of describing inheritance.
> Why don't we use a flat name space to remove additional logistics
> imposed by Ambari. I agree that the syntax is fully accurate, but it
> is a larger headache to maintain this hierarchical structure.
>
> The second problem is the component plugin architecture sounds good in
> theory. I see some scalability issues with this approach. Each
> component describes the components that it depends on. This could
> interfere with introducing new components. i.e. Mapreduce component
> depends on HDFS. A new component is introduced, and name HBase. Now,
> the Mapreduce component needs to update it's dependency to HDFS and
> HBase and ZooKeeper. For introducing new component, there is a lot of
> plugins updates to make the new version work. The plugin writer also
> needs to make theoretical assumption that if components X is installed
> do Y, otherwise do Z. Conditional assumption in plugin introduces
> uncertainty and corner cases into the deployment system. The number
> of permutation can greatly exceed the logics that is required to
> handle by the plugin.
>
Don't understand this. Every component knows what components it depends on,
right? MapRed component developer would know that it depends on hdfs. If you
depend on hbase, you should ensure that the list of dependent components
includes hbase...
> Instead of using plugin architecture to manage deployment, it would be
> safer to use a scripting approach to enable power administrator to
> deploy a stack of software by writing shell script like script to
> accomplish the deployment tasks. The recipes scripts can be shared by
> the community to automate software stack deployment. This will ensure
> the scope of Ambari deployment is focus on cross nodes orchestration
> without having to build bell and whistles which does not scale well in
> the long term.
>
Actually, Ambari could get out of the business of orchestration. Instead, the
components should be updated to tolerate their dependencies being down.. I have
heard this from some people, and generally concur with this thinking. It makes
the ecosystem better (if you ignore Ambari for the moment).
> What do you guys think?
>
> regards,
> Eric