OK thx Kevin!

Let's close for now and we'll open a new one if it still has issues.

Thx!
Sam
Le 2 mars 2015 17:15, "Kevin W Monroe" <[email protected]> a écrit
:

> Hi Sam, this was fixed with a commit late last month:
>
> http://bazaar.launchpad.net/~bigdata-charmers/charms/trusty/hdp-
> hadoop/trunk/revision/30?remember=32
>
> This is the infamous jps problem that plagued big data charms since the
> openjdk 7u75 update.  In your case, namenode-relation-joined *should*
> have detected that namenode was running and returned before the call to
> start_namenode():
>
> http://bazaar.launchpad.net/~bigdata-charmers/charms/trusty/hdp-
> hadoop/trunk/view/head:/hooks/hdp-hadoop-common.py#L438
>
> However, prior to the above commit, jps running as root did not detect a
> running NameNode process, so it fell through to start_namenode() and
> caused this bug.  We no longer rely on jps, but rather pgrep to detect
> running processes, so this should be working correctly now.  The latest
> hdp-hadoop (currently REVNO32) contains this fix.
>
> ** Changed in: hdp-hadoop (Juju Charms Collection)
>        Status: New => Fix Released
>
> ** Changed in: hdp-pig (Juju Charms Collection)
>        Status: New => Fix Released
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1414080
>
> Title:
>   Race condition on relation when bundling hdp-hadoop
>
> Status in hdp-hadoop package in Juju Charms Collection:
>   Fix Released
> Status in hdp-pig package in Juju Charms Collection:
>   Fix Released
>
> Bug description:
>   I build a demo to run a Machine Learning workshop based on a blog post
>   made by Hortonworks.
>
>   There are 2 sides in the demo:
>   * Bundle: https://github.com/SaMnCo/bundle-flight-delay-demo
>   * Charm: https://github.com/SaMnCo/charm-flight-delay-demo
>
>   The bundle comprises:
>
>   * YARN Master
>   * 4x compute nodes
>   * PIG colocated on YARN
>   * ipython-notebook colocated on YARN
>
>   When I deploy manually using the 00-deploy file provided in the
>   bundle, everything goes well. However, when trying to deploy the
>   bundle, it fails at relation creation.
>
>   In the attached juju log collected at deployment, line 11347, we see
>   Hadoop crashing. Then 11402, the crash expands. 11418, we discover the
>   resource manager is not ready.
>
>   I can reproduce the same behavior with a simpler bundle comprising:
>
>   * YARN Master
>   * 4x compute nodes
>   * PIG colocated on YARN
>
>   So it seems to really be related to PIG/YARN/Compute nodes relation.
>
>   I can also reproduce the same behavior from juju-deployer, which fails
>   with :
>
>   2015-01-23 16:51:21 [INFO] deployer.import: Adding relations...
>   2015-01-23 16:51:23 [INFO] deployer.import:  Adding relation
> c00-yarn-master:namenode <-> c02-compute-node:datanode
>   2015-01-23 16:51:24 [INFO] deployer.import:  Adding relation
> c00-yarn-master:resourcemanager <-> c02-compute-node:nodemanager
>   2015-01-23 16:51:25 [INFO] deployer.import:  Adding relation
> c04-hdp-pig:namenode <-> c00-yarn-master:namenode
>   2015-01-23 16:51:25 [INFO] deployer.import:  Adding relation
> c04-hdp-pig:resourcemanager <-> c00-yarn-master:resourcemanager
>   2015-01-23 16:51:26 [INFO] deployer.import:  Adding relation
> c03-flight-delay-demo:notebook <-> c01-ipython-notebook:notebook
>   2015-01-23 16:51:26 [DEBUG] deployer.import: Waiting for relation
> convergence 60s
>   2015-01-23 16:52:29 [ERROR] deployer.env: The following units had errors:
>      unit: c00-yarn-master/0: machine: 1 agent-state: error details: hook
> failed: "namenode-relation-joined"
>   2015-01-23 16:52:29 [INFO] deployer.cli: Deployment stopped. run time:
> 583.23
>
>   Let me know if you need anything else!
>   Thanks!
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/charms/+source/hdp-hadoop/+bug/1414080/+subscriptions
>

-- 
You received this bug notification because you are a member of Juju Big
Data Development, which is a bug assignee.
https://bugs.launchpad.net/bugs/1414080

Title:
  Race condition on relation when bundling hdp-hadoop

Status in hdp-hadoop package in Juju Charms Collection:
  Fix Released
Status in hdp-pig package in Juju Charms Collection:
  Fix Released

Bug description:
  I build a demo to run a Machine Learning workshop based on a blog post
  made by Hortonworks.

  There are 2 sides in the demo: 
  * Bundle: https://github.com/SaMnCo/bundle-flight-delay-demo
  * Charm: https://github.com/SaMnCo/charm-flight-delay-demo

  The bundle comprises:

  * YARN Master
  * 4x compute nodes
  * PIG colocated on YARN
  * ipython-notebook colocated on YARN

  When I deploy manually using the 00-deploy file provided in the
  bundle, everything goes well. However, when trying to deploy the
  bundle, it fails at relation creation.

  In the attached juju log collected at deployment, line 11347, we see
  Hadoop crashing. Then 11402, the crash expands. 11418, we discover the
  resource manager is not ready.

  I can reproduce the same behavior with a simpler bundle comprising:

  * YARN Master
  * 4x compute nodes
  * PIG colocated on YARN

  So it seems to really be related to PIG/YARN/Compute nodes relation.

  I can also reproduce the same behavior from juju-deployer, which fails
  with :

  2015-01-23 16:51:21 [INFO] deployer.import: Adding relations...
  2015-01-23 16:51:23 [INFO] deployer.import:  Adding relation 
c00-yarn-master:namenode <-> c02-compute-node:datanode
  2015-01-23 16:51:24 [INFO] deployer.import:  Adding relation 
c00-yarn-master:resourcemanager <-> c02-compute-node:nodemanager
  2015-01-23 16:51:25 [INFO] deployer.import:  Adding relation 
c04-hdp-pig:namenode <-> c00-yarn-master:namenode
  2015-01-23 16:51:25 [INFO] deployer.import:  Adding relation 
c04-hdp-pig:resourcemanager <-> c00-yarn-master:resourcemanager
  2015-01-23 16:51:26 [INFO] deployer.import:  Adding relation 
c03-flight-delay-demo:notebook <-> c01-ipython-notebook:notebook
  2015-01-23 16:51:26 [DEBUG] deployer.import: Waiting for relation convergence 
60s
  2015-01-23 16:52:29 [ERROR] deployer.env: The following units had errors:
     unit: c00-yarn-master/0: machine: 1 agent-state: error details: hook 
failed: "namenode-relation-joined"
  2015-01-23 16:52:29 [INFO] deployer.cli: Deployment stopped. run time: 583.23

  Let me know if you need anything else! 
  Thanks!

To manage notifications about this bug go to:
https://bugs.launchpad.net/charms/+source/hdp-hadoop/+bug/1414080/+subscriptions

-- 
Mailing list: https://launchpad.net/~bigdata-dev
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~bigdata-dev
More help   : https://help.launchpad.net/ListHelp

Reply via email to