[
https://issues.apache.org/jira/browse/HAMA-387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113079#comment-13113079
]
Edward J. Yoon commented on HAMA-387:
-------------------------------------
{code}
2011-09-23 09:42:34,466 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000046_0 11/09/23 09:42:34 DEBUG bsp.BSPPeer:
enterBarrier() znode size within /bsp/job_201109230911_0002/71 is 45. Znodes
include [attempt_201109230911_0002_000046_0,
attempt_201109230911_0002_000006_0, attempt_201109230911_0002_000005_0,
attempt_201109230911_0002_000030_0, attempt_201109230911_0002_000000_0,
attempt_201109230911_0002_000026_0, attempt_201109230911_0002_000025_0,
attempt_201109230911_0002_000007_0, attempt_201109230911_0002_000024_0,
attempt_201109230911_0002_000014_0, attempt_201109230911_0002_000021_0,
attempt_201109230911_0002_000045_0, attempt_201109230911_0002_000015_0,
attempt_201109230911_0002_000035_0, attempt_201109230911_0002_000020_0,
attempt_201109230911_0002_000016_0, attempt_201109230911_0002_000044_0,
attempt_201109230911_0002_000009_0, attempt_201109230911_0002_000017_0,
attempt_201109230911_0002_000008_0, attempt_201109230911_0002_000011_0,
attempt_201109230911_0002_000037_0, attempt_201109230911_0002_000004_0,
attempt_201109230911_0002_000043_0, attempt_201109230911_0002_000022_0,
attempt_201109230911_0002_000012_0, attempt_201109230911_0002_000019_0,
attempt_201109230911_0002_000039_0, attempt_201109230911_0002_000034_0,
attempt_201109230911_0002_000036_0, attempt_201109230911_0002_000027_0,
attempt_201109230911_0002_000018_0, attempt_201109230911_0002_000033_0,
attempt_201109230911_0002_000023_0, attempt_201109230911_0002_000029_0,
attempt_201109230911_0002_000013_0, attempt_201109230911_0002_000003_0,
attempt_201109230911_0002_000031_0, attempt_201109230911_0002_000028_0,
attempt_201109230911_0002_000040_0, attempt_201109230911_0002_000001_0,
attempt_201109230911_0002_000042_0, attempt_201109230911_0002_000047_0,
attempt_201109230911_0002_000002_0, attempt_201109230911_0002_000032_0]
2011-09-23 09:42:34,482 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000047_0 11/09/23 09:42:34 DEBUG bsp.BSPPeer:
enterBarrier() znode size within /bsp/job_201109230911_0002/71 is 47. Znodes
include [attempt_201109230911_0002_000010_0,
attempt_201109230911_0002_000046_0, attempt_201109230911_0002_000006_0,
attempt_201109230911_0002_000005_0, attempt_201109230911_0002_000030_0,
attempt_201109230911_0002_000000_0, attempt_201109230911_0002_000026_0,
attempt_201109230911_0002_000025_0, attempt_201109230911_0002_000007_0,
attempt_201109230911_0002_000024_0, attempt_201109230911_0002_000014_0,
attempt_201109230911_0002_000021_0, attempt_201109230911_0002_000045_0,
attempt_201109230911_0002_000015_0, attempt_201109230911_0002_000035_0,
attempt_201109230911_0002_000020_0, attempt_201109230911_0002_000016_0,
attempt_201109230911_0002_000044_0, attempt_201109230911_0002_000009_0,
attempt_201109230911_0002_000017_0, attempt_201109230911_0002_000008_0,
attempt_201109230911_0002_000011_0, attempt_201109230911_0002_000037_0,
attempt_201109230911_0002_000004_0, attempt_201109230911_0002_000043_0,
attempt_201109230911_0002_000022_0, attempt_201109230911_0002_000012_0,
attempt_201109230911_0002_000019_0, attempt_201109230911_0002_000039_0,
attempt_201109230911_0002_000034_0, attempt_201109230911_0002_000036_0,
attempt_201109230911_0002_000027_0, attempt_201109230911_0002_000018_0,
attempt_201109230911_0002_000033_0, attempt_201109230911_0002_000023_0,
attempt_201109230911_0002_000029_0, attempt_201109230911_0002_000013_0,
attempt_201109230911_0002_000003_0, attempt_201109230911_0002_000031_0,
attempt_201109230911_0002_000028_0, attempt_201109230911_0002_000040_0,
attempt_201109230911_0002_000001_0, attempt_201109230911_0002_000041_0,
attempt_201109230911_0002_000042_0, attempt_201109230911_0002_000047_0,
attempt_201109230911_0002_000002_0, attempt_201109230911_0002_000032_0]
2011-09-23 09:42:34,507 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000046_0 11/09/23 09:42:34 WARN bsp.BSPPeer: Ignore
because znode may be deleted.
2011-09-23 09:42:34,507 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000046_0
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
for /bsp/job_201109230911_0002/71/ready
2011-09-23 09:42:34,507 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000045_0 11/09/23 09:42:34 WARN bsp.BSPPeer: Ignore
because znode may be deleted.
2011-09-23 09:42:34,507 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000045_0
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
for /bsp/job_201109230911_0002/71/ready
2011-09-23 09:42:34,507 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000045_0 at
org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
2011-09-23 09:42:34,507 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000045_0 at
org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
2011-09-23 09:42:34,507 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000045_0 at
org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728)
2011-09-23 09:42:34,507 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000045_0 at
org.apache.hama.bsp.BSPPeer$1.process(BSPPeer.java:397)
2011-09-23 09:42:34,507 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000045_0 at
org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:488)
2011-09-23 09:42:34,507 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000046_0 at
org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
2011-09-23 09:42:34,508 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000046_0 at
org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
2011-09-23 09:42:34,508 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000046_0 at
org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728)
2011-09-23 09:42:34,508 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000046_0 at
org.apache.hama.bsp.BSPPeer$1.process(BSPPeer.java:397)
2011-09-23 09:42:34,508 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000046_0 at
org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:488)
2011-09-23 09:42:34,516 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000047_0 11/09/23 09:42:34 WARN bsp.BSPPeer: Ignore
because znode may be deleted.
2011-09-23 09:42:34,516 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000047_0
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
for /bsp/job_201109230911_0002/71/ready
2011-09-23 09:42:34,516 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000047_0 at
org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
2011-09-23 09:42:34,516 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000047_0 at
org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
2011-09-23 09:42:34,516 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000047_0 at
org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728)
2011-09-23 09:42:34,516 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000047_0 at
org.apache.hama.bsp.BSPPeer$1.process(BSPPeer.java:397)
2011-09-23 09:42:34,516 INFO org.apache.hama.bsp.TaskRunner:
attempt_201109230911_0002_000047_0 at
org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:488)
{code}
Problem occurred on my testbed, too.
> Advanced Barrier Synchronization
> --------------------------------
>
> Key: HAMA-387
> URL: https://issues.apache.org/jira/browse/HAMA-387
> Project: Hama
> Issue Type: Improvement
> Components: bsp
> Affects Versions: 0.3.0
> Reporter: Edward J. Yoon
> Assignee: ChiaHung Lin
> Fix For: 0.4.0
>
> Attachments: HAMA-387.patch, HAMA-387_v02.patch, HAMA-387_v03.patch,
> HAMA-387_v04.patch, doublebarrier.patch, new.patch, sleepless.patch, x.PNG,
> x.patch
>
>
> I think, the lock file must include:
> * the job ID
> * the task ID of the lock file owner
> * the current superstep count
> to check ownership and validation.
> Currently they are named by hostname, but multi-tasks can be run per one
> groomserver in the future.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira