[jira] [Commented] (TEZ-2342) TestFaultTolerance.testRandomFailingTasks fails due to timeout
[ https://issues.apache.org/jira/browse/TEZ-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513676#comment-14513676 ] Jeff Zhang commented on TEZ-2342: - [~bikassaha] No other issue after running many times, and check the logs on the windows jenkins server, it is failed due to timeout. TestFaultTolerance.testRandomFailingTasks fails due to timeout -- Key: TEZ-2342 URL: https://issues.apache.org/jira/browse/TEZ-2342 Project: Apache Tez Issue Type: Bug Reporter: Jeff Zhang Assignee: Jeff Zhang Priority: Minor Attachments: TEZ-2342-1.patch, syslog_dag_1429582868137_0001_1 {code} Error Message test timed out after 12 milliseconds Stacktrace java.lang.Exception: test timed out after 12 milliseconds at java.lang.Thread.sleep(Native Method) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:126) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:114) at org.apache.tez.test.TestFaultTolerance.testRandomFailingTasks(TestFaultTolerance.java:723) Standard Output 2015-04-17 07:46:10,952 INFO [main] test.TestFaultTolerance (TestFaultTolerance.java:setup(65)) - Starting mini clusters 2015-04-17 07:46:11,508 INFO [main] hdfs.MiniDFSCluster (MiniDFSCluster.java:init(446)) - starting cluster: numNameNodes=1, numDataNodes=1 Formatting using clusterid: testClusterID 2015-04-17 07:46:12,919 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(716)) - No KeyProvider found. 2015-04-17 07:46:12,920 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(726)) - fsLock is fair:true 2015-04-17 07:46:13,021 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1173)) - hadoop.configured.node.mapping is deprecated. Instead, use net.topology.configured.node.mapping 2015-04-17 07:46:13,021 INFO [main] blockmanagement.DatanodeManager (DatanodeManager.java:init(239)) - dfs.block.invalidate.limit=1000 2015-04-17 07:46:13,022 INFO [main] blockmanagement.DatanodeManager (DatanodeManager.java:init(245)) - dfs.namenode.datanode.registration.ip-hostname-check=true 2015-04-17 07:46:13,022 INFO [main] blockmanagement.BlockManager (InvalidateBlocks.java:printBlockDeletionTime(71)) - dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000 2015-04-17 07:46:13,025 INFO [main] blockmanagement.BlockManager (InvalidateBlocks.java:printBlockDeletionTime(76)) - The block deletion will start around 2015 Apr 17 07:46:13 2015-04-17 07:46:13,029 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(354)) - Computing capacity for map BlocksMap 2015-04-17 07:46:13,030 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(355)) - VM type = 64-bit 2015-04-17 07:46:13,032 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(356)) - 2.0% max memory 910.3 MB = 18.2 MB 2015-04-17 07:46:13,033 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(361)) - capacity = 2^21 = 2097152 entries 2015-04-17 07:46:13,079 INFO [main] blockmanagement.BlockManager (BlockManager.java:createBlockTokenSecretManager(365)) - dfs.block.access.token.enable=false 2015-04-17 07:46:13,080 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(350)) - defaultReplication = 1 2015-04-17 07:46:13,080 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(351)) - maxReplication = 512 2015-04-17 07:46:13,083 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(352)) - minReplication = 1 2015-04-17 07:46:13,083 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(353)) - maxReplicationStreams = 2 2015-04-17 07:46:13,083 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(354)) - shouldCheckForEnoughRacks = false 2015-04-17 07:46:13,084 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(355)) - replicationRecheckInterval = 3000 2015-04-17 07:46:13,084 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(356)) - encryptDataTransfer= false 2015-04-17 07:46:13,084 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(357)) - maxNumBlocksToLog = 1000 2015-04-17 07:46:13,115 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(746)) - fsOwner = jenkins (auth:SIMPLE) 2015-04-17 07:46:13,116 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(747)) - supergroup = supergroup 2015-04-17 07:46:13,116 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(748)) - isPermissionEnabled = true 2015-04-17 07:46:13,116 INFO [main] namenode.FSNamesystem
[jira] [Commented] (TEZ-2342) TestFaultTolerance.testRandomFailingTasks fails due to timeout
[ https://issues.apache.org/jira/browse/TEZ-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514427#comment-14514427 ] Bikas Saha commented on TEZ-2342: - Sounds good. +1 TestFaultTolerance.testRandomFailingTasks fails due to timeout -- Key: TEZ-2342 URL: https://issues.apache.org/jira/browse/TEZ-2342 Project: Apache Tez Issue Type: Bug Reporter: Jeff Zhang Assignee: Jeff Zhang Priority: Minor Attachments: TEZ-2342-1.patch, syslog_dag_1429582868137_0001_1 {code} Error Message test timed out after 12 milliseconds Stacktrace java.lang.Exception: test timed out after 12 milliseconds at java.lang.Thread.sleep(Native Method) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:126) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:114) at org.apache.tez.test.TestFaultTolerance.testRandomFailingTasks(TestFaultTolerance.java:723) Standard Output 2015-04-17 07:46:10,952 INFO [main] test.TestFaultTolerance (TestFaultTolerance.java:setup(65)) - Starting mini clusters 2015-04-17 07:46:11,508 INFO [main] hdfs.MiniDFSCluster (MiniDFSCluster.java:init(446)) - starting cluster: numNameNodes=1, numDataNodes=1 Formatting using clusterid: testClusterID 2015-04-17 07:46:12,919 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(716)) - No KeyProvider found. 2015-04-17 07:46:12,920 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(726)) - fsLock is fair:true 2015-04-17 07:46:13,021 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1173)) - hadoop.configured.node.mapping is deprecated. Instead, use net.topology.configured.node.mapping 2015-04-17 07:46:13,021 INFO [main] blockmanagement.DatanodeManager (DatanodeManager.java:init(239)) - dfs.block.invalidate.limit=1000 2015-04-17 07:46:13,022 INFO [main] blockmanagement.DatanodeManager (DatanodeManager.java:init(245)) - dfs.namenode.datanode.registration.ip-hostname-check=true 2015-04-17 07:46:13,022 INFO [main] blockmanagement.BlockManager (InvalidateBlocks.java:printBlockDeletionTime(71)) - dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000 2015-04-17 07:46:13,025 INFO [main] blockmanagement.BlockManager (InvalidateBlocks.java:printBlockDeletionTime(76)) - The block deletion will start around 2015 Apr 17 07:46:13 2015-04-17 07:46:13,029 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(354)) - Computing capacity for map BlocksMap 2015-04-17 07:46:13,030 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(355)) - VM type = 64-bit 2015-04-17 07:46:13,032 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(356)) - 2.0% max memory 910.3 MB = 18.2 MB 2015-04-17 07:46:13,033 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(361)) - capacity = 2^21 = 2097152 entries 2015-04-17 07:46:13,079 INFO [main] blockmanagement.BlockManager (BlockManager.java:createBlockTokenSecretManager(365)) - dfs.block.access.token.enable=false 2015-04-17 07:46:13,080 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(350)) - defaultReplication = 1 2015-04-17 07:46:13,080 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(351)) - maxReplication = 512 2015-04-17 07:46:13,083 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(352)) - minReplication = 1 2015-04-17 07:46:13,083 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(353)) - maxReplicationStreams = 2 2015-04-17 07:46:13,083 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(354)) - shouldCheckForEnoughRacks = false 2015-04-17 07:46:13,084 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(355)) - replicationRecheckInterval = 3000 2015-04-17 07:46:13,084 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(356)) - encryptDataTransfer= false 2015-04-17 07:46:13,084 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(357)) - maxNumBlocksToLog = 1000 2015-04-17 07:46:13,115 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(746)) - fsOwner = jenkins (auth:SIMPLE) 2015-04-17 07:46:13,116 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(747)) - supergroup = supergroup 2015-04-17 07:46:13,116 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(748)) - isPermissionEnabled = true 2015-04-17 07:46:13,116 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(759)) - HA Enabled: false 2015-04-17 07:46:13,120 INFO [main] namenode.FSNamesystem
[jira] [Commented] (TEZ-2342) TestFaultTolerance.testRandomFailingTasks fails due to timeout
[ https://issues.apache.org/jira/browse/TEZ-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506515#comment-14506515 ] Jeff Zhang commented on TEZ-2342: - [~hitesh] [~bikassaha] Please help review it. TestFaultTolerance.testRandomFailingTasks fails due to timeout -- Key: TEZ-2342 URL: https://issues.apache.org/jira/browse/TEZ-2342 Project: Apache Tez Issue Type: Bug Reporter: Jeff Zhang Assignee: Jeff Zhang Priority: Minor Attachments: TEZ-2342-1.patch, syslog_dag_1429582868137_0001_1 {code} Error Message test timed out after 12 milliseconds Stacktrace java.lang.Exception: test timed out after 12 milliseconds at java.lang.Thread.sleep(Native Method) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:126) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:114) at org.apache.tez.test.TestFaultTolerance.testRandomFailingTasks(TestFaultTolerance.java:723) Standard Output 2015-04-17 07:46:10,952 INFO [main] test.TestFaultTolerance (TestFaultTolerance.java:setup(65)) - Starting mini clusters 2015-04-17 07:46:11,508 INFO [main] hdfs.MiniDFSCluster (MiniDFSCluster.java:init(446)) - starting cluster: numNameNodes=1, numDataNodes=1 Formatting using clusterid: testClusterID 2015-04-17 07:46:12,919 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(716)) - No KeyProvider found. 2015-04-17 07:46:12,920 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(726)) - fsLock is fair:true 2015-04-17 07:46:13,021 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1173)) - hadoop.configured.node.mapping is deprecated. Instead, use net.topology.configured.node.mapping 2015-04-17 07:46:13,021 INFO [main] blockmanagement.DatanodeManager (DatanodeManager.java:init(239)) - dfs.block.invalidate.limit=1000 2015-04-17 07:46:13,022 INFO [main] blockmanagement.DatanodeManager (DatanodeManager.java:init(245)) - dfs.namenode.datanode.registration.ip-hostname-check=true 2015-04-17 07:46:13,022 INFO [main] blockmanagement.BlockManager (InvalidateBlocks.java:printBlockDeletionTime(71)) - dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000 2015-04-17 07:46:13,025 INFO [main] blockmanagement.BlockManager (InvalidateBlocks.java:printBlockDeletionTime(76)) - The block deletion will start around 2015 Apr 17 07:46:13 2015-04-17 07:46:13,029 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(354)) - Computing capacity for map BlocksMap 2015-04-17 07:46:13,030 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(355)) - VM type = 64-bit 2015-04-17 07:46:13,032 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(356)) - 2.0% max memory 910.3 MB = 18.2 MB 2015-04-17 07:46:13,033 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(361)) - capacity = 2^21 = 2097152 entries 2015-04-17 07:46:13,079 INFO [main] blockmanagement.BlockManager (BlockManager.java:createBlockTokenSecretManager(365)) - dfs.block.access.token.enable=false 2015-04-17 07:46:13,080 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(350)) - defaultReplication = 1 2015-04-17 07:46:13,080 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(351)) - maxReplication = 512 2015-04-17 07:46:13,083 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(352)) - minReplication = 1 2015-04-17 07:46:13,083 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(353)) - maxReplicationStreams = 2 2015-04-17 07:46:13,083 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(354)) - shouldCheckForEnoughRacks = false 2015-04-17 07:46:13,084 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(355)) - replicationRecheckInterval = 3000 2015-04-17 07:46:13,084 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(356)) - encryptDataTransfer= false 2015-04-17 07:46:13,084 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(357)) - maxNumBlocksToLog = 1000 2015-04-17 07:46:13,115 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(746)) - fsOwner = jenkins (auth:SIMPLE) 2015-04-17 07:46:13,116 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(747)) - supergroup = supergroup 2015-04-17 07:46:13,116 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(748)) - isPermissionEnabled = true 2015-04-17 07:46:13,116 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(759)) - HA Enabled: false 2015-04-17 07:46:13,120 INFO [main]
[jira] [Commented] (TEZ-2342) TestFaultTolerance.testRandomFailingTasks fails due to timeout
[ https://issues.apache.org/jira/browse/TEZ-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507543#comment-14507543 ] Bikas Saha commented on TEZ-2342: - If this passes with the increased timeout (instead of hanging permanently then the change looks good. Could you please run this in loop 10-20 times and see if there are any further issues. If none, then lets commit this. Else lets look for a code/test bug. TestFaultTolerance.testRandomFailingTasks fails due to timeout -- Key: TEZ-2342 URL: https://issues.apache.org/jira/browse/TEZ-2342 Project: Apache Tez Issue Type: Bug Reporter: Jeff Zhang Assignee: Jeff Zhang Priority: Minor Attachments: TEZ-2342-1.patch, syslog_dag_1429582868137_0001_1 {code} Error Message test timed out after 12 milliseconds Stacktrace java.lang.Exception: test timed out after 12 milliseconds at java.lang.Thread.sleep(Native Method) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:126) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:114) at org.apache.tez.test.TestFaultTolerance.testRandomFailingTasks(TestFaultTolerance.java:723) Standard Output 2015-04-17 07:46:10,952 INFO [main] test.TestFaultTolerance (TestFaultTolerance.java:setup(65)) - Starting mini clusters 2015-04-17 07:46:11,508 INFO [main] hdfs.MiniDFSCluster (MiniDFSCluster.java:init(446)) - starting cluster: numNameNodes=1, numDataNodes=1 Formatting using clusterid: testClusterID 2015-04-17 07:46:12,919 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(716)) - No KeyProvider found. 2015-04-17 07:46:12,920 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(726)) - fsLock is fair:true 2015-04-17 07:46:13,021 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1173)) - hadoop.configured.node.mapping is deprecated. Instead, use net.topology.configured.node.mapping 2015-04-17 07:46:13,021 INFO [main] blockmanagement.DatanodeManager (DatanodeManager.java:init(239)) - dfs.block.invalidate.limit=1000 2015-04-17 07:46:13,022 INFO [main] blockmanagement.DatanodeManager (DatanodeManager.java:init(245)) - dfs.namenode.datanode.registration.ip-hostname-check=true 2015-04-17 07:46:13,022 INFO [main] blockmanagement.BlockManager (InvalidateBlocks.java:printBlockDeletionTime(71)) - dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000 2015-04-17 07:46:13,025 INFO [main] blockmanagement.BlockManager (InvalidateBlocks.java:printBlockDeletionTime(76)) - The block deletion will start around 2015 Apr 17 07:46:13 2015-04-17 07:46:13,029 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(354)) - Computing capacity for map BlocksMap 2015-04-17 07:46:13,030 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(355)) - VM type = 64-bit 2015-04-17 07:46:13,032 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(356)) - 2.0% max memory 910.3 MB = 18.2 MB 2015-04-17 07:46:13,033 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(361)) - capacity = 2^21 = 2097152 entries 2015-04-17 07:46:13,079 INFO [main] blockmanagement.BlockManager (BlockManager.java:createBlockTokenSecretManager(365)) - dfs.block.access.token.enable=false 2015-04-17 07:46:13,080 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(350)) - defaultReplication = 1 2015-04-17 07:46:13,080 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(351)) - maxReplication = 512 2015-04-17 07:46:13,083 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(352)) - minReplication = 1 2015-04-17 07:46:13,083 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(353)) - maxReplicationStreams = 2 2015-04-17 07:46:13,083 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(354)) - shouldCheckForEnoughRacks = false 2015-04-17 07:46:13,084 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(355)) - replicationRecheckInterval = 3000 2015-04-17 07:46:13,084 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(356)) - encryptDataTransfer= false 2015-04-17 07:46:13,084 INFO [main] blockmanagement.BlockManager (BlockManager.java:init(357)) - maxNumBlocksToLog = 1000 2015-04-17 07:46:13,115 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(746)) - fsOwner = jenkins (auth:SIMPLE) 2015-04-17 07:46:13,116 INFO [main] namenode.FSNamesystem (FSNamesystem.java:init(747)) - supergroup = supergroup 2015-04-17 07:46:13,116 INFO [main] namenode.FSNamesystem