[jira] [Commented] (HIVE-14739) Replace runnables directly added to runtime shutdown hooks to avoid deadlock
[ https://issues.apache.org/jira/browse/HIVE-14739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15488526#comment-15488526 ] Hive QA commented on HIVE-14739: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12828311/HIVE-14739.3.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10545 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats0] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char] org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching org.apache.hive.spark.client.TestSparkClient.testJobSubmission {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1167/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1167/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-1167/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12828311 - PreCommit-HIVE-MASTER-Build > Replace runnables directly added to runtime shutdown hooks to avoid deadlock > > > Key: HIVE-14739 > URL: https://issues.apache.org/jira/browse/HIVE-14739 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Deepesh Khandelwal >Assignee: Prasanth Jayachandran > Attachments: HIVE-14739.1.patch, HIVE-14739.2.patch, > HIVE-14739.3.patch > > > [~deepesh] reported that a deadlock can occur when running queries through > hive cli. [~cnauroth] analyzed it and reported that hive adds shutdown hooks > directly to java Runtime which may execute in non-deterministic order causing > deadlocks with hadoop's shutdown hooks. In one case, hadoop shutdown locked > FileSystem#Cache and FileSystem.close whereas hive shutdown hook locked > FileSystem.close and FileSystem#Cache order causing a deadlock. > Hive and Hadoop has ShutdownHookManager that runs the shutdown hooks in > deterministic order based on priority. We should use that to avoid deadlock > throughout the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14739) Replace runnables directly added to runtime shutdown hooks to avoid deadlock
[ https://issues.apache.org/jira/browse/HIVE-14739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15488409#comment-15488409 ] Siddharth Seth commented on HIVE-14739: --- +1 > Replace runnables directly added to runtime shutdown hooks to avoid deadlock > > > Key: HIVE-14739 > URL: https://issues.apache.org/jira/browse/HIVE-14739 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Deepesh Khandelwal >Assignee: Prasanth Jayachandran > Attachments: HIVE-14739.1.patch, HIVE-14739.2.patch, > HIVE-14739.3.patch > > > [~deepesh] reported that a deadlock can occur when running queries through > hive cli. [~cnauroth] analyzed it and reported that hive adds shutdown hooks > directly to java Runtime which may execute in non-deterministic order causing > deadlocks with hadoop's shutdown hooks. In one case, hadoop shutdown locked > FileSystem#Cache and FileSystem.close whereas hive shutdown hook locked > FileSystem.close and FileSystem#Cache order causing a deadlock. > Hive and Hadoop has ShutdownHookManager that runs the shutdown hooks in > deterministic order based on priority. We should use that to avoid deadlock > throughout the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14739) Replace runnables directly added to runtime shutdown hooks to avoid deadlock
[ https://issues.apache.org/jira/browse/HIVE-14739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486315#comment-15486315 ] Chris Nauroth commented on HIVE-14739: -- [~prasanth_j], thank you for the updated patch. +1 (non-binding) from me. > Replace runnables directly added to runtime shutdown hooks to avoid deadlock > > > Key: HIVE-14739 > URL: https://issues.apache.org/jira/browse/HIVE-14739 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Deepesh Khandelwal >Assignee: Prasanth Jayachandran > Attachments: HIVE-14739.1.patch, HIVE-14739.2.patch > > > [~deepesh] reported that a deadlock can occur when running queries through > hive cli. [~cnauroth] analyzed it and reported that hive adds shutdown hooks > directly to java Runtime which may execute in non-deterministic order causing > deadlocks with hadoop's shutdown hooks. In one case, hadoop shutdown locked > FileSystem#Cache and FileSystem.close whereas hive shutdown hook locked > FileSystem.close and FileSystem#Cache order causing a deadlock. > Hive and Hadoop has ShutdownHookManager that runs the shutdown hooks in > deterministic order based on priority. We should use that to avoid deadlock > throughout the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14739) Replace runnables directly added to runtime shutdown hooks to avoid deadlock
[ https://issues.apache.org/jira/browse/HIVE-14739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485889#comment-15485889 ] Hive QA commented on HIVE-14739: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12828143/HIVE-14739.2.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1159/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1159/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-1159/ Messages: {noformat} This message was trimmed, see log for full details [INFO] Including org.jodd:jodd-core:jar:3.5.2 in the shaded jar. [INFO] Including org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13 in the shaded jar. [INFO] Excluding org.datanucleus:datanucleus-core:jar:4.1.6 from the shaded jar. [INFO] Excluding org.apache.calcite:calcite-core:jar:1.6.0 from the shaded jar. [INFO] Excluding org.apache.calcite:calcite-linq4j:jar:1.6.0 from the shaded jar. [INFO] Excluding net.hydromatic:eigenbase-properties:jar:1.1.5 from the shaded jar. [INFO] Excluding org.codehaus.janino:janino:jar:2.7.6 from the shaded jar. [INFO] Excluding org.codehaus.janino:commons-compiler:jar:2.7.6 from the shaded jar. [INFO] Excluding org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde from the shaded jar. [INFO] Excluding org.apache.calcite:calcite-avatica:jar:1.6.0 from the shaded jar. [INFO] Including com.google.guava:guava:jar:14.0.1 in the shaded jar. [INFO] Including com.googlecode.javaewah:JavaEWAH:jar:0.3.2 in the shaded jar. [INFO] Excluding com.google.code.gson:gson:jar:2.2.4 from the shaded jar. [INFO] Including org.json:json:jar:20090211 in the shaded jar. [INFO] Excluding stax:stax-api:jar:1.0.1 from the shaded jar. [INFO] Including net.sf.opencsv:opencsv:jar:2.3 in the shaded jar. [INFO] Excluding jline:jline:jar:2.12 from the shaded jar. [INFO] Excluding org.apache.tez:tez-api:jar:0.8.4 from the shaded jar. [INFO] Excluding org.codehaus.jettison:jettison:jar:1.3.4 from the shaded jar. [INFO] Excluding org.apache.commons:commons-collections4:jar:4.1 from the shaded jar. [INFO] Excluding org.apache.tez:tez-runtime-library:jar:0.8.4 from the shaded jar. [INFO] Excluding org.roaringbitmap:RoaringBitmap:jar:0.4.9 from the shaded jar. [INFO] Excluding com.ning:async-http-client:jar:1.8.16 from the shaded jar. [INFO] Excluding org.apache.tez:tez-common:jar:0.8.4 from the shaded jar. [INFO] Excluding org.apache.tez:tez-runtime-internals:jar:0.8.4 from the shaded jar. [INFO] Excluding org.apache.tez:hadoop-shim:jar:0.8.4 from the shaded jar. [INFO] Excluding org.apache.tez:tez-mapreduce:jar:0.8.4 from the shaded jar. [INFO] Excluding org.apache.hadoop:hadoop-yarn-server-web-proxy:jar:2.6.0 from the shaded jar. [INFO] Excluding javax.servlet:servlet-api:jar:2.5 from the shaded jar. [INFO] Excluding org.apache.spark:spark-core_2.10:jar:1.6.0 from the shaded jar. [INFO] Excluding com.twitter:chill_2.10:jar:0.5.0 from the shaded jar. [INFO] Excluding com.twitter:chill-java:jar:0.5.0 from the shaded jar. [INFO] Excluding org.apache.xbean:xbean-asm5-shaded:jar:4.4 from the shaded jar. [INFO] Excluding org.apache.hadoop:hadoop-client:jar:2.7.2 from the shaded jar. [INFO] Excluding org.apache.hadoop:hadoop-mapreduce-client-app:jar:2.7.2 from the shaded jar. [INFO] Excluding org.apache.hadoop:hadoop-mapreduce-client-shuffle:jar:2.7.2 from the shaded jar. [INFO] Excluding org.apache.hadoop:hadoop-mapreduce-client-jobclient:jar:2.7.2 from the shaded jar. [INFO] Excluding org.apache.spark:spark-launcher_2.10:jar:1.6.0 from the shaded jar. [INFO] Excluding org.apache.spark:spark-network-common_2.10:jar:1.6.0 from the shaded jar. [INFO] Excluding org.apache.spark:spark-network-shuffle_2.10:jar:1.6.0 from the shaded jar. [INFO] Excluding org.apache.spark:spark-unsafe_2.10:jar:1.6.0 from the shaded jar. [INFO] Excluding org.slf4j:jul-to-slf4j:jar:1.7.10 from the shaded jar. [INFO] Excluding org.slf4j:jcl-over-slf4j:jar:1.7.10 from the shaded jar. [INFO] Excluding com.ning:compress-lzf:jar:1.0.3 from the shaded jar. [INFO] Excluding net.jpountz.lz4:lz4:jar:1.3.0 from the shaded jar. [INFO] Excluding com.typesafe.akka:akka-remote_2.10:jar:2.3.11 from the shaded jar. [INFO] Excluding com.typesafe.akka:akka-actor_2.10:jar:2.3.11 from the shaded jar. [INFO] Excluding com.typesafe:config:jar:1.2.1 from the shaded jar. [INFO] Excluding org.uncommons.maths:uncommons-maths:jar:1.2.2a from the shaded jar. [INFO] Excluding com.typesafe.akka:akka-slf4j_2.10:jar:2.3.11 from the shaded jar. [INFO] Excluding org.scala-lang:scala-library:jar:2.10.4 from the shaded jar. [INFO] Excluding org.json4s:json4s-jackson_2.10:jar:3.2.10 from the shaded jar. [INFO] Excluding org.json4s:json4s-core_2.10:jar:3.2.10 from the
[jira] [Commented] (HIVE-14739) Replace runnables directly added to runtime shutdown hooks to avoid deadlock
[ https://issues.apache.org/jira/browse/HIVE-14739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485740#comment-15485740 ] Prasanth Jayachandran commented on HIVE-14739: -- Thanks [~cnauroth] for reviewing the patch! I am not sure why ShutdownHookManager was forked initially (probably it was private initially in hadoop not sure though). I looked at the history and the initial commit didn't have any intent to fork it. But then later HIVE-11768 added delete on exit hook that manages the temp files that are created to be cleared on shutdown. I refactored the hive's ShutdownHookManger to retain the DELETE_ON_EXIT hook and retain some methods that will just delegate to Hadoop's ShutdownHookManager. Also replaced all Threads with Runnable. > Replace runnables directly added to runtime shutdown hooks to avoid deadlock > > > Key: HIVE-14739 > URL: https://issues.apache.org/jira/browse/HIVE-14739 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Deepesh Khandelwal >Assignee: Prasanth Jayachandran > Attachments: HIVE-14739.1.patch, HIVE-14739.2.patch > > > [~deepesh] reported that a deadlock can occur when running queries through > hive cli. [~cnauroth] analyzed it and reported that hive adds shutdown hooks > directly to java Runtime which may execute in non-deterministic order causing > deadlocks with hadoop's shutdown hooks. In one case, hadoop shutdown locked > FileSystem#Cache and FileSystem.close whereas hive shutdown hook locked > FileSystem.close and FileSystem#Cache order causing a deadlock. > Hive and Hadoop has ShutdownHookManager that runs the shutdown hooks in > deterministic order based on priority. We should use that to avoid deadlock > throughout the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14739) Replace runnables directly added to runtime shutdown hooks to avoid deadlock
[ https://issues.apache.org/jira/browse/HIVE-14739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485485#comment-15485485 ] Chris Nauroth commented on HIVE-14739: -- [~prasanth_j], thank you for sharing this patch. It's interesting for me to see that Hive appears to have forked its own copy of {{ShutdownHookManager}} from Hadoop. I don't know the background on this. The code is similar, but not identical, between the two codebases. Perhaps that's because the Hive version was not updated to match recent changes in Hadoop, like HADOOP-12950. In order to fully prevent deadlocks between different shutdown hooks, there really needs to be a single {{ShutdownHookManager}} in the process. If Hadoop and Hive each have their own implementation, and a Hive process instantiates one of each and registers different shutdown hooks with each one, then there will be 2 threads executing different shutdown hooks concurrently, which could still cause a deadlock. Would it make sense to eliminate the forked {{ShutdownHookManager}} class in Hive and instead rely completely on using the one from Hadoop? Also, a minor nit: maybe all calls to {{new Thread()}} could be converted to {{new Runnable()}}. The {{Runnable}} interface is sufficient, and it won't make use of any additional functionality provided by the {{Thread}} implementation. > Replace runnables directly added to runtime shutdown hooks to avoid deadlock > > > Key: HIVE-14739 > URL: https://issues.apache.org/jira/browse/HIVE-14739 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Deepesh Khandelwal >Assignee: Prasanth Jayachandran > Attachments: HIVE-14739.1.patch > > > [~deepesh] reported that a deadlock can occur when running queries through > hive cli. [~cnauroth] analyzed it and reported that hive adds shutdown hooks > directly to java Runtime which may execute in non-deterministic order causing > deadlocks with hadoop's shutdown hooks. In one case, hadoop shutdown locked > FileSystem#Cache and FileSystem.close whereas hive shutdown hook locked > FileSystem.close and FileSystem#Cache order causing a deadlock. > Hive and Hadoop has ShutdownHookManager that runs the shutdown hooks in > deterministic order based on priority. We should use that to avoid deadlock > throughout the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14739) Replace runnables directly added to runtime shutdown hooks to avoid deadlock
[ https://issues.apache.org/jira/browse/HIVE-14739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485392#comment-15485392 ] Prasanth Jayachandran commented on HIVE-14739: -- [~sseth]/[~hagleitn] Can someone please review this patch? > Replace runnables directly added to runtime shutdown hooks to avoid deadlock > > > Key: HIVE-14739 > URL: https://issues.apache.org/jira/browse/HIVE-14739 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Deepesh Khandelwal >Assignee: Prasanth Jayachandran > Attachments: HIVE-14739.1.patch > > > [~deepesh] reported that a deadlock can occur when running queries through > hive cli. [~cnauroth] analyzed it and reported that hive adds shutdown hooks > directly to java Runtime which may execute in non-deterministic order causing > deadlocks with hadoop's shutdown hooks. In one case, hadoop shutdown locked > FileSystem#Cache and FileSystem.close whereas hive shutdown hook locked > FileSystem.close and FileSystem#Cache order causing a deadlock. > Hive and Hadoop has ShutdownHookManager that runs the shutdown hooks in > deterministic order based on priority. We should use that to avoid deadlock > throughout the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)