[jira] [Commented] (HIVE-14714) Finishing Hive on Spark causes "java.io.IOException: Stream closed"
[ https://issues.apache.org/jira/browse/HIVE-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15508419#comment-15508419 ] Rui Li commented on HIVE-14714: --- +1 Thanks for the update [~gszadovszky]. I'll commit this shortly if no one has any other comments. > Finishing Hive on Spark causes "java.io.IOException: Stream closed" > --- > > Key: HIVE-14714 > URL: https://issues.apache.org/jira/browse/HIVE-14714 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Gabor Szadovszky >Assignee: Gabor Szadovszky > Attachments: HIVE-14714.2.patch, HIVE-14714.3.patch, HIVE-14714.patch > > > After execute hive command with Spark, finishing the beeline session or > even switch the engine causes IOException. The following executed Ctrl-D to > finish the session but "!quit" or even "set hive.execution.engine=mr;" causes > the issue. > From HS2 log: > {code} > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [HiveServer2-Handler-Pool: Thread-106]: Timed out shutting down remote > driver, interrupting... > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [Driver]: Waiting thread interrupted, killing child process. > 2016-09-06 16:15:12,296 WARN org.apache.hive.spark.client.SparkClientImpl: > [stderr-redir-1]: Error in redirector thread. > java.io.IOException: Stream closed > at > java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:272) > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:154) > at java.io.BufferedReader.readLine(BufferedReader.java:317) > at java.io.BufferedReader.readLine(BufferedReader.java:382) > at > org.apache.hive.spark.client.SparkClientImpl$Redirector.run(SparkClientImpl.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14714) Finishing Hive on Spark causes "java.io.IOException: Stream closed"
[ https://issues.apache.org/jira/browse/HIVE-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15506672#comment-15506672 ] Hive QA commented on HIVE-14714: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12829385/HIVE-14714.3.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10556 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.testMetaDataCounts org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1241/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1241/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1241/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12829385 - PreCommit-HIVE-Build > Finishing Hive on Spark causes "java.io.IOException: Stream closed" > --- > > Key: HIVE-14714 > URL: https://issues.apache.org/jira/browse/HIVE-14714 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Gabor Szadovszky >Assignee: Gabor Szadovszky > Attachments: HIVE-14714.2.patch, HIVE-14714.3.patch, HIVE-14714.patch > > > After execute hive command with Spark, finishing the beeline session or > even switch the engine causes IOException. The following executed Ctrl-D to > finish the session but "!quit" or even "set hive.execution.engine=mr;" causes > the issue. > From HS2 log: > {code} > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [HiveServer2-Handler-Pool: Thread-106]: Timed out shutting down remote > driver, interrupting... > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [Driver]: Waiting thread interrupted, killing child process. > 2016-09-06 16:15:12,296 WARN org.apache.hive.spark.client.SparkClientImpl: > [stderr-redir-1]: Error in redirector thread. > java.io.IOException: Stream closed > at > java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:272) > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:154) > at java.io.BufferedReader.readLine(BufferedReader.java:317) > at java.io.BufferedReader.readLine(BufferedReader.java:382) > at > org.apache.hive.spark.client.SparkClientImpl$Redirector.run(SparkClientImpl.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14714) Finishing Hive on Spark causes "java.io.IOException: Stream closed"
[ https://issues.apache.org/jira/browse/HIVE-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15506460#comment-15506460 ] Gabor Szadovszky commented on HIVE-14714: - Review board link is removed as it is not relevant anymore. For this little change, I would not create another one. > Finishing Hive on Spark causes "java.io.IOException: Stream closed" > --- > > Key: HIVE-14714 > URL: https://issues.apache.org/jira/browse/HIVE-14714 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Gabor Szadovszky >Assignee: Gabor Szadovszky > Attachments: HIVE-14714.2.patch, HIVE-14714.3.patch, HIVE-14714.patch > > > After execute hive command with Spark, finishing the beeline session or > even switch the engine causes IOException. The following executed Ctrl-D to > finish the session but "!quit" or even "set hive.execution.engine=mr;" causes > the issue. > From HS2 log: > {code} > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [HiveServer2-Handler-Pool: Thread-106]: Timed out shutting down remote > driver, interrupting... > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [Driver]: Waiting thread interrupted, killing child process. > 2016-09-06 16:15:12,296 WARN org.apache.hive.spark.client.SparkClientImpl: > [stderr-redir-1]: Error in redirector thread. > java.io.IOException: Stream closed > at > java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:272) > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:154) > at java.io.BufferedReader.readLine(BufferedReader.java:317) > at java.io.BufferedReader.readLine(BufferedReader.java:382) > at > org.apache.hive.spark.client.SparkClientImpl$Redirector.run(SparkClientImpl.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14714) Finishing Hive on Spark causes "java.io.IOException: Stream closed"
[ https://issues.apache.org/jira/browse/HIVE-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15505363#comment-15505363 ] Rui Li commented on HIVE-14714: --- Hi [~gszadovszky], in that case, how about logging some brief message in DEBUG level? I'm just wary to swallow any exceptions. [~xuefuz], do you have any thoughts on this? > Finishing Hive on Spark causes "java.io.IOException: Stream closed" > --- > > Key: HIVE-14714 > URL: https://issues.apache.org/jira/browse/HIVE-14714 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Gabor Szadovszky >Assignee: Gabor Szadovszky > Attachments: HIVE-14714.2.patch, HIVE-14714.patch > > > After execute hive command with Spark, finishing the beeline session or > even switch the engine causes IOException. The following executed Ctrl-D to > finish the session but "!quit" or even "set hive.execution.engine=mr;" causes > the issue. > From HS2 log: > {code} > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [HiveServer2-Handler-Pool: Thread-106]: Timed out shutting down remote > driver, interrupting... > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [Driver]: Waiting thread interrupted, killing child process. > 2016-09-06 16:15:12,296 WARN org.apache.hive.spark.client.SparkClientImpl: > [stderr-redir-1]: Error in redirector thread. > java.io.IOException: Stream closed > at > java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:272) > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:154) > at java.io.BufferedReader.readLine(BufferedReader.java:317) > at java.io.BufferedReader.readLine(BufferedReader.java:382) > at > org.apache.hive.spark.client.SparkClientImpl$Redirector.run(SparkClientImpl.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14714) Finishing Hive on Spark causes "java.io.IOException: Stream closed"
[ https://issues.apache.org/jira/browse/HIVE-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15503484#comment-15503484 ] Gabor Szadovszky commented on HIVE-14714: - Thanks a lot for the hint, [~lirui]. The fix of [HIVE-13895] should solve the waiting problem. However, in case of child.waitFor() is interrupted and the related process still generates some output the IOException in the redirector threads would be logged. (It might occur if the related spark configs are modified.) I think, these exceptions might be misleading. So, I would do a minimal modification to swallow these IOExceptions in case we are about to stop the remote driver (isAlive is false). What do you think? > Finishing Hive on Spark causes "java.io.IOException: Stream closed" > --- > > Key: HIVE-14714 > URL: https://issues.apache.org/jira/browse/HIVE-14714 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Gabor Szadovszky >Assignee: Gabor Szadovszky > Attachments: HIVE-14714.2.patch, HIVE-14714.patch > > > After execute hive command with Spark, finishing the beeline session or > even switch the engine causes IOException. The following executed Ctrl-D to > finish the session but "!quit" or even "set hive.execution.engine=mr;" causes > the issue. > From HS2 log: > {code} > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [HiveServer2-Handler-Pool: Thread-106]: Timed out shutting down remote > driver, interrupting... > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [Driver]: Waiting thread interrupted, killing child process. > 2016-09-06 16:15:12,296 WARN org.apache.hive.spark.client.SparkClientImpl: > [stderr-redir-1]: Error in redirector thread. > java.io.IOException: Stream closed > at > java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:272) > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:154) > at java.io.BufferedReader.readLine(BufferedReader.java:317) > at java.io.BufferedReader.readLine(BufferedReader.java:382) > at > org.apache.hive.spark.client.SparkClientImpl$Redirector.run(SparkClientImpl.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14714) Finishing Hive on Spark causes "java.io.IOException: Stream closed"
[ https://issues.apache.org/jira/browse/HIVE-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15503061#comment-15503061 ] Rui Li commented on HIVE-14714: --- bq. These threads are running in HS2 therefore, they won't be terminated in case of beeline is closed. Yeah, but if we use CLI, these threads run in the CLI. Then we may lose some output from spark-submit after CLI exits. Thinking more about this, I think the problem is more specific to yarn-cluster mode right? Because in yarn-client mode, RemoteDriver runs in spark-submit so it should shut down properly. For yarn-cluster mode, spark-submit is just a monitor for the spark app. It may be acceptable to lose some output from it. But on the other hand, user can set {{spark.yarn.submit.waitAppCompletion=false}} so that spark-submit exits after the app is submitted in order to avoid this hanging issue. HIVE-13895 actually made this default. I wonder if that should be enough for the issue. > Finishing Hive on Spark causes "java.io.IOException: Stream closed" > --- > > Key: HIVE-14714 > URL: https://issues.apache.org/jira/browse/HIVE-14714 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Gabor Szadovszky >Assignee: Gabor Szadovszky > Attachments: HIVE-14714.2.patch, HIVE-14714.patch > > > After execute hive command with Spark, finishing the beeline session or > even switch the engine causes IOException. The following executed Ctrl-D to > finish the session but "!quit" or even "set hive.execution.engine=mr;" causes > the issue. > From HS2 log: > {code} > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [HiveServer2-Handler-Pool: Thread-106]: Timed out shutting down remote > driver, interrupting... > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [Driver]: Waiting thread interrupted, killing child process. > 2016-09-06 16:15:12,296 WARN org.apache.hive.spark.client.SparkClientImpl: > [stderr-redir-1]: Error in redirector thread. > java.io.IOException: Stream closed > at > java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:272) > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:154) > at java.io.BufferedReader.readLine(BufferedReader.java:317) > at java.io.BufferedReader.readLine(BufferedReader.java:382) > at > org.apache.hive.spark.client.SparkClientImpl$Redirector.run(SparkClientImpl.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14714) Finishing Hive on Spark causes "java.io.IOException: Stream closed"
[ https://issues.apache.org/jira/browse/HIVE-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15502706#comment-15502706 ] Gabor Szadovszky commented on HIVE-14714: - Hi [~lirui], # The root cause of the spark submit hang is that the refresh interval of the checking of the process might set as large as it won't get the new state of the remote driver in time. This value can be modified by the user therefore, I would like to handle this situation. # These threads are running in HS2 therefore, they won't be terminated in case of beeline is closed. The only effect on the beeline is that it don't have to wait for the timeout as the method stop() will return immediately. (In case of HS2 is running in embedded mode, then these threads will be terminated but it was the original behaviour which I haven't changed.) > Finishing Hive on Spark causes "java.io.IOException: Stream closed" > --- > > Key: HIVE-14714 > URL: https://issues.apache.org/jira/browse/HIVE-14714 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Gabor Szadovszky >Assignee: Gabor Szadovszky > Attachments: HIVE-14714.2.patch, HIVE-14714.patch > > > After execute hive command with Spark, finishing the beeline session or > even switch the engine causes IOException. The following executed Ctrl-D to > finish the session but "!quit" or even "set hive.execution.engine=mr;" causes > the issue. > From HS2 log: > {code} > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [HiveServer2-Handler-Pool: Thread-106]: Timed out shutting down remote > driver, interrupting... > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [Driver]: Waiting thread interrupted, killing child process. > 2016-09-06 16:15:12,296 WARN org.apache.hive.spark.client.SparkClientImpl: > [stderr-redir-1]: Error in redirector thread. > java.io.IOException: Stream closed > at > java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:272) > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:154) > at java.io.BufferedReader.readLine(BufferedReader.java:317) > at java.io.BufferedReader.readLine(BufferedReader.java:382) > at > org.apache.hive.spark.client.SparkClientImpl$Redirector.run(SparkClientImpl.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14714) Finishing Hive on Spark causes "java.io.IOException: Stream closed"
[ https://issues.apache.org/jira/browse/HIVE-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15502177#comment-15502177 ] Rui Li commented on HIVE-14714: --- Hi [~gszadovszky], I have several questions about the change: # If the root cause is that spark-submit hangs after RemoteDriver exits, can we figure out the reason and fix it? Both the delay and exception log on Hive's side seem to be expected in that case by design, although we can reconsider whether this is appropriate. # I guess the change also affects CLI right? Since both driverThread and redirector are daemon threads, will they just be terminated when the CLI exits? > Finishing Hive on Spark causes "java.io.IOException: Stream closed" > --- > > Key: HIVE-14714 > URL: https://issues.apache.org/jira/browse/HIVE-14714 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Gabor Szadovszky >Assignee: Gabor Szadovszky > Attachments: HIVE-14714.2.patch, HIVE-14714.patch > > > After execute hive command with Spark, finishing the beeline session or > even switch the engine causes IOException. The following executed Ctrl-D to > finish the session but "!quit" or even "set hive.execution.engine=mr;" causes > the issue. > From HS2 log: > {code} > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [HiveServer2-Handler-Pool: Thread-106]: Timed out shutting down remote > driver, interrupting... > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [Driver]: Waiting thread interrupted, killing child process. > 2016-09-06 16:15:12,296 WARN org.apache.hive.spark.client.SparkClientImpl: > [stderr-redir-1]: Error in redirector thread. > java.io.IOException: Stream closed > at > java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:272) > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:154) > at java.io.BufferedReader.readLine(BufferedReader.java:317) > at java.io.BufferedReader.readLine(BufferedReader.java:382) > at > org.apache.hive.spark.client.SparkClientImpl$Redirector.run(SparkClientImpl.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14714) Finishing Hive on Spark causes "java.io.IOException: Stream closed"
[ https://issues.apache.org/jira/browse/HIVE-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495722#comment-15495722 ] Gabor Szadovszky commented on HIVE-14714: - Failing unit tests are not relevant. Existing unit tests are covering the modified functionality or it is related to logging which would quite hard to unit test. Manually tested (see review board for details.) > Finishing Hive on Spark causes "java.io.IOException: Stream closed" > --- > > Key: HIVE-14714 > URL: https://issues.apache.org/jira/browse/HIVE-14714 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Gabor Szadovszky >Assignee: Gabor Szadovszky > Attachments: HIVE-14714.2.patch, HIVE-14714.patch > > > After execute hive command with Spark, finishing the beeline session or > even switch the engine causes IOException. The following executed Ctrl-D to > finish the session but "!quit" or even "set hive.execution.engine=mr;" causes > the issue. > From HS2 log: > {code} > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [HiveServer2-Handler-Pool: Thread-106]: Timed out shutting down remote > driver, interrupting... > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [Driver]: Waiting thread interrupted, killing child process. > 2016-09-06 16:15:12,296 WARN org.apache.hive.spark.client.SparkClientImpl: > [stderr-redir-1]: Error in redirector thread. > java.io.IOException: Stream closed > at > java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:272) > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:154) > at java.io.BufferedReader.readLine(BufferedReader.java:317) > at java.io.BufferedReader.readLine(BufferedReader.java:382) > at > org.apache.hive.spark.client.SparkClientImpl$Redirector.run(SparkClientImpl.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14714) Finishing Hive on Spark causes "java.io.IOException: Stream closed"
[ https://issues.apache.org/jira/browse/HIVE-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493957#comment-15493957 ] Hive QA commented on HIVE-14714: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12828659/HIVE-14714.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10561 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char] org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1200/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1200/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-1200/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12828659 - PreCommit-HIVE-MASTER-Build > Finishing Hive on Spark causes "java.io.IOException: Stream closed" > --- > > Key: HIVE-14714 > URL: https://issues.apache.org/jira/browse/HIVE-14714 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Gabor Szadovszky >Assignee: Gabor Szadovszky > Attachments: HIVE-14714.2.patch, HIVE-14714.patch > > > After execute hive command with Spark, finishing the beeline session or > even switch the engine causes IOException. The following executed Ctrl-D to > finish the session but "!quit" or even "set hive.execution.engine=mr;" causes > the issue. > From HS2 log: > {code} > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [HiveServer2-Handler-Pool: Thread-106]: Timed out shutting down remote > driver, interrupting... > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [Driver]: Waiting thread interrupted, killing child process. > 2016-09-06 16:15:12,296 WARN org.apache.hive.spark.client.SparkClientImpl: > [stderr-redir-1]: Error in redirector thread. > java.io.IOException: Stream closed > at > java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:272) > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:154) > at java.io.BufferedReader.readLine(BufferedReader.java:317) > at java.io.BufferedReader.readLine(BufferedReader.java:382) > at > org.apache.hive.spark.client.SparkClientImpl$Redirector.run(SparkClientImpl.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14714) Finishing Hive on Spark causes "java.io.IOException: Stream closed"
[ https://issues.apache.org/jira/browse/HIVE-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493760#comment-15493760 ] Gabor Szadovszky commented on HIVE-14714: - The original problem was the listed exception and that beeline exited only after 10s. The root cause of the 10s delay was that in many cases the spark-submit process does not end even in the case of the RemoteDriver has ended on the other side. Therefore, the driverThread.join(1) really waits for 10s and then we are interrupting it. Here comes the root cause of the logged exception. If we are interrupting child.waitFor() the redirector threads gets IOExceptions in the next readLine() as the related streams got closed. I've redesigned the Redirector class therefore, it does not use any IO which might hang the thread in case of interruption (e.g. BufferedReader.readLine() cannot be interrupted, it waits for infinity if the related stream is open but no input appears). After this redesign we are able to simply interrupt the driver thread and let it keep working in the background until we have some outputs to be gathered or the related timeout occurs. We do not have to hang the client side to wait for all the threads to be finished. Then came the unit test failure. The root cause was that protocol.endSession() only sends a job via rpc asynchronously to close the session on the other side. As there is no 10s delay anymore the unit tests executed each after another run into the issue that the previous session is not closed properly. Therefore I've implemented some trick make the end session synchronous. Hope it describes my change properly and with my code comments makes it understandable. Any comments here or on the review board are more than welcome. :) > Finishing Hive on Spark causes "java.io.IOException: Stream closed" > --- > > Key: HIVE-14714 > URL: https://issues.apache.org/jira/browse/HIVE-14714 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Gabor Szadovszky >Assignee: Gabor Szadovszky > Attachments: HIVE-14714.2.patch, HIVE-14714.patch > > > After execute hive command with Spark, finishing the beeline session or > even switch the engine causes IOException. The following executed Ctrl-D to > finish the session but "!quit" or even "set hive.execution.engine=mr;" causes > the issue. > From HS2 log: > {code} > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [HiveServer2-Handler-Pool: Thread-106]: Timed out shutting down remote > driver, interrupting... > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [Driver]: Waiting thread interrupted, killing child process. > 2016-09-06 16:15:12,296 WARN org.apache.hive.spark.client.SparkClientImpl: > [stderr-redir-1]: Error in redirector thread. > java.io.IOException: Stream closed > at > java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:272) > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:154) > at java.io.BufferedReader.readLine(BufferedReader.java:317) > at java.io.BufferedReader.readLine(BufferedReader.java:382) > at > org.apache.hive.spark.client.SparkClientImpl$Redirector.run(SparkClientImpl.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14714) Finishing Hive on Spark causes "java.io.IOException: Stream closed"
[ https://issues.apache.org/jira/browse/HIVE-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493696#comment-15493696 ] Xuefu Zhang commented on HIVE-14714: I'm not sure if this is obvious from the code changes, but could we get some diagnosis and explanation on the fix? It doesn't seem trivial. > Finishing Hive on Spark causes "java.io.IOException: Stream closed" > --- > > Key: HIVE-14714 > URL: https://issues.apache.org/jira/browse/HIVE-14714 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Gabor Szadovszky >Assignee: Gabor Szadovszky > Attachments: HIVE-14714.2.patch, HIVE-14714.patch > > > After execute hive command with Spark, finishing the beeline session or > even switch the engine causes IOException. The following executed Ctrl-D to > finish the session but "!quit" or even "set hive.execution.engine=mr;" causes > the issue. > From HS2 log: > {code} > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [HiveServer2-Handler-Pool: Thread-106]: Timed out shutting down remote > driver, interrupting... > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [Driver]: Waiting thread interrupted, killing child process. > 2016-09-06 16:15:12,296 WARN org.apache.hive.spark.client.SparkClientImpl: > [stderr-redir-1]: Error in redirector thread. > java.io.IOException: Stream closed > at > java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:272) > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:154) > at java.io.BufferedReader.readLine(BufferedReader.java:317) > at java.io.BufferedReader.readLine(BufferedReader.java:382) > at > org.apache.hive.spark.client.SparkClientImpl$Redirector.run(SparkClientImpl.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14714) Finishing Hive on Spark causes "java.io.IOException: Stream closed"
[ https://issues.apache.org/jira/browse/HIVE-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491084#comment-15491084 ] Hive QA commented on HIVE-14714: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12828490/HIVE-14714.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10545 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats0] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char] org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1188/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1188/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-1188/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12828490 - PreCommit-HIVE-MASTER-Build > Finishing Hive on Spark causes "java.io.IOException: Stream closed" > --- > > Key: HIVE-14714 > URL: https://issues.apache.org/jira/browse/HIVE-14714 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Gabor Szadovszky >Assignee: Gabor Szadovszky > Attachments: HIVE-14714.patch > > > After execute hive command with Spark, finishing the beeline session or > even switch the engine causes IOException. The following executed Ctrl-D to > finish the session but "!quit" or even "set hive.execution.engine=mr;" causes > the issue. > From HS2 log: > {code} > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [HiveServer2-Handler-Pool: Thread-106]: Timed out shutting down remote > driver, interrupting... > 2016-09-06 16:15:12,291 WARN org.apache.hive.spark.client.SparkClientImpl: > [Driver]: Waiting thread interrupted, killing child process. > 2016-09-06 16:15:12,296 WARN org.apache.hive.spark.client.SparkClientImpl: > [stderr-redir-1]: Error in redirector thread. > java.io.IOException: Stream closed > at > java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:272) > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:154) > at java.io.BufferedReader.readLine(BufferedReader.java:317) > at java.io.BufferedReader.readLine(BufferedReader.java:382) > at > org.apache.hive.spark.client.SparkClientImpl$Redirector.run(SparkClientImpl.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)