[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906116#comment-15906116 ] Hive QA commented on HIVE-16180: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857456/HIVE-16180.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10339 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] (batchId=153) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4086/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4086/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4086/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12857456 - PreCommit-HIVE-Build > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, HIVE-16180.2.patch, > Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16156) FileSinkOperator should delete existing output target when renaming
[ https://issues.apache.org/jira/browse/HIVE-16156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-16156: --- Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Patch committed to master. Thanks to Sergey for the review. > FileSinkOperator should delete existing output target when renaming > --- > > Key: HIVE-16156 > URL: https://issues.apache.org/jira/browse/HIVE-16156 > Project: Hive > Issue Type: Bug > Components: Operators >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Fix For: 2.2.0 > > Attachments: HIVE-16156.1.patch, HIVE-16156.2.patch, HIVE-16156.patch > > > If a task get killed (for whatever a reason) after it completes the renaming > the temp output to final output during commit, subsequent task attempts will > fail when renaming because of the existence of the target output. This can > happen, however rarely. > {code} > Job failed with org.apache.hadoop.hive.ql.metadata.HiveException: Unable to > rename output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. > java.util.concurrent.ExecutionException: Exception thrown by job > at > org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:311) > at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:316) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 306 in stage 5.0 failed 4 times, most recent failure: Lost task 306.4 in > stage 5.0 (TID 2956, hadoopworker1444-sjc1.prod.uber.internal): > java.lang.IllegalStateException: Hit error while closing operators - failing > tree: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename > output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:202) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename > output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:227) >
[jira] [Commented] (HIVE-16156) FileSinkOperator should delete existing output target when renaming
[ https://issues.apache.org/jira/browse/HIVE-16156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906108#comment-15906108 ] Hive QA commented on HIVE-16156: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857437/HIVE-16156.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 10339 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4085/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4085/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4085/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12857437 - PreCommit-HIVE-Build > FileSinkOperator should delete existing output target when renaming > --- > > Key: HIVE-16156 > URL: https://issues.apache.org/jira/browse/HIVE-16156 > Project: Hive > Issue Type: Bug > Components: Operators >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-16156.1.patch, HIVE-16156.2.patch, HIVE-16156.patch > > > If a task get killed (for whatever a reason) after it completes the renaming > the temp output to final output during commit, subsequent task attempts will > fail when renaming because of the existence of the target output. This can > happen, however rarely. > {code} > Job failed with org.apache.hadoop.hive.ql.metadata.HiveException: Unable to > rename output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. > java.util.concurrent.ExecutionException: Exception thrown by job > at > org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:311) > at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:316) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 306 in stage 5.0 failed 4 times, most recent failure: Lost task 306.4 in > stage 5.0 (TID 2956, hadoopworker1444-sjc1.prod.uber.internal): > java.lang.IllegalStateException: Hit error while closing operators - failing > tree: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename > output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:202) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at >
[jira] [Commented] (HIVE-16133) Footer cache in Tez AM can take too much memory
[ https://issues.apache.org/jira/browse/HIVE-16133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906089#comment-15906089 ] Hive QA commented on HIVE-16133: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857431/HIVE-16133.04.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 10339 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4084/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4084/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4084/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12857431 - PreCommit-HIVE-Build > Footer cache in Tez AM can take too much memory > --- > > Key: HIVE-16133 > URL: https://issues.apache.org/jira/browse/HIVE-16133 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Sergey Shelukhin > Attachments: HIVE-16133.01.patch, HIVE-16133.02.patch, > HIVE-16133.02.patch, HIVE-16133.03.patch, HIVE-16133.04.patch, > HIVE-16133.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16156) FileSinkOperator should delete existing output target when renaming
[ https://issues.apache.org/jira/browse/HIVE-16156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906063#comment-15906063 ] Hive QA commented on HIVE-16156: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857437/HIVE-16156.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10339 tests executed *Failed tests:* {noformat} org.apache.hive.jdbc.TestJdbcDriver2.testSelectExecAsync2 (batchId=217) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4083/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4083/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4083/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12857437 - PreCommit-HIVE-Build > FileSinkOperator should delete existing output target when renaming > --- > > Key: HIVE-16156 > URL: https://issues.apache.org/jira/browse/HIVE-16156 > Project: Hive > Issue Type: Bug > Components: Operators >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-16156.1.patch, HIVE-16156.2.patch, HIVE-16156.patch > > > If a task get killed (for whatever a reason) after it completes the renaming > the temp output to final output during commit, subsequent task attempts will > fail when renaming because of the existence of the target output. This can > happen, however rarely. > {code} > Job failed with org.apache.hadoop.hive.ql.metadata.HiveException: Unable to > rename output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. > java.util.concurrent.ExecutionException: Exception thrown by job > at > org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:311) > at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:316) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 306 in stage 5.0 failed 4 times, most recent failure: Lost task 306.4 in > stage 5.0 (TID 2956, hadoopworker1444-sjc1.prod.uber.internal): > java.lang.IllegalStateException: Hit error while closing operators - failing > tree: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename > output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:202) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) > at
[jira] [Commented] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately
[ https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906042#comment-15906042 ] Hive QA commented on HIVE-16104: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857429/HIVE-16104.04.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 10341 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4082/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4082/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4082/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12857429 - PreCommit-HIVE-Build > LLAP: preemption may be too aggressive if the pre-empted task doesn't die > immediately > - > > Key: HIVE-16104 > URL: https://issues.apache.org/jira/browse/HIVE-16104 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16104.01.patch, HIVE-16104.02.patch, > HIVE-16104.03.patch, HIVE-16104.04.patch, HIVE-16104.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16156) FileSinkOperator should delete existing output target when renaming
[ https://issues.apache.org/jira/browse/HIVE-16156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906017#comment-15906017 ] Hive QA commented on HIVE-16156: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857437/HIVE-16156.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 10339 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4081/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4081/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4081/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12857437 - PreCommit-HIVE-Build > FileSinkOperator should delete existing output target when renaming > --- > > Key: HIVE-16156 > URL: https://issues.apache.org/jira/browse/HIVE-16156 > Project: Hive > Issue Type: Bug > Components: Operators >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-16156.1.patch, HIVE-16156.2.patch, HIVE-16156.patch > > > If a task get killed (for whatever a reason) after it completes the renaming > the temp output to final output during commit, subsequent task attempts will > fail when renaming because of the existence of the target output. This can > happen, however rarely. > {code} > Job failed with org.apache.hadoop.hive.ql.metadata.HiveException: Unable to > rename output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. > java.util.concurrent.ExecutionException: Exception thrown by job > at > org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:311) > at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:316) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 306 in stage 5.0 failed 4 times, most recent failure: Lost task 306.4 in > stage 5.0 (TID 2956, hadoopworker1444-sjc1.prod.uber.internal): > java.lang.IllegalStateException: Hit error while closing operators - failing > tree: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename > output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:202) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at >
[jira] [Commented] (HIVE-15947) Enhance Templeton service job operations reliability
[ https://issues.apache.org/jira/browse/HIVE-15947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905986#comment-15905986 ] Hive QA commented on HIVE-15947: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857414/HIVE-15947.7.patch {color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10354 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketmapjoin2] (batchId=97) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[cbo_udf_udaf] (batchId=97) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ptf_decimal] (batchId=97) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_19] (batchId=97) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union23] (batchId=97) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union31] (batchId=97) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union] (batchId=97) org.apache.hive.service.server.TestHS2HttpServer.testContextRootUrlRewrite (batchId=187) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4080/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4080/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4080/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12857414 - PreCommit-HIVE-Build > Enhance Templeton service job operations reliability > > > Key: HIVE-15947 > URL: https://issues.apache.org/jira/browse/HIVE-15947 > Project: Hive > Issue Type: Bug >Reporter: Subramanyam Pattipaka >Assignee: Subramanyam Pattipaka > Attachments: HIVE-15947.2.patch, HIVE-15947.3.patch, > HIVE-15947.4.patch, HIVE-15947.6.patch, HIVE-15947.7.patch, HIVE-15947.patch > > > Currently Templeton service doesn't restrict number of job operation > requests. It simply accepts and tries to run all operations. If more number > of concurrent job submit requests comes then the time to submit job > operations can increase significantly. Templetonused hdfs to store staging > file for job. If HDFS storage can't respond to large number of requests and > throttles then the job submission can take very large times in order of > minutes. > This behavior may not be suitable for all applications and client > applications may be looking for predictable and low response for successful > request or send throttle response to client to wait for some time before > re-requesting job operation. > In this JIRA, I am trying to address following job operations > 1) Submit new Job > 2) Get Job Status > 3) List jobs > These three operations has different complexity due to variance in use of > cluster resources like YARN/HDFS. > The idea is to introduce a new config templeton.job.submit.exec.max-procs > which controls maximum number of concurrent active job submissions within > Templeton and use this config to control better response times. If a new job > submission request sees that there are already > templeton.job.submit.exec.max-procs jobs getting submitted concurrently then > the request will fail with Http error 503 with reason >“Too many concurrent job submission requests received. Please wait for > some time before retrying.” > > The client is expected to catch this response and retry after waiting for > some time. The default value for the config > templeton.job.submit.exec.max-procs is set to ‘0’. This means by default job > submission requests are always accepted. The behavior needs to be enabled > based on requirements. > We can have similar behavior for Status and List operations with configs > templeton.job.status.exec.max-procs and templeton.list.job.exec.max-procs > respectively. > Once the job operation is started, the operation can take longer time. The > client which has requested for job operation may not be waiting for > indefinite amount of time. This work introduces configurations > templeton.exec.job.submit.timeout > templeton.exec.job.status.timeout > templeton.exec.job.list.timeout > to specify maximum amount of time job operation can execute. If time out > happens then list and status job requests returns to client with message > "List job request got timed out. Please retry the operation after waiting for > some time." > If submit job request gets timed out then > i) The job submit request
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905976#comment-15905976 ] Prasanth Jayachandran commented on HIVE-16180: -- Agreed. Assuming ZCR already takes care of it. > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905972#comment-15905972 ] Sergey Shelukhin edited comment on HIVE-16180 at 3/11/17 1:13 AM: -- What I mean for ZCR is that we should not call this when zcr is on, because we (may?) give buffers for ZCR reader/get them from it. was (Author: sershe): What I mean for ZCR is that we should not call this when zcr is on. > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905972#comment-15905972 ] Sergey Shelukhin commented on HIVE-16180: - What I mean for ZCR is that we should not call this when zcr is on. > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905966#comment-15905966 ] Prasanth Jayachandran commented on HIVE-16180: -- Well.. if Full GC is not triggered. This can really be problematic. Worst case, no Full GC (very less heap occupancy) and lots of off heap BB allocations can take down the system. Null'fying it, we will lose reference and opportunity for cleanup. > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905965#comment-15905965 ] Prasanth Jayachandran commented on HIVE-16180: -- Yes. Orc side changes for ZCR might also be required. I haven't looked into it yet. Will do in a follow up. > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16132) DataSize stats don't seem correct in semijoin opt branch
[ https://issues.apache.org/jira/browse/HIVE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-16132: -- Attachment: HIVE-16132.5.patch Updated results for dynamic_semijoin_reduction > DataSize stats don't seem correct in semijoin opt branch > > > Key: HIVE-16132 > URL: https://issues.apache.org/jira/browse/HIVE-16132 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-16132.1.patch, HIVE-16132.2.patch, > HIVE-16132.3.patch, HIVE-16132.4.patch, HIVE-16132.5.patch > > > For the following operator tree snippet, the second Select is the start of a > semijoin optimization branch. Take a look at the Data size - it is the same > as the data size for its parent Select, even though the second select has > only a single bigint column in its projection (the parent has 2 columns). I > would expect the size to be 533328 (16 bytes * 3). > Fixing this estimate may become important if we need to estimate the cost of > generating the min/max/bloomfilter. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905958#comment-15905958 ] Sergey Shelukhin commented on HIVE-16180: - Also it's not really a leak, it's just a delayed cleanup/ > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16132) DataSize stats don't seem correct in semijoin opt branch
[ https://issues.apache.org/jira/browse/HIVE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905955#comment-15905955 ] Gunther Hagleitner commented on HIVE-16132: --- +1 > DataSize stats don't seem correct in semijoin opt branch > > > Key: HIVE-16132 > URL: https://issues.apache.org/jira/browse/HIVE-16132 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-16132.1.patch, HIVE-16132.2.patch, > HIVE-16132.3.patch, HIVE-16132.4.patch > > > For the following operator tree snippet, the second Select is the start of a > semijoin optimization branch. Take a look at the Data size - it is the same > as the data size for its parent Select, even though the second select has > only a single bigint column in its projection (the parent has 2 columns). I > would expect the size to be 533328 (16 bytes * 3). > Fixing this estimate may become important if we need to estimate the cost of > generating the min/max/bloomfilter. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905957#comment-15905957 ] Prasanth Jayachandran commented on HIVE-16180: -- [~sershe] can you please take a look? > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905956#comment-15905956 ] Sergey Shelukhin commented on HIVE-16180: - +1 pending tests... should this also not be done when zero-copy reader is enabled? > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-16180: - Attachment: Native-mem-spike.png FullGC-15GB-cleanup.png Full-gc-native-mem-cleanup.png > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-16180: - Status: Patch Available (was: Open) > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-16180: - Attachment: (was: Full-gc-native-mem-cleanup.png) > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-16180: - Attachment: (was: Native-mem-spike.png) > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-16180: - Attachment: (was: FullGC-15GB-cleanup.png) > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, Full-gc-native-mem-cleanup.png, > HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905946#comment-15905946 ] Prasanth Jayachandran edited comment on HIVE-16180 at 3/11/17 12:53 AM: Attaching snapshots from the analysis. 1) Native memory spike happened at around 11:00:00PM (configured offheap cache size is 48GB). At this point memory used by direct byte buffer spiked to 51.7GB. !Native-mem-spike.png! 2) At around same time Full GC triggered and reclaimed 15GB of memory (12GB heap + 3 GB offheap) !FullGC-15GB-cleanup.png! !Full-gc-native-mem-cleanup.png! was (Author: prasanth_j): Attaching snapshots from the analysis. 1) Native memory spike happened at around 11:00:00PM (configured offheap cache size is 48GB). At this point memory used by direct byte buffer spiked to 51.7GB. !Native-mem-spike.png|thumbnail! 2) At around same time Full GC triggered and reclaimed 15GB of memory (12GB heap + 3 GB offheap) !FullGC-15GB-cleanup.png|thumbnail! !Full-gc-native-mem-cleanup.png|thumbnail! > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, Full-gc-native-mem-cleanup.png, > HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905946#comment-15905946 ] Prasanth Jayachandran edited comment on HIVE-16180 at 3/11/17 12:51 AM: Attaching snapshots from the analysis. 1) Native memory spike happened at around 11:00:00PM (configured offheap cache size is 48GB). At this point memory used by direct byte buffer spiked to 51.7GB. !Native-mem-spike.png|thumbnail! 2) At around same time Full GC triggered and reclaimed 15GB of memory (12GB heap + 3 GB offheap) !FullGC-15GB-cleanup.png|thumbnail! !Full-gc-native-mem-cleanup.png|thumbnail! was (Author: prasanth_j): Attaching snapshots from the analysis. 1) Native memory spike happened at around 11:00:00PM (configured offheap cache size is 48GB). At this point memory used by direct byte buffer spiked to 51.7GB. !Native-mem-spike.png! 2) At around same time Full GC triggered and reclaimed 15GB of memory (12GB heap + 3 GB offheap) !FullGC-15GB-cleanup.png! !Full-gc-native-mem-cleanup.png! > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16158) Correct mistake in documentation for ALTER TABLE … ADD/REPLACE COLUMNS CASCADE
[ https://issues.apache.org/jira/browse/HIVE-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905947#comment-15905947 ] Lefty Leverenz commented on HIVE-16158: --- > Should I resolve the ticket? Yes please. Thanks for bringing this up. > Correct mistake in documentation for ALTER TABLE … ADD/REPLACE COLUMNS CASCADE > -- > > Key: HIVE-16158 > URL: https://issues.apache.org/jira/browse/HIVE-16158 > Project: Hive > Issue Type: Bug > Components: Documentation >Affects Versions: 1.0.0 >Reporter: Illya Yalovyy >Assignee: Lefty Leverenz > > Current documentation says that key word CASCADE was introduced in Hive 0.15 > release. That information is incorrect and confuses users. The feature was > actually released in Hive 1.1.0. (HIVE-8839) > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Add/ReplaceColumns -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-16180: - Description: Observed this in internal test run. There is a native memory leak in Orc EncodedReaderImpl that can cause YARN pmem monitor to kill the container running the daemon. Direct byte buffers are null'ed out which is not guaranteed to be cleaned until next Full GC. To show this issue, attaching a small test program that allocations 3x256MB direct byte buffers. First buffer is null'ed out but still native memory is used. Second buffer user Cleaner to clean up native allocation. Third buffer is also null'ed but this time invoking a System.gc() which cleans up all native memory. Output from the test program is below {code} Allocating 3x256MB direct memory.. Native memory used: 786432000 Native memory used after data1=null: 786432000 Native memory used after data2.clean(): 524288000 Native memory used after data3=null: 524288000 Native memory used without gc: 524288000 Native memory used after gc: 0 {code} Longer term improvements/solutions: 1) Use DirectBufferPool from hadoop or netty's https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as direct byte buffer allocations are expensive (System.gc() + 100ms thread sleep). 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 was: Observed this in internal test run. There is a native memory leak in Orc EncodedReaderImpl that can cause YARN pmem monitor to kill the container running the daemon. Direct byte buffers are null'ed out which is not guaranteed to be cleaned until next Full GC. To show this take issue, attaching a small test program that allocations 3x256MB direct byte buffers. First buffer is null'ed out but still native memory is used. Second buffer user Cleaner to clean up native allocation. Third buffer is also null'ed but this time invoking a System.gc() which cleans up all native memory. Output from the test program is below {code} Allocating 3x256MB direct memory.. Native memory used: 786432000 Native memory used after data1=null: 786432000 Native memory used after data2.clean(): 524288000 Native memory used after data3=null: 524288000 Native memory used without gc: 524288000 Native memory used after gc: 0 {code} Longer term improvements/solutions: 1) Use DirectBufferPool from hadoop or netty's https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as direct byte buffer allocations are expensive (System.gc() + 100ms thread sleep). 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, HIVE-16180.1.patch > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocations 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-16180: - Description: Observed this in internal test run. There is a native memory leak in Orc EncodedReaderImpl that can cause YARN pmem monitor to kill the container running the daemon. Direct byte buffers are null'ed out which is not guaranteed to be cleaned until next Full GC. To show this issue, attaching a small test program that allocates 3x256MB direct byte buffers. First buffer is null'ed out but still native memory is used. Second buffer user Cleaner to clean up native allocation. Third buffer is also null'ed but this time invoking a System.gc() which cleans up all native memory. Output from the test program is below {code} Allocating 3x256MB direct memory.. Native memory used: 786432000 Native memory used after data1=null: 786432000 Native memory used after data2.clean(): 524288000 Native memory used after data3=null: 524288000 Native memory used without gc: 524288000 Native memory used after gc: 0 {code} Longer term improvements/solutions: 1) Use DirectBufferPool from hadoop or netty's https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as direct byte buffer allocations are expensive (System.gc() + 100ms thread sleep). 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 was: Observed this in internal test run. There is a native memory leak in Orc EncodedReaderImpl that can cause YARN pmem monitor to kill the container running the daemon. Direct byte buffers are null'ed out which is not guaranteed to be cleaned until next Full GC. To show this issue, attaching a small test program that allocations 3x256MB direct byte buffers. First buffer is null'ed out but still native memory is used. Second buffer user Cleaner to clean up native allocation. Third buffer is also null'ed but this time invoking a System.gc() which cleans up all native memory. Output from the test program is below {code} Allocating 3x256MB direct memory.. Native memory used: 786432000 Native memory used after data1=null: 786432000 Native memory used after data2.clean(): 524288000 Native memory used after data3=null: 524288000 Native memory used without gc: 524288000 Native memory used after gc: 0 {code} Longer term improvements/solutions: 1) Use DirectBufferPool from hadoop or netty's https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as direct byte buffer allocations are expensive (System.gc() + 100ms thread sleep). 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, HIVE-16180.1.patch > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-16180: - Attachment: DirectCleaner.java > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, HIVE-16180.1.patch > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-16180: - Attachment: HIVE-16180.1.patch > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-16180.1.patch > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this take issue, > attaching a small test program that allocations 3x256MB direct byte buffers. > First buffer is null'ed out but still native memory is used. Second buffer > user Cleaner to clean up native allocation. Third buffer is also null'ed but > this time invoking a System.gc() which cleans up all native memory. Output > from the test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran reassigned HIVE-16180: > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this take issue, > attaching a small test program that allocations 3x256MB direct byte buffers. > First buffer is null'ed out but still native memory is used. Second buffer > user Cleaner to clean up native allocation. Third buffer is also null'ed but > this time invoking a System.gc() which cleans up all native memory. Output > from the test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16177) non Acid to acid conversion doesn't handle _copy_N files
[ https://issues.apache.org/jira/browse/HIVE-16177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905918#comment-15905918 ] Eugene Koifman edited comment on HIVE-16177 at 3/11/17 12:37 AM: - We currently only allow converting a bucketed ORC table to Acid. Some possibilities: Check how splits are done for "isOriginal" files. If RecordReader.getRowNumber() is not smart enough to produce an ordinal from the beginning of the file, then we must be creating 1 split per bucket. If getRowNumber() is smart enough to produce a number from the beginning of the file, we may be splitting each file. Either way, OriginalReaderPair could look at which copy_N it has and look for all copy_M files with M non Acid to acid conversion doesn't handle _copy_N files > > > Key: HIVE-16177 > URL: https://issues.apache.org/jira/browse/HIVE-16177 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-16177.01.patch, HIVE-16177.02.patch > > > {noformat} > create table T(a int, b int) clustered by (a) into 2 buckets stored as orc > TBLPROPERTIES('transactional'='false') > insert into T(a,b) values(1,2) > insert into T(a,b) values(1,3) > alter table T SET TBLPROPERTIES ('transactional'='true') > {noformat} > //we should now have bucket files 01_0 and 01_0_copy_1 > but OrcRawRecordMerger.OriginalReaderPair.next() doesn't know that there can > be copy_N files and numbers rows in each bucket from 0 thus generating > duplicate IDs > {noformat} > select ROW__ID, INPUT__FILE__NAME, a, b from T > {noformat} > produces > {noformat} > {"transactionid":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0,1,2 > {"transactionid\":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0_copy_1,1,3 > {noformat} > [~owen.omalley], do you have any thoughts on a good way to handle this? > attached patch has a few changes to make Acid even recognize copy_N but this > is just a pre-requisite. The new UT demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately
[ https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905932#comment-15905932 ] Hive QA commented on HIVE-16104: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857429/HIVE-16104.04.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 10341 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4079/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4079/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4079/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12857429 - PreCommit-HIVE-Build > LLAP: preemption may be too aggressive if the pre-empted task doesn't die > immediately > - > > Key: HIVE-16104 > URL: https://issues.apache.org/jira/browse/HIVE-16104 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16104.01.patch, HIVE-16104.02.patch, > HIVE-16104.03.patch, HIVE-16104.04.patch, HIVE-16104.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16155) No need for ConditionalTask if no conditional map join is created
[ https://issues.apache.org/jira/browse/HIVE-16155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905923#comment-15905923 ] Xuefu Zhang commented on HIVE-16155: +1 > No need for ConditionalTask if no conditional map join is created > - > > Key: HIVE-16155 > URL: https://issues.apache.org/jira/browse/HIVE-16155 > Project: Hive > Issue Type: Bug >Reporter: Rui Li >Assignee: Rui Li >Priority: Minor > Attachments: HIVE-16155.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16156) FileSinkOperator should delete existing output target when renaming
[ https://issues.apache.org/jira/browse/HIVE-16156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-16156: --- Attachment: HIVE-16156.2.patch > FileSinkOperator should delete existing output target when renaming > --- > > Key: HIVE-16156 > URL: https://issues.apache.org/jira/browse/HIVE-16156 > Project: Hive > Issue Type: Bug > Components: Operators >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-16156.1.patch, HIVE-16156.2.patch, HIVE-16156.patch > > > If a task get killed (for whatever a reason) after it completes the renaming > the temp output to final output during commit, subsequent task attempts will > fail when renaming because of the existence of the target output. This can > happen, however rarely. > {code} > Job failed with org.apache.hadoop.hive.ql.metadata.HiveException: Unable to > rename output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. > java.util.concurrent.ExecutionException: Exception thrown by job > at > org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:311) > at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:316) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 306 in stage 5.0 failed 4 times, most recent failure: Lost task 306.4 in > stage 5.0 (TID 2956, hadoopworker1444-sjc1.prod.uber.internal): > java.lang.IllegalStateException: Hit error while closing operators - failing > tree: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename > output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:202) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename > output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:227) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$200(FileSinkOperator.java:133) > at >
[jira] [Commented] (HIVE-16177) non Acid to acid conversion doesn't handle _copy_N files
[ https://issues.apache.org/jira/browse/HIVE-16177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905918#comment-15905918 ] Eugene Koifman commented on HIVE-16177: --- We currently only allow converting a bucketed ORC table to Acid. Some possibilities: Check how splits are done for "isOriginal" files. If RecordReader.getRowNumber() is not smart enough to produce an ordinal from the beginning of the file, then we must be creating 1 split per bucket. If getRowNumber() is smart enough to produce a number from the beginning of the file, we may be splitting each file. Either way, OriginalReaderPair could look at which copy_N it has and look for all copy_M files with M non Acid to acid conversion doesn't handle _copy_N files > > > Key: HIVE-16177 > URL: https://issues.apache.org/jira/browse/HIVE-16177 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-16177.01.patch, HIVE-16177.02.patch > > > {noformat} > create table T(a int, b int) clustered by (a) into 2 buckets stored as orc > TBLPROPERTIES('transactional'='false') > insert into T(a,b) values(1,2) > insert into T(a,b) values(1,3) > alter table T SET TBLPROPERTIES ('transactional'='true') > {noformat} > //we should now have bucket files 01_0 and 01_0_copy_1 > but OrcRawRecordMerger.OriginalReaderPair.next() doesn't know that there can > be copy_N files and numbers rows in each bucket from 0 thus generating > duplicate IDs > {noformat} > select ROW__ID, INPUT__FILE__NAME, a, b from T > {noformat} > produces > {noformat} > {"transactionid":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0,1,2 > {"transactionid\":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0_copy_1,1,3 > {noformat} > [~owen.omalley], do you have any thoughts on a good way to handle this? > attached patch has a few changes to make Acid even recognize copy_N but this > is just a pre-requisite. The new UT demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15407) add distcp to classpath by default, because hive depends on it.
[ https://issues.apache.org/jira/browse/HIVE-15407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-15407: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to master. Thanks for the patch! > add distcp to classpath by default, because hive depends on it. > > > Key: HIVE-15407 > URL: https://issues.apache.org/jira/browse/HIVE-15407 > Project: Hive > Issue Type: Bug > Components: Beeline, CLI >Affects Versions: 2.2.0 >Reporter: Fei Hui >Assignee: Fei Hui > Fix For: 2.2.0 > > Attachments: HIVE-15407.1.patch, HIVE-15407.2.patch > > > when i run hive queries, i get errors as follow > java.lang.NoClassDefFoundError: org/apache/hadoop/tools/DistCpOptions > ... > I dig into code, and find that hive depends on distcp ,but distcp is not in > classpath by default. > I think if adding distcp to hadoop classpath by default in hadoop project, > but hadoop committers will not do that. discussions in HADOOP-13865 . They > propose that Resolving this problem on HIVE > So i add distcp to classpath on HIVE -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16177) non Acid to acid conversion doesn't handle _copy_N files
[ https://issues.apache.org/jira/browse/HIVE-16177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905909#comment-15905909 ] Sergey Shelukhin commented on HIVE-16177: - Bucket handling in Hive in general is completely screwed, and inconsistent in different places (e.g. sample and IIRC some other code would just take files in order, regardless of names, and if there are fewer or more files than needed). Maybe there needs to be some work to enforce it better via some cental utility or manager class that would get all files for a bucket and validate buckets more strictly. > non Acid to acid conversion doesn't handle _copy_N files > > > Key: HIVE-16177 > URL: https://issues.apache.org/jira/browse/HIVE-16177 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-16177.01.patch, HIVE-16177.02.patch > > > {noformat} > create table T(a int, b int) clustered by (a) into 2 buckets stored as orc > TBLPROPERTIES('transactional'='false') > insert into T(a,b) values(1,2) > insert into T(a,b) values(1,3) > alter table T SET TBLPROPERTIES ('transactional'='true') > {noformat} > //we should now have bucket files 01_0 and 01_0_copy_1 > but OrcRawRecordMerger.OriginalReaderPair.next() doesn't know that there can > be copy_N files and numbers rows in each bucket from 0 thus generating > duplicate IDs > {noformat} > select ROW__ID, INPUT__FILE__NAME, a, b from T > {noformat} > produces > {noformat} > {"transactionid":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0,1,2 > {"transactionid\":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0_copy_1,1,3 > {noformat} > [~owen.omalley], do you have any thoughts on a good way to handle this? > attached patch has a few changes to make Acid even recognize copy_N but this > is just a pre-requisite. The new UT demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15947) Enhance Templeton service job operations reliability
[ https://issues.apache.org/jira/browse/HIVE-15947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905902#comment-15905902 ] Kiran Kumar Kolli commented on HIVE-15947: -- +1 CR comments can be found in review board > Enhance Templeton service job operations reliability > > > Key: HIVE-15947 > URL: https://issues.apache.org/jira/browse/HIVE-15947 > Project: Hive > Issue Type: Bug >Reporter: Subramanyam Pattipaka >Assignee: Subramanyam Pattipaka > Attachments: HIVE-15947.2.patch, HIVE-15947.3.patch, > HIVE-15947.4.patch, HIVE-15947.6.patch, HIVE-15947.7.patch, HIVE-15947.patch > > > Currently Templeton service doesn't restrict number of job operation > requests. It simply accepts and tries to run all operations. If more number > of concurrent job submit requests comes then the time to submit job > operations can increase significantly. Templetonused hdfs to store staging > file for job. If HDFS storage can't respond to large number of requests and > throttles then the job submission can take very large times in order of > minutes. > This behavior may not be suitable for all applications and client > applications may be looking for predictable and low response for successful > request or send throttle response to client to wait for some time before > re-requesting job operation. > In this JIRA, I am trying to address following job operations > 1) Submit new Job > 2) Get Job Status > 3) List jobs > These three operations has different complexity due to variance in use of > cluster resources like YARN/HDFS. > The idea is to introduce a new config templeton.job.submit.exec.max-procs > which controls maximum number of concurrent active job submissions within > Templeton and use this config to control better response times. If a new job > submission request sees that there are already > templeton.job.submit.exec.max-procs jobs getting submitted concurrently then > the request will fail with Http error 503 with reason >“Too many concurrent job submission requests received. Please wait for > some time before retrying.” > > The client is expected to catch this response and retry after waiting for > some time. The default value for the config > templeton.job.submit.exec.max-procs is set to ‘0’. This means by default job > submission requests are always accepted. The behavior needs to be enabled > based on requirements. > We can have similar behavior for Status and List operations with configs > templeton.job.status.exec.max-procs and templeton.list.job.exec.max-procs > respectively. > Once the job operation is started, the operation can take longer time. The > client which has requested for job operation may not be waiting for > indefinite amount of time. This work introduces configurations > templeton.exec.job.submit.timeout > templeton.exec.job.status.timeout > templeton.exec.job.list.timeout > to specify maximum amount of time job operation can execute. If time out > happens then list and status job requests returns to client with message > "List job request got timed out. Please retry the operation after waiting for > some time." > If submit job request gets timed out then > i) The job submit request thread which receives time out will check if > valid job id is generated in job request. > ii) If it is generated then issue kill job request on cancel thread > pool. Don't wait for operation to complete and returns to client with time > out message. > Side effects of enabling time out for submit operations > 1) This has a possibility for having active job for some time by the client > gets response and a list operation from client could potential show the newly > created job before it gets killed. > 2) We do best effort to kill the job and no guarantees. This means there is a > possibility of duplicate job created. One possible reason for this could be a > case where job is created and then operation timed out but kill request > failed due to resource manager unavailability. When resource manager > restarts, it will restarts the job which got created. > Fixing this scenario is not part of the scope of this JIRA. The job operation > functionality can be enabled only if above side effects are acceptable. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16178) corr/covar_samp UDAF standard compliance
[ https://issues.apache.org/jira/browse/HIVE-16178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-16178: Description: h3. corr the standard defines corner cases when it should return null - but the current result is NaN. If N * SUMX2 equals SUMX * SUMX , then the result is the null value. and If N * SUMY2 equals SUMY * SUMY , then the result is the null value. h3. covar_samp returns 0 instead 1 `If N is 1 (one), then the result is the null value.` h3. check (x,y) vs (y,x) args in docs the standard uses (y,x) order; and some of the function names are also contain X and Y...so the order does matter..currently at least corr uses (x,y) order which is okay - because its symmetric; but it would be great to have the same order everywhere (check others) was: h3. corr the standard defines corner cases when it should return null - but the current result is NaN. If N * SUMX2 equals SUMX * SUMX , then the result is the null value. and If N * SUMY2 equals SUMY * SUMY , then the result is the null value. h3. covar_samp returns 0 instead 1 `If N is 1 (one), then the result is the null value.` > corr/covar_samp UDAF standard compliance > > > Key: HIVE-16178 > URL: https://issues.apache.org/jira/browse/HIVE-16178 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Zoltan Haindrich >Priority: Minor > > h3. corr > the standard defines corner cases when it should return null - but the > current result is NaN. > If N * SUMX2 equals SUMX * SUMX , then the result is the null value. > and > If N * SUMY2 equals SUMY * SUMY , then the result is the null value. > h3. covar_samp > returns 0 instead 1 > `If N is 1 (one), then the result is the null value.` > h3. check (x,y) vs (y,x) args in docs > the standard uses (y,x) order; and some of the function names are also > contain X and Y...so the order does matter..currently at least corr uses > (x,y) order which is okay - because its symmetric; but it would be great to > have the same order everywhere (check others) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16156) FileSinkOperator should delete existing output target when renaming
[ https://issues.apache.org/jira/browse/HIVE-16156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-16156: --- Attachment: HIVE-16156.2.patch > FileSinkOperator should delete existing output target when renaming > --- > > Key: HIVE-16156 > URL: https://issues.apache.org/jira/browse/HIVE-16156 > Project: Hive > Issue Type: Bug > Components: Operators >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-16156.1.patch, HIVE-16156.2.patch, HIVE-16156.patch > > > If a task get killed (for whatever a reason) after it completes the renaming > the temp output to final output during commit, subsequent task attempts will > fail when renaming because of the existence of the target output. This can > happen, however rarely. > {code} > Job failed with org.apache.hadoop.hive.ql.metadata.HiveException: Unable to > rename output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. > java.util.concurrent.ExecutionException: Exception thrown by job > at > org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:311) > at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:316) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 306 in stage 5.0 failed 4 times, most recent failure: Lost task 306.4 in > stage 5.0 (TID 2956, hadoopworker1444-sjc1.prod.uber.internal): > java.lang.IllegalStateException: Hit error while closing operators - failing > tree: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename > output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:202) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename > output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:227) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$200(FileSinkOperator.java:133) > at >
[jira] [Commented] (HIVE-15947) Enhance Templeton service job operations reliability
[ https://issues.apache.org/jira/browse/HIVE-15947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905884#comment-15905884 ] Hive QA commented on HIVE-15947: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857414/HIVE-15947.7.patch {color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10353 tests executed *Failed tests:* {noformat} org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery (batchId=219) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4078/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4078/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4078/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12857414 - PreCommit-HIVE-Build > Enhance Templeton service job operations reliability > > > Key: HIVE-15947 > URL: https://issues.apache.org/jira/browse/HIVE-15947 > Project: Hive > Issue Type: Bug >Reporter: Subramanyam Pattipaka >Assignee: Subramanyam Pattipaka > Attachments: HIVE-15947.2.patch, HIVE-15947.3.patch, > HIVE-15947.4.patch, HIVE-15947.6.patch, HIVE-15947.7.patch, HIVE-15947.patch > > > Currently Templeton service doesn't restrict number of job operation > requests. It simply accepts and tries to run all operations. If more number > of concurrent job submit requests comes then the time to submit job > operations can increase significantly. Templetonused hdfs to store staging > file for job. If HDFS storage can't respond to large number of requests and > throttles then the job submission can take very large times in order of > minutes. > This behavior may not be suitable for all applications and client > applications may be looking for predictable and low response for successful > request or send throttle response to client to wait for some time before > re-requesting job operation. > In this JIRA, I am trying to address following job operations > 1) Submit new Job > 2) Get Job Status > 3) List jobs > These three operations has different complexity due to variance in use of > cluster resources like YARN/HDFS. > The idea is to introduce a new config templeton.job.submit.exec.max-procs > which controls maximum number of concurrent active job submissions within > Templeton and use this config to control better response times. If a new job > submission request sees that there are already > templeton.job.submit.exec.max-procs jobs getting submitted concurrently then > the request will fail with Http error 503 with reason >“Too many concurrent job submission requests received. Please wait for > some time before retrying.” > > The client is expected to catch this response and retry after waiting for > some time. The default value for the config > templeton.job.submit.exec.max-procs is set to ‘0’. This means by default job > submission requests are always accepted. The behavior needs to be enabled > based on requirements. > We can have similar behavior for Status and List operations with configs > templeton.job.status.exec.max-procs and templeton.list.job.exec.max-procs > respectively. > Once the job operation is started, the operation can take longer time. The > client which has requested for job operation may not be waiting for > indefinite amount of time. This work introduces configurations > templeton.exec.job.submit.timeout > templeton.exec.job.status.timeout > templeton.exec.job.list.timeout > to specify maximum amount of time job operation can execute. If time out > happens then list and status job requests returns to client with message > "List job request got timed out. Please retry the operation after waiting for > some time." > If submit job request gets timed out then > i) The job submit request thread which receives time out will check if > valid job id is generated in job request. > ii) If it is generated then issue kill job request on cancel thread > pool. Don't wait for operation to complete and returns to client with time > out message. > Side effects of enabling time out for submit operations > 1) This has a possibility for having active job for some time by the client > gets response and a list operation from client could potential show the newly > created job before it gets killed. > 2) We do best effort to kill the job and no guarantees. This means there is a >
[jira] [Updated] (HIVE-16133) Footer cache in Tez AM can take too much memory
[ https://issues.apache.org/jira/browse/HIVE-16133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16133: Attachment: HIVE-16133.04.patch Apparently both can happen, adding BB (not the underlying data) duplicate calls; it is possible although unlikely that the footer BB is manipulated after we cache it, and on get, the footer deserialization definitely manipulates the buffer stuff, causing issues for other users. > Footer cache in Tez AM can take too much memory > --- > > Key: HIVE-16133 > URL: https://issues.apache.org/jira/browse/HIVE-16133 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Sergey Shelukhin > Attachments: HIVE-16133.01.patch, HIVE-16133.02.patch, > HIVE-16133.02.patch, HIVE-16133.03.patch, HIVE-16133.04.patch, > HIVE-16133.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-14514) OrcRecordUpdater should clone writerOptions when creating delete event writers
[ https://issues.apache.org/jira/browse/HIVE-14514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905874#comment-15905874 ] Eugene Koifman commented on HIVE-14514: --- look for HIVE-14514 in OrcRecordUpdater > OrcRecordUpdater should clone writerOptions when creating delete event writers > -- > > Key: HIVE-14514 > URL: https://issues.apache.org/jira/browse/HIVE-14514 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 2.2.0 >Reporter: Saket Saurabh >Assignee: Eugene Koifman >Priority: Minor > > When split-update is enabled for ACID, OrcRecordUpdater creates two sets of > writers: one for the insert deltas and one for the delete deltas. The > deleteEventWriter is initialized with similar writerOptions as the normal > writer, except that it has a different callback handler. Due to the lack of > copy constructor/ clone() method in writerOptions, the same writerOptions > object is mutated to specify a different callback for the delete case. > Although, this is harmless for now, but it may become a source of confusion > and possible error in future. The ideal way to fix this would be to create a > clone() method for writerOptions- however this requires that the parent class > of WriterOptions in the OrcFile.WriterOptions should implement Cloneable or > provide a copy constructor. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16158) Correct mistake in documentation for ALTER TABLE … ADD/REPLACE COLUMNS CASCADE
[ https://issues.apache.org/jira/browse/HIVE-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905881#comment-15905881 ] Illya Yalovyy commented on HIVE-16158: -- Than you [~leftylev]! All changes look good. Should I resolve the ticket? > Correct mistake in documentation for ALTER TABLE … ADD/REPLACE COLUMNS CASCADE > -- > > Key: HIVE-16158 > URL: https://issues.apache.org/jira/browse/HIVE-16158 > Project: Hive > Issue Type: Bug > Components: Documentation >Affects Versions: 1.0.0 >Reporter: Illya Yalovyy >Assignee: Lefty Leverenz > > Current documentation says that key word CASCADE was introduced in Hive 0.15 > release. That information is incorrect and confuses users. The feature was > actually released in Hive 1.1.0. (HIVE-8839) > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Add/ReplaceColumns -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15981) Allow empty grouping sets
[ https://issues.apache.org/jira/browse/HIVE-15981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-15981: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Pushed to master. Thanks, Zoltan! > Allow empty grouping sets > - > > Key: HIVE-15981 > URL: https://issues.apache.org/jira/browse/HIVE-15981 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Zoltan Haindrich > Fix For: 2.2.0 > > Attachments: HIVE-15981.1.patch, HIVE-15981.2.patch > > > group by () should be treated as equivalent to no group by clause. Currently > it throws a parse error -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15979) Support character_length and octet_length
[ https://issues.apache.org/jira/browse/HIVE-15979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-15979: Status: Open (was: Patch Available) Test failures look related. > Support character_length and octet_length > - > > Key: HIVE-15979 > URL: https://issues.apache.org/jira/browse/HIVE-15979 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Teddy Choi > Attachments: HIVE-15979.1.patch, HIVE-15979.2.patch, > HIVE-15979.3.patch, HIVE-15979.4.patch, HIVE-15979.5.patch > > > SQL defines standard ways to get number of characters and octets. SQL > reference: section 6.28. Example: > vagrant=# select character_length('欲速则不达'); > character_length > -- > 5 > (1 row) > vagrant=# select octet_length('欲速则不达'); > octet_length > -- >15 > (1 row) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16179) HoS tasks may fail due to ArrayIndexOutOfBoundException in BinarySortableSerDe
[ https://issues.apache.org/jira/browse/HIVE-16179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905872#comment-15905872 ] Xuefu Zhang commented on HIVE-16179: Actually the issue is fixed in HIVE-12768. > HoS tasks may fail due to ArrayIndexOutOfBoundException in BinarySortableSerDe > -- > > Key: HIVE-16179 > URL: https://issues.apache.org/jira/browse/HIVE-16179 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > > Stacktrace: > {code} > java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: > Hive Runtime Error: Unable to deserialize reduce input key from > x1x100x101x97x51x49x50x97x102x45x97x98x56x52x45x52x102x52x53x45x56x49x101x99x45x49x99x100x98x55x97x51x52x100x49x49x55x0x1x128x0x0x0x0x0x0x19x1x128x0x0x0x0x0x0x3x1x128x0x66x179x1x192x244x45x90x1x85x98x101x114x0x1x76x111x115x32x65x110x103x101x108x101x115x0x1x2x128x0x0x2x50x51x57x51x0x1x192x55x238x20x122x225x71x174x1x128x0x0x0x87x240x169x195x1x50x48x49x54x45x49x48x45x48x49x32x50x51x58x51x49x58x51x49x0x1x117x98x101x114x88x0x255 > with properties > {columns=_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11, > > serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe, > serialization.sort.order=, > columns.types=string,bigint,bigint,date,int,varchar(50),varchar(255),decimal(12,2),double,bigint,string,varchar(255)} > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:339) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:54) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2004) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2004) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error: Unable to deserialize reduce input key from > x1x100x101x97x51x49x50x97x102x45x97x98x56x52x45x52x102x52x53x45x56x49x101x99x45x49x99x100x98x55x97x51x52x100x49x49x55x0x1x128x0x0x0x0x0x0x19x1x128x0x0x0x0x0x0x3x1x128x0x66x179x1x192x244x45x90x1x85x98x101x114x0x1x76x111x115x32x65x110x103x101x108x101x115x0x1x2x128x0x0x2x50x51x57x51x0x1x192x55x238x20x122x225x71x174x1x128x0x0x0x87x240x169x195x1x50x48x49x54x45x49x48x45x48x49x32x50x51x58x51x49x58x51x49x0x1x117x98x101x114x88x0x255 > with properties > {columns=_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11, > > serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe, > serialization.sort.order=, > columns.types=string,bigint,bigint,date,int,varchar(50),varchar(255),decimal(12,2),double,bigint,string,varchar(255)} > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:311) > ... 16 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: 3 > at > org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:413) > at > org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:190) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:309) > ... 16 more > {code} > It seems to be a synchronization issue in BinarySortableSerDe. -- This message was sent by Atlassian JIRA
[jira] [Resolved] (HIVE-16179) HoS tasks may fail due to ArrayIndexOutOfBoundException in BinarySortableSerDe
[ https://issues.apache.org/jira/browse/HIVE-16179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang resolved HIVE-16179. Resolution: Duplicate > HoS tasks may fail due to ArrayIndexOutOfBoundException in BinarySortableSerDe > -- > > Key: HIVE-16179 > URL: https://issues.apache.org/jira/browse/HIVE-16179 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > > Stacktrace: > {code} > java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: > Hive Runtime Error: Unable to deserialize reduce input key from > x1x100x101x97x51x49x50x97x102x45x97x98x56x52x45x52x102x52x53x45x56x49x101x99x45x49x99x100x98x55x97x51x52x100x49x49x55x0x1x128x0x0x0x0x0x0x19x1x128x0x0x0x0x0x0x3x1x128x0x66x179x1x192x244x45x90x1x85x98x101x114x0x1x76x111x115x32x65x110x103x101x108x101x115x0x1x2x128x0x0x2x50x51x57x51x0x1x192x55x238x20x122x225x71x174x1x128x0x0x0x87x240x169x195x1x50x48x49x54x45x49x48x45x48x49x32x50x51x58x51x49x58x51x49x0x1x117x98x101x114x88x0x255 > with properties > {columns=_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11, > > serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe, > serialization.sort.order=, > columns.types=string,bigint,bigint,date,int,varchar(50),varchar(255),decimal(12,2),double,bigint,string,varchar(255)} > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:339) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:54) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2004) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2004) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error: Unable to deserialize reduce input key from > x1x100x101x97x51x49x50x97x102x45x97x98x56x52x45x52x102x52x53x45x56x49x101x99x45x49x99x100x98x55x97x51x52x100x49x49x55x0x1x128x0x0x0x0x0x0x19x1x128x0x0x0x0x0x0x3x1x128x0x66x179x1x192x244x45x90x1x85x98x101x114x0x1x76x111x115x32x65x110x103x101x108x101x115x0x1x2x128x0x0x2x50x51x57x51x0x1x192x55x238x20x122x225x71x174x1x128x0x0x0x87x240x169x195x1x50x48x49x54x45x49x48x45x48x49x32x50x51x58x51x49x58x51x49x0x1x117x98x101x114x88x0x255 > with properties > {columns=_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11, > > serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe, > serialization.sort.order=, > columns.types=string,bigint,bigint,date,int,varchar(50),varchar(255),decimal(12,2),double,bigint,string,varchar(255)} > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:311) > ... 16 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: 3 > at > org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:413) > at > org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:190) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:309) > ... 16 more > {code} > It seems to be a synchronization issue in BinarySortableSerDe. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16133) Footer cache in Tez AM can take too much memory
[ https://issues.apache.org/jira/browse/HIVE-16133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905869#comment-15905869 ] Sergey Shelukhin commented on HIVE-16133: - Still can;t repro, but I can see errors in the logs of the last run. Apparently sometimes footer in cache gets corrupted when it's stored as a buffer. Not sure how it happens, probably the buffer object, or data, is reused somewhere and needs to be copied. > Footer cache in Tez AM can take too much memory > --- > > Key: HIVE-16133 > URL: https://issues.apache.org/jira/browse/HIVE-16133 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Sergey Shelukhin > Attachments: HIVE-16133.01.patch, HIVE-16133.02.patch, > HIVE-16133.02.patch, HIVE-16133.03.patch, HIVE-16133.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16156) FileSinkOperator should delete existing output target when renaming
[ https://issues.apache.org/jira/browse/HIVE-16156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905865#comment-15905865 ] Sergey Shelukhin commented on HIVE-16156: - +1 pending tests; file status can be retrieved only if rename fails; can be changed on commit > FileSinkOperator should delete existing output target when renaming > --- > > Key: HIVE-16156 > URL: https://issues.apache.org/jira/browse/HIVE-16156 > Project: Hive > Issue Type: Bug > Components: Operators >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-16156.1.patch, HIVE-16156.patch > > > If a task get killed (for whatever a reason) after it completes the renaming > the temp output to final output during commit, subsequent task attempts will > fail when renaming because of the existence of the target output. This can > happen, however rarely. > {code} > Job failed with org.apache.hadoop.hive.ql.metadata.HiveException: Unable to > rename output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. > java.util.concurrent.ExecutionException: Exception thrown by job > at > org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:311) > at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:316) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 306 in stage 5.0 failed 4 times, most recent failure: Lost task 306.4 in > stage 5.0 (TID 2956, hadoopworker1444-sjc1.prod.uber.internal): > java.lang.IllegalStateException: Hit error while closing operators - failing > tree: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename > output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:202) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename > output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:227) > at >
[jira] [Updated] (HIVE-16178) corr/covar_samp UDAF standard compliance
[ https://issues.apache.org/jira/browse/HIVE-16178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-16178: Summary: corr/covar_samp UDAF standard compliance (was: corr UDAF standard compliance) > corr/covar_samp UDAF standard compliance > > > Key: HIVE-16178 > URL: https://issues.apache.org/jira/browse/HIVE-16178 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Zoltan Haindrich >Priority: Minor > > h3. corr > the standard defines corner cases when it should return null - but the > current result is NaN. > If N * SUMX2 equals SUMX * SUMX , then the result is the null value. > and > If N * SUMY2 equals SUMY * SUMY , then the result is the null value. > h3. covar_samp > returns 0 instead 1 > `If N is 1 (one), then the result is the null value.` -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16178) corr UDAF standard compliance
[ https://issues.apache.org/jira/browse/HIVE-16178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-16178: Description: h3. corr the standard defines corner cases when it should return null - but the current result is NaN. If N * SUMX2 equals SUMX * SUMX , then the result is the null value. and If N * SUMY2 equals SUMY * SUMY , then the result is the null value. h3. covar_samp returns 0 instead 1 `If N is 1 (one), then the result is the null value.` was: the standard defines corner cases when it should return null - but the current result is NaN. If N * SUMX2 equals SUMX * SUMX , then the result is the null value. and If N * SUMY2 equals SUMY * SUMY , then the result is the null value. > corr UDAF standard compliance > - > > Key: HIVE-16178 > URL: https://issues.apache.org/jira/browse/HIVE-16178 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Zoltan Haindrich >Priority: Minor > > h3. corr > the standard defines corner cases when it should return null - but the > current result is NaN. > If N * SUMX2 equals SUMX * SUMX , then the result is the null value. > and > If N * SUMY2 equals SUMY * SUMY , then the result is the null value. > h3. covar_samp > returns 0 instead 1 > `If N is 1 (one), then the result is the null value.` -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16179) HoS tasks may fail due to ArrayIndexOutOfBoundException in BinarySortableSerDe
[ https://issues.apache.org/jira/browse/HIVE-16179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang reassigned HIVE-16179: -- > HoS tasks may fail due to ArrayIndexOutOfBoundException in BinarySortableSerDe > -- > > Key: HIVE-16179 > URL: https://issues.apache.org/jira/browse/HIVE-16179 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > > Stacktrace: > {code} > java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: > Hive Runtime Error: Unable to deserialize reduce input key from > x1x100x101x97x51x49x50x97x102x45x97x98x56x52x45x52x102x52x53x45x56x49x101x99x45x49x99x100x98x55x97x51x52x100x49x49x55x0x1x128x0x0x0x0x0x0x19x1x128x0x0x0x0x0x0x3x1x128x0x66x179x1x192x244x45x90x1x85x98x101x114x0x1x76x111x115x32x65x110x103x101x108x101x115x0x1x2x128x0x0x2x50x51x57x51x0x1x192x55x238x20x122x225x71x174x1x128x0x0x0x87x240x169x195x1x50x48x49x54x45x49x48x45x48x49x32x50x51x58x51x49x58x51x49x0x1x117x98x101x114x88x0x255 > with properties > {columns=_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11, > > serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe, > serialization.sort.order=, > columns.types=string,bigint,bigint,date,int,varchar(50),varchar(255),decimal(12,2),double,bigint,string,varchar(255)} > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:339) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:54) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2004) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2004) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error: Unable to deserialize reduce input key from > x1x100x101x97x51x49x50x97x102x45x97x98x56x52x45x52x102x52x53x45x56x49x101x99x45x49x99x100x98x55x97x51x52x100x49x49x55x0x1x128x0x0x0x0x0x0x19x1x128x0x0x0x0x0x0x3x1x128x0x66x179x1x192x244x45x90x1x85x98x101x114x0x1x76x111x115x32x65x110x103x101x108x101x115x0x1x2x128x0x0x2x50x51x57x51x0x1x192x55x238x20x122x225x71x174x1x128x0x0x0x87x240x169x195x1x50x48x49x54x45x49x48x45x48x49x32x50x51x58x51x49x58x51x49x0x1x117x98x101x114x88x0x255 > with properties > {columns=_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11, > > serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe, > serialization.sort.order=, > columns.types=string,bigint,bigint,date,int,varchar(50),varchar(255),decimal(12,2),double,bigint,string,varchar(255)} > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:311) > ... 16 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: 3 > at > org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:413) > at > org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:190) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:309) > ... 16 more > {code} > It seems to be a synchronization issue in BinarySortableSerDe. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately
[ https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905856#comment-15905856 ] Sergey Shelukhin commented on HIVE-16104: - Updated the loop, the lock and tests > LLAP: preemption may be too aggressive if the pre-empted task doesn't die > immediately > - > > Key: HIVE-16104 > URL: https://issues.apache.org/jira/browse/HIVE-16104 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16104.01.patch, HIVE-16104.02.patch, > HIVE-16104.03.patch, HIVE-16104.04.patch, HIVE-16104.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately
[ https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16104: Attachment: HIVE-16104.04.patch > LLAP: preemption may be too aggressive if the pre-empted task doesn't die > immediately > - > > Key: HIVE-16104 > URL: https://issues.apache.org/jira/browse/HIVE-16104 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16104.01.patch, HIVE-16104.02.patch, > HIVE-16104.03.patch, HIVE-16104.04.patch, HIVE-16104.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16156) FileSinkOperator should delete existing output target when renaming
[ https://issues.apache.org/jira/browse/HIVE-16156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905853#comment-15905853 ] Xuefu Zhang commented on HIVE-16156: Patch #1 incorporated [~sershe]'s comment above. > FileSinkOperator should delete existing output target when renaming > --- > > Key: HIVE-16156 > URL: https://issues.apache.org/jira/browse/HIVE-16156 > Project: Hive > Issue Type: Bug > Components: Operators >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-16156.1.patch, HIVE-16156.patch > > > If a task get killed (for whatever a reason) after it completes the renaming > the temp output to final output during commit, subsequent task attempts will > fail when renaming because of the existence of the target output. This can > happen, however rarely. > {code} > Job failed with org.apache.hadoop.hive.ql.metadata.HiveException: Unable to > rename output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. > java.util.concurrent.ExecutionException: Exception thrown by job > at > org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:311) > at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:316) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 306 in stage 5.0 failed 4 times, most recent failure: Lost task 306.4 in > stage 5.0 (TID 2956, hadoopworker1444-sjc1.prod.uber.internal): > java.lang.IllegalStateException: Hit error while closing operators - failing > tree: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename > output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:202) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename > output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:227) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$200(FileSinkOperator.java:133) > at >
[jira] [Updated] (HIVE-16156) FileSinkOperator should delete existing output target when renaming
[ https://issues.apache.org/jira/browse/HIVE-16156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-16156: --- Attachment: HIVE-16156.1.patch > FileSinkOperator should delete existing output target when renaming > --- > > Key: HIVE-16156 > URL: https://issues.apache.org/jira/browse/HIVE-16156 > Project: Hive > Issue Type: Bug > Components: Operators >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-16156.1.patch, HIVE-16156.patch > > > If a task get killed (for whatever a reason) after it completes the renaming > the temp output to final output during commit, subsequent task attempts will > fail when renaming because of the existence of the target output. This can > happen, however rarely. > {code} > Job failed with org.apache.hadoop.hive.ql.metadata.HiveException: Unable to > rename output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. > java.util.concurrent.ExecutionException: Exception thrown by job > at > org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:311) > at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:316) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 306 in stage 5.0 failed 4 times, most recent failure: Lost task 306.4 in > stage 5.0 (TID 2956, hadoopworker1444-sjc1.prod.uber.internal): > java.lang.IllegalStateException: Hit error while closing operators - failing > tree: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename > output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:202) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename > output from: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0 > to: > hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0 > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:227) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$200(FileSinkOperator.java:133) > at >
[jira] [Commented] (HIVE-16178) corr UDAF standard compliance
[ https://issues.apache.org/jira/browse/HIVE-16178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905852#comment-15905852 ] Zoltan Haindrich commented on HIVE-16178: - HIVE-15978 will contain a disabled testcase for this problem > corr UDAF standard compliance > - > > Key: HIVE-16178 > URL: https://issues.apache.org/jira/browse/HIVE-16178 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Zoltan Haindrich >Priority: Minor > > the standard defines corner cases when it should return null - but the > current result is NaN. > If N * SUMX2 equals SUMX * SUMX , then the result is the null value. > and > If N * SUMY2 equals SUMY * SUMY , then the result is the null value. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16175) Possible race condition in InstanceCache
[ https://issues.apache.org/jira/browse/HIVE-16175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-16175: Fix Version/s: 2.2.0 > Possible race condition in InstanceCache > > > Key: HIVE-16175 > URL: https://issues.apache.org/jira/browse/HIVE-16175 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Reporter: Chao Sun >Assignee: Chao Sun > Fix For: 2.2.0 > > Attachments: HIVE-16175.1.patch > > > Currently the [InstanceCache | > https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/avro/InstanceCache.java] > class is not thread safe, but it is sometimes used as a static variable, for > instance > [here|https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java#L114]. > This will be an issue for HoS, at which scenario it could be accessed by > multiple threads at the same time. We found this sometimes causes NPE: > {code} > ERROR : FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. > java.util.concurrent.ExecutionException: Exception thrown by job > at > org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:311) > at > org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:316) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 20 in stage 0.0 failed 4 times, most recent failure: Lost task 20.3 in > stage 0.0 (TID 33, hadoopworker992-sjc1.prod.uber.internal): > java.lang.RuntimeException: Map operator initialization failed: > org.apache.hadoop.hive.ql.metadata.Hive > Exception: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:127) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:55) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:30) > at > org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192) > at > org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:325) > at > org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:388) > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:92) > ... 16 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.supportedCategories(AvroObjectInspectorGenerator.java:142) > at > org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:91) > at > org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:104) > at > org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:104) > at >
[jira] [Updated] (HIVE-16175) Possible race condition in InstanceCache
[ https://issues.apache.org/jira/browse/HIVE-16175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-16175: Resolution: Fixed Status: Resolved (was: Patch Available) Committed to the master. Thanks [~xuefuz] for the review! > Possible race condition in InstanceCache > > > Key: HIVE-16175 > URL: https://issues.apache.org/jira/browse/HIVE-16175 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-16175.1.patch > > > Currently the [InstanceCache | > https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/avro/InstanceCache.java] > class is not thread safe, but it is sometimes used as a static variable, for > instance > [here|https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java#L114]. > This will be an issue for HoS, at which scenario it could be accessed by > multiple threads at the same time. We found this sometimes causes NPE: > {code} > ERROR : FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. > java.util.concurrent.ExecutionException: Exception thrown by job > at > org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:311) > at > org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:316) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 20 in stage 0.0 failed 4 times, most recent failure: Lost task 20.3 in > stage 0.0 (TID 33, hadoopworker992-sjc1.prod.uber.internal): > java.lang.RuntimeException: Map operator initialization failed: > org.apache.hadoop.hive.ql.metadata.Hive > Exception: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:127) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:55) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:30) > at > org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192) > at > org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:325) > at > org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:388) > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:92) > ... 16 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.supportedCategories(AvroObjectInspectorGenerator.java:142) > at > org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:91) > at > org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:104) > at >
[jira] [Commented] (HIVE-15981) Allow empty grouping sets
[ https://issues.apache.org/jira/browse/HIVE-15981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905822#comment-15905822 ] Hive QA commented on HIVE-15981: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857390/HIVE-15981.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 10337 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4077/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4077/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4077/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12857390 - PreCommit-HIVE-Build > Allow empty grouping sets > - > > Key: HIVE-15981 > URL: https://issues.apache.org/jira/browse/HIVE-15981 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Zoltan Haindrich > Attachments: HIVE-15981.1.patch, HIVE-15981.2.patch > > > group by () should be treated as equivalent to no group by clause. Currently > it throws a parse error -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16161) Disable "packaging.minimizeJar" for JDBC build
[ https://issues.apache.org/jira/browse/HIVE-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-16161: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Patch pushed to master. Thanks Tao, Vaibhav! > Disable "packaging.minimizeJar" for JDBC build > -- > > Key: HIVE-16161 > URL: https://issues.apache.org/jira/browse/HIVE-16161 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-16161.1.patch > > > "packaging.minimizeJar" is set to true for jdbc/pom.xml, which causes the > standalone JDBC jar not having some necessary classes like > "org.apache.hive.org.apache.commons.logging.impl.LogFactoryImpl". We need to > set it to false to have the classes shaded into the jdbc jar as expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-14864) Distcp is not called from MoveTask when src is a directory
[ https://issues.apache.org/jira/browse/HIVE-14864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-14864: --- Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Thanks [~stakiar]. I committed the patch to upstream. > Distcp is not called from MoveTask when src is a directory > -- > > Key: HIVE-14864 > URL: https://issues.apache.org/jira/browse/HIVE-14864 > Project: Hive > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Sahil Takiar > Fix For: 2.2.0 > > Attachments: HIVE-14864.1.patch, HIVE-14864.2.patch, > HIVE-14864.3.patch, HIVE-14864.4.patch, HIVE-14864.patch > > > In FileUtils.java the following code does not get executed even when src > directory size is greater than HIVE_EXEC_COPYFILE_MAXSIZE because > srcFS.getFileStatus(src).getLen() returns 0 when src is a directory. We > should use srcFS.getContentSummary(src).getLength() instead. > {noformat} > /* Run distcp if source file/dir is too big */ > if (srcFS.getUri().getScheme().equals("hdfs") && > srcFS.getFileStatus(src).getLen() > > conf.getLongVar(HiveConf.ConfVars.HIVE_EXEC_COPYFILE_MAXSIZE)) { > LOG.info("Source is " + srcFS.getFileStatus(src).getLen() + " bytes. > (MAX: " + conf.getLongVar(HiveConf.ConfVars.HIVE_EXEC_COPYFILE_MAXSIZE) + > ")"); > LOG.info("Launch distributed copy (distcp) job."); > HiveConfUtil.updateJobCredentialProviders(conf); > copied = shims.runDistCp(src, dst, conf); > if (copied && deleteSource) { > srcFS.delete(src, true); > } > } > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16175) Possible race condition in InstanceCache
[ https://issues.apache.org/jira/browse/HIVE-16175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905784#comment-15905784 ] Xuefu Zhang commented on HIVE-16175: +1 > Possible race condition in InstanceCache > > > Key: HIVE-16175 > URL: https://issues.apache.org/jira/browse/HIVE-16175 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-16175.1.patch > > > Currently the [InstanceCache | > https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/avro/InstanceCache.java] > class is not thread safe, but it is sometimes used as a static variable, for > instance > [here|https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java#L114]. > This will be an issue for HoS, at which scenario it could be accessed by > multiple threads at the same time. We found this sometimes causes NPE: > {code} > ERROR : FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. > java.util.concurrent.ExecutionException: Exception thrown by job > at > org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:311) > at > org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:316) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 20 in stage 0.0 failed 4 times, most recent failure: Lost task 20.3 in > stage 0.0 (TID 33, hadoopworker992-sjc1.prod.uber.internal): > java.lang.RuntimeException: Map operator initialization failed: > org.apache.hadoop.hive.ql.metadata.Hive > Exception: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:127) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:55) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:30) > at > org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192) > at > org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:325) > at > org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:388) > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:92) > ... 16 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.supportedCategories(AvroObjectInspectorGenerator.java:142) > at > org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:91) > at > org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:104) > at > org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:104) > at >
[jira] [Updated] (HIVE-15947) Enhance Templeton service job operations reliability
[ https://issues.apache.org/jira/browse/HIVE-15947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subramanyam Pattipaka updated HIVE-15947: - Attachment: HIVE-15947.7.patch New patch with minor changes. > Enhance Templeton service job operations reliability > > > Key: HIVE-15947 > URL: https://issues.apache.org/jira/browse/HIVE-15947 > Project: Hive > Issue Type: Bug >Reporter: Subramanyam Pattipaka >Assignee: Subramanyam Pattipaka > Attachments: HIVE-15947.2.patch, HIVE-15947.3.patch, > HIVE-15947.4.patch, HIVE-15947.6.patch, HIVE-15947.7.patch, HIVE-15947.patch > > > Currently Templeton service doesn't restrict number of job operation > requests. It simply accepts and tries to run all operations. If more number > of concurrent job submit requests comes then the time to submit job > operations can increase significantly. Templetonused hdfs to store staging > file for job. If HDFS storage can't respond to large number of requests and > throttles then the job submission can take very large times in order of > minutes. > This behavior may not be suitable for all applications and client > applications may be looking for predictable and low response for successful > request or send throttle response to client to wait for some time before > re-requesting job operation. > In this JIRA, I am trying to address following job operations > 1) Submit new Job > 2) Get Job Status > 3) List jobs > These three operations has different complexity due to variance in use of > cluster resources like YARN/HDFS. > The idea is to introduce a new config templeton.job.submit.exec.max-procs > which controls maximum number of concurrent active job submissions within > Templeton and use this config to control better response times. If a new job > submission request sees that there are already > templeton.job.submit.exec.max-procs jobs getting submitted concurrently then > the request will fail with Http error 503 with reason >“Too many concurrent job submission requests received. Please wait for > some time before retrying.” > > The client is expected to catch this response and retry after waiting for > some time. The default value for the config > templeton.job.submit.exec.max-procs is set to ‘0’. This means by default job > submission requests are always accepted. The behavior needs to be enabled > based on requirements. > We can have similar behavior for Status and List operations with configs > templeton.job.status.exec.max-procs and templeton.list.job.exec.max-procs > respectively. > Once the job operation is started, the operation can take longer time. The > client which has requested for job operation may not be waiting for > indefinite amount of time. This work introduces configurations > templeton.exec.job.submit.timeout > templeton.exec.job.status.timeout > templeton.exec.job.list.timeout > to specify maximum amount of time job operation can execute. If time out > happens then list and status job requests returns to client with message > "List job request got timed out. Please retry the operation after waiting for > some time." > If submit job request gets timed out then > i) The job submit request thread which receives time out will check if > valid job id is generated in job request. > ii) If it is generated then issue kill job request on cancel thread > pool. Don't wait for operation to complete and returns to client with time > out message. > Side effects of enabling time out for submit operations > 1) This has a possibility for having active job for some time by the client > gets response and a list operation from client could potential show the newly > created job before it gets killed. > 2) We do best effort to kill the job and no guarantees. This means there is a > possibility of duplicate job created. One possible reason for this could be a > case where job is created and then operation timed out but kill request > failed due to resource manager unavailability. When resource manager > restarts, it will restarts the job which got created. > Fixing this scenario is not part of the scope of this JIRA. The job operation > functionality can be enabled only if above side effects are acceptable. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16177) non Acid to acid conversion doesn't handle _copy_N files
[ https://issues.apache.org/jira/browse/HIVE-16177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-16177: -- Attachment: (was: HIVE-16177.02.patch) > non Acid to acid conversion doesn't handle _copy_N files > > > Key: HIVE-16177 > URL: https://issues.apache.org/jira/browse/HIVE-16177 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Priority: Critical > Attachments: HIVE-16177.01.patch, HIVE-16177.02.patch > > > {noformat} > create table T(a int, b int) clustered by (a) into 2 buckets stored as orc > TBLPROPERTIES('transactional'='false') > insert into T(a,b) values(1,2) > insert into T(a,b) values(1,3) > alter table T SET TBLPROPERTIES ('transactional'='true') > {noformat} > //we should now have bucket files 01_0 and 01_0_copy_1 > but OrcRawRecordMerger.OriginalReaderPair.next() doesn't know that there can > be copy_N files and numbers rows in each bucket from 0 thus generating > duplicate IDs > {noformat} > select ROW__ID, INPUT__FILE__NAME, a, b from T > {noformat} > produces > {noformat} > {"transactionid":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0,1,2 > {"transactionid\":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0_copy_1,1,3 > {noformat} > [~owen.omalley], do you have any thoughts on a good way to handle this? > attached patch has a few changes to make Acid even recognize copy_N but this > is just a pre-requisite. The new UT demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16177) non Acid to acid conversion doesn't handle _copy_N files
[ https://issues.apache.org/jira/browse/HIVE-16177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman reassigned HIVE-16177: - Assignee: Eugene Koifman > non Acid to acid conversion doesn't handle _copy_N files > > > Key: HIVE-16177 > URL: https://issues.apache.org/jira/browse/HIVE-16177 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-16177.01.patch, HIVE-16177.02.patch > > > {noformat} > create table T(a int, b int) clustered by (a) into 2 buckets stored as orc > TBLPROPERTIES('transactional'='false') > insert into T(a,b) values(1,2) > insert into T(a,b) values(1,3) > alter table T SET TBLPROPERTIES ('transactional'='true') > {noformat} > //we should now have bucket files 01_0 and 01_0_copy_1 > but OrcRawRecordMerger.OriginalReaderPair.next() doesn't know that there can > be copy_N files and numbers rows in each bucket from 0 thus generating > duplicate IDs > {noformat} > select ROW__ID, INPUT__FILE__NAME, a, b from T > {noformat} > produces > {noformat} > {"transactionid":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0,1,2 > {"transactionid\":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0_copy_1,1,3 > {noformat} > [~owen.omalley], do you have any thoughts on a good way to handle this? > attached patch has a few changes to make Acid even recognize copy_N but this > is just a pre-requisite. The new UT demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16177) non Acid to acid conversion doesn't handle _copy_N files
[ https://issues.apache.org/jira/browse/HIVE-16177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-16177: -- Attachment: HIVE-16177.02.patch > non Acid to acid conversion doesn't handle _copy_N files > > > Key: HIVE-16177 > URL: https://issues.apache.org/jira/browse/HIVE-16177 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Priority: Critical > Attachments: HIVE-16177.01.patch, HIVE-16177.02.patch > > > {noformat} > create table T(a int, b int) clustered by (a) into 2 buckets stored as orc > TBLPROPERTIES('transactional'='false') > insert into T(a,b) values(1,2) > insert into T(a,b) values(1,3) > alter table T SET TBLPROPERTIES ('transactional'='true') > {noformat} > //we should now have bucket files 01_0 and 01_0_copy_1 > but OrcRawRecordMerger.OriginalReaderPair.next() doesn't know that there can > be copy_N files and numbers rows in each bucket from 0 thus generating > duplicate IDs > {noformat} > select ROW__ID, INPUT__FILE__NAME, a, b from T > {noformat} > produces > {noformat} > {"transactionid":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0,1,2 > {"transactionid\":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0_copy_1,1,3 > {noformat} > [~owen.omalley], do you have any thoughts on a good way to handle this? > attached patch has a few changes to make Acid even recognize copy_N but this > is just a pre-requisite. The new UT demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16177) non Acid to acid conversion doesn't handle _copy_N files
[ https://issues.apache.org/jira/browse/HIVE-16177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-16177: -- Attachment: HIVE-16177.02.patch > non Acid to acid conversion doesn't handle _copy_N files > > > Key: HIVE-16177 > URL: https://issues.apache.org/jira/browse/HIVE-16177 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Priority: Critical > Attachments: HIVE-16177.01.patch, HIVE-16177.02.patch > > > {noformat} > create table T(a int, b int) clustered by (a) into 2 buckets stored as orc > TBLPROPERTIES('transactional'='false') > insert into T(a,b) values(1,2) > insert into T(a,b) values(1,3) > alter table T SET TBLPROPERTIES ('transactional'='true') > {noformat} > //we should now have bucket files 01_0 and 01_0_copy_1 > but OrcRawRecordMerger.OriginalReaderPair.next() doesn't know that there can > be copy_N files and numbers rows in each bucket from 0 thus generating > duplicate IDs > {noformat} > select ROW__ID, INPUT__FILE__NAME, a, b from T > {noformat} > produces > {noformat} > {"transactionid":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0,1,2 > {"transactionid\":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0_copy_1,1,3 > {noformat} > [~owen.omalley], do you have any thoughts on a good way to handle this? > attached patch has a few changes to make Acid even recognize copy_N but this > is just a pre-requisite. The new UT demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16132) DataSize stats don't seem correct in semijoin opt branch
[ https://issues.apache.org/jira/browse/HIVE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905772#comment-15905772 ] Deepak Jaiswal commented on HIVE-16132: --- [~hagleitn] Can you please review? https://reviews.apache.org/r/57391/ > DataSize stats don't seem correct in semijoin opt branch > > > Key: HIVE-16132 > URL: https://issues.apache.org/jira/browse/HIVE-16132 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-16132.1.patch, HIVE-16132.2.patch, > HIVE-16132.3.patch, HIVE-16132.4.patch > > > For the following operator tree snippet, the second Select is the start of a > semijoin optimization branch. Take a look at the Data size - it is the same > as the data size for its parent Select, even though the second select has > only a single bigint column in its projection (the parent has 2 columns). I > would expect the size to be 533328 (16 bytes * 3). > Fixing this estimate may become important if we need to estimate the cost of > generating the min/max/bloomfilter. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16133) Footer cache in Tez AM can take too much memory
[ https://issues.apache.org/jira/browse/HIVE-16133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905754#comment-15905754 ] Hive QA commented on HIVE-16133: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857383/HIVE-16133.03.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10336 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_dynpart_hashjoin_1] (batchId=155) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_join30] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_nullsafe_join] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_outer_join1] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_outer_join4] (batchId=155) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] (batchId=95) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4076/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4076/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4076/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12857383 - PreCommit-HIVE-Build > Footer cache in Tez AM can take too much memory > --- > > Key: HIVE-16133 > URL: https://issues.apache.org/jira/browse/HIVE-16133 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Sergey Shelukhin > Attachments: HIVE-16133.01.patch, HIVE-16133.02.patch, > HIVE-16133.02.patch, HIVE-16133.03.patch, HIVE-16133.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16177) non Acid to acid conversion doesn't handle _copy_N files
[ https://issues.apache.org/jira/browse/HIVE-16177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-16177: -- Description: {noformat} create table T(a int, b int) clustered by (a) into 2 buckets stored as orc TBLPROPERTIES('transactional'='false') insert into T(a,b) values(1,2) insert into T(a,b) values(1,3) alter table T SET TBLPROPERTIES ('transactional'='true') {noformat} //we should now have bucket files 01_0 and 01_0_copy_1 but OrcRawRecordMerger.OriginalReaderPair.next() doesn't know that there can be copy_N files and numbers rows in each bucket from 0 thus generating duplicate IDs {noformat} select ROW__ID, INPUT__FILE__NAME, a, b from T {noformat} produces {noformat} {"transactionid":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0,1,2 {"transactionid\":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0_copy_1,1,3 {noformat} [~owen.omalley], do you have any thoughts on a good way to handle this? attached patch has a few changes to make Acid even recognize copy_N but this is just a pre-requisite. The new UT demonstrates the issue. was: insert into T(a,b) values(1,2) insert into T(a,b) values(1,3) //we should now have bucket files 01_0 and 01_0_copy_1 but OrcRawRecordMerger.OriginalReaderPair.next() doesn't know that there can be copy_N files and numbers rows in each bucket from 0 thus generating duplicate IDs [~owen.omalley], do you have any thoughts on a good way to handle this? attached patch has a few changes to make Acid even recognize copy_N but this is just a pre-requisite. The new UT demonstrates the issue. > non Acid to acid conversion doesn't handle _copy_N files > > > Key: HIVE-16177 > URL: https://issues.apache.org/jira/browse/HIVE-16177 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Priority: Critical > Attachments: HIVE-16177.01.patch > > > {noformat} > create table T(a int, b int) clustered by (a) into 2 buckets stored as orc > TBLPROPERTIES('transactional'='false') > insert into T(a,b) values(1,2) > insert into T(a,b) values(1,3) > alter table T SET TBLPROPERTIES ('transactional'='true') > {noformat} > //we should now have bucket files 01_0 and 01_0_copy_1 > but OrcRawRecordMerger.OriginalReaderPair.next() doesn't know that there can > be copy_N files and numbers rows in each bucket from 0 thus generating > duplicate IDs > {noformat} > select ROW__ID, INPUT__FILE__NAME, a, b from T > {noformat} > produces > {noformat} > {"transactionid":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0,1,2 > {"transactionid\":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0_copy_1,1,3 > {noformat} > [~owen.omalley], do you have any thoughts on a good way to handle this? > attached patch has a few changes to make Acid even recognize copy_N but this > is just a pre-requisite. The new UT demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16161) Disable "packaging.minimizeJar" for JDBC build
[ https://issues.apache.org/jira/browse/HIVE-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905697#comment-15905697 ] Tao Li commented on HIVE-16161: --- [~vgumashta] Test result looks good. > Disable "packaging.minimizeJar" for JDBC build > -- > > Key: HIVE-16161 > URL: https://issues.apache.org/jira/browse/HIVE-16161 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li >Priority: Critical > Attachments: HIVE-16161.1.patch > > > "packaging.minimizeJar" is set to true for jdbc/pom.xml, which causes the > standalone JDBC jar not having some necessary classes like > "org.apache.hive.org.apache.commons.logging.impl.LogFactoryImpl". We need to > set it to false to have the classes shaded into the jdbc jar as expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15981) Allow empty grouping sets
[ https://issues.apache.org/jira/browse/HIVE-15981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905692#comment-15905692 ] Ashutosh Chauhan commented on HIVE-15981: - +1 pending tests > Allow empty grouping sets > - > > Key: HIVE-15981 > URL: https://issues.apache.org/jira/browse/HIVE-15981 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Zoltan Haindrich > Attachments: HIVE-15981.1.patch, HIVE-15981.2.patch > > > group by () should be treated as equivalent to no group by clause. Currently > it throws a parse error -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15979) Support character_length and octet_length
[ https://issues.apache.org/jira/browse/HIVE-15979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905686#comment-15905686 ] Hive QA commented on HIVE-15979: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12855530/HIVE-15979.5.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10340 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ba_table_udfs] (batchId=23) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_binary_data] (batchId=46) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_character_length] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_octet_length] (batchId=30) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_udf_character_length] (batchId=75) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_udf_octet_length] (batchId=2) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=141) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4075/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4075/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4075/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12855530 - PreCommit-HIVE-Build > Support character_length and octet_length > - > > Key: HIVE-15979 > URL: https://issues.apache.org/jira/browse/HIVE-15979 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Teddy Choi > Attachments: HIVE-15979.1.patch, HIVE-15979.2.patch, > HIVE-15979.3.patch, HIVE-15979.4.patch, HIVE-15979.5.patch > > > SQL defines standard ways to get number of characters and octets. SQL > reference: section 6.28. Example: > vagrant=# select character_length('欲速则不达'); > character_length > -- > 5 > (1 row) > vagrant=# select octet_length('欲速则不达'); > octet_length > -- >15 > (1 row) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15947) Enhance Templeton service job operations reliability
[ https://issues.apache.org/jira/browse/HIVE-15947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subramanyam Pattipaka updated HIVE-15947: - Attachment: HIVE-15947.6.patch Latest patch after fixing all review comments. > Enhance Templeton service job operations reliability > > > Key: HIVE-15947 > URL: https://issues.apache.org/jira/browse/HIVE-15947 > Project: Hive > Issue Type: Bug >Reporter: Subramanyam Pattipaka >Assignee: Subramanyam Pattipaka > Attachments: HIVE-15947.2.patch, HIVE-15947.3.patch, > HIVE-15947.4.patch, HIVE-15947.6.patch, HIVE-15947.patch > > > Currently Templeton service doesn't restrict number of job operation > requests. It simply accepts and tries to run all operations. If more number > of concurrent job submit requests comes then the time to submit job > operations can increase significantly. Templetonused hdfs to store staging > file for job. If HDFS storage can't respond to large number of requests and > throttles then the job submission can take very large times in order of > minutes. > This behavior may not be suitable for all applications and client > applications may be looking for predictable and low response for successful > request or send throttle response to client to wait for some time before > re-requesting job operation. > In this JIRA, I am trying to address following job operations > 1) Submit new Job > 2) Get Job Status > 3) List jobs > These three operations has different complexity due to variance in use of > cluster resources like YARN/HDFS. > The idea is to introduce a new config templeton.job.submit.exec.max-procs > which controls maximum number of concurrent active job submissions within > Templeton and use this config to control better response times. If a new job > submission request sees that there are already > templeton.job.submit.exec.max-procs jobs getting submitted concurrently then > the request will fail with Http error 503 with reason >“Too many concurrent job submission requests received. Please wait for > some time before retrying.” > > The client is expected to catch this response and retry after waiting for > some time. The default value for the config > templeton.job.submit.exec.max-procs is set to ‘0’. This means by default job > submission requests are always accepted. The behavior needs to be enabled > based on requirements. > We can have similar behavior for Status and List operations with configs > templeton.job.status.exec.max-procs and templeton.list.job.exec.max-procs > respectively. > Once the job operation is started, the operation can take longer time. The > client which has requested for job operation may not be waiting for > indefinite amount of time. This work introduces configurations > templeton.exec.job.submit.timeout > templeton.exec.job.status.timeout > templeton.exec.job.list.timeout > to specify maximum amount of time job operation can execute. If time out > happens then list and status job requests returns to client with message > "List job request got timed out. Please retry the operation after waiting for > some time." > If submit job request gets timed out then > i) The job submit request thread which receives time out will check if > valid job id is generated in job request. > ii) If it is generated then issue kill job request on cancel thread > pool. Don't wait for operation to complete and returns to client with time > out message. > Side effects of enabling time out for submit operations > 1) This has a possibility for having active job for some time by the client > gets response and a list operation from client could potential show the newly > created job before it gets killed. > 2) We do best effort to kill the job and no guarantees. This means there is a > possibility of duplicate job created. One possible reason for this could be a > case where job is created and then operation timed out but kill request > failed due to resource manager unavailability. When resource manager > restarts, it will restarts the job which got created. > Fixing this scenario is not part of the scope of this JIRA. The job operation > functionality can be enabled only if above side effects are acceptable. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16177) non Acid to acid conversion doesn't handle _copy_N files
[ https://issues.apache.org/jira/browse/HIVE-16177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-16177: -- Attachment: HIVE-16177.01.patch > non Acid to acid conversion doesn't handle _copy_N files > > > Key: HIVE-16177 > URL: https://issues.apache.org/jira/browse/HIVE-16177 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Priority: Critical > Attachments: HIVE-16177.01.patch > > > insert into T(a,b) values(1,2) > insert into T(a,b) values(1,3) > //we should now have bucket files 01_0 and 01_0_copy_1 > but OrcRawRecordMerger.OriginalReaderPair.next() doesn't know that there can > be copy_N files and numbers rows in each bucket from 0 thus generating > duplicate IDs > [~owen.omalley], do you have any thoughts on a good way to handle this? > attached patch has a few changes to make Acid even recognize copy_N but this > is just a pre-requisite. The new UT demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately
[ https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16104: Attachment: HIVE-16104.03.patch > LLAP: preemption may be too aggressive if the pre-empted task doesn't die > immediately > - > > Key: HIVE-16104 > URL: https://issues.apache.org/jira/browse/HIVE-16104 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16104.01.patch, HIVE-16104.02.patch, > HIVE-16104.03.patch, HIVE-16104.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15981) Allow empty grouping sets
[ https://issues.apache.org/jira/browse/HIVE-15981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-15981: Attachment: HIVE-15981.2.patch [~ashutoshc] it seemed like a good idea to add it...but of course, the standard only specifies that; i've updated the patch > Allow empty grouping sets > - > > Key: HIVE-15981 > URL: https://issues.apache.org/jira/browse/HIVE-15981 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Zoltan Haindrich > Attachments: HIVE-15981.1.patch, HIVE-15981.2.patch > > > group by () should be treated as equivalent to no group by clause. Currently > it throws a parse error -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16161) Disable "packaging.minimizeJar" for JDBC build
[ https://issues.apache.org/jira/browse/HIVE-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905627#comment-15905627 ] Hive QA commented on HIVE-16161: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857129/HIVE-16161.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 10336 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4074/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4074/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4074/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12857129 - PreCommit-HIVE-Build > Disable "packaging.minimizeJar" for JDBC build > -- > > Key: HIVE-16161 > URL: https://issues.apache.org/jira/browse/HIVE-16161 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li >Priority: Critical > Attachments: HIVE-16161.1.patch > > > "packaging.minimizeJar" is set to true for jdbc/pom.xml, which causes the > standalone JDBC jar not having some necessary classes like > "org.apache.hive.org.apache.commons.logging.impl.LogFactoryImpl". We need to > set it to false to have the classes shaded into the jdbc jar as expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16133) Footer cache in Tez AM can take too much memory
[ https://issues.apache.org/jira/browse/HIVE-16133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16133: Attachment: HIVE-16133.03.patch Trying again, cannot repro > Footer cache in Tez AM can take too much memory > --- > > Key: HIVE-16133 > URL: https://issues.apache.org/jira/browse/HIVE-16133 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Sergey Shelukhin > Attachments: HIVE-16133.01.patch, HIVE-16133.02.patch, > HIVE-16133.02.patch, HIVE-16133.03.patch, HIVE-16133.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16164) Provide mechanism for passing HMS notification ID between transactional and non-transactional listeners.
[ https://issues.apache.org/jira/browse/HIVE-16164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905596#comment-15905596 ] Sergey Shelukhin commented on HIVE-16164: - Sorry, not familiar at all with the notification listener :( > Provide mechanism for passing HMS notification ID between transactional and > non-transactional listeners. > > > Key: HIVE-16164 > URL: https://issues.apache.org/jira/browse/HIVE-16164 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Sergio Peña >Assignee: Sergio Peña > > The HMS DB notification listener currently stores an event ID on the HMS > backend DB so that external applications (such as backup apps) can request > incremental notifications based on the last event ID requested. > The HMS DB notification and backup applications are asynchronous. However, > there are sometimes that applications may be required to be in sync with the > latest HMS event in order to process an action. These applications will > provide a listener implementation that is called by the HMS after an HMS > transaction happened. > The problem is that the listener running after the transaction (or during the > non-transactional context) may need the DB event ID in order to sync all > events happened previous to that event ID, but this ID is never passed to the > non-transactional listeners. > We can pass this event information through the EnvironmentContext found on > each ListenerEvent implementations (such as CreateTableEvent), and send the > EnvironmentContext to the non-transactional listeners to get the event ID. > The DbNotificactionListener already knows the event ID after calling the > ObjectStore.addNotificationEvent(). We just need to set this event ID to the > EnvironmentContext from each of the event notifications and make sure that > this EnvironmentContext is sent to the non-transactional listeners. > Here's the code example when creating a table on {{create_table_core}}: > {noformat} > ms.createTable(tbl); > if (transactionalListeners.size() > 0) { > CreateTableEvent createTableEvent = new CreateTableEvent(tbl, true, this); > createTableEvent.setEnvironmentContext(envContext); > for (MetaStoreEventListener transactionalListener : > transactionalListeners) { > transactionalListener.onCreateTable(createTableEvent); // <- > Here the notification ID is generated > } > } > success = ms.commitTransaction(); > } finally { > if (!success) { > ms.rollbackTransaction(); > if (madeDir) { > wh.deleteDir(tblPath, true); > } > } > for (MetaStoreEventListener listener : listeners) { > CreateTableEvent createTableEvent = > new CreateTableEvent(tbl, success, this); > createTableEvent.setEnvironmentContext(envContext); > listener.onCreateTable(createTableEvent);// <- > Here we would like to consume notification ID > } > {noformat} > We could use a specific key name that will be used on the EnvironmentContext, > such as DB_NOTIFICATION_EVENT_ID. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16163) Remove unnecessary parentheses in HiveParser
[ https://issues.apache.org/jira/browse/HIVE-16163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong reassigned HIVE-16163: -- Assignee: Pengcheng Xiong > Remove unnecessary parentheses in HiveParser > > > Key: HIVE-16163 > URL: https://issues.apache.org/jira/browse/HIVE-16163 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > > in HiveParser.g > L2145: > {code} > columnParenthesesList > @init { pushMsg("column parentheses list", state); } > @after { popMsg(state); } > : LPAREN columnNameList RPAREN > ; > {code} > should be changed to > {code} > columnParenthesesList > @init { pushMsg("column parentheses list", state); } > @after { popMsg(state); } > : LPAREN! columnNameList RPAREN! > ; > {code} > However, we also need to refactor our code. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15001) Remove showConnectedUrl from command line help
[ https://issues.apache.org/jira/browse/HIVE-15001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-15001: - Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Fix has been committed to the master for Hive 2.2.0. Thank you for your contribution [~pvary] > Remove showConnectedUrl from command line help > -- > > Key: HIVE-15001 > URL: https://issues.apache.org/jira/browse/HIVE-15001 > Project: Hive > Issue Type: Sub-task > Components: Beeline >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Trivial > Fix For: 2.2.0 > > Attachments: HIVE-15001.2.patch, HIVE-15001.3.patch, HIVE-15001.patch > > > As discussed with [~nemon], the showConnectedUrl commandline parameter is not > working since a erroneous merge. Instead beeline always prints the currently > connected url. Since it is good for everyone, no extra parameter is needed to > turn this feature on. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16175) Possible race condition in InstanceCache
[ https://issues.apache.org/jira/browse/HIVE-16175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905565#comment-15905565 ] Hive QA commented on HIVE-16175: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857352/HIVE-16175.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 10336 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4073/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4073/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4073/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12857352 - PreCommit-HIVE-Build > Possible race condition in InstanceCache > > > Key: HIVE-16175 > URL: https://issues.apache.org/jira/browse/HIVE-16175 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-16175.1.patch > > > Currently the [InstanceCache | > https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/avro/InstanceCache.java] > class is not thread safe, but it is sometimes used as a static variable, for > instance > [here|https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java#L114]. > This will be an issue for HoS, at which scenario it could be accessed by > multiple threads at the same time. We found this sometimes causes NPE: > {code} > ERROR : FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. > java.util.concurrent.ExecutionException: Exception thrown by job > at > org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:311) > at > org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:316) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382) > at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 20 in stage 0.0 failed 4 times, most recent failure: Lost task 20.3 in > stage 0.0 (TID 33, hadoopworker992-sjc1.prod.uber.internal): > java.lang.RuntimeException: Map operator initialization failed: > org.apache.hadoop.hive.ql.metadata.Hive > Exception: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:127) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:55) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:30) > at > org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192) > at > org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:325) > at >
[jira] [Commented] (HIVE-15691) Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink
[ https://issues.apache.org/jira/browse/HIVE-15691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905522#comment-15905522 ] Eugene Koifman commented on HIVE-15691: --- I don't think the 2.x version should have deprecated form of the c'tors. Since it's a new class no one could be using them yet. Otherwise I think this is OK for 2.x. Please update the patch and I can commit it. > Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink > - > > Key: HIVE-15691 > URL: https://issues.apache.org/jira/browse/HIVE-15691 > Project: Hive > Issue Type: New Feature > Components: HCatalog, Transactions >Reporter: Kalyan >Assignee: Kalyan > Attachments: HIVE-15691.1.patch, HIVE-15691.patch, > HIVE-15691-updated.patch > > > Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink. > It is similar to StrictJsonWriter available in hive. > Dependency is there in flume to commit. > FLUME-3036 : Create a RegexSerializer for Hive Sink. > Patch is available for Flume, Please verify the below link > https://github.com/kalyanhadooptraining/flume/commit/1c651e81395404321f9964c8d9d2af6f4a2aaef9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16091) Support subqueries in project/select
[ https://issues.apache.org/jira/browse/HIVE-16091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905517#comment-15905517 ] Vineet Garg commented on HIVE-16091: Looking at failures (for whatever reason I apache build is now showing failure logs). I'll also add more tests. Meanwhile I have created review board request : https://reviews.apache.org/r/57518/ > Support subqueries in project/select > > > Key: HIVE-16091 > URL: https://issues.apache.org/jira/browse/HIVE-16091 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-16091.1.patch, HIVE-16091.2.patch > > > Currently scalar subqueries are supported in filter only (WHERE/HAVING). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16161) Disable "packaging.minimizeJar" for JDBC build
[ https://issues.apache.org/jira/browse/HIVE-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905509#comment-15905509 ] Tao Li commented on HIVE-16161: --- Resubmission did not help. [~daijy] is kicking off one manually. Thanks. > Disable "packaging.minimizeJar" for JDBC build > -- > > Key: HIVE-16161 > URL: https://issues.apache.org/jira/browse/HIVE-16161 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li >Priority: Critical > Attachments: HIVE-16161.1.patch > > > "packaging.minimizeJar" is set to true for jdbc/pom.xml, which causes the > standalone JDBC jar not having some necessary classes like > "org.apache.hive.org.apache.commons.logging.impl.LogFactoryImpl". We need to > set it to false to have the classes shaded into the jdbc jar as expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16164) Provide mechanism for passing HMS notification ID between transactional and non-transactional listeners.
[ https://issues.apache.org/jira/browse/HIVE-16164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905505#comment-15905505 ] Alexander Kolbasov commented on HIVE-16164: --- I see. IMO a better way would be to just pass the success status as a parameter to hooks or set status on the existing event. > Provide mechanism for passing HMS notification ID between transactional and > non-transactional listeners. > > > Key: HIVE-16164 > URL: https://issues.apache.org/jira/browse/HIVE-16164 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Sergio Peña >Assignee: Sergio Peña > > The HMS DB notification listener currently stores an event ID on the HMS > backend DB so that external applications (such as backup apps) can request > incremental notifications based on the last event ID requested. > The HMS DB notification and backup applications are asynchronous. However, > there are sometimes that applications may be required to be in sync with the > latest HMS event in order to process an action. These applications will > provide a listener implementation that is called by the HMS after an HMS > transaction happened. > The problem is that the listener running after the transaction (or during the > non-transactional context) may need the DB event ID in order to sync all > events happened previous to that event ID, but this ID is never passed to the > non-transactional listeners. > We can pass this event information through the EnvironmentContext found on > each ListenerEvent implementations (such as CreateTableEvent), and send the > EnvironmentContext to the non-transactional listeners to get the event ID. > The DbNotificactionListener already knows the event ID after calling the > ObjectStore.addNotificationEvent(). We just need to set this event ID to the > EnvironmentContext from each of the event notifications and make sure that > this EnvironmentContext is sent to the non-transactional listeners. > Here's the code example when creating a table on {{create_table_core}}: > {noformat} > ms.createTable(tbl); > if (transactionalListeners.size() > 0) { > CreateTableEvent createTableEvent = new CreateTableEvent(tbl, true, this); > createTableEvent.setEnvironmentContext(envContext); > for (MetaStoreEventListener transactionalListener : > transactionalListeners) { > transactionalListener.onCreateTable(createTableEvent); // <- > Here the notification ID is generated > } > } > success = ms.commitTransaction(); > } finally { > if (!success) { > ms.rollbackTransaction(); > if (madeDir) { > wh.deleteDir(tblPath, true); > } > } > for (MetaStoreEventListener listener : listeners) { > CreateTableEvent createTableEvent = > new CreateTableEvent(tbl, success, this); > createTableEvent.setEnvironmentContext(envContext); > listener.onCreateTable(createTableEvent);// <- > Here we would like to consume notification ID > } > {noformat} > We could use a specific key name that will be used on the EnvironmentContext, > such as DB_NOTIFICATION_EVENT_ID. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16132) DataSize stats don't seem correct in semijoin opt branch
[ https://issues.apache.org/jira/browse/HIVE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905496#comment-15905496 ] Hive QA commented on HIVE-16132: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857348/HIVE-16132.4.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10336 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction] (batchId=148) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4072/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4072/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4072/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12857348 - PreCommit-HIVE-Build > DataSize stats don't seem correct in semijoin opt branch > > > Key: HIVE-16132 > URL: https://issues.apache.org/jira/browse/HIVE-16132 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-16132.1.patch, HIVE-16132.2.patch, > HIVE-16132.3.patch, HIVE-16132.4.patch > > > For the following operator tree snippet, the second Select is the start of a > semijoin optimization branch. Take a look at the Data size - it is the same > as the data size for its parent Select, even though the second select has > only a single bigint column in its projection (the parent has 2 columns). I > would expect the size to be 533328 (16 bytes * 3). > Fixing this estimate may become important if we need to estimate the cost of > generating the min/max/bloomfilter. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16167) Remove transitive dependency on mysql connector jar
[ https://issues.apache.org/jira/browse/HIVE-16167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-16167: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Pushed to master. > Remove transitive dependency on mysql connector jar > --- > > Key: HIVE-16167 > URL: https://issues.apache.org/jira/browse/HIVE-16167 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure, Druid integration >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Fix For: 2.2.0 > > Attachments: HIVE-16167.patch > > > Brought in by druid storage handler transitively. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16176) SchemaTool should exit with non-zero exit code when one or more validator's fail.
[ https://issues.apache.org/jira/browse/HIVE-16176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam reassigned HIVE-16176: > SchemaTool should exit with non-zero exit code when one or more validator's > fail. > - > > Key: HIVE-16176 > URL: https://issues.apache.org/jira/browse/HIVE-16176 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 2.2.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam >Priority: Minor > > Currently schematool exits with a code of 0 when one or more schema tool > validation fail. Ideally, it should return a non-zero exit code when any of > the validators fail. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16161) Disable "packaging.minimizeJar" for JDBC build
[ https://issues.apache.org/jira/browse/HIVE-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li updated HIVE-16161: -- Status: Patch Available (was: Open) > Disable "packaging.minimizeJar" for JDBC build > -- > > Key: HIVE-16161 > URL: https://issues.apache.org/jira/browse/HIVE-16161 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li >Priority: Critical > Attachments: HIVE-16161.1.patch > > > "packaging.minimizeJar" is set to true for jdbc/pom.xml, which causes the > standalone JDBC jar not having some necessary classes like > "org.apache.hive.org.apache.commons.logging.impl.LogFactoryImpl". We need to > set it to false to have the classes shaded into the jdbc jar as expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16161) Disable "packaging.minimizeJar" for JDBC build
[ https://issues.apache.org/jira/browse/HIVE-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li updated HIVE-16161: -- Status: Open (was: Patch Available) > Disable "packaging.minimizeJar" for JDBC build > -- > > Key: HIVE-16161 > URL: https://issues.apache.org/jira/browse/HIVE-16161 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li >Priority: Critical > Attachments: HIVE-16161.1.patch > > > "packaging.minimizeJar" is set to true for jdbc/pom.xml, which causes the > standalone JDBC jar not having some necessary classes like > "org.apache.hive.org.apache.commons.logging.impl.LogFactoryImpl". We need to > set it to false to have the classes shaded into the jdbc jar as expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16161) Disable "packaging.minimizeJar" for JDBC build
[ https://issues.apache.org/jira/browse/HIVE-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905470#comment-15905470 ] Tao Li commented on HIVE-16161: --- [~vgumashta] Sounds good. I don't see a build from https://builds.apache.org/job/PreCommit-HIVE-Build/ though. Should I re-submit? > Disable "packaging.minimizeJar" for JDBC build > -- > > Key: HIVE-16161 > URL: https://issues.apache.org/jira/browse/HIVE-16161 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li >Priority: Critical > Attachments: HIVE-16161.1.patch > > > "packaging.minimizeJar" is set to true for jdbc/pom.xml, which causes the > standalone JDBC jar not having some necessary classes like > "org.apache.hive.org.apache.commons.logging.impl.LogFactoryImpl". We need to > set it to false to have the classes shaded into the jdbc jar as expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16167) Remove transitive dependency on mysql connector jar
[ https://issues.apache.org/jira/browse/HIVE-16167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905452#comment-15905452 ] Gunther Hagleitner commented on HIVE-16167: --- +1 > Remove transitive dependency on mysql connector jar > --- > > Key: HIVE-16167 > URL: https://issues.apache.org/jira/browse/HIVE-16167 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure, Druid integration >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-16167.patch > > > Brought in by druid storage handler transitively. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16167) Remove transitive dependency on mysql connector jar
[ https://issues.apache.org/jira/browse/HIVE-16167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905439#comment-15905439 ] Ashutosh Chauhan commented on HIVE-16167: - [~hagleitn] Can you take a look ? > Remove transitive dependency on mysql connector jar > --- > > Key: HIVE-16167 > URL: https://issues.apache.org/jira/browse/HIVE-16167 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure, Druid integration >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-16167.patch > > > Brought in by druid storage handler transitively. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16167) Remove transitive dependency on mysql connector jar
[ https://issues.apache.org/jira/browse/HIVE-16167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905430#comment-15905430 ] Hive QA commented on HIVE-16167: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857147/HIVE-16167.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10336 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] (batchId=94) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4071/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4071/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4071/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12857147 - PreCommit-HIVE-Build > Remove transitive dependency on mysql connector jar > --- > > Key: HIVE-16167 > URL: https://issues.apache.org/jira/browse/HIVE-16167 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure, Druid integration >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-16167.patch > > > Brought in by druid storage handler transitively. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16161) Disable "packaging.minimizeJar" for JDBC build
[ https://issues.apache.org/jira/browse/HIVE-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905429#comment-15905429 ] Vaibhav Gumashta commented on HIVE-16161: - [~taoli-hwx] Waiting for precommit run. Will commit if after the run. > Disable "packaging.minimizeJar" for JDBC build > -- > > Key: HIVE-16161 > URL: https://issues.apache.org/jira/browse/HIVE-16161 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li >Priority: Critical > Attachments: HIVE-16161.1.patch > > > "packaging.minimizeJar" is set to true for jdbc/pom.xml, which causes the > standalone JDBC jar not having some necessary classes like > "org.apache.hive.org.apache.commons.logging.impl.LogFactoryImpl". We need to > set it to false to have the classes shaded into the jdbc jar as expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346)