[jira] [Commented] (HIVE-16005) miscellaneous small fixes to help with llap debuggability
[ https://issues.apache.org/jira/browse/HIVE-16005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877707#comment-15877707 ] Prasanth Jayachandran commented on HIVE-16005: -- for constructUniqueQueryId.. can we use the same format as that of filenames generated by query-routing logger? (queryId-dagId) This way easier to locate the corresponding log file. Appending suffix to the thread name, is it primarily to get some context from jstack output? For stacktraces that gets logged will already have these info via NDC. > miscellaneous small fixes to help with llap debuggability > - > > Key: HIVE-16005 > URL: https://issues.apache.org/jira/browse/HIVE-16005 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-16005.01.patch > > > - Include proc_ in cli, beeline, metastore, hs2 process args > - LLAP history logger - log QueryId instead of dagName (dag name is free > flowing text) > - LLAP JXM ExecutorStatus - Log QueryId instead of dagName. Sort by running / > queued > - Include thread name in TaskRunnerCallable so that it shows up in stack > traces (will cause extra output in logs) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-1626) stop using java.util.Stack
[ https://issues.apache.org/jira/browse/HIVE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-1626: - Attachment: HIVE-1626.2.patch Re-uploading the patch. > stop using java.util.Stack > -- > > Key: HIVE-1626 > URL: https://issues.apache.org/jira/browse/HIVE-1626 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 0.7.0 >Reporter: John Sichi >Assignee: Teddy Choi > Attachments: HIVE-1626.2.patch, HIVE-1626.2.patch > > > We currently use Stack as part of the generic node walking library. Stack > should not be used for this since its inheritance from Vector incurs > superfluous synchronization overhead. > Most projects end up adding an ArrayStack implementation and using that > instead. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16002) Correlated IN subquery with aggregate asserts in sq_count_check UDF
[ https://issues.apache.org/jira/browse/HIVE-16002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877678#comment-15877678 ] Hive QA commented on HIVE-16002: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12853851/HIVE-16002.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10252 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_in] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_notin] (batchId=151) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_in] (batchId=122) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel (batchId=211) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3685/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3685/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3685/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12853851 - PreCommit-HIVE-Build > Correlated IN subquery with aggregate asserts in sq_count_check UDF > --- > > Key: HIVE-16002 > URL: https://issues.apache.org/jira/browse/HIVE-16002 > Project: Hive > Issue Type: Bug >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-16002.1.patch > > > Reproducer > {code:SQL} > create table t(i int, j int); > insert into t values(0,1), (0,2); > create table tt(i int, j int); > insert into tt values(0,3); > select * from t where i IN (select count(i) from tt where tt.j = t.j); > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16006) Incremental REPL LOAD doesn't operate on the target database if name differs from source database.
[ https://issues.apache.org/jira/browse/HIVE-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan reassigned HIVE-16006: --- > Incremental REPL LOAD doesn't operate on the target database if name differs > from source database. > -- > > Key: HIVE-16006 > URL: https://issues.apache.org/jira/browse/HIVE-16006 > Project: Hive > Issue Type: Bug > Components: repl >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > > During "Incremental Load", it is not considering the database name input in > the command line. Hence load doesn't happen. At the same time, database with > original name is getting modified. > Steps: > 1. REPL DUMP default FROM 52; > 2. REPL LOAD replDb FROM '/tmp/dump/1487588522621'; > – This step modifies the default Db instead of replDb. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-1626) stop using java.util.Stack
[ https://issues.apache.org/jira/browse/HIVE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877679#comment-15877679 ] Hive QA commented on HIVE-1626: --- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12853867/HIVE-1626.2.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3686/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3686/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3686/ Messages: {noformat} This message was trimmed, see log for full details Apply anyway? [n] Skipping patch. 3 out of 3 hunks ignored -- saving rejects to file ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkMapJoinResolver.java.rej patching file ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java Reversed (or previously applied) patch detected! Assume -R? [n] Apply anyway? [n] Skipping patch. 12 out of 12 hunks ignored -- saving rejects to file ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java.rej patching file ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java Reversed (or previously applied) patch detected! Assume -R? [n] Apply anyway? [n] Skipping patch. 2 out of 2 hunks ignored -- saving rejects to file ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java.rej patching file ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java Reversed (or previously applied) patch detected! Assume -R? [n] Apply anyway? [n] Skipping patch. 3 out of 3 hunks ignored -- saving rejects to file ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java.rej patching file ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/CombineEquivalentWorkResolver.java Reversed (or previously applied) patch detected! Assume -R? [n] Apply anyway? [n] Skipping patch. 2 out of 2 hunks ignored -- saving rejects to file ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/CombineEquivalentWorkResolver.java.rej patching file ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java Reversed (or previously applied) patch detected! Assume -R? [n] Apply anyway? [n] Skipping patch. 2 out of 2 hunks ignored -- saving rejects to file ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java.rej patching file ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkJoinHintOptimizer.java Reversed (or previously applied) patch detected! Assume -R? [n] Apply anyway? [n] Skipping patch. 2 out of 2 hunks ignored -- saving rejects to file ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkJoinHintOptimizer.java.rej patching file ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkJoinOptimizer.java Reversed (or previously applied) patch detected! Assume -R? [n] Apply anyway? [n] Skipping patch. 2 out of 2 hunks ignored -- saving rejects to file ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkJoinOptimizer.java.rej patching file ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java Reversed (or previously applied) patch detected! Assume -R? [n] Apply anyway? [n] Skipping patch. 2 out of 2 hunks ignored -- saving rejects to file ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java.rej patching file ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkReduceSinkMapJoinProc.java Reversed (or previously applied) patch detected! Assume -R? [n] Apply anyway? [n] Skipping patch. 6 out of 6 hunks ignored -- saving rejects to file ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkReduceSinkMapJoinProc.java.rej patching file ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSMBJoinHintOptimizer.java Reversed (or previously applied) patch detected! Assume -R? [n] Apply anyway? [n] Skipping patch. 2 out of 2 hunks ignored -- saving rejects to file ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSMBJoinHintOptimizer.java.rej patching file ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSkewJoinProcFactory.java Reversed (or previously applied) patch detected! Assume -R? [n] Apply anyway? [n] Skipping patch. 2 out of 2 hunks ignored -- saving rejects to file ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSkewJoinProcFactory.java.rej patching file ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSkewJoinResolver.java Reversed (or previously applied) patch detected! Assume -R? [n] Apply anyway? [n] Skipping patch. 2 out of 2 hunks ignored -- saving rejects to file ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSkewJoinResolver.java.rej patching file
[jira] [Updated] (HIVE-15993) Hive REPL STATUS is not returning last event ID
[ https://issues.apache.org/jira/browse/HIVE-15993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-15993: Affects Version/s: (was: 2.1.0) > Hive REPL STATUS is not returning last event ID > --- > > Key: HIVE-15993 > URL: https://issues.apache.org/jira/browse/HIVE-15993 > Project: Hive > Issue Type: Bug > Components: repl >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Attachments: HIVE-15993.01.patch > > > While running "REPL STATUS" on target to get last event ID for DB, it returns > zero rows. > 0: jdbc:hive2://localhost:10001/repl> REPL status repl; > No rows affected (932.167 seconds) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15570) LLAP: Exception in HostAffinitySplitLocationProvider when running in container mode
[ https://issues.apache.org/jira/browse/HIVE-15570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877676#comment-15877676 ] Lefty Leverenz commented on HIVE-15570: --- Doc note: The new description and behavior of *hive.llap.client.consistent.splits* need to be documented in the wiki for release 2.2.0: * [Configuration Properties -- hive.llap.client.consistent.splits | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.llap.client.consistent.splits] Added a TODOC2.2 label. (By the way, the parameter description should have included newlines (\n) as shown for *hive.llap.validate.acls* right after it, to avoid overlong lines in the generated template file hive-default.xml.template.) > LLAP: Exception in HostAffinitySplitLocationProvider when running in > container mode > --- > > Key: HIVE-15570 > URL: https://issues.apache.org/jira/browse/HIVE-15570 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Zhiyuan Yang >Priority: Minor > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-15570.1.patch, HIVE-15570.2.patch, > HIVE-15570.3.patch > > > Sometimes user might prefer to run with "hive.execution.mode=container" mode > when LLAP is stopped. If hive config for LLAP had > "hive.llap.client.consistent.splits=true" in client side, it would end up > throwing the following exception in {{Utils.java}}. > {noformat} > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:68) > ... 25 more > Caused by: java.lang.IllegalStateException: > org.apache.hadoop.hive.ql.exec.tez.HostAffinitySplitLocationProvider needs at > least 1 location to function > at > com.google.common.base.Preconditions.checkState(Preconditions.java:149) > at > org.apache.hadoop.hive.ql.exec.tez.HostAffinitySplitLocationProvider.(HostAffinitySplitLocationProvider.java:52) > at > org.apache.hadoop.hive.ql.exec.tez.Utils.getSplitLocationProvider(Utils.java:54) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.(HiveSplitGenerator.java:121) > ... 30 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15993) Hive REPL STATUS is not returning last event ID
[ https://issues.apache.org/jira/browse/HIVE-15993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-15993: Component/s: (was: Parser) > Hive REPL STATUS is not returning last event ID > --- > > Key: HIVE-15993 > URL: https://issues.apache.org/jira/browse/HIVE-15993 > Project: Hive > Issue Type: Bug > Components: repl >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Attachments: HIVE-15993.01.patch > > > While running "REPL STATUS" on target to get last event ID for DB, it returns > zero rows. > 0: jdbc:hive2://localhost:10001/repl> REPL status repl; > No rows affected (932.167 seconds) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15993) Hive REPL STATUS is not returning last event ID
[ https://issues.apache.org/jira/browse/HIVE-15993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-15993: Component/s: repl Parser > Hive REPL STATUS is not returning last event ID > --- > > Key: HIVE-15993 > URL: https://issues.apache.org/jira/browse/HIVE-15993 > Project: Hive > Issue Type: Bug > Components: Parser, repl >Affects Versions: 2.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Attachments: HIVE-15993.01.patch > > > While running "REPL STATUS" on target to get last event ID for DB, it returns > zero rows. > 0: jdbc:hive2://localhost:10001/repl> REPL status repl; > No rows affected (932.167 seconds) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15993) Hive REPL STATUS is not returning last event ID
[ https://issues.apache.org/jira/browse/HIVE-15993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-15993: Affects Version/s: 2.1.0 > Hive REPL STATUS is not returning last event ID > --- > > Key: HIVE-15993 > URL: https://issues.apache.org/jira/browse/HIVE-15993 > Project: Hive > Issue Type: Bug > Components: Parser, repl >Affects Versions: 2.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Attachments: HIVE-15993.01.patch > > > While running "REPL STATUS" on target to get last event ID for DB, it returns > zero rows. > 0: jdbc:hive2://localhost:10001/repl> REPL status repl; > No rows affected (932.167 seconds) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16004) OutOfMemory in SparkReduceRecordHandler with vectorization mode
[ https://issues.apache.org/jira/browse/HIVE-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877652#comment-15877652 ] Ferdinand Xu commented on HIVE-16004: - LGTM, [~xuefuz], do you have any further comments? > OutOfMemory in SparkReduceRecordHandler with vectorization mode > --- > > Key: HIVE-16004 > URL: https://issues.apache.org/jira/browse/HIVE-16004 > Project: Hive > Issue Type: Bug >Reporter: Colin Ma >Assignee: Colin Ma > Attachments: HIVE-16004.001.patch, HIVE-16004.002.patch > > > For the query 28 of TPCs-BB with 1T data, the executor memory is set as 30G. > Get the following exception: > java.lang.OutOfMemoryError > at > java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123) > at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117) > at > java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) > at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153) > at java.io.DataOutputStream.write(DataOutputStream.java:107) > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.setVector(VectorizedBatchUtil.java:467) > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.addRowToBatchFrom(VectorizedBatchUtil.java:238) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processVectors(SparkReduceRecordHandler.java:367) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:286) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:220) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:85) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42) > at scala.collection.Iterator$class.foreach(Iterator.scala:893) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127) > at > org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974) > at > org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) > at org.apache.spark.scheduler.Task.run(Task.scala:85) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > I think DataOutputBuffer isn't cleared on time cause this problem. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15570) LLAP: Exception in HostAffinitySplitLocationProvider when running in container mode
[ https://issues.apache.org/jira/browse/HIVE-15570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-15570: -- Labels: TODOC2.2 (was: ) > LLAP: Exception in HostAffinitySplitLocationProvider when running in > container mode > --- > > Key: HIVE-15570 > URL: https://issues.apache.org/jira/browse/HIVE-15570 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Zhiyuan Yang >Priority: Minor > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-15570.1.patch, HIVE-15570.2.patch, > HIVE-15570.3.patch > > > Sometimes user might prefer to run with "hive.execution.mode=container" mode > when LLAP is stopped. If hive config for LLAP had > "hive.llap.client.consistent.splits=true" in client side, it would end up > throwing the following exception in {{Utils.java}}. > {noformat} > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:68) > ... 25 more > Caused by: java.lang.IllegalStateException: > org.apache.hadoop.hive.ql.exec.tez.HostAffinitySplitLocationProvider needs at > least 1 location to function > at > com.google.common.base.Preconditions.checkState(Preconditions.java:149) > at > org.apache.hadoop.hive.ql.exec.tez.HostAffinitySplitLocationProvider.(HostAffinitySplitLocationProvider.java:52) > at > org.apache.hadoop.hive.ql.exec.tez.Utils.getSplitLocationProvider(Utils.java:54) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.(HiveSplitGenerator.java:121) > ... 30 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15989) Incorrect rounding in decimal data types
[ https://issues.apache.org/jira/browse/HIVE-15989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877629#comment-15877629 ] Prasanth Jayachandran commented on HIVE-15989: -- [~sershe] orcfiledump -d option will dump records > Incorrect rounding in decimal data types > > > Key: HIVE-15989 > URL: https://issues.apache.org/jira/browse/HIVE-15989 > Project: Hive > Issue Type: Bug > Components: Database/Schema >Reporter: Nikesh >Priority: Critical > Attachments: ANA_AUTO_E.csv > > > I have a numeric field in a file in my data lake and created a hive external > table pointing to this field. The field value is > 0. but when I fetched this record > using the query it display only 0.. I tried using DECIMAL and > DOUBLE data types but nothing worked.is this a bug or am I not using the > exact data type for this? > Thanks, > NIkesh -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16005) miscellaneous small fixes to help with llap debuggability
[ https://issues.apache.org/jira/browse/HIVE-16005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-16005: -- Status: Patch Available (was: Open) > miscellaneous small fixes to help with llap debuggability > - > > Key: HIVE-16005 > URL: https://issues.apache.org/jira/browse/HIVE-16005 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-16005.01.patch > > > - Include proc_ in cli, beeline, metastore, hs2 process args > - LLAP history logger - log QueryId instead of dagName (dag name is free > flowing text) > - LLAP JXM ExecutorStatus - Log QueryId instead of dagName. Sort by running / > queued > - Include thread name in TaskRunnerCallable so that it shows up in stack > traces (will cause extra output in logs) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16005) miscellaneous small fixes to help with llap debuggability
[ https://issues.apache.org/jira/browse/HIVE-16005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth reassigned HIVE-16005: - Assignee: Siddharth Seth > miscellaneous small fixes to help with llap debuggability > - > > Key: HIVE-16005 > URL: https://issues.apache.org/jira/browse/HIVE-16005 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-16005.01.patch > > > - Include proc_ in cli, beeline, metastore, hs2 process args > - LLAP history logger - log QueryId instead of dagName (dag name is free > flowing text) > - LLAP JXM ExecutorStatus - Log QueryId instead of dagName. Sort by running / > queued > - Include thread name in TaskRunnerCallable so that it shows up in stack > traces (will cause extra output in logs) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16005) miscellaneous small fixes to help with llap debuggability
[ https://issues.apache.org/jira/browse/HIVE-16005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-16005: -- Summary: miscellaneous small fixes to help with llap debuggability (was: miscellaneous small fixes to help with debuggability) > miscellaneous small fixes to help with llap debuggability > - > > Key: HIVE-16005 > URL: https://issues.apache.org/jira/browse/HIVE-16005 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth > Attachments: HIVE-16005.01.patch > > > - Include proc_ in cli, beeline, metastore, hs2 process args > - LLAP history logger - log QueryId instead of dagName (dag name is free > flowing text) > - LLAP JXM ExecutorStatus - Log QueryId instead of dagName. Sort by running / > queued > - Include thread name in TaskRunnerCallable so that it shows up in stack > traces (will cause extra output in logs) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16005) miscellaneous small fixes to help with debuggability
[ https://issues.apache.org/jira/browse/HIVE-16005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-16005: -- Attachment: HIVE-16005.01.patch cc [~prasanth_j], [~sershe] for review. > miscellaneous small fixes to help with debuggability > > > Key: HIVE-16005 > URL: https://issues.apache.org/jira/browse/HIVE-16005 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth > Attachments: HIVE-16005.01.patch > > > - Include proc_ in cli, beeline, metastore, hs2 process args > - LLAP history logger - log QueryId instead of dagName (dag name is free > flowing text) > - LLAP JXM ExecutorStatus - Log QueryId instead of dagName. Sort by running / > queued > - Include thread name in TaskRunnerCallable so that it shows up in stack > traces (will cause extra output in logs) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15999) Fix flakiness in TestDbTxnManager2
[ https://issues.apache.org/jira/browse/HIVE-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877617#comment-15877617 ] Wei Zheng commented on HIVE-15999: -- I cannot find a good answer. As we discussed, this might be a derby bug. One thing I noticed that is different for TestDbTxnManager2 from other tests is that, all other tests call "TxnDbUtil.setConfValues(conf);" before calling "TxnDbUtil.prepDb();", but TestDbTxnManager2 only calls "TxnDbUtil.setConfValues(conf);" once in a @BeforeClass method. By changing the @BeforeClass method into constructor, it's guaranteed to be run for every UT, which is consistent to all other tests. I ran the ptest several times, and didn't see such failure anymore with the fix. > Fix flakiness in TestDbTxnManager2 > -- > > Key: HIVE-15999 > URL: https://issues.apache.org/jira/browse/HIVE-15999 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.2.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-15999.1.patch > > > Right now there is test flakiness wrt. TestDbTxnManager2. The error is like > this: > {code} > java.sql.SQLException: Table/View 'TXNS' already exists in Schema 'APP'. > at > org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown > Source) > at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown > Source) > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown > Source) > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) > at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) > at > org.apache.hadoop.hive.metastore.txn.TxnDbUtil.prepDb(TxnDbUtil.java:75) > at > org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.setUp(TestDbTxnManager2.java:90) > {code} > The failure is due to HiveConf used in the test being polluted by some test, > e.g. in testDummyTxnManagerOnAcidTable(), conf entry HIVE_TXN_MANAGER is set > to "org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager" but not switched back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16004) OutOfMemory in SparkReduceRecordHandler with vectorization mode
[ https://issues.apache.org/jira/browse/HIVE-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Ma updated HIVE-16004: Attachment: HIVE-16004.002.patch [~Ferd], thanks for your review, the patch is updated. > OutOfMemory in SparkReduceRecordHandler with vectorization mode > --- > > Key: HIVE-16004 > URL: https://issues.apache.org/jira/browse/HIVE-16004 > Project: Hive > Issue Type: Bug >Reporter: Colin Ma >Assignee: Colin Ma > Attachments: HIVE-16004.001.patch, HIVE-16004.002.patch > > > For the query 28 of TPCs-BB with 1T data, the executor memory is set as 30G. > Get the following exception: > java.lang.OutOfMemoryError > at > java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123) > at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117) > at > java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) > at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153) > at java.io.DataOutputStream.write(DataOutputStream.java:107) > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.setVector(VectorizedBatchUtil.java:467) > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.addRowToBatchFrom(VectorizedBatchUtil.java:238) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processVectors(SparkReduceRecordHandler.java:367) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:286) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:220) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:85) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42) > at scala.collection.Iterator$class.foreach(Iterator.scala:893) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127) > at > org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974) > at > org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) > at org.apache.spark.scheduler.Task.run(Task.scala:85) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > I think DataOutputBuffer isn't cleared on time cause this problem. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15570) LLAP: Exception in HostAffinitySplitLocationProvider when running in container mode
[ https://issues.apache.org/jira/browse/HIVE-15570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-15570: -- Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) > LLAP: Exception in HostAffinitySplitLocationProvider when running in > container mode > --- > > Key: HIVE-15570 > URL: https://issues.apache.org/jira/browse/HIVE-15570 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Zhiyuan Yang >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15570.1.patch, HIVE-15570.2.patch, > HIVE-15570.3.patch > > > Sometimes user might prefer to run with "hive.execution.mode=container" mode > when LLAP is stopped. If hive config for LLAP had > "hive.llap.client.consistent.splits=true" in client side, it would end up > throwing the following exception in {{Utils.java}}. > {noformat} > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:68) > ... 25 more > Caused by: java.lang.IllegalStateException: > org.apache.hadoop.hive.ql.exec.tez.HostAffinitySplitLocationProvider needs at > least 1 location to function > at > com.google.common.base.Preconditions.checkState(Preconditions.java:149) > at > org.apache.hadoop.hive.ql.exec.tez.HostAffinitySplitLocationProvider.(HostAffinitySplitLocationProvider.java:52) > at > org.apache.hadoop.hive.ql.exec.tez.Utils.getSplitLocationProvider(Utils.java:54) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.(HiveSplitGenerator.java:121) > ... 30 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15570) LLAP: Exception in HostAffinitySplitLocationProvider when running in container mode
[ https://issues.apache.org/jira/browse/HIVE-15570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877601#comment-15877601 ] Siddharth Seth commented on HIVE-15570: --- Test failures are unrelated. Committing. Thanks [~aplusplus] > LLAP: Exception in HostAffinitySplitLocationProvider when running in > container mode > --- > > Key: HIVE-15570 > URL: https://issues.apache.org/jira/browse/HIVE-15570 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Zhiyuan Yang >Priority: Minor > Attachments: HIVE-15570.1.patch, HIVE-15570.2.patch, > HIVE-15570.3.patch > > > Sometimes user might prefer to run with "hive.execution.mode=container" mode > when LLAP is stopped. If hive config for LLAP had > "hive.llap.client.consistent.splits=true" in client side, it would end up > throwing the following exception in {{Utils.java}}. > {noformat} > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:68) > ... 25 more > Caused by: java.lang.IllegalStateException: > org.apache.hadoop.hive.ql.exec.tez.HostAffinitySplitLocationProvider needs at > least 1 location to function > at > com.google.common.base.Preconditions.checkState(Preconditions.java:149) > at > org.apache.hadoop.hive.ql.exec.tez.HostAffinitySplitLocationProvider.(HostAffinitySplitLocationProvider.java:52) > at > org.apache.hadoop.hive.ql.exec.tez.Utils.getSplitLocationProvider(Utils.java:54) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.(HiveSplitGenerator.java:121) > ... 30 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16004) OutOfMemory in SparkReduceRecordHandler with vectorization mode
[ https://issues.apache.org/jira/browse/HIVE-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877594#comment-15877594 ] Ferdinand Xu commented on HIVE-16004: - Thanks [~colin_mjj] for the patch. Can we reset this buffer instead of allocate a new one? {noformat} buffer = new DataOutputBuffer(); {noformat} > OutOfMemory in SparkReduceRecordHandler with vectorization mode > --- > > Key: HIVE-16004 > URL: https://issues.apache.org/jira/browse/HIVE-16004 > Project: Hive > Issue Type: Bug >Reporter: Colin Ma >Assignee: Colin Ma > Attachments: HIVE-16004.001.patch > > > For the query 28 of TPCs-BB with 1T data, the executor memory is set as 30G. > Get the following exception: > java.lang.OutOfMemoryError > at > java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123) > at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117) > at > java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) > at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153) > at java.io.DataOutputStream.write(DataOutputStream.java:107) > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.setVector(VectorizedBatchUtil.java:467) > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.addRowToBatchFrom(VectorizedBatchUtil.java:238) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processVectors(SparkReduceRecordHandler.java:367) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:286) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:220) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:85) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42) > at scala.collection.Iterator$class.foreach(Iterator.scala:893) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127) > at > org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974) > at > org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) > at org.apache.spark.scheduler.Task.run(Task.scala:85) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > I think DataOutputBuffer isn't cleared on time cause this problem. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16004) OutOfMemory in SparkReduceRecordHandler with vectorization mode
[ https://issues.apache.org/jira/browse/HIVE-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Ma updated HIVE-16004: Status: Patch Available (was: Open) Initial patch updated. > OutOfMemory in SparkReduceRecordHandler with vectorization mode > --- > > Key: HIVE-16004 > URL: https://issues.apache.org/jira/browse/HIVE-16004 > Project: Hive > Issue Type: Bug >Reporter: Colin Ma >Assignee: Colin Ma > Attachments: HIVE-16004.001.patch > > > For the query 28 of TPCs-BB with 1T data, the executor memory is set as 30G. > Get the following exception: > java.lang.OutOfMemoryError > at > java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123) > at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117) > at > java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) > at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153) > at java.io.DataOutputStream.write(DataOutputStream.java:107) > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.setVector(VectorizedBatchUtil.java:467) > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.addRowToBatchFrom(VectorizedBatchUtil.java:238) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processVectors(SparkReduceRecordHandler.java:367) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:286) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:220) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:85) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42) > at scala.collection.Iterator$class.foreach(Iterator.scala:893) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127) > at > org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974) > at > org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) > at org.apache.spark.scheduler.Task.run(Task.scala:85) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > I think DataOutputBuffer isn't cleared on time cause this problem. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-1555) JDBC Storage Handler
[ https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877579#comment-15877579 ] Hive QA commented on HIVE-1555: --- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12853848/HIVE-1555.6.patch {color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10278 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=236) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[jdbc_handler] (batchId=52) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=224) org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag[1] (batchId=173) org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag[2] (batchId=173) org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteSmallint (batchId=173) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3684/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3684/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3684/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12853848 - PreCommit-HIVE-Build > JDBC Storage Handler > > > Key: HIVE-1555 > URL: https://issues.apache.org/jira/browse/HIVE-1555 > Project: Hive > Issue Type: New Feature > Components: JDBC >Reporter: Bob Robertson >Assignee: Gunther Hagleitner > Attachments: HIVE-1555.3.patch, HIVE-1555.4.patch, HIVE-1555.5.patch, > HIVE-1555.6.patch, JDBCStorageHandler Design Doc.pdf > > Original Estimate: 24h > Remaining Estimate: 24h > > With the Cassandra and HBase Storage Handlers I thought it would make sense > to include a generic JDBC RDBMS Storage Handler so that you could import a > standard DB table into Hive. Many people must want to perform HiveQL joins, > etc against tables in other systems etc. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16004) OutOfMemory in SparkReduceRecordHandler with vectorization mode
[ https://issues.apache.org/jira/browse/HIVE-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Ma updated HIVE-16004: Attachment: HIVE-16004.001.patch > OutOfMemory in SparkReduceRecordHandler with vectorization mode > --- > > Key: HIVE-16004 > URL: https://issues.apache.org/jira/browse/HIVE-16004 > Project: Hive > Issue Type: Bug >Reporter: Colin Ma >Assignee: Colin Ma > Attachments: HIVE-16004.001.patch > > > For the query 28 of TPCs-BB with 1T data, the executor memory is set as 30G. > Get the following exception: > java.lang.OutOfMemoryError > at > java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123) > at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117) > at > java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) > at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153) > at java.io.DataOutputStream.write(DataOutputStream.java:107) > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.setVector(VectorizedBatchUtil.java:467) > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.addRowToBatchFrom(VectorizedBatchUtil.java:238) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processVectors(SparkReduceRecordHandler.java:367) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:286) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:220) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:85) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42) > at scala.collection.Iterator$class.foreach(Iterator.scala:893) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127) > at > org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974) > at > org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) > at org.apache.spark.scheduler.Task.run(Task.scala:85) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > I think DataOutputBuffer isn't cleared on time cause this problem. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16004) OutOfMemory in SparkReduceRecordHandler with vectorization mode
[ https://issues.apache.org/jira/browse/HIVE-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Ma reassigned HIVE-16004: --- > OutOfMemory in SparkReduceRecordHandler with vectorization mode > --- > > Key: HIVE-16004 > URL: https://issues.apache.org/jira/browse/HIVE-16004 > Project: Hive > Issue Type: Bug >Reporter: Colin Ma >Assignee: Colin Ma > > For the query 28 of TPCs-BB with 1T data, the executor memory is set as 30G. > Get the following exception: > java.lang.OutOfMemoryError > at > java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123) > at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117) > at > java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) > at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153) > at java.io.DataOutputStream.write(DataOutputStream.java:107) > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.setVector(VectorizedBatchUtil.java:467) > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.addRowToBatchFrom(VectorizedBatchUtil.java:238) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processVectors(SparkReduceRecordHandler.java:367) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:286) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:220) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:85) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42) > at scala.collection.Iterator$class.foreach(Iterator.scala:893) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127) > at > org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974) > at > org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) > at org.apache.spark.scheduler.Task.run(Task.scala:85) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > I think DataOutputBuffer isn't cleared on time cause this problem. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15859) Hive client side shows Spark Driver disconnected while Spark Driver side could not get RPC header
[ https://issues.apache.org/jira/browse/HIVE-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877560#comment-15877560 ] KaiXu commented on HIVE-15859: -- Thanks all for the efforts, I will try the patch. > Hive client side shows Spark Driver disconnected while Spark Driver side > could not get RPC header > -- > > Key: HIVE-15859 > URL: https://issues.apache.org/jira/browse/HIVE-15859 > Project: Hive > Issue Type: Bug > Components: Hive, Spark >Affects Versions: 2.2.0 > Environment: hadoop2.7.1 > spark1.6.2 > hive2.2 >Reporter: KaiXu >Assignee: Rui Li > Attachments: HIVE-15859.1.patch, HIVE-15859.2.patch > > > Hive on Spark, failed with error: > {noformat} > 2017-02-08 09:50:59,331 Stage-2_0: 1039(+2)/1041 Stage-3_0: 796(+456)/1520 > Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1 > 2017-02-08 09:51:00,335 Stage-2_0: 1040(+1)/1041 Stage-3_0: 914(+398)/1520 > Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1 > 2017-02-08 09:51:01,338 Stage-2_0: 1041/1041 Finished Stage-3_0: > 961(+383)/1520 Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1 > Failed to monitor Job[ 2] with exception 'java.lang.IllegalStateException(RPC > channel is closed.)' > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask > {noformat} > application log shows the driver commanded a shutdown with some unknown > reason, but hive's log shows Driver could not get RPC header( Expected RPC > header, got org.apache.hive.spark.client.rpc.Rpc$NullMessage instead). > {noformat} > 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = > hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml > 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1169.0 in > stage 3.0 (TID 2519) > 17/02/08 09:51:04 INFO executor.CoarseGrainedExecutorBackend: Driver > commanded a shutdown > 17/02/08 09:51:04 INFO storage.MemoryStore: MemoryStore cleared > 17/02/08 09:51:04 INFO storage.BlockManager: BlockManager stopped > 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = > hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml > 17/02/08 09:51:04 WARN executor.CoarseGrainedExecutorBackend: An unknown > (hsx-node1:42777) driver disconnected. > 17/02/08 09:51:04 ERROR executor.CoarseGrainedExecutorBackend: Driver > 192.168.1.1:42777 disassociated! Shutting down. > 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1105.0 in > stage 3.0 (TID 2511) > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Shutdown hook called > 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: > Shutting down remote daemon. > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk6/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-71da1dfc-99bd-4687-bc2f-33452db8de3d > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk2/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-7f134d81-e77e-4b92-bd99-0a51d0962c14 > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk5/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-77a90d63-fb05-4bc6-8d5e-1562cc502e6c > 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: > Remote daemon shut down; proceeding with flushing remote transports. > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk4/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-91f8b91a-114d-4340-8560-d3cd085c1cd4 > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk1/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-a3c24f9e-8609-48f0-9d37-0de7ae06682a > 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: > Remoting shut down. > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk7/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-f6120a43-2158-4780-927c-c5786b78f53e > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk3/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-e17931ad-9e8a-45da-86f8-9a0fdca0fad1 > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk8/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-4de34175-f871-4c28-8ec0-d2fc0020c5c3 > 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1137.0 in > stage 3.0 (TID 2515) > 17/02/08 09:51:04 INFO
[jira] [Updated] (HIVE-16003) Blobstores should use fs.listFiles(path, recursive=true) rather than FileUtils.listStatusRecursively
[ https://issues.apache.org/jira/browse/HIVE-16003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-16003: Description: {{FileUtils.listStatusRecursively}} can be slow on blobstores because {{listStatus}} calls are applied recursively to a given directory. This can be especially bad on tables with multiple levels of partitioning. The {{FileSystem}} API provides an optimized API called {{listFiles(path, recursive)}} that can be used to invoke an optimized recursive directory listing. The problem is that the {{listFiles(path, recursive)}} API doesn't provide a option to pass in a {{PathFilter}}, while {{FileUtils.listStatusRecursively}} uses a custom HIDDEN_FILES_PATH_FILTER. To fix this we could either: 1: Modify the FileSystem API to provide a {{listFiles(path, recursive, PathFilter)}} method (probably the cleanest solution) 2: Add conditional logic so that blobstores invoke {{listFiles(path, recursive)}} and the rest of the code uses the current implementation of {{FileUtils.listStatusRecursively}} 3: Replace the implementation of {{FileUtils.listStatusRecursively}} with {{listFiles(path, recursive)}} and apply the {{PathFilter}} on the results (not sure what optimizations can be made if {{PathFilter}} objects are passed into {{FileSystem}} methods - maybe {{PathFilter}} objects are pushed to the NameNode?) was: {{FileUtils.listStatusRecursively}} can be slow on blobstores because {{listStatus}} calls are applied recursively to a given directory. This can be especially bad on tables with multiple levels of partitioning. The {{FileSystem}} API provides an optimized API called {{listFiles(path, recursive)}} that can be used to invoke an optimized recursive directory listing. The problem is that the {{listFiles(path, recursive)}} API doesn't provide a option to pass in a {{PathFilter}}, while {{FileUtils.listStatusRecursively}} uses a custom HIDDEN_FILES_PATH_FILTER. To fix this we could either: 1: Modify the FileSystem API to provide a {{listFiles(path, recursive, PathFilter)}} method 2: Add conditional logic so that blobstores invoke {{listFiles(path, recursive)}} and the rest of the code uses the current implementation of {{FileUtils.listStatusRecursively}} 3: Replace the implementation of {{FileUtils.listStatusRecursively}} with {{listFiles(path, recursive)}} and apply the {{PathFilter}} on the results > Blobstores should use fs.listFiles(path, recursive=true) rather than > FileUtils.listStatusRecursively > > > Key: HIVE-16003 > URL: https://issues.apache.org/jira/browse/HIVE-16003 > Project: Hive > Issue Type: Sub-task >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > {{FileUtils.listStatusRecursively}} can be slow on blobstores because > {{listStatus}} calls are applied recursively to a given directory. This can > be especially bad on tables with multiple levels of partitioning. > The {{FileSystem}} API provides an optimized API called {{listFiles(path, > recursive)}} that can be used to invoke an optimized recursive directory > listing. > The problem is that the {{listFiles(path, recursive)}} API doesn't provide a > option to pass in a {{PathFilter}}, while {{FileUtils.listStatusRecursively}} > uses a custom HIDDEN_FILES_PATH_FILTER. > To fix this we could either: > 1: Modify the FileSystem API to provide a {{listFiles(path, recursive, > PathFilter)}} method (probably the cleanest solution) > 2: Add conditional logic so that blobstores invoke {{listFiles(path, > recursive)}} and the rest of the code uses the current implementation of > {{FileUtils.listStatusRecursively}} > 3: Replace the implementation of {{FileUtils.listStatusRecursively}} with > {{listFiles(path, recursive)}} and apply the {{PathFilter}} on the results > (not sure what optimizations can be made if {{PathFilter}} objects are passed > into {{FileSystem}} methods - maybe {{PathFilter}} objects are pushed to the > NameNode?) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16003) Blobstores should use fs.listFiles(path, recursive=true) rather than FileUtils.listStatusRecursively
[ https://issues.apache.org/jira/browse/HIVE-16003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar reassigned HIVE-16003: --- > Blobstores should use fs.listFiles(path, recursive=true) rather than > FileUtils.listStatusRecursively > > > Key: HIVE-16003 > URL: https://issues.apache.org/jira/browse/HIVE-16003 > Project: Hive > Issue Type: Sub-task >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > {{FileUtils.listStatusRecursively}} can be slow on blobstores because > {{listStatus}} calls are applied recursively to a given directory. This can > be especially bad on tables with multiple levels of partitioning. > The {{FileSystem}} API provides an optimized API called {{listFiles(path, > recursive)}} that can be used to invoke an optimized recursive directory > listing. > The problem is that the {{listFiles(path, recursive)}} API doesn't provide a > option to pass in a {{PathFilter}}, while {{FileUtils.listStatusRecursively}} > uses a custom HIDDEN_FILES_PATH_FILTER. > To fix this we could either: > 1: Modify the FileSystem API to provide a {{listFiles(path, recursive, > PathFilter)}} method > 2: Add conditional logic so that blobstores invoke {{listFiles(path, > recursive)}} and the rest of the code uses the current implementation of > {{FileUtils.listStatusRecursively}} > 3: Replace the implementation of {{FileUtils.listStatusRecursively}} with > {{listFiles(path, recursive)}} and apply the {{PathFilter}} on the results -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15954) LLAP: some Tez INFO logs are too noisy
[ https://issues.apache.org/jira/browse/HIVE-15954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877531#comment-15877531 ] Siddharth Seth commented on HIVE-15954: --- Did this disable all logging from the classes mentioned? That's a little too much. The annoying lines are under a separate logger, and just those can be disabled. > LLAP: some Tez INFO logs are too noisy > -- > > Key: HIVE-15954 > URL: https://issues.apache.org/jira/browse/HIVE-15954 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 2.2.0 > > Attachments: HIVE-15954.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15880) Allow insert overwrite query to use auto.purge table property
[ https://issues.apache.org/jira/browse/HIVE-15880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877528#comment-15877528 ] Hive QA commented on HIVE-15880: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12853833/HIVE-15880.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10252 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=223) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=211) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3683/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3683/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3683/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12853833 - PreCommit-HIVE-Build > Allow insert overwrite query to use auto.purge table property > - > > Key: HIVE-15880 > URL: https://issues.apache.org/jira/browse/HIVE-15880 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-15880.01.patch > > > It seems inconsistent that auto.purge property is not considered when we do a > INSERT OVERWRITE while it is when we do a DROP TABLE > Drop table doesn't move table data to Trash when auto.purge is set to true > {noformat} > > create table temp(col1 string, col2 string); > No rows affected (0.064 seconds) > > alter table temp set tblproperties('auto.purge'='true'); > No rows affected (0.083 seconds) > > insert into temp values ('test', 'test'), ('test2', 'test2'); > No rows affected (25.473 seconds) > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 22 2017-02-09 13:03 > /user/hive/warehouse/temp/00_0 > # > > drop table temp; > No rows affected (0.242 seconds) > # hdfs dfs -ls /user/hive/warehouse/temp > ls: `/user/hive/warehouse/temp': No such file or directory > # > # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse > # > {noformat} > INSERT OVERWRITE query moves the table data to Trash even when auto.purge is > set to true > {noformat} > > create table temp(col1 string, col2 string); > > alter table temp set tblproperties('auto.purge'='true'); > > insert into temp values ('test', 'test'), ('test2', 'test2'); > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 22 2017-02-09 13:07 > /user/hive/warehouse/temp/00_0 > # > > insert overwrite table temp select * from dummy; > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 26 2017-02-09 13:08 > /user/hive/warehouse/temp/00_0 > # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse > Found 1 items > drwx-- - hive hive 0 2017-02-09 13:08 > /user/hive/.Trash/Current/user/hive/warehouse/temp > # > {noformat} > While move operations are not very costly on HDFS it could be significant > overhead on slow FileSystems like S3. This could improve the performance of > {{INSERT OVERWRITE TABLE}} queries especially when there are large number of > partitions on tables located on S3 should the user wish to set auto.purge > property to true -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15989) Incorrect rounding in decimal data types
[ https://issues.apache.org/jira/browse/HIVE-15989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877519#comment-15877519 ] Nikesh commented on HIVE-15989: --- I do not have an ORC file right now, I have attached the csv file which contains the sample record. > Incorrect rounding in decimal data types > > > Key: HIVE-15989 > URL: https://issues.apache.org/jira/browse/HIVE-15989 > Project: Hive > Issue Type: Bug > Components: Database/Schema >Reporter: Nikesh >Priority: Critical > Attachments: ANA_AUTO_E.csv > > > I have a numeric field in a file in my data lake and created a hive external > table pointing to this field. The field value is > 0. but when I fetched this record > using the query it display only 0.. I tried using DECIMAL and > DOUBLE data types but nothing worked.is this a bug or am I not using the > exact data type for this? > Thanks, > NIkesh -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15989) Incorrect rounding in decimal data types
[ https://issues.apache.org/jira/browse/HIVE-15989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikesh updated HIVE-15989: -- Attachment: ANA_AUTO_E.csv > Incorrect rounding in decimal data types > > > Key: HIVE-15989 > URL: https://issues.apache.org/jira/browse/HIVE-15989 > Project: Hive > Issue Type: Bug > Components: Database/Schema >Reporter: Nikesh >Priority: Critical > Attachments: ANA_AUTO_E.csv > > > I have a numeric field in a file in my data lake and created a hive external > table pointing to this field. The field value is > 0. but when I fetched this record > using the query it display only 0.. I tried using DECIMAL and > DOUBLE data types but nothing worked.is this a bug or am I not using the > exact data type for this? > Thanks, > NIkesh -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15947) Enhance Templeton service job operations reliability
[ https://issues.apache.org/jira/browse/HIVE-15947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877500#comment-15877500 ] Subramanyam Pattipaka commented on HIVE-15947: -- [~kiran.kolli], I have fixed comments provided by you. [~thejas], Can you please provide comments if you have any? I have provided unit tests for threads getting killed and interrupted. I have tried to simulate threads getting killed using shutdownNow(). Is there a better way to simulate kill thread behavior for webhcat request and verify this behavior? > Enhance Templeton service job operations reliability > > > Key: HIVE-15947 > URL: https://issues.apache.org/jira/browse/HIVE-15947 > Project: Hive > Issue Type: Bug >Reporter: Subramanyam Pattipaka >Assignee: Subramanyam Pattipaka > Attachments: HIVE-15947.2.patch, HIVE-15947.patch > > > Currently Templeton service doesn't restrict number of job operation > requests. It simply accepts and tries to run all operations. If more number > of concurrent job submit requests comes then the time to submit job > operations can increase significantly. Templetonused hdfs to store staging > file for job. If HDFS storage can't respond to large number of requests and > throttles then the job submission can take very large times in order of > minutes. > This behavior may not be suitable for all applications and client > applications may be looking for predictable and low response for successful > request or send throttle response to client to wait for some time before > re-requesting job operation. > In this JIRA, I am trying to address following job operations > 1) Submit new Job > 2) Get Job Status > 3) List jobs > These three operations has different complexity due to variance in use of > cluster resources like YARN/HDFS. > The idea is to introduce a new config templeton.job.submit.exec.max-procs > which controls maximum number of concurrent active job submissions within > Templeton and use this config to control better response times. If a new job > submission request sees that there are already > templeton.job.submit.exec.max-procs jobs getting submitted concurrently then > the request will fail with Http error 503 with reason >“Too many concurrent job submission requests received. Please wait for > some time before retrying.” > > The client is expected to catch this response and retry after waiting for > some time. The default value for the config > templeton.job.submit.exec.max-procs is set to ‘0’. This means by default job > submission requests are always accepted. The behavior needs to be enabled > based on requirements. > We can have similar behavior for Status and List operations with configs > templeton.job.status.exec.max-procs and templeton.list.job.exec.max-procs > respectively. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15947) Enhance Templeton service job operations reliability
[ https://issues.apache.org/jira/browse/HIVE-15947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subramanyam Pattipaka updated HIVE-15947: - Attachment: HIVE-15947.2.patch Incorporated review comments. > Enhance Templeton service job operations reliability > > > Key: HIVE-15947 > URL: https://issues.apache.org/jira/browse/HIVE-15947 > Project: Hive > Issue Type: Bug >Reporter: Subramanyam Pattipaka >Assignee: Subramanyam Pattipaka > Attachments: HIVE-15947.2.patch, HIVE-15947.patch > > > Currently Templeton service doesn't restrict number of job operation > requests. It simply accepts and tries to run all operations. If more number > of concurrent job submit requests comes then the time to submit job > operations can increase significantly. Templetonused hdfs to store staging > file for job. If HDFS storage can't respond to large number of requests and > throttles then the job submission can take very large times in order of > minutes. > This behavior may not be suitable for all applications and client > applications may be looking for predictable and low response for successful > request or send throttle response to client to wait for some time before > re-requesting job operation. > In this JIRA, I am trying to address following job operations > 1) Submit new Job > 2) Get Job Status > 3) List jobs > These three operations has different complexity due to variance in use of > cluster resources like YARN/HDFS. > The idea is to introduce a new config templeton.job.submit.exec.max-procs > which controls maximum number of concurrent active job submissions within > Templeton and use this config to control better response times. If a new job > submission request sees that there are already > templeton.job.submit.exec.max-procs jobs getting submitted concurrently then > the request will fail with Http error 503 with reason >“Too many concurrent job submission requests received. Please wait for > some time before retrying.” > > The client is expected to catch this response and retry after waiting for > some time. The default value for the config > templeton.job.submit.exec.max-procs is set to ‘0’. This means by default job > submission requests are always accepted. The behavior needs to be enabled > based on requirements. > We can have similar behavior for Status and List operations with configs > templeton.job.status.exec.max-procs and templeton.list.job.exec.max-procs > respectively. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15999) Fix flakiness in TestDbTxnManager2
[ https://issues.apache.org/jira/browse/HIVE-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877475#comment-15877475 ] Hive QA commented on HIVE-15999: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12853815/HIVE-15999.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10252 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=223) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel (batchId=211) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3682/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3682/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3682/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12853815 - PreCommit-HIVE-Build > Fix flakiness in TestDbTxnManager2 > -- > > Key: HIVE-15999 > URL: https://issues.apache.org/jira/browse/HIVE-15999 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.2.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-15999.1.patch > > > Right now there is test flakiness wrt. TestDbTxnManager2. The error is like > this: > {code} > java.sql.SQLException: Table/View 'TXNS' already exists in Schema 'APP'. > at > org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown > Source) > at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown > Source) > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown > Source) > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) > at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) > at > org.apache.hadoop.hive.metastore.txn.TxnDbUtil.prepDb(TxnDbUtil.java:75) > at > org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.setUp(TestDbTxnManager2.java:90) > {code} > The failure is due to HiveConf used in the test being polluted by some test, > e.g. in testDummyTxnManagerOnAcidTable(), conf entry HIVE_TXN_MANAGER is set > to "org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager" but not switched back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15847) In Progress update refreshes seem slow
[ https://issues.apache.org/jira/browse/HIVE-15847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877464#comment-15877464 ] anishek commented on HIVE-15847: Thanks for the review [~thejas] > In Progress update refreshes seem slow > -- > > Key: HIVE-15847 > URL: https://issues.apache.org/jira/browse/HIVE-15847 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.2.0 >Reporter: anishek >Assignee: anishek > Fix For: 2.2.0 > > Attachments: current_response.mov, HIVE-15847.1.patch, > HIVE-15847.2.patch > > > After HIVE-15473, the refresh rates for in place progress bar seems to be > slow on hive cli. > As pointed out by [~prasanth_j] > {quote} > The refresh rate is slow. Following video will show it > before patch: https://asciinema.org/a/2fgcncxg5gjavcpxt6lfb8jg9 > after patch: https://asciinema.org/a/2tht5jf6l9b2dc3ylt5gtztqg > {quote} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15955) make explain formatted to include opId and etc
[ https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15955: --- Status: Open (was: Patch Available) > make explain formatted to include opId and etc > -- > > Key: HIVE-15955 > URL: https://issues.apache.org/jira/browse/HIVE-15955 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch, > HIVE-15955.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15955) make explain formatted to include opId and etc
[ https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15955: --- Attachment: HIVE-15955.03.patch > make explain formatted to include opId and etc > -- > > Key: HIVE-15955 > URL: https://issues.apache.org/jira/browse/HIVE-15955 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch, > HIVE-15955.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15955) make explain formatted to include opId and etc
[ https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15955: --- Status: Patch Available (was: Open) > make explain formatted to include opId and etc > -- > > Key: HIVE-15955 > URL: https://issues.apache.org/jira/browse/HIVE-15955 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch, > HIVE-15955.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15991) Flaky Test: TestEncryptedHDFSCliDriver encryption_join_with_different_encryption_keys
[ https://issues.apache.org/jira/browse/HIVE-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877436#comment-15877436 ] Hive QA commented on HIVE-15991: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12853813/HIVE-15991.txt {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10252 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=140) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_join16] (batchId=111) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join21] (batchId=111) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_8] (batchId=111) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3681/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3681/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3681/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12853813 - PreCommit-HIVE-Build > Flaky Test: TestEncryptedHDFSCliDriver > encryption_join_with_different_encryption_keys > - > > Key: HIVE-15991 > URL: https://issues.apache.org/jira/browse/HIVE-15991 > Project: Hive > Issue Type: Sub-task >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-15991.txt > > > I ran a git-bisect and seems HIVE-15703 started causing this failure. Not > entirely sure why, but I updated the .out file and the diff is pretty > straightforward, so I think its safe to just update it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15958) LLAP: IPC connections are not being reused for umbilical protocol
[ https://issues.apache.org/jira/browse/HIVE-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-15958: - Attachment: HIVE-15958.2.patch [~sseth] also added clearing AMNodeInfo on query completion. Can you please take another look? > LLAP: IPC connections are not being reused for umbilical protocol > - > > Key: HIVE-15958 > URL: https://issues.apache.org/jira/browse/HIVE-15958 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Rajesh Balamohan >Assignee: Prasanth Jayachandran > Attachments: HIVE-15958.1.patch, HIVE-15958.2.patch > > > During concurrency testing, observed 1000s of ipc thread creations. Ideally, > the connections to same hosts should be reused. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-1626) stop using java.util.Stack
[ https://issues.apache.org/jira/browse/HIVE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-1626: - Status: Patch Available (was: Open) > stop using java.util.Stack > -- > > Key: HIVE-1626 > URL: https://issues.apache.org/jira/browse/HIVE-1626 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 0.7.0 >Reporter: John Sichi >Assignee: Teddy Choi > Attachments: HIVE-1626.2.patch > > > We currently use Stack as part of the generic node walking library. Stack > should not be used for this since its inheritance from Vector incurs > superfluous synchronization overhead. > Most projects end up adding an ArrayStack implementation and using that > instead. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-1626) stop using java.util.Stack
[ https://issues.apache.org/jira/browse/HIVE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-1626: - Attachment: HIVE-1626.2.patch 125 files are changed. Most of files are subclasses of NodeProcessor and Dispatcher. They now use Deque instead of Stack. However, there were dozens of Stack.get(int) calls, which is not in ArrayDeque. I implemented Utils.get(Deque, int) for it with Deque.decendingIterator(), which impacts GC. > stop using java.util.Stack > -- > > Key: HIVE-1626 > URL: https://issues.apache.org/jira/browse/HIVE-1626 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 0.7.0 >Reporter: John Sichi >Assignee: Teddy Choi > Attachments: HIVE-1626.2.patch > > > We currently use Stack as part of the generic node walking library. Stack > should not be used for this since its inheritance from Vector incurs > superfluous synchronization overhead. > Most projects end up adding an ArrayStack implementation and using that > instead. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-14901) HiveServer2: Use user supplied fetch size to determine #rows serialized in tasks
[ https://issues.apache.org/jira/browse/HIVE-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877374#comment-15877374 ] Hive QA commented on HIVE-14901: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12853818/HIVE-14901.4.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10253 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=223) org.apache.hive.jdbc.authorization.TestJdbcMetadataApiAuth.org.apache.hive.jdbc.authorization.TestJdbcMetadataApiAuth (batchId=218) org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthUDFBlacklist.testBlackListedUdfUsage (batchId=217) org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testAllowedCommands (batchId=218) org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testAuthorization1 (batchId=218) org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage (batchId=218) org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testConfigWhiteList (batchId=218) org.apache.hive.minikdc.TestJdbcWithMiniKdcSQLAuthBinary.testAuthorization1 (batchId=229) org.apache.hive.minikdc.TestJdbcWithMiniKdcSQLAuthHttp.testAuthorization1 (batchId=229) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3680/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3680/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3680/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 12 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12853818 - PreCommit-HIVE-Build > HiveServer2: Use user supplied fetch size to determine #rows serialized in > tasks > > > Key: HIVE-14901 > URL: https://issues.apache.org/jira/browse/HIVE-14901 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC, ODBC >Affects Versions: 2.1.0 >Reporter: Vaibhav Gumashta >Assignee: Norris Lee > Attachments: HIVE-14901.1.patch, HIVE-14901.2.patch, > HIVE-14901.3.patch, HIVE-14901.4.patch, HIVE-14901.patch > > > Currently, we use {{hive.server2.thrift.resultset.max.fetch.size}} to decide > the max number of rows that we write in tasks. However, we should ideally use > the user supplied value (which can be extracted from the > ThriftCLIService.FetchResults' request parameter) to decide how many rows to > serialize in a blob in the tasks. We should however use > {{hive.server2.thrift.resultset.max.fetch.size}} to have an upper bound on > it, so that we don't go OOM in tasks and HS2. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15955) make explain formatted to include opId and etc
[ https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877304#comment-15877304 ] Hive QA commented on HIVE-15955: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12853824/HIVE-15955.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 111 failed/errored test(s), 10252 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input4] (batchId=74) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join0] (batchId=54) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parallel_join0] (batchId=68) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[plan_json] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join3] (batchId=31) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join4] (batchId=78) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join6] (batchId=38) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] (batchId=137) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[constprog_dpp] (batchId=141) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[constprog_semijoin] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_5] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_mat_1] (batchId=144) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_mat_2] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_mat_3] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_mat_4] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_mat_5] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[deleteAnalyze] (batchId=144) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[empty_join] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_4] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[smb_cache] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_aggregate_without_gby] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_parquet_types] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_timestamp] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[windowing_gby] (batchId=146) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_1] (batchId=93) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=94) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] (batchId=93) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] (batchId=94) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=93) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=93) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[multi_count_distinct] (batchId=93) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query12] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query13] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query15] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query17] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query18] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query19] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query1] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query20] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query21] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query22] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query25] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query26]
[jira] [Commented] (HIVE-15999) Fix flakiness in TestDbTxnManager2
[ https://issues.apache.org/jira/browse/HIVE-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877258#comment-15877258 ] Eugene Koifman commented on HIVE-15999: --- could you explain why this is causing a derby error? > Fix flakiness in TestDbTxnManager2 > -- > > Key: HIVE-15999 > URL: https://issues.apache.org/jira/browse/HIVE-15999 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.2.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-15999.1.patch > > > Right now there is test flakiness wrt. TestDbTxnManager2. The error is like > this: > {code} > java.sql.SQLException: Table/View 'TXNS' already exists in Schema 'APP'. > at > org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown > Source) > at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown > Source) > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown > Source) > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) > at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) > at > org.apache.hadoop.hive.metastore.txn.TxnDbUtil.prepDb(TxnDbUtil.java:75) > at > org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.setUp(TestDbTxnManager2.java:90) > {code} > The failure is due to HiveConf used in the test being polluted by some test, > e.g. in testDummyTxnManagerOnAcidTable(), conf entry HIVE_TXN_MANAGER is set > to "org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager" but not switched back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HIVE-15946) Failing test : TestCliDriver cbo_rp_auto_join1
[ https://issues.apache.org/jira/browse/HIVE-15946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li resolved HIVE-15946. --- Resolution: Resolved Fix Version/s: 2.2.0 Fixed in HIVE-15948 > Failing test : TestCliDriver cbo_rp_auto_join1 > -- > > Key: HIVE-15946 > URL: https://issues.apache.org/jira/browse/HIVE-15946 > Project: Hive > Issue Type: Sub-task > Components: CBO >Affects Versions: 2.2.0 >Reporter: Thejas M Nair > Fix For: 2.2.0 > > > Started failing in master around Feb 14 2017. > {code} > at org.junit.Assert.fail(Assert.java:88) > at org.apache.hadoop.hive.ql.QTestUtil.failed(QTestUtil.java:2204) > at > org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:186) > at > org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104) > at > org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver(TestCliDriver.java:59) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16002) Correlated IN subquery with aggregate asserts in sq_count_check UDF
[ https://issues.apache.org/jira/browse/HIVE-16002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-16002: Status: Patch Available (was: Open) > Correlated IN subquery with aggregate asserts in sq_count_check UDF > --- > > Key: HIVE-16002 > URL: https://issues.apache.org/jira/browse/HIVE-16002 > Project: Hive > Issue Type: Bug >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-16002.1.patch > > > Reproducer > {code:SQL} > create table t(i int, j int); > insert into t values(0,1), (0,2); > create table tt(i int, j int); > insert into tt values(0,3); > select * from t where i IN (select count(i) from tt where tt.j = t.j); > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16002) Correlated IN subquery with aggregate asserts in sq_count_check UDF
[ https://issues.apache.org/jira/browse/HIVE-16002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16002: --- Attachment: HIVE-16002.1.patch > Correlated IN subquery with aggregate asserts in sq_count_check UDF > --- > > Key: HIVE-16002 > URL: https://issues.apache.org/jira/browse/HIVE-16002 > Project: Hive > Issue Type: Bug >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-16002.1.patch > > > Reproducer > {code:SQL} > create table t(i int, j int); > insert into t values(0,1), (0,2); > create table tt(i int, j int); > insert into tt values(0,3); > select * from t where i IN (select count(i) from tt where tt.j = t.j); > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16002) Correlated IN subquery with aggregate asserts in sq_count_check UDF
[ https://issues.apache.org/jira/browse/HIVE-16002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg reassigned HIVE-16002: -- > Correlated IN subquery with aggregate asserts in sq_count_check UDF > --- > > Key: HIVE-16002 > URL: https://issues.apache.org/jira/browse/HIVE-16002 > Project: Hive > Issue Type: Bug >Reporter: Vineet Garg >Assignee: Vineet Garg > > ==Reproducer== > {code:SQL} > create table t(i int, j int); > insert into t values(0,1), (0,2); > create table tt(i int, j int); > insert into tt values(0,3); > select * from t where i IN (select count(i) from tt where tt.j = t.j); > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16002) Correlated IN subquery with aggregate asserts in sq_count_check UDF
[ https://issues.apache.org/jira/browse/HIVE-16002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16002: --- Description: Reproducer {code:SQL} create table t(i int, j int); insert into t values(0,1), (0,2); create table tt(i int, j int); insert into tt values(0,3); select * from t where i IN (select count(i) from tt where tt.j = t.j); {code} was: ==Reproducer== {code:SQL} create table t(i int, j int); insert into t values(0,1), (0,2); create table tt(i int, j int); insert into tt values(0,3); select * from t where i IN (select count(i) from tt where tt.j = t.j); {code} > Correlated IN subquery with aggregate asserts in sq_count_check UDF > --- > > Key: HIVE-16002 > URL: https://issues.apache.org/jira/browse/HIVE-16002 > Project: Hive > Issue Type: Bug >Reporter: Vineet Garg >Assignee: Vineet Garg > > Reproducer > {code:SQL} > create table t(i int, j int); > insert into t values(0,1), (0,2); > create table tt(i int, j int); > insert into tt values(0,3); > select * from t where i IN (select count(i) from tt where tt.j = t.j); > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15938) position alias in order by fails for union queries
[ https://issues.apache.org/jira/browse/HIVE-15938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-15938: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to master after removing the test-time logging. Thanks for the reviews! After discussing with [~ashutoshc], it doesn't seem like the other patch will make this significantly easier. Perhaps when that is also committed we can simplify this if there's some obvious solution. > position alias in order by fails for union queries > -- > > Key: HIVE-15938 > URL: https://issues.apache.org/jira/browse/HIVE-15938 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 2.2.0 > > Attachments: HIVE-15938.01.patch, HIVE-15938.02.patch, > HIVE-15938.03.patch, HIVE-15938.04.patch, HIVE-15938.patch > > > The problem is that the query introduces a spurious ALLCOLREF -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15999) Fix flakiness in TestDbTxnManager2
[ https://issues.apache.org/jira/browse/HIVE-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877225#comment-15877225 ] Wei Zheng commented on HIVE-15999: -- [~ekoifman] Can you take a look please? > Fix flakiness in TestDbTxnManager2 > -- > > Key: HIVE-15999 > URL: https://issues.apache.org/jira/browse/HIVE-15999 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.2.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-15999.1.patch > > > Right now there is test flakiness wrt. TestDbTxnManager2. The error is like > this: > {code} > java.sql.SQLException: Table/View 'TXNS' already exists in Schema 'APP'. > at > org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown > Source) > at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown > Source) > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown > Source) > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) > at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) > at > org.apache.hadoop.hive.metastore.txn.TxnDbUtil.prepDb(TxnDbUtil.java:75) > at > org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.setUp(TestDbTxnManager2.java:90) > {code} > The failure is due to HiveConf used in the test being polluted by some test, > e.g. in testDummyTxnManagerOnAcidTable(), conf entry HIVE_TXN_MANAGER is set > to "org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager" but not switched back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15999) Fix flakiness in TestDbTxnManager2
[ https://issues.apache.org/jira/browse/HIVE-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877220#comment-15877220 ] Hive QA commented on HIVE-15999: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12853815/HIVE-15999.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10251 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag[0] (batchId=173) org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDecimalXY (batchId=173) org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteTimestamp (batchId=173) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3678/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3678/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3678/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12853815 - PreCommit-HIVE-Build > Fix flakiness in TestDbTxnManager2 > -- > > Key: HIVE-15999 > URL: https://issues.apache.org/jira/browse/HIVE-15999 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.2.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-15999.1.patch > > > Right now there is test flakiness wrt. TestDbTxnManager2. The error is like > this: > {code} > java.sql.SQLException: Table/View 'TXNS' already exists in Schema 'APP'. > at > org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown > Source) > at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown > Source) > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown > Source) > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) > at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) > at > org.apache.hadoop.hive.metastore.txn.TxnDbUtil.prepDb(TxnDbUtil.java:75) > at > org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.setUp(TestDbTxnManager2.java:90) > {code} > The failure is due to HiveConf used in the test being polluted by some test, > e.g. in testDummyTxnManagerOnAcidTable(), conf entry HIVE_TXN_MANAGER is set > to "org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager" but not switched back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-1555) JDBC Storage Handler
[ https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-1555: - Attachment: HIVE-1555.6.patch Thanks [~brocknoland]. Fixed cleanupResources to close all 3 even in the case of exception. > JDBC Storage Handler > > > Key: HIVE-1555 > URL: https://issues.apache.org/jira/browse/HIVE-1555 > Project: Hive > Issue Type: New Feature > Components: JDBC >Reporter: Bob Robertson >Assignee: Gunther Hagleitner > Attachments: HIVE-1555.3.patch, HIVE-1555.4.patch, HIVE-1555.5.patch, > HIVE-1555.6.patch, JDBCStorageHandler Design Doc.pdf > > Original Estimate: 24h > Remaining Estimate: 24h > > With the Cassandra and HBase Storage Handlers I thought it would make sense > to include a generic JDBC RDBMS Storage Handler so that you could import a > standard DB table into Hive. Many people must want to perform HiveQL joins, > etc against tables in other systems etc. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-1555) JDBC Storage Handler
[ https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-1555: - Status: Patch Available (was: Open) > JDBC Storage Handler > > > Key: HIVE-1555 > URL: https://issues.apache.org/jira/browse/HIVE-1555 > Project: Hive > Issue Type: New Feature > Components: JDBC >Reporter: Bob Robertson >Assignee: Gunther Hagleitner > Attachments: HIVE-1555.3.patch, HIVE-1555.4.patch, HIVE-1555.5.patch, > HIVE-1555.6.patch, JDBCStorageHandler Design Doc.pdf > > Original Estimate: 24h > Remaining Estimate: 24h > > With the Cassandra and HBase Storage Handlers I thought it would make sense > to include a generic JDBC RDBMS Storage Handler so that you could import a > standard DB table into Hive. Many people must want to perform HiveQL joins, > etc against tables in other systems etc. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-12492) MapJoin: 4 million unique integers seems to be a probe plateau
[ https://issues.apache.org/jira/browse/HIVE-12492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877214#comment-15877214 ] Ashutosh Chauhan commented on HIVE-12492: - I am not sure if patch is working as intended. For 2-way join, A DHJ is selected with 2 CUSTOM_SIMPLE_EDGE going into Reducer where join operator is of type MapJoin. None of the test cases in attached patch has plans of that shape. See: ql/src/test/results/clientpositive/llap/tez_dynpart_hashjoin_*.q.out > MapJoin: 4 million unique integers seems to be a probe plateau > -- > > Key: HIVE-12492 > URL: https://issues.apache.org/jira/browse/HIVE-12492 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Affects Versions: 1.3.0, 1.2.1, 2.0.0 >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12492.patch > > > After 4 million keys, the map-join implementation seems to suffer from a > performance degradation. > The hashtable build & probe time makes this very inefficient, even if the > data is very compact (i.e 2 ints). > Falling back onto the shuffle join or bucket map-join is useful after 2^22 > items. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-1555) JDBC Storage Handler
[ https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-1555: - Status: Open (was: Patch Available) > JDBC Storage Handler > > > Key: HIVE-1555 > URL: https://issues.apache.org/jira/browse/HIVE-1555 > Project: Hive > Issue Type: New Feature > Components: JDBC >Reporter: Bob Robertson >Assignee: Gunther Hagleitner > Attachments: HIVE-1555.3.patch, HIVE-1555.4.patch, HIVE-1555.5.patch, > JDBCStorageHandler Design Doc.pdf > > Original Estimate: 24h > Remaining Estimate: 24h > > With the Cassandra and HBase Storage Handlers I thought it would make sense > to include a generic JDBC RDBMS Storage Handler so that you could import a > standard DB table into Hive. Many people must want to perform HiveQL joins, > etc against tables in other systems etc. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15626) beeline should not exit after canceling the query on ctrl-c
[ https://issues.apache.org/jira/browse/HIVE-15626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877166#comment-15877166 ] Vihang Karajgaonkar commented on HIVE-15626: Hi [~leftylev] I updated the BeeLine wiki which documents this behavior https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-Cancellingthequery Please let me know if you need any changes in the text. THanks! > beeline should not exit after canceling the query on ctrl-c > --- > > Key: HIVE-15626 > URL: https://issues.apache.org/jira/browse/HIVE-15626 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: Sergey Shelukhin >Assignee: Vihang Karajgaonkar > Fix For: 2.2.0 > > Attachments: HIVE-15626.01.patch > > > I am seeing this in 1.2 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15955) make explain formatted to include opId and etc
[ https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877127#comment-15877127 ] Ashutosh Chauhan commented on HIVE-15955: - Latest patch looks good. Some minor comments on RB. > make explain formatted to include opId and etc > -- > > Key: HIVE-15955 > URL: https://issues.apache.org/jira/browse/HIVE-15955 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15881) Use new thread count variable name instead of mapred.dfsclient.parallelism.max
[ https://issues.apache.org/jira/browse/HIVE-15881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877112#comment-15877112 ] Hive QA commented on HIVE-15881: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12853804/HIVE-15881.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 10252 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=223) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_sortmerge_join_14] (batchId=100) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_sortmerge_join_15] (batchId=100) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_tez2] (batchId=100) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketmapjoin8] (batchId=100) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby5_map_skew] (batchId=100) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join_reorder] (batchId=100) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[sample1] (batchId=100) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[skewjoinopt4] (batchId=106) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union10] (batchId=100) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union34] (batchId=100) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_count_distinct] (batchId=106) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=211) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel (batchId=211) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3677/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3677/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3677/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12853804 - PreCommit-HIVE-Build > Use new thread count variable name instead of mapred.dfsclient.parallelism.max > -- > > Key: HIVE-15881 > URL: https://issues.apache.org/jira/browse/HIVE-15881 > Project: Hive > Issue Type: Task > Components: Query Planning >Reporter: Sergio Peña >Assignee: Sergio Peña >Priority: Minor > Attachments: HIVE-15881.1.patch, HIVE-15881.2.patch > > > The Utilities class has two methods, {{getInputSummary}} and > {{getInputPaths}}, that use the variable {{mapred.dfsclient.parallelism.max}} > to get the summary of a list of input locations in parallel. These methods > are Hive related, but the variable name does not look it is specific for Hive. > Also, the above variable is not on HiveConf nor used anywhere else. I just > found a reference on the Hadoop MR1 code. > I'd like to propose the deprecation of {{mapred.dfsclient.parallelism.max}}, > and use a different variable name, such as > {{hive.get.input.listing.num.threads}}, that reflects the intention of the > variable. The removal of the old variable might happen on Hive 3.x -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16001) add test for merge + runtime filtering
[ https://issues.apache.org/jira/browse/HIVE-16001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-16001: -- Description: make sure merge works with HIVE-15802 and HIVE-15269 add to sqlmerge.q cc [~jdere] was:make sure merge works with HIVE-15802 and HIVE-15269 > add test for merge + runtime filtering > -- > > Key: HIVE-16001 > URL: https://issues.apache.org/jira/browse/HIVE-16001 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > > make sure merge works with HIVE-15802 and HIVE-15269 > add to sqlmerge.q > cc [~jdere] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16001) add test for merge + runtime filtering
[ https://issues.apache.org/jira/browse/HIVE-16001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman reassigned HIVE-16001: - > add test for merge + runtime filtering > -- > > Key: HIVE-16001 > URL: https://issues.apache.org/jira/browse/HIVE-16001 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > > make sure merge works with HIVE-15802 and HIVE-15269 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15880) Allow insert overwrite query to use auto.purge table property
[ https://issues.apache.org/jira/browse/HIVE-15880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-15880: --- Status: Patch Available (was: Open) > Allow insert overwrite query to use auto.purge table property > - > > Key: HIVE-15880 > URL: https://issues.apache.org/jira/browse/HIVE-15880 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-15880.01.patch > > > It seems inconsistent that auto.purge property is not considered when we do a > INSERT OVERWRITE while it is when we do a DROP TABLE > Drop table doesn't move table data to Trash when auto.purge is set to true > {noformat} > > create table temp(col1 string, col2 string); > No rows affected (0.064 seconds) > > alter table temp set tblproperties('auto.purge'='true'); > No rows affected (0.083 seconds) > > insert into temp values ('test', 'test'), ('test2', 'test2'); > No rows affected (25.473 seconds) > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 22 2017-02-09 13:03 > /user/hive/warehouse/temp/00_0 > # > > drop table temp; > No rows affected (0.242 seconds) > # hdfs dfs -ls /user/hive/warehouse/temp > ls: `/user/hive/warehouse/temp': No such file or directory > # > # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse > # > {noformat} > INSERT OVERWRITE query moves the table data to Trash even when auto.purge is > set to true > {noformat} > > create table temp(col1 string, col2 string); > > alter table temp set tblproperties('auto.purge'='true'); > > insert into temp values ('test', 'test'), ('test2', 'test2'); > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 22 2017-02-09 13:07 > /user/hive/warehouse/temp/00_0 > # > > insert overwrite table temp select * from dummy; > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 26 2017-02-09 13:08 > /user/hive/warehouse/temp/00_0 > # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse > Found 1 items > drwx-- - hive hive 0 2017-02-09 13:08 > /user/hive/.Trash/Current/user/hive/warehouse/temp > # > {noformat} > While move operations are not very costly on HDFS it could be significant > overhead on slow FileSystems like S3. This could improve the performance of > {{INSERT OVERWRITE TABLE}} queries especially when there are large number of > partitions on tables located on S3 should the user wish to set auto.purge > property to true -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15880) Allow insert overwrite query to use auto.purge table property
[ https://issues.apache.org/jira/browse/HIVE-15880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-15880: --- Attachment: HIVE-15880.01.patch Adding the first version of the patch to trigger the pre-commit. Will be submitting second version with additional test cases > Allow insert overwrite query to use auto.purge table property > - > > Key: HIVE-15880 > URL: https://issues.apache.org/jira/browse/HIVE-15880 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-15880.01.patch > > > It seems inconsistent that auto.purge property is not considered when we do a > INSERT OVERWRITE while it is when we do a DROP TABLE > Drop table doesn't move table data to Trash when auto.purge is set to true > {noformat} > > create table temp(col1 string, col2 string); > No rows affected (0.064 seconds) > > alter table temp set tblproperties('auto.purge'='true'); > No rows affected (0.083 seconds) > > insert into temp values ('test', 'test'), ('test2', 'test2'); > No rows affected (25.473 seconds) > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 22 2017-02-09 13:03 > /user/hive/warehouse/temp/00_0 > # > > drop table temp; > No rows affected (0.242 seconds) > # hdfs dfs -ls /user/hive/warehouse/temp > ls: `/user/hive/warehouse/temp': No such file or directory > # > # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse > # > {noformat} > INSERT OVERWRITE query moves the table data to Trash even when auto.purge is > set to true > {noformat} > > create table temp(col1 string, col2 string); > > alter table temp set tblproperties('auto.purge'='true'); > > insert into temp values ('test', 'test'), ('test2', 'test2'); > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 22 2017-02-09 13:07 > /user/hive/warehouse/temp/00_0 > # > > insert overwrite table temp select * from dummy; > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 26 2017-02-09 13:08 > /user/hive/warehouse/temp/00_0 > # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse > Found 1 items > drwx-- - hive hive 0 2017-02-09 13:08 > /user/hive/.Trash/Current/user/hive/warehouse/temp > # > {noformat} > While move operations are not very costly on HDFS it could be significant > overhead on slow FileSystems like S3. This could improve the performance of > {{INSERT OVERWRITE TABLE}} queries especially when there are large number of > partitions on tables located on S3 should the user wish to set auto.purge > property to true -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15934) Downgrade Maven surefire plugin from 2.19.1 to 2.18.1
[ https://issues.apache.org/jira/browse/HIVE-15934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-15934: - Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to master. Thanks Zoltan for the review! > Downgrade Maven surefire plugin from 2.19.1 to 2.18.1 > - > > Key: HIVE-15934 > URL: https://issues.apache.org/jira/browse/HIVE-15934 > Project: Hive > Issue Type: Bug > Components: Tests >Affects Versions: 2.2.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Fix For: 2.2.0 > > Attachments: HIVE-15934.1.patch > > > Surefire 2.19.1 has some issue > (https://issues.apache.org/jira/browse/SUREFIRE-1255) which caused debugging > session to abort after a short period of time. Many IntelliJ users have seen > this, although it looks fine for Eclipse users. Version 2.18.1 works fine. > We'd better make the change to not impact the development for IntelliJ guys. > We can upgrade again once the root cause is figured out. > cc [~kgyrtkirk] [~ashutoshc] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15931) JDBC: Improve logging when using ZooKeeper
[ https://issues.apache.org/jira/browse/HIVE-15931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877034#comment-15877034 ] Hive QA commented on HIVE-15931: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12853800/HIVE-15931.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10251 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel (batchId=211) org.apache.hive.jdbc.TestJdbcDriver2.testBadURL (batchId=215) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3676/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3676/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3676/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12853800 - PreCommit-HIVE-Build > JDBC: Improve logging when using ZooKeeper > -- > > Key: HIVE-15931 > URL: https://issues.apache.org/jira/browse/HIVE-15931 > Project: Hive > Issue Type: Bug > Components: JDBC >Affects Versions: 2.2.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Attachments: HIVE-15931.1.patch, HIVE-15931.2.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15934) Downgrade Maven surefire plugin from 2.19.1 to 2.18.1
[ https://issues.apache.org/jira/browse/HIVE-15934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877012#comment-15877012 ] ASF GitHub Bot commented on HIVE-15934: --- GitHub user weiatwork opened a pull request: https://github.com/apache/hive/pull/152 HIVE-15934 : Downgrade Maven surefire plugin from 2.19.1 to 2.18.1 (W… …ei Zheng, reviewed by Zoltan Haindrich) You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/hive HIVE-15934 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/152.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #152 commit cc9085617b8749b8eb0a69fb893133ac04915eb8 Author: Wei ZhengDate: 2017-02-21T23:31:51Z HIVE-15934 : Downgrade Maven surefire plugin from 2.19.1 to 2.18.1 (Wei Zheng, reviewed by Zoltan Haindrich) > Downgrade Maven surefire plugin from 2.19.1 to 2.18.1 > - > > Key: HIVE-15934 > URL: https://issues.apache.org/jira/browse/HIVE-15934 > Project: Hive > Issue Type: Bug > Components: Tests >Affects Versions: 2.2.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-15934.1.patch > > > Surefire 2.19.1 has some issue > (https://issues.apache.org/jira/browse/SUREFIRE-1255) which caused debugging > session to abort after a short period of time. Many IntelliJ users have seen > this, although it looks fine for Eclipse users. Version 2.18.1 works fine. > We'd better make the change to not impact the development for IntelliJ guys. > We can upgrade again once the root cause is figured out. > cc [~kgyrtkirk] [~ashutoshc] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15955) make explain formatted to include opId and etc
[ https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15955: --- Status: Open (was: Patch Available) > make explain formatted to include opId and etc > -- > > Key: HIVE-15955 > URL: https://issues.apache.org/jira/browse/HIVE-15955 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15955) make explain formatted to include opId and etc
[ https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15955: --- Status: Patch Available (was: Open) > make explain formatted to include opId and etc > -- > > Key: HIVE-15955 > URL: https://issues.apache.org/jira/browse/HIVE-15955 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15955) make explain formatted to include opId and etc
[ https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15955: --- Attachment: HIVE-15955.02.patch > make explain formatted to include opId and etc > -- > > Key: HIVE-15955 > URL: https://issues.apache.org/jira/browse/HIVE-15955 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15955) make explain formatted to include opId and etc
[ https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15955: --- Summary: make explain formatted to include opId and etc (was: make explain formatted to dump JSONObject when user level explain is on) > make explain formatted to include opId and etc > -- > > Key: HIVE-15955 > URL: https://issues.apache.org/jira/browse/HIVE-15955 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15955.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15996) Implement multiargument GROUPING function
[ https://issues.apache.org/jira/browse/HIVE-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876956#comment-15876956 ] Carter Shanklin commented on HIVE-15996: [~julianhyde] I don't think so, GROUP_ID seems to depend on the input data (I'm no Oracle expert to be fair here) while this is converting the bitmasks to numbers. So i could select all of grouping(c1, c2, c3); grouping(c1, c3) and grouping(c2, c3) with different numbers per row whereas GROUP_ID doesn't seem to take any arguments. > Implement multiargument GROUPING function > - > > Key: HIVE-15996 > URL: https://issues.apache.org/jira/browse/HIVE-15996 > Project: Hive > Issue Type: New Feature >Affects Versions: 2.2.0 >Reporter: Carter Shanklin >Assignee: Jesus Camacho Rodriguez > > Per the SQL standard section 6.9: > GROUPING ( CR1, ..., CRN-1, CRN ) > is equivalent to: > CAST ( ( 2 * GROUPING ( CR1, ..., CRN-1 ) + GROUPING ( CRN ) ) AS IDT ) > So for example: > select c1, c2, c3, grouping(c1, c2, c3) from e011_02 group by rollup(c1, c2, > c3); > Should be allowed and equivalent to: > select c1, c2, c3, 4*grouping(c1) + 2*grouping(c2) + grouping(c3) from > e011_02 group by rollup(c1, c2, c3); -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15991) Flaky Test: TestEncryptedHDFSCliDriver encryption_join_with_different_encryption_keys
[ https://issues.apache.org/jira/browse/HIVE-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-15991: Description: I ran a git-bisect and seems HIVE-15703 started causing this failure. Not entirely sure why, but I updated the .out file and the diff is pretty straightforward, so I think its safe to just update it. > Flaky Test: TestEncryptedHDFSCliDriver > encryption_join_with_different_encryption_keys > - > > Key: HIVE-15991 > URL: https://issues.apache.org/jira/browse/HIVE-15991 > Project: Hive > Issue Type: Sub-task >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-15991.txt > > > I ran a git-bisect and seems HIVE-15703 started causing this failure. Not > entirely sure why, but I updated the .out file and the diff is pretty > straightforward, so I think its safe to just update it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15570) LLAP: Exception in HostAffinitySplitLocationProvider when running in container mode
[ https://issues.apache.org/jira/browse/HIVE-15570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876933#comment-15876933 ] Hive QA commented on HIVE-15570: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12853219/HIVE-15570.3.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10251 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3675/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3675/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3675/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12853219 - PreCommit-HIVE-Build > LLAP: Exception in HostAffinitySplitLocationProvider when running in > container mode > --- > > Key: HIVE-15570 > URL: https://issues.apache.org/jira/browse/HIVE-15570 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Zhiyuan Yang >Priority: Minor > Attachments: HIVE-15570.1.patch, HIVE-15570.2.patch, > HIVE-15570.3.patch > > > Sometimes user might prefer to run with "hive.execution.mode=container" mode > when LLAP is stopped. If hive config for LLAP had > "hive.llap.client.consistent.splits=true" in client side, it would end up > throwing the following exception in {{Utils.java}}. > {noformat} > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:68) > ... 25 more > Caused by: java.lang.IllegalStateException: > org.apache.hadoop.hive.ql.exec.tez.HostAffinitySplitLocationProvider needs at > least 1 location to function > at > com.google.common.base.Preconditions.checkState(Preconditions.java:149) > at > org.apache.hadoop.hive.ql.exec.tez.HostAffinitySplitLocationProvider.(HostAffinitySplitLocationProvider.java:52) > at > org.apache.hadoop.hive.ql.exec.tez.Utils.getSplitLocationProvider(Utils.java:54) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.(HiveSplitGenerator.java:121) > ... 30 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15881) Use new thread count variable name instead of mapred.dfsclient.parallelism.max
[ https://issues.apache.org/jira/browse/HIVE-15881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876900#comment-15876900 ] Thomas Poepping commented on HIVE-15881: Hey [~spena], updated the RB. Just one question, otherwise non-binding +1 > Use new thread count variable name instead of mapred.dfsclient.parallelism.max > -- > > Key: HIVE-15881 > URL: https://issues.apache.org/jira/browse/HIVE-15881 > Project: Hive > Issue Type: Task > Components: Query Planning >Reporter: Sergio Peña >Assignee: Sergio Peña >Priority: Minor > Attachments: HIVE-15881.1.patch, HIVE-15881.2.patch > > > The Utilities class has two methods, {{getInputSummary}} and > {{getInputPaths}}, that use the variable {{mapred.dfsclient.parallelism.max}} > to get the summary of a list of input locations in parallel. These methods > are Hive related, but the variable name does not look it is specific for Hive. > Also, the above variable is not on HiveConf nor used anywhere else. I just > found a reference on the Hadoop MR1 code. > I'd like to propose the deprecation of {{mapred.dfsclient.parallelism.max}}, > and use a different variable name, such as > {{hive.get.input.listing.num.threads}}, that reflects the intention of the > variable. The removal of the old variable might happen on Hive 3.x -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-15881) Use new thread count variable name instead of mapred.dfsclient.parallelism.max
[ https://issues.apache.org/jira/browse/HIVE-15881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876900#comment-15876900 ] Thomas Poepping edited comment on HIVE-15881 at 2/21/17 10:42 PM: -- Hey [~spena], updated the RB. Just one question, otherwise non-binding +1 pending QA was (Author: poeppt): Hey [~spena], updated the RB. Just one question, otherwise non-binding +1 > Use new thread count variable name instead of mapred.dfsclient.parallelism.max > -- > > Key: HIVE-15881 > URL: https://issues.apache.org/jira/browse/HIVE-15881 > Project: Hive > Issue Type: Task > Components: Query Planning >Reporter: Sergio Peña >Assignee: Sergio Peña >Priority: Minor > Attachments: HIVE-15881.1.patch, HIVE-15881.2.patch > > > The Utilities class has two methods, {{getInputSummary}} and > {{getInputPaths}}, that use the variable {{mapred.dfsclient.parallelism.max}} > to get the summary of a list of input locations in parallel. These methods > are Hive related, but the variable name does not look it is specific for Hive. > Also, the above variable is not on HiveConf nor used anywhere else. I just > found a reference on the Hadoop MR1 code. > I'd like to propose the deprecation of {{mapred.dfsclient.parallelism.max}}, > and use a different variable name, such as > {{hive.get.input.listing.num.threads}}, that reflects the intention of the > variable. The removal of the old variable might happen on Hive 3.x -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-14901) HiveServer2: Use user supplied fetch size to determine #rows serialized in tasks
[ https://issues.apache.org/jira/browse/HIVE-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Norris Lee updated HIVE-14901: -- Status: In Progress (was: Patch Available) > HiveServer2: Use user supplied fetch size to determine #rows serialized in > tasks > > > Key: HIVE-14901 > URL: https://issues.apache.org/jira/browse/HIVE-14901 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC, ODBC >Affects Versions: 2.1.0 >Reporter: Vaibhav Gumashta >Assignee: Norris Lee > Attachments: HIVE-14901.1.patch, HIVE-14901.2.patch, > HIVE-14901.3.patch, HIVE-14901.4.patch, HIVE-14901.patch > > > Currently, we use {{hive.server2.thrift.resultset.max.fetch.size}} to decide > the max number of rows that we write in tasks. However, we should ideally use > the user supplied value (which can be extracted from the > ThriftCLIService.FetchResults' request parameter) to decide how many rows to > serialize in a blob in the tasks. We should however use > {{hive.server2.thrift.resultset.max.fetch.size}} to have an upper bound on > it, so that we don't go OOM in tasks and HS2. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-14901) HiveServer2: Use user supplied fetch size to determine #rows serialized in tasks
[ https://issues.apache.org/jira/browse/HIVE-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Norris Lee updated HIVE-14901: -- Attachment: HIVE-14901.4.patch > HiveServer2: Use user supplied fetch size to determine #rows serialized in > tasks > > > Key: HIVE-14901 > URL: https://issues.apache.org/jira/browse/HIVE-14901 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC, ODBC >Affects Versions: 2.1.0 >Reporter: Vaibhav Gumashta >Assignee: Norris Lee > Attachments: HIVE-14901.1.patch, HIVE-14901.2.patch, > HIVE-14901.3.patch, HIVE-14901.4.patch, HIVE-14901.patch > > > Currently, we use {{hive.server2.thrift.resultset.max.fetch.size}} to decide > the max number of rows that we write in tasks. However, we should ideally use > the user supplied value (which can be extracted from the > ThriftCLIService.FetchResults' request parameter) to decide how many rows to > serialize in a blob in the tasks. We should however use > {{hive.server2.thrift.resultset.max.fetch.size}} to have an upper bound on > it, so that we don't go OOM in tasks and HS2. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-14901) HiveServer2: Use user supplied fetch size to determine #rows serialized in tasks
[ https://issues.apache.org/jira/browse/HIVE-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Norris Lee updated HIVE-14901: -- Status: Patch Available (was: In Progress) > HiveServer2: Use user supplied fetch size to determine #rows serialized in > tasks > > > Key: HIVE-14901 > URL: https://issues.apache.org/jira/browse/HIVE-14901 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC, ODBC >Affects Versions: 2.1.0 >Reporter: Vaibhav Gumashta >Assignee: Norris Lee > Attachments: HIVE-14901.1.patch, HIVE-14901.2.patch, > HIVE-14901.3.patch, HIVE-14901.4.patch, HIVE-14901.patch > > > Currently, we use {{hive.server2.thrift.resultset.max.fetch.size}} to decide > the max number of rows that we write in tasks. However, we should ideally use > the user supplied value (which can be extracted from the > ThriftCLIService.FetchResults' request parameter) to decide how many rows to > serialize in a blob in the tasks. We should however use > {{hive.server2.thrift.resultset.max.fetch.size}} to have an upper bound on > it, so that we don't go OOM in tasks and HS2. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15971) LLAP: logs urls should use daemon container id instead of fake container id
[ https://issues.apache.org/jira/browse/HIVE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-15971: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to master > LLAP: logs urls should use daemon container id instead of fake container id > --- > > Key: HIVE-15971 > URL: https://issues.apache.org/jira/browse/HIVE-15971 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-15971.1.patch, HIVE-15971.2.patch, > HIVE-15971.3.patch, HIVE-15971.4.patch > > > The containerId used for log url generation is fake. It should be replaced by > the container id of the llap daemon. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15971) LLAP: logs urls should use daemon container id instead of fake container id
[ https://issues.apache.org/jira/browse/HIVE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876869#comment-15876869 ] Prasanth Jayachandran commented on HIVE-15971: -- test failures are not related to this patch and has been failing already. > LLAP: logs urls should use daemon container id instead of fake container id > --- > > Key: HIVE-15971 > URL: https://issues.apache.org/jira/browse/HIVE-15971 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-15971.1.patch, HIVE-15971.2.patch, > HIVE-15971.3.patch, HIVE-15971.4.patch > > > The containerId used for log url generation is fake. It should be replaced by > the container id of the llap daemon. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15971) LLAP: logs urls should use daemon container id instead of fake container id
[ https://issues.apache.org/jira/browse/HIVE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876851#comment-15876851 ] Prasanth Jayachandran commented on HIVE-15971: -- [~sseth] Thanks for the review. Created HIVE-16000 for follow up. Will check if the test failures are related before commit. > LLAP: logs urls should use daemon container id instead of fake container id > --- > > Key: HIVE-15971 > URL: https://issues.apache.org/jira/browse/HIVE-15971 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-15971.1.patch, HIVE-15971.2.patch, > HIVE-15971.3.patch, HIVE-15971.4.patch > > > The containerId used for log url generation is fake. It should be replaced by > the container id of the llap daemon. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16000) LLAP: LLAP log urls improvements
[ https://issues.apache.org/jira/browse/HIVE-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-16000: - Description: Follow up for HIVE-15971 (based on https://issues.apache.org/jira/browse/HIVE-15971?focusedCommentId=15876814=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15876814) 1) Make NodeManager web address port available via ServiceInstance or something better (other than reading from configuration) 2) When llap node goes down log URL cannot be constructed since we rely on information from service registry. Instead YARN NodeId can be extended to provided necessary information (container id) for constructing the log url. was: Follow up for HIVE-15971 1) Make NodeManager web address port available via ServiceInstance or something better (other than reading from configuration) 2) When llap node goes down log URL cannot be constructed since we rely on information from service registry. Instead YARN NodeId can be extended to provided necessary information (container id) for constructing the log url. > LLAP: LLAP log urls improvements > > > Key: HIVE-16000 > URL: https://issues.apache.org/jira/browse/HIVE-16000 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran > > Follow up for HIVE-15971 (based on > https://issues.apache.org/jira/browse/HIVE-15971?focusedCommentId=15876814=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15876814) > 1) Make NodeManager web address port available via ServiceInstance or > something better (other than reading from configuration) > 2) When llap node goes down log URL cannot be constructed since we rely on > information from service registry. Instead YARN NodeId can be extended to > provided necessary information (container id) for constructing the log url. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15999) Fix flakiness in TestDbTxnManager2
[ https://issues.apache.org/jira/browse/HIVE-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-15999: - Description: Right now there is test flakiness wrt. TestDbTxnManager2. The error is like this: {code} java.sql.SQLException: Table/View 'TXNS' already exists in Schema 'APP'. at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source) at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source) at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) at org.apache.hadoop.hive.metastore.txn.TxnDbUtil.prepDb(TxnDbUtil.java:75) at org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.setUp(TestDbTxnManager2.java:90) {code} The failure is due to HiveConf used in the test being polluted by some test, e.g. in testDummyTxnManagerOnAcidTable(), conf entry HIVE_TXN_MANAGER is set to "org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager" but not switched back. was: Right now there is test flakiness wrt. TestDbTxnManager2. The error is like this: {code} org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.checkExpectedLocks Error Details Table/View 'TXNS' already exists in Schema 'APP'. {code} The failure is due to HiveConf used in the test being polluted by some test, e.g. in testDummyTxnManagerOnAcidTable(), conf entry HIVE_TXN_MANAGER is set to "org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager" but not switched back. > Fix flakiness in TestDbTxnManager2 > -- > > Key: HIVE-15999 > URL: https://issues.apache.org/jira/browse/HIVE-15999 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.2.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-15999.1.patch > > > Right now there is test flakiness wrt. TestDbTxnManager2. The error is like > this: > {code} > java.sql.SQLException: Table/View 'TXNS' already exists in Schema 'APP'. > at > org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown > Source) > at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown > Source) > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown > Source) > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) > at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) > at > org.apache.hadoop.hive.metastore.txn.TxnDbUtil.prepDb(TxnDbUtil.java:75) > at > org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.setUp(TestDbTxnManager2.java:90) > {code} > The failure is due to HiveConf used in the test being polluted by some test, > e.g. in testDummyTxnManagerOnAcidTable(), conf entry HIVE_TXN_MANAGER is set > to "org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager" but not switched back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15999) Fix flakiness in TestDbTxnManager2
[ https://issues.apache.org/jira/browse/HIVE-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-15999: - Attachment: HIVE-15999.1.patch > Fix flakiness in TestDbTxnManager2 > -- > > Key: HIVE-15999 > URL: https://issues.apache.org/jira/browse/HIVE-15999 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.2.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-15999.1.patch > > > Right now there is test flakiness wrt. TestDbTxnManager2. The error is like > this: > {code} > org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.checkExpectedLocks > Error Details > Table/View 'TXNS' already exists in Schema 'APP'. > {code} > The failure is due to HiveConf used in the test being polluted by some test, > e.g. in testDummyTxnManagerOnAcidTable(), conf entry HIVE_TXN_MANAGER is set > to "org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager" but not switched back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15999) Fix flakiness in TestDbTxnManager2
[ https://issues.apache.org/jira/browse/HIVE-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-15999: - Status: Patch Available (was: Open) > Fix flakiness in TestDbTxnManager2 > -- > > Key: HIVE-15999 > URL: https://issues.apache.org/jira/browse/HIVE-15999 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.2.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-15999.1.patch > > > Right now there is test flakiness wrt. TestDbTxnManager2. The error is like > this: > {code} > org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.checkExpectedLocks > Error Details > Table/View 'TXNS' already exists in Schema 'APP'. > {code} > The failure is due to HiveConf used in the test being polluted by some test, > e.g. in testDummyTxnManagerOnAcidTable(), conf entry HIVE_TXN_MANAGER is set > to "org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager" but not switched back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15991) Flaky Test: TestEncryptedHDFSCliDriver encryption_join_with_different_encryption_keys
[ https://issues.apache.org/jira/browse/HIVE-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876837#comment-15876837 ] Ashutosh Chauhan commented on HIVE-15991: - +1 > Flaky Test: TestEncryptedHDFSCliDriver > encryption_join_with_different_encryption_keys > - > > Key: HIVE-15991 > URL: https://issues.apache.org/jira/browse/HIVE-15991 > Project: Hive > Issue Type: Sub-task >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-15991.txt > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15991) Flaky Test: TestEncryptedHDFSCliDriver encryption_join_with_different_encryption_keys
[ https://issues.apache.org/jira/browse/HIVE-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-15991: Status: Patch Available (was: Open) > Flaky Test: TestEncryptedHDFSCliDriver > encryption_join_with_different_encryption_keys > - > > Key: HIVE-15991 > URL: https://issues.apache.org/jira/browse/HIVE-15991 > Project: Hive > Issue Type: Sub-task >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-15991.txt > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-15999) Fix flakiness in TestDbTxnManager2
[ https://issues.apache.org/jira/browse/HIVE-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng reassigned HIVE-15999: > Fix flakiness in TestDbTxnManager2 > -- > > Key: HIVE-15999 > URL: https://issues.apache.org/jira/browse/HIVE-15999 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.2.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > > Right now there is test flakiness wrt. TestDbTxnManager2. The error is like > this: > {code} > org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.checkExpectedLocks > Error Details > Table/View 'TXNS' already exists in Schema 'APP'. > {code} > The failure is due to HiveConf used in the test being polluted by some test, > e.g. in testDummyTxnManagerOnAcidTable(), conf entry HIVE_TXN_MANAGER is set > to "org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager" but not switched back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
[ https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876829#comment-15876829 ] Hive QA commented on HIVE-15844: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12853791/HIVE-15844.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 44 failed/errored test(s), 10251 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions] (batchId=231) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions] (batchId=231) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_4] (batchId=11) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_all_non_partitioned] (batchId=26) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_all_partitioned] (batchId=26) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_tmp_table] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_whole_partition] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dynpart_sort_optimization_acid2] (batchId=29) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_update_delete] (batchId=78) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_after_multiple_inserts] (batchId=62) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_after_multiple_inserts_special_characters] (batchId=66) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_all_non_partitioned] (batchId=7) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_all_partitioned] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_two_cols] (batchId=19) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_all_non_partitioned] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_all_partitioned] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_tmp_table] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_whole_partition] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization2] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization_acid] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_update_delete] (batchId=154) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acid_part_update] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acid_table_update] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acidvec_part_update] (batchId=141) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acidvec_table_update] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sqlmerge] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_after_multiple_inserts] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_all_non_partitioned] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_all_partitioned] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_two_cols] (batchId=142) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=93) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.ql.TestTxnCommands2.testMerge2 (batchId=258) org.apache.hadoop.hive.ql.TestTxnCommands2.testMerge3 (batchId=258) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testMerge2 (batchId=268) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testMerge3 (batchId=268) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge2 (batchId=266) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge3 (batchId=266) org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.schemaEvolutionAddColDynamicPartitioningUpdate (batchId=205) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel
[jira] [Updated] (HIVE-15959) LLAP: fix headroom calculation and move it to daemon
[ https://issues.apache.org/jira/browse/HIVE-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-15959: Resolution: Fixed Status: Resolved (was: Patch Available) Committed to master. Thanks for the review! > LLAP: fix headroom calculation and move it to daemon > > > Key: HIVE-15959 > URL: https://issues.apache.org/jira/browse/HIVE-15959 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15959.01.patch, HIVE-15959.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15991) Flaky Test: TestEncryptedHDFSCliDriver encryption_join_with_different_encryption_keys
[ https://issues.apache.org/jira/browse/HIVE-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-15991: Attachment: HIVE-15991.txt > Flaky Test: TestEncryptedHDFSCliDriver > encryption_join_with_different_encryption_keys > - > > Key: HIVE-15991 > URL: https://issues.apache.org/jira/browse/HIVE-15991 > Project: Hive > Issue Type: Sub-task >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-15991.txt > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15971) LLAP: logs urls should use daemon container id instead of fake container id
[ https://issues.apache.org/jira/browse/HIVE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876814#comment-15876814 ] Siddharth Seth commented on HIVE-15971: --- +1. Looks good. There's a couple of follow up items. Reading the port from YARNConfiguration is not great. A port value of 0 means a dynamic port, in which case this completely breaks. We need a follow up jira to figure out a good way to make the port available. Likely published from within the LLAPDaemon container itself. Other than this, if the llap instance is not found (e.g. task timed out because llap went down) - we won't be able to construct the log URL. Probably need to handle this by retaining information for some time. A bunch of this can be simplified if NodeId could be extended in Yarn. A LlapNodeId could include information about the container, NM webaddress, etc - at allocation time. > LLAP: logs urls should use daemon container id instead of fake container id > --- > > Key: HIVE-15971 > URL: https://issues.apache.org/jira/browse/HIVE-15971 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-15971.1.patch, HIVE-15971.2.patch, > HIVE-15971.3.patch, HIVE-15971.4.patch > > > The containerId used for log url generation is fake. It should be replaced by > the container id of the llap daemon. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15867) Add blobstore tests for import/export
[ https://issues.apache.org/jira/browse/HIVE-15867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876820#comment-15876820 ] Sahil Takiar commented on HIVE-15867: - No worries, just checking in, thanks for the update! > Add blobstore tests for import/export > - > > Key: HIVE-15867 > URL: https://issues.apache.org/jira/browse/HIVE-15867 > Project: Hive > Issue Type: Bug >Reporter: Thomas Poepping >Assignee: Thomas Poepping > > This patch covers ten separate tests testing import and export operations > running against blobstore filesystems: > * Import addpartition > ** blobstore -> file > ** file -> blobstore > ** blobstore -> blobstore > ** blobstore -> hdfs > * import/export > ** blobstore -> file > ** file -> blobstore > ** blobstore -> blobstore (partitioned and non-partitioned) > ** blobstore -> HDFS (partitioned and non-partitioned) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15934) Downgrade Maven surefire plugin from 2.19.1 to 2.18.1
[ https://issues.apache.org/jira/browse/HIVE-15934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876819#comment-15876819 ] Zoltan Haindrich commented on HIVE-15934: - [~wzheng] I totally aggree; we should downgrade surefire - currently there is no better alternative +1 > Downgrade Maven surefire plugin from 2.19.1 to 2.18.1 > - > > Key: HIVE-15934 > URL: https://issues.apache.org/jira/browse/HIVE-15934 > Project: Hive > Issue Type: Bug > Components: Tests >Affects Versions: 2.2.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-15934.1.patch > > > Surefire 2.19.1 has some issue > (https://issues.apache.org/jira/browse/SUREFIRE-1255) which caused debugging > session to abort after a short period of time. Many IntelliJ users have seen > this, although it looks fine for Eclipse users. Version 2.18.1 works fine. > We'd better make the change to not impact the development for IntelliJ guys. > We can upgrade again once the root cause is figured out. > cc [~kgyrtkirk] [~ashutoshc] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15934) Downgrade Maven surefire plugin from 2.19.1 to 2.18.1
[ https://issues.apache.org/jira/browse/HIVE-15934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876791#comment-15876791 ] Wei Zheng commented on HIVE-15934: -- Ping [~ashutoshc].. > Downgrade Maven surefire plugin from 2.19.1 to 2.18.1 > - > > Key: HIVE-15934 > URL: https://issues.apache.org/jira/browse/HIVE-15934 > Project: Hive > Issue Type: Bug > Components: Tests >Affects Versions: 2.2.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-15934.1.patch > > > Surefire 2.19.1 has some issue > (https://issues.apache.org/jira/browse/SUREFIRE-1255) which caused debugging > session to abort after a short period of time. Many IntelliJ users have seen > this, although it looks fine for Eclipse users. Version 2.18.1 works fine. > We'd better make the change to not impact the development for IntelliJ guys. > We can upgrade again once the root cause is figured out. > cc [~kgyrtkirk] [~ashutoshc] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15867) Add blobstore tests for import/export
[ https://issues.apache.org/jira/browse/HIVE-15867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876786#comment-15876786 ] Thomas Poepping commented on HIVE-15867: Hi [~stakiar], yes, still on the radar. My colleague is working on a patch now, once it passes our internal review he'll attach it here. Sorry for the wait > Add blobstore tests for import/export > - > > Key: HIVE-15867 > URL: https://issues.apache.org/jira/browse/HIVE-15867 > Project: Hive > Issue Type: Bug >Reporter: Thomas Poepping >Assignee: Thomas Poepping > > This patch covers ten separate tests testing import and export operations > running against blobstore filesystems: > * Import addpartition > ** blobstore -> file > ** file -> blobstore > ** blobstore -> blobstore > ** blobstore -> hdfs > * import/export > ** blobstore -> file > ** file -> blobstore > ** blobstore -> blobstore (partitioned and non-partitioned) > ** blobstore -> HDFS (partitioned and non-partitioned) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15971) LLAP: logs urls should use daemon container id instead of fake container id
[ https://issues.apache.org/jira/browse/HIVE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876733#comment-15876733 ] Hive QA commented on HIVE-15971: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12853789/HIVE-15971.4.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10251 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join1] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join31] (batchId=81) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[multiMapJoin2] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=140) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=223) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join31] (batchId=133) org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure (batchId=210) org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag[0] (batchId=173) org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDecimalXY (batchId=173) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3673/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3673/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3673/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 12 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12853789 - PreCommit-HIVE-Build > LLAP: logs urls should use daemon container id instead of fake container id > --- > > Key: HIVE-15971 > URL: https://issues.apache.org/jira/browse/HIVE-15971 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-15971.1.patch, HIVE-15971.2.patch, > HIVE-15971.3.patch, HIVE-15971.4.patch > > > The containerId used for log url generation is fake. It should be replaced by > the container id of the llap daemon. -- This message was sent by Atlassian JIRA (v6.3.15#6346)