[jira] [Updated] (HIVE-16573) In-place update for HoS can't be disabled
[ https://issues.apache.org/jira/browse/HIVE-16573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li updated HIVE-16573: --- Attachment: HIVE-16573-branch2.3.patch > In-place update for HoS can't be disabled > - > > Key: HIVE-16573 > URL: https://issues.apache.org/jira/browse/HIVE-16573 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Rui Li >Assignee: Bing Li >Priority: Minor > Attachments: HIVE-16573-branch2.3.patch > > > {{hive.spark.exec.inplace.progress}} has no effect -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16573) In-place update for HoS can't be disabled
[ https://issues.apache.org/jira/browse/HIVE-16573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036565#comment-16036565 ] Bing Li edited comment on HIVE-16573 at 6/5/17 5:53 AM: Hi, [~ruili] and [~anishek] Seems that we can't import class SessionState into InPlaceUpdate.java, it will cause module cycles error during compiling, which is hive-common->hive-exec->hive-common. I changed it as below: {quote} String engine = HiveConf.getVar(conf, HiveConf.ConfVars.HIVE_EXECUTION_ENGINE); boolean inPlaceUpdates = false; if (engine.equals("tez")) inPlaceUpdates = HiveConf.getBoolVar(conf, HiveConf.ConfVars.TEZ_EXEC_INPLACE_PROGRESS); if (engine.equals("spark")) inPlaceUpdates = HiveConf.getBoolVar(conf, HiveConf.ConfVars.SPARK_EXEC_INPLACE_PROGRESS); {quote} Do you think is ok? was (Author: libing): Hi, [~ruili] and [~anishek] Seems that we can't import class SessionState into InPlaceUpdate.java, it will cause module cycles error during compiling, which is hive-common->hive-exec->hive-common. I changed it as below: {quote} {{ String engine = HiveConf.getVar(conf, HiveConf.ConfVars.HIVE_EXECUTION_ENGINE); boolean inPlaceUpdates = false; if (engine.equals("tez")) inPlaceUpdates = HiveConf.getBoolVar(conf, HiveConf.ConfVars.TEZ_EXEC_INPLACE_PROGRESS); if (engine.equals("spark")) inPlaceUpdates = HiveConf.getBoolVar(conf, HiveConf.ConfVars.SPARK_EXEC_INPLACE_PROGRESS); }} {quote} Do you think is ok? > In-place update for HoS can't be disabled > - > > Key: HIVE-16573 > URL: https://issues.apache.org/jira/browse/HIVE-16573 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Rui Li >Assignee: Bing Li >Priority: Minor > > {{hive.spark.exec.inplace.progress}} has no effect -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16573) In-place update for HoS can't be disabled
[ https://issues.apache.org/jira/browse/HIVE-16573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036565#comment-16036565 ] Bing Li edited comment on HIVE-16573 at 6/5/17 5:51 AM: Hi, [~ruili] and [~anishek] Seems that we can't import class SessionState into InPlaceUpdate.java, it will cause module cycles error during compiling, which is hive-common->hive-exec->hive-common. I changed it as below: {quote} String engine = HiveConf.getVar(conf, HiveConf.ConfVars.HIVE_EXECUTION_ENGINE); boolean inPlaceUpdates = false; if (engine.equals("tez")) inPlaceUpdates = HiveConf.getBoolVar(conf, HiveConf.ConfVars.TEZ_EXEC_INPLACE_PROGRESS); if (engine.equals("spark")) inPlaceUpdates = HiveConf.getBoolVar(conf, HiveConf.ConfVars.SPARK_EXEC_INPLACE_PROGRESS); {quote} Do you think is ok? was (Author: libing): Hi, [~ruili] and [~anishek] Seems that we can't import class SessionState into InPlaceUpdate.java, it will cause module cycles error during compiling, which is hive-common->hive-exec->hive-common. I changed it as below: String engine = HiveConf.getVar(conf, HiveConf.ConfVars.HIVE_EXECUTION_ENGINE); boolean inPlaceUpdates = false; if (engine.equals("tez")) inPlaceUpdates = HiveConf.getBoolVar(conf, HiveConf.ConfVars.TEZ_EXEC_INPLACE_PROGRESS); if (engine.equals("spark")) inPlaceUpdates = HiveConf.getBoolVar(conf, HiveConf.ConfVars.SPARK_EXEC_INPLACE_PROGRESS); Do you think is ok? > In-place update for HoS can't be disabled > - > > Key: HIVE-16573 > URL: https://issues.apache.org/jira/browse/HIVE-16573 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Rui Li >Assignee: Bing Li >Priority: Minor > > {{hive.spark.exec.inplace.progress}} has no effect -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16573) In-place update for HoS can't be disabled
[ https://issues.apache.org/jira/browse/HIVE-16573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036565#comment-16036565 ] Bing Li edited comment on HIVE-16573 at 6/5/17 5:52 AM: Hi, [~ruili] and [~anishek] Seems that we can't import class SessionState into InPlaceUpdate.java, it will cause module cycles error during compiling, which is hive-common->hive-exec->hive-common. I changed it as below: {quote} {{ String engine = HiveConf.getVar(conf, HiveConf.ConfVars.HIVE_EXECUTION_ENGINE); boolean inPlaceUpdates = false; if (engine.equals("tez")) inPlaceUpdates = HiveConf.getBoolVar(conf, HiveConf.ConfVars.TEZ_EXEC_INPLACE_PROGRESS); if (engine.equals("spark")) inPlaceUpdates = HiveConf.getBoolVar(conf, HiveConf.ConfVars.SPARK_EXEC_INPLACE_PROGRESS); }} {quote} Do you think is ok? was (Author: libing): Hi, [~ruili] and [~anishek] Seems that we can't import class SessionState into InPlaceUpdate.java, it will cause module cycles error during compiling, which is hive-common->hive-exec->hive-common. I changed it as below: {quote} String engine = HiveConf.getVar(conf, HiveConf.ConfVars.HIVE_EXECUTION_ENGINE); boolean inPlaceUpdates = false; if (engine.equals("tez")) inPlaceUpdates = HiveConf.getBoolVar(conf, HiveConf.ConfVars.TEZ_EXEC_INPLACE_PROGRESS); if (engine.equals("spark")) inPlaceUpdates = HiveConf.getBoolVar(conf, HiveConf.ConfVars.SPARK_EXEC_INPLACE_PROGRESS); {quote} Do you think is ok? > In-place update for HoS can't be disabled > - > > Key: HIVE-16573 > URL: https://issues.apache.org/jira/browse/HIVE-16573 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Rui Li >Assignee: Bing Li >Priority: Minor > > {{hive.spark.exec.inplace.progress}} has no effect -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Work started] (HIVE-16573) In-place update for HoS can't be disabled
[ https://issues.apache.org/jira/browse/HIVE-16573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-16573 started by Bing Li. -- > In-place update for HoS can't be disabled > - > > Key: HIVE-16573 > URL: https://issues.apache.org/jira/browse/HIVE-16573 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Rui Li >Assignee: Bing Li >Priority: Minor > > {{hive.spark.exec.inplace.progress}} has no effect -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16573) In-place update for HoS can't be disabled
[ https://issues.apache.org/jira/browse/HIVE-16573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036565#comment-16036565 ] Bing Li commented on HIVE-16573: Hi, [~ruili] and [~anishek] Seems that we can't import class SessionState into InPlaceUpdate.java, it will cause module cycles error during compiling, which is hive-common->hive-exec->hive-common. I changed it as below: String engine = HiveConf.getVar(conf, HiveConf.ConfVars.HIVE_EXECUTION_ENGINE); boolean inPlaceUpdates = false; if (engine.equals("tez")) inPlaceUpdates = HiveConf.getBoolVar(conf, HiveConf.ConfVars.TEZ_EXEC_INPLACE_PROGRESS); if (engine.equals("spark")) inPlaceUpdates = HiveConf.getBoolVar(conf, HiveConf.ConfVars.SPARK_EXEC_INPLACE_PROGRESS); Do you think is ok? > In-place update for HoS can't be disabled > - > > Key: HIVE-16573 > URL: https://issues.apache.org/jira/browse/HIVE-16573 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Rui Li >Assignee: Bing Li >Priority: Minor > > {{hive.spark.exec.inplace.progress}} has no effect -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]
[ https://issues.apache.org/jira/browse/HIVE-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036554#comment-16036554 ] liyunzhang_intel commented on HIVE-11297: - [~csun]: thanks for review. reply you on review board. bq.Seems this removes the extra map work after it was generated. Is there a way to avoid generating the map work in the first place? physical operator tree will by spark partition pruningsink original tree: {noformat} TS[1]-FIL[17]-RS[4]-JOIN[5] -SEL[18]-GBY[19]-SPARKPRUNINGSINK[20] -SEL[21]-GBY[22]-SPARKPRUNINGSINK[23] {noformat} after split by spark partition pruningsink: {noformat} TS[1]-FIL[17]-RS[4]-JOIN[5] TS[1]-FIL[17]-SEL[18]-GBY[19]-SPARKPRUNINGSINK[20] TS[1]-FIL[17]-SEL[21]-GBY[22]-SPARKPRUNINGSINK[23] {noformat} If we want to avoid generating multiple map works({noformat}TS[1]-FIL[17]-SEL[18]-GBY[19]-SPARKPRUNINGSINK[20],TS[1]-FIL[17]-SEL[18]-GBY[22]-SPARKPRUNINGSINK[23]{noformat}), we need remove the rule of spark dynamic partition pruning. If we remove that rule, exception will be thrown because the remaining tree will not be in a MapWork ( {noformat} -SEL[18]-GBY[19]-SPARKPRUNINGSINK[20] -SEL[21]-GBY[22]-SPARKPRUNINGSINK[23] {noformat} ) {code} opRules.put(new RuleRegExp("Split Work - SparkPartitionPruningSink", SparkPartitionPruningSinkOperator.getOperatorName() + "%"), genSparkWork); {code} If you have idea about this, please give me your suggestion. > Combine op trees for partition info generating tasks [Spark branch] > --- > > Key: HIVE-11297 > URL: https://issues.apache.org/jira/browse/HIVE-11297 > Project: Hive > Issue Type: Bug >Affects Versions: spark-branch >Reporter: Chao Sun >Assignee: liyunzhang_intel > Attachments: HIVE-11297.1.patch > > > Currently, for dynamic partition pruning in Spark, if a small table generates > partition info for more than one partition columns, multiple operator trees > are created, which all start from the same table scan op, but have different > spark partition pruning sinks. > As an optimization, we can combine these op trees and so don't have to do > table scan multiple times. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Issue Comment Deleted] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]
[ https://issues.apache.org/jira/browse/HIVE-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianguo Tian updated HIVE-11297: Comment: was deleted (was: [~csun]: thanks for review, reply you on review board.) > Combine op trees for partition info generating tasks [Spark branch] > --- > > Key: HIVE-11297 > URL: https://issues.apache.org/jira/browse/HIVE-11297 > Project: Hive > Issue Type: Bug >Affects Versions: spark-branch >Reporter: Chao Sun >Assignee: liyunzhang_intel > Attachments: HIVE-11297.1.patch > > > Currently, for dynamic partition pruning in Spark, if a small table generates > partition info for more than one partition columns, multiple operator trees > are created, which all start from the same table scan op, but have different > spark partition pruning sinks. > As an optimization, we can combine these op trees and so don't have to do > table scan multiple times. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]
[ https://issues.apache.org/jira/browse/HIVE-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036540#comment-16036540 ] Jianguo Tian commented on HIVE-11297: - [~csun]: thanks for review, reply you on review board. > Combine op trees for partition info generating tasks [Spark branch] > --- > > Key: HIVE-11297 > URL: https://issues.apache.org/jira/browse/HIVE-11297 > Project: Hive > Issue Type: Bug >Affects Versions: spark-branch >Reporter: Chao Sun >Assignee: liyunzhang_intel > Attachments: HIVE-11297.1.patch > > > Currently, for dynamic partition pruning in Spark, if a small table generates > partition info for more than one partition columns, multiple operator trees > are created, which all start from the same table scan op, but have different > spark partition pruning sinks. > As an optimization, we can combine these op trees and so don't have to do > table scan multiple times. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16780) Case "multiple sources, single key" in spark_dynamic_pruning.q fails
[ https://issues.apache.org/jira/browse/HIVE-16780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036520#comment-16036520 ] Chao Sun commented on HIVE-16780: - Thanks for the findings [~kellyzly]! I wonder if the issue still happen when {{hive.tez.dynamic.semijoin.reduction}} is set to false. It seems this config affects Spark branch too, which should not happen. Maybe we should first disable this optimization for Spark in {{DynamicPartitionPruningOptimization}}, which is shared by both engines. In future we can investigate on how to enable this optimization for Spark. > Case "multiple sources, single key" in spark_dynamic_pruning.q fails > - > > Key: HIVE-16780 > URL: https://issues.apache.org/jira/browse/HIVE-16780 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > > script.q > {code} > set hive.optimize.ppd=true; > set hive.ppd.remove.duplicatefilters=true; > set hive.spark.dynamic.partition.pruning=true; > set hive.optimize.metadataonly=false; > set hive.optimize.index.filter=true; > set hive.strict.checks.cartesian.product=false; > set hive.spark.dynamic.partition.pruning=true; > -- multiple sources, single key > select count(*) from srcpart join srcpart_date on (srcpart.ds = > srcpart_date.ds) join srcpart_hour on (srcpart.hr = srcpart_hour.hr) > {code} > if disabling "hive.optimize.index.filter", case passes otherwise it always > hang out in the first job. Exception > {code} > 17/05/27 23:39:45 DEBUG Executor task launch worker-0 PerfLogger: method=SparkInitializeOperators start=1495899585574 end=1495899585933 > duration=359 from=org.apache.hadoop.hive.ql.exec.spark.SparkRecordHandler> > 17/05/27 23:39:45 INFO Executor task launch worker-0 Utilities: PLAN PATH = > hdfs://bdpe41:8020/tmp/hive/root/029a2d8a-c6e5-4ea9-adea-ef8fbea3cde2/hive_2017-05-27_23-39-06_464_5915518562441677640-1/-mr-10007/617d9dd6-9f9a-4786-8131-a7b98e8abc3e/map.xml > 17/05/27 23:39:45 DEBUG Executor task launch worker-0 Utilities: Found plan > in cache for name: map.xml > 17/05/27 23:39:45 DEBUG Executor task launch worker-0 DFSClient: Connecting > to datanode 10.239.47.162:50010 > 17/05/27 23:39:45 DEBUG Executor task launch worker-0 MapOperator: Processing > alias(es) srcpart_hour for file > hdfs://bdpe41:8020/user/hive/warehouse/srcpart_hour/08_0 > 17/05/27 23:39:45 DEBUG Executor task launch worker-0 ObjectCache: Creating > root_20170527233906_ac2934e1-2e58-4116-9f0d-35dee302d689_DynamicValueRegistry > 17/05/27 23:39:45 ERROR Executor task launch worker-0 SparkMapRecordHandler: > Error processing row: org.apache.hadoop.hive.ql.metadata.HiveException: Hive > Runtime Error while processing row {"hr":"11","hour":"11"} > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row {"hr":"11","hour":"11"} > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:562) > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:136) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:48) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:27) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:85) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42) > at scala.collection.Iterator$class.foreach(Iterator.scala:893) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127) > at > org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974) > at > org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) > at org.apache.spark.scheduler.Task.run(Task.scala:85) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.IllegalStateException: Failed to retrieve dynamic value > for RS_7_srcpart__col3_min > at > org.apache.hadoop.hive.ql.plan.DynamicValue.getValue(DynamicValue.java:126) > at >
[jira] [Commented] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]
[ https://issues.apache.org/jira/browse/HIVE-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036514#comment-16036514 ] Chao Sun commented on HIVE-11297: - Thanks for working on this [~kellyzly]!. Sorry for the delay but I added some comments in RB. > Combine op trees for partition info generating tasks [Spark branch] > --- > > Key: HIVE-11297 > URL: https://issues.apache.org/jira/browse/HIVE-11297 > Project: Hive > Issue Type: Bug >Affects Versions: spark-branch >Reporter: Chao Sun >Assignee: liyunzhang_intel > Attachments: HIVE-11297.1.patch > > > Currently, for dynamic partition pruning in Spark, if a small table generates > partition info for more than one partition columns, multiple operator trees > are created, which all start from the same table scan op, but have different > spark partition pruning sinks. > As an optimization, we can combine these op trees and so don't have to do > table scan multiple times. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]
[ https://issues.apache.org/jira/browse/HIVE-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036446#comment-16036446 ] liyunzhang_intel commented on HIVE-11297: - [~csun],[~Ferd]: can you help review HIVE-11297.1.patch if have time? > Combine op trees for partition info generating tasks [Spark branch] > --- > > Key: HIVE-11297 > URL: https://issues.apache.org/jira/browse/HIVE-11297 > Project: Hive > Issue Type: Bug >Affects Versions: spark-branch >Reporter: Chao Sun >Assignee: liyunzhang_intel > Attachments: HIVE-11297.1.patch > > > Currently, for dynamic partition pruning in Spark, if a small table generates > partition info for more than one partition columns, multiple operator trees > are created, which all start from the same table scan op, but have different > spark partition pruning sinks. > As an optimization, we can combine these op trees and so don't have to do > table scan multiple times. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-13745) UDF current_date、current_timestamp、unix_timestamp NPE
[ https://issues.apache.org/jira/browse/HIVE-13745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036375#comment-16036375 ] Hive QA commented on HIVE-13745: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12803597/HIVE-13745.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5528/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5528/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5528/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-06-04 19:51:37.770 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-5528/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-06-04 19:51:37.773 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 04bb1ca HIVE-16809 : Improve filter condition for correlated subqueries (Vineet Garg via Ashutosh Chauhan) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 04bb1ca HIVE-16809 : Improve filter condition for correlated subqueries (Vineet Garg via Ashutosh Chauhan) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-06-04 19:51:42.271 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: No such file or directory error: a/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java: No such file or directory error: a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java: No such file or directory error: a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentTimestamp.java: No such file or directory error: a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUnixTimeStamp.java: No such file or directory The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12803597 - PreCommit-HIVE-Build > UDF current_date、current_timestamp、unix_timestamp NPE > - > > Key: HIVE-13745 > URL: https://issues.apache.org/jira/browse/HIVE-13745 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Biao Wu >Assignee: Biao Wu > Attachments: HIVE-13745.patch > > > NullPointerException when current_date is used in mapreduce -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-13745) UDF current_date、current_timestamp、unix_timestamp NPE
[ https://issues.apache.org/jira/browse/HIVE-13745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036369#comment-16036369 ] Vincent Tran commented on HIVE-13745: - [~aihuaxu], Was it ever determined as to why SessionState.get() is NULL here? > UDF current_date、current_timestamp、unix_timestamp NPE > - > > Key: HIVE-13745 > URL: https://issues.apache.org/jira/browse/HIVE-13745 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Biao Wu >Assignee: Biao Wu > Attachments: HIVE-13745.patch > > > NullPointerException when current_date is used in mapreduce -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HIVE-3229) null values being loaded as non-null values into Hive
[ https://issues.apache.org/jira/browse/HIVE-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dudu Markovitz resolved HIVE-3229. -- Resolution: Not A Bug > null values being loaded as non-null values into Hive > - > > Key: HIVE-3229 > URL: https://issues.apache.org/jira/browse/HIVE-3229 > Project: Hive > Issue Type: Bug >Reporter: N Campbell > Attachments: CERT.TSET1.txt > > > various tab delimited input files contain one or more columns that represent > null values in rows. the data appears to load (without an error such as in > JIRA 3228) however the resulting values are now non-null values which is > incorrect. > create table if not exists CERT.TSET1_E ( RNUM int , C1 int, C2 string) > row format delimited > fields terminated by '\t' > stored as textfile; > create table if not exists CERT.TSET1 ( RNUM int , C1 int, C2 string) > stored as sequencefile; > load data local inpath 'CERT.TSET1.txt' > overwrite into table CERT.TSET1_E; > insert overwrite table CERT.TSET1 select * from CERT.TSET1_E; -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16768) NOT operator returns NULL from result of <=>
[ https://issues.apache.org/jira/browse/HIVE-16768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036358#comment-16036358 ] Pengcheng Xiong commented on HIVE-16768: Sounds the same problem. Could u try 2.1.1 and see if it reappears? thanks. > NOT operator returns NULL from result of <=> > > > Key: HIVE-16768 > URL: https://issues.apache.org/jira/browse/HIVE-16768 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Alexander Sterligov >Assignee: Fei Hui > > {{SELECT "foo" <=> null;}} > returns {{false}} as expected. > {{SELECT NOT("foo" <=> null);}} > returns NULL, but should return {{true}}. > Workaround is > {{SELECT NOT(COALESCE("foo" <=> null));}} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16768) NOT operator returns NULL from result of <=>
[ https://issues.apache.org/jira/browse/HIVE-16768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036209#comment-16036209 ] Fei Hui commented on HIVE-16768: It is the same to HIVE-15517 [~sterligovak]. Is it right ? [~pxiong] > NOT operator returns NULL from result of <=> > > > Key: HIVE-16768 > URL: https://issues.apache.org/jira/browse/HIVE-16768 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Alexander Sterligov > > {{SELECT "foo" <=> null;}} > returns {{false}} as expected. > {{SELECT NOT("foo" <=> null);}} > returns NULL, but should return {{true}}. > Workaround is > {{SELECT NOT(COALESCE("foo" <=> null));}} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16768) NOT operator returns NULL from result of <=>
[ https://issues.apache.org/jira/browse/HIVE-16768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui reassigned HIVE-16768: -- Assignee: Fei Hui > NOT operator returns NULL from result of <=> > > > Key: HIVE-16768 > URL: https://issues.apache.org/jira/browse/HIVE-16768 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Alexander Sterligov >Assignee: Fei Hui > > {{SELECT "foo" <=> null;}} > returns {{false}} as expected. > {{SELECT NOT("foo" <=> null);}} > returns NULL, but should return {{true}}. > Workaround is > {{SELECT NOT(COALESCE("foo" <=> null));}} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16760) Update errata.txt for HIVE-16743
[ https://issues.apache.org/jira/browse/HIVE-16760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036203#comment-16036203 ] Lefty Leverenz commented on HIVE-16760: --- bq. comment in the original jira is sufficient for any errata.txt updates Ah, good point. Okay, I've opened HIVE-16822 to document errata.txt. Thanks [~thejas]. ([~wzheng], sorry for hijacking your jira with this discussion.) > Update errata.txt for HIVE-16743 > > > Key: HIVE-16760 > URL: https://issues.apache.org/jira/browse/HIVE-16760 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-16760.patch > > > Refer to: > https://issues.apache.org/jira/browse/HIVE-16743?focusedCommentId=16024139=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16024139 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16764) Support numeric as same as decimal
[ https://issues.apache.org/jira/browse/HIVE-16764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036201#comment-16036201 ] Lefty Leverenz commented on HIVE-16764: --- Doc note: NUMERIC is documented in the wiki with version notes in the sections of Hive Data Types that describe DECIMAL. * [Hive Data Types -- Numeric Types | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-NumericTypes] * [Hive Data Types -- Decimals | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-DecimalsdecimalDecimals] * [Hive Data Types -- Decimal Types | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-DecimalTypes] It might be nice to include an example or two (mentioning the version) in Using Decimal Types: * [Hive Data Types -- Using Decimal Types | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-UsingDecimalTypes] > Support numeric as same as decimal > -- > > Key: HIVE-16764 > URL: https://issues.apache.org/jira/browse/HIVE-16764 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.2.0, 2.3.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Labels: incompatibleChange > Fix For: 3.0.0 > > Attachments: HIVE-16764.01.patch, HIVE-16764.02.patch > > > for example numeric(12,2) -> decimal(12,2) > This will make Numeric reserved keyword -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16654) Optimize a combination of avg(), sum(), count(distinct) etc
[ https://issues.apache.org/jira/browse/HIVE-16654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036193#comment-16036193 ] Lefty Leverenz commented on HIVE-16654: --- Wow, you're quick. Thanks for the doc, especially the additional explanation. Here's the direct link: * [Configuration Properties -- hive.optimize.countdistinct | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.optimize.countdistinct] Removing the TODOC3.0 label. > Optimize a combination of avg(), sum(), count(distinct) etc > --- > > Key: HIVE-16654 > URL: https://issues.apache.org/jira/browse/HIVE-16654 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 3.0.0 > > Attachments: HIVE-16654.01.patch, HIVE-16654.02.patch, > HIVE-16654.03.patch, HIVE-16654.04.patch > > > an example rewrite for q28 of tpcds is > {code} > (select LP as B1_LP ,CNT as B1_CNT,CNTD as B1_CNTD > from (select sum(xc0) / sum(xc1) as LP, sum(xc1) as CNT, count(1) as > CNTD from (select sum(ss_list_price) as xc0, count(ss_list_price) as xc1 from > store_sales where > ss_list_price is not null and ss_quantity between 0 and 5 > and (ss_list_price between 11 and 11+10 > or ss_coupon_amt between 460 and 460+1000 > or ss_wholesale_cost between 14 and 14+20) > group by ss_list_price) ss0) ss1) B1 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16654) Optimize a combination of avg(), sum(), count(distinct) etc
[ https://issues.apache.org/jira/browse/HIVE-16654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-16654: -- Labels: (was: TODOC3.0) > Optimize a combination of avg(), sum(), count(distinct) etc > --- > > Key: HIVE-16654 > URL: https://issues.apache.org/jira/browse/HIVE-16654 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 3.0.0 > > Attachments: HIVE-16654.01.patch, HIVE-16654.02.patch, > HIVE-16654.03.patch, HIVE-16654.04.patch > > > an example rewrite for q28 of tpcds is > {code} > (select LP as B1_LP ,CNT as B1_CNT,CNTD as B1_CNTD > from (select sum(xc0) / sum(xc1) as LP, sum(xc1) as CNT, count(1) as > CNTD from (select sum(ss_list_price) as xc0, count(ss_list_price) as xc1 from > store_sales where > ss_list_price is not null and ss_quantity between 0 and 5 > and (ss_list_price between 11 and 11+10 > or ss_coupon_amt between 460 and 460+1000 > or ss_wholesale_cost between 14 and 14+20) > group by ss_list_price) ss0) ss1) B1 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16654) Optimize a combination of avg(), sum(), count(distinct) etc
[ https://issues.apache.org/jira/browse/HIVE-16654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036192#comment-16036192 ] Pengcheng Xiong commented on HIVE-16654: [~leftylev], thanks for your attention. I have updated the wiki accordingly. > Optimize a combination of avg(), sum(), count(distinct) etc > --- > > Key: HIVE-16654 > URL: https://issues.apache.org/jira/browse/HIVE-16654 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-16654.01.patch, HIVE-16654.02.patch, > HIVE-16654.03.patch, HIVE-16654.04.patch > > > an example rewrite for q28 of tpcds is > {code} > (select LP as B1_LP ,CNT as B1_CNT,CNTD as B1_CNTD > from (select sum(xc0) / sum(xc1) as LP, sum(xc1) as CNT, count(1) as > CNTD from (select sum(ss_list_price) as xc0, count(ss_list_price) as xc1 from > store_sales where > ss_list_price is not null and ss_quantity between 0 and 5 > and (ss_list_price between 11 and 11+10 > or ss_coupon_amt between 460 and 460+1000 > or ss_wholesale_cost between 14 and 14+20) > group by ss_list_price) ss0) ss1) B1 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16774) Support position in ORDER BY when using SELECT *
[ https://issues.apache.org/jira/browse/HIVE-16774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036190#comment-16036190 ] Lefty Leverenz commented on HIVE-16774: --- Thanks, good to know. > Support position in ORDER BY when using SELECT * > > > Key: HIVE-16774 > URL: https://issues.apache.org/jira/browse/HIVE-16774 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 3.0.0 > > Attachments: HIVE-16774.01.patch, HIVE-16774.02.patch > > > query47.q query57.q -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16372) Enable DDL statement for non-native tables (add/remove table properties)
[ https://issues.apache.org/jira/browse/HIVE-16372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036188#comment-16036188 ] Lefty Leverenz commented on HIVE-16372: --- Okay, thanks. > Enable DDL statement for non-native tables (add/remove table properties) > > > Key: HIVE-16372 > URL: https://issues.apache.org/jira/browse/HIVE-16372 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 3.0.0 > > Attachments: HIVE-16372.01.patch, HIVE-16372.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16774) Support position in ORDER BY when using SELECT *
[ https://issues.apache.org/jira/browse/HIVE-16774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036187#comment-16036187 ] Pengcheng Xiong commented on HIVE-16774: [~leftylev], previously we do support order by position. This is an improvement. Thus i think we are ok. > Support position in ORDER BY when using SELECT * > > > Key: HIVE-16774 > URL: https://issues.apache.org/jira/browse/HIVE-16774 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 3.0.0 > > Attachments: HIVE-16774.01.patch, HIVE-16774.02.patch > > > query47.q query57.q -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16654) Optimize a combination of avg(), sum(), count(distinct) etc
[ https://issues.apache.org/jira/browse/HIVE-16654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036185#comment-16036185 ] Lefty Leverenz commented on HIVE-16654: --- Doc note: This adds *hive.optimize.countdistinct* to HiveConf.java, so it needs to be documented in the wiki. * [Configuration Properties -- Query and DDL Execution | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution] Added a TODOC3.0 label. > Optimize a combination of avg(), sum(), count(distinct) etc > --- > > Key: HIVE-16654 > URL: https://issues.apache.org/jira/browse/HIVE-16654 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-16654.01.patch, HIVE-16654.02.patch, > HIVE-16654.03.patch, HIVE-16654.04.patch > > > an example rewrite for q28 of tpcds is > {code} > (select LP as B1_LP ,CNT as B1_CNT,CNTD as B1_CNTD > from (select sum(xc0) / sum(xc1) as LP, sum(xc1) as CNT, count(1) as > CNTD from (select sum(ss_list_price) as xc0, count(ss_list_price) as xc1 from > store_sales where > ss_list_price is not null and ss_quantity between 0 and 5 > and (ss_list_price between 11 and 11+10 > or ss_coupon_amt between 460 and 460+1000 > or ss_wholesale_cost between 14 and 14+20) > group by ss_list_price) ss0) ss1) B1 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16372) Enable DDL statement for non-native tables (add/remove table properties)
[ https://issues.apache.org/jira/browse/HIVE-16372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036186#comment-16036186 ] Pengcheng Xiong commented on HIVE-16372: [~leftylev], current version on wiki does not say that we can not do that for non-native tables. Thus, i think we are ok. > Enable DDL statement for non-native tables (add/remove table properties) > > > Key: HIVE-16372 > URL: https://issues.apache.org/jira/browse/HIVE-16372 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 3.0.0 > > Attachments: HIVE-16372.01.patch, HIVE-16372.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16654) Optimize a combination of avg(), sum(), count(distinct) etc
[ https://issues.apache.org/jira/browse/HIVE-16654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-16654: -- Labels: TODOC3.0 (was: ) > Optimize a combination of avg(), sum(), count(distinct) etc > --- > > Key: HIVE-16654 > URL: https://issues.apache.org/jira/browse/HIVE-16654 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-16654.01.patch, HIVE-16654.02.patch, > HIVE-16654.03.patch, HIVE-16654.04.patch > > > an example rewrite for q28 of tpcds is > {code} > (select LP as B1_LP ,CNT as B1_CNT,CNTD as B1_CNTD > from (select sum(xc0) / sum(xc1) as LP, sum(xc1) as CNT, count(1) as > CNTD from (select sum(ss_list_price) as xc0, count(ss_list_price) as xc1 from > store_sales where > ss_list_price is not null and ss_quantity between 0 and 5 > and (ss_list_price between 11 and 11+10 > or ss_coupon_amt between 460 and 460+1000 > or ss_wholesale_cost between 14 and 14+20) > group by ss_list_price) ss0) ss1) B1 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16774) Support position in ORDER BY when using SELECT *
[ https://issues.apache.org/jira/browse/HIVE-16774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036182#comment-16036182 ] Lefty Leverenz commented on HIVE-16774: --- Does this need to be documented in the wiki? > Support position in ORDER BY when using SELECT * > > > Key: HIVE-16774 > URL: https://issues.apache.org/jira/browse/HIVE-16774 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 3.0.0 > > Attachments: HIVE-16774.01.patch, HIVE-16774.02.patch > > > query47.q query57.q -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16591) DR for function Binaries on HDFS
[ https://issues.apache.org/jira/browse/HIVE-16591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036180#comment-16036180 ] Lefty Leverenz commented on HIVE-16591: --- Doc note: This adds *hive.repl.replica.functions.root.dir* to HiveConf.java, and it's already documented in the wiki -- thanks, [~anishek]! * [Configuration Properties -- hive.repl.replica.functions.root.dir | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.repl.replica.functions.root.dir] > DR for function Binaries on HDFS > - > > Key: HIVE-16591 > URL: https://issues.apache.org/jira/browse/HIVE-16591 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > Attachments: HIVE-16591.1.patch, HIVE-16591.2.patch, > HIVE-16591.3.patch > > > # We have to make sure that during incremental dump we dont allow functions > to be copied if they have local filesystem "file://" resources. -- depends > how much system side work we want to do, We are going to explicitly provide a > caveat for replicating functions where in, only functions created "using" > clause will be replicated and the "using" clause prohibits creating functions > with the local "file://" resources and hence doing additional checks when > doing repl dump might not be required. > # We have to make sure that during the bootstrap / incremental dump we append > the namenode host + port if functions are created without the fully > qualified location of uri on hdfs, not sure how this would play for S3 or > WASB filesystem. > # We have to copy the binaries of a function resource list on CREATE / DROP > FUNCTION . The change management file system has to keep a copy of the binary > when a DROP function is called, to provide capability of updating binary > definition for existing functions along with DR. An example of list of steps > is given in doc (ReplicateFunctions.pdf ) attached in parent Issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)