[jira] [Commented] (HIVE-17884) Implement create, alter and drop workload management triggers.
[ https://issues.apache.org/jira/browse/HIVE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218092#comment-16218092 ] Harish Jaiprakash commented on HIVE-17884: -- Thanks [~prasanth_j]. For show/describe, the current though process is just expose raw data using information schema for now. And at the end enhance show/describe to display in some formatted fashion. I'll fix the typos, and add the getTriggersForResourcePlan api in IMetaStoreClient. It makes sense to support more data types in expression parser, I'll add bytes and interval into it. > Implement create, alter and drop workload management triggers. > -- > > Key: HIVE-17884 > URL: https://issues.apache.org/jira/browse/HIVE-17884 > Project: Hive > Issue Type: Sub-task >Reporter: Harish Jaiprakash >Assignee: Harish Jaiprakash > Attachments: HIVE-17884.01.patch > > > Implement triggers for workload management: > The commands to be implemented: > CREATE TRIGGER `resourceplan_name`.`trigger_name` WHEN condition DO action; > condition is a boolean expression: variable operator value types with 'AND' > and 'OR' support. > action is currently: KILL or MOVE TO pool; > ALTER TRIGGER `plan_name`.`trigger_name` WHEN condition DO action; > DROP TRIGGER `plan_name`.`trigger_name`; > Also add WM_TRIGGERS to information schema. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17879) Can not find java.sql.date in JDK9 when building hive
[ https://issues.apache.org/jira/browse/HIVE-17879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218084#comment-16218084 ] liyunzhang commented on HIVE-17879: --- [~gopalv]: {quote} the test was using build artifacts from JDK8 - running LLAP (only) with JDK9 by overriding the --javaHome param during package builds. {quote} I wonder that LLAP does not need hadoop dependency? If need, you mean it can run successfully with hadoop package (built with jdk8) in jdk9 env? In my env, i tried this but this failed. > Can not find java.sql.date in JDK9 when building hive > - > > Key: HIVE-17879 > URL: https://issues.apache.org/jira/browse/HIVE-17879 > Project: Hive > Issue Type: Sub-task >Reporter: liyunzhang > > when build hive with jdk9 > got following error > {code} > [ERROR] Failed to execute goal > org.datanucleus:datanucleus-maven-plugin:3.3.0-release:enhance (default) on > project hive-standalone-metastore: Error executing DataNucleus tool > org.datanucleus.enhancer.DataNucleusEnhancer: InvocationTargetException: > java/sql/Date: java.sql.Date -> [Help 1] > org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute > goal org.datanucleus:datanucleus-maven-plugin:3.3.0-release:enhance (default) > on project hive-standalone-metastore: Error executing DataNucleus tool > org.datanucleus.enhancer.DataNucleusEnhancer > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:212) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) > at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116) > at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80) > at > org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51) > at > org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128) > at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307) > at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193) > at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106) > at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863) > at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288) > at org.apache.maven.cli.MavenCli.main(MavenCli.java:199) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at > org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289) > at > org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229) > at > org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415) > at > org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356) > Caused by: org.apache.maven.plugin.MojoExecutionException: Error executing > DataNucleus tool org.datanucleus.enhancer.DataNucleusEnhancer > at > org.datanucleus.maven.AbstractDataNucleusMojo.executeInJvm(AbstractDataNucleusMojo.java:350) > at > org.datanucleus.maven.AbstractEnhancerMojo.enhance(AbstractEnhancerMojo.java:266) > at > org.datanucleus.maven.AbstractEnhancerMojo.executeDataNucleusTool(AbstractEnhancerMojo.java:72) > at > org.datanucleus.maven.AbstractDataNucleusMojo.execute(AbstractDataNucleusMojo.java:126) > at > org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:207) > ... 20 more > Caused by: java.lang.reflect.InvocationTargetException > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at > org.datanucleus.maven.AbstractDataNucleusMojo.executeInJvm(AbstractDataNucleusMojo.java:333) > ... 25 more > Caused by: java.lang.NoClassDefFoundError: java/sql/Date > at org.datanucleus.ClassConstants.(ClassConstants.java:66) > at >
[jira] [Commented] (HIVE-17884) Implement create, alter and drop workload management triggers.
[ https://issues.apache.org/jira/browse/HIVE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218067#comment-16218067 ] Hive QA commented on HIVE-17884: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12893737/HIVE-17884.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11322 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=155) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=110) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=205) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=222) org.apache.hadoop.hive.ql.parse.authorization.plugin.sqlstd.TestOperation2Privilege.checkHiveOperationTypeMatch (batchId=270) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes (batchId=229) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7460/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7460/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7460/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12893737 - PreCommit-HIVE-Build > Implement create, alter and drop workload management triggers. > -- > > Key: HIVE-17884 > URL: https://issues.apache.org/jira/browse/HIVE-17884 > Project: Hive > Issue Type: Sub-task >Reporter: Harish Jaiprakash >Assignee: Harish Jaiprakash > Attachments: HIVE-17884.01.patch > > > Implement triggers for workload management: > The commands to be implemented: > CREATE TRIGGER `resourceplan_name`.`trigger_name` WHEN condition DO action; > condition is a boolean expression: variable operator value types with 'AND' > and 'OR' support. > action is currently: KILL or MOVE TO pool; > ALTER TRIGGER `plan_name`.`trigger_name` WHEN condition DO action; > DROP TRIGGER `plan_name`.`trigger_name`; > Also add WM_TRIGGERS to information schema. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17879) Can not find java.sql.date in JDK9 when building hive
[ https://issues.apache.org/jira/browse/HIVE-17879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218051#comment-16218051 ] Gopal V commented on HIVE-17879: [~kellyzly]: the test was using build artifacts from JDK8 - running LLAP (only) with JDK9 by overriding the --javaHome param during package builds. The only thing that didn't work there was the misc.Cleaner reference in the LLAP off-heap cache (the cache was therefore disabled for both tests). > Can not find java.sql.date in JDK9 when building hive > - > > Key: HIVE-17879 > URL: https://issues.apache.org/jira/browse/HIVE-17879 > Project: Hive > Issue Type: Sub-task >Reporter: liyunzhang > > when build hive with jdk9 > got following error > {code} > [ERROR] Failed to execute goal > org.datanucleus:datanucleus-maven-plugin:3.3.0-release:enhance (default) on > project hive-standalone-metastore: Error executing DataNucleus tool > org.datanucleus.enhancer.DataNucleusEnhancer: InvocationTargetException: > java/sql/Date: java.sql.Date -> [Help 1] > org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute > goal org.datanucleus:datanucleus-maven-plugin:3.3.0-release:enhance (default) > on project hive-standalone-metastore: Error executing DataNucleus tool > org.datanucleus.enhancer.DataNucleusEnhancer > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:212) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) > at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116) > at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80) > at > org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51) > at > org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128) > at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307) > at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193) > at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106) > at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863) > at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288) > at org.apache.maven.cli.MavenCli.main(MavenCli.java:199) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at > org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289) > at > org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229) > at > org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415) > at > org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356) > Caused by: org.apache.maven.plugin.MojoExecutionException: Error executing > DataNucleus tool org.datanucleus.enhancer.DataNucleusEnhancer > at > org.datanucleus.maven.AbstractDataNucleusMojo.executeInJvm(AbstractDataNucleusMojo.java:350) > at > org.datanucleus.maven.AbstractEnhancerMojo.enhance(AbstractEnhancerMojo.java:266) > at > org.datanucleus.maven.AbstractEnhancerMojo.executeDataNucleusTool(AbstractEnhancerMojo.java:72) > at > org.datanucleus.maven.AbstractDataNucleusMojo.execute(AbstractDataNucleusMojo.java:126) > at > org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:207) > ... 20 more > Caused by: java.lang.reflect.InvocationTargetException > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at > org.datanucleus.maven.AbstractDataNucleusMojo.executeInJvm(AbstractDataNucleusMojo.java:333) > ... 25 more > Caused by: java.lang.NoClassDefFoundError: java/sql/Date > at org.datanucleus.ClassConstants.(ClassConstants.java:66) > at >
[jira] [Comment Edited] (HIVE-17879) Can not find java.sql.date in JDK9 when building hive
[ https://issues.apache.org/jira/browse/HIVE-17879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217983#comment-16217983 ] liyunzhang edited comment on HIVE-17879 at 10/25/17 3:31 AM: - [~kgyrtkirk]: thanks for your suggestion. will try. Actually i found that i need to build hadoop package first. If use hadoop pkg with jdk8 building and the hive pkg with jdk9 building, the exception will be thrown as following {code} Exception in thread "main" java.lang.ExceptionInInitializerError at org.apache.hadoop.util.StringUtils.(StringUtils.java:80) at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1437) at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:4064) at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:4091) at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:4294) at org.apache.hadoop.hive.conf.HiveConf.(HiveConf.java:4200) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:99) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:83) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:708) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:564) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.lang.StringIndexOutOfBoundsException: begin 0, end 3, length 1 at java.base/java.lang.String.checkBoundsBeginEnd(String.java:3116) at java.base/java.lang.String.substring(String.java:1885) at org.apache.hadoop.util.Shell.(Shell.java:52) ... 16 more {code} [~gopalv]: I saw that the performance test data on HIVE-17573. Is this tested on hive package(built with jdk9) and hadoop package(built with jdk9) on jdk9 runtime env? was (Author: kellyzly): [~kgyrtkirk]: thanks for your suggestion. will try. > Can not find java.sql.date in JDK9 when building hive > - > > Key: HIVE-17879 > URL: https://issues.apache.org/jira/browse/HIVE-17879 > Project: Hive > Issue Type: Sub-task >Reporter: liyunzhang > > when build hive with jdk9 > got following error > {code} > [ERROR] Failed to execute goal > org.datanucleus:datanucleus-maven-plugin:3.3.0-release:enhance (default) on > project hive-standalone-metastore: Error executing DataNucleus tool > org.datanucleus.enhancer.DataNucleusEnhancer: InvocationTargetException: > java/sql/Date: java.sql.Date -> [Help 1] > org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute > goal org.datanucleus:datanucleus-maven-plugin:3.3.0-release:enhance (default) > on project hive-standalone-metastore: Error executing DataNucleus tool > org.datanucleus.enhancer.DataNucleusEnhancer > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:212) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) > at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116) > at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80) > at > org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51) > at > org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128) > at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307) > at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193) > at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106) > at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863) > at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288) > at org.apache.maven.cli.MavenCli.main(MavenCli.java:199) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at >
[jira] [Comment Edited] (HIVE-17899) Provide an option to disable tez split grouping
[ https://issues.apache.org/jira/browse/HIVE-17899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218046#comment-16218046 ] Gopal V edited comment on HIVE-17899 at 10/25/17 3:29 AM: -- A hive config option for - tez.grouping.by-length=false? {code} if (!(groupByLength || groupByCount)) { throw new TezUncheckedException( "None of the grouping parameters are true: " + TEZ_GROUPING_SPLIT_BY_LENGTH + ", " + TEZ_GROUPING_SPLIT_BY_COUNT); } {code} That part might need to go into Tez as well. was (Author: gopalv): A hive config option for - tez.grouping.by-length=false? > Provide an option to disable tez split grouping > --- > > Key: HIVE-17899 > URL: https://issues.apache.org/jira/browse/HIVE-17899 > Project: Hive > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > Only way to disable split grouping in tez is to change input format to > CombineHiveInputFormat. Provide a config option to disable split grouping > regardless of the IF. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17899) Provide an option to disable tez split grouping
[ https://issues.apache.org/jira/browse/HIVE-17899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218046#comment-16218046 ] Gopal V commented on HIVE-17899: A hive config option for - tez.grouping.by-length=false? > Provide an option to disable tez split grouping > --- > > Key: HIVE-17899 > URL: https://issues.apache.org/jira/browse/HIVE-17899 > Project: Hive > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > Only way to disable split grouping in tez is to change input format to > CombineHiveInputFormat. Provide a config option to disable split grouping > regardless of the IF. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Pushed to master. Thanks Xuefu for the review! > Hive on Spark generate more shuffle data than hive on mr > > > Key: HIVE-15104 > URL: https://issues.apache.org/jira/browse/HIVE-15104 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.2.1 >Reporter: wangwenli >Assignee: Rui Li > Fix For: 3.0.0 > > Attachments: HIVE-15104.1.patch, HIVE-15104.10.patch, > HIVE-15104.2.patch, HIVE-15104.3.patch, HIVE-15104.4.patch, > HIVE-15104.5.patch, HIVE-15104.6.patch, HIVE-15104.7.patch, > HIVE-15104.8.patch, HIVE-15104.9.patch, TPC-H 100G.xlsx > > > the same sql, running on spark and mr engine, will generate different size > of shuffle data. > i think it is because of hive on mr just serialize part of HiveKey, but hive > on spark which using kryo will serialize full of Hivekey object. > what is your opionion? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine
[ https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218011#comment-16218011 ] Hive QA commented on HIVE-17433: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12893727/HIVE-17433.05.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7459/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7459/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7459/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-10-25 02:48:56.379 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-7459/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-10-25 02:48:56.382 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 84950cf HIVE-17764 : alter view fails when hive.metastore.disallow.incompatible.col.type.changes set to true (Janaki Lahorani, reviewed by Andrew Sherman and Vihang Karajgaonkar) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 84950cf HIVE-17764 : alter view fails when hive.metastore.disallow.incompatible.col.type.changes set to true (Janaki Lahorani, reviewed by Andrew Sherman and Vihang Karajgaonkar) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-10-25 02:48:56.906 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: patch failed: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:2898 error: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: patch does not apply error: patch failed: ql/src/test/results/clientpositive/llap/vector_between_columns.q.out:91 error: ql/src/test/results/clientpositive/llap/vector_between_columns.q.out: patch does not apply error: patch failed: ql/src/test/results/clientpositive/llap/vector_complex_all.q.out:678 error: ql/src/test/results/clientpositive/llap/vector_complex_all.q.out: patch does not apply error: patch failed: ql/src/test/results/clientpositive/llap/vector_groupby_mapjoin.q.out:39 error: ql/src/test/results/clientpositive/llap/vector_groupby_mapjoin.q.out: patch does not apply error: patch failed: ql/src/test/results/clientpositive/llap/vector_include_no_sel.q.out:224 error: ql/src/test/results/clientpositive/llap/vector_include_no_sel.q.out: patch does not apply error: patch failed: ql/src/test/results/clientpositive/llap/vectorized_dynamic_partition_pruning.q.out:5955 error: ql/src/test/results/clientpositive/llap/vectorized_dynamic_partition_pruning.q.out: patch does not apply The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12893727 - PreCommit-HIVE-Build > Vectorization: Support Decimal64 in Hive Query Engine > - > > Key: HIVE-17433 > URL: https://issues.apache.org/jira/browse/HIVE-17433 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-17433.03.patch, HIVE-17433.04.patch, > HIVE-17433.05.patch > > > Provide partial support for Decimal64 within Hive. By partial I mean that > our current decimal has a large surface area of features (rounding, multiply, > divide, remainder, power, big precision, and many more) but only a small >
[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
[ https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17458: -- Attachment: HIVE-17458.08.patch > VectorizedOrcAcidRowBatchReader doesn't handle 'original' files > --- > > Key: HIVE-17458 > URL: https://issues.apache.org/jira/browse/HIVE-17458 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, > HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, > HIVE-17458.06.patch, HIVE-17458.07.patch, HIVE-17458.07.patch, > HIVE-17458.08.patch > > > VectorizedOrcAcidRowBatchReader will not be used for original files. This > will likely look like a perf regression when converting a table from non-acid > to acid until it runs through a major compaction. > With Load Data support, if large files are added via Load Data, the read ops > will not vectorize until major compaction. > There is no reason why this should be the case. Just like > OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other > files in the logical tranche/bucket and calculate the offset for the RowBatch > of the split. (Presumably getRecordReader().getRowNumber() works the same in > vector mode). > In this case we don't even need OrcSplit.isOriginal() - the reader can infer > it from file path... which in particular simplifies > OrcInputFormat.determineSplitStrategies() -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17879) Can not find java.sql.date in JDK9 when building hive
[ https://issues.apache.org/jira/browse/HIVE-17879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217983#comment-16217983 ] liyunzhang commented on HIVE-17879: --- [~kgyrtkirk]: thanks for your suggestion. will try. > Can not find java.sql.date in JDK9 when building hive > - > > Key: HIVE-17879 > URL: https://issues.apache.org/jira/browse/HIVE-17879 > Project: Hive > Issue Type: Sub-task >Reporter: liyunzhang > > when build hive with jdk9 > got following error > {code} > [ERROR] Failed to execute goal > org.datanucleus:datanucleus-maven-plugin:3.3.0-release:enhance (default) on > project hive-standalone-metastore: Error executing DataNucleus tool > org.datanucleus.enhancer.DataNucleusEnhancer: InvocationTargetException: > java/sql/Date: java.sql.Date -> [Help 1] > org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute > goal org.datanucleus:datanucleus-maven-plugin:3.3.0-release:enhance (default) > on project hive-standalone-metastore: Error executing DataNucleus tool > org.datanucleus.enhancer.DataNucleusEnhancer > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:212) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) > at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116) > at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80) > at > org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51) > at > org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128) > at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307) > at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193) > at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106) > at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863) > at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288) > at org.apache.maven.cli.MavenCli.main(MavenCli.java:199) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at > org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289) > at > org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229) > at > org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415) > at > org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356) > Caused by: org.apache.maven.plugin.MojoExecutionException: Error executing > DataNucleus tool org.datanucleus.enhancer.DataNucleusEnhancer > at > org.datanucleus.maven.AbstractDataNucleusMojo.executeInJvm(AbstractDataNucleusMojo.java:350) > at > org.datanucleus.maven.AbstractEnhancerMojo.enhance(AbstractEnhancerMojo.java:266) > at > org.datanucleus.maven.AbstractEnhancerMojo.executeDataNucleusTool(AbstractEnhancerMojo.java:72) > at > org.datanucleus.maven.AbstractDataNucleusMojo.execute(AbstractDataNucleusMojo.java:126) > at > org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:207) > ... 20 more > Caused by: java.lang.reflect.InvocationTargetException > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at > org.datanucleus.maven.AbstractDataNucleusMojo.executeInJvm(AbstractDataNucleusMojo.java:333) > ... 25 more > Caused by: java.lang.NoClassDefFoundError: java/sql/Date > at org.datanucleus.ClassConstants.(ClassConstants.java:66) > at > org.datanucleus.plugin.NonManagedPluginRegistry.registerExtensions(NonManagedPluginRegistry.java:206) > at > org.datanucleus.plugin.NonManagedPluginRegistry.registerExtensionPoints(NonManagedPluginRegistry.java:155) > at org.datanucleus.plugin.PluginManager.(PluginManager.java:63) > at >
[jira] [Commented] (HIVE-17764) alter view fails when hive.metastore.disallow.incompatible.col.type.changes set to true
[ https://issues.apache.org/jira/browse/HIVE-17764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217982#comment-16217982 ] Vihang Karajgaonkar commented on HIVE-17764: Patch merged to master. Hi [~janulatha] Can you provide the patch for branch-2 as well? The qfile from the patch fails on branch-2. > alter view fails when hive.metastore.disallow.incompatible.col.type.changes > set to true > --- > > Key: HIVE-17764 > URL: https://issues.apache.org/jira/browse/HIVE-17764 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Janaki Lahorani >Assignee: Janaki Lahorani > Fix For: 3.0.0 > > Attachments: HIVE17764.1.patch, HIVE17764.2.patch > > > A view is a virtual structure that derives the type information from the > table(s) the view is based on.If the view definition is altered, the > corresponding column types should be updated. The relevance of the change > depending on the previous structure of the view is irrelevant. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-12369) Native Vector GroupBy
[ https://issues.apache.org/jira/browse/HIVE-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217981#comment-16217981 ] Hive QA commented on HIVE-12369: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12879115/HIVE-12369.06.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7458/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7458/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7458/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-10-25 01:49:44.492 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-7458/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-10-25 01:49:44.495 + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive 42e70a3..84950cf master -> origin/master 74034f1..84e107b branch-2 -> origin/branch-2 + git reset --hard HEAD HEAD is now at 42e70a3 HIVE-17471 : Vectorization: Enable hive.vectorized.row.identifier.enabled to true by default (Sergey Shelukhin, reviewed by Matt McCline) + git clean -f -d Removing common/src/java/org/apache/hadoop/hive/conf/HiveConf.java.orig Removing itests/src/test/resources/testconfiguration.properties.orig Removing ql/src/test/queries/clientpositive/acid_vectorization_original.q Removing ql/src/test/results/clientpositive/llap/acid_vectorization_original.q.out Removing ql/src/test/results/clientpositive/tez/acid_vectorization_original.q.out Removing standalone-metastore/src/gen/org/ + git checkout master Already on 'master' Your branch is behind 'origin/master' by 2 commits, and can be fast-forwarded. (use "git pull" to update your local branch) + git reset --hard origin/master HEAD is now at 84950cf HIVE-17764 : alter view fails when hive.metastore.disallow.incompatible.col.type.changes set to true (Janaki Lahorani, reviewed by Andrew Sherman and Vihang Karajgaonkar) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-10-25 01:49:49.788 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/plan/GroupByDesc.java:20 error: ql/src/java/org/apache/hadoop/hive/ql/plan/GroupByDesc.java: patch does not apply error: patch failed: ql/src/test/results/clientpositive/llap/sysdb.q.out:3346 error: ql/src/test/results/clientpositive/llap/sysdb.q.out: patch does not apply error: patch failed: ql/src/test/results/clientpositive/llap/vector_aggregate_9.q.out:147 error: ql/src/test/results/clientpositive/llap/vector_aggregate_9.q.out: patch does not apply error: patch failed: ql/src/test/results/clientpositive/llap/vector_between_in.q.out:157 error: ql/src/test/results/clientpositive/llap/vector_between_in.q.out: patch does not apply error: patch failed: ql/src/test/results/clientpositive/llap/vector_count_distinct.q.out:1327 error: ql/src/test/results/clientpositive/llap/vector_count_distinct.q.out: patch does not apply error: patch failed: ql/src/test/results/clientpositive/llap/vector_decimal_precision.q.out:589 error: ql/src/test/results/clientpositive/llap/vector_decimal_precision.q.out: patch does not apply error: ql/src/test/results/clientpositive/llap/vector_empty_where.q.out: No such file or directory error: patch failed: ql/src/test/results/clientpositive/llap/vector_groupby_grouping_id2.q.out:609 error: ql/src/test/results/clientpositive/llap/vector_groupby_grouping_id2.q.out: patch does not apply error: patch failed:
[jira] [Updated] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.
[ https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-17887: - Reporter: Santhosh B Gowda (was: Sankar Hariappan) > Incremental REPL LOAD with Drop partition event on timestamp type partition > column fails. > - > > Key: HIVE-17887 > URL: https://issues.apache.org/jira/browse/HIVE-17887 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl >Affects Versions: 3.0.0 >Reporter: Santhosh B Gowda >Assignee: Sankar Hariappan > Labels: DR, replication > Fix For: 3.0.0 > > Attachments: HIVE-17887.01.patch > > > When try to replicate the drop partition event on a table with partition on > timestamp type column fails in REPL LOAD. > *Scenario:* > 1. create table with partition on timestamp column. > 2.bootstrap dump/load. > 3. insert a record to create partition(p="2001-11-09 00:00:00.0"). > 4. drop the same partition(p="2001-11-09 00:00:00.0"). > 5. incremental dump/load > -- REPL LOAD throws below exception > {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: > Thread-36769]: metastore.RetryingHMSHandler > (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error > parsing partition filter; lexer error: line 1:18 no viable alternative at > character ':'; exception MismatchedTokenException(12!=23)) > at > org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517) > at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103) > at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957) > at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) > at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200) > at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178) > at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source) > at > org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562) > at > org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018) > at > org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197) > at > org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76) > at > org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255) > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17832) Allow hive.metastore.disallow.incompatible.col.type.changes to be changed in metastore
[ https://issues.apache.org/jira/browse/HIVE-17832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-17832: --- Resolution: Fixed Fix Version/s: 2.4.0 Status: Resolved (was: Patch Available) Committed in branch-2 and master. Thanks [~janulatha] for your contribution. > Allow hive.metastore.disallow.incompatible.col.type.changes to be changed in > metastore > -- > > Key: HIVE-17832 > URL: https://issues.apache.org/jira/browse/HIVE-17832 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Janaki Lahorani >Assignee: Janaki Lahorani > Fix For: 3.0.0, 2.4.0 > > Attachments: HIVE17832.1.patch, HIVE17832.2.patch > > > hive.metastore.disallow.incompatible.col.type.changes when set to true, will > disallow incompatible column type changes through alter table. But, this > parameter is not modifiable in HMS. If HMS in not embedded into HS2, the > value cannot be changed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
[ https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217965#comment-16217965 ] Hive QA commented on HIVE-17458: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12893811/HIVE-17458.07.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11320 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_vectorization_original] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=156) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=110) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=205) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=222) org.apache.hadoop.hive.ql.parse.authorization.plugin.sqlstd.TestOperation2Privilege.checkHiveOperationTypeMatch (batchId=270) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7457/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7457/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7457/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12893811 - PreCommit-HIVE-Build > VectorizedOrcAcidRowBatchReader doesn't handle 'original' files > --- > > Key: HIVE-17458 > URL: https://issues.apache.org/jira/browse/HIVE-17458 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, > HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, > HIVE-17458.06.patch, HIVE-17458.07.patch, HIVE-17458.07.patch > > > VectorizedOrcAcidRowBatchReader will not be used for original files. This > will likely look like a perf regression when converting a table from non-acid > to acid until it runs through a major compaction. > With Load Data support, if large files are added via Load Data, the read ops > will not vectorize until major compaction. > There is no reason why this should be the case. Just like > OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other > files in the logical tranche/bucket and calculate the offset for the RowBatch > of the split. (Presumably getRecordReader().getRowNumber() works the same in > vector mode). > In this case we don't even need OrcSplit.isOriginal() - the reader can infer > it from file path... which in particular simplifies > OrcInputFormat.determineSplitStrategies() -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17834) Fix flaky triggers test
[ https://issues.apache.org/jira/browse/HIVE-17834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-17834: - Attachment: HIVE-17834.3.patch TestTriggersTezSessionPoolManager wasn't performing validation frequently like TezTriggersWorkloadManager, the test failed before SHUFFLE_BYTES being published and validated (query completed in the meantime). > Fix flaky triggers test > --- > > Key: HIVE-17834 > URL: https://issues.apache.org/jira/browse/HIVE-17834 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17834.1.patch, HIVE-17834.2.patch, > HIVE-17834.3.patch > > > https://issues.apache.org/jira/browse/HIVE-12631?focusedCommentId=16209803=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16209803 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17899) Provide an option to disable tez split grouping
[ https://issues.apache.org/jira/browse/HIVE-17899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran reassigned HIVE-17899: Assignee: Prasanth Jayachandran > Provide an option to disable tez split grouping > --- > > Key: HIVE-17899 > URL: https://issues.apache.org/jira/browse/HIVE-17899 > Project: Hive > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > Only way to disable split grouping in tez is to change input format to > CombineHiveInputFormat. Provide a config option to disable split grouping > regardless of the IF. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17899) Provide an option to disable tez split grouping
[ https://issues.apache.org/jira/browse/HIVE-17899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran reassigned HIVE-17899: Assignee: (was: Prasanth Jayachandran) > Provide an option to disable tez split grouping > --- > > Key: HIVE-17899 > URL: https://issues.apache.org/jira/browse/HIVE-17899 > Project: Hive > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran > > Only way to disable split grouping in tez is to change input format to > CombineHiveInputFormat. Provide a config option to disable split grouping > regardless of the IF. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17899) Provide an option to disable tez split grouping
[ https://issues.apache.org/jira/browse/HIVE-17899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran reassigned HIVE-17899: > Provide an option to disable tez split grouping > --- > > Key: HIVE-17899 > URL: https://issues.apache.org/jira/browse/HIVE-17899 > Project: Hive > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > Only way to disable split grouping in tez is to change input format to > CombineHiveInputFormat. Provide a config option to disable split grouping > regardless of the IF. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-15670) column_stats_accurate may not fit in PARTITION_PARAMS.VALUE
[ https://issues.apache.org/jira/browse/HIVE-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217948#comment-16217948 ] Sergey Shelukhin commented on HIVE-15670: - Beats me... the current implementation is as such. > column_stats_accurate may not fit in PARTITION_PARAMS.VALUE > --- > > Key: HIVE-15670 > URL: https://issues.apache.org/jira/browse/HIVE-15670 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > The JSON can be too big with many columns (see setColumnStatsState method). > We can make JSON more compact by only storing the list of columns with true > values. Or we can even store a bitmask in a dedicated column, and adjust it > when altering table (rare enough). Or we can just change the VALUE column to > text blob (might be a painful change wrt upgrade scripts, and supporting all > the DBs' varied blob implementations, esp. in directsql). > Storing denormalized flags in a separate table will probably be slow, > comparatively. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17885) Fix TestTriggersWorkloadManager.testTriggerHighShuffleBytes runtime fluctation
[ https://issues.apache.org/jira/browse/HIVE-17885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217946#comment-16217946 ] Prasanth Jayachandran commented on HIVE-17885: -- HIVE-17834 should fix this. > Fix TestTriggersWorkloadManager.testTriggerHighShuffleBytes runtime fluctation > -- > > Key: HIVE-17885 > URL: https://issues.apache.org/jira/browse/HIVE-17885 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich > > The following testcase's execution time is fluctuating between 30sec to 90sec > https://builds.apache.org/job/PreCommit-HIVE-Build/7450/testReport/org.apache.hive.jdbc/TestTriggersWorkloadManager/testTriggerHighShuffleBytes/history/ > in case it reaches 90sec; it times out and fails.. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17884) Implement create, alter and drop workload management triggers.
[ https://issues.apache.org/jira/browse/HIVE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217943#comment-16217943 ] Prasanth Jayachandran commented on HIVE-17884: -- Also would recommend one other minor change (either in this patch of followup jira), while creating resource plan will be better to remove '_' from query parallelism QUERY_PARALLELISM -> QUERY PARALLELISM we can use '_' for trigger expression counter names. Also, trigger expression counters does not accept values other than integer. Something like below will throw error WHEN HDFS_BYTES_READ > 10GB DO KILL from usability perspective, it will be easier to specify 10GB vs values in bytes. Similarly for time based counters, WHEN EXECUTION_TIME > 2 hours DO KILL. ExpressionFactory.java can parse such counters ("10 GB"). > Implement create, alter and drop workload management triggers. > -- > > Key: HIVE-17884 > URL: https://issues.apache.org/jira/browse/HIVE-17884 > Project: Hive > Issue Type: Sub-task >Reporter: Harish Jaiprakash >Assignee: Harish Jaiprakash > Attachments: HIVE-17884.01.patch > > > Implement triggers for workload management: > The commands to be implemented: > CREATE TRIGGER `resourceplan_name`.`trigger_name` WHEN condition DO action; > condition is a boolean expression: variable operator value types with 'AND' > and 'OR' support. > action is currently: KILL or MOVE TO pool; > ALTER TRIGGER `plan_name`.`trigger_name` WHEN condition DO action; > DROP TRIGGER `plan_name`.`trigger_name`; > Also add WM_TRIGGERS to information schema. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17884) Implement create, alter and drop workload management triggers.
[ https://issues.apache.org/jira/browse/HIVE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217941#comment-16217941 ] Prasanth Jayachandran commented on HIVE-17884: -- Found some issue when trying out this patch and other RP patch Hive.geAllResourcePlans() -> Hive.getAllResourcePlans() Will this patch also support show triggers or will it be in a separate patch? IMetaStoreClient.java is missing getTriggersForResourcePlan API (assuming client will be allowed to retrieve Triggers independently without going via getResourcePlan). Hive.java is missing similar getTriggersForResourcePlan. Looks good otherwise. > Implement create, alter and drop workload management triggers. > -- > > Key: HIVE-17884 > URL: https://issues.apache.org/jira/browse/HIVE-17884 > Project: Hive > Issue Type: Sub-task >Reporter: Harish Jaiprakash >Assignee: Harish Jaiprakash > Attachments: HIVE-17884.01.patch > > > Implement triggers for workload management: > The commands to be implemented: > CREATE TRIGGER `resourceplan_name`.`trigger_name` WHEN condition DO action; > condition is a boolean expression: variable operator value types with 'AND' > and 'OR' support. > action is currently: KILL or MOVE TO pool; > ALTER TRIGGER `plan_name`.`trigger_name` WHEN condition DO action; > DROP TRIGGER `plan_name`.`trigger_name`; > Also add WM_TRIGGERS to information schema. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17898) Explain plan output enhancement
[ https://issues.apache.org/jira/browse/HIVE-17898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17898: --- Attachment: HIVE-17898.1.patch > Explain plan output enhancement > --- > > Key: HIVE-17898 > URL: https://issues.apache.org/jira/browse/HIVE-17898 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17898.1.patch > > > We would like to enhance the explain plan output to display additional > information e.g.: > TableScan operator should have following additional info > * Actual table name (currently only alias name is displayed) > * Database name > * Column names being scanned -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17898) Explain plan output enhancement
[ https://issues.apache.org/jira/browse/HIVE-17898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17898: --- Status: Patch Available (was: Open) > Explain plan output enhancement > --- > > Key: HIVE-17898 > URL: https://issues.apache.org/jira/browse/HIVE-17898 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17898.1.patch > > > We would like to enhance the explain plan output to display additional > information e.g.: > TableScan operator should have following additional info > * Actual table name (currently only alias name is displayed) > * Database name > * Column names being scanned -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17898) Explain plan output enhancement
[ https://issues.apache.org/jira/browse/HIVE-17898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg reassigned HIVE-17898: -- > Explain plan output enhancement > --- > > Key: HIVE-17898 > URL: https://issues.apache.org/jira/browse/HIVE-17898 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > > We would like to enhance the explain plan output to display additional > information e.g.: > TableScan operator should have following additional info > * Actual table name (currently only alias name is displayed) > * Database name > * Column names being scanned -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17696) Vectorized reader does not seem to be pushing down projection columns in certain code paths
[ https://issues.apache.org/jira/browse/HIVE-17696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217937#comment-16217937 ] Ferdinand Xu commented on HIVE-17696: - Two changes here: * DataWritableReadSupport did two things in its init method. 1) create request schema 2) create meta data. Vectorized Reader only need part one. * DataWritableReadSupport supported nested pruning filter while vectorization path still has some issues which leads qtest failed. So I disabled it in the 2nd patch. > Vectorized reader does not seem to be pushing down projection columns in > certain code paths > --- > > Key: HIVE-17696 > URL: https://issues.apache.org/jira/browse/HIVE-17696 > Project: Hive > Issue Type: Sub-task >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu > Attachments: HIVE-17696.2.patch, HIVE-17696.patch > > > This is the code snippet from {{VectorizedParquetRecordReader.java}} > {noformat} > MessageType tableSchema; > if (indexAccess) { > List indexSequence = new ArrayList<>(); > // Generates a sequence list of indexes > for(int i = 0; i < columnNamesList.size(); i++) { > indexSequence.add(i); > } > tableSchema = DataWritableReadSupport.getSchemaByIndex(fileSchema, > columnNamesList, > indexSequence); > } else { > tableSchema = DataWritableReadSupport.getSchemaByName(fileSchema, > columnNamesList, > columnTypesList); > } > indexColumnsWanted = > ColumnProjectionUtils.getReadColumnIDs(configuration); > if (!ColumnProjectionUtils.isReadAllColumns(configuration) && > !indexColumnsWanted.isEmpty()) { > requestedSchema = > DataWritableReadSupport.getSchemaByIndex(tableSchema, > columnNamesList, indexColumnsWanted); > } else { > requestedSchema = fileSchema; > } > this.reader = new ParquetFileReader( > configuration, footer.getFileMetaData(), file, blocks, > requestedSchema.getColumns()); > {noformat} > Couple of things to notice here: > Most of this code is duplicated from {{DataWritableReadSupport.init()}} > method. > the else condition passes in fileSchema instead of using tableSchema like we > do in DataWritableReadSupport.init() method. Does this cause projection > columns to be missed when we read parquet files? We should probably just > reuse ReadContext returned from {{DataWritableReadSupport.init()}} method > here. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HIVE-17719) Add mapreduce.job.hdfs-servers, mapreduce.job.send-token-conf to sql std auth whitelist
[ https://issues.apache.org/jira/browse/HIVE-17719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair resolved HIVE-17719. -- Resolution: Won't Fix This can lead to potential security issues. So instead the repl load command has been enhanced to take parameters. > Add mapreduce.job.hdfs-servers, mapreduce.job.send-token-conf to sql std auth > whitelist > --- > > Key: HIVE-17719 > URL: https://issues.apache.org/jira/browse/HIVE-17719 > Project: Hive > Issue Type: Bug >Reporter: Thejas M Nair >Assignee: Thejas M Nair > > mapreduce.job.hdfs-servers, mapreduce.job.send-token-conf can be needed to > access a remote cluster in HA config for hive replication v2. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17897) "repl load" in bootstrap phase fails when partitions have whitespace
[ https://issues.apache.org/jira/browse/HIVE-17897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-17897: - Reporter: Sankar Hariappan (was: Thejas M Nair) > "repl load" in bootstrap phase fails when partitions have whitespace > > > Key: HIVE-17897 > URL: https://issues.apache.org/jira/browse/HIVE-17897 > Project: Hive > Issue Type: Sub-task > Components: repl >Reporter: Sankar Hariappan >Assignee: Thejas M Nair >Priority: Critical > Fix For: 3.0.0 > > > The issue is that Path.toURI().toString() is being used to serialize the > location, while new Path(String) is used to deserialize it. URI escapes chars > such as space, so the deserialized location doesn't point to the correct file > location. > Following exception is seen - > {code} > 2017-10-24T11:58:34,451 ERROR [d5606640-8174-4584-8b54-936b0f5628fa main] > exec.Task: Failed with exception null > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.parse.repl.CopyUtils.regularCopy(CopyUtils.java:211) > at > org.apache.hadoop.hive.ql.parse.repl.CopyUtils.copyAndVerify(CopyUtils.java:71) > at > org.apache.hadoop.hive.ql.exec.ReplCopyTask.execute(ReplCopyTask.java:137) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:206) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2276) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1906) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1623) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1362) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1352) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:239) > at org.apache.hadoop.util.RunJar.main(RunJar.java:153) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17826) Error writing to RandomAccessFile after operation log is closed
[ https://issues.apache.org/jira/browse/HIVE-17826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217913#comment-16217913 ] Andrew Sherman edited comment on HIVE-17826 at 10/25/17 12:35 AM: -- I looked into an alternative solution which is to use an [IdlePurgePolicy|https://logging.apache.org/log4j/2.x/manual/appenders.html]. This can be inserted into log4j when the RoutingAppender Is created in LogDivertAppender. The IdlePurgePolicy works by scheduling a thread to run at a configurable interval. When the thread runs it checks if any of the RoutingAppender’s sub-Appenders have been idle for more than a configurable time. Any that are found are stopped and the AppenderControl is removed. I was able to use this instead of LogUtils.stopQueryAppender() to cause OperationLogs to close, providing an alternative mechanism for avoiding the file descriptor leak fixed in [HIVE-17128]. The problem I see is that an IdlePurgePolicy may prematurely close the log for a long running operation if the operation is not logging. I experimented to see what happens when logging with a particular key restarts after being closed by IdlePurgePolicy. The good thing is that the logging does succeed but the bad thing is that the second log file appears to overwrite the original log file. So I think that the original fix I proposed may be simpler and safer. Edited to add: we may also need [HIVE-17373] to be completed in order to get bug fixes to log4j was (Author: asherman): I looked into an alternative solution which is to use an [IdlePurgePolicy|https://logging.apache.org/log4j/2.x/manual/appenders.html]. This can be inserted into log4j when the RoutingAppender Is created in LogDivertAppender. The IdlePurgePolicy works by scheduling a thread to run at a configurable interval. When the thread runs it checks if any of the RoutingAppender’s sub-Appenders have been idle for more than a configurable time. Any that are found are stopped and the AppenderControl is removed. I was able to use this instead of LogUtils.stopQueryAppender() to cause OperationLogs to close, providing an alternative mechanism for avoiding the file descriptor leak fixed in [HIVE-17128]. The problem I see is that an IdlePurgePolicy may prematurely close the log for a long running operation if the operation is not logging. I experimented to see what happens when logging with a particular key restarts after being closed by IdlePurgePolicy. The good thing is that the logging does succeed but the bad thing is that the second log file appears to overwrite the original log file. So I think that the original fix I proposed may be simpler and safer. > Error writing to RandomAccessFile after operation log is closed > --- > > Key: HIVE-17826 > URL: https://issues.apache.org/jira/browse/HIVE-17826 > Project: Hive > Issue Type: Bug >Reporter: Andrew Sherman >Assignee: Andrew Sherman > Attachments: HIVE-17826.1.patch > > > We are seeing the error from HS2 process stdout. > {noformat} > 2017-09-07 10:17:23,933 AsyncLogger-1 ERROR Attempted to append to > non-started appender query-file-appender > 2017-09-07 10:17:23,934 AsyncLogger-1 ERROR Attempted to append to > non-started appender query-file-appender > 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR Unable to write to stream > /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9 > for appender query-file-appender > 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR An exception occurred processing > Appender query-file-appender > org.apache.logging.log4j.core.appender.AppenderLoggingException: Error > writing to RandomAccessFile > /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9 > at > org.apache.logging.log4j.core.appender.RandomAccessFileManager.flush(RandomAccessFileManager.java:114) > at > org.apache.logging.log4j.core.appender.RandomAccessFileManager.write(RandomAccessFileManager.java:103) > at > org.apache.logging.log4j.core.appender.OutputStreamManager.write(OutputStreamManager.java:136) > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:105) > at > org.apache.logging.log4j.core.appender.RandomAccessFileAppender.append(RandomAccessFileAppender.java:89) > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:116) > at >
[jira] [Commented] (HIVE-17764) alter view fails when hive.metastore.disallow.incompatible.col.type.changes set to true
[ https://issues.apache.org/jira/browse/HIVE-17764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217927#comment-16217927 ] Vihang Karajgaonkar commented on HIVE-17764: +1 > alter view fails when hive.metastore.disallow.incompatible.col.type.changes > set to true > --- > > Key: HIVE-17764 > URL: https://issues.apache.org/jira/browse/HIVE-17764 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Janaki Lahorani >Assignee: Janaki Lahorani > Fix For: 3.0.0 > > Attachments: HIVE17764.1.patch, HIVE17764.2.patch > > > A view is a virtual structure that derives the type information from the > table(s) the view is based on.If the view definition is altered, the > corresponding column types should be updated. The relevance of the change > depending on the previous structure of the view is irrelevant. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17897) "repl load" in bootstrap phase fails when partitions have whitespace
[ https://issues.apache.org/jira/browse/HIVE-17897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair reassigned HIVE-17897: > "repl load" in bootstrap phase fails when partitions have whitespace > > > Key: HIVE-17897 > URL: https://issues.apache.org/jira/browse/HIVE-17897 > Project: Hive > Issue Type: Sub-task > Components: repl >Reporter: Thejas M Nair >Assignee: Thejas M Nair >Priority: Critical > Fix For: 3.0.0 > > > The issue is that Path.toURI().toString() is being used to serialize the > location, while new Path(String) is used to deserialize it. URI escapes chars > such as space, so the deserialized location doesn't point to the correct file > location. > Following exception is seen - > {code} > 2017-10-24T11:58:34,451 ERROR [d5606640-8174-4584-8b54-936b0f5628fa main] > exec.Task: Failed with exception null > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.parse.repl.CopyUtils.regularCopy(CopyUtils.java:211) > at > org.apache.hadoop.hive.ql.parse.repl.CopyUtils.copyAndVerify(CopyUtils.java:71) > at > org.apache.hadoop.hive.ql.exec.ReplCopyTask.execute(ReplCopyTask.java:137) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:206) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2276) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1906) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1623) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1362) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1352) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:239) > at org.apache.hadoop.util.RunJar.main(RunJar.java:153) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17826) Error writing to RandomAccessFile after operation log is closed
[ https://issues.apache.org/jira/browse/HIVE-17826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217913#comment-16217913 ] Andrew Sherman commented on HIVE-17826: --- I looked into an alternative solution which is to use an [IdlePurgePolicy|https://logging.apache.org/log4j/2.x/manual/appenders.html]. This can be inserted into log4j when the RoutingAppender Is created in LogDivertAppender. The IdlePurgePolicy works by scheduling a thread to run at a configurable interval. When the thread runs it checks if any of the RoutingAppender’s sub-Appenders have been idle for more than a configurable time. Any that are found are stopped and the AppenderControl is removed. I was able to use this instead of LogUtils.stopQueryAppender() to cause OperationLogs to close, providing an alternative mechanism for avoiding the file descriptor leak fixed in [HIVE-17128]. The problem I see is that an IdlePurgePolicy may prematurely close the log for a long running operation if the operation is not logging. I experimented to see what happens when logging with a particular key restarts after being closed by IdlePurgePolicy. The good thing is that the logging does succeed but the bad thing is that the second log file appears to overwrite the original log file. So I think that the original fix I proposed may be simpler and safer. > Error writing to RandomAccessFile after operation log is closed > --- > > Key: HIVE-17826 > URL: https://issues.apache.org/jira/browse/HIVE-17826 > Project: Hive > Issue Type: Bug >Reporter: Andrew Sherman >Assignee: Andrew Sherman > Attachments: HIVE-17826.1.patch > > > We are seeing the error from HS2 process stdout. > {noformat} > 2017-09-07 10:17:23,933 AsyncLogger-1 ERROR Attempted to append to > non-started appender query-file-appender > 2017-09-07 10:17:23,934 AsyncLogger-1 ERROR Attempted to append to > non-started appender query-file-appender > 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR Unable to write to stream > /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9 > for appender query-file-appender > 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR An exception occurred processing > Appender query-file-appender > org.apache.logging.log4j.core.appender.AppenderLoggingException: Error > writing to RandomAccessFile > /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9 > at > org.apache.logging.log4j.core.appender.RandomAccessFileManager.flush(RandomAccessFileManager.java:114) > at > org.apache.logging.log4j.core.appender.RandomAccessFileManager.write(RandomAccessFileManager.java:103) > at > org.apache.logging.log4j.core.appender.OutputStreamManager.write(OutputStreamManager.java:136) > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:105) > at > org.apache.logging.log4j.core.appender.RandomAccessFileAppender.append(RandomAccessFileAppender.java:89) > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:116) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) > at > org.apache.logging.log4j.core.appender.routing.RoutingAppender.append(RoutingAppender.java:112) > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:116) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) > at > org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:390) > at > org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:378) > at > org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:362) > at > org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:79) > at > org.apache.logging.log4j.core.async.AsyncLogger.actualAsyncLog(AsyncLogger.java:385) > at > org.apache.logging.log4j.core.async.RingBufferLogEvent.execute(RingBufferLogEvent.java:103) > at >
[jira] [Commented] (HIVE-15670) column_stats_accurate may not fit in PARTITION_PARAMS.VALUE
[ https://issues.apache.org/jira/browse/HIVE-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217912#comment-16217912 ] Alexander Behm commented on HIVE-15670: --- May I ask what's the purpose of storing this JSON in the tableproperties? Seems pretty expensive to me. If you want to keep track of the accuracy of column stats, why not populate a "last updated" timestamp in the appropriate column statistic? > column_stats_accurate may not fit in PARTITION_PARAMS.VALUE > --- > > Key: HIVE-15670 > URL: https://issues.apache.org/jira/browse/HIVE-15670 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > The JSON can be too big with many columns (see setColumnStatsState method). > We can make JSON more compact by only storing the list of columns with true > values. Or we can even store a bitmask in a dedicated column, and adjust it > when altering table (rare enough). Or we can just change the VALUE column to > text blob (might be a painful change wrt upgrade scripts, and supporting all > the DBs' varied blob implementations, esp. in directsql). > Storing denormalized flags in a separate table will probably be slow, > comparatively. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17841) implement applying the resource plan
[ https://issues.apache.org/jira/browse/HIVE-17841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17841: Attachment: HIVE-17841.02.patch Adding many more tests > implement applying the resource plan > > > Key: HIVE-17841 > URL: https://issues.apache.org/jira/browse/HIVE-17841 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17841.01.patch, HIVE-17841.02.patch, > HIVE-17841.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16970) General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils
[ https://issues.apache.org/jira/browse/HIVE-16970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217881#comment-16217881 ] Ashutosh Chauhan commented on HIVE-16970: - +1 > General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils > - > > Key: HIVE-16970 > URL: https://issues.apache.org/jira/browse/HIVE-16970 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Attachments: HIVE-16970.1.patch, HIVE-16970.2.patch > > > # Simplify > # Do not initiate empty collections > # Parsing is incorrect: > {code:title=org.apache.hadoop.hive.metastore.cache.CacheUtils} > public static String buildKey(String dbName, String tableName, List > partVals) { > String key = buildKey(dbName, tableName); > if (partVals == null || partVals.size() == 0) { > return key; > } > // missing a delimiter between the "tableName" and the first "partVal" > for (int i = 0; i < partVals.size(); i++) { > key += partVals.get(i); > if (i != partVals.size() - 1) { > key += delimit; > } > } > return key; > } > public static Object[] splitPartitionColStats(String key) { > // ... > } > {code} > The result of passing the key to the "split" method is: > {code} > buildKey("db","Table",["Part1","Part2","Part3"], "col"); > [db, tablePart1, [Part2, Part3], col] > // "table" and "Part1" is mistakenly concatenated > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16663) String Caching For Rows
[ https://issues.apache.org/jira/browse/HIVE-16663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217877#comment-16217877 ] Ashutosh Chauhan commented on HIVE-16663: - +1 pending tests > String Caching For Rows > --- > > Key: HIVE-16663 > URL: https://issues.apache.org/jira/browse/HIVE-16663 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.0.1 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HIVE-16663.1.patch, HIVE-16663.2.patch, > HIVE-16663.3.patch, HIVE-16663.4.patch, HIVE-16663.5.patch, > HIVE-16663.6.patch, HIVE-16663.7.patch > > > It is very common that there are many repeated values in the result set of a > query, especially when JOINs are present in the query. As it currently > stands, beeline does not attempt to cache any of these values and therefore > it consumes a lot of memory. > Adding a string cache may save a lot of memory. There are organizations that > use beeline to perform ETL processing of result sets into CSV. This will > better support those organizations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17896) TopN: Create a standalone vectorizable TopN operator
[ https://issues.apache.org/jira/browse/HIVE-17896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-17896: --- Description: For TPC-DS Query27, the TopN operation is delayed by the group-by - the group-by operator buffers up all the rows before discarding the 99% of the rows in the TopN Hash within the ReduceSink Operator. The RS TopN operator is very restrictive as it only supports doing the filtering on the shuffle keys, but it is better to do this before breaking the vectors into rows and losing the isRepeating properties. Adding a TopN operator in the physical operator tree allows the following to happen. GBY->RS(Top=1) can become TopN(1)->GBY->RS(Top=1) So that, the TopN can remove rows before they are buffered into the GBY and consume memory. Here's the equivalent implementation in Presto https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/operator/TopNOperator.java#L35 Adding this as a sub-feature of GroupBy prevents further optimizations if the GBY is on keys "a,b,c" and the TopN is on just "a". was: For TPC-DS Query27, the TopN operation is delayed by the group-by - the group-by operator buffers up all the rows before discarding the 99% of the rows in the TopN Hash within the ReduceSink Operator. The RS TopN operator is very restrictive as it only supports doing the filtering on the shuffle keys, but it is better to do this before breaking the vectors into rows and losing the isRepeating properties. Adding a TopN operator in the physical operator tree allows the following to happen. GBY->RS(Top=1) can become TopN(1)->GBY->RS(Top=1) So that, the TopN can remove rows before they are buffered into the GBY and consume memory. Here's the equivalent implementation in Presto https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/operator/TopNOperator.java#L35 > TopN: Create a standalone vectorizable TopN operator > > > Key: HIVE-17896 > URL: https://issues.apache.org/jira/browse/HIVE-17896 > Project: Hive > Issue Type: New Feature > Components: Operators >Affects Versions: 3.0.0 >Reporter: Gopal V > > For TPC-DS Query27, the TopN operation is delayed by the group-by - the > group-by operator buffers up all the rows before discarding the 99% of the > rows in the TopN Hash within the ReduceSink Operator. > The RS TopN operator is very restrictive as it only supports doing the > filtering on the shuffle keys, but it is better to do this before breaking > the vectors into rows and losing the isRepeating properties. > Adding a TopN operator in the physical operator tree allows the following to > happen. > GBY->RS(Top=1) > can become > TopN(1)->GBY->RS(Top=1) > So that, the TopN can remove rows before they are buffered into the GBY and > consume memory. > Here's the equivalent implementation in Presto > https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/operator/TopNOperator.java#L35 > Adding this as a sub-feature of GroupBy prevents further optimizations if the > GBY is on keys "a,b,c" and the TopN is on just "a". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17831) HiveSemanticAnalyzerHookContext does not update the HiveOperation after sem.analyze() is called
[ https://issues.apache.org/jira/browse/HIVE-17831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-17831: Resolution: Fixed Status: Resolved (was: Patch Available) > HiveSemanticAnalyzerHookContext does not update the HiveOperation after > sem.analyze() is called > --- > > Key: HIVE-17831 > URL: https://issues.apache.org/jira/browse/HIVE-17831 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0, 2.4.0, 2.2.1, 2.3.1 >Reporter: Sergio Peña >Assignee: Aihua Xu > Fix For: 2.1.2, 3.0.0, 2.4.0, 2.2.1, 2.3.2 > > Attachments: HIVE-17831.1.patch > > > The SemanticAnalyzer.analyze() called on the Driver.compile() method updates > the HiveOperation based on the analysis this does. However, the patch done on > HIVE-17048 does not update such operation and is send an invalid operation to > the postAnalyze() call. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217849#comment-16217849 ] Hive QA commented on HIVE-15104: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12893666/HIVE-15104.10.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 11319 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=110) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=205) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=222) org.apache.hadoop.hive.ql.parse.authorization.plugin.sqlstd.TestOperation2Privilege.checkHiveOperationTypeMatch (batchId=270) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=229) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7456/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7456/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7456/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12893666 - PreCommit-HIVE-Build > Hive on Spark generate more shuffle data than hive on mr > > > Key: HIVE-15104 > URL: https://issues.apache.org/jira/browse/HIVE-15104 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.2.1 >Reporter: wangwenli >Assignee: Rui Li > Attachments: HIVE-15104.1.patch, HIVE-15104.10.patch, > HIVE-15104.2.patch, HIVE-15104.3.patch, HIVE-15104.4.patch, > HIVE-15104.5.patch, HIVE-15104.6.patch, HIVE-15104.7.patch, > HIVE-15104.8.patch, HIVE-15104.9.patch, TPC-H 100G.xlsx > > > the same sql, running on spark and mr engine, will generate different size > of shuffle data. > i think it is because of hive on mr just serialize part of HiveKey, but hive > on spark which using kryo will serialize full of Hivekey object. > what is your opionion? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17895) Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP)
[ https://issues.apache.org/jira/browse/HIVE-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217803#comment-16217803 ] Vihang Karajgaonkar edited comment on HIVE-17895 at 10/24/17 10:27 PM: --- Thats a good idea. Not sure if we can automate this for our pre-commit. Seems like a good idea to make sure that the results match with or without vectorization. I will meanwhile give it a shot on branch-2 to see if these tests are failing on branch-2 as well. Did you run with all the **vector**.q files? or there were specific qfiles which you targetted. Thanks! was (Author: vihangk1): Thats a good idea. Not sure if we can automate this for our pre-commit. Seems like a good idea to make sure that the results match with or without vectorization. I will meanwhile give it a shot on branch-2 to see if these tests are failing on branch-2 as well. Did you run with all the *vector*.q files? or there were specific qfiles which you targetted. Thanks! > Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP) > > > Key: HIVE-17895 > URL: https://issues.apache.org/jira/browse/HIVE-17895 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > > NonVec: 103 NULL0.0 NULLoriginal > Vec: 103 NULLNULLNULLoriginal -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17895) Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP)
[ https://issues.apache.org/jira/browse/HIVE-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217803#comment-16217803 ] Vihang Karajgaonkar commented on HIVE-17895: Thats a good idea. Not sure if we can automate this for our pre-commit. Seems like a good idea to make sure that the results match with or without vectorization. I will meanwhile give it a shot on branch-2 to see if these tests are failing on branch-2 as well. Did you run with all the *vector*.q files? or there were specific qfiles which you targetted. Thanks! > Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP) > > > Key: HIVE-17895 > URL: https://issues.apache.org/jira/browse/HIVE-17895 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > > NonVec: 103 NULL0.0 NULLoriginal > Vec: 103 NULLNULLNULLoriginal -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17831) HiveSemanticAnalyzerHookContext does not update the HiveOperation after sem.analyze() is called
[ https://issues.apache.org/jira/browse/HIVE-17831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-17831: --- Fix Version/s: 2.3.2 > HiveSemanticAnalyzerHookContext does not update the HiveOperation after > sem.analyze() is called > --- > > Key: HIVE-17831 > URL: https://issues.apache.org/jira/browse/HIVE-17831 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0, 2.4.0, 2.2.1, 2.3.1 >Reporter: Sergio Peña >Assignee: Aihua Xu > Fix For: 2.1.2, 3.0.0, 2.4.0, 2.2.1, 2.3.2 > > Attachments: HIVE-17831.1.patch > > > The SemanticAnalyzer.analyze() called on the Driver.compile() method updates > the HiveOperation based on the analysis this does. However, the patch done on > HIVE-17048 does not update such operation and is send an invalid operation to > the postAnalyze() call. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17895) Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP)
[ https://issues.apache.org/jira/browse/HIVE-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217760#comment-16217760 ] Matt McCline commented on HIVE-17895: - [~vihangk1] No, I have not checked on branch-2. The trick I use in checking is to modify the "if (!vectorPath) {" in the Vectorizer source to "if (!vectorPath || true) {" to temporarily disable vectorization and then execute the Q file tests. I look for query result differences in diffs. Occasionally, differences are due to a lack of "-- SORT_QUERY_RESULTS" in the Q file. But usually it is a different result but not necessarily the fault of vectorization. Sometimes it is row-mode that is in error. > Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP) > > > Key: HIVE-17895 > URL: https://issues.apache.org/jira/browse/HIVE-17895 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > > NonVec: 103 NULL0.0 NULLoriginal > Vec: 103 NULLNULLNULLoriginal -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
[ https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17458: -- Attachment: HIVE-17458.07.patch > VectorizedOrcAcidRowBatchReader doesn't handle 'original' files > --- > > Key: HIVE-17458 > URL: https://issues.apache.org/jira/browse/HIVE-17458 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, > HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, > HIVE-17458.06.patch, HIVE-17458.07.patch, HIVE-17458.07.patch > > > VectorizedOrcAcidRowBatchReader will not be used for original files. This > will likely look like a perf regression when converting a table from non-acid > to acid until it runs through a major compaction. > With Load Data support, if large files are added via Load Data, the read ops > will not vectorize until major compaction. > There is no reason why this should be the case. Just like > OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other > files in the logical tranche/bucket and calculate the offset for the RowBatch > of the split. (Presumably getRecordReader().getRowNumber() works the same in > vector mode). > In this case we don't even need OrcSplit.isOriginal() - the reader can infer > it from file path... which in particular simplifies > OrcInputFormat.determineSplitStrategies() -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17877) HoS: combine equivalent DPP sink works
[ https://issues.apache.org/jira/browse/HIVE-17877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217684#comment-16217684 ] Hive QA commented on HIVE-17877: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12893661/HIVE-17877.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 11316 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning] (batchId=171) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_2] (batchId=173) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_3] (batchId=173) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only] (batchId=172) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=171) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=221) org.apache.hadoop.hive.ql.parse.authorization.plugin.sqlstd.TestOperation2Privilege.checkHiveOperationTypeMatch (batchId=269) org.apache.hive.jdbc.TestJdbcDriver2.testSelectExecAsync2 (batchId=229) org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testCancelRenewTokenFlow (batchId=242) org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testConnection (batchId=242) org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testIsValid (batchId=242) org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testIsValidNeg (batchId=242) org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testNegativeProxyAuth (batchId=242) org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testNegativeTokenAuth (batchId=242) org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testProxyAuth (batchId=242) org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testTokenAuth (batchId=242) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7455/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7455/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7455/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 18 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12893661 - PreCommit-HIVE-Build > HoS: combine equivalent DPP sink works > -- > > Key: HIVE-17877 > URL: https://issues.apache.org/jira/browse/HIVE-17877 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-17877.1.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17471) Vectorization: Enable hive.vectorized.row.identifier.enabled to true by default
[ https://issues.apache.org/jira/browse/HIVE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17471: Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to master. Thanks for the review! > Vectorization: Enable hive.vectorized.row.identifier.enabled to true by > default > --- > > Key: HIVE-17471 > URL: https://issues.apache.org/jira/browse/HIVE-17471 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Sergey Shelukhin > Fix For: 3.0.0 > > Attachments: HIVE-17471.01.patch, HIVE-17471.patch > > > We set it disabled in https://issues.apache.org/jira/browse/HIVE-17116 > "Vectorization: Add infrastructure for vectorization of ROW__ID struct" > But forgot to turn it on to true by default in Teddy's ACID ROW__ID work... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17895) Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP)
[ https://issues.apache.org/jira/browse/HIVE-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217643#comment-16217643 ] Vihang Karajgaonkar commented on HIVE-17895: Hi [~mmccline] I see you create some of the wrong results JIRA related to vectorization. Do you know if they apply to branch-2 as well? > Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP) > > > Key: HIVE-17895 > URL: https://issues.apache.org/jira/browse/HIVE-17895 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > > NonVec: 103 NULL0.0 NULLoriginal > Vec: 103 NULLNULLNULLoriginal -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.
[ https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217637#comment-16217637 ] Thejas M Nair commented on HIVE-17887: -- +1 > Incremental REPL LOAD with Drop partition event on timestamp type partition > column fails. > - > > Key: HIVE-17887 > URL: https://issues.apache.org/jira/browse/HIVE-17887 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl >Affects Versions: 3.0.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Labels: DR, replication > Fix For: 3.0.0 > > Attachments: HIVE-17887.01.patch > > > When try to replicate the drop partition event on a table with partition on > timestamp type column fails in REPL LOAD. > *Scenario:* > 1. create table with partition on timestamp column. > 2.bootstrap dump/load. > 3. insert a record to create partition(p="2001-11-09 00:00:00.0"). > 4. drop the same partition(p="2001-11-09 00:00:00.0"). > 5. incremental dump/load > -- REPL LOAD throws below exception > {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: > Thread-36769]: metastore.RetryingHMSHandler > (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error > parsing partition filter; lexer error: line 1:18 no viable alternative at > character ':'; exception MismatchedTokenException(12!=23)) > at > org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517) > at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103) > at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957) > at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) > at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200) > at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178) > at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source) > at > org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562) > at > org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018) > at > org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197) > at > org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76) > at > org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255) > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17884) Implement create, alter and drop workload management triggers.
[ https://issues.apache.org/jira/browse/HIVE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217635#comment-16217635 ] Sergey Shelukhin commented on HIVE-17884: - One small comment, overall looks good. cc [~prasanth_j] > Implement create, alter and drop workload management triggers. > -- > > Key: HIVE-17884 > URL: https://issues.apache.org/jira/browse/HIVE-17884 > Project: Hive > Issue Type: Sub-task >Reporter: Harish Jaiprakash >Assignee: Harish Jaiprakash > Attachments: HIVE-17884.01.patch > > > Implement triggers for workload management: > The commands to be implemented: > CREATE TRIGGER `resourceplan_name`.`trigger_name` WHEN condition DO action; > condition is a boolean expression: variable operator value types with 'AND' > and 'OR' support. > action is currently: KILL or MOVE TO pool; > ALTER TRIGGER `plan_name`.`trigger_name` WHEN condition DO action; > DROP TRIGGER `plan_name`.`trigger_name`; > Also add WM_TRIGGERS to information schema. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-14731) Use Tez cartesian product edge in Hive (unpartitioned case only)
[ https://issues.apache.org/jira/browse/HIVE-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217616#comment-16217616 ] Gunther Hagleitner commented on HIVE-14731: --- Ran relevant tests locally. All pass. Committed to master. > Use Tez cartesian product edge in Hive (unpartitioned case only) > > > Key: HIVE-14731 > URL: https://issues.apache.org/jira/browse/HIVE-14731 > Project: Hive > Issue Type: Bug >Reporter: Zhiyuan Yang >Assignee: Zhiyuan Yang > Attachments: HIVE-14731.1.patch, HIVE-14731.10.patch, > HIVE-14731.11.patch, HIVE-14731.12.patch, HIVE-14731.13.patch, > HIVE-14731.14.patch, HIVE-14731.15.patch, HIVE-14731.16.patch, > HIVE-14731.17.patch, HIVE-14731.18.patch, HIVE-14731.19.patch, > HIVE-14731.2.patch, HIVE-14731.20.patch, HIVE-14731.21.patch, > HIVE-14731.22.patch, HIVE-14731.23.patch, HIVE-14731.3.patch, > HIVE-14731.4.patch, HIVE-14731.5.patch, HIVE-14731.6.patch, > HIVE-14731.7.patch, HIVE-14731.8.patch, HIVE-14731.9.patch > > > Given cartesian product edge is available in Tez now (see TEZ-3230), let's > integrate it into Hive on Tez. This allows us to have more than one reducer > in cross product queries. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17882) Resource plan retrieval looks incorrect
[ https://issues.apache.org/jira/browse/HIVE-17882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217608#comment-16217608 ] Sergey Shelukhin commented on HIVE-17882: - +1 > Resource plan retrieval looks incorrect > --- > > Key: HIVE-17882 > URL: https://issues.apache.org/jira/browse/HIVE-17882 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Harish Jaiprakash > Attachments: HIVE-17882.01.patch > > > {code} > 0: jdbc:hive2://localhost:1> show resource plan global; > +--+-++ > | rp_name | status | query_parallelism | > +--+-++ > | global | 1 | NULL | > +--+-++ > {code} > looks like status and query_parallelism got swapped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine
[ https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217539#comment-16217539 ] Matt McCline edited comment on HIVE-17433 at 10/24/17 8:11 PM: --- Known Wrong Vectorization Results on Master: HIVE-17893: Vectorization: Wrong results for vector_udf3.q HIVE-17892: Vectorization: Wrong results for vectorized_timestamp_funcs.q HIVE-17890: Vectorization: Wrong results for vectorized_case.q HIVE-17889: Vectorization: Wrong results for vectorization_15.q HIVE-17863: Vectorization: Two Q files produce wrong PTF query results HIVE-17123: Vectorization: Wrong results for vector_groupby_cube1.q HIVE-16919: Vectorization: vectorization_short_regress.q has query result differences with non-vectorized run. Vectorized unary function broken? HIVE-17895: Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP) HIVE-17894: Vectorization: Wrong results for dynpart_sort_opt_vectorization.q (LLAP) was (Author: mmccline): Known Wrong Vectorization Results on Master: HIVE-17893: Vectorization: Wrong results for vector_udf3.q HIVE-17892: Vectorization: Wrong results for vectorized_timestamp_funcs.q HIVE-17890: Vectorization: Wrong results for vectorized_case.q HIVE-17889: Vectorization: Wrong results for vectorization_15.q HIVE-17863: Vectorization: Two Q files produce wrong PTF query results HIVE-17123: Vectorization: Wrong results for vector_groupby_cube1.q > Vectorization: Support Decimal64 in Hive Query Engine > - > > Key: HIVE-17433 > URL: https://issues.apache.org/jira/browse/HIVE-17433 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-17433.03.patch, HIVE-17433.04.patch, > HIVE-17433.05.patch > > > Provide partial support for Decimal64 within Hive. By partial I mean that > our current decimal has a large surface area of features (rounding, multiply, > divide, remainder, power, big precision, and many more) but only a small > number has been identified as being performance hotspots. > Those are small precision decimals with precision <= 18 that fit within a > 64-bit long we are calling Decimal64 ​. Just as we optimize row-mode > execution engine hotspots by selectively adding new vectorization code, we > can treat the current decimal as the full featured one and add additional > Decimal64 optimization where query benchmarks really show it help. > This change creates a Decimal64ColumnVector. > This change currently detects small decimal with Hive for Vectorized text > input format and uses some new Decimal64 vectorized classes for comparison, > addition, and later perhaps a few GroupBy aggregations like sum, avg, min, > max. > The patch also supports a new annotation that can mark a > VectorizedInputFormat as supporting Decimal64 (it is called DECIMAL_64). So, > in separate work those other formats such as ORC, PARQUET, etc can be done in > later JIRAs so they participate in the Decimal64 performance optimization. > The idea is when you annotate your input format with: > @VectorizedInputFormatSupports(supports = {DECIMAL_64}) > the Vectorizer in Hive will plan usage of Decimal64ColumnVector instead of > DecimalColumnVector. Upon an input format seeing Decimal64ColumnVector being > used, the input format can fill that column vector with decimal64 longs > instead of HiveDecimalWritable objects of DecimalColumnVector. > There will be a Hive environment variable > hive.vectorized.input.format.supports.enabled that has a string list of > supported features. The default will start as "decimal_64". It can be > turned off to allow for performance comparisons and testing. > The query SELECT * FROM DECIMAL_6_1_txt where key - 100BD < 200BD ORDER BY > key, value > Will have a vectorized explain plan looking like: > ... > Filter Operator > Filter Vectorization: > className: VectorFilterOperator > native: true > predicateExpression: > FilterDecimal64ColLessDecimal64Scalar(col 2, val 2000)(children: > Decimal64ColSubtractDecimal64Scalar(col 0, val 1000, > outputDecimal64AbsMax 999) -> 2:decimal(11,5)/DECIMAL_64) -> boolean > predicate: ((key - 100) < 200) (type: boolean) > ... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17895) Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP)
[ https://issues.apache.org/jira/browse/HIVE-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline reassigned HIVE-17895: --- > Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP) > > > Key: HIVE-17895 > URL: https://issues.apache.org/jira/browse/HIVE-17895 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > > NonVec: 103 NULL0.0 NULLoriginal > Vec: 103 NULLNULLNULLoriginal -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17894) Vectorization: Wrong results for dynpart_sort_opt_vectorization.q (LLAP)
[ https://issues.apache.org/jira/browse/HIVE-17894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline reassigned HIVE-17894: --- > Vectorization: Wrong results for dynpart_sort_opt_vectorization.q (LLAP) > > > Key: HIVE-17894 > URL: https://issues.apache.org/jira/browse/HIVE-17894 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > > NonVec: 34 > Vec: 38 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17832) Allow hive.metastore.disallow.incompatible.col.type.changes to be changed in metastore
[ https://issues.apache.org/jira/browse/HIVE-17832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217574#comment-16217574 ] Vihang Karajgaonkar commented on HIVE-17832: +1. I will be committin this patch EOD unless someone has any other objections. > Allow hive.metastore.disallow.incompatible.col.type.changes to be changed in > metastore > -- > > Key: HIVE-17832 > URL: https://issues.apache.org/jira/browse/HIVE-17832 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Janaki Lahorani >Assignee: Janaki Lahorani > Fix For: 3.0.0 > > Attachments: HIVE17832.1.patch, HIVE17832.2.patch > > > hive.metastore.disallow.incompatible.col.type.changes when set to true, will > disallow incompatible column type changes through alter table. But, this > parameter is not modifiable in HMS. If HMS in not embedded into HS2, the > value cannot be changed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17832) Allow hive.metastore.disallow.incompatible.col.type.changes to be changed in metastore
[ https://issues.apache.org/jira/browse/HIVE-17832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217574#comment-16217574 ] Vihang Karajgaonkar edited comment on HIVE-17832 at 10/24/17 7:56 PM: -- +1. I will be committing this patch EOD unless someone has any other objections. was (Author: vihangk1): +1. I will be committin this patch EOD unless someone has any other objections. > Allow hive.metastore.disallow.incompatible.col.type.changes to be changed in > metastore > -- > > Key: HIVE-17832 > URL: https://issues.apache.org/jira/browse/HIVE-17832 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Janaki Lahorani >Assignee: Janaki Lahorani > Fix For: 3.0.0 > > Attachments: HIVE17832.1.patch, HIVE17832.2.patch > > > hive.metastore.disallow.incompatible.col.type.changes when set to true, will > disallow incompatible column type changes through alter table. But, this > parameter is not modifiable in HMS. If HMS in not embedded into HS2, the > value cannot be changed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.
[ https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-17887: Status: Patch Available (was: Open) > Incremental REPL LOAD with Drop partition event on timestamp type partition > column fails. > - > > Key: HIVE-17887 > URL: https://issues.apache.org/jira/browse/HIVE-17887 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl >Affects Versions: 3.0.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Labels: DR, replication > Fix For: 3.0.0 > > Attachments: HIVE-17887.01.patch > > > When try to replicate the drop partition event on a table with partition on > timestamp type column fails in REPL LOAD. > *Scenario:* > 1. create table with partition on timestamp column. > 2.bootstrap dump/load. > 3. insert a record to create partition(p="2001-11-09 00:00:00.0"). > 4. drop the same partition(p="2001-11-09 00:00:00.0"). > 5. incremental dump/load > -- REPL LOAD throws below exception > {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: > Thread-36769]: metastore.RetryingHMSHandler > (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error > parsing partition filter; lexer error: line 1:18 no viable alternative at > character ':'; exception MismatchedTokenException(12!=23)) > at > org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517) > at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103) > at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957) > at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) > at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200) > at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178) > at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source) > at > org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562) > at > org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018) > at > org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197) > at > org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76) > at > org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255) > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.
[ https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-17887: Attachment: HIVE-17887.01.patch Attached 01.patch with drop partition replicated for timestamp column partition. Request [~thejas] to please review the same! > Incremental REPL LOAD with Drop partition event on timestamp type partition > column fails. > - > > Key: HIVE-17887 > URL: https://issues.apache.org/jira/browse/HIVE-17887 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl >Affects Versions: 3.0.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Labels: DR, replication > Fix For: 3.0.0 > > Attachments: HIVE-17887.01.patch > > > When try to replicate the drop partition event on a table with partition on > timestamp type column fails in REPL LOAD. > *Scenario:* > 1. create table with partition on timestamp column. > 2.bootstrap dump/load. > 3. insert a record to create partition(p="2001-11-09 00:00:00.0"). > 4. drop the same partition(p="2001-11-09 00:00:00.0"). > 5. incremental dump/load > -- REPL LOAD throws below exception > {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: > Thread-36769]: metastore.RetryingHMSHandler > (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error > parsing partition filter; lexer error: line 1:18 no viable alternative at > character ':'; exception MismatchedTokenException(12!=23)) > at > org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517) > at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103) > at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957) > at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) > at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200) > at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178) > at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source) > at > org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562) > at > org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018) > at > org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197) > at > org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76) > at > org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255) > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Work stopped] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.
[ https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-17887 stopped by Sankar Hariappan. --- > Incremental REPL LOAD with Drop partition event on timestamp type partition > column fails. > - > > Key: HIVE-17887 > URL: https://issues.apache.org/jira/browse/HIVE-17887 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl >Affects Versions: 3.0.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Labels: DR, replication > Fix For: 3.0.0 > > > When try to replicate the drop partition event on a table with partition on > timestamp type column fails in REPL LOAD. > *Scenario:* > 1. create table with partition on timestamp column. > 2.bootstrap dump/load. > 3. insert a record to create partition(p="2001-11-09 00:00:00.0"). > 4. drop the same partition(p="2001-11-09 00:00:00.0"). > 5. incremental dump/load > -- REPL LOAD throws below exception > {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: > Thread-36769]: metastore.RetryingHMSHandler > (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error > parsing partition filter; lexer error: line 1:18 no viable alternative at > character ':'; exception MismatchedTokenException(12!=23)) > at > org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517) > at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103) > at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957) > at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) > at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200) > at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178) > at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source) > at > org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562) > at > org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018) > at > org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197) > at > org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76) > at > org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255) > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
[ https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17458: -- Attachment: HIVE-17458.07.patch > VectorizedOrcAcidRowBatchReader doesn't handle 'original' files > --- > > Key: HIVE-17458 > URL: https://issues.apache.org/jira/browse/HIVE-17458 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, > HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, > HIVE-17458.06.patch, HIVE-17458.07.patch > > > VectorizedOrcAcidRowBatchReader will not be used for original files. This > will likely look like a perf regression when converting a table from non-acid > to acid until it runs through a major compaction. > With Load Data support, if large files are added via Load Data, the read ops > will not vectorize until major compaction. > There is no reason why this should be the case. Just like > OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other > files in the logical tranche/bucket and calculate the offset for the RowBatch > of the split. (Presumably getRecordReader().getRowNumber() works the same in > vector mode). > In this case we don't even need OrcSplit.isOriginal() - the reader can infer > it from file path... which in particular simplifies > OrcInputFormat.determineSplitStrategies() -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine
[ https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217539#comment-16217539 ] Matt McCline commented on HIVE-17433: - Known Wrong Vectorization Results on Master: HIVE-17893: Vectorization: Wrong results for vector_udf3.q HIVE-17892: Vectorization: Wrong results for vectorized_timestamp_funcs.q HIVE-17890: Vectorization: Wrong results for vectorized_case.q HIVE-17889: Vectorization: Wrong results for vectorization_15.q HIVE-17863: Vectorization: Two Q files produce wrong PTF query results HIVE-17123: Vectorization: Wrong results for vector_groupby_cube1.q > Vectorization: Support Decimal64 in Hive Query Engine > - > > Key: HIVE-17433 > URL: https://issues.apache.org/jira/browse/HIVE-17433 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-17433.03.patch, HIVE-17433.04.patch, > HIVE-17433.05.patch > > > Provide partial support for Decimal64 within Hive. By partial I mean that > our current decimal has a large surface area of features (rounding, multiply, > divide, remainder, power, big precision, and many more) but only a small > number has been identified as being performance hotspots. > Those are small precision decimals with precision <= 18 that fit within a > 64-bit long we are calling Decimal64 ​. Just as we optimize row-mode > execution engine hotspots by selectively adding new vectorization code, we > can treat the current decimal as the full featured one and add additional > Decimal64 optimization where query benchmarks really show it help. > This change creates a Decimal64ColumnVector. > This change currently detects small decimal with Hive for Vectorized text > input format and uses some new Decimal64 vectorized classes for comparison, > addition, and later perhaps a few GroupBy aggregations like sum, avg, min, > max. > The patch also supports a new annotation that can mark a > VectorizedInputFormat as supporting Decimal64 (it is called DECIMAL_64). So, > in separate work those other formats such as ORC, PARQUET, etc can be done in > later JIRAs so they participate in the Decimal64 performance optimization. > The idea is when you annotate your input format with: > @VectorizedInputFormatSupports(supports = {DECIMAL_64}) > the Vectorizer in Hive will plan usage of Decimal64ColumnVector instead of > DecimalColumnVector. Upon an input format seeing Decimal64ColumnVector being > used, the input format can fill that column vector with decimal64 longs > instead of HiveDecimalWritable objects of DecimalColumnVector. > There will be a Hive environment variable > hive.vectorized.input.format.supports.enabled that has a string list of > supported features. The default will start as "decimal_64". It can be > turned off to allow for performance comparisons and testing. > The query SELECT * FROM DECIMAL_6_1_txt where key - 100BD < 200BD ORDER BY > key, value > Will have a vectorized explain plan looking like: > ... > Filter Operator > Filter Vectorization: > className: VectorFilterOperator > native: true > predicateExpression: > FilterDecimal64ColLessDecimal64Scalar(col 2, val 2000)(children: > Decimal64ColSubtractDecimal64Scalar(col 0, val 1000, > outputDecimal64AbsMax 999) -> 2:decimal(11,5)/DECIMAL_64) -> boolean > predicate: ((key - 100) < 200) (type: boolean) > ... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17891) HIVE-13076 uses create table if not exists for the postgres script
[ https://issues.apache.org/jira/browse/HIVE-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-17891: --- Attachment: HIVE-17891.01.patch > HIVE-13076 uses create table if not exists for the postgres script > -- > > Key: HIVE-17891 > URL: https://issues.apache.org/jira/browse/HIVE-17891 > Project: Hive > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-17891.01.patch > > > HIVE-13076 addes a new table to the schema but the patch script uses {{CREATE > TABLE IF NOT EXISTS}} syntax to add the new table. The issue the {{IF NOT > EXISTS}} clause is only available from postgres 9.1 onwards. So the script > will fail for older versions of postgres. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17891) HIVE-13076 uses create table if not exists for the postgres script
[ https://issues.apache.org/jira/browse/HIVE-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-17891: --- Status: Patch Available (was: Open) > HIVE-13076 uses create table if not exists for the postgres script > -- > > Key: HIVE-17891 > URL: https://issues.apache.org/jira/browse/HIVE-17891 > Project: Hive > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-17891.01.patch > > > HIVE-13076 addes a new table to the schema but the patch script uses {{CREATE > TABLE IF NOT EXISTS}} syntax to add the new table. The issue the {{IF NOT > EXISTS}} clause is only available from postgres 9.1 onwards. So the script > will fail for older versions of postgres. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17891) HIVE-13076 uses create table if not exists for the postgres script
[ https://issues.apache.org/jira/browse/HIVE-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-17891: --- Description: HIVE-13076 addes a new table to the schema but the patch script uses {{CREATE TABLE IF NOT EXISTS}} syntax to add the new table. The issue is that the {{IF NOT EXISTS}} clause is only available from postgres 9.1 onwards. So the script will fail for older versions of postgres. (was: HIVE-13076 addes a new table to the schema but the patch script uses {{CREATE TABLE IF NOT EXISTS}} syntax to add the new table. The issue the {{IF NOT EXISTS}} clause is only available from postgres 9.1 onwards. So the script will fail for older versions of postgres.) > HIVE-13076 uses create table if not exists for the postgres script > -- > > Key: HIVE-17891 > URL: https://issues.apache.org/jira/browse/HIVE-17891 > Project: Hive > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-17891.01.patch > > > HIVE-13076 addes a new table to the schema but the patch script uses {{CREATE > TABLE IF NOT EXISTS}} syntax to add the new table. The issue is that the {{IF > NOT EXISTS}} clause is only available from postgres 9.1 onwards. So the > script will fail for older versions of postgres. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17893) Vectorization: Wrong results for vector_udf3.q
[ https://issues.apache.org/jira/browse/HIVE-17893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline reassigned HIVE-17893: --- > Vectorization: Wrong results for vector_udf3.q > -- > > Key: HIVE-17893 > URL: https://issues.apache.org/jira/browse/HIVE-17893 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > > NonVec: > yy2GiGM ll2TvTZ > yxN0212hM17E8J8bJj8D7blkA0212uZ17R8W8oWw8Q7o > ywA68u76Jv06axCv451avL4 ljN68h76Wi06nkPi451niY4 > yvNv1qliAi1d > yv3gnG4a33hD7bIm7oxE5rw li3taT4n33uQ7oVz7bkR5ej > yv1js li1wf > yujO07KWj lhwB07XJw > ytpx1RL8F2I lgck1EY8S2V > ytj7g5W lgw7t5J > ytgaJW1Gvrkv5wFUJU2y1SlgtnWJ1Tiexi5jSHWH2l1F > Vec: > yy2GiGM Unvectorized > yxN0212hM17E8J8bJj8D7bUnvectorized > ywA68u76Jv06axCv451avL4 Unvectorized > yvNv1qUnvectorized > yv3gnG4a33hD7bIm7oxE5rw Unvectorized > yv1js Unvectorized > yujO07KWj Unvectorized > ytpx1RL8F2I Unvectorized > ytj7g5W Unvectorized > ytgaJW1Gvrkv5wFUJU2y1SUnvectorized -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17891) HIVE-13076 uses create table if not exists for the postgres script
[ https://issues.apache.org/jira/browse/HIVE-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-17891: --- Description: HIVE-13076 addes a new table to the schema but the patch script uses {{CREATE TABLE IF NOT EXISTS}} syntax to add the new table. The issue the {{IF NOT EXISTS}} clause is only available from postgres 9.1 onwards. So the script will fail for older versions of postgres. (was: HIVE-13354 addes a new table to the schema but the patch script uses {{CREATE TABLE IF NOT EXISTS}} syntax to add the new table. The issue the {{IF NOT EXISTS}} clause is only available from postgres 9.1 onwards. So the script will fail for older versions of postgres.) > HIVE-13076 uses create table if not exists for the postgres script > -- > > Key: HIVE-17891 > URL: https://issues.apache.org/jira/browse/HIVE-17891 > Project: Hive > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > > HIVE-13076 addes a new table to the schema but the patch script uses {{CREATE > TABLE IF NOT EXISTS}} syntax to add the new table. The issue the {{IF NOT > EXISTS}} clause is only available from postgres 9.1 onwards. So the script > will fail for older versions of postgres. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17891) HIVE-13076 uses create table if not exists for the postgres script
[ https://issues.apache.org/jira/browse/HIVE-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-17891: --- Summary: HIVE-13076 uses create table if not exists for the postgres script (was: HIVE-13354 uses create table if not exists for the postgres script) > HIVE-13076 uses create table if not exists for the postgres script > -- > > Key: HIVE-17891 > URL: https://issues.apache.org/jira/browse/HIVE-17891 > Project: Hive > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > > HIVE-13354 addes a new table to the schema but the patch script uses {{CREATE > TABLE IF NOT EXISTS}} syntax to add the new table. The issue the {{IF NOT > EXISTS}} clause is only available from postgres 9.1 onwards. So the script > will fail for older versions of postgres. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17886) Fix failure of TestReplicationScenarios.testConstraints
[ https://issues.apache.org/jira/browse/HIVE-17886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-17886: Attachment: repl-tc.hive.log attached hive.log > Fix failure of TestReplicationScenarios.testConstraints > --- > > Key: HIVE-17886 > URL: https://issues.apache.org/jira/browse/HIVE-17886 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich > Attachments: repl-tc.hive.log > > > after HIVE-16603 this test started failing > {code} > 2017-10-24T10:52:17,024 DEBUG [main] metastore.HiveMetaStoreClient: Unable to > shutdown metastore client. Will try closing transport directly. > org.apache.thrift.transport.TTransportException: Cannot write to null > outputStream > at > org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142) > ~[libthrift-0.9.3.jar:0.9.3] > at > org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:178) > ~[libthrift-0.9.3.jar:0.9.3] > at > org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:106) > ~[libthrift-0.9.3.jar:0.9.3] > at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:70) > ~[libthrift-0.9.3.jar:0.9.3] > at > org.apache.thrift.TServiceClient.sendBaseOneway(TServiceClient.java:66) > ~[libthrift-0.9.3.jar:0.9.3] > at > com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:436) > ~[libfb303-0.9.3.jar:?] > at > com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:430) > ~[libfb303-0.9.3.jar:?] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:569) > [hive-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source) ~[?:?] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_131] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_131] > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173) > [hive-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at com.sun.proxy.$Proxy38.close(Unknown Source) [?:?] > at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source) ~[?:?] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_131] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_131] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2413) > [hive-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at com.sun.proxy.$Proxy38.close(Unknown Source) [?:?] > at > org.apache.hadoop.hive.metastore.SynchronizedMetaStoreClient.close(SynchronizedMetaStoreClient.java:112) > [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.metadata.Hive.close(Hive.java:425) > [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.metadata.Hive.access$000(Hive.java:181) > [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.metadata.Hive$1.remove(Hive.java:202) > [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.metadata.Hive.closeCurrent(Hive.java:388) > [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:339) > [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:324) > [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.metadata.Hive.getWithFastCheck(Hive.java:316) > [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.metadata.Hive.getWithFastCheck(Hive.java:308) > [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.Task.getHive(Task.java:186) > [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.repl.bootstrap.ReplLoadTask.execute(ReplLoadTask.java:73) > [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:206) > [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) > [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2276) > [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1906) > [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at
[jira] [Assigned] (HIVE-17892) Vectorization: Wrong results for vectorized_timestamp_funcs.q
[ https://issues.apache.org/jira/browse/HIVE-17892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline reassigned HIVE-17892: --- > Vectorization: Wrong results for vectorized_timestamp_funcs.q > - > > Key: HIVE-17892 > URL: https://issues.apache.org/jira/browse/HIVE-17892 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > > NonVec: > NULL NULLNULLNULLNULLNULLNULLNULLNULL > NULL NULLNULLNULLNULLNULLNULLNULLNULL > NULL NULLNULLNULLNULLNULLNULLNULLNULL > Vec: > NULL NULLNULLNULLNULLNULL8 1 1 > NULL NULLNULLNULLNULLNULLNULLNULLNULL > -62169765561 2 11 30 30 48 4 40 39 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16663) String Caching For Rows
[ https://issues.apache.org/jira/browse/HIVE-16663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-16663: --- Description: It is very common that there are many repeated values in the result set of a query, especially when JOINs are present in the query. As it currently stands, beeline does not attempt to cache any of these values and therefore it consumes a lot of memory. Adding a string cache may save a lot of memory. There are organizations that use beeline to perform ETL processing of result sets into CSV. This will better support those organizations. was: It is very common that there are many repeated values in the result set of a query. As it currently stands, beeline does not attempt to cache any of these values and therefore it consumes a lot of memory. Adding a string cache may save a lot of memory. There are organizations that use beeline to perform ETL processing of result sets into CSV. This will better support those organizations. > String Caching For Rows > --- > > Key: HIVE-16663 > URL: https://issues.apache.org/jira/browse/HIVE-16663 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.0.1 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HIVE-16663.1.patch, HIVE-16663.2.patch, > HIVE-16663.3.patch, HIVE-16663.4.patch, HIVE-16663.5.patch, > HIVE-16663.6.patch, HIVE-16663.7.patch > > > It is very common that there are many repeated values in the result set of a > query, especially when JOINs are present in the query. As it currently > stands, beeline does not attempt to cache any of these values and therefore > it consumes a lot of memory. > Adding a string cache may save a lot of memory. There are organizations that > use beeline to perform ETL processing of result sets into CSV. This will > better support those organizations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17891) HIVE-13354 uses create table if not exists for the postgres script
[ https://issues.apache.org/jira/browse/HIVE-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar reassigned HIVE-17891: -- > HIVE-13354 uses create table if not exists for the postgres script > -- > > Key: HIVE-17891 > URL: https://issues.apache.org/jira/browse/HIVE-17891 > Project: Hive > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > > HIVE-13354 addes a new table to the schema but the patch script uses {{CREATE > TABLE IF NOT EXISTS}} syntax to add the new table. The issue the {{IF NOT > EXISTS}} clause is only available from postgres 9.1 onwards. So the script > will fail for older versions of postgres. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17890) Vectorization: Wrong results for vectorized_case.q
[ https://issues.apache.org/jira/browse/HIVE-17890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline reassigned HIVE-17890: --- > Vectorization: Wrong results for vectorized_case.q > -- > > Key: HIVE-17890 > URL: https://issues.apache.org/jira/browse/HIVE-17890 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > > NonVec: 5110 4607 > Vec: 4086 3583 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17889) Vectorization: Wrong results for vectorization_15.q
[ https://issues.apache.org/jira/browse/HIVE-17889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline reassigned HIVE-17889: --- > Vectorization: Wrong results for vectorization_15.q > --- > > Key: HIVE-17889 > URL: https://issues.apache.org/jira/browse/HIVE-17889 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > > NonVec: 15:59:56.527 > Vec: 16:00:09.889 > ctimestamp1 (column 7) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16663) String Caching For Rows
[ https://issues.apache.org/jira/browse/HIVE-16663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217499#comment-16217499 ] BELUGA BEHR commented on HIVE-16663: Latest changes to fix merge issues... Also, removed call to {{rs.wasNull()}} because [ResultSet|https://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#getObject(int)] will already handle SQL NULL values appropriately when calling {{rs.getObject()}} > String Caching For Rows > --- > > Key: HIVE-16663 > URL: https://issues.apache.org/jira/browse/HIVE-16663 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.0.1 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HIVE-16663.1.patch, HIVE-16663.2.patch, > HIVE-16663.3.patch, HIVE-16663.4.patch, HIVE-16663.5.patch, > HIVE-16663.6.patch, HIVE-16663.7.patch > > > It is very common that there are many repeated values in the result set of a > query. As it currently stands, beeline does not attempt to cache any of > these values and therefore it consumes a lot of memory. > Adding a string cache may save a lot of memory. There are organizations that > use beeline to perform ETL processing of result sets into CSV. This will > better support those organizations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16663) String Caching For Rows
[ https://issues.apache.org/jira/browse/HIVE-16663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-16663: --- Status: Patch Available (was: Open) > String Caching For Rows > --- > > Key: HIVE-16663 > URL: https://issues.apache.org/jira/browse/HIVE-16663 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.0.1 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HIVE-16663.1.patch, HIVE-16663.2.patch, > HIVE-16663.3.patch, HIVE-16663.4.patch, HIVE-16663.5.patch, > HIVE-16663.6.patch, HIVE-16663.7.patch > > > It is very common that there are many repeated values in the result set of a > query. As it currently stands, beeline does not attempt to cache any of > these values and therefore it consumes a lot of memory. > Adding a string cache may save a lot of memory. There are organizations that > use beeline to perform ETL processing of result sets into CSV. This will > better support those organizations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16663) String Caching For Rows
[ https://issues.apache.org/jira/browse/HIVE-16663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-16663: --- Attachment: HIVE-16663.7.patch > String Caching For Rows > --- > > Key: HIVE-16663 > URL: https://issues.apache.org/jira/browse/HIVE-16663 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.0.1 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HIVE-16663.1.patch, HIVE-16663.2.patch, > HIVE-16663.3.patch, HIVE-16663.4.patch, HIVE-16663.5.patch, > HIVE-16663.6.patch, HIVE-16663.7.patch > > > It is very common that there are many repeated values in the result set of a > query. As it currently stands, beeline does not attempt to cache any of > these values and therefore it consumes a lot of memory. > Adding a string cache may save a lot of memory. There are organizations that > use beeline to perform ETL processing of result sets into CSV. This will > better support those organizations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16663) String Caching For Rows
[ https://issues.apache.org/jira/browse/HIVE-16663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-16663: --- Status: Open (was: Patch Available) > String Caching For Rows > --- > > Key: HIVE-16663 > URL: https://issues.apache.org/jira/browse/HIVE-16663 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.0.1 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HIVE-16663.1.patch, HIVE-16663.2.patch, > HIVE-16663.3.patch, HIVE-16663.4.patch, HIVE-16663.5.patch, HIVE-16663.6.patch > > > It is very common that there are many repeated values in the result set of a > query. As it currently stands, beeline does not attempt to cache any of > these values and therefore it consumes a lot of memory. > Adding a string cache may save a lot of memory. There are organizations that > use beeline to perform ETL processing of result sets into CSV. This will > better support those organizations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-15305) Add tests for METASTORE_EVENT_LISTENERS
[ https://issues.apache.org/jira/browse/HIVE-15305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohit Sabharwal updated HIVE-15305: --- Attachment: (was: HIVE-15305.1.patch) > Add tests for METASTORE_EVENT_LISTENERS > --- > > Key: HIVE-15305 > URL: https://issues.apache.org/jira/browse/HIVE-15305 > Project: Hive > Issue Type: Bug >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal > Attachments: HIVE-15305.1.patch, HIVE-15305.1.patch, > HIVE-15305.1.patch, HIVE-15305.1.patch, HIVE-15305.1.patch, HIVE-15305.patch > > > HIVE-15232 reused TestDbNotificationListener to test > METASTORE_TRANSACTIONAL_EVENT_LISTENERS and removed unit testing of > METASTORE_EVENT_LISTENERS config. We should test both. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-15305) Add tests for METASTORE_EVENT_LISTENERS
[ https://issues.apache.org/jira/browse/HIVE-15305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohit Sabharwal updated HIVE-15305: --- Attachment: HIVE-15305.1.patch > Add tests for METASTORE_EVENT_LISTENERS > --- > > Key: HIVE-15305 > URL: https://issues.apache.org/jira/browse/HIVE-15305 > Project: Hive > Issue Type: Bug >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal > Attachments: HIVE-15305.1.patch, HIVE-15305.1.patch, > HIVE-15305.1.patch, HIVE-15305.1.patch, HIVE-15305.1.patch, HIVE-15305.patch > > > HIVE-15232 reused TestDbNotificationListener to test > METASTORE_TRANSACTIONAL_EVENT_LISTENERS and removed unit testing of > METASTORE_EVENT_LISTENERS config. We should test both. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Work started] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.
[ https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-17887 started by Sankar Hariappan. --- > Incremental REPL LOAD with Drop partition event on timestamp type partition > column fails. > - > > Key: HIVE-17887 > URL: https://issues.apache.org/jira/browse/HIVE-17887 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl >Affects Versions: 3.0.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Labels: DR, replication > Fix For: 3.0.0 > > > When try to replicate the drop partition event on a table with partition on > timestamp type column fails in REPL LOAD. > *Scenario:* > 1. create table with partition on timestamp column. > 2.bootstrap dump/load. > 3. insert a record to create partition(p="2001-11-09 00:00:00.0"). > 4. drop the same partition(p="2001-11-09 00:00:00.0"). > 5. incremental dump/load > -- REPL LOAD throws below exception > {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: > Thread-36769]: metastore.RetryingHMSHandler > (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error > parsing partition filter; lexer error: line 1:18 no viable alternative at > character ':'; exception MismatchedTokenException(12!=23)) > at > org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517) > at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103) > at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957) > at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) > at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200) > at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178) > at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source) > at > org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562) > at > org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018) > at > org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197) > at > org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76) > at > org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255) > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-15305) Add tests for METASTORE_EVENT_LISTENERS
[ https://issues.apache.org/jira/browse/HIVE-15305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohit Sabharwal updated HIVE-15305: --- Attachment: HIVE-15305.1.patch > Add tests for METASTORE_EVENT_LISTENERS > --- > > Key: HIVE-15305 > URL: https://issues.apache.org/jira/browse/HIVE-15305 > Project: Hive > Issue Type: Bug >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal > Attachments: HIVE-15305.1.patch, HIVE-15305.1.patch, > HIVE-15305.1.patch, HIVE-15305.1.patch, HIVE-15305.1.patch, HIVE-15305.patch > > > HIVE-15232 reused TestDbNotificationListener to test > METASTORE_TRANSACTIONAL_EVENT_LISTENERS and removed unit testing of > METASTORE_EVENT_LISTENERS config. We should test both. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17232) "No match found" Compactor finds a bucket file thinking it's a directory
[ https://issues.apache.org/jira/browse/HIVE-17232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217477#comment-16217477 ] Eugene Koifman commented on HIVE-17232: --- Some comments: 1. Table.java is a generated class based on hive_metastore.thrift so anything you add to it manually will be lost next time it is regenerated 2. Instead of just "No match found" the error msg should include the file name that it was trying to process so that we can debug this if it happens again. 3. If you want the Work to check for table level compaction request for partitioned tables it should put the compaction request into failed state (markFailed()) - this way it is visible to the end user in SHOW COMPACTIONS. > "No match found" Compactor finds a bucket file thinking it's a directory > -- > > Key: HIVE-17232 > URL: https://issues.apache.org/jira/browse/HIVE-17232 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Assignee: Steve Yeom > Attachments: HIVE-17232.01.patch > > > {noformat} > 2017-08-02T12:38:11,996 WARN [main] compactor.CompactorMR: Found a > non-bucket file that we thought matched the bucket pattern! > file:/Users/ekoifman/dev/hiv\ > erwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands2-1501702264311/warehouse/acidtblpart/p=1/delta_013_013_/bucket_1 > Matcher=java\ > .util.regex.Matcher[pattern=^[0-9]{6} region=0,12 lastmatch=] > 2017-08-02T12:38:11,996 INFO [main] mapreduce.JobSubmitter: Cleaning up the > staging area > file:/tmp/hadoop/mapred/staging/ekoifman1723152463/.staging/job_lo\ > cal1723152463_0183 > 2017-08-02T12:38:11,997 ERROR [main] compactor.Worker: Caught exception while > trying to compact > id:1,dbname:default,tableName:ACIDTBLPART,partName:null,stat\ > e:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestTxnId:0. > Marking failed to avoid repeated failures, java.lang.IllegalStateException: > \ > No match found > at java.util.regex.Matcher.group(Matcher.java:536) > at java.util.regex.Matcher.group(Matcher.java:496) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorInputFormat.addFileToMap(CompactorMR.java:577) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorInputFormat.getSplits(CompactorMR.java:549) > at > org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:330) > at > org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:322) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:198) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807) > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.launchCompactionJob(CompactorMR.java:320) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:275) > at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:166) > at > org.apache.hadoop.hive.ql.TestTxnCommands2.runWorker(TestTxnCommands2.java:1138) > at > org.apache.hadoop.hive.ql.TestTxnCommands2.updateDeletePartitioned(TestTxnCommands2.java:894) > {noformat} > the stack trace points to 1st runWorker() in updateDeletePartitioned() though > the test run was TestTxnCommands2WithSplitUpdateAndVectorization -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17881) LLAP: Text cache NPE
[ https://issues.apache.org/jira/browse/HIVE-17881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217472#comment-16217472 ] Prasanth Jayachandran commented on HIVE-17881: -- Yes. I don't want to use cache :) no purging of cache yet without restart. I wanted to use LLAP IO since it prints FS counters but without cache as I want to perform a HDFS read always for some unit test. Regardless it shouldn't throw NPE if someone wants to disable cache but use only async io. > LLAP: Text cache NPE > > > Key: HIVE-17881 > URL: https://issues.apache.org/jira/browse/HIVE-17881 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran > > With LLAP IO enabled and hive.llap.io.memory.mode set to false. Text cache > throws NPE for following query > {code} > select t1.k,t1.v from src t1 join src t2 on t1.k>=t2.k; > {code} > {code} > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.readFileWithCache(SerDeEncodedDataReader.java:763) > at > org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.performDataRead(SerDeEncodedDataReader.java:668) > at > org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader$5.run(SerDeEncodedDataReader.java:259) > at > org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader$5.run(SerDeEncodedDataReader.java:256) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1889) > at > org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.callInternal(SerDeEncodedDataReader.java:256) > at > org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.callInternal(SerDeEncodedDataReader.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17888) Display the reason for query cancellation
[ https://issues.apache.org/jira/browse/HIVE-17888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran reassigned HIVE-17888: > Display the reason for query cancellation > - > > Key: HIVE-17888 > URL: https://issues.apache.org/jira/browse/HIVE-17888 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > For user convenience and easy debugging, if a trigger kills a query return > the reason for the killing the query. Currently the query kill will only > display the following which is not very useful > {code} > Error: Query was cancelled (state=01000,code=0) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16855) org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements
[ https://issues.apache.org/jira/browse/HIVE-16855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217444#comment-16217444 ] BELUGA BEHR commented on HIVE-16855: [~ngangam] Please consider this simple improvement. > org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements > -- > > Key: HIVE-16855 > URL: https://issues.apache.org/jira/browse/HIVE-16855 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.1, 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HIVE-16855.1.patch > > > # Improve (Simplify) Logging > # Remove custom buffer size for {{BufferedInputStream}} and instead rely on > JVM default which is often larger these days (8192) > # Simplify looping logic -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.
[ https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan reassigned HIVE-17887: --- > Incremental REPL LOAD with Drop partition event on timestamp type partition > column fails. > - > > Key: HIVE-17887 > URL: https://issues.apache.org/jira/browse/HIVE-17887 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl >Affects Versions: 3.0.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Labels: DR, replication > Fix For: 3.0.0 > > > When try to replicate the drop partition event on a table with partition on > timestamp type column fails in REPL LOAD. > *Scenario:* > 1. create table with partition on timestamp column. > 2.bootstrap dump/load. > 3. insert a record to create partition(p="2001-11-09 00:00:00.0"). > 4. drop the same partition(p="2001-11-09 00:00:00.0"). > 5. incremental dump/load > -- REPL LOAD throws below exception > {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: > Thread-36769]: metastore.RetryingHMSHandler > (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error > parsing partition filter; lexer error: line 1:18 no viable alternative at > character ':'; exception MismatchedTokenException(12!=23)) > at > org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517) > at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103) > at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957) > at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) > at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200) > at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178) > at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source) > at > org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562) > at > org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018) > at > org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197) > at > org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76) > at > org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255) > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16890) org.apache.hadoop.hive.serde2.io.HiveVarcharWritable - Adds Superfluous Wrapper
[ https://issues.apache.org/jira/browse/HIVE-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217440#comment-16217440 ] BELUGA BEHR commented on HIVE-16890: [~ngangam] Please consider this simple improvement. > org.apache.hadoop.hive.serde2.io.HiveVarcharWritable - Adds Superfluous > Wrapper > --- > > Key: HIVE-16890 > URL: https://issues.apache.org/jira/browse/HIVE-16890 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Attachments: HIVE-16890.1.patch > > > Class {{org.apache.hadoop.hive.serde2.io.HiveVarcharWritable}} creates a > superfluous wrapper and then immediately unwraps it. Don't bother wrapping > in this scenario. > {code} > public void set(HiveVarchar val, int len) { > set(val.getValue(), len); > } > public void set(String val, int maxLength) { > value.set(HiveBaseChar.enforceMaxLength(val, maxLength)); > } > public HiveVarchar getHiveVarchar() { > return new HiveVarchar(value.toString(), -1); > } > // Here calls getHiveVarchar() which creates a new HiveVarchar object with > a string in it > // The object is passed to set(HiveVarchar val, int len) > // The string is pulled out > public void enforceMaxLength(int maxLength) { > // Might be possible to truncate the existing Text value, for now just do > something simple. > if (value.getLength()>maxLength && getCharacterLength()>maxLength) > set(getHiveVarchar(), maxLength); > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16970) General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils
[ https://issues.apache.org/jira/browse/HIVE-16970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217441#comment-16217441 ] BELUGA BEHR commented on HIVE-16970: [~ngangam] Please consider this simple improvement. > General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils > - > > Key: HIVE-16970 > URL: https://issues.apache.org/jira/browse/HIVE-16970 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Attachments: HIVE-16970.1.patch, HIVE-16970.2.patch > > > # Simplify > # Do not initiate empty collections > # Parsing is incorrect: > {code:title=org.apache.hadoop.hive.metastore.cache.CacheUtils} > public static String buildKey(String dbName, String tableName, List > partVals) { > String key = buildKey(dbName, tableName); > if (partVals == null || partVals.size() == 0) { > return key; > } > // missing a delimiter between the "tableName" and the first "partVal" > for (int i = 0; i < partVals.size(); i++) { > key += partVals.get(i); > if (i != partVals.size() - 1) { > key += delimit; > } > } > return key; > } > public static Object[] splitPartitionColStats(String key) { > // ... > } > {code} > The result of passing the key to the "split" method is: > {code} > buildKey("db","Table",["Part1","Part2","Part3"], "col"); > [db, tablePart1, [Part2, Part3], col] > // "table" and "Part1" is mistakenly concatenated > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17841) implement applying the resource plan
[ https://issues.apache.org/jira/browse/HIVE-17841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217432#comment-16217432 ] Hive QA commented on HIVE-17841: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12893627/HIVE-17841.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 11315 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby2_map_skew] (batchId=82) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=221) org.apache.hadoop.hive.ql.parse.authorization.plugin.sqlstd.TestOperation2Privilege.checkHiveOperationTypeMatch (batchId=269) org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers1 (batchId=228) org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers2 (batchId=228) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead (batchId=228) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesWrite (batchId=228) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=228) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime (batchId=228) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime (batchId=228) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerTotalTasks (batchId=228) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7454/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7454/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7454/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12893627 - PreCommit-HIVE-Build > implement applying the resource plan > > > Key: HIVE-17841 > URL: https://issues.apache.org/jira/browse/HIVE-17841 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17841.01.patch, HIVE-17841.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16970) General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils
[ https://issues.apache.org/jira/browse/HIVE-16970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-16970: --- Attachment: HIVE-16970.2.patch > General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils > - > > Key: HIVE-16970 > URL: https://issues.apache.org/jira/browse/HIVE-16970 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Attachments: HIVE-16970.1.patch, HIVE-16970.2.patch > > > # Simplify > # Do not initiate empty collections > # Parsing is incorrect: > {code:title=org.apache.hadoop.hive.metastore.cache.CacheUtils} > public static String buildKey(String dbName, String tableName, List > partVals) { > String key = buildKey(dbName, tableName); > if (partVals == null || partVals.size() == 0) { > return key; > } > // missing a delimiter between the "tableName" and the first "partVal" > for (int i = 0; i < partVals.size(); i++) { > key += partVals.get(i); > if (i != partVals.size() - 1) { > key += delimit; > } > } > return key; > } > public static Object[] splitPartitionColStats(String key) { > // ... > } > {code} > The result of passing the key to the "split" method is: > {code} > buildKey("db","Table",["Part1","Part2","Part3"], "col"); > [db, tablePart1, [Part2, Part3], col] > // "table" and "Part1" is mistakenly concatenated > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16970) General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils
[ https://issues.apache.org/jira/browse/HIVE-16970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-16970: --- Status: Patch Available (was: Open) > General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils > - > > Key: HIVE-16970 > URL: https://issues.apache.org/jira/browse/HIVE-16970 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Attachments: HIVE-16970.1.patch, HIVE-16970.2.patch > > > # Simplify > # Do not initiate empty collections > # Parsing is incorrect: > {code:title=org.apache.hadoop.hive.metastore.cache.CacheUtils} > public static String buildKey(String dbName, String tableName, List > partVals) { > String key = buildKey(dbName, tableName); > if (partVals == null || partVals.size() == 0) { > return key; > } > // missing a delimiter between the "tableName" and the first "partVal" > for (int i = 0; i < partVals.size(); i++) { > key += partVals.get(i); > if (i != partVals.size() - 1) { > key += delimit; > } > } > return key; > } > public static Object[] splitPartitionColStats(String key) { > // ... > } > {code} > The result of passing the key to the "split" method is: > {code} > buildKey("db","Table",["Part1","Part2","Part3"], "col"); > [db, tablePart1, [Part2, Part3], col] > // "table" and "Part1" is mistakenly concatenated > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16970) General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils
[ https://issues.apache.org/jira/browse/HIVE-16970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-16970: --- Status: Open (was: Patch Available) > General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils > - > > Key: HIVE-16970 > URL: https://issues.apache.org/jira/browse/HIVE-16970 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Attachments: HIVE-16970.1.patch, HIVE-16970.2.patch > > > # Simplify > # Do not initiate empty collections > # Parsing is incorrect: > {code:title=org.apache.hadoop.hive.metastore.cache.CacheUtils} > public static String buildKey(String dbName, String tableName, List > partVals) { > String key = buildKey(dbName, tableName); > if (partVals == null || partVals.size() == 0) { > return key; > } > // missing a delimiter between the "tableName" and the first "partVal" > for (int i = 0; i < partVals.size(); i++) { > key += partVals.get(i); > if (i != partVals.size() - 1) { > key += delimit; > } > } > return key; > } > public static Object[] splitPartitionColStats(String key) { > // ... > } > {code} > The result of passing the key to the "split" method is: > {code} > buildKey("db","Table",["Part1","Part2","Part3"], "col"); > [db, tablePart1, [Part2, Part3], col] > // "table" and "Part1" is mistakenly concatenated > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217431#comment-16217431 ] Xuefu Zhang commented on HIVE-15104: +1 > Hive on Spark generate more shuffle data than hive on mr > > > Key: HIVE-15104 > URL: https://issues.apache.org/jira/browse/HIVE-15104 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.2.1 >Reporter: wangwenli >Assignee: Rui Li > Attachments: HIVE-15104.1.patch, HIVE-15104.10.patch, > HIVE-15104.2.patch, HIVE-15104.3.patch, HIVE-15104.4.patch, > HIVE-15104.5.patch, HIVE-15104.6.patch, HIVE-15104.7.patch, > HIVE-15104.8.patch, HIVE-15104.9.patch, TPC-H 100G.xlsx > > > the same sql, running on spark and mr engine, will generate different size > of shuffle data. > i think it is because of hive on mr just serialize part of HiveKey, but hive > on spark which using kryo will serialize full of Hivekey object. > what is your opionion? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17232) "No match found" Compactor finds a bucket file thinking it's a directory
[ https://issues.apache.org/jira/browse/HIVE-17232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217418#comment-16217418 ] Steve Yeom commented on HIVE-17232: --- [~ekoifman] please review the patch 01. Thanks, Steve. > "No match found" Compactor finds a bucket file thinking it's a directory > -- > > Key: HIVE-17232 > URL: https://issues.apache.org/jira/browse/HIVE-17232 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Assignee: Steve Yeom > Attachments: HIVE-17232.01.patch > > > {noformat} > 2017-08-02T12:38:11,996 WARN [main] compactor.CompactorMR: Found a > non-bucket file that we thought matched the bucket pattern! > file:/Users/ekoifman/dev/hiv\ > erwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands2-1501702264311/warehouse/acidtblpart/p=1/delta_013_013_/bucket_1 > Matcher=java\ > .util.regex.Matcher[pattern=^[0-9]{6} region=0,12 lastmatch=] > 2017-08-02T12:38:11,996 INFO [main] mapreduce.JobSubmitter: Cleaning up the > staging area > file:/tmp/hadoop/mapred/staging/ekoifman1723152463/.staging/job_lo\ > cal1723152463_0183 > 2017-08-02T12:38:11,997 ERROR [main] compactor.Worker: Caught exception while > trying to compact > id:1,dbname:default,tableName:ACIDTBLPART,partName:null,stat\ > e:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestTxnId:0. > Marking failed to avoid repeated failures, java.lang.IllegalStateException: > \ > No match found > at java.util.regex.Matcher.group(Matcher.java:536) > at java.util.regex.Matcher.group(Matcher.java:496) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorInputFormat.addFileToMap(CompactorMR.java:577) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorInputFormat.getSplits(CompactorMR.java:549) > at > org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:330) > at > org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:322) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:198) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807) > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.launchCompactionJob(CompactorMR.java:320) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:275) > at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:166) > at > org.apache.hadoop.hive.ql.TestTxnCommands2.runWorker(TestTxnCommands2.java:1138) > at > org.apache.hadoop.hive.ql.TestTxnCommands2.updateDeletePartitioned(TestTxnCommands2.java:894) > {noformat} > the stack trace points to 1st runWorker() in updateDeletePartitioned() though > the test run was TestTxnCommands2WithSplitUpdateAndVectorization -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17881) LLAP: Text cache NPE
[ https://issues.apache.org/jira/browse/HIVE-17881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217413#comment-16217413 ] Sergey Shelukhin commented on HIVE-17881: - The solution would be to remove hive.llap.io.memory.mode. Were you using it for some legitimate reason? :) > LLAP: Text cache NPE > > > Key: HIVE-17881 > URL: https://issues.apache.org/jira/browse/HIVE-17881 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran > > With LLAP IO enabled and hive.llap.io.memory.mode set to false. Text cache > throws NPE for following query > {code} > select t1.k,t1.v from src t1 join src t2 on t1.k>=t2.k; > {code} > {code} > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.readFileWithCache(SerDeEncodedDataReader.java:763) > at > org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.performDataRead(SerDeEncodedDataReader.java:668) > at > org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader$5.run(SerDeEncodedDataReader.java:259) > at > org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader$5.run(SerDeEncodedDataReader.java:256) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1889) > at > org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.callInternal(SerDeEncodedDataReader.java:256) > at > org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.callInternal(SerDeEncodedDataReader.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17832) Allow hive.metastore.disallow.incompatible.col.type.changes to be changed in metastore
[ https://issues.apache.org/jira/browse/HIVE-17832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217405#comment-16217405 ] Sergey Shelukhin commented on HIVE-17832: - Hmm, sure. As for the embedded metastore, it implies that the user is the admin because they have full control over metastore (and direct access to the database). But yeah, I guess it's similar to strict check parameters and we should allow users to shoot themselves in the foot :P +1 > Allow hive.metastore.disallow.incompatible.col.type.changes to be changed in > metastore > -- > > Key: HIVE-17832 > URL: https://issues.apache.org/jira/browse/HIVE-17832 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Janaki Lahorani >Assignee: Janaki Lahorani > Fix For: 3.0.0 > > Attachments: HIVE17832.1.patch, HIVE17832.2.patch > > > hive.metastore.disallow.incompatible.col.type.changes when set to true, will > disallow incompatible column type changes through alter table. But, this > parameter is not modifiable in HMS. If HMS in not embedded into HS2, the > value cannot be changed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HIVE-16603) Enforce foreign keys to refer to primary keys or unique keys
[ https://issues.apache.org/jira/browse/HIVE-16603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich resolved HIVE-16603. - Resolution: Fixed this failure seems to need more than just a quick look - opened HIVE-17886 > Enforce foreign keys to refer to primary keys or unique keys > > > Key: HIVE-16603 > URL: https://issues.apache.org/jira/browse/HIVE-16603 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-16603.patch > > > Follow-up on HIVE-16575. > Currently we do not enforce foreign keys to refer to primary keys or unique > keys (as opposed to PostgreSQL and others); we should do that. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17778) Add support for custom counters in trigger expression
[ https://issues.apache.org/jira/browse/HIVE-17778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217321#comment-16217321 ] Hive QA commented on HIVE-17778: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12893642/HIVE-17778.5.patch {color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 11315 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=101) org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver (batchId=102) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=205) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=222) org.apache.hadoop.hive.ql.parse.authorization.plugin.sqlstd.TestOperation2Privilege.checkHiveOperationTypeMatch (batchId=270) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedFiles (batchId=229) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitions (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitionsUnionAll (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomNonExistent (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime (batchId=229) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7453/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7453/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7453/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12893642 - PreCommit-HIVE-Build > Add support for custom counters in trigger expression > - > > Key: HIVE-17778 > URL: https://issues.apache.org/jira/browse/HIVE-17778 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17778.1.patch, HIVE-17778.2.patch, > HIVE-17778.3.patch, HIVE-17778.4.patch, HIVE-17778.5.patch > > > HIVE-17508 only supports limited counters. This ticket is to extend it to > support custom counters (counters that are not supported by execution engine > will be dropped). -- This message was sent by Atlassian JIRA (v6.4.14#64029)