[jira] [Commented] (HIVE-17884) Implement create, alter and drop workload management triggers.

2017-10-24 Thread Harish Jaiprakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218092#comment-16218092
 ] 

Harish Jaiprakash commented on HIVE-17884:
--

Thanks [~prasanth_j].

For show/describe, the current though process is just expose raw data using 
information schema for now. And at the end enhance show/describe to display in 
some formatted fashion.

I'll fix the typos, and add the getTriggersForResourcePlan api in 
IMetaStoreClient.

It makes sense to support more data types in expression parser, I'll add bytes 
and interval into it.

> Implement create, alter and drop workload management triggers.
> --
>
> Key: HIVE-17884
> URL: https://issues.apache.org/jira/browse/HIVE-17884
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Attachments: HIVE-17884.01.patch
>
>
> Implement triggers for workload management:
> The commands to be implemented:
> CREATE TRIGGER `resourceplan_name`.`trigger_name` WHEN condition DO action;
> condition is a boolean expression: variable operator value types with 'AND' 
> and 'OR' support.
> action is currently: KILL or MOVE TO pool;
> ALTER TRIGGER `plan_name`.`trigger_name` WHEN condition DO action;
> DROP TRIGGER `plan_name`.`trigger_name`;
> Also add WM_TRIGGERS to information schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17879) Can not find java.sql.date in JDK9 when building hive

2017-10-24 Thread liyunzhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218084#comment-16218084
 ] 

liyunzhang commented on HIVE-17879:
---

[~gopalv]: {quote}
the test was using build artifacts from JDK8 - running LLAP (only) with JDK9 by 
overriding the --javaHome param during package builds.

{quote}
I wonder that LLAP does not need hadoop dependency? If need, you mean it can 
run successfully with hadoop package (built with jdk8) in jdk9 env? In my env, 
i tried this but this failed.


> Can not find java.sql.date in JDK9 when building hive
> -
>
> Key: HIVE-17879
> URL: https://issues.apache.org/jira/browse/HIVE-17879
> Project: Hive
>  Issue Type: Sub-task
>Reporter: liyunzhang
>
> when build hive with jdk9
> got following error
> {code}
> [ERROR] Failed to execute goal 
> org.datanucleus:datanucleus-maven-plugin:3.3.0-release:enhance (default) on 
> project hive-standalone-metastore: Error executing DataNucleus tool 
> org.datanucleus.enhancer.DataNucleusEnhancer: InvocationTargetException: 
> java/sql/Date: java.sql.Date -> [Help 1]
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute 
> goal org.datanucleus:datanucleus-maven-plugin:3.3.0-release:enhance (default) 
> on project hive-standalone-metastore: Error executing DataNucleus tool 
> org.datanucleus.enhancer.DataNucleusEnhancer
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:212)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
>   at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
>   at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
>   at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
>   at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
>   at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863)
>   at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288)
>   at org.apache.maven.cli.MavenCli.main(MavenCli.java:199)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
> Caused by: org.apache.maven.plugin.MojoExecutionException: Error executing 
> DataNucleus tool org.datanucleus.enhancer.DataNucleusEnhancer
>   at 
> org.datanucleus.maven.AbstractDataNucleusMojo.executeInJvm(AbstractDataNucleusMojo.java:350)
>   at 
> org.datanucleus.maven.AbstractEnhancerMojo.enhance(AbstractEnhancerMojo.java:266)
>   at 
> org.datanucleus.maven.AbstractEnhancerMojo.executeDataNucleusTool(AbstractEnhancerMojo.java:72)
>   at 
> org.datanucleus.maven.AbstractDataNucleusMojo.execute(AbstractDataNucleusMojo.java:126)
>   at 
> org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:207)
>   ... 20 more
> Caused by: java.lang.reflect.InvocationTargetException
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>   at 
> org.datanucleus.maven.AbstractDataNucleusMojo.executeInJvm(AbstractDataNucleusMojo.java:333)
>   ... 25 more
> Caused by: java.lang.NoClassDefFoundError: java/sql/Date
>   at org.datanucleus.ClassConstants.(ClassConstants.java:66)
>   at 
> 

[jira] [Commented] (HIVE-17884) Implement create, alter and drop workload management triggers.

2017-10-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218067#comment-16218067
 ] 

Hive QA commented on HIVE-17884:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12893737/HIVE-17884.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11322 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=155)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=110)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=205)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=222)
org.apache.hadoop.hive.ql.parse.authorization.plugin.sqlstd.TestOperation2Privilege.checkHiveOperationTypeMatch
 (batchId=270)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes
 (batchId=229)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7460/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7460/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7460/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12893737 - PreCommit-HIVE-Build

> Implement create, alter and drop workload management triggers.
> --
>
> Key: HIVE-17884
> URL: https://issues.apache.org/jira/browse/HIVE-17884
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Attachments: HIVE-17884.01.patch
>
>
> Implement triggers for workload management:
> The commands to be implemented:
> CREATE TRIGGER `resourceplan_name`.`trigger_name` WHEN condition DO action;
> condition is a boolean expression: variable operator value types with 'AND' 
> and 'OR' support.
> action is currently: KILL or MOVE TO pool;
> ALTER TRIGGER `plan_name`.`trigger_name` WHEN condition DO action;
> DROP TRIGGER `plan_name`.`trigger_name`;
> Also add WM_TRIGGERS to information schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17879) Can not find java.sql.date in JDK9 when building hive

2017-10-24 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218051#comment-16218051
 ] 

Gopal V commented on HIVE-17879:


[~kellyzly]: the test was using build artifacts from JDK8 - running LLAP (only) 
with JDK9 by overriding the --javaHome param during package builds. 

The only thing that didn't work there was the misc.Cleaner reference in the 
LLAP off-heap cache (the cache was therefore disabled for both tests).

> Can not find java.sql.date in JDK9 when building hive
> -
>
> Key: HIVE-17879
> URL: https://issues.apache.org/jira/browse/HIVE-17879
> Project: Hive
>  Issue Type: Sub-task
>Reporter: liyunzhang
>
> when build hive with jdk9
> got following error
> {code}
> [ERROR] Failed to execute goal 
> org.datanucleus:datanucleus-maven-plugin:3.3.0-release:enhance (default) on 
> project hive-standalone-metastore: Error executing DataNucleus tool 
> org.datanucleus.enhancer.DataNucleusEnhancer: InvocationTargetException: 
> java/sql/Date: java.sql.Date -> [Help 1]
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute 
> goal org.datanucleus:datanucleus-maven-plugin:3.3.0-release:enhance (default) 
> on project hive-standalone-metastore: Error executing DataNucleus tool 
> org.datanucleus.enhancer.DataNucleusEnhancer
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:212)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
>   at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
>   at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
>   at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
>   at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
>   at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863)
>   at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288)
>   at org.apache.maven.cli.MavenCli.main(MavenCli.java:199)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
> Caused by: org.apache.maven.plugin.MojoExecutionException: Error executing 
> DataNucleus tool org.datanucleus.enhancer.DataNucleusEnhancer
>   at 
> org.datanucleus.maven.AbstractDataNucleusMojo.executeInJvm(AbstractDataNucleusMojo.java:350)
>   at 
> org.datanucleus.maven.AbstractEnhancerMojo.enhance(AbstractEnhancerMojo.java:266)
>   at 
> org.datanucleus.maven.AbstractEnhancerMojo.executeDataNucleusTool(AbstractEnhancerMojo.java:72)
>   at 
> org.datanucleus.maven.AbstractDataNucleusMojo.execute(AbstractDataNucleusMojo.java:126)
>   at 
> org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:207)
>   ... 20 more
> Caused by: java.lang.reflect.InvocationTargetException
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>   at 
> org.datanucleus.maven.AbstractDataNucleusMojo.executeInJvm(AbstractDataNucleusMojo.java:333)
>   ... 25 more
> Caused by: java.lang.NoClassDefFoundError: java/sql/Date
>   at org.datanucleus.ClassConstants.(ClassConstants.java:66)
>   at 
> 

[jira] [Comment Edited] (HIVE-17879) Can not find java.sql.date in JDK9 when building hive

2017-10-24 Thread liyunzhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217983#comment-16217983
 ] 

liyunzhang edited comment on HIVE-17879 at 10/25/17 3:31 AM:
-

[~kgyrtkirk]: thanks for your suggestion. will try.  Actually i found that i 
need to build hadoop package first. If use hadoop pkg with jdk8 building and 
the hive pkg with jdk9 building, the exception will be thrown as following
{code}
Exception in thread "main" java.lang.ExceptionInInitializerError
at org.apache.hadoop.util.StringUtils.(StringUtils.java:80)
at 
org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1437)
at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:4064)
at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:4091)
at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:4294)
at org.apache.hadoop.hive.conf.HiveConf.(HiveConf.java:4200)
at 
org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:99)
at 
org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:83)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:708)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.StringIndexOutOfBoundsException: begin 0, end 3, length 1
at java.base/java.lang.String.checkBoundsBeginEnd(String.java:3116)
at java.base/java.lang.String.substring(String.java:1885)
at org.apache.hadoop.util.Shell.(Shell.java:52)
... 16 more

{code}
[~gopalv]: I saw that the performance test data on HIVE-17573.  Is this tested 
on hive package(built with jdk9) and hadoop package(built with jdk9) on jdk9 
runtime env?


was (Author: kellyzly):
[~kgyrtkirk]: thanks for your suggestion. will try.

> Can not find java.sql.date in JDK9 when building hive
> -
>
> Key: HIVE-17879
> URL: https://issues.apache.org/jira/browse/HIVE-17879
> Project: Hive
>  Issue Type: Sub-task
>Reporter: liyunzhang
>
> when build hive with jdk9
> got following error
> {code}
> [ERROR] Failed to execute goal 
> org.datanucleus:datanucleus-maven-plugin:3.3.0-release:enhance (default) on 
> project hive-standalone-metastore: Error executing DataNucleus tool 
> org.datanucleus.enhancer.DataNucleusEnhancer: InvocationTargetException: 
> java/sql/Date: java.sql.Date -> [Help 1]
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute 
> goal org.datanucleus:datanucleus-maven-plugin:3.3.0-release:enhance (default) 
> on project hive-standalone-metastore: Error executing DataNucleus tool 
> org.datanucleus.enhancer.DataNucleusEnhancer
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:212)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
>   at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
>   at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
>   at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
>   at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
>   at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863)
>   at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288)
>   at org.apache.maven.cli.MavenCli.main(MavenCli.java:199)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>   at 
> 

[jira] [Comment Edited] (HIVE-17899) Provide an option to disable tez split grouping

2017-10-24 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218046#comment-16218046
 ] 

Gopal V edited comment on HIVE-17899 at 10/25/17 3:29 AM:
--

A hive config option for - tez.grouping.by-length=false?

{code}
if (!(groupByLength || groupByCount)) {
  throw new TezUncheckedException(
  "None of the grouping parameters are true: "
  + TEZ_GROUPING_SPLIT_BY_LENGTH + ", "
  + TEZ_GROUPING_SPLIT_BY_COUNT);
}
{code}

That part might need to go into Tez as well.


was (Author: gopalv):
A hive config option for - tez.grouping.by-length=false?

> Provide an option to disable tez split grouping
> ---
>
> Key: HIVE-17899
> URL: https://issues.apache.org/jira/browse/HIVE-17899
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Only way to disable split grouping in tez is to change input format to 
> CombineHiveInputFormat. Provide a config option to disable split grouping 
> regardless of the IF. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17899) Provide an option to disable tez split grouping

2017-10-24 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218046#comment-16218046
 ] 

Gopal V commented on HIVE-17899:


A hive config option for - tez.grouping.by-length=false?

> Provide an option to disable tez split grouping
> ---
>
> Key: HIVE-17899
> URL: https://issues.apache.org/jira/browse/HIVE-17899
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Only way to disable split grouping in tez is to change input format to 
> CombineHiveInputFormat. Provide a config option to disable split grouping 
> regardless of the IF. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-24 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15104:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks Xuefu for the review!

> Hive on Spark generate more shuffle data than hive on mr
> 
>
> Key: HIVE-15104
> URL: https://issues.apache.org/jira/browse/HIVE-15104
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.1
>Reporter: wangwenli
>Assignee: Rui Li
> Fix For: 3.0.0
>
> Attachments: HIVE-15104.1.patch, HIVE-15104.10.patch, 
> HIVE-15104.2.patch, HIVE-15104.3.patch, HIVE-15104.4.patch, 
> HIVE-15104.5.patch, HIVE-15104.6.patch, HIVE-15104.7.patch, 
> HIVE-15104.8.patch, HIVE-15104.9.patch, TPC-H 100G.xlsx
>
>
> the same sql,  running on spark  and mr engine, will generate different size 
> of shuffle data.
> i think it is because of hive on mr just serialize part of HiveKey, but hive 
> on spark which using kryo will serialize full of Hivekey object.  
> what is your opionion?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine

2017-10-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218011#comment-16218011
 ] 

Hive QA commented on HIVE-17433:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12893727/HIVE-17433.05.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7459/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7459/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7459/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-10-25 02:48:56.379
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-7459/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-10-25 02:48:56.382
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 84950cf HIVE-17764 : alter view fails when 
hive.metastore.disallow.incompatible.col.type.changes set to true (Janaki 
Lahorani, reviewed by Andrew Sherman and Vihang Karajgaonkar)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 84950cf HIVE-17764 : alter view fails when 
hive.metastore.disallow.incompatible.col.type.changes set to true (Janaki 
Lahorani, reviewed by Andrew Sherman and Vihang Karajgaonkar)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-10-25 02:48:56.906
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: 
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:2898
error: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: patch does 
not apply
error: patch failed: 
ql/src/test/results/clientpositive/llap/vector_between_columns.q.out:91
error: ql/src/test/results/clientpositive/llap/vector_between_columns.q.out: 
patch does not apply
error: patch failed: 
ql/src/test/results/clientpositive/llap/vector_complex_all.q.out:678
error: ql/src/test/results/clientpositive/llap/vector_complex_all.q.out: patch 
does not apply
error: patch failed: 
ql/src/test/results/clientpositive/llap/vector_groupby_mapjoin.q.out:39
error: ql/src/test/results/clientpositive/llap/vector_groupby_mapjoin.q.out: 
patch does not apply
error: patch failed: 
ql/src/test/results/clientpositive/llap/vector_include_no_sel.q.out:224
error: ql/src/test/results/clientpositive/llap/vector_include_no_sel.q.out: 
patch does not apply
error: patch failed: 
ql/src/test/results/clientpositive/llap/vectorized_dynamic_partition_pruning.q.out:5955
error: 
ql/src/test/results/clientpositive/llap/vectorized_dynamic_partition_pruning.q.out:
 patch does not apply
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12893727 - PreCommit-HIVE-Build

> Vectorization: Support Decimal64 in Hive Query Engine
> -
>
> Key: HIVE-17433
> URL: https://issues.apache.org/jira/browse/HIVE-17433
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17433.03.patch, HIVE-17433.04.patch, 
> HIVE-17433.05.patch
>
>
> Provide partial support for Decimal64 within Hive.  By partial I mean that 
> our current decimal has a large surface area of features (rounding, multiply, 
> divide, remainder, power, big precision, and many more) but only a small 
> 

[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files

2017-10-24 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17458:
--
Attachment: HIVE-17458.08.patch

> VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
> ---
>
> Key: HIVE-17458
> URL: https://issues.apache.org/jira/browse/HIVE-17458
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, 
> HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, 
> HIVE-17458.06.patch, HIVE-17458.07.patch, HIVE-17458.07.patch, 
> HIVE-17458.08.patch
>
>
> VectorizedOrcAcidRowBatchReader will not be used for original files.  This 
> will likely look like a perf regression when converting a table from non-acid 
> to acid until it runs through a major compaction.
> With Load Data support, if large files are added via Load Data, the read ops 
> will not vectorize until major compaction.  
> There is no reason why this should be the case.  Just like 
> OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other 
> files in the logical tranche/bucket and calculate the offset for the RowBatch 
> of the split.  (Presumably getRecordReader().getRowNumber() works the same in 
> vector mode).
> In this case we don't even need OrcSplit.isOriginal() - the reader can infer 
> it from file path... which in particular simplifies 
> OrcInputFormat.determineSplitStrategies()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17879) Can not find java.sql.date in JDK9 when building hive

2017-10-24 Thread liyunzhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217983#comment-16217983
 ] 

liyunzhang commented on HIVE-17879:
---

[~kgyrtkirk]: thanks for your suggestion. will try.

> Can not find java.sql.date in JDK9 when building hive
> -
>
> Key: HIVE-17879
> URL: https://issues.apache.org/jira/browse/HIVE-17879
> Project: Hive
>  Issue Type: Sub-task
>Reporter: liyunzhang
>
> when build hive with jdk9
> got following error
> {code}
> [ERROR] Failed to execute goal 
> org.datanucleus:datanucleus-maven-plugin:3.3.0-release:enhance (default) on 
> project hive-standalone-metastore: Error executing DataNucleus tool 
> org.datanucleus.enhancer.DataNucleusEnhancer: InvocationTargetException: 
> java/sql/Date: java.sql.Date -> [Help 1]
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute 
> goal org.datanucleus:datanucleus-maven-plugin:3.3.0-release:enhance (default) 
> on project hive-standalone-metastore: Error executing DataNucleus tool 
> org.datanucleus.enhancer.DataNucleusEnhancer
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:212)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
>   at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
>   at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
>   at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
>   at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
>   at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863)
>   at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288)
>   at org.apache.maven.cli.MavenCli.main(MavenCli.java:199)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
> Caused by: org.apache.maven.plugin.MojoExecutionException: Error executing 
> DataNucleus tool org.datanucleus.enhancer.DataNucleusEnhancer
>   at 
> org.datanucleus.maven.AbstractDataNucleusMojo.executeInJvm(AbstractDataNucleusMojo.java:350)
>   at 
> org.datanucleus.maven.AbstractEnhancerMojo.enhance(AbstractEnhancerMojo.java:266)
>   at 
> org.datanucleus.maven.AbstractEnhancerMojo.executeDataNucleusTool(AbstractEnhancerMojo.java:72)
>   at 
> org.datanucleus.maven.AbstractDataNucleusMojo.execute(AbstractDataNucleusMojo.java:126)
>   at 
> org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:207)
>   ... 20 more
> Caused by: java.lang.reflect.InvocationTargetException
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>   at 
> org.datanucleus.maven.AbstractDataNucleusMojo.executeInJvm(AbstractDataNucleusMojo.java:333)
>   ... 25 more
> Caused by: java.lang.NoClassDefFoundError: java/sql/Date
>   at org.datanucleus.ClassConstants.(ClassConstants.java:66)
>   at 
> org.datanucleus.plugin.NonManagedPluginRegistry.registerExtensions(NonManagedPluginRegistry.java:206)
>   at 
> org.datanucleus.plugin.NonManagedPluginRegistry.registerExtensionPoints(NonManagedPluginRegistry.java:155)
>   at org.datanucleus.plugin.PluginManager.(PluginManager.java:63)
>   at 
> 

[jira] [Commented] (HIVE-17764) alter view fails when hive.metastore.disallow.incompatible.col.type.changes set to true

2017-10-24 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217982#comment-16217982
 ] 

Vihang Karajgaonkar commented on HIVE-17764:


Patch merged to master. Hi [~janulatha] Can you provide the patch for branch-2 
as well? The qfile from the patch fails on branch-2.

> alter view fails when hive.metastore.disallow.incompatible.col.type.changes 
> set to true
> ---
>
> Key: HIVE-17764
> URL: https://issues.apache.org/jira/browse/HIVE-17764
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0
>
> Attachments: HIVE17764.1.patch, HIVE17764.2.patch
>
>
> A view is a virtual structure that derives the type information from the 
> table(s) the view is based on.If the view definition is altered, the 
> corresponding column types should be updated.  The relevance of the change 
> depending on the previous structure of the view is irrelevant.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-12369) Native Vector GroupBy

2017-10-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217981#comment-16217981
 ] 

Hive QA commented on HIVE-12369:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879115/HIVE-12369.06.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7458/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7458/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7458/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-10-25 01:49:44.492
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-7458/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-10-25 01:49:44.495
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   42e70a3..84950cf  master -> origin/master
   74034f1..84e107b  branch-2   -> origin/branch-2
+ git reset --hard HEAD
HEAD is now at 42e70a3 HIVE-17471 : Vectorization: Enable 
hive.vectorized.row.identifier.enabled to true by default (Sergey Shelukhin, 
reviewed by Matt McCline)
+ git clean -f -d
Removing common/src/java/org/apache/hadoop/hive/conf/HiveConf.java.orig
Removing itests/src/test/resources/testconfiguration.properties.orig
Removing ql/src/test/queries/clientpositive/acid_vectorization_original.q
Removing 
ql/src/test/results/clientpositive/llap/acid_vectorization_original.q.out
Removing 
ql/src/test/results/clientpositive/tez/acid_vectorization_original.q.out
Removing standalone-metastore/src/gen/org/
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 2 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at 84950cf HIVE-17764 : alter view fails when 
hive.metastore.disallow.incompatible.col.type.changes set to true (Janaki 
Lahorani, reviewed by Andrew Sherman and Vihang Karajgaonkar)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-10-25 01:49:49.788
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/plan/GroupByDesc.java:20
error: ql/src/java/org/apache/hadoop/hive/ql/plan/GroupByDesc.java: patch does 
not apply
error: patch failed: ql/src/test/results/clientpositive/llap/sysdb.q.out:3346
error: ql/src/test/results/clientpositive/llap/sysdb.q.out: patch does not apply
error: patch failed: 
ql/src/test/results/clientpositive/llap/vector_aggregate_9.q.out:147
error: ql/src/test/results/clientpositive/llap/vector_aggregate_9.q.out: patch 
does not apply
error: patch failed: 
ql/src/test/results/clientpositive/llap/vector_between_in.q.out:157
error: ql/src/test/results/clientpositive/llap/vector_between_in.q.out: patch 
does not apply
error: patch failed: 
ql/src/test/results/clientpositive/llap/vector_count_distinct.q.out:1327
error: ql/src/test/results/clientpositive/llap/vector_count_distinct.q.out: 
patch does not apply
error: patch failed: 
ql/src/test/results/clientpositive/llap/vector_decimal_precision.q.out:589
error: ql/src/test/results/clientpositive/llap/vector_decimal_precision.q.out: 
patch does not apply
error: ql/src/test/results/clientpositive/llap/vector_empty_where.q.out: No 
such file or directory
error: patch failed: 
ql/src/test/results/clientpositive/llap/vector_groupby_grouping_id2.q.out:609
error: 
ql/src/test/results/clientpositive/llap/vector_groupby_grouping_id2.q.out: 
patch does not apply
error: patch failed: 

[jira] [Updated] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.

2017-10-24 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-17887:
-
Reporter: Santhosh B Gowda  (was: Sankar Hariappan)

> Incremental REPL LOAD with Drop partition event on timestamp type partition 
> column fails.
> -
>
> Key: HIVE-17887
> URL: https://issues.apache.org/jira/browse/HIVE-17887
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Santhosh B Gowda
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17887.01.patch
>
>
> When try to replicate the drop partition event on a table with partition on 
> timestamp type column fails in REPL LOAD.
> *Scenario:*
> 1. create table with partition on timestamp column.
> 2.bootstrap dump/load.
> 3. insert a record to create partition(p="2001-11-09 00:00:00.0").
> 4. drop the same partition(p="2001-11-09 00:00:00.0").
> 5. incremental dump/load
> -- REPL LOAD throws below exception
> {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: 
> Thread-36769]: metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error 
> parsing partition filter; lexer error: line 1:18 no viable alternative at 
> character ':'; exception MismatchedTokenException(12!=23))
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517)
> at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103)
> at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957)
> at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200)
> at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
> at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17832) Allow hive.metastore.disallow.incompatible.col.type.changes to be changed in metastore

2017-10-24 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17832:
---
   Resolution: Fixed
Fix Version/s: 2.4.0
   Status: Resolved  (was: Patch Available)

Committed in branch-2 and master. Thanks [~janulatha] for your contribution.

> Allow hive.metastore.disallow.incompatible.col.type.changes to be changed in 
> metastore
> --
>
> Key: HIVE-17832
> URL: https://issues.apache.org/jira/browse/HIVE-17832
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE17832.1.patch, HIVE17832.2.patch
>
>
> hive.metastore.disallow.incompatible.col.type.changes when set to true, will 
> disallow incompatible column type changes through alter table.  But, this 
> parameter is not modifiable in HMS.  If HMS in not embedded into HS2, the 
> value cannot be changed.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files

2017-10-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217965#comment-16217965
 ] 

Hive QA commented on HIVE-17458:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12893811/HIVE-17458.07.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11320 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_vectorization_original]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=156)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=110)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=205)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=222)
org.apache.hadoop.hive.ql.parse.authorization.plugin.sqlstd.TestOperation2Privilege.checkHiveOperationTypeMatch
 (batchId=270)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7457/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7457/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7457/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12893811 - PreCommit-HIVE-Build

> VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
> ---
>
> Key: HIVE-17458
> URL: https://issues.apache.org/jira/browse/HIVE-17458
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, 
> HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, 
> HIVE-17458.06.patch, HIVE-17458.07.patch, HIVE-17458.07.patch
>
>
> VectorizedOrcAcidRowBatchReader will not be used for original files.  This 
> will likely look like a perf regression when converting a table from non-acid 
> to acid until it runs through a major compaction.
> With Load Data support, if large files are added via Load Data, the read ops 
> will not vectorize until major compaction.  
> There is no reason why this should be the case.  Just like 
> OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other 
> files in the logical tranche/bucket and calculate the offset for the RowBatch 
> of the split.  (Presumably getRecordReader().getRowNumber() works the same in 
> vector mode).
> In this case we don't even need OrcSplit.isOriginal() - the reader can infer 
> it from file path... which in particular simplifies 
> OrcInputFormat.determineSplitStrategies()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17834) Fix flaky triggers test

2017-10-24 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17834:
-
Attachment: HIVE-17834.3.patch

TestTriggersTezSessionPoolManager wasn't performing validation frequently like 
TezTriggersWorkloadManager, the test failed before SHUFFLE_BYTES being 
published and validated (query completed in the meantime).

> Fix flaky triggers test
> ---
>
> Key: HIVE-17834
> URL: https://issues.apache.org/jira/browse/HIVE-17834
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17834.1.patch, HIVE-17834.2.patch, 
> HIVE-17834.3.patch
>
>
> https://issues.apache.org/jira/browse/HIVE-12631?focusedCommentId=16209803=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16209803



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17899) Provide an option to disable tez split grouping

2017-10-24 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-17899:


Assignee: Prasanth Jayachandran

> Provide an option to disable tez split grouping
> ---
>
> Key: HIVE-17899
> URL: https://issues.apache.org/jira/browse/HIVE-17899
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Only way to disable split grouping in tez is to change input format to 
> CombineHiveInputFormat. Provide a config option to disable split grouping 
> regardless of the IF. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17899) Provide an option to disable tez split grouping

2017-10-24 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-17899:


Assignee: (was: Prasanth Jayachandran)

> Provide an option to disable tez split grouping
> ---
>
> Key: HIVE-17899
> URL: https://issues.apache.org/jira/browse/HIVE-17899
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>
> Only way to disable split grouping in tez is to change input format to 
> CombineHiveInputFormat. Provide a config option to disable split grouping 
> regardless of the IF. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17899) Provide an option to disable tez split grouping

2017-10-24 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-17899:



> Provide an option to disable tez split grouping
> ---
>
> Key: HIVE-17899
> URL: https://issues.apache.org/jira/browse/HIVE-17899
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Only way to disable split grouping in tez is to change input format to 
> CombineHiveInputFormat. Provide a config option to disable split grouping 
> regardless of the IF. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15670) column_stats_accurate may not fit in PARTITION_PARAMS.VALUE

2017-10-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217948#comment-16217948
 ] 

Sergey Shelukhin commented on HIVE-15670:
-

Beats me... the current implementation is as such.

> column_stats_accurate may not fit in PARTITION_PARAMS.VALUE
> ---
>
> Key: HIVE-15670
> URL: https://issues.apache.org/jira/browse/HIVE-15670
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> The JSON can be too big with many columns (see setColumnStatsState method).
> We can make JSON more compact by only storing the list of columns with true 
> values. Or we can even store a bitmask in a dedicated column, and adjust it 
> when altering table (rare enough). Or we can just change the VALUE column to 
> text blob (might be a painful change wrt upgrade scripts, and supporting all 
> the DBs' varied blob implementations, esp. in directsql).
> Storing denormalized flags in a separate table will probably be slow, 
> comparatively.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17885) Fix TestTriggersWorkloadManager.testTriggerHighShuffleBytes runtime fluctation

2017-10-24 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217946#comment-16217946
 ] 

Prasanth Jayachandran commented on HIVE-17885:
--

HIVE-17834 should fix this.

> Fix TestTriggersWorkloadManager.testTriggerHighShuffleBytes runtime fluctation
> --
>
> Key: HIVE-17885
> URL: https://issues.apache.org/jira/browse/HIVE-17885
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>
> The following testcase's execution time is fluctuating between 30sec to 90sec
> https://builds.apache.org/job/PreCommit-HIVE-Build/7450/testReport/org.apache.hive.jdbc/TestTriggersWorkloadManager/testTriggerHighShuffleBytes/history/
> in case it reaches 90sec; it times out and fails..



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17884) Implement create, alter and drop workload management triggers.

2017-10-24 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217943#comment-16217943
 ] 

Prasanth Jayachandran commented on HIVE-17884:
--

Also would recommend one other minor change (either in this patch of followup 
jira), while creating resource plan will be better to remove '_' from query 
parallelism

QUERY_PARALLELISM -> QUERY PARALLELISM

we can use '_' for trigger expression counter names.

Also, trigger expression counters does not accept values other than integer. 
Something like below will throw error

WHEN HDFS_BYTES_READ > 10GB DO KILL

from usability perspective, it will be easier to specify 10GB vs values in 
bytes. Similarly for time based counters, WHEN EXECUTION_TIME > 2 hours DO 
KILL. ExpressionFactory.java can parse such counters ("10 GB"). 


> Implement create, alter and drop workload management triggers.
> --
>
> Key: HIVE-17884
> URL: https://issues.apache.org/jira/browse/HIVE-17884
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Attachments: HIVE-17884.01.patch
>
>
> Implement triggers for workload management:
> The commands to be implemented:
> CREATE TRIGGER `resourceplan_name`.`trigger_name` WHEN condition DO action;
> condition is a boolean expression: variable operator value types with 'AND' 
> and 'OR' support.
> action is currently: KILL or MOVE TO pool;
> ALTER TRIGGER `plan_name`.`trigger_name` WHEN condition DO action;
> DROP TRIGGER `plan_name`.`trigger_name`;
> Also add WM_TRIGGERS to information schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17884) Implement create, alter and drop workload management triggers.

2017-10-24 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217941#comment-16217941
 ] 

Prasanth Jayachandran commented on HIVE-17884:
--

Found some issue when trying out this patch and other RP patch

Hive.geAllResourcePlans() -> Hive.getAllResourcePlans()

Will this patch also support show triggers or will it be in a separate patch?

IMetaStoreClient.java is missing getTriggersForResourcePlan API (assuming 
client will be allowed to retrieve Triggers independently without going via 
getResourcePlan).

Hive.java is missing similar getTriggersForResourcePlan.

Looks good otherwise.


> Implement create, alter and drop workload management triggers.
> --
>
> Key: HIVE-17884
> URL: https://issues.apache.org/jira/browse/HIVE-17884
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Attachments: HIVE-17884.01.patch
>
>
> Implement triggers for workload management:
> The commands to be implemented:
> CREATE TRIGGER `resourceplan_name`.`trigger_name` WHEN condition DO action;
> condition is a boolean expression: variable operator value types with 'AND' 
> and 'OR' support.
> action is currently: KILL or MOVE TO pool;
> ALTER TRIGGER `plan_name`.`trigger_name` WHEN condition DO action;
> DROP TRIGGER `plan_name`.`trigger_name`;
> Also add WM_TRIGGERS to information schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17898) Explain plan output enhancement

2017-10-24 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17898:
---
Attachment: HIVE-17898.1.patch

> Explain plan output enhancement
> ---
>
> Key: HIVE-17898
> URL: https://issues.apache.org/jira/browse/HIVE-17898
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17898.1.patch
>
>
> We would like to enhance the explain plan output to display additional 
> information e.g.:
> TableScan operator should have following additional info
> * Actual table name (currently only alias name is displayed)
> * Database name
> * Column names being scanned



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17898) Explain plan output enhancement

2017-10-24 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17898:
---
Status: Patch Available  (was: Open)

> Explain plan output enhancement
> ---
>
> Key: HIVE-17898
> URL: https://issues.apache.org/jira/browse/HIVE-17898
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17898.1.patch
>
>
> We would like to enhance the explain plan output to display additional 
> information e.g.:
> TableScan operator should have following additional info
> * Actual table name (currently only alias name is displayed)
> * Database name
> * Column names being scanned



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17898) Explain plan output enhancement

2017-10-24 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-17898:
--


> Explain plan output enhancement
> ---
>
> Key: HIVE-17898
> URL: https://issues.apache.org/jira/browse/HIVE-17898
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>
> We would like to enhance the explain plan output to display additional 
> information e.g.:
> TableScan operator should have following additional info
> * Actual table name (currently only alias name is displayed)
> * Database name
> * Column names being scanned



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17696) Vectorized reader does not seem to be pushing down projection columns in certain code paths

2017-10-24 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217937#comment-16217937
 ] 

Ferdinand Xu commented on HIVE-17696:
-

Two changes here:
* DataWritableReadSupport did two things in its init method. 1) create request 
schema 2) create meta data. Vectorized Reader only need part one.
* DataWritableReadSupport supported nested pruning filter while vectorization 
path still has some issues which leads qtest failed. So I disabled it in the 
2nd patch.

> Vectorized reader does not seem to be pushing down projection columns in 
> certain code paths
> ---
>
> Key: HIVE-17696
> URL: https://issues.apache.org/jira/browse/HIVE-17696
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
> Attachments: HIVE-17696.2.patch, HIVE-17696.patch
>
>
> This is the code snippet from {{VectorizedParquetRecordReader.java}}
> {noformat}
> MessageType tableSchema;
> if (indexAccess) {
>   List indexSequence = new ArrayList<>();
>   // Generates a sequence list of indexes
>   for(int i = 0; i < columnNamesList.size(); i++) {
> indexSequence.add(i);
>   }
>   tableSchema = DataWritableReadSupport.getSchemaByIndex(fileSchema, 
> columnNamesList,
> indexSequence);
> } else {
>   tableSchema = DataWritableReadSupport.getSchemaByName(fileSchema, 
> columnNamesList,
> columnTypesList);
> }
> indexColumnsWanted = 
> ColumnProjectionUtils.getReadColumnIDs(configuration);
> if (!ColumnProjectionUtils.isReadAllColumns(configuration) && 
> !indexColumnsWanted.isEmpty()) {
>   requestedSchema =
> DataWritableReadSupport.getSchemaByIndex(tableSchema, 
> columnNamesList, indexColumnsWanted);
> } else {
>   requestedSchema = fileSchema;
> }
> this.reader = new ParquetFileReader(
>   configuration, footer.getFileMetaData(), file, blocks, 
> requestedSchema.getColumns());
> {noformat}
> Couple of things to notice here:
> Most of this code is duplicated from {{DataWritableReadSupport.init()}} 
> method. 
> the else condition passes in fileSchema instead of using tableSchema like we 
> do in DataWritableReadSupport.init() method. Does this cause projection 
> columns to be missed when we read parquet files? We should probably just 
> reuse ReadContext returned from {{DataWritableReadSupport.init()}} method 
> here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-17719) Add mapreduce.job.hdfs-servers, mapreduce.job.send-token-conf to sql std auth whitelist

2017-10-24 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair resolved HIVE-17719.
--
Resolution: Won't Fix

This can lead to potential security issues. So instead the repl load command 
has been enhanced to take parameters.


> Add mapreduce.job.hdfs-servers, mapreduce.job.send-token-conf to sql std auth 
> whitelist
> ---
>
> Key: HIVE-17719
> URL: https://issues.apache.org/jira/browse/HIVE-17719
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>
> mapreduce.job.hdfs-servers, mapreduce.job.send-token-conf can be needed to 
> access a remote cluster in HA config for hive replication v2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17897) "repl load" in bootstrap phase fails when partitions have whitespace

2017-10-24 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-17897:
-
Reporter: Sankar Hariappan  (was: Thejas M Nair)

> "repl load" in bootstrap phase fails when partitions have whitespace
> 
>
> Key: HIVE-17897
> URL: https://issues.apache.org/jira/browse/HIVE-17897
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: Thejas M Nair
>Priority: Critical
> Fix For: 3.0.0
>
>
> The issue is that Path.toURI().toString() is being used to serialize the 
> location, while new Path(String) is used to deserialize it. URI escapes chars 
> such as space, so the deserialized location doesn't point to the correct file 
> location.
> Following exception is seen - 
> {code}
> 2017-10-24T11:58:34,451 ERROR [d5606640-8174-4584-8b54-936b0f5628fa main] 
> exec.Task: Failed with exception null
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.repl.CopyUtils.regularCopy(CopyUtils.java:211)
> at 
> org.apache.hadoop.hive.ql.parse.repl.CopyUtils.copyAndVerify(CopyUtils.java:71)
> at 
> org.apache.hadoop.hive.ql.exec.ReplCopyTask.execute(ReplCopyTask.java:137)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:206)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2276)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1906)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1623)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1362)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1352)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17826) Error writing to RandomAccessFile after operation log is closed

2017-10-24 Thread Andrew Sherman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217913#comment-16217913
 ] 

Andrew Sherman edited comment on HIVE-17826 at 10/25/17 12:35 AM:
--

I looked into an alternative solution which is to use an 
[IdlePurgePolicy|https://logging.apache.org/log4j/2.x/manual/appenders.html]. 
This can be inserted into log4j when the RoutingAppender Is created in 
LogDivertAppender. The IdlePurgePolicy works by scheduling a thread to run at a 
configurable interval. When the thread runs it checks if any of the 
RoutingAppender’s sub-Appenders have been idle for more than a configurable 
time. Any that are found are stopped and the AppenderControl  is removed. 
I was able to use this instead of LogUtils.stopQueryAppender() to cause 
OperationLogs to close, providing an alternative mechanism for avoiding the 
file descriptor leak fixed in  [HIVE-17128].
 
The problem I see is that an IdlePurgePolicy may prematurely close the log for 
a long running operation if the operation is not logging. I experimented to see 
what happens when logging with a particular key restarts after being closed by 
IdlePurgePolicy. The good thing is that the logging does succeed but the bad 
thing is that the second log file appears to overwrite the original log file.

So I think that the original fix I proposed may be simpler and safer.

Edited to add: we may also need [HIVE-17373] to be completed in order to get 
bug fixes to  log4j



was (Author: asherman):
I looked into an alternative solution which is to use an 
[IdlePurgePolicy|https://logging.apache.org/log4j/2.x/manual/appenders.html]. 
This can be inserted into log4j when the RoutingAppender Is created in 
LogDivertAppender. The IdlePurgePolicy works by scheduling a thread to run at a 
configurable interval. When the thread runs it checks if any of the 
RoutingAppender’s sub-Appenders have been idle for more than a configurable 
time. Any that are found are stopped and the AppenderControl  is removed. 
I was able to use this instead of LogUtils.stopQueryAppender() to cause 
OperationLogs to close, providing an alternative mechanism for avoiding the 
file descriptor leak fixed in  [HIVE-17128].
 
The problem I see is that an IdlePurgePolicy may prematurely close the log for 
a long running operation if the operation is not logging. I experimented to see 
what happens when logging with a particular key restarts after being closed by 
IdlePurgePolicy. The good thing is that the logging does succeed but the bad 
thing is that the second log file appears to overwrite the original log file.

So I think that the original fix I proposed may be simpler and safer.


> Error writing to RandomAccessFile after operation log is closed
> ---
>
> Key: HIVE-17826
> URL: https://issues.apache.org/jira/browse/HIVE-17826
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Attachments: HIVE-17826.1.patch
>
>
> We are seeing the error from HS2 process stdout.
> {noformat}
> 2017-09-07 10:17:23,933 AsyncLogger-1 ERROR Attempted to append to 
> non-started appender query-file-appender
> 2017-09-07 10:17:23,934 AsyncLogger-1 ERROR Attempted to append to 
> non-started appender query-file-appender
> 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR Unable to write to stream 
> /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9
>  for appender query-file-appender
> 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR An exception occurred processing 
> Appender query-file-appender 
> org.apache.logging.log4j.core.appender.AppenderLoggingException: Error 
> writing to RandomAccessFile 
> /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9
>   at 
> org.apache.logging.log4j.core.appender.RandomAccessFileManager.flush(RandomAccessFileManager.java:114)
>   at 
> org.apache.logging.log4j.core.appender.RandomAccessFileManager.write(RandomAccessFileManager.java:103)
>   at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.write(OutputStreamManager.java:136)
>   at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:105)
>   at 
> org.apache.logging.log4j.core.appender.RandomAccessFileAppender.append(RandomAccessFileAppender.java:89)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:116)
>   at 
> 

[jira] [Commented] (HIVE-17764) alter view fails when hive.metastore.disallow.incompatible.col.type.changes set to true

2017-10-24 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217927#comment-16217927
 ] 

Vihang Karajgaonkar commented on HIVE-17764:


+1

> alter view fails when hive.metastore.disallow.incompatible.col.type.changes 
> set to true
> ---
>
> Key: HIVE-17764
> URL: https://issues.apache.org/jira/browse/HIVE-17764
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0
>
> Attachments: HIVE17764.1.patch, HIVE17764.2.patch
>
>
> A view is a virtual structure that derives the type information from the 
> table(s) the view is based on.If the view definition is altered, the 
> corresponding column types should be updated.  The relevance of the change 
> depending on the previous structure of the view is irrelevant.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17897) "repl load" in bootstrap phase fails when partitions have whitespace

2017-10-24 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair reassigned HIVE-17897:



> "repl load" in bootstrap phase fails when partitions have whitespace
> 
>
> Key: HIVE-17897
> URL: https://issues.apache.org/jira/browse/HIVE-17897
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>Priority: Critical
> Fix For: 3.0.0
>
>
> The issue is that Path.toURI().toString() is being used to serialize the 
> location, while new Path(String) is used to deserialize it. URI escapes chars 
> such as space, so the deserialized location doesn't point to the correct file 
> location.
> Following exception is seen - 
> {code}
> 2017-10-24T11:58:34,451 ERROR [d5606640-8174-4584-8b54-936b0f5628fa main] 
> exec.Task: Failed with exception null
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.repl.CopyUtils.regularCopy(CopyUtils.java:211)
> at 
> org.apache.hadoop.hive.ql.parse.repl.CopyUtils.copyAndVerify(CopyUtils.java:71)
> at 
> org.apache.hadoop.hive.ql.exec.ReplCopyTask.execute(ReplCopyTask.java:137)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:206)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2276)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1906)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1623)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1362)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1352)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17826) Error writing to RandomAccessFile after operation log is closed

2017-10-24 Thread Andrew Sherman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217913#comment-16217913
 ] 

Andrew Sherman commented on HIVE-17826:
---

I looked into an alternative solution which is to use an 
[IdlePurgePolicy|https://logging.apache.org/log4j/2.x/manual/appenders.html]. 
This can be inserted into log4j when the RoutingAppender Is created in 
LogDivertAppender. The IdlePurgePolicy works by scheduling a thread to run at a 
configurable interval. When the thread runs it checks if any of the 
RoutingAppender’s sub-Appenders have been idle for more than a configurable 
time. Any that are found are stopped and the AppenderControl  is removed. 
I was able to use this instead of LogUtils.stopQueryAppender() to cause 
OperationLogs to close, providing an alternative mechanism for avoiding the 
file descriptor leak fixed in  [HIVE-17128].
 
The problem I see is that an IdlePurgePolicy may prematurely close the log for 
a long running operation if the operation is not logging. I experimented to see 
what happens when logging with a particular key restarts after being closed by 
IdlePurgePolicy. The good thing is that the logging does succeed but the bad 
thing is that the second log file appears to overwrite the original log file.

So I think that the original fix I proposed may be simpler and safer.


> Error writing to RandomAccessFile after operation log is closed
> ---
>
> Key: HIVE-17826
> URL: https://issues.apache.org/jira/browse/HIVE-17826
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Attachments: HIVE-17826.1.patch
>
>
> We are seeing the error from HS2 process stdout.
> {noformat}
> 2017-09-07 10:17:23,933 AsyncLogger-1 ERROR Attempted to append to 
> non-started appender query-file-appender
> 2017-09-07 10:17:23,934 AsyncLogger-1 ERROR Attempted to append to 
> non-started appender query-file-appender
> 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR Unable to write to stream 
> /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9
>  for appender query-file-appender
> 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR An exception occurred processing 
> Appender query-file-appender 
> org.apache.logging.log4j.core.appender.AppenderLoggingException: Error 
> writing to RandomAccessFile 
> /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9
>   at 
> org.apache.logging.log4j.core.appender.RandomAccessFileManager.flush(RandomAccessFileManager.java:114)
>   at 
> org.apache.logging.log4j.core.appender.RandomAccessFileManager.write(RandomAccessFileManager.java:103)
>   at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.write(OutputStreamManager.java:136)
>   at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:105)
>   at 
> org.apache.logging.log4j.core.appender.RandomAccessFileAppender.append(RandomAccessFileAppender.java:89)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:116)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)
>   at 
> org.apache.logging.log4j.core.appender.routing.RoutingAppender.append(RoutingAppender.java:112)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:116)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)
>   at 
> org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:390)
>   at 
> org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:378)
>   at 
> org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:362)
>   at 
> org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:79)
>   at 
> org.apache.logging.log4j.core.async.AsyncLogger.actualAsyncLog(AsyncLogger.java:385)
>   at 
> org.apache.logging.log4j.core.async.RingBufferLogEvent.execute(RingBufferLogEvent.java:103)
>   at 
> 

[jira] [Commented] (HIVE-15670) column_stats_accurate may not fit in PARTITION_PARAMS.VALUE

2017-10-24 Thread Alexander Behm (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217912#comment-16217912
 ] 

Alexander Behm commented on HIVE-15670:
---

May I ask what's the purpose of storing this JSON in the tableproperties? Seems 
pretty expensive to me. If you want to keep track of the accuracy of column 
stats, why not populate a "last updated" timestamp in the appropriate column 
statistic?

> column_stats_accurate may not fit in PARTITION_PARAMS.VALUE
> ---
>
> Key: HIVE-15670
> URL: https://issues.apache.org/jira/browse/HIVE-15670
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> The JSON can be too big with many columns (see setColumnStatsState method).
> We can make JSON more compact by only storing the list of columns with true 
> values. Or we can even store a bitmask in a dedicated column, and adjust it 
> when altering table (rare enough). Or we can just change the VALUE column to 
> text blob (might be a painful change wrt upgrade scripts, and supporting all 
> the DBs' varied blob implementations, esp. in directsql).
> Storing denormalized flags in a separate table will probably be slow, 
> comparatively.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17841) implement applying the resource plan

2017-10-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17841:

Attachment: HIVE-17841.02.patch

Adding many more tests

> implement applying the resource plan
> 
>
> Key: HIVE-17841
> URL: https://issues.apache.org/jira/browse/HIVE-17841
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17841.01.patch, HIVE-17841.02.patch, 
> HIVE-17841.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16970) General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils

2017-10-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217881#comment-16217881
 ] 

Ashutosh Chauhan commented on HIVE-16970:
-

+1

> General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils
> -
>
> Key: HIVE-16970
> URL: https://issues.apache.org/jira/browse/HIVE-16970
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-16970.1.patch, HIVE-16970.2.patch
>
>
> # Simplify
> # Do not initiate empty collections
> # Parsing is incorrect:
> {code:title=org.apache.hadoop.hive.metastore.cache.CacheUtils}
>   public static String buildKey(String dbName, String tableName, List 
> partVals) {
> String key = buildKey(dbName, tableName);
> if (partVals == null || partVals.size() == 0) {
>   return key;
> }
> // missing a delimiter between the "tableName" and the first "partVal"
> for (int i = 0; i < partVals.size(); i++) {
>   key += partVals.get(i);
>   if (i != partVals.size() - 1) {
> key += delimit;
>   }
> }
> return key;
>   }
> public static Object[] splitPartitionColStats(String key) {
> // ...
> }
> {code}
> The result of passing the key to the "split" method is:
> {code}
> buildKey("db","Table",["Part1","Part2","Part3"], "col");
> [db, tablePart1, [Part2, Part3], col]
> // "table" and "Part1" is mistakenly concatenated
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16663) String Caching For Rows

2017-10-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217877#comment-16217877
 ] 

Ashutosh Chauhan commented on HIVE-16663:
-

+1 pending tests

> String Caching For Rows
> ---
>
> Key: HIVE-16663
> URL: https://issues.apache.org/jira/browse/HIVE-16663
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.0.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16663.1.patch, HIVE-16663.2.patch, 
> HIVE-16663.3.patch, HIVE-16663.4.patch, HIVE-16663.5.patch, 
> HIVE-16663.6.patch, HIVE-16663.7.patch
>
>
> It is very common that there are many repeated values in the result set of a 
> query, especially when JOINs are present in the query.  As it currently 
> stands, beeline does not attempt to cache any of these values and therefore 
> it consumes a lot of memory.
> Adding a string cache may save a lot of memory.  There are organizations that 
> use beeline to perform ETL processing of result sets into CSV.  This will 
> better support those organizations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17896) TopN: Create a standalone vectorizable TopN operator

2017-10-24 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-17896:
---
Description: 
For TPC-DS Query27, the TopN operation is delayed by the group-by - the 
group-by operator buffers up all the rows before discarding the 99% of the rows 
in the TopN Hash within the ReduceSink Operator.

The RS TopN operator is very restrictive as it only supports doing the 
filtering on the shuffle keys, but it is better to do this before breaking the 
vectors into rows and losing the isRepeating properties.

Adding a TopN operator in the physical operator tree allows the following to 
happen.

GBY->RS(Top=1)

can become 

TopN(1)->GBY->RS(Top=1)

So that, the TopN can remove rows before they are buffered into the GBY and 
consume memory.

Here's the equivalent implementation in Presto

https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/operator/TopNOperator.java#L35

Adding this as a sub-feature of GroupBy prevents further optimizations if the 
GBY is on keys "a,b,c" and the TopN is on just "a".

  was:
For TPC-DS Query27, the TopN operation is delayed by the group-by - the 
group-by operator buffers up all the rows before discarding the 99% of the rows 
in the TopN Hash within the ReduceSink Operator.

The RS TopN operator is very restrictive as it only supports doing the 
filtering on the shuffle keys, but it is better to do this before breaking the 
vectors into rows and losing the isRepeating properties.

Adding a TopN operator in the physical operator tree allows the following to 
happen.

GBY->RS(Top=1)

can become 

TopN(1)->GBY->RS(Top=1)

So that, the TopN can remove rows before they are buffered into the GBY and 
consume memory.

Here's the equivalent implementation in Presto

https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/operator/TopNOperator.java#L35


> TopN: Create a standalone vectorizable TopN operator
> 
>
> Key: HIVE-17896
> URL: https://issues.apache.org/jira/browse/HIVE-17896
> Project: Hive
>  Issue Type: New Feature
>  Components: Operators
>Affects Versions: 3.0.0
>Reporter: Gopal V
>
> For TPC-DS Query27, the TopN operation is delayed by the group-by - the 
> group-by operator buffers up all the rows before discarding the 99% of the 
> rows in the TopN Hash within the ReduceSink Operator.
> The RS TopN operator is very restrictive as it only supports doing the 
> filtering on the shuffle keys, but it is better to do this before breaking 
> the vectors into rows and losing the isRepeating properties.
> Adding a TopN operator in the physical operator tree allows the following to 
> happen.
> GBY->RS(Top=1)
> can become 
> TopN(1)->GBY->RS(Top=1)
> So that, the TopN can remove rows before they are buffered into the GBY and 
> consume memory.
> Here's the equivalent implementation in Presto
> https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/operator/TopNOperator.java#L35
> Adding this as a sub-feature of GroupBy prevents further optimizations if the 
> GBY is on keys "a,b,c" and the TopN is on just "a".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17831) HiveSemanticAnalyzerHookContext does not update the HiveOperation after sem.analyze() is called

2017-10-24 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-17831:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> HiveSemanticAnalyzerHookContext does not update the HiveOperation after 
> sem.analyze() is called
> ---
>
> Key: HIVE-17831
> URL: https://issues.apache.org/jira/browse/HIVE-17831
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0, 2.4.0, 2.2.1, 2.3.1
>Reporter: Sergio Peña
>Assignee: Aihua Xu
> Fix For: 2.1.2, 3.0.0, 2.4.0, 2.2.1, 2.3.2
>
> Attachments: HIVE-17831.1.patch
>
>
> The SemanticAnalyzer.analyze() called on the Driver.compile() method updates 
> the HiveOperation based on the analysis this does. However, the patch done on 
> HIVE-17048 does not update such operation and is send an invalid operation to 
> the postAnalyze() call.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217849#comment-16217849
 ] 

Hive QA commented on HIVE-15104:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12893666/HIVE-15104.10.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 11319 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=110)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=205)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=222)
org.apache.hadoop.hive.ql.parse.authorization.plugin.sqlstd.TestOperation2Privilege.checkHiveOperationTypeMatch
 (batchId=270)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=229)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7456/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7456/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7456/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12893666 - PreCommit-HIVE-Build

> Hive on Spark generate more shuffle data than hive on mr
> 
>
> Key: HIVE-15104
> URL: https://issues.apache.org/jira/browse/HIVE-15104
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.1
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15104.1.patch, HIVE-15104.10.patch, 
> HIVE-15104.2.patch, HIVE-15104.3.patch, HIVE-15104.4.patch, 
> HIVE-15104.5.patch, HIVE-15104.6.patch, HIVE-15104.7.patch, 
> HIVE-15104.8.patch, HIVE-15104.9.patch, TPC-H 100G.xlsx
>
>
> the same sql,  running on spark  and mr engine, will generate different size 
> of shuffle data.
> i think it is because of hive on mr just serialize part of HiveKey, but hive 
> on spark which using kryo will serialize full of Hivekey object.  
> what is your opionion?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17895) Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP)

2017-10-24 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217803#comment-16217803
 ] 

Vihang Karajgaonkar edited comment on HIVE-17895 at 10/24/17 10:27 PM:
---

Thats a good idea. Not sure if we can automate this for our pre-commit. Seems 
like a good idea to make sure that the results match with or without 
vectorization. I will meanwhile give it a shot on branch-2 to see if these 
tests are failing on branch-2 as well. Did you run with all the **vector**.q 
files? or there were specific qfiles which you targetted. Thanks!


was (Author: vihangk1):
Thats a good idea. Not sure if we can automate this for our pre-commit. Seems 
like a good idea to make sure that the results match with or without 
vectorization. I will meanwhile give it a shot on branch-2 to see if these 
tests are failing on branch-2 as well. Did you run with all the *vector*.q 
files? or there were specific qfiles which you targetted. Thanks!

> Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP)
> 
>
> Key: HIVE-17895
> URL: https://issues.apache.org/jira/browse/HIVE-17895
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> NonVec: 103   NULL0.0 NULLoriginal
> Vec: 103  NULLNULLNULLoriginal



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17895) Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP)

2017-10-24 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217803#comment-16217803
 ] 

Vihang Karajgaonkar commented on HIVE-17895:


Thats a good idea. Not sure if we can automate this for our pre-commit. Seems 
like a good idea to make sure that the results match with or without 
vectorization. I will meanwhile give it a shot on branch-2 to see if these 
tests are failing on branch-2 as well. Did you run with all the *vector*.q 
files? or there were specific qfiles which you targetted. Thanks!

> Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP)
> 
>
> Key: HIVE-17895
> URL: https://issues.apache.org/jira/browse/HIVE-17895
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> NonVec: 103   NULL0.0 NULLoriginal
> Vec: 103  NULLNULLNULLoriginal



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17831) HiveSemanticAnalyzerHookContext does not update the HiveOperation after sem.analyze() is called

2017-10-24 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-17831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-17831:
---
Fix Version/s: 2.3.2

> HiveSemanticAnalyzerHookContext does not update the HiveOperation after 
> sem.analyze() is called
> ---
>
> Key: HIVE-17831
> URL: https://issues.apache.org/jira/browse/HIVE-17831
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0, 2.4.0, 2.2.1, 2.3.1
>Reporter: Sergio Peña
>Assignee: Aihua Xu
> Fix For: 2.1.2, 3.0.0, 2.4.0, 2.2.1, 2.3.2
>
> Attachments: HIVE-17831.1.patch
>
>
> The SemanticAnalyzer.analyze() called on the Driver.compile() method updates 
> the HiveOperation based on the analysis this does. However, the patch done on 
> HIVE-17048 does not update such operation and is send an invalid operation to 
> the postAnalyze() call.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17895) Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP)

2017-10-24 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217760#comment-16217760
 ] 

Matt McCline commented on HIVE-17895:
-

[~vihangk1] No, I have not checked on branch-2.

The trick I use in checking is to modify the "if (!vectorPath) {" in the 
Vectorizer source to "if (!vectorPath || true) {" to temporarily disable 
vectorization and then execute the Q file tests.  I look for query result 
differences in diffs.  Occasionally, differences are due to a lack of "-- 
SORT_QUERY_RESULTS" in the Q file.  But usually it is a different result but 
not necessarily the fault of vectorization.  Sometimes it is row-mode that is 
in error.

> Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP)
> 
>
> Key: HIVE-17895
> URL: https://issues.apache.org/jira/browse/HIVE-17895
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> NonVec: 103   NULL0.0 NULLoriginal
> Vec: 103  NULLNULLNULLoriginal



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files

2017-10-24 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17458:
--
Attachment: HIVE-17458.07.patch

> VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
> ---
>
> Key: HIVE-17458
> URL: https://issues.apache.org/jira/browse/HIVE-17458
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, 
> HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, 
> HIVE-17458.06.patch, HIVE-17458.07.patch, HIVE-17458.07.patch
>
>
> VectorizedOrcAcidRowBatchReader will not be used for original files.  This 
> will likely look like a perf regression when converting a table from non-acid 
> to acid until it runs through a major compaction.
> With Load Data support, if large files are added via Load Data, the read ops 
> will not vectorize until major compaction.  
> There is no reason why this should be the case.  Just like 
> OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other 
> files in the logical tranche/bucket and calculate the offset for the RowBatch 
> of the split.  (Presumably getRecordReader().getRowNumber() works the same in 
> vector mode).
> In this case we don't even need OrcSplit.isOriginal() - the reader can infer 
> it from file path... which in particular simplifies 
> OrcInputFormat.determineSplitStrategies()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17877) HoS: combine equivalent DPP sink works

2017-10-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217684#comment-16217684
 ] 

Hive QA commented on HIVE-17877:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12893661/HIVE-17877.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 11316 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_2]
 (batchId=173)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_3]
 (batchId=173)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=171)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=204)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=221)
org.apache.hadoop.hive.ql.parse.authorization.plugin.sqlstd.TestOperation2Privilege.checkHiveOperationTypeMatch
 (batchId=269)
org.apache.hive.jdbc.TestJdbcDriver2.testSelectExecAsync2 (batchId=229)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testCancelRenewTokenFlow 
(batchId=242)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testConnection 
(batchId=242)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testIsValid (batchId=242)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testIsValidNeg 
(batchId=242)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testNegativeProxyAuth 
(batchId=242)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testNegativeTokenAuth 
(batchId=242)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testProxyAuth 
(batchId=242)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testTokenAuth 
(batchId=242)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7455/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7455/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7455/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12893661 - PreCommit-HIVE-Build

> HoS: combine equivalent DPP sink works
> --
>
> Key: HIVE-17877
> URL: https://issues.apache.org/jira/browse/HIVE-17877
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-17877.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17471) Vectorization: Enable hive.vectorized.row.identifier.enabled to true by default

2017-10-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17471:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the review!

> Vectorization: Enable hive.vectorized.row.identifier.enabled to true by 
> default
> ---
>
> Key: HIVE-17471
> URL: https://issues.apache.org/jira/browse/HIVE-17471
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-17471.01.patch, HIVE-17471.patch
>
>
> We set it disabled in https://issues.apache.org/jira/browse/HIVE-17116 
> "Vectorization: Add infrastructure for vectorization of ROW__ID struct"
> But forgot to turn it on to true by default in Teddy's ACID ROW__ID work... 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17895) Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP)

2017-10-24 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217643#comment-16217643
 ] 

Vihang Karajgaonkar commented on HIVE-17895:


Hi [~mmccline] I see you create some of the wrong results JIRA related to 
vectorization. Do you know if they apply to branch-2 as well?

> Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP)
> 
>
> Key: HIVE-17895
> URL: https://issues.apache.org/jira/browse/HIVE-17895
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> NonVec: 103   NULL0.0 NULLoriginal
> Vec: 103  NULLNULLNULLoriginal



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.

2017-10-24 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217637#comment-16217637
 ] 

Thejas M Nair commented on HIVE-17887:
--

+1

> Incremental REPL LOAD with Drop partition event on timestamp type partition 
> column fails.
> -
>
> Key: HIVE-17887
> URL: https://issues.apache.org/jira/browse/HIVE-17887
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17887.01.patch
>
>
> When try to replicate the drop partition event on a table with partition on 
> timestamp type column fails in REPL LOAD.
> *Scenario:*
> 1. create table with partition on timestamp column.
> 2.bootstrap dump/load.
> 3. insert a record to create partition(p="2001-11-09 00:00:00.0").
> 4. drop the same partition(p="2001-11-09 00:00:00.0").
> 5. incremental dump/load
> -- REPL LOAD throws below exception
> {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: 
> Thread-36769]: metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error 
> parsing partition filter; lexer error: line 1:18 no viable alternative at 
> character ':'; exception MismatchedTokenException(12!=23))
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517)
> at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103)
> at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957)
> at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200)
> at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
> at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17884) Implement create, alter and drop workload management triggers.

2017-10-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217635#comment-16217635
 ] 

Sergey Shelukhin commented on HIVE-17884:
-

One small comment, overall looks good. cc [~prasanth_j]

> Implement create, alter and drop workload management triggers.
> --
>
> Key: HIVE-17884
> URL: https://issues.apache.org/jira/browse/HIVE-17884
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Attachments: HIVE-17884.01.patch
>
>
> Implement triggers for workload management:
> The commands to be implemented:
> CREATE TRIGGER `resourceplan_name`.`trigger_name` WHEN condition DO action;
> condition is a boolean expression: variable operator value types with 'AND' 
> and 'OR' support.
> action is currently: KILL or MOVE TO pool;
> ALTER TRIGGER `plan_name`.`trigger_name` WHEN condition DO action;
> DROP TRIGGER `plan_name`.`trigger_name`;
> Also add WM_TRIGGERS to information schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14731) Use Tez cartesian product edge in Hive (unpartitioned case only)

2017-10-24 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217616#comment-16217616
 ] 

Gunther Hagleitner commented on HIVE-14731:
---

Ran relevant tests locally. All pass. Committed to master.

> Use Tez cartesian product edge in Hive (unpartitioned case only)
> 
>
> Key: HIVE-14731
> URL: https://issues.apache.org/jira/browse/HIVE-14731
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: HIVE-14731.1.patch, HIVE-14731.10.patch, 
> HIVE-14731.11.patch, HIVE-14731.12.patch, HIVE-14731.13.patch, 
> HIVE-14731.14.patch, HIVE-14731.15.patch, HIVE-14731.16.patch, 
> HIVE-14731.17.patch, HIVE-14731.18.patch, HIVE-14731.19.patch, 
> HIVE-14731.2.patch, HIVE-14731.20.patch, HIVE-14731.21.patch, 
> HIVE-14731.22.patch, HIVE-14731.23.patch, HIVE-14731.3.patch, 
> HIVE-14731.4.patch, HIVE-14731.5.patch, HIVE-14731.6.patch, 
> HIVE-14731.7.patch, HIVE-14731.8.patch, HIVE-14731.9.patch
>
>
> Given cartesian product edge is available in Tez now (see TEZ-3230), let's 
> integrate it into Hive on Tez. This allows us to have more than one reducer 
> in cross product queries.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17882) Resource plan retrieval looks incorrect

2017-10-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217608#comment-16217608
 ] 

Sergey Shelukhin commented on HIVE-17882:
-

+1

> Resource plan retrieval looks incorrect
> ---
>
> Key: HIVE-17882
> URL: https://issues.apache.org/jira/browse/HIVE-17882
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Harish Jaiprakash
> Attachments: HIVE-17882.01.patch
>
>
> {code}
> 0: jdbc:hive2://localhost:1> show resource plan global;
> +--+-++
> | rp_name  | status  | query_parallelism  |
> +--+-++
> | global   | 1   | NULL   |
> +--+-++
> {code}
> looks like status and query_parallelism got swapped.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine

2017-10-24 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217539#comment-16217539
 ] 

Matt McCline edited comment on HIVE-17433 at 10/24/17 8:11 PM:
---

Known Wrong Vectorization Results on Master:

HIVE-17893: Vectorization: Wrong results for vector_udf3.q
HIVE-17892: Vectorization: Wrong results for vectorized_timestamp_funcs.q
HIVE-17890: Vectorization: Wrong results for vectorized_case.q
HIVE-17889: Vectorization: Wrong results for vectorization_15.q
HIVE-17863: Vectorization: Two Q files produce wrong PTF query results
HIVE-17123: Vectorization: Wrong results for vector_groupby_cube1.q

HIVE-16919: Vectorization: vectorization_short_regress.q has query result 
differences with non-vectorized run. Vectorized unary function broken?
HIVE-17895: Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP)
HIVE-17894: Vectorization: Wrong results for dynpart_sort_opt_vectorization.q 
(LLAP)


was (Author: mmccline):
Known Wrong Vectorization Results on Master:

HIVE-17893: Vectorization: Wrong results for vector_udf3.q
HIVE-17892: Vectorization: Wrong results for vectorized_timestamp_funcs.q
HIVE-17890: Vectorization: Wrong results for vectorized_case.q
HIVE-17889: Vectorization: Wrong results for vectorization_15.q
HIVE-17863: Vectorization: Two Q files produce wrong PTF query results
HIVE-17123: Vectorization: Wrong results for vector_groupby_cube1.q

> Vectorization: Support Decimal64 in Hive Query Engine
> -
>
> Key: HIVE-17433
> URL: https://issues.apache.org/jira/browse/HIVE-17433
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17433.03.patch, HIVE-17433.04.patch, 
> HIVE-17433.05.patch
>
>
> Provide partial support for Decimal64 within Hive.  By partial I mean that 
> our current decimal has a large surface area of features (rounding, multiply, 
> divide, remainder, power, big precision, and many more) but only a small 
> number has been identified as being performance hotspots.
> Those are small precision decimals with precision <= 18 that fit within a 
> 64-bit long we are calling Decimal64 ​.  Just as we optimize row-mode 
> execution engine hotspots by selectively adding new vectorization code, we 
> can treat the current decimal as the full featured one and add additional 
> Decimal64 optimization where query benchmarks really show it help.
> This change creates a Decimal64ColumnVector.
> This change currently detects small decimal with Hive for Vectorized text 
> input format and uses some new Decimal64 vectorized classes for comparison, 
> addition, and later perhaps a few GroupBy aggregations like sum, avg, min, 
> max.
> The patch also supports a new annotation that can mark a 
> VectorizedInputFormat as supporting Decimal64 (it is called DECIMAL_64).  So, 
> in separate work those other formats such as ORC, PARQUET, etc can be done in 
> later JIRAs so they participate in the Decimal64 performance optimization.
> The idea is when you annotate your input format with:
> @VectorizedInputFormatSupports(supports = {DECIMAL_64})
> the Vectorizer in Hive will plan usage of Decimal64ColumnVector instead of 
> DecimalColumnVector.  Upon an input format seeing Decimal64ColumnVector being 
> used, the input format can fill that column vector with decimal64 longs 
> instead of HiveDecimalWritable objects of DecimalColumnVector.
> There will be a Hive environment variable 
> hive.vectorized.input.format.supports.enabled that has a string list of 
> supported features.  The default will start as "decimal_64".  It can be 
> turned off to allow for performance comparisons and testing.
> The query SELECT * FROM DECIMAL_6_1_txt where key - 100BD < 200BD ORDER BY 
> key, value
> Will have a vectorized explain plan looking like:
> ...
> Filter Operator
>   Filter Vectorization:
>   className: VectorFilterOperator
>   native: true
>   predicateExpression: 
> FilterDecimal64ColLessDecimal64Scalar(col 2, val 2000)(children: 
> Decimal64ColSubtractDecimal64Scalar(col 0, val 1000, 
> outputDecimal64AbsMax 999) -> 2:decimal(11,5)/DECIMAL_64) -> boolean
>   predicate: ((key - 100) < 200) (type: boolean)
> ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17895) Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP)

2017-10-24 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reassigned HIVE-17895:
---


> Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP)
> 
>
> Key: HIVE-17895
> URL: https://issues.apache.org/jira/browse/HIVE-17895
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> NonVec: 103   NULL0.0 NULLoriginal
> Vec: 103  NULLNULLNULLoriginal



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17894) Vectorization: Wrong results for dynpart_sort_opt_vectorization.q (LLAP)

2017-10-24 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reassigned HIVE-17894:
---


> Vectorization: Wrong results for dynpart_sort_opt_vectorization.q (LLAP)
> 
>
> Key: HIVE-17894
> URL: https://issues.apache.org/jira/browse/HIVE-17894
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> NonVec: 34
> Vec: 38



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17832) Allow hive.metastore.disallow.incompatible.col.type.changes to be changed in metastore

2017-10-24 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217574#comment-16217574
 ] 

Vihang Karajgaonkar commented on HIVE-17832:


+1. I will be committin this patch EOD unless someone has any other objections.

> Allow hive.metastore.disallow.incompatible.col.type.changes to be changed in 
> metastore
> --
>
> Key: HIVE-17832
> URL: https://issues.apache.org/jira/browse/HIVE-17832
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0
>
> Attachments: HIVE17832.1.patch, HIVE17832.2.patch
>
>
> hive.metastore.disallow.incompatible.col.type.changes when set to true, will 
> disallow incompatible column type changes through alter table.  But, this 
> parameter is not modifiable in HMS.  If HMS in not embedded into HS2, the 
> value cannot be changed.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17832) Allow hive.metastore.disallow.incompatible.col.type.changes to be changed in metastore

2017-10-24 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217574#comment-16217574
 ] 

Vihang Karajgaonkar edited comment on HIVE-17832 at 10/24/17 7:56 PM:
--

+1. I will be committing this patch EOD unless someone has any other objections.


was (Author: vihangk1):
+1. I will be committin this patch EOD unless someone has any other objections.

> Allow hive.metastore.disallow.incompatible.col.type.changes to be changed in 
> metastore
> --
>
> Key: HIVE-17832
> URL: https://issues.apache.org/jira/browse/HIVE-17832
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0
>
> Attachments: HIVE17832.1.patch, HIVE17832.2.patch
>
>
> hive.metastore.disallow.incompatible.col.type.changes when set to true, will 
> disallow incompatible column type changes through alter table.  But, this 
> parameter is not modifiable in HMS.  If HMS in not embedded into HS2, the 
> value cannot be changed.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.

2017-10-24 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17887:

Status: Patch Available  (was: Open)

> Incremental REPL LOAD with Drop partition event on timestamp type partition 
> column fails.
> -
>
> Key: HIVE-17887
> URL: https://issues.apache.org/jira/browse/HIVE-17887
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17887.01.patch
>
>
> When try to replicate the drop partition event on a table with partition on 
> timestamp type column fails in REPL LOAD.
> *Scenario:*
> 1. create table with partition on timestamp column.
> 2.bootstrap dump/load.
> 3. insert a record to create partition(p="2001-11-09 00:00:00.0").
> 4. drop the same partition(p="2001-11-09 00:00:00.0").
> 5. incremental dump/load
> -- REPL LOAD throws below exception
> {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: 
> Thread-36769]: metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error 
> parsing partition filter; lexer error: line 1:18 no viable alternative at 
> character ':'; exception MismatchedTokenException(12!=23))
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517)
> at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103)
> at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957)
> at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200)
> at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
> at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.

2017-10-24 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17887:

Attachment: HIVE-17887.01.patch

Attached 01.patch with drop partition replicated for timestamp column partition.
Request [~thejas] to please review the same!

> Incremental REPL LOAD with Drop partition event on timestamp type partition 
> column fails.
> -
>
> Key: HIVE-17887
> URL: https://issues.apache.org/jira/browse/HIVE-17887
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17887.01.patch
>
>
> When try to replicate the drop partition event on a table with partition on 
> timestamp type column fails in REPL LOAD.
> *Scenario:*
> 1. create table with partition on timestamp column.
> 2.bootstrap dump/load.
> 3. insert a record to create partition(p="2001-11-09 00:00:00.0").
> 4. drop the same partition(p="2001-11-09 00:00:00.0").
> 5. incremental dump/load
> -- REPL LOAD throws below exception
> {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: 
> Thread-36769]: metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error 
> parsing partition filter; lexer error: line 1:18 no viable alternative at 
> character ':'; exception MismatchedTokenException(12!=23))
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517)
> at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103)
> at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957)
> at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200)
> at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
> at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work stopped] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.

2017-10-24 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17887 stopped by Sankar Hariappan.
---
> Incremental REPL LOAD with Drop partition event on timestamp type partition 
> column fails.
> -
>
> Key: HIVE-17887
> URL: https://issues.apache.org/jira/browse/HIVE-17887
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
>
> When try to replicate the drop partition event on a table with partition on 
> timestamp type column fails in REPL LOAD.
> *Scenario:*
> 1. create table with partition on timestamp column.
> 2.bootstrap dump/load.
> 3. insert a record to create partition(p="2001-11-09 00:00:00.0").
> 4. drop the same partition(p="2001-11-09 00:00:00.0").
> 5. incremental dump/load
> -- REPL LOAD throws below exception
> {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: 
> Thread-36769]: metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error 
> parsing partition filter; lexer error: line 1:18 no viable alternative at 
> character ':'; exception MismatchedTokenException(12!=23))
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517)
> at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103)
> at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957)
> at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200)
> at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
> at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files

2017-10-24 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17458:
--
Attachment: HIVE-17458.07.patch

> VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
> ---
>
> Key: HIVE-17458
> URL: https://issues.apache.org/jira/browse/HIVE-17458
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, 
> HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, 
> HIVE-17458.06.patch, HIVE-17458.07.patch
>
>
> VectorizedOrcAcidRowBatchReader will not be used for original files.  This 
> will likely look like a perf regression when converting a table from non-acid 
> to acid until it runs through a major compaction.
> With Load Data support, if large files are added via Load Data, the read ops 
> will not vectorize until major compaction.  
> There is no reason why this should be the case.  Just like 
> OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other 
> files in the logical tranche/bucket and calculate the offset for the RowBatch 
> of the split.  (Presumably getRecordReader().getRowNumber() works the same in 
> vector mode).
> In this case we don't even need OrcSplit.isOriginal() - the reader can infer 
> it from file path... which in particular simplifies 
> OrcInputFormat.determineSplitStrategies()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine

2017-10-24 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217539#comment-16217539
 ] 

Matt McCline commented on HIVE-17433:
-

Known Wrong Vectorization Results on Master:

HIVE-17893: Vectorization: Wrong results for vector_udf3.q
HIVE-17892: Vectorization: Wrong results for vectorized_timestamp_funcs.q
HIVE-17890: Vectorization: Wrong results for vectorized_case.q
HIVE-17889: Vectorization: Wrong results for vectorization_15.q
HIVE-17863: Vectorization: Two Q files produce wrong PTF query results
HIVE-17123: Vectorization: Wrong results for vector_groupby_cube1.q

> Vectorization: Support Decimal64 in Hive Query Engine
> -
>
> Key: HIVE-17433
> URL: https://issues.apache.org/jira/browse/HIVE-17433
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17433.03.patch, HIVE-17433.04.patch, 
> HIVE-17433.05.patch
>
>
> Provide partial support for Decimal64 within Hive.  By partial I mean that 
> our current decimal has a large surface area of features (rounding, multiply, 
> divide, remainder, power, big precision, and many more) but only a small 
> number has been identified as being performance hotspots.
> Those are small precision decimals with precision <= 18 that fit within a 
> 64-bit long we are calling Decimal64 ​.  Just as we optimize row-mode 
> execution engine hotspots by selectively adding new vectorization code, we 
> can treat the current decimal as the full featured one and add additional 
> Decimal64 optimization where query benchmarks really show it help.
> This change creates a Decimal64ColumnVector.
> This change currently detects small decimal with Hive for Vectorized text 
> input format and uses some new Decimal64 vectorized classes for comparison, 
> addition, and later perhaps a few GroupBy aggregations like sum, avg, min, 
> max.
> The patch also supports a new annotation that can mark a 
> VectorizedInputFormat as supporting Decimal64 (it is called DECIMAL_64).  So, 
> in separate work those other formats such as ORC, PARQUET, etc can be done in 
> later JIRAs so they participate in the Decimal64 performance optimization.
> The idea is when you annotate your input format with:
> @VectorizedInputFormatSupports(supports = {DECIMAL_64})
> the Vectorizer in Hive will plan usage of Decimal64ColumnVector instead of 
> DecimalColumnVector.  Upon an input format seeing Decimal64ColumnVector being 
> used, the input format can fill that column vector with decimal64 longs 
> instead of HiveDecimalWritable objects of DecimalColumnVector.
> There will be a Hive environment variable 
> hive.vectorized.input.format.supports.enabled that has a string list of 
> supported features.  The default will start as "decimal_64".  It can be 
> turned off to allow for performance comparisons and testing.
> The query SELECT * FROM DECIMAL_6_1_txt where key - 100BD < 200BD ORDER BY 
> key, value
> Will have a vectorized explain plan looking like:
> ...
> Filter Operator
>   Filter Vectorization:
>   className: VectorFilterOperator
>   native: true
>   predicateExpression: 
> FilterDecimal64ColLessDecimal64Scalar(col 2, val 2000)(children: 
> Decimal64ColSubtractDecimal64Scalar(col 0, val 1000, 
> outputDecimal64AbsMax 999) -> 2:decimal(11,5)/DECIMAL_64) -> boolean
>   predicate: ((key - 100) < 200) (type: boolean)
> ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17891) HIVE-13076 uses create table if not exists for the postgres script

2017-10-24 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17891:
---
Attachment: HIVE-17891.01.patch

> HIVE-13076 uses create table if not exists for the postgres script
> --
>
> Key: HIVE-17891
> URL: https://issues.apache.org/jira/browse/HIVE-17891
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17891.01.patch
>
>
> HIVE-13076 addes a new table to the schema but the patch script uses {{CREATE 
> TABLE IF NOT EXISTS}} syntax to add the new table. The issue the {{IF NOT 
> EXISTS}} clause is only available from postgres 9.1 onwards. So the script 
> will fail for older versions of postgres.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17891) HIVE-13076 uses create table if not exists for the postgres script

2017-10-24 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17891:
---
Status: Patch Available  (was: Open)

> HIVE-13076 uses create table if not exists for the postgres script
> --
>
> Key: HIVE-17891
> URL: https://issues.apache.org/jira/browse/HIVE-17891
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17891.01.patch
>
>
> HIVE-13076 addes a new table to the schema but the patch script uses {{CREATE 
> TABLE IF NOT EXISTS}} syntax to add the new table. The issue the {{IF NOT 
> EXISTS}} clause is only available from postgres 9.1 onwards. So the script 
> will fail for older versions of postgres.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17891) HIVE-13076 uses create table if not exists for the postgres script

2017-10-24 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17891:
---
Description: HIVE-13076 addes a new table to the schema but the patch 
script uses {{CREATE TABLE IF NOT EXISTS}} syntax to add the new table. The 
issue is that the {{IF NOT EXISTS}} clause is only available from postgres 9.1 
onwards. So the script will fail for older versions of postgres.  (was: 
HIVE-13076 addes a new table to the schema but the patch script uses {{CREATE 
TABLE IF NOT EXISTS}} syntax to add the new table. The issue the {{IF NOT 
EXISTS}} clause is only available from postgres 9.1 onwards. So the script will 
fail for older versions of postgres.)

> HIVE-13076 uses create table if not exists for the postgres script
> --
>
> Key: HIVE-17891
> URL: https://issues.apache.org/jira/browse/HIVE-17891
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17891.01.patch
>
>
> HIVE-13076 addes a new table to the schema but the patch script uses {{CREATE 
> TABLE IF NOT EXISTS}} syntax to add the new table. The issue is that the {{IF 
> NOT EXISTS}} clause is only available from postgres 9.1 onwards. So the 
> script will fail for older versions of postgres.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17893) Vectorization: Wrong results for vector_udf3.q

2017-10-24 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reassigned HIVE-17893:
---


> Vectorization: Wrong results for vector_udf3.q
> --
>
> Key: HIVE-17893
> URL: https://issues.apache.org/jira/browse/HIVE-17893
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> NonVec:
> yy2GiGM   ll2TvTZ
> yxN0212hM17E8J8bJj8D7blkA0212uZ17R8W8oWw8Q7o
> ywA68u76Jv06axCv451avL4   ljN68h76Wi06nkPi451niY4
> yvNv1qliAi1d
> yv3gnG4a33hD7bIm7oxE5rw   li3taT4n33uQ7oVz7bkR5ej
> yv1js li1wf
> yujO07KWj lhwB07XJw
> ytpx1RL8F2I   lgck1EY8S2V
> ytj7g5W   lgw7t5J
> ytgaJW1Gvrkv5wFUJU2y1SlgtnWJ1Tiexi5jSHWH2l1F
> Vec:
> yy2GiGM   Unvectorized
> yxN0212hM17E8J8bJj8D7bUnvectorized
> ywA68u76Jv06axCv451avL4   Unvectorized
> yvNv1qUnvectorized
> yv3gnG4a33hD7bIm7oxE5rw   Unvectorized
> yv1js Unvectorized
> yujO07KWj Unvectorized
> ytpx1RL8F2I   Unvectorized
> ytj7g5W   Unvectorized
> ytgaJW1Gvrkv5wFUJU2y1SUnvectorized



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17891) HIVE-13076 uses create table if not exists for the postgres script

2017-10-24 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17891:
---
Description: HIVE-13076 addes a new table to the schema but the patch 
script uses {{CREATE TABLE IF NOT EXISTS}} syntax to add the new table. The 
issue the {{IF NOT EXISTS}} clause is only available from postgres 9.1 onwards. 
So the script will fail for older versions of postgres.  (was: HIVE-13354 addes 
a new table to the schema but the patch script uses {{CREATE TABLE IF NOT 
EXISTS}} syntax to add the new table. The issue the {{IF NOT EXISTS}} clause is 
only available from postgres 9.1 onwards. So the script will fail for older 
versions of postgres.)

> HIVE-13076 uses create table if not exists for the postgres script
> --
>
> Key: HIVE-17891
> URL: https://issues.apache.org/jira/browse/HIVE-17891
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>
> HIVE-13076 addes a new table to the schema but the patch script uses {{CREATE 
> TABLE IF NOT EXISTS}} syntax to add the new table. The issue the {{IF NOT 
> EXISTS}} clause is only available from postgres 9.1 onwards. So the script 
> will fail for older versions of postgres.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17891) HIVE-13076 uses create table if not exists for the postgres script

2017-10-24 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17891:
---
Summary: HIVE-13076 uses create table if not exists for the postgres script 
 (was: HIVE-13354 uses create table if not exists for the postgres script)

> HIVE-13076 uses create table if not exists for the postgres script
> --
>
> Key: HIVE-17891
> URL: https://issues.apache.org/jira/browse/HIVE-17891
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>
> HIVE-13354 addes a new table to the schema but the patch script uses {{CREATE 
> TABLE IF NOT EXISTS}} syntax to add the new table. The issue the {{IF NOT 
> EXISTS}} clause is only available from postgres 9.1 onwards. So the script 
> will fail for older versions of postgres.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17886) Fix failure of TestReplicationScenarios.testConstraints

2017-10-24 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17886:

Attachment: repl-tc.hive.log

attached hive.log

> Fix failure of TestReplicationScenarios.testConstraints
> ---
>
> Key: HIVE-17886
> URL: https://issues.apache.org/jira/browse/HIVE-17886
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
> Attachments: repl-tc.hive.log
>
>
> after HIVE-16603 this test started failing
> {code}
> 2017-10-24T10:52:17,024 DEBUG [main] metastore.HiveMetaStoreClient: Unable to 
> shutdown metastore client. Will try closing transport directly.
> org.apache.thrift.transport.TTransportException: Cannot write to null 
> outputStream
> at 
> org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142)
>  ~[libthrift-0.9.3.jar:0.9.3]
> at 
> org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:178) 
> ~[libthrift-0.9.3.jar:0.9.3]
> at 
> org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:106)
>  ~[libthrift-0.9.3.jar:0.9.3]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:70) 
> ~[libthrift-0.9.3.jar:0.9.3]
> at 
> org.apache.thrift.TServiceClient.sendBaseOneway(TServiceClient.java:66) 
> ~[libthrift-0.9.3.jar:0.9.3]
> at 
> com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:436)
>  ~[libfb303-0.9.3.jar:?]
> at 
> com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:430) 
> ~[libfb303-0.9.3.jar:?]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:569)
>  [hive-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source) ~[?:?]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_131]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_131]
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173)
>  [hive-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at com.sun.proxy.$Proxy38.close(Unknown Source) [?:?]
> at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source) ~[?:?]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_131]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_131]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2413)
>  [hive-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at com.sun.proxy.$Proxy38.close(Unknown Source) [?:?]
> at 
> org.apache.hadoop.hive.metastore.SynchronizedMetaStoreClient.close(SynchronizedMetaStoreClient.java:112)
>  [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.metadata.Hive.close(Hive.java:425) 
> [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.metadata.Hive.access$000(Hive.java:181) 
> [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.metadata.Hive$1.remove(Hive.java:202) 
> [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.closeCurrent(Hive.java:388) 
> [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:339) 
> [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:324) 
> [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getWithFastCheck(Hive.java:316) 
> [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getWithFastCheck(Hive.java:308) 
> [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.Task.getHive(Task.java:186) 
> [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.repl.bootstrap.ReplLoadTask.execute(ReplLoadTask.java:73)
>  [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:206) 
> [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) 
> [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2276) 
> [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1906) 
> [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 

[jira] [Assigned] (HIVE-17892) Vectorization: Wrong results for vectorized_timestamp_funcs.q

2017-10-24 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reassigned HIVE-17892:
---


> Vectorization: Wrong results for vectorized_timestamp_funcs.q
> -
>
> Key: HIVE-17892
> URL: https://issues.apache.org/jira/browse/HIVE-17892
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> NonVec:
> NULL  NULLNULLNULLNULLNULLNULLNULLNULL
> NULL  NULLNULLNULLNULLNULLNULLNULLNULL
> NULL  NULLNULLNULLNULLNULLNULLNULLNULL
> Vec:
> NULL  NULLNULLNULLNULLNULL8   1   1
> NULL  NULLNULLNULLNULLNULLNULLNULLNULL
> -62169765561  2   11  30  30  48  4   40  39



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16663) String Caching For Rows

2017-10-24 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-16663:
---
Description: 
It is very common that there are many repeated values in the result set of a 
query, especially when JOINs are present in the query.  As it currently stands, 
beeline does not attempt to cache any of these values and therefore it consumes 
a lot of memory.

Adding a string cache may save a lot of memory.  There are organizations that 
use beeline to perform ETL processing of result sets into CSV.  This will 
better support those organizations.

  was:
It is very common that there are many repeated values in the result set of a 
query.  As it currently stands, beeline does not attempt to cache any of these 
values and therefore it consumes a lot of memory.

Adding a string cache may save a lot of memory.  There are organizations that 
use beeline to perform ETL processing of result sets into CSV.  This will 
better support those organizations.


> String Caching For Rows
> ---
>
> Key: HIVE-16663
> URL: https://issues.apache.org/jira/browse/HIVE-16663
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.0.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16663.1.patch, HIVE-16663.2.patch, 
> HIVE-16663.3.patch, HIVE-16663.4.patch, HIVE-16663.5.patch, 
> HIVE-16663.6.patch, HIVE-16663.7.patch
>
>
> It is very common that there are many repeated values in the result set of a 
> query, especially when JOINs are present in the query.  As it currently 
> stands, beeline does not attempt to cache any of these values and therefore 
> it consumes a lot of memory.
> Adding a string cache may save a lot of memory.  There are organizations that 
> use beeline to perform ETL processing of result sets into CSV.  This will 
> better support those organizations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17891) HIVE-13354 uses create table if not exists for the postgres script

2017-10-24 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-17891:
--


> HIVE-13354 uses create table if not exists for the postgres script
> --
>
> Key: HIVE-17891
> URL: https://issues.apache.org/jira/browse/HIVE-17891
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>
> HIVE-13354 addes a new table to the schema but the patch script uses {{CREATE 
> TABLE IF NOT EXISTS}} syntax to add the new table. The issue the {{IF NOT 
> EXISTS}} clause is only available from postgres 9.1 onwards. So the script 
> will fail for older versions of postgres.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17890) Vectorization: Wrong results for vectorized_case.q

2017-10-24 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reassigned HIVE-17890:
---


> Vectorization: Wrong results for vectorized_case.q
> --
>
> Key: HIVE-17890
> URL: https://issues.apache.org/jira/browse/HIVE-17890
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> NonVec: 5110  4607
> Vec: 4086 3583



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17889) Vectorization: Wrong results for vectorization_15.q

2017-10-24 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reassigned HIVE-17889:
---


> Vectorization: Wrong results for vectorization_15.q
> ---
>
> Key: HIVE-17889
> URL: https://issues.apache.org/jira/browse/HIVE-17889
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> NonVec: 15:59:56.527  
> Vec: 16:00:09.889
> ctimestamp1 (column 7)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16663) String Caching For Rows

2017-10-24 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217499#comment-16217499
 ] 

BELUGA BEHR commented on HIVE-16663:


Latest changes to fix merge issues...

Also, removed call to {{rs.wasNull()}} because 
[ResultSet|https://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#getObject(int)]
 will already handle SQL NULL values appropriately when calling 
{{rs.getObject()}}

> String Caching For Rows
> ---
>
> Key: HIVE-16663
> URL: https://issues.apache.org/jira/browse/HIVE-16663
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.0.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16663.1.patch, HIVE-16663.2.patch, 
> HIVE-16663.3.patch, HIVE-16663.4.patch, HIVE-16663.5.patch, 
> HIVE-16663.6.patch, HIVE-16663.7.patch
>
>
> It is very common that there are many repeated values in the result set of a 
> query.  As it currently stands, beeline does not attempt to cache any of 
> these values and therefore it consumes a lot of memory.
> Adding a string cache may save a lot of memory.  There are organizations that 
> use beeline to perform ETL processing of result sets into CSV.  This will 
> better support those organizations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16663) String Caching For Rows

2017-10-24 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-16663:
---
Status: Patch Available  (was: Open)

> String Caching For Rows
> ---
>
> Key: HIVE-16663
> URL: https://issues.apache.org/jira/browse/HIVE-16663
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.0.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16663.1.patch, HIVE-16663.2.patch, 
> HIVE-16663.3.patch, HIVE-16663.4.patch, HIVE-16663.5.patch, 
> HIVE-16663.6.patch, HIVE-16663.7.patch
>
>
> It is very common that there are many repeated values in the result set of a 
> query.  As it currently stands, beeline does not attempt to cache any of 
> these values and therefore it consumes a lot of memory.
> Adding a string cache may save a lot of memory.  There are organizations that 
> use beeline to perform ETL processing of result sets into CSV.  This will 
> better support those organizations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16663) String Caching For Rows

2017-10-24 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-16663:
---
Attachment: HIVE-16663.7.patch

> String Caching For Rows
> ---
>
> Key: HIVE-16663
> URL: https://issues.apache.org/jira/browse/HIVE-16663
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.0.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16663.1.patch, HIVE-16663.2.patch, 
> HIVE-16663.3.patch, HIVE-16663.4.patch, HIVE-16663.5.patch, 
> HIVE-16663.6.patch, HIVE-16663.7.patch
>
>
> It is very common that there are many repeated values in the result set of a 
> query.  As it currently stands, beeline does not attempt to cache any of 
> these values and therefore it consumes a lot of memory.
> Adding a string cache may save a lot of memory.  There are organizations that 
> use beeline to perform ETL processing of result sets into CSV.  This will 
> better support those organizations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16663) String Caching For Rows

2017-10-24 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-16663:
---
Status: Open  (was: Patch Available)

> String Caching For Rows
> ---
>
> Key: HIVE-16663
> URL: https://issues.apache.org/jira/browse/HIVE-16663
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.0.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16663.1.patch, HIVE-16663.2.patch, 
> HIVE-16663.3.patch, HIVE-16663.4.patch, HIVE-16663.5.patch, HIVE-16663.6.patch
>
>
> It is very common that there are many repeated values in the result set of a 
> query.  As it currently stands, beeline does not attempt to cache any of 
> these values and therefore it consumes a lot of memory.
> Adding a string cache may save a lot of memory.  There are organizations that 
> use beeline to perform ETL processing of result sets into CSV.  This will 
> better support those organizations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15305) Add tests for METASTORE_EVENT_LISTENERS

2017-10-24 Thread Mohit Sabharwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-15305:
---
Attachment: (was: HIVE-15305.1.patch)

> Add tests for METASTORE_EVENT_LISTENERS
> ---
>
> Key: HIVE-15305
> URL: https://issues.apache.org/jira/browse/HIVE-15305
> Project: Hive
>  Issue Type: Bug
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-15305.1.patch, HIVE-15305.1.patch, 
> HIVE-15305.1.patch, HIVE-15305.1.patch, HIVE-15305.1.patch, HIVE-15305.patch
>
>
> HIVE-15232 reused TestDbNotificationListener to test 
> METASTORE_TRANSACTIONAL_EVENT_LISTENERS and removed unit testing of 
> METASTORE_EVENT_LISTENERS config. We should test both. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15305) Add tests for METASTORE_EVENT_LISTENERS

2017-10-24 Thread Mohit Sabharwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-15305:
---
Attachment: HIVE-15305.1.patch

> Add tests for METASTORE_EVENT_LISTENERS
> ---
>
> Key: HIVE-15305
> URL: https://issues.apache.org/jira/browse/HIVE-15305
> Project: Hive
>  Issue Type: Bug
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-15305.1.patch, HIVE-15305.1.patch, 
> HIVE-15305.1.patch, HIVE-15305.1.patch, HIVE-15305.1.patch, HIVE-15305.patch
>
>
> HIVE-15232 reused TestDbNotificationListener to test 
> METASTORE_TRANSACTIONAL_EVENT_LISTENERS and removed unit testing of 
> METASTORE_EVENT_LISTENERS config. We should test both. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work started] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.

2017-10-24 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17887 started by Sankar Hariappan.
---
> Incremental REPL LOAD with Drop partition event on timestamp type partition 
> column fails.
> -
>
> Key: HIVE-17887
> URL: https://issues.apache.org/jira/browse/HIVE-17887
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
>
> When try to replicate the drop partition event on a table with partition on 
> timestamp type column fails in REPL LOAD.
> *Scenario:*
> 1. create table with partition on timestamp column.
> 2.bootstrap dump/load.
> 3. insert a record to create partition(p="2001-11-09 00:00:00.0").
> 4. drop the same partition(p="2001-11-09 00:00:00.0").
> 5. incremental dump/load
> -- REPL LOAD throws below exception
> {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: 
> Thread-36769]: metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error 
> parsing partition filter; lexer error: line 1:18 no viable alternative at 
> character ':'; exception MismatchedTokenException(12!=23))
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517)
> at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103)
> at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957)
> at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200)
> at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
> at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15305) Add tests for METASTORE_EVENT_LISTENERS

2017-10-24 Thread Mohit Sabharwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-15305:
---
Attachment: HIVE-15305.1.patch

> Add tests for METASTORE_EVENT_LISTENERS
> ---
>
> Key: HIVE-15305
> URL: https://issues.apache.org/jira/browse/HIVE-15305
> Project: Hive
>  Issue Type: Bug
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-15305.1.patch, HIVE-15305.1.patch, 
> HIVE-15305.1.patch, HIVE-15305.1.patch, HIVE-15305.1.patch, HIVE-15305.patch
>
>
> HIVE-15232 reused TestDbNotificationListener to test 
> METASTORE_TRANSACTIONAL_EVENT_LISTENERS and removed unit testing of 
> METASTORE_EVENT_LISTENERS config. We should test both. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17232) "No match found" Compactor finds a bucket file thinking it's a directory

2017-10-24 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217477#comment-16217477
 ] 

Eugene Koifman commented on HIVE-17232:
---

Some comments:
1. Table.java is a generated class based on hive_metastore.thrift so anything 
you add to it manually will be lost next time it is regenerated
2. Instead of just "No match found" the error msg should include the file name 
that it was trying to process so that we can debug this if it happens again.
3. If you want the Work to check for table level compaction request for 
partitioned tables it should put the compaction request into failed state 
(markFailed()) - this way it is visible to the end user in SHOW COMPACTIONS.


>  "No match found"  Compactor finds a bucket file thinking it's a directory
> --
>
> Key: HIVE-17232
> URL: https://issues.apache.org/jira/browse/HIVE-17232
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Steve Yeom
> Attachments: HIVE-17232.01.patch
>
>
> {noformat}
> 2017-08-02T12:38:11,996  WARN [main] compactor.CompactorMR: Found a 
> non-bucket file that we thought matched the bucket pattern! 
> file:/Users/ekoifman/dev/hiv\
> erwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands2-1501702264311/warehouse/acidtblpart/p=1/delta_013_013_/bucket_1
>  Matcher=java\
> .util.regex.Matcher[pattern=^[0-9]{6} region=0,12 lastmatch=]
> 2017-08-02T12:38:11,996  INFO [main] mapreduce.JobSubmitter: Cleaning up the 
> staging area 
> file:/tmp/hadoop/mapred/staging/ekoifman1723152463/.staging/job_lo\
> cal1723152463_0183
> 2017-08-02T12:38:11,997 ERROR [main] compactor.Worker: Caught exception while 
> trying to compact 
> id:1,dbname:default,tableName:ACIDTBLPART,partName:null,stat\
> e:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestTxnId:0.
>   Marking failed to avoid repeated failures, java.lang.IllegalStateException: 
> \
> No match found
> at java.util.regex.Matcher.group(Matcher.java:536)
> at java.util.regex.Matcher.group(Matcher.java:496)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorInputFormat.addFileToMap(CompactorMR.java:577)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorInputFormat.getSplits(CompactorMR.java:549)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:330)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:322)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:198)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
> at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.launchCompactionJob(CompactorMR.java:320)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:275)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:166)
> at 
> org.apache.hadoop.hive.ql.TestTxnCommands2.runWorker(TestTxnCommands2.java:1138)
> at 
> org.apache.hadoop.hive.ql.TestTxnCommands2.updateDeletePartitioned(TestTxnCommands2.java:894)
> {noformat}
> the stack trace points to 1st runWorker() in updateDeletePartitioned() though 
> the test run was TestTxnCommands2WithSplitUpdateAndVectorization



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17881) LLAP: Text cache NPE

2017-10-24 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217472#comment-16217472
 ] 

Prasanth Jayachandran commented on HIVE-17881:
--

Yes. I don't want to use cache :) no purging of cache yet without restart. I 
wanted to use LLAP IO since it prints FS counters but without cache as I want 
to perform a HDFS read always for some unit test. Regardless it shouldn't throw 
NPE if someone wants to disable cache but use only async io. 

> LLAP: Text cache NPE
> 
>
> Key: HIVE-17881
> URL: https://issues.apache.org/jira/browse/HIVE-17881
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>
> With LLAP IO enabled and hive.llap.io.memory.mode set to false. Text cache 
> throws NPE for following query
> {code}
> select t1.k,t1.v from src t1 join src t2 on t1.k>=t2.k;
> {code}
> {code}
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.readFileWithCache(SerDeEncodedDataReader.java:763)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.performDataRead(SerDeEncodedDataReader.java:668)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader$5.run(SerDeEncodedDataReader.java:259)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader$5.run(SerDeEncodedDataReader.java:256)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1889)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.callInternal(SerDeEncodedDataReader.java:256)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.callInternal(SerDeEncodedDataReader.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17888) Display the reason for query cancellation

2017-10-24 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-17888:



> Display the reason for query cancellation
> -
>
> Key: HIVE-17888
> URL: https://issues.apache.org/jira/browse/HIVE-17888
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> For user convenience and easy debugging, if a trigger kills a query return 
> the reason for the killing the query. Currently the query kill will only 
> display the following which is not very useful
> {code}
> Error: Query was cancelled (state=01000,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16855) org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements

2017-10-24 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217444#comment-16217444
 ] 

BELUGA BEHR commented on HIVE-16855:


[~ngangam] Please consider this simple improvement.

> org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements
> --
>
> Key: HIVE-16855
> URL: https://issues.apache.org/jira/browse/HIVE-16855
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16855.1.patch
>
>
> # Improve (Simplify) Logging
> # Remove custom buffer size for {{BufferedInputStream}} and instead rely on 
> JVM default which is often larger these days (8192)
> # Simplify looping logic



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.

2017-10-24 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan reassigned HIVE-17887:
---


> Incremental REPL LOAD with Drop partition event on timestamp type partition 
> column fails.
> -
>
> Key: HIVE-17887
> URL: https://issues.apache.org/jira/browse/HIVE-17887
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
>
> When try to replicate the drop partition event on a table with partition on 
> timestamp type column fails in REPL LOAD.
> *Scenario:*
> 1. create table with partition on timestamp column.
> 2.bootstrap dump/load.
> 3. insert a record to create partition(p="2001-11-09 00:00:00.0").
> 4. drop the same partition(p="2001-11-09 00:00:00.0").
> 5. incremental dump/load
> -- REPL LOAD throws below exception
> {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: 
> Thread-36769]: metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error 
> parsing partition filter; lexer error: line 1:18 no viable alternative at 
> character ':'; exception MismatchedTokenException(12!=23))
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517)
> at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103)
> at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957)
> at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200)
> at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
> at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16890) org.apache.hadoop.hive.serde2.io.HiveVarcharWritable - Adds Superfluous Wrapper

2017-10-24 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217440#comment-16217440
 ] 

BELUGA BEHR commented on HIVE-16890:


[~ngangam] Please consider this simple improvement.

> org.apache.hadoop.hive.serde2.io.HiveVarcharWritable - Adds Superfluous 
> Wrapper
> ---
>
> Key: HIVE-16890
> URL: https://issues.apache.org/jira/browse/HIVE-16890
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-16890.1.patch
>
>
> Class {{org.apache.hadoop.hive.serde2.io.HiveVarcharWritable}} creates a 
> superfluous wrapper and then immediately unwraps it.  Don't bother wrapping 
> in this scenario.
> {code}
>   public void set(HiveVarchar val, int len) {
> set(val.getValue(), len);
>   }
>   public void set(String val, int maxLength) {
> value.set(HiveBaseChar.enforceMaxLength(val, maxLength));
>   }
>   public HiveVarchar getHiveVarchar() {
> return new HiveVarchar(value.toString(), -1);
>   }
>   // Here calls getHiveVarchar() which creates a new HiveVarchar object with 
> a string in it
>   // The object is passed to set(HiveVarchar val, int len)
>   //  The string is pulled out
>   public void enforceMaxLength(int maxLength) {
> // Might be possible to truncate the existing Text value, for now just do 
> something simple.
> if (value.getLength()>maxLength && getCharacterLength()>maxLength)
>   set(getHiveVarchar(), maxLength);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16970) General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils

2017-10-24 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217441#comment-16217441
 ] 

BELUGA BEHR commented on HIVE-16970:


[~ngangam] Please consider this simple improvement.

> General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils
> -
>
> Key: HIVE-16970
> URL: https://issues.apache.org/jira/browse/HIVE-16970
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-16970.1.patch, HIVE-16970.2.patch
>
>
> # Simplify
> # Do not initiate empty collections
> # Parsing is incorrect:
> {code:title=org.apache.hadoop.hive.metastore.cache.CacheUtils}
>   public static String buildKey(String dbName, String tableName, List 
> partVals) {
> String key = buildKey(dbName, tableName);
> if (partVals == null || partVals.size() == 0) {
>   return key;
> }
> // missing a delimiter between the "tableName" and the first "partVal"
> for (int i = 0; i < partVals.size(); i++) {
>   key += partVals.get(i);
>   if (i != partVals.size() - 1) {
> key += delimit;
>   }
> }
> return key;
>   }
> public static Object[] splitPartitionColStats(String key) {
> // ...
> }
> {code}
> The result of passing the key to the "split" method is:
> {code}
> buildKey("db","Table",["Part1","Part2","Part3"], "col");
> [db, tablePart1, [Part2, Part3], col]
> // "table" and "Part1" is mistakenly concatenated
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17841) implement applying the resource plan

2017-10-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217432#comment-16217432
 ] 

Hive QA commented on HIVE-17841:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12893627/HIVE-17841.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 11315 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby2_map_skew] 
(batchId=82)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=163)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=204)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=221)
org.apache.hadoop.hive.ql.parse.authorization.plugin.sqlstd.TestOperation2Privilege.checkHiveOperationTypeMatch
 (batchId=269)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers1 
(batchId=228)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers2 
(batchId=228)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead 
(batchId=228)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesWrite 
(batchId=228)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=228)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime
 (batchId=228)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime
 (batchId=228)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerTotalTasks 
(batchId=228)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7454/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7454/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7454/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12893627 - PreCommit-HIVE-Build

> implement applying the resource plan
> 
>
> Key: HIVE-17841
> URL: https://issues.apache.org/jira/browse/HIVE-17841
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17841.01.patch, HIVE-17841.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16970) General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils

2017-10-24 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-16970:
---
Attachment: HIVE-16970.2.patch

> General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils
> -
>
> Key: HIVE-16970
> URL: https://issues.apache.org/jira/browse/HIVE-16970
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-16970.1.patch, HIVE-16970.2.patch
>
>
> # Simplify
> # Do not initiate empty collections
> # Parsing is incorrect:
> {code:title=org.apache.hadoop.hive.metastore.cache.CacheUtils}
>   public static String buildKey(String dbName, String tableName, List 
> partVals) {
> String key = buildKey(dbName, tableName);
> if (partVals == null || partVals.size() == 0) {
>   return key;
> }
> // missing a delimiter between the "tableName" and the first "partVal"
> for (int i = 0; i < partVals.size(); i++) {
>   key += partVals.get(i);
>   if (i != partVals.size() - 1) {
> key += delimit;
>   }
> }
> return key;
>   }
> public static Object[] splitPartitionColStats(String key) {
> // ...
> }
> {code}
> The result of passing the key to the "split" method is:
> {code}
> buildKey("db","Table",["Part1","Part2","Part3"], "col");
> [db, tablePart1, [Part2, Part3], col]
> // "table" and "Part1" is mistakenly concatenated
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16970) General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils

2017-10-24 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-16970:
---
Status: Patch Available  (was: Open)

> General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils
> -
>
> Key: HIVE-16970
> URL: https://issues.apache.org/jira/browse/HIVE-16970
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-16970.1.patch, HIVE-16970.2.patch
>
>
> # Simplify
> # Do not initiate empty collections
> # Parsing is incorrect:
> {code:title=org.apache.hadoop.hive.metastore.cache.CacheUtils}
>   public static String buildKey(String dbName, String tableName, List 
> partVals) {
> String key = buildKey(dbName, tableName);
> if (partVals == null || partVals.size() == 0) {
>   return key;
> }
> // missing a delimiter between the "tableName" and the first "partVal"
> for (int i = 0; i < partVals.size(); i++) {
>   key += partVals.get(i);
>   if (i != partVals.size() - 1) {
> key += delimit;
>   }
> }
> return key;
>   }
> public static Object[] splitPartitionColStats(String key) {
> // ...
> }
> {code}
> The result of passing the key to the "split" method is:
> {code}
> buildKey("db","Table",["Part1","Part2","Part3"], "col");
> [db, tablePart1, [Part2, Part3], col]
> // "table" and "Part1" is mistakenly concatenated
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16970) General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils

2017-10-24 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-16970:
---
Status: Open  (was: Patch Available)

> General Improvements To org.apache.hadoop.hive.metastore.cache.CacheUtils
> -
>
> Key: HIVE-16970
> URL: https://issues.apache.org/jira/browse/HIVE-16970
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-16970.1.patch, HIVE-16970.2.patch
>
>
> # Simplify
> # Do not initiate empty collections
> # Parsing is incorrect:
> {code:title=org.apache.hadoop.hive.metastore.cache.CacheUtils}
>   public static String buildKey(String dbName, String tableName, List 
> partVals) {
> String key = buildKey(dbName, tableName);
> if (partVals == null || partVals.size() == 0) {
>   return key;
> }
> // missing a delimiter between the "tableName" and the first "partVal"
> for (int i = 0; i < partVals.size(); i++) {
>   key += partVals.get(i);
>   if (i != partVals.size() - 1) {
> key += delimit;
>   }
> }
> return key;
>   }
> public static Object[] splitPartitionColStats(String key) {
> // ...
> }
> {code}
> The result of passing the key to the "split" method is:
> {code}
> buildKey("db","Table",["Part1","Part2","Part3"], "col");
> [db, tablePart1, [Part2, Part3], col]
> // "table" and "Part1" is mistakenly concatenated
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-24 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217431#comment-16217431
 ] 

Xuefu Zhang commented on HIVE-15104:


+1

> Hive on Spark generate more shuffle data than hive on mr
> 
>
> Key: HIVE-15104
> URL: https://issues.apache.org/jira/browse/HIVE-15104
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.1
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15104.1.patch, HIVE-15104.10.patch, 
> HIVE-15104.2.patch, HIVE-15104.3.patch, HIVE-15104.4.patch, 
> HIVE-15104.5.patch, HIVE-15104.6.patch, HIVE-15104.7.patch, 
> HIVE-15104.8.patch, HIVE-15104.9.patch, TPC-H 100G.xlsx
>
>
> the same sql,  running on spark  and mr engine, will generate different size 
> of shuffle data.
> i think it is because of hive on mr just serialize part of HiveKey, but hive 
> on spark which using kryo will serialize full of Hivekey object.  
> what is your opionion?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17232) "No match found" Compactor finds a bucket file thinking it's a directory

2017-10-24 Thread Steve Yeom (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217418#comment-16217418
 ] 

Steve Yeom commented on HIVE-17232:
---

[~ekoifman] please review the patch 01. 
Thanks, 
Steve. 

>  "No match found"  Compactor finds a bucket file thinking it's a directory
> --
>
> Key: HIVE-17232
> URL: https://issues.apache.org/jira/browse/HIVE-17232
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Steve Yeom
> Attachments: HIVE-17232.01.patch
>
>
> {noformat}
> 2017-08-02T12:38:11,996  WARN [main] compactor.CompactorMR: Found a 
> non-bucket file that we thought matched the bucket pattern! 
> file:/Users/ekoifman/dev/hiv\
> erwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands2-1501702264311/warehouse/acidtblpart/p=1/delta_013_013_/bucket_1
>  Matcher=java\
> .util.regex.Matcher[pattern=^[0-9]{6} region=0,12 lastmatch=]
> 2017-08-02T12:38:11,996  INFO [main] mapreduce.JobSubmitter: Cleaning up the 
> staging area 
> file:/tmp/hadoop/mapred/staging/ekoifman1723152463/.staging/job_lo\
> cal1723152463_0183
> 2017-08-02T12:38:11,997 ERROR [main] compactor.Worker: Caught exception while 
> trying to compact 
> id:1,dbname:default,tableName:ACIDTBLPART,partName:null,stat\
> e:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestTxnId:0.
>   Marking failed to avoid repeated failures, java.lang.IllegalStateException: 
> \
> No match found
> at java.util.regex.Matcher.group(Matcher.java:536)
> at java.util.regex.Matcher.group(Matcher.java:496)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorInputFormat.addFileToMap(CompactorMR.java:577)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorInputFormat.getSplits(CompactorMR.java:549)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:330)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:322)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:198)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
> at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.launchCompactionJob(CompactorMR.java:320)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:275)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:166)
> at 
> org.apache.hadoop.hive.ql.TestTxnCommands2.runWorker(TestTxnCommands2.java:1138)
> at 
> org.apache.hadoop.hive.ql.TestTxnCommands2.updateDeletePartitioned(TestTxnCommands2.java:894)
> {noformat}
> the stack trace points to 1st runWorker() in updateDeletePartitioned() though 
> the test run was TestTxnCommands2WithSplitUpdateAndVectorization



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17881) LLAP: Text cache NPE

2017-10-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217413#comment-16217413
 ] 

Sergey Shelukhin commented on HIVE-17881:
-

The solution would be to remove hive.llap.io.memory.mode. Were you using it for 
some legitimate reason? :)

> LLAP: Text cache NPE
> 
>
> Key: HIVE-17881
> URL: https://issues.apache.org/jira/browse/HIVE-17881
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>
> With LLAP IO enabled and hive.llap.io.memory.mode set to false. Text cache 
> throws NPE for following query
> {code}
> select t1.k,t1.v from src t1 join src t2 on t1.k>=t2.k;
> {code}
> {code}
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.readFileWithCache(SerDeEncodedDataReader.java:763)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.performDataRead(SerDeEncodedDataReader.java:668)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader$5.run(SerDeEncodedDataReader.java:259)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader$5.run(SerDeEncodedDataReader.java:256)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1889)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.callInternal(SerDeEncodedDataReader.java:256)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.callInternal(SerDeEncodedDataReader.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17832) Allow hive.metastore.disallow.incompatible.col.type.changes to be changed in metastore

2017-10-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217405#comment-16217405
 ] 

Sergey Shelukhin commented on HIVE-17832:
-

Hmm, sure. 
As for the embedded metastore, it implies that the user is the admin because 
they have full control over metastore (and direct access to the database).
But yeah, I guess it's similar to strict check parameters and we should allow 
users to shoot themselves in the foot :P
+1

> Allow hive.metastore.disallow.incompatible.col.type.changes to be changed in 
> metastore
> --
>
> Key: HIVE-17832
> URL: https://issues.apache.org/jira/browse/HIVE-17832
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0
>
> Attachments: HIVE17832.1.patch, HIVE17832.2.patch
>
>
> hive.metastore.disallow.incompatible.col.type.changes when set to true, will 
> disallow incompatible column type changes through alter table.  But, this 
> parameter is not modifiable in HMS.  If HMS in not embedded into HS2, the 
> value cannot be changed.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-16603) Enforce foreign keys to refer to primary keys or unique keys

2017-10-24 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich resolved HIVE-16603.
-
Resolution: Fixed

this failure seems to need more than just a quick look - opened HIVE-17886

> Enforce foreign keys to refer to primary keys or unique keys
> 
>
> Key: HIVE-16603
> URL: https://issues.apache.org/jira/browse/HIVE-16603
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-16603.patch
>
>
> Follow-up on HIVE-16575.
> Currently we do not enforce foreign keys to refer to primary keys or unique 
> keys (as opposed to PostgreSQL and others); we should do that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17778) Add support for custom counters in trigger expression

2017-10-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217321#comment-16217321
 ] 

Hive QA commented on HIVE-17778:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12893642/HIVE-17778.5.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 11315 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=101)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=102)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=205)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=222)
org.apache.hadoop.hive.ql.parse.authorization.plugin.sqlstd.TestOperation2Privilege.checkHiveOperationTypeMatch
 (batchId=270)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedFiles
 (batchId=229)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes
 (batchId=229)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitions
 (batchId=229)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitionsUnionAll
 (batchId=229)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomNonExistent 
(batchId=229)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead 
(batchId=229)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=229)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime
 (batchId=229)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime
 (batchId=229)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7453/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7453/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7453/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12893642 - PreCommit-HIVE-Build

> Add support for custom counters in trigger expression
> -
>
> Key: HIVE-17778
> URL: https://issues.apache.org/jira/browse/HIVE-17778
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17778.1.patch, HIVE-17778.2.patch, 
> HIVE-17778.3.patch, HIVE-17778.4.patch, HIVE-17778.5.patch
>
>
> HIVE-17508 only supports limited counters. This ticket is to extend it to 
> support custom counters (counters that are not supported by execution engine 
> will be dropped).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >