[jira] [Updated] (HIVE-17208) Repl dump should pass in db/table information to authorization API

2017-07-31 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-17208:
--
Attachment: HIVE-17208.2.patch

> Repl dump should pass in db/table information to authorization API
> --
>
> Key: HIVE-17208
> URL: https://issues.apache.org/jira/browse/HIVE-17208
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-17208.1.patch, HIVE-17208.2.patch
>
>
> "repl dump" does not provide db/table information. That is necessary for 
> authorization replication in ranger.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16896) move replication load related work in semantic analysis phase to execution phase using a task

2017-07-31 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-16896:
---
Status: Patch Available  (was: In Progress)

> move replication load related work in semantic analysis phase to execution 
> phase using a task
> -
>
> Key: HIVE-16896
> URL: https://issues.apache.org/jira/browse/HIVE-16896
> Project: Hive
>  Issue Type: Sub-task
>Reporter: anishek
>Assignee: anishek
> Attachments: HIVE-16896.1.patch
>
>
> we want to not create too many tasks in memory in the analysis phase while 
> loading data. Currently we load all the files in the bootstrap dump location 
> as {{FileStatus[]}} and then iterate over it to load objects, we should 
> rather move to 
> {code}
> org.apache.hadoop.fs.RemoteIteratorlistFiles(Path 
> f, boolean recursive)
> {code}
> which would internally batch and return values. 
> additionally since we cant hand off partial tasks from analysis pahse => 
> execution phase, we are going to move the whole repl load functionality to 
> execution phase so we can better control creation/execution of tasks (not 
> related to hive {{Task}}, we may get rid of ReplCopyTask)
> Additional consideration to take into account at the end of this jira is to 
> see if we want to specifically do a multi threaded load of bootstrap dump.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work started] (HIVE-16896) move replication load related work in semantic analysis phase to execution phase using a task

2017-07-31 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-16896 started by anishek.
--
> move replication load related work in semantic analysis phase to execution 
> phase using a task
> -
>
> Key: HIVE-16896
> URL: https://issues.apache.org/jira/browse/HIVE-16896
> Project: Hive
>  Issue Type: Sub-task
>Reporter: anishek
>Assignee: anishek
> Attachments: HIVE-16896.1.patch
>
>
> we want to not create too many tasks in memory in the analysis phase while 
> loading data. Currently we load all the files in the bootstrap dump location 
> as {{FileStatus[]}} and then iterate over it to load objects, we should 
> rather move to 
> {code}
> org.apache.hadoop.fs.RemoteIteratorlistFiles(Path 
> f, boolean recursive)
> {code}
> which would internally batch and return values. 
> additionally since we cant hand off partial tasks from analysis pahse => 
> execution phase, we are going to move the whole repl load functionality to 
> execution phase so we can better control creation/execution of tasks (not 
> related to hive {{Task}}, we may get rid of ReplCopyTask)
> Additional consideration to take into account at the end of this jira is to 
> see if we want to specifically do a multi threaded load of bootstrap dump.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16896) move replication load related work in semantic analysis phase to execution phase using a task

2017-07-31 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-16896:
---
Attachment: HIVE-16896.1.patch

> move replication load related work in semantic analysis phase to execution 
> phase using a task
> -
>
> Key: HIVE-16896
> URL: https://issues.apache.org/jira/browse/HIVE-16896
> Project: Hive
>  Issue Type: Sub-task
>Reporter: anishek
>Assignee: anishek
> Attachments: HIVE-16896.1.patch
>
>
> we want to not create too many tasks in memory in the analysis phase while 
> loading data. Currently we load all the files in the bootstrap dump location 
> as {{FileStatus[]}} and then iterate over it to load objects, we should 
> rather move to 
> {code}
> org.apache.hadoop.fs.RemoteIteratorlistFiles(Path 
> f, boolean recursive)
> {code}
> which would internally batch and return values. 
> additionally since we cant hand off partial tasks from analysis pahse => 
> execution phase, we are going to move the whole repl load functionality to 
> execution phase so we can better control creation/execution of tasks (not 
> related to hive {{Task}}, we may get rid of ReplCopyTask)
> Additional consideration to take into account at the end of this jira is to 
> see if we want to specifically do a multi threaded load of bootstrap dump.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17144) export of temporary tables not working and it seems to be using distcp rather than filesystem copy

2017-07-31 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-17144:
---
Status: Patch Available  (was: In Progress)

> export of temporary tables not working and it seems to be using distcp rather 
> than filesystem copy
> --
>
> Key: HIVE-17144
> URL: https://issues.apache.org/jira/browse/HIVE-17144
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
> Attachments: HIVE-17144.1.patch
>
>
> create temporary table t1 (i int);
> insert into t1 values (3);
> export table t1 to 'hdfs://somelocation';
> above fails. additionally it should use filesystem copy and not distcp to do 
> the job.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17144) export of temporary tables not working and it seems to be using distcp rather than filesystem copy

2017-07-31 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-17144:
---
Attachment: HIVE-17144.1.patch

> export of temporary tables not working and it seems to be using distcp rather 
> than filesystem copy
> --
>
> Key: HIVE-17144
> URL: https://issues.apache.org/jira/browse/HIVE-17144
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
> Attachments: HIVE-17144.1.patch
>
>
> create temporary table t1 (i int);
> insert into t1 values (3);
> export table t1 to 'hdfs://somelocation';
> above fails. additionally it should use filesystem copy and not distcp to do 
> the job.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work started] (HIVE-17144) export of temporary tables not working and it seems to be using distcp rather than filesystem copy

2017-07-31 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17144 started by anishek.
--
> export of temporary tables not working and it seems to be using distcp rather 
> than filesystem copy
> --
>
> Key: HIVE-17144
> URL: https://issues.apache.org/jira/browse/HIVE-17144
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
>
> create temporary table t1 (i int);
> insert into t1 values (3);
> export table t1 to 'hdfs://somelocation';
> above fails. additionally it should use filesystem copy and not distcp to do 
> the job.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17144) export of temporary tables not working and it seems to be using distcp rather than filesystem copy

2017-07-31 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108393#comment-16108393
 ] 

ASF GitHub Bot commented on HIVE-17144:
---

GitHub user anishek opened a pull request:

https://github.com/apache/hive/pull/215

HIVE-17144 : export of temporary tables not working and it seems to be 
using distcp rather than filesystem copy



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/anishek/hive HIVE-17144

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/215.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #215


commit df618700c5043422f4a75fe7afcffd7b9202ceb5
Author: Anishek Agarwal 
Date:   2017-08-01T05:24:11Z

HIVE-17144 : export of temporary tables not working and it seems to be 
using distcp rather than filesystem copy




> export of temporary tables not working and it seems to be using distcp rather 
> than filesystem copy
> --
>
> Key: HIVE-17144
> URL: https://issues.apache.org/jira/browse/HIVE-17144
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
>
> create temporary table t1 (i int);
> insert into t1 values (3);
> export table t1 to 'hdfs://somelocation';
> above fails. additionally it should use filesystem copy and not distcp to do 
> the job.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17167) Create metastore specific configuration tool

2017-07-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108365#comment-16108365
 ] 

Hive QA commented on HIVE-17167:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879727/HIVE-17167.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11035 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=236)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6205/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6205/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6205/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879727 - PreCommit-HIVE-Build

> Create metastore specific configuration tool
> 
>
> Key: HIVE-17167
> URL: https://issues.apache.org/jira/browse/HIVE-17167
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17167.2.patch, HIVE-17167.patch
>
>
> As part of making the metastore a separately releasable module we need 
> configuration tools that are specific to that module.  It cannot use or 
> extend HiveConf as that is in hive common.  But it must take a HiveConf 
> object and be able to operate on it.
> The best way to achieve this is using Hadoop's Configuration object (which 
> HiveConf extends) together with enums and static methods.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17213) HoS: file merging doesn't work for union all

2017-07-31 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108352#comment-16108352
 ] 

Xuefu Zhang commented on HIVE-17213:


Patch #1 looks good. However, I'd suggest the following:

1. Code style for 
{code}
+  for (FileSinkDesc fsConf:fileSinkDesc.getLinkedFileSinkDesc()) {
+fsConf.setDirName(tmpDir);
{code}
2. Possibly add a test

Also, I'd like to understand a bit more and this will be conducted in person.

> HoS: file merging doesn't work for union all
> 
>
> Key: HIVE-17213
> URL: https://issues.apache.org/jira/browse/HIVE-17213
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17213.0.patch, HIVE-17213.1.patch
>
>
> HoS file merging doesn't work properly since it doesn't set linked file sinks 
> properly which is used to generate move tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17218) Canonical-ize hostnames for Hive metastore, and HS2 servers.

2017-07-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108312#comment-16108312
 ] 

Hive QA commented on HIVE-17218:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879721/HIVE-17218.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11018 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=240)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6204/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6204/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6204/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879721 - PreCommit-HIVE-Build

> Canonical-ize hostnames for Hive metastore, and HS2 servers.
> 
>
> Key: HIVE-17218
> URL: https://issues.apache.org/jira/browse/HIVE-17218
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore, Security
>Affects Versions: 1.2.2, 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17218.1.patch
>
>
> Currently, the {{HiveMetastoreClient}} and {{HiveConnection}} do not 
> canonical-ize the hostnames of the metastore/HS2 servers. In deployments 
> where there are multiple such servers behind a VIP, this causes a number of 
> inconveniences:
> # The client-side configuration (e.g. {{hive.metastore.uris}} in 
> {{hive-site.xml}}) needs to specify the VIP's hostname, and cannot use a 
> simplified CNAME, in the thrift URL. If the 
> {{hive.metastore.kerberos.principal}} is specified using {{_HOST}}, one sees 
> GSS failures as follows:
> {noformat}
> hive --hiveconf hive.metastore.kerberos.principal=hive/_h...@grid.myth.net 
> --hiveconf 
> hive.metastore.uris="thrift://simplified-hcat-cname.grid.myth.net:56789"
> ...
> Exception in thread "main" java.lang.RuntimeException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:542)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> ...
> {noformat}
> This is because {{_HOST}} is filled in with the CNAME, and not the 
> canonicalized name.
> # Oozie workflows that use HCat {{}} have to always use the VIP 
> hostname, and can't use {{_HOST}}-based service principals, if the CNAME 
> differs from the VIP name.
> If the client-code simply canonical-ized the hostnames, it would enable the 
> use of both simplified CNAMEs, and _HOST in service principals.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17213) HoS: file merging doesn't work for union all

2017-07-31 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-17213:

Attachment: HIVE-17213.1.patch

Patch v0 incorrectly deleted {{SparkFileSinkProcessor}}, which is used to 
process file sinks for the non-union all cases too. Uploading patch v1 to fix 
the issue.

> HoS: file merging doesn't work for union all
> 
>
> Key: HIVE-17213
> URL: https://issues.apache.org/jira/browse/HIVE-17213
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17213.0.patch, HIVE-17213.1.patch
>
>
> HoS file merging doesn't work properly since it doesn't set linked file sinks 
> properly which is used to generate move tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17194) JDBC: Implement Gzip servlet filter

2017-07-31 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-17194:
---
Description: 
{code}
POST /cliservice HTTP/1.1
Content-Type: application/x-thrift
Accept: application/x-thrift
User-Agent: Java/THttpClient/HC
Authorization: Basic YW5vbnltb3VzOmFub255bW91cw==
Content-Length: 71
Host: localhost:10007
Connection: Keep-Alive
Accept-Encoding: gzip,deflate
X-XSRF-HEADER: true
{code}

The Beeline client clearly sends out HTTP compression headers which are ignored 
by the HTTP service layer in HS2.

After patch, result looks like

{code}
HTTP/1.1 200 OK
Date: Tue, 01 Aug 2017 01:47:23 GMT
Content-Type: application/x-thrift
Vary: Accept-Encoding, User-Agent
Content-Encoding: gzip
Transfer-Encoding: chunked
Server: Jetty(9.3.8.v20160314)
{code}

  was:
{code}
POST /cliservice HTTP/1.1
Content-Type: application/x-thrift
Accept: application/x-thrift
User-Agent: Java/THttpClient/HC
Authorization: Basic YW5vbnltb3VzOmFub255bW91cw==
Content-Length: 71
Host: localhost:10007
Connection: Keep-Alive
Accept-Encoding: gzip,deflate
X-XSRF-HEADER: true
{code}

The Beeline client clearly sends out HTTP compression headers which are ignored 
by the HTTP service layer in HS2.


> JDBC: Implement Gzip servlet filter
> ---
>
> Key: HIVE-17194
> URL: https://issues.apache.org/jira/browse/HIVE-17194
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, JDBC
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-17194.1.patch, HIVE-17194.2.patch, 
> HIVE-17194.3.patch
>
>
> {code}
> POST /cliservice HTTP/1.1
> Content-Type: application/x-thrift
> Accept: application/x-thrift
> User-Agent: Java/THttpClient/HC
> Authorization: Basic YW5vbnltb3VzOmFub255bW91cw==
> Content-Length: 71
> Host: localhost:10007
> Connection: Keep-Alive
> Accept-Encoding: gzip,deflate
> X-XSRF-HEADER: true
> {code}
> The Beeline client clearly sends out HTTP compression headers which are 
> ignored by the HTTP service layer in HS2.
> After patch, result looks like
> {code}
> HTTP/1.1 200 OK
> Date: Tue, 01 Aug 2017 01:47:23 GMT
> Content-Type: application/x-thrift
> Vary: Accept-Encoding, User-Agent
> Content-Encoding: gzip
> Transfer-Encoding: chunked
> Server: Jetty(9.3.8.v20160314)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17194) JDBC: Implement Gzip servlet filter

2017-07-31 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-17194:
---
Status: Patch Available  (was: Open)

> JDBC: Implement Gzip servlet filter
> ---
>
> Key: HIVE-17194
> URL: https://issues.apache.org/jira/browse/HIVE-17194
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, JDBC
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-17194.1.patch, HIVE-17194.2.patch, 
> HIVE-17194.3.patch
>
>
> {code}
> POST /cliservice HTTP/1.1
> Content-Type: application/x-thrift
> Accept: application/x-thrift
> User-Agent: Java/THttpClient/HC
> Authorization: Basic YW5vbnltb3VzOmFub255bW91cw==
> Content-Length: 71
> Host: localhost:10007
> Connection: Keep-Alive
> Accept-Encoding: gzip,deflate
> X-XSRF-HEADER: true
> {code}
> The Beeline client clearly sends out HTTP compression headers which are 
> ignored by the HTTP service layer in HS2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17194) JDBC: Implement Gzip servlet filter

2017-07-31 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-17194:
---
Status: Open  (was: Patch Available)

> JDBC: Implement Gzip servlet filter
> ---
>
> Key: HIVE-17194
> URL: https://issues.apache.org/jira/browse/HIVE-17194
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, JDBC
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-17194.1.patch, HIVE-17194.2.patch, 
> HIVE-17194.3.patch
>
>
> {code}
> POST /cliservice HTTP/1.1
> Content-Type: application/x-thrift
> Accept: application/x-thrift
> User-Agent: Java/THttpClient/HC
> Authorization: Basic YW5vbnltb3VzOmFub255bW91cw==
> Content-Length: 71
> Host: localhost:10007
> Connection: Keep-Alive
> Accept-Encoding: gzip,deflate
> X-XSRF-HEADER: true
> {code}
> The Beeline client clearly sends out HTTP compression headers which are 
> ignored by the HTTP service layer in HS2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17194) JDBC: Implement Gzip servlet filter

2017-07-31 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-17194:
---
Attachment: HIVE-17194.3.patch

The gzip header is eaten up by the HttpClient (outside of JDBC code), so this 
is nearly impossible to test.

> JDBC: Implement Gzip servlet filter
> ---
>
> Key: HIVE-17194
> URL: https://issues.apache.org/jira/browse/HIVE-17194
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, JDBC
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-17194.1.patch, HIVE-17194.2.patch, 
> HIVE-17194.3.patch
>
>
> {code}
> POST /cliservice HTTP/1.1
> Content-Type: application/x-thrift
> Accept: application/x-thrift
> User-Agent: Java/THttpClient/HC
> Authorization: Basic YW5vbnltb3VzOmFub255bW91cw==
> Content-Length: 71
> Host: localhost:10007
> Connection: Keep-Alive
> Accept-Encoding: gzip,deflate
> X-XSRF-HEADER: true
> {code}
> The Beeline client clearly sends out HTTP compression headers which are 
> ignored by the HTTP service layer in HS2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17006) LLAP: Parquet caching

2017-07-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108259#comment-16108259
 ] 

Hive QA commented on HIVE-17006:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879706/HIVE-17006.02.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11019 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=240)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6203/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6203/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6203/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879706 - PreCommit-HIVE-Build

> LLAP: Parquet caching
> -
>
> Key: HIVE-17006
> URL: https://issues.apache.org/jira/browse/HIVE-17006
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17006.01.patch, HIVE-17006.02.patch, 
> HIVE-17006.patch, HIVE-17006.WIP.patch
>
>
> There are multiple options to do Parquet caching in LLAP:
> 1) Full elevator (too intrusive for now).
> 2) Page based cache like ORC (requires some changes to Parquet or 
> copy-pasted).
> 3) Cache disk data on column chunk level as is.
> Given that Parquet reads at column chunk granularity, (2) is not as useful as 
> for ORC, but still a good idea. I messaged the dev list about it but didn't 
> get a response, we may follow up later.
> For now, do (3). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17115) MetaStoreUtils.getDeserializer doesn't catch the java.lang.ClassNotFoundException

2017-07-31 Thread Aroop Maliakkal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108243#comment-16108243
 ] 

Aroop Maliakkal commented on HIVE-17115:


[~daijy] :: These hive tables are created on top of Hbase tables. Here is one 
of the sample commands we have used for creating it.

{quote}
CREATE EXTERNAL TABLE test_20150326

(MD5 STRING,

image BINARY

)

STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

WITH SERDEPROPERTIES ("hbase.columns.mapping"= ":key#s,z:imagebyte#b")


TBLPROPERTIES("hbase.table.name"= "images_00",
"hbase.table.default.storage.type" = "binary");
{quote}

> MetaStoreUtils.getDeserializer doesn't catch the 
> java.lang.ClassNotFoundException
> -
>
> Key: HIVE-17115
> URL: https://issues.apache.org/jira/browse/HIVE-17115
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1
>Reporter: Erik.fang
>Assignee: Erik.fang
> Attachments: HIVE-17115.1.patch, HIVE-17115.patch
>
>
> Suppose we create a table with Custom SerDe, then call 
> HiveMetaStoreClient.getSchema(String db, String tableName) to extract the 
> metadata from HiveMetaStore Service
> the thrift client hangs there with exception in HiveMetaStore Service's log, 
> such as
> {code:java}
> Exception in thread "pool-5-thread-129" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/util/Bytes
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.parseColumnsMapping(HBaseSerDe.java:184)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDeParameters.(HBaseSerDeParameters.java:73)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:117)
> at 
> org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:53)
> at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:401)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_fields_with_environment_context(HiveMetaStore.java:3556)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_schema_with_environment_context(HiveMetaStore.java:3636)
> at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> at com.sun.proxy.$Proxy4.get_schema_with_environment_context(Unknown 
> Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9146)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9130)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.util.Bytes
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14261) Support set/unset partition parameters

2017-07-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108192#comment-16108192
 ] 

Ashutosh Chauhan commented on HIVE-14261:
-

Somewhat.
I would also like to understand use cases which this feature unlocks? 

> Support set/unset partition parameters
> --
>
> Key: HIVE-14261
> URL: https://issues.apache.org/jira/browse/HIVE-14261
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14261.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17213) HoS: file merging doesn't work for union all

2017-07-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108190#comment-16108190
 ] 

Hive QA commented on HIVE-17213:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879669/HIVE-17213.0.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 553 failed/errored test(s), 11018 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=240)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[auto_sortmerge_join_16]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucket4] 
(batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucket5] 
(batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucket6] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketmapjoin6]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketmapjoin7]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[constprog_partitioner]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[constprog_semijoin]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[disable_merge_for_bucketing]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[dynamic_rdd_cache]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[empty_dir_in_table]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[gen_udf_example_add10]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[index_bitmap3]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[index_bitmap_auto]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[infer_bucket_sort_bucketed_table]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[infer_bucket_sort_map_operators]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[infer_bucket_sort_merge]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[infer_bucket_sort_num_buckets]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[infer_bucket_sort_reducers_power_two]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[insert_overwrite_directory2]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[list_bucket_dml_10]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge1]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge2]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge3]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge4]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge5]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge6]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge7]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge8]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge9]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge_diff_fs]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge_incompat1]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge_incompat2]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[parallel_orderby]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[quotedid_smb]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[reduce_deduplicate]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[remote_script]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[root_dir_external_table]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[schemeAuthority2]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[schemeAuthority]
 

[jira] [Commented] (HIVE-17208) Repl dump should pass in db/table information to authorization API

2017-07-31 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108188#comment-16108188
 ] 

Thejas M Nair commented on HIVE-17208:
--

ReplicationSemanticAnalyzer - Since the location is not specified by user, we 
don't need to check access to it.
ReplDumpTask - matches* methods - "public static"  (instead of "static public") 
is recommended sun convention (and therefore hive convention).
ReplicationSemanticAnalyzer calling static method in ReplDumpTask is little 
unintuitive. How about moving that method to a higher level utils class ? Maybe 
org.apache.hadoop.hive.ql.parse.repl.dump.Utils class ?


> Repl dump should pass in db/table information to authorization API
> --
>
> Key: HIVE-17208
> URL: https://issues.apache.org/jira/browse/HIVE-17208
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-17208.1.patch
>
>
> "repl dump" does not provide db/table information. That is necessary for 
> authorization replication in ranger.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17209) ObjectCacheFactory should return null when tez shared object registry is not setup

2017-07-31 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108161#comment-16108161
 ] 

Rajesh Balamohan commented on HIVE-17209:
-

Thanks [~sershe]. This would also need a fix in ORC 
(https://github.com/apache/orc/blob/master/java/core/src/java/org/apache/orc/impl/RecordReaderImpl.java#L554).
 Since {{getLiteralList()}} would be empty, it needs to check for empty 
structure to avoid IndexOutOfBoundsException. 

> ObjectCacheFactory should return null when tez shared object registry is not 
> setup
> --
>
> Key: HIVE-17209
> URL: https://issues.apache.org/jira/browse/HIVE-17209
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-17209.1.patch
>
>
> HIVE-15269 introduced dynamic min/max bloom filter 
> ("hive.tez.dynamic.semijoin.reduction=true"). This needs to access 
> ObjectCache and in tez, ObjectCache can only be created by {{TezProcessor}}.
> In the following case {{AM --> splits --> 
> OrcInputFormat.pickStripes::evaluatePredicateMinMax --> 
> DynamicValue.getLiteral --> objectCache access}}, AM ends up throwing lots of 
> NPE since AM has not created ObjectCache.  
> Orc reader catches these exceptions, skips PPD and proceeds further. For e.g, 
> in Q95 it ends up throwing ~30,000 NPE before completing split information.
> ObjectCacheFactory should return null when tez shared object registry is not 
> setup. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16985) LLAP IO: enable SMB join in elevator after the former is fixed

2017-07-31 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16985:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

committed to master

> LLAP IO: enable SMB join in elevator after the former is fixed
> --
>
> Key: HIVE-16985
> URL: https://issues.apache.org/jira/browse/HIVE-16985
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Deepak Jaiswal
> Fix For: 3.0.0
>
> Attachments: HIVE-16985.1.patch
>
>
> We currently skip the IO elevator when we encounter an SMB join (see 
> HIVE-16761). However, it might work with elevator with the code commented out 
> in HIVE-16761. Need to look again after HIVE-16965 is fixed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17209) ObjectCacheFactory should return null when tez shared object registry is not setup

2017-07-31 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108161#comment-16108161
 ] 

Rajesh Balamohan edited comment on HIVE-17209 at 7/31/17 11:35 PM:
---

Thanks [~sershe]. This would also need a fix in ORC 
(https://github.com/apache/orc/blob/master/java/core/src/java/org/apache/orc/impl/RecordReaderImpl.java#L554).
 Since {{getLiteralList()}} would be empty, it needs to check for empty 
structure to avoid IndexOutOfBoundsException. I will create separate ticket for 
that in ORC.


was (Author: rajesh.balamohan):
Thanks [~sershe]. This would also need a fix in ORC 
(https://github.com/apache/orc/blob/master/java/core/src/java/org/apache/orc/impl/RecordReaderImpl.java#L554).
 Since {{getLiteralList()}} would be empty, it needs to check for empty 
structure to avoid IndexOutOfBoundsException. 

> ObjectCacheFactory should return null when tez shared object registry is not 
> setup
> --
>
> Key: HIVE-17209
> URL: https://issues.apache.org/jira/browse/HIVE-17209
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-17209.1.patch
>
>
> HIVE-15269 introduced dynamic min/max bloom filter 
> ("hive.tez.dynamic.semijoin.reduction=true"). This needs to access 
> ObjectCache and in tez, ObjectCache can only be created by {{TezProcessor}}.
> In the following case {{AM --> splits --> 
> OrcInputFormat.pickStripes::evaluatePredicateMinMax --> 
> DynamicValue.getLiteral --> objectCache access}}, AM ends up throwing lots of 
> NPE since AM has not created ObjectCache.  
> Orc reader catches these exceptions, skips PPD and proceeds further. For e.g, 
> in Q95 it ends up throwing ~30,000 NPE before completing split information.
> ObjectCacheFactory should return null when tez shared object registry is not 
> setup. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17167) Create metastore specific configuration tool

2017-07-31 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108158#comment-16108158
 ] 

Vihang Karajgaonkar commented on HIVE-17167:


Thanks for the changes [~alangates] +1 (pending tests)

> Create metastore specific configuration tool
> 
>
> Key: HIVE-17167
> URL: https://issues.apache.org/jira/browse/HIVE-17167
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17167.2.patch, HIVE-17167.patch
>
>
> As part of making the metastore a separately releasable module we need 
> configuration tools that are specific to that module.  It cannot use or 
> extend HiveConf as that is in hive common.  But it must take a HiveConf 
> object and be able to operate on it.
> The best way to achieve this is using Hadoop's Configuration object (which 
> HiveConf extends) together with enums and static methods.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14261) Support set/unset partition parameters

2017-07-31 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108150#comment-16108150
 ] 

Carl Steinbach commented on HIVE-14261:
---

[~ashutoshc] will adding a server-side property that disables this 
functionality satisfy your concerns?

> Support set/unset partition parameters
> --
>
> Key: HIVE-14261
> URL: https://issues.apache.org/jira/browse/HIVE-14261
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14261.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17113) Duplicate bucket files can get written to table by runaway task

2017-07-31 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-17113:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master

> Duplicate bucket files can get written to table by runaway task
> ---
>
> Key: HIVE-17113
> URL: https://issues.apache.org/jira/browse/HIVE-17113
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 3.0.0
>
> Attachments: HIVE-17113.1.patch, HIVE-17113.2.patch, 
> HIVE-17113.3.patch
>
>
> Saw a table get a duplicate bucket file from a Hive query. It looks like the 
> following happened:
> 1. Task attempt A_0 starts,but then stops making progress
> 2. The job was running with speculative execution on, and task attempt A_1 is 
> started
> 3. Task attempt A_1 finishes execution and saves its output to the temp 
> directory.
> 5. A task kill is sent to A_0, though this does appear to actually kill A_0
> 6. The job for the query finishes and Utilities.mvFileToFinalPath() calls 
> Utilities.removeTempOrDuplicateFiles() to check for duplicate bucket files
> 7. A_0 (still running) finally finishes and saves its file to the temp 
> directory. At this point we now have duplicate bucket files - oops!
> 8. Utilities.removeTempOrDuplicateFiles() moves the temp directory to the 
> final location, where it is later moved to the partition directory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16985) LLAP IO: enable SMB join in elevator after the former is fixed

2017-07-31 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108139#comment-16108139
 ] 

Sergey Shelukhin commented on HIVE-16985:
-

+1

> LLAP IO: enable SMB join in elevator after the former is fixed
> --
>
> Key: HIVE-16985
> URL: https://issues.apache.org/jira/browse/HIVE-16985
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Deepak Jaiswal
> Attachments: HIVE-16985.1.patch
>
>
> We currently skip the IO elevator when we encounter an SMB join (see 
> HIVE-16761). However, it might work with elevator with the code commented out 
> in HIVE-16761. Need to look again after HIVE-16965 is fixed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17189) Fix backwards incompatibility in HiveMetaStoreClient

2017-07-31 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17189:
---
Attachment: HIVE-17189.02.patch

Added a test based on Alan's suggestion.

> Fix backwards incompatibility in HiveMetaStoreClient
> 
>
> Key: HIVE-17189
> URL: https://issues.apache.org/jira/browse/HIVE-17189
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.1.1
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17189.01.patch, HIVE-17189.02.patch
>
>
> HIVE-12730 adds the ability to edit the basic stats using {{alter table}} and 
> {{alter partition}} commands. However, it changes the signature of @public 
> interface of MetastoreClient and removes some methods which breaks backwards 
> compatibility. This can be fixed easily by re-introducing the removed methods 
> and making them call into newly added method 
> {{alter_table_with_environment_context}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17188) ObjectStore runs out of memory for large batches of addPartitions().

2017-07-31 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108132#comment-16108132
 ] 

Vihang Karajgaonkar commented on HIVE-17188:


Looks good to me. 
+1 

> ObjectStore runs out of memory for large batches of addPartitions().
> 
>
> Key: HIVE-17188
> URL: https://issues.apache.org/jira/browse/HIVE-17188
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
> Attachments: HIVE-17188.1.patch
>
>
> For large batches (e.g. hundreds) of {{addPartitions()}}, the {{ObjectStore}} 
> runs out of memory. Flushing the {{PersistenceManager}} alleviates the 
> problem.
> Note: The problem being addressed here isn't so much with the size of the 
> hundreds of Partition objects, but the cruft that builds with the 
> PersistenceManager, in the JDO layer, as confirmed through memory-profiling.
> (Raising this on behalf of [~cdrome] and [~thiruvel].)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache

2017-07-31 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108110#comment-16108110
 ] 

Prasanth Jayachandran commented on HIVE-17220:
--

Although bloom-1 is fast in microbenchmarks (2-5x faster as there is only 1 
memory access), there is around 2% increase in fpp. This will let more rows 
pass through the bloom filter negating the performance gain. Alternative, 
approach is to increase the stride size for hash mapping to more than 1 long. 
Will update the patch shortly with bloom-k implementation.

> Bloomfilter probing in semijoin reduction is thrashing L1 dcache
> 
>
> Key: HIVE-17220
> URL: https://issues.apache.org/jira/browse/HIVE-17220
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17220.WIP.patch
>
>
> [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for 
> some of the TPC-DS queries and resulted L1 data cache thrashing. 
> This is because of the huge bitset in bloom filter that doesn't fit in any 
> levels of cache, also the hash bits corresponding to a single key map to 
> different segments of bitset which are spread out. This can result in K-1 
> memory access (K being number of hash functions) in worst case for every key 
> that gets probed because of locality miss in L1 cache. 
> Ran a JMH microbenchmark to verify the same. Following is the JMH perf 
> profile for bloom filter probing
> {code}
> Perf stats:
> --
>5101.935637  task-clock (msec) #0.461 CPUs utilized
>346  context-switches  #0.068 K/sec
>336  cpu-migrations#0.066 K/sec
>  6,207  page-faults   #0.001 M/sec
> 10,016,486,301  cycles#1.963 GHz  
> (26.90%)
>  5,751,692,176  stalled-cycles-frontend   #   57.42% frontend cycles 
> idle (27.05%)
>  stalled-cycles-backend
> 14,359,914,397  instructions  #1.43  insns per cycle
>   #0.40  stalled cycles 
> per insn  (33.78%)
>  2,200,632,861  branches  #  431.333 M/sec
> (33.84%)
>  1,162,860  branch-misses #0.05% of all branches  
> (33.97%)
>  1,025,992,254  L1-dcache-loads   #  201.099 M/sec
> (26.56%)
>432,663,098  L1-dcache-load-misses #   42.17% of all L1-dcache 
> hits(14.49%)
>331,383,297  LLC-loads #   64.952 M/sec
> (14.47%)
>203,524  LLC-load-misses   #0.06% of all LL-cache 
> hits (21.67%)
>  L1-icache-loads
>  1,633,821  L1-icache-load-misses #0.320 M/sec
> (28.85%)
>950,368,796  dTLB-loads#  186.276 M/sec
> (28.61%)
>246,813,393  dTLB-load-misses  #   25.97% of all dTLB 
> cache hits   (14.53%)
> 25,451  iTLB-loads#0.005 M/sec
> (14.48%)
> 35,415  iTLB-load-misses  #  139.15% of all iTLB 
> cache hits   (21.73%)
>  L1-dcache-prefetches
>175,958  L1-dcache-prefetch-misses #0.034 M/sec
> (28.94%)
>   11.064783140 seconds time elapsed
> {code}
> This shows 42.17% of L1 data cache misses. 
> This jira is to use cache efficient bloom filter for semijoin probing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17217) SMB Join : Assert if paths are different in TezGroupedSplit in KeyValueInputMerger

2017-07-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108100#comment-16108100
 ] 

Hive QA commented on HIVE-17217:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879692/HIVE-17217.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11018 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=240)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6201/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6201/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6201/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879692 - PreCommit-HIVE-Build

> SMB Join : Assert if paths are different in TezGroupedSplit in 
> KeyValueInputMerger
> --
>
> Key: HIVE-17217
> URL: https://issues.apache.org/jira/browse/HIVE-17217
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-17217.1.patch
>
>
> In KeyValueInputMerger, a TezGroupedSplit may contain more than 1 splits. 
> However, the splits should all belong to same path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17172) add ordering checks to DiskRangeList

2017-07-31 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17172:

Attachment: HIVE-17172.01.patch

Added a test

> add ordering checks to DiskRangeList
> 
>
> Key: HIVE-17172
> URL: https://issues.apache.org/jira/browse/HIVE-17172
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17172.01.patch, HIVE-17172.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache

2017-07-31 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17220:
-
Attachment: HIVE-17220.WIP.patch

> Bloomfilter probing in semijoin reduction is thrashing L1 dcache
> 
>
> Key: HIVE-17220
> URL: https://issues.apache.org/jira/browse/HIVE-17220
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17220.WIP.patch
>
>
> [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for 
> some of the TPC-DS queries and resulted L1 data cache thrashing. 
> This is because of the huge bitset in bloom filter that doesn't fit in any 
> levels of cache, also the hash bits corresponding to a single key map to 
> different segments of bitset which are spread out. This can result in K-1 
> memory access (K being number of hash functions) in worst case for every key 
> that gets probed because of locality miss in L1 cache. 
> Ran a JMH microbenchmark to verify the same. Following is the JMH perf 
> profile for bloom filter probing
> {code}
> Perf stats:
> --
>5101.935637  task-clock (msec) #0.461 CPUs utilized
>346  context-switches  #0.068 K/sec
>336  cpu-migrations#0.066 K/sec
>  6,207  page-faults   #0.001 M/sec
> 10,016,486,301  cycles#1.963 GHz  
> (26.90%)
>  5,751,692,176  stalled-cycles-frontend   #   57.42% frontend cycles 
> idle (27.05%)
>  stalled-cycles-backend
> 14,359,914,397  instructions  #1.43  insns per cycle
>   #0.40  stalled cycles 
> per insn  (33.78%)
>  2,200,632,861  branches  #  431.333 M/sec
> (33.84%)
>  1,162,860  branch-misses #0.05% of all branches  
> (33.97%)
>  1,025,992,254  L1-dcache-loads   #  201.099 M/sec
> (26.56%)
>432,663,098  L1-dcache-load-misses #   42.17% of all L1-dcache 
> hits(14.49%)
>331,383,297  LLC-loads #   64.952 M/sec
> (14.47%)
>203,524  LLC-load-misses   #0.06% of all LL-cache 
> hits (21.67%)
>  L1-icache-loads
>  1,633,821  L1-icache-load-misses #0.320 M/sec
> (28.85%)
>950,368,796  dTLB-loads#  186.276 M/sec
> (28.61%)
>246,813,393  dTLB-load-misses  #   25.97% of all dTLB 
> cache hits   (14.53%)
> 25,451  iTLB-loads#0.005 M/sec
> (14.48%)
> 35,415  iTLB-load-misses  #  139.15% of all iTLB 
> cache hits   (21.73%)
>  L1-dcache-prefetches
>175,958  L1-dcache-prefetch-misses #0.034 M/sec
> (28.94%)
>   11.064783140 seconds time elapsed
> {code}
> This shows 42.17% of L1 data cache misses. 
> This jira is to use cache efficient bloom filter for semijoin probing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache

2017-07-31 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-17220:



> Bloomfilter probing in semijoin reduction is thrashing L1 dcache
> 
>
> Key: HIVE-17220
> URL: https://issues.apache.org/jira/browse/HIVE-17220
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for 
> some of the TPC-DS queries and resulted L1 data cache thrashing. 
> This is because of the huge bitset in bloom filter that doesn't fit in any 
> levels of cache, also the hash bits corresponding to a single key map to 
> different segments of bitset which are spread out. This can result in K-1 
> memory access (K being number of hash functions) in worst case for every key 
> that gets probed because of locality miss in L1 cache. 
> Ran a JMH microbenchmark to verify the same. Following is the JMH perf 
> profile for bloom filter probing
> {code}
> Perf stats:
> --
>5101.935637  task-clock (msec) #0.461 CPUs utilized
>346  context-switches  #0.068 K/sec
>336  cpu-migrations#0.066 K/sec
>  6,207  page-faults   #0.001 M/sec
> 10,016,486,301  cycles#1.963 GHz  
> (26.90%)
>  5,751,692,176  stalled-cycles-frontend   #   57.42% frontend cycles 
> idle (27.05%)
>  stalled-cycles-backend
> 14,359,914,397  instructions  #1.43  insns per cycle
>   #0.40  stalled cycles 
> per insn  (33.78%)
>  2,200,632,861  branches  #  431.333 M/sec
> (33.84%)
>  1,162,860  branch-misses #0.05% of all branches  
> (33.97%)
>  1,025,992,254  L1-dcache-loads   #  201.099 M/sec
> (26.56%)
>432,663,098  L1-dcache-load-misses #   42.17% of all L1-dcache 
> hits(14.49%)
>331,383,297  LLC-loads #   64.952 M/sec
> (14.47%)
>203,524  LLC-load-misses   #0.06% of all LL-cache 
> hits (21.67%)
>  L1-icache-loads
>  1,633,821  L1-icache-load-misses #0.320 M/sec
> (28.85%)
>950,368,796  dTLB-loads#  186.276 M/sec
> (28.61%)
>246,813,393  dTLB-load-misses  #   25.97% of all dTLB 
> cache hits   (14.53%)
> 25,451  iTLB-loads#0.005 M/sec
> (14.48%)
> 35,415  iTLB-load-misses  #  139.15% of all iTLB 
> cache hits   (21.73%)
>  L1-dcache-prefetches
>175,958  L1-dcache-prefetch-misses #0.034 M/sec
> (28.94%)
>   11.064783140 seconds time elapsed
> {code}
> This shows 42.17% of L1 data cache misses. 
> This jira is to use cache efficient bloom filter for semijoin probing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17167) Create metastore specific configuration tool

2017-07-31 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17167:
--
Status: Patch Available  (was: Open)

> Create metastore specific configuration tool
> 
>
> Key: HIVE-17167
> URL: https://issues.apache.org/jira/browse/HIVE-17167
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17167.2.patch, HIVE-17167.patch
>
>
> As part of making the metastore a separately releasable module we need 
> configuration tools that are specific to that module.  It cannot use or 
> extend HiveConf as that is in hive common.  But it must take a HiveConf 
> object and be able to operate on it.
> The best way to achieve this is using Hadoop's Configuration object (which 
> HiveConf extends) together with enums and static methods.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17167) Create metastore specific configuration tool

2017-07-31 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17167:
--
Attachment: HIVE-17167.2.patch

New patch that reworks MetastoreConf.get() based on Vihang's suggestions, adds 
comments on hive-default.xml, adds a new method to get all resource file URIs, 
and removes dead code.

> Create metastore specific configuration tool
> 
>
> Key: HIVE-17167
> URL: https://issues.apache.org/jira/browse/HIVE-17167
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17167.2.patch, HIVE-17167.patch
>
>
> As part of making the metastore a separately releasable module we need 
> configuration tools that are specific to that module.  It cannot use or 
> extend HiveConf as that is in hive common.  But it must take a HiveConf 
> object and be able to operate on it.
> The best way to achieve this is using Hadoop's Configuration object (which 
> HiveConf extends) together with enums and static methods.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17167) Create metastore specific configuration tool

2017-07-31 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17167:
--
Status: Open  (was: Patch Available)

> Create metastore specific configuration tool
> 
>
> Key: HIVE-17167
> URL: https://issues.apache.org/jira/browse/HIVE-17167
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17167.patch
>
>
> As part of making the metastore a separately releasable module we need 
> configuration tools that are specific to that module.  It cannot use or 
> extend HiveConf as that is in hive common.  But it must take a HiveConf 
> object and be able to operate on it.
> The best way to achieve this is using Hadoop's Configuration object (which 
> HiveConf extends) together with enums and static methods.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17219) NPE in SparkPartitionPruningSinkOperator#closeOp for query with partitioned join in subquery

2017-07-31 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-17219:
---


> NPE in SparkPartitionPruningSinkOperator#closeOp for query with partitioned 
> join in subquery
> 
>
> Key: HIVE-17219
> URL: https://issues.apache.org/jira/browse/HIVE-17219
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> The following query: {{select * from partitioned_table1 where 
> partitioned_table1.part_col in (select partitioned_table2.col from 
> partitioned_table2 join partitioned_table3 on partitioned_table3.col = 
> partitioned_table2.part_col)}} throws a NPE in 
> {{SparkPartitionPruningSinkOperator#closeOp}}
> The full stack trace is:
> {code}
> Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 1 in stage 22.0 failed 4 times, most recent failure: Lost task 1.3 in 
> stage 22.0 (TID 37, 10.16.1.179): java.lang.IllegalStateException: Hit error 
> while closing operators - failing tree: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:194)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:96)
> at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
> at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:147)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
> at org.apache.spark.scheduler.Task.run(Task.scala:85)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.spark.SparkPartitionPruningSinkOperator.closeOp(SparkPartitionPruningSinkOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:709)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:723)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:723)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:723)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:723)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:723)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:723)
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:171)
> ... 11 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.spark.SparkPartitionPruningSinkOperator.flushToFile(SparkPartitionPruningSinkOperator.java:151)
> at 
> org.apache.hadoop.hive.ql.parse.spark.SparkPartitionPruningSinkOperator.closeOp(SparkPartitionPruningSinkOperator.java:93)
> ... 19 more
> {code}
> The full setup is:
> {code}
> set hive.spark.dynamic.partition.pruning=true;
> create table partitioned_table1 (col int) partitioned by (part_col int);
> create table partitioned_table2 (col int) partitioned by (part_col int);
> create table partitioned_table3 (col int) partitioned by (part_col int);
> create table regular_table (col1 int, col2 int);
> insert into table regular_table values (0, 0), (1, 1), (2, 2);
> alter table partitioned_table1 add partition (part_col = 1);
> alter table partitioned_table1 add partition (part_col = 2);
> alter table partitioned_table1 add partition (part_col = 3);
> insert into table partitioned_table1 partition (part_col = 1) values (1), 
> (2), (3);
> insert into table partitioned_table1 partition (part_col = 2) values (1), 
> (2), (3);
> insert into table partitioned_table1 partition (part_col = 3) values (1), 
> (2), (3);
> alter table partitioned_table2 add partition (part_col = 1);
> alter table partitioned_table2 add partition (part_col = 2);
> alter table partitioned_table2 add partition (part_col = 3);
> insert into table partitioned_table2 partition (part_col = 1) values (1), 
> (2), (3);
> insert into table 

[jira] [Commented] (HIVE-17188) ObjectStore runs out of memory for large batches of addPartitions().

2017-07-31 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108024#comment-16108024
 ] 

Mithun Radhakrishnan commented on HIVE-17188:
-

I'm +1 on this change. This has been running in production for some time now. 

I'd like to commit this to {{master}} and {{branch-2}}, tomorrow, unless there 
is objection.

> ObjectStore runs out of memory for large batches of addPartitions().
> 
>
> Key: HIVE-17188
> URL: https://issues.apache.org/jira/browse/HIVE-17188
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
> Attachments: HIVE-17188.1.patch
>
>
> For large batches (e.g. hundreds) of {{addPartitions()}}, the {{ObjectStore}} 
> runs out of memory. Flushing the {{PersistenceManager}} alleviates the 
> problem.
> Note: The problem being addressed here isn't so much with the size of the 
> hundreds of Partition objects, but the cruft that builds with the 
> PersistenceManager, in the JDO layer, as confirmed through memory-profiling.
> (Raising this on behalf of [~cdrome] and [~thiruvel].)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17218) Canonical-ize hostnames for Hive metastore, and HS2 servers.

2017-07-31 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17218:

Status: Patch Available  (was: Open)

> Canonical-ize hostnames for Hive metastore, and HS2 servers.
> 
>
> Key: HIVE-17218
> URL: https://issues.apache.org/jira/browse/HIVE-17218
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore, Security
>Affects Versions: 2.2.0, 1.2.2, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17218.1.patch
>
>
> Currently, the {{HiveMetastoreClient}} and {{HiveConnection}} do not 
> canonical-ize the hostnames of the metastore/HS2 servers. In deployments 
> where there are multiple such servers behind a VIP, this causes a number of 
> inconveniences:
> # The client-side configuration (e.g. {{hive.metastore.uris}} in 
> {{hive-site.xml}}) needs to specify the VIP's hostname, and cannot use a 
> simplified CNAME, in the thrift URL. If the 
> {{hive.metastore.kerberos.principal}} is specified using {{_HOST}}, one sees 
> GSS failures as follows:
> {noformat}
> hive --hiveconf hive.metastore.kerberos.principal=hive/_h...@grid.myth.net 
> --hiveconf 
> hive.metastore.uris="thrift://simplified-hcat-cname.grid.myth.net:56789"
> ...
> Exception in thread "main" java.lang.RuntimeException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:542)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> ...
> {noformat}
> This is because {{_HOST}} is filled in with the CNAME, and not the 
> canonicalized name.
> # Oozie workflows that use HCat {{}} have to always use the VIP 
> hostname, and can't use {{_HOST}}-based service principals, if the CNAME 
> differs from the VIP name.
> If the client-code simply canonical-ized the hostnames, it would enable the 
> use of both simplified CNAMEs, and _HOST in service principals.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17218) Canonical-ize hostnames for Hive metastore, and HS2 servers.

2017-07-31 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17218:

Attachment: HIVE-17218.1.patch

Here's the proposed fix.

> Canonical-ize hostnames for Hive metastore, and HS2 servers.
> 
>
> Key: HIVE-17218
> URL: https://issues.apache.org/jira/browse/HIVE-17218
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore, Security
>Affects Versions: 1.2.2, 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17218.1.patch
>
>
> Currently, the {{HiveMetastoreClient}} and {{HiveConnection}} do not 
> canonical-ize the hostnames of the metastore/HS2 servers. In deployments 
> where there are multiple such servers behind a VIP, this causes a number of 
> inconveniences:
> # The client-side configuration (e.g. {{hive.metastore.uris}} in 
> {{hive-site.xml}}) needs to specify the VIP's hostname, and cannot use a 
> simplified CNAME, in the thrift URL. If the 
> {{hive.metastore.kerberos.principal}} is specified using {{_HOST}}, one sees 
> GSS failures as follows:
> {noformat}
> hive --hiveconf hive.metastore.kerberos.principal=hive/_h...@grid.myth.net 
> --hiveconf 
> hive.metastore.uris="thrift://simplified-hcat-cname.grid.myth.net:56789"
> ...
> Exception in thread "main" java.lang.RuntimeException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:542)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> ...
> {noformat}
> This is because {{_HOST}} is filled in with the CNAME, and not the 
> canonicalized name.
> # Oozie workflows that use HCat {{}} have to always use the VIP 
> hostname, and can't use {{_HOST}}-based service principals, if the CNAME 
> differs from the VIP name.
> If the client-code simply canonical-ized the hostnames, it would enable the 
> use of both simplified CNAMEs, and _HOST in service principals.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17218) Canonical-ize hostnames for Hive metastore, and HS2 servers.

2017-07-31 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan reassigned HIVE-17218:
---


> Canonical-ize hostnames for Hive metastore, and HS2 servers.
> 
>
> Key: HIVE-17218
> URL: https://issues.apache.org/jira/browse/HIVE-17218
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore, Security
>Affects Versions: 2.2.0, 1.2.2, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>
> Currently, the {{HiveMetastoreClient}} and {{HiveConnection}} do not 
> canonical-ize the hostnames of the metastore/HS2 servers. In deployments 
> where there are multiple such servers behind a VIP, this causes a number of 
> inconveniences:
> # The client-side configuration (e.g. {{hive.metastore.uris}} in 
> {{hive-site.xml}}) needs to specify the VIP's hostname, and cannot use a 
> simplified CNAME, in the thrift URL. If the 
> {{hive.metastore.kerberos.principal}} is specified using {{_HOST}}, one sees 
> GSS failures as follows:
> {noformat}
> hive --hiveconf hive.metastore.kerberos.principal=hive/_h...@grid.myth.net 
> --hiveconf 
> hive.metastore.uris="thrift://simplified-hcat-cname.grid.myth.net:56789"
> ...
> Exception in thread "main" java.lang.RuntimeException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:542)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> ...
> {noformat}
> This is because {{_HOST}} is filled in with the CNAME, and not the 
> canonicalized name.
> # Oozie workflows that use HCat {{}} have to always use the VIP 
> hostname, and can't use {{_HOST}}-based service principals, if the CNAME 
> differs from the VIP name.
> If the client-code simply canonical-ized the hostnames, it would enable the 
> use of both simplified CNAMEs, and _HOST in service principals.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17152) Improve security of random generator for HS2 cookies

2017-07-31 Thread Tao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107980#comment-16107980
 ] 

Tao Li commented on HIVE-17152:
---

All test failures except the below ones are tracked in HIVE-15058.

org.apache.hive.jdbc.TestJdbcWithMiniHS2.testConcurrentStatements (batchId=228)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testParallelCompilation2 (batchId=228)

Those 2 tests passed locally on my box. If they keep recurring, then we can add 
them to HIVE-15058.

[~thejas] Can you please review the patch?

> Improve security of random generator for HS2 cookies
> 
>
> Key: HIVE-17152
> URL: https://issues.apache.org/jira/browse/HIVE-17152
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17152.1.patch
>
>
> The random number generated is used as a secret to append to a sequence and 
> SHA to implement a CookieSigner. If this is attackable, then it's possible 
> for an attacker to sign a cookie as if we had. We should fix this and use 
> SecureRandom as a stronger random function .
> HTTPAuthUtils has a similar issue. If that is attackable, an attacker might 
> be able to create a similar cookie. Paired with the above issue with the 
> CookieSigner, it could reasonably spoof a HS2 cookie.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17152) Improve security of random generator for HS2 cookies

2017-07-31 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17152:
--
Status: Patch Available  (was: Open)

> Improve security of random generator for HS2 cookies
> 
>
> Key: HIVE-17152
> URL: https://issues.apache.org/jira/browse/HIVE-17152
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17152.1.patch
>
>
> The random number generated is used as a secret to append to a sequence and 
> SHA to implement a CookieSigner. If this is attackable, then it's possible 
> for an attacker to sign a cookie as if we had. We should fix this and use 
> SecureRandom as a stronger random function .
> HTTPAuthUtils has a similar issue. If that is attackable, an attacker might 
> be able to create a similar cookie. Paired with the above issue with the 
> CookieSigner, it could reasonably spoof a HS2 cookie.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17205) add functional support

2017-07-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107967#comment-16107967
 ] 

Hive QA commented on HIVE-17205:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879687/HIVE-17205.03.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 11021 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
TestTxnCommandsBase - did not produce a TEST-*.xml file (likely timed out) 
(batchId=281)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=240)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[skewjoin_mapjoin1] 
(batchId=81)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testConnection (batchId=241)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testIsValid (batchId=241)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testIsValidNeg (batchId=241)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testNegativeTokenAuth 
(batchId=241)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testProxyAuth (batchId=241)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testTokenAuth (batchId=241)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6200/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6200/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6200/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879687 - PreCommit-HIVE-Build

> add functional support
> --
>
> Key: HIVE-17205
> URL: https://issues.apache.org/jira/browse/HIVE-17205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17205.01.patch, HIVE-17205.02.patch, 
> HIVE-17205.03.patch
>
>
> make sure unbucketed tables can be marked transactional=true
> make insert/update/delete/compaction work



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14013) Describe table doesn't show unicode properly

2017-07-31 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107963#comment-16107963
 ] 

Aihua Xu commented on HIVE-14013:
-

[~fenghaizhu] I can't see your attachment. But I tried simple test and it works 
fine.

{noformat}
hive> desc unicode_comments_tbl1;
OK
col1string  第一列 
p1  string  分割  
 
# Partition Information  
# col_name  data_type   comment 
 
p1  string  分割
{noformat}

> Describe table doesn't show unicode properly
> 
>
> Key: HIVE-14013
> URL: https://issues.apache.org/jira/browse/HIVE-14013
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 2.3.0
>
> Attachments: HIVE-14013.1.patch, HIVE-14013.2.patch, 
> HIVE-14013.3.patch, HIVE-14013.4.patch
>
>
> Describe table output will show comments incorrectly rather than the unicode 
> itself.
> {noformat}
> hive> desc formatted t1;
> # Detailed Table Information 
> Table Type: MANAGED_TABLE
> Table Parameters:
> COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
> comment \u8868\u4E2D\u6587\u6D4B\u8BD5
> numFiles0   
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17194) JDBC: Implement Gzip servlet filter

2017-07-31 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107942#comment-16107942
 ] 

Gopal V commented on HIVE-17194:


Jetty9 turned GzipFilter into a no-op operation - need to use GzipHandler in 
Jetty9 to get compression instead.

> JDBC: Implement Gzip servlet filter
> ---
>
> Key: HIVE-17194
> URL: https://issues.apache.org/jira/browse/HIVE-17194
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, JDBC
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-17194.1.patch, HIVE-17194.2.patch
>
>
> {code}
> POST /cliservice HTTP/1.1
> Content-Type: application/x-thrift
> Accept: application/x-thrift
> User-Agent: Java/THttpClient/HC
> Authorization: Basic YW5vbnltb3VzOmFub255bW91cw==
> Content-Length: 71
> Host: localhost:10007
> Connection: Keep-Alive
> Accept-Encoding: gzip,deflate
> X-XSRF-HEADER: true
> {code}
> The Beeline client clearly sends out HTTP compression headers which are 
> ignored by the HTTP service layer in HS2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17006) LLAP: Parquet caching

2017-07-31 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17006:

Attachment: HIVE-17006.02.patch

Fixing the stream issue (seems to be version dependent)

> LLAP: Parquet caching
> -
>
> Key: HIVE-17006
> URL: https://issues.apache.org/jira/browse/HIVE-17006
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17006.01.patch, HIVE-17006.02.patch, 
> HIVE-17006.patch, HIVE-17006.WIP.patch
>
>
> There are multiple options to do Parquet caching in LLAP:
> 1) Full elevator (too intrusive for now).
> 2) Page based cache like ORC (requires some changes to Parquet or 
> copy-pasted).
> 3) Cache disk data on column chunk level as is.
> Given that Parquet reads at column chunk granularity, (2) is not as useful as 
> for ORC, but still a good idea. I messaged the dev list about it but didn't 
> get a response, we may follow up later.
> For now, do (3). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-13989) Extended ACLs are not handled according to specification

2017-07-31 Thread Chris Drome (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107894#comment-16107894
 ] 

Chris Drome commented on HIVE-13989:


[~vgumashta], I'll review your comments and update accordingly.

> Extended ACLs are not handled according to specification
> 
>
> Key: HIVE-13989
> URL: https://issues.apache.org/jira/browse/HIVE-13989
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Chris Drome
>Assignee: Chris Drome
> Attachments: HIVE-13989.1-branch-1.patch, HIVE-13989.1.patch, 
> HIVE-13989-branch-1.patch, HIVE-13989-branch-2.2.patch, 
> HIVE-13989-branch-2.2.patch, HIVE-13989-branch-2.2.patch
>
>
> Hive takes two approaches to working with extended ACLs depending on whether 
> data is being produced via a Hive query or HCatalog APIs. A Hive query will 
> run an FsShell command to recursively set the extended ACLs for a directory 
> sub-tree. HCatalog APIs will attempt to build up the directory sub-tree 
> programmatically and runs some code to set the ACLs to match the parent 
> directory.
> Some incorrect assumptions were made when implementing the extended ACLs 
> support. Refer to https://issues.apache.org/jira/browse/HDFS-4685 for the 
> design documents of extended ACLs in HDFS. These documents model the 
> implementation after the POSIX implementation on Linux, which can be found at 
> http://www.vanemery.com/Linux/ACL/POSIX_ACL_on_Linux.html.
> The code for setting extended ACLs via HCatalog APIs is found in 
> HdfsUtils.java:
> {code}
> if (aclEnabled) {
>   aclStatus =  sourceStatus.getAclStatus();
>   if (aclStatus != null) {
> LOG.trace(aclStatus.toString());
> aclEntries = aclStatus.getEntries();
> removeBaseAclEntries(aclEntries);
> //the ACL api's also expect the tradition user/group/other permission 
> in the form of ACL
> aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.USER, 
> sourcePerm.getUserAction()));
> aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.GROUP, 
> sourcePerm.getGroupAction()));
> aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.OTHER, 
> sourcePerm.getOtherAction()));
>   }
> }
> {code}
> We found that DEFAULT extended ACL rules were not being inherited properly by 
> the directory sub-tree, so the above code is incomplete because it 
> effectively drops the DEFAULT rules. The second problem is with the call to 
> {{sourcePerm.getGroupAction()}}, which is incorrect in the case of extended 
> ACLs. When extended ACLs are used the GROUP permission is replaced with the 
> extended ACL mask. So the above code will apply the wrong permissions to the 
> GROUP. Instead the correct GROUP permissions now need to be pulled from the 
> AclEntry as returned by {{getAclStatus().getEntries()}}. See the 
> implementation of the new method {{getDefaultAclEntries}} for details.
> Similar issues exist with the HCatalog API. None of the API accounts for 
> setting extended ACLs on the directory sub-tree. The changes to the HCatalog 
> API allow the extended ACLs to be passed into the required methods similar to 
> how basic permissions are passed in. When building the directory sub-tree the 
> extended ACLs of the table directory are inherited by all sub-directories, 
> including the DEFAULT rules.
> Replicating the problem:
> Create a table to write data into (I will use acl_test as the destination and 
> words_text as the source) and set the ACLs as follows:
> {noformat}
> $ hdfs dfs -setfacl -m 
> default:user::rwx,default:group::r-x,default:mask::rwx,default:user:hdfs:rwx,group::r-x,user:hdfs:rwx
>  /user/cdrome/hive/acl_test
> $ hdfs dfs -ls -d /user/cdrome/hive/acl_test
> drwxrwx---+  - cdrome hdfs  0 2016-07-13 20:36 
> /user/cdrome/hive/acl_test
> $ hdfs dfs -getfacl -R /user/cdrome/hive/acl_test
> # file: /user/cdrome/hive/acl_test
> # owner: cdrome
> # group: hdfs
> user::rwx
> user:hdfs:rwx
> group::r-x
> mask::rwx
> other::---
> default:user::rwx
> default:user:hdfs:rwx
> default:group::r-x
> default:mask::rwx
> default:other::---
> {noformat}
> Note that the basic GROUP permission is set to {{rwx}} after setting the 
> ACLs. The ACLs explicitly set the DEFAULT rules and a rule specifically for 
> the {{hdfs}} user.
> Run the following query to populate the table:
> {noformat}
> insert into acl_test partition (dt='a', ds='b') select a, b from words_text 
> where dt = 'c';
> {noformat}
> Note that words_text only has a single partition key.
> Now examine the ACLs for the resulting directories:
> {noformat}
> $ hdfs dfs -getfacl -R /user/cdrome/hive/acl_test
> # file: /user/cdrome/hive/acl_test
> # owner: cdrome
> # group: hdfs
> user::rwx
> user:hdfs:rwx

[jira] [Updated] (HIVE-16759) Add table type information to HMS log notifications

2017-07-31 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16759:
---
Fix Version/s: 2.4.0
   3.0.0

> Add table type information to HMS log notifications
> ---
>
> Key: HIVE-16759
> URL: https://issues.apache.org/jira/browse/HIVE-16759
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.1.1
>Reporter: Sergio Peña
>Assignee: Janaki Lahorani
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE16759.1.patch, HIVE16759.2.patch, HIVE16759.3.patch, 
> HIVE16759.3.patch, HIVE16759.4.patch, HIVE-16759-branch-2.01.patch
>
>
> The DB notifications used by HiveMetaStore should include the table type for 
> all notifications that include table events, such as create, drop and alter 
> table.
> This would be useful for consumers to identify views vs tables.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16759) Add table type information to HMS log notifications

2017-07-31 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107863#comment-16107863
 ] 

Vihang Karajgaonkar commented on HIVE-16759:


Pushed to branch-2. Thanks for your contribution [~janulatha]

> Add table type information to HMS log notifications
> ---
>
> Key: HIVE-16759
> URL: https://issues.apache.org/jira/browse/HIVE-16759
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.1.1
>Reporter: Sergio Peña
>Assignee: Janaki Lahorani
> Attachments: HIVE16759.1.patch, HIVE16759.2.patch, HIVE16759.3.patch, 
> HIVE16759.3.patch, HIVE16759.4.patch, HIVE-16759-branch-2.01.patch
>
>
> The DB notifications used by HiveMetaStore should include the table type for 
> all notifications that include table events, such as create, drop and alter 
> table.
> This would be useful for consumers to identify views vs tables.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17167) Create metastore specific configuration tool

2017-07-31 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107845#comment-16107845
 ] 

Vihang Karajgaonkar commented on HIVE-17167:


Yes, without sub-classing I agree we cannot enforce this in code. Given that 
users can still do a conf.get("key") instead of MetastoreConf.getX(conf, key), 
adding a check for inconsistent values in MetastoreConf.getX() may not work for 
all the cases and will only add to the confusion.

> Create metastore specific configuration tool
> 
>
> Key: HIVE-17167
> URL: https://issues.apache.org/jira/browse/HIVE-17167
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17167.patch
>
>
> As part of making the metastore a separately releasable module we need 
> configuration tools that are specific to that module.  It cannot use or 
> extend HiveConf as that is in hive common.  But it must take a HiveConf 
> object and be able to operate on it.
> The best way to achieve this is using Hadoop's Configuration object (which 
> HiveConf extends) together with enums and static methods.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17217) SMB Join : Assert if paths are different in TezGroupedSplit in KeyValueInputMerger

2017-07-31 Thread Deepak Jaiswal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107812#comment-16107812
 ] 

Deepak Jaiswal edited comment on HIVE-17217 at 7/31/17 7:17 PM:


Proper implementation of assert.
[~gopalv] Can you please review?


was (Author: djaiswal):
Proper implementation of assert.

> SMB Join : Assert if paths are different in TezGroupedSplit in 
> KeyValueInputMerger
> --
>
> Key: HIVE-17217
> URL: https://issues.apache.org/jira/browse/HIVE-17217
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-17217.1.patch
>
>
> In KeyValueInputMerger, a TezGroupedSplit may contain more than 1 splits. 
> However, the splits should all belong to same path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17217) SMB Join : Assert if paths are different in TezGroupedSplit in KeyValueInputMerger

2017-07-31 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-17217:
--
Attachment: HIVE-17217.1.patch

Proper implementation of assert.

> SMB Join : Assert if paths are different in TezGroupedSplit in 
> KeyValueInputMerger
> --
>
> Key: HIVE-17217
> URL: https://issues.apache.org/jira/browse/HIVE-17217
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-17217.1.patch
>
>
> In KeyValueInputMerger, a TezGroupedSplit may contain more than 1 splits. 
> However, the splits should all belong to same path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17217) SMB Join : Assert if paths are different in TezGroupedSplit in KeyValueInputMerger

2017-07-31 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-17217:
--
Status: Patch Available  (was: In Progress)

> SMB Join : Assert if paths are different in TezGroupedSplit in 
> KeyValueInputMerger
> --
>
> Key: HIVE-17217
> URL: https://issues.apache.org/jira/browse/HIVE-17217
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>
> In KeyValueInputMerger, a TezGroupedSplit may contain more than 1 splits. 
> However, the splits should all belong to same path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17217) SMB Join : Assert if paths are different in TezGroupedSplit in KeyValueInputMerger

2017-07-31 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-17217:
--
Summary: SMB Join : Assert if paths are different in TezGroupedSplit in 
KeyValueInputMerger  (was: Bucket Map Join : Assert if paths are different in 
TezGroupedSplit in KeyValueInputMerger)

> SMB Join : Assert if paths are different in TezGroupedSplit in 
> KeyValueInputMerger
> --
>
> Key: HIVE-17217
> URL: https://issues.apache.org/jira/browse/HIVE-17217
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>
> In KeyValueInputMerger, a TezGroupedSplit may contain more than 1 splits. 
> However, the splits should all belong to same path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17217) Bucket Map Join : Assert if paths are different in TezGroupedSplit in KeyValueInputMerger

2017-07-31 Thread Deepak Jaiswal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107804#comment-16107804
 ] 

Deepak Jaiswal commented on HIVE-17217:
---

Thanks for pointing this out. SMB it is. I will update the JIRA.

> Bucket Map Join : Assert if paths are different in TezGroupedSplit in 
> KeyValueInputMerger
> -
>
> Key: HIVE-17217
> URL: https://issues.apache.org/jira/browse/HIVE-17217
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>
> In KeyValueInputMerger, a TezGroupedSplit may contain more than 1 splits. 
> However, the splits should all belong to same path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17217) Bucket Map Join : Assert if paths are different in TezGroupedSplit in KeyValueInputMerger

2017-07-31 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107797#comment-16107797
 ] 

Gopal V commented on HIVE-17217:


The SMB Join or Bucket MapJoin?

The BMJ can have different paths because there's no ordering required.

> Bucket Map Join : Assert if paths are different in TezGroupedSplit in 
> KeyValueInputMerger
> -
>
> Key: HIVE-17217
> URL: https://issues.apache.org/jira/browse/HIVE-17217
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>
> In KeyValueInputMerger, a TezGroupedSplit may contain more than 1 splits. 
> However, the splits should all belong to same path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17205) add functional support

2017-07-31 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17205:
--
Attachment: HIVE-17205.03.patch

3 is the same as 2 - build bot didn't get triggered

> add functional support
> --
>
> Key: HIVE-17205
> URL: https://issues.apache.org/jira/browse/HIVE-17205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17205.01.patch, HIVE-17205.02.patch, 
> HIVE-17205.03.patch
>
>
> make sure unbucketed tables can be marked transactional=true
> make insert/update/delete/compaction work



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16908) Failures in TestHcatClient due to HIVE-16844

2017-07-31 Thread Sunitha Beeram (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107778#comment-16107778
 ] 

Sunitha Beeram commented on HIVE-16908:
---

[~mithun] thanks - will take a look at the earliest.

> Failures in TestHcatClient due to HIVE-16844
> 
>
> Key: HIVE-16908
> URL: https://issues.apache.org/jira/browse/HIVE-16908
> Project: Hive
>  Issue Type: Bug
>Reporter: Sunitha Beeram
>Assignee: Sunitha Beeram
> Attachments: HIVE-16908.1.patch, HIVE-16908.2.patch, 
> HIVE-16908.3.patch
>
>
> Some of the tests in TestHCatClient.java, for ex:
> {noformat}
> org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
>  (batchId=177)
> org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
>  (batchId=177)
> org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
> (batchId=177)
> {noformat}
> are failing due to HIVE-16844. HIVE-16844 fixes a connection leak when a new 
> configuration object is set on the ObjectStore. TestHCatClient fires up a 
> second instance of metastore thread with a different conf object that results 
> in the PersistenceMangaerFactory closure and hence tests fail. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17212) Dynamic add partition by insert shouldn't generate INSERT event.

2017-07-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107761#comment-16107761
 ] 

Hive QA commented on HIVE-17212:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879656/HIVE-17212.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11019 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6199/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6199/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6199/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879656 - PreCommit-HIVE-Build

> Dynamic add partition by insert shouldn't generate INSERT event.
> 
>
> Key: HIVE-17212
> URL: https://issues.apache.org/jira/browse/HIVE-17212
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17212.01.patch
>
>
> A partition is dynamically added if INSERT INTO is invoked on a non-existing 
> partition.
> Generally, insert operation generated INSERT event to notify the operation 
> with new data files.
> In this case, Hive should generate only ADD_PARTITION events with the new 
> files added. It shouldn't create INSERT event.
> Need to test and verify this behaviour.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17209) ObjectCacheFactory should return null when tez shared object registry is not setup

2017-07-31 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107746#comment-16107746
 ] 

Sergey Shelukhin commented on HIVE-17209:
-

+1

> ObjectCacheFactory should return null when tez shared object registry is not 
> setup
> --
>
> Key: HIVE-17209
> URL: https://issues.apache.org/jira/browse/HIVE-17209
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-17209.1.patch
>
>
> HIVE-15269 introduced dynamic min/max bloom filter 
> ("hive.tez.dynamic.semijoin.reduction=true"). This needs to access 
> ObjectCache and in tez, ObjectCache can only be created by {{TezProcessor}}.
> In the following case {{AM --> splits --> 
> OrcInputFormat.pickStripes::evaluatePredicateMinMax --> 
> DynamicValue.getLiteral --> objectCache access}}, AM ends up throwing lots of 
> NPE since AM has not created ObjectCache.  
> Orc reader catches these exceptions, skips PPD and proceeds further. For e.g, 
> in Q95 it ends up throwing ~30,000 NPE before completing split information.
> ObjectCacheFactory should return null when tez shared object registry is not 
> setup. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17214) check/fix conversion of non-acid to acid

2017-07-31 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107737#comment-16107737
 ] 

Sergey Shelukhin edited comment on HIVE-17214 at 7/31/17 6:31 PM:
--

There's a setting in Hive (and Tez/MR?) that basically makes it enumerate input 
directory contents recursively. Theoretically, the paths could be completely 
arbitrary. Union for Tez is just a special case that makes use of this feature. 
Not sure if anyone actually cares otherwise. Flattening it might be an option 
when converting to ACID... or throwing an error, and then flattening if some 
parameter is passed to the alter query/some config setting is set.
Oh also list bucketing makes use of this feature, kind of. I think Hive 
actually substitutes the paths when it's in use, but you can still read the 
table ignoring list bucketing, if the recursion is enabled. Not sure if anyone 
cares about list bucketing either ;)


was (Author: sershe):
There's a setting in Hive (and Tez/MR?) that basically makes it enumerate input 
directory contents recursively. Theoretically, the paths could be completely 
arbitrary. Union for Tez is just a special case that makes use of this feature. 
Not sure if anyone actually cares otherwise. Flattening it might be an option 
when converting to ACID... or throwing an error, and then flattening if some 
parameter is passed to the alter query/some config setting is set.

> check/fix conversion of non-acid to acid
> 
>
> Key: HIVE-17214
> URL: https://issues.apache.org/jira/browse/HIVE-17214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> bucketed tables have stricter rules for file layout on disk - bucket files 
> are direct children of a partition directory.
> for un-bucketed tables I'm not sure there are any rules
> for example, CTAS with Tez + Union operator creates 1 directory for each leg 
> of the union
> Supposedly Hive can read table by picking all files recursively.  
> Can it also write (other than CTAS example above) arbitrarily?
> Does it mean Acid write can also write anywhere?
> Figure out what can be supported and how can existing layout can be checked?  
> Examining a full "ls -l -R" for a large table could be expensive. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17214) check/fix conversion of non-acid to acid

2017-07-31 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107737#comment-16107737
 ] 

Sergey Shelukhin commented on HIVE-17214:
-

There's a setting in Hive (and Tez/MR?) that basically makes it enumerate input 
directory contents recursively. Theoretically, the paths could be completely 
arbitrary. Union for Tez is just a special case that makes use of this feature. 
Not sure if anyone actually cares otherwise. Flattening it might be an option 
when converting to ACID... or throwing an error, and then flattening if some 
parameter is passed to the alter query/some config setting is set.

> check/fix conversion of non-acid to acid
> 
>
> Key: HIVE-17214
> URL: https://issues.apache.org/jira/browse/HIVE-17214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> bucketed tables have stricter rules for file layout on disk - bucket files 
> are direct children of a partition directory.
> for un-bucketed tables I'm not sure there are any rules
> for example, CTAS with Tez + Union operator creates 1 directory for each leg 
> of the union
> Supposedly Hive can read table by picking all files recursively.  
> Can it also write (other than CTAS example above) arbitrarily?
> Does it mean Acid write can also write anywhere?
> Figure out what can be supported and how can existing layout can be checked?  
> Examining a full "ls -l -R" for a large table could be expensive. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16979) Cache UGI for metastore

2017-07-31 Thread Tao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107734#comment-16107734
 ] 

Tao Li commented on HIVE-16979:
---

All test failures are due to flaky tests tracked in HIVE-15058.

[~thejas] Can you please take a look at this change?

> Cache UGI for metastore
> ---
>
> Key: HIVE-16979
> URL: https://issues.apache.org/jira/browse/HIVE-16979
> Project: Hive
>  Issue Type: Improvement
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, 
> HIVE-16979.3.patch
>
>
> FileSystem.closeAllForUGI is called per request against metastore to dispose 
> UGI, which involves talking to HDFS name node and is time consuming. So the 
> perf improvement would be caching and reusing the UGI.
> Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency 
> against HDFS. Usually a Hive query could result in several calls against 
> metastore, so we can save up to 50-100 ms per hive query.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17217) Bucket Map Join : Assert if paths are different in TezGroupedSplit in KeyValueInputMerger

2017-07-31 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal reassigned HIVE-17217:
-


> Bucket Map Join : Assert if paths are different in TezGroupedSplit in 
> KeyValueInputMerger
> -
>
> Key: HIVE-17217
> URL: https://issues.apache.org/jira/browse/HIVE-17217
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>
> In KeyValueInputMerger, a TezGroupedSplit may contain more than 1 splits. 
> However, the splits should all belong to same path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work started] (HIVE-17217) Bucket Map Join : Assert if paths are different in TezGroupedSplit in KeyValueInputMerger

2017-07-31 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17217 started by Deepak Jaiswal.
-
> Bucket Map Join : Assert if paths are different in TezGroupedSplit in 
> KeyValueInputMerger
> -
>
> Key: HIVE-17217
> URL: https://issues.apache.org/jira/browse/HIVE-17217
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>
> In KeyValueInputMerger, a TezGroupedSplit may contain more than 1 splits. 
> However, the splits should all belong to same path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17216) Additional qtests for HoS DPP

2017-07-31 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-17216:
---


> Additional qtests for HoS DPP
> -
>
> Key: HIVE-17216
> URL: https://issues.apache.org/jira/browse/HIVE-17216
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> There are a few queries that we can add to the HoS DPP tests to increase 
> coverage. There are a few query patterns that the current tests don't cover.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17202) Add InterfaceAudience and InterfaceStability annotations for HMS Listener APIs

2017-07-31 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17202:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the review [~spena], committed to master.

> Add InterfaceAudience and InterfaceStability annotations for HMS Listener APIs
> --
>
> Key: HIVE-17202
> URL: https://issues.apache.org/jira/browse/HIVE-17202
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17202.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17113) Duplicate bucket files can get written to table by runaway task

2017-07-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107703#comment-16107703
 ] 

Ashutosh Chauhan commented on HIVE-17113:
-

+1

> Duplicate bucket files can get written to table by runaway task
> ---
>
> Key: HIVE-17113
> URL: https://issues.apache.org/jira/browse/HIVE-17113
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-17113.1.patch, HIVE-17113.2.patch, 
> HIVE-17113.3.patch
>
>
> Saw a table get a duplicate bucket file from a Hive query. It looks like the 
> following happened:
> 1. Task attempt A_0 starts,but then stops making progress
> 2. The job was running with speculative execution on, and task attempt A_1 is 
> started
> 3. Task attempt A_1 finishes execution and saves its output to the temp 
> directory.
> 5. A task kill is sent to A_0, though this does appear to actually kill A_0
> 6. The job for the query finishes and Utilities.mvFileToFinalPath() calls 
> Utilities.removeTempOrDuplicateFiles() to check for duplicate bucket files
> 7. A_0 (still running) finally finishes and saves its file to the temp 
> directory. At this point we now have duplicate bucket files - oops!
> 8. Utilities.removeTempOrDuplicateFiles() moves the temp directory to the 
> final location, where it is later moved to the partition directory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17113) Duplicate bucket files can get written to table by runaway task

2017-07-31 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107690#comment-16107690
 ] 

Jason Dere commented on HIVE-17113:
---

[~ashutoshc] can you review this one?

> Duplicate bucket files can get written to table by runaway task
> ---
>
> Key: HIVE-17113
> URL: https://issues.apache.org/jira/browse/HIVE-17113
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-17113.1.patch, HIVE-17113.2.patch, 
> HIVE-17113.3.patch
>
>
> Saw a table get a duplicate bucket file from a Hive query. It looks like the 
> following happened:
> 1. Task attempt A_0 starts,but then stops making progress
> 2. The job was running with speculative execution on, and task attempt A_1 is 
> started
> 3. Task attempt A_1 finishes execution and saves its output to the temp 
> directory.
> 5. A task kill is sent to A_0, though this does appear to actually kill A_0
> 6. The job for the query finishes and Utilities.mvFileToFinalPath() calls 
> Utilities.removeTempOrDuplicateFiles() to check for duplicate bucket files
> 7. A_0 (still running) finally finishes and saves its file to the temp 
> directory. At this point we now have duplicate bucket files - oops!
> 8. Utilities.removeTempOrDuplicateFiles() moves the temp directory to the 
> final location, where it is later moved to the partition directory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17215) Streaming Ingest API writing unbucketed tables

2017-07-31 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-17215:
-


> Streaming Ingest API writing unbucketed tables
> --
>
> Key: HIVE-17215
> URL: https://issues.apache.org/jira/browse/HIVE-17215
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> Currently the API expects the target table to be bucketed.
> It creates 1 writer per bucket per connection/partition.
> The simplest is to allow the API to create a single writer for unbucketed 
> tables.  
> If this doesn't provide enough write throughput, the client can create 
> another connection.
> Could add a parameter to the API to specify writer parallelism for unbucketed 
> tables.  If it's set to 2 for example, the writer will write delta_x_y_ 
> and delta_x_y_1 using statementId.  Maybe as a followup.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17214) check/fix conversion of non-acid to acid

2017-07-31 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-17214:
-


> check/fix conversion of non-acid to acid
> 
>
> Key: HIVE-17214
> URL: https://issues.apache.org/jira/browse/HIVE-17214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> bucketed tables have stricter rules for file layout on disk - bucket files 
> are direct children of a partition directory.
> for un-bucketed tables I'm not sure there are any rules
> for example, CTAS with Tez + Union operator creates 1 directory for each leg 
> of the union
> Supposedly Hive can read table by picking all files recursively.  
> Can it also write (other than CTAS example above) arbitrarily?
> Does it mean Acid write can also write anywhere?
> Figure out what can be supported and how can existing layout can be checked?  
> Examining a full "ls -l -R" for a large table could be expensive. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17213) HoS: file merging doesn't work for union all

2017-07-31 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-17213:

Attachment: HIVE-17213.0.patch

> HoS: file merging doesn't work for union all
> 
>
> Key: HIVE-17213
> URL: https://issues.apache.org/jira/browse/HIVE-17213
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17213.0.patch
>
>
> HoS file merging doesn't work properly since it doesn't set linked file sinks 
> properly which is used to generate move tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17213) HoS: file merging doesn't work for union all

2017-07-31 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-17213:

Status: Patch Available  (was: Open)

> HoS: file merging doesn't work for union all
> 
>
> Key: HIVE-17213
> URL: https://issues.apache.org/jira/browse/HIVE-17213
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17213.0.patch
>
>
> HoS file merging doesn't work properly since it doesn't set linked file sinks 
> properly which is used to generate move tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17213) HoS: file merging doesn't work for union all

2017-07-31 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun reassigned HIVE-17213:
---


> HoS: file merging doesn't work for union all
> 
>
> Key: HIVE-17213
> URL: https://issues.apache.org/jira/browse/HIVE-17213
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chao Sun
>Assignee: Chao Sun
>
> HoS file merging doesn't work properly since it doesn't set linked file sinks 
> properly which is used to generate move tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17205) add functional support

2017-07-31 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17205:
--
Attachment: HIVE-17205.02.patch

> add functional support
> --
>
> Key: HIVE-17205
> URL: https://issues.apache.org/jira/browse/HIVE-17205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17205.01.patch, HIVE-17205.02.patch
>
>
> make sure unbucketed tables can be marked transactional=true
> make insert/update/delete/compaction work



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16908) Failures in TestHcatClient due to HIVE-16844

2017-07-31 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107621#comment-16107621
 ] 

Mithun Radhakrishnan commented on HIVE-16908:
-

After co-opting this JIRA so that I could provide an alternative fix, I've 
reset ownership back to [~sbeeram].

Rather than rewrite the test to launch the target metastore in a separate 
process, I have reworded the failing tests without changing their intention. In 
the places where calls to the source and target metastores are interleaved, I 
fetch a new HMS client.

Fixing the static-state in the metastore is a larger problem that should be 
addressed separately.

[~sbeeram], I wonder if you might find this satisfactory.

> Failures in TestHcatClient due to HIVE-16844
> 
>
> Key: HIVE-16908
> URL: https://issues.apache.org/jira/browse/HIVE-16908
> Project: Hive
>  Issue Type: Bug
>Reporter: Sunitha Beeram
>Assignee: Sunitha Beeram
> Attachments: HIVE-16908.1.patch, HIVE-16908.2.patch, 
> HIVE-16908.3.patch
>
>
> Some of the tests in TestHCatClient.java, for ex:
> {noformat}
> org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
>  (batchId=177)
> org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
>  (batchId=177)
> org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
> (batchId=177)
> {noformat}
> are failing due to HIVE-16844. HIVE-16844 fixes a connection leak when a new 
> configuration object is set on the ObjectStore. TestHCatClient fires up a 
> second instance of metastore thread with a different conf object that results 
> in the PersistenceMangaerFactory closure and hence tests fail. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16845) INSERT OVERWRITE a table with dynamic partitions on S3 fails with NPE

2017-07-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107622#comment-16107622
 ] 

Hive QA commented on HIVE-16845:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879647/HIVE-16845.4.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11021 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=240)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6198/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6198/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6198/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879647 - PreCommit-HIVE-Build

> INSERT OVERWRITE a table with dynamic partitions on S3 fails with NPE
> -
>
> Key: HIVE-16845
> URL: https://issues.apache.org/jira/browse/HIVE-16845
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
> Attachments: HIVE-16845.1.patch, HIVE-16845.2.patch, 
> HIVE-16845.3.patch, HIVE-16845.4.patch
>
>
> *How to reproduce*
> - Create a partitioned table on S3:
> {noformat}
> CREATE EXTERNAL TABLE s3table(user_id string COMMENT '', event_name string 
> COMMENT '') PARTITIONED BY (reported_date string, product_id int) LOCATION 
> 's3a://'; 
> {noformat}
> - Create a temp table:
> {noformat}
> create table tmp_table (id string, name string, date string, pid int) row 
> format delimited fields terminated by '\t' lines terminated by '\n' stored as 
> textfile;
> {noformat}
> - Load the following rows to the tmp table:
> {noformat}
> u1value1  2017-04-10  1
> u2value2  2017-04-10  1
> u3value3  2017-04-10  10001
> {noformat}
> - Set the following parameters:
> -- hive.exec.dynamic.partition.mode=nonstrict
> -- mapreduce.input.fileinputformat.split.maxsize=10
> -- hive.blobstore.optimizations.enabled=true
> -- hive.blobstore.use.blobstore.as.scratchdir=false
> -- hive.merge.mapfiles=true
> - Insert the rows from the temp table into the s3 table:
> {noformat}
> INSERT OVERWRITE TABLE s3table
> PARTITION (reported_date, product_id)
> SELECT
>   t.id as user_id,
>   t.name as event_name,
>   t.date as reported_date,
>   t.pid as product_id
> FROM tmp_table t;
> {noformat}
> A NPE will occur with the following stacktrace:
> {noformat}
> 2017-05-08 21:32:50,607 ERROR 
> org.apache.hive.service.cli.operation.Operation: 
> [HiveServer2-Background-Pool: Thread-184028]: Error running hive query: 
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code -101 from 
> org.apache.hadoop.hive.ql.exec.ConditionalTask. null
> at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:239)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> 

[jira] [Assigned] (HIVE-16908) Failures in TestHcatClient due to HIVE-16844

2017-07-31 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan reassigned HIVE-16908:
---

Assignee: Sunitha Beeram  (was: Mithun Radhakrishnan)

> Failures in TestHcatClient due to HIVE-16844
> 
>
> Key: HIVE-16908
> URL: https://issues.apache.org/jira/browse/HIVE-16908
> Project: Hive
>  Issue Type: Bug
>Reporter: Sunitha Beeram
>Assignee: Sunitha Beeram
> Attachments: HIVE-16908.1.patch, HIVE-16908.2.patch, 
> HIVE-16908.3.patch
>
>
> Some of the tests in TestHCatClient.java, for ex:
> {noformat}
> org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
>  (batchId=177)
> org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
>  (batchId=177)
> org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
> (batchId=177)
> {noformat}
> are failing due to HIVE-16844. HIVE-16844 fixes a connection leak when a new 
> configuration object is set on the ObjectStore. TestHCatClient fires up a 
> second instance of metastore thread with a different conf object that results 
> in the PersistenceMangaerFactory closure and hence tests fail. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17115) MetaStoreUtils.getDeserializer doesn't catch the java.lang.ClassNotFoundException

2017-07-31 Thread sarun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107598#comment-16107598
 ] 

sarun commented on HIVE-17115:
--

[~erik.fang] Can you please update about the table details?

> MetaStoreUtils.getDeserializer doesn't catch the 
> java.lang.ClassNotFoundException
> -
>
> Key: HIVE-17115
> URL: https://issues.apache.org/jira/browse/HIVE-17115
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1
>Reporter: Erik.fang
>Assignee: Erik.fang
> Attachments: HIVE-17115.1.patch, HIVE-17115.patch
>
>
> Suppose we create a table with Custom SerDe, then call 
> HiveMetaStoreClient.getSchema(String db, String tableName) to extract the 
> metadata from HiveMetaStore Service
> the thrift client hangs there with exception in HiveMetaStore Service's log, 
> such as
> {code:java}
> Exception in thread "pool-5-thread-129" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/util/Bytes
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.parseColumnsMapping(HBaseSerDe.java:184)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDeParameters.(HBaseSerDeParameters.java:73)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:117)
> at 
> org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:53)
> at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:401)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_fields_with_environment_context(HiveMetaStore.java:3556)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_schema_with_environment_context(HiveMetaStore.java:3636)
> at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> at com.sun.proxy.$Proxy4.get_schema_with_environment_context(Unknown 
> Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9146)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9130)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.util.Bytes
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-17173) Add some convenience redirects to the Hive site

2017-07-31 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HIVE-17173.
--
Resolution: Fixed

> Add some convenience redirects to the Hive site
> ---
>
> Key: HIVE-17173
> URL: https://issues.apache.org/jira/browse/HIVE-17173
> Project: Hive
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>
> I'd propose that we add the following redirects to our site's .htaccess:
> * http://hive.apache.org/bugs -> https://issues.apache.org/jira/browse/hive
> * http://hive.apache.org/downloads -> 
> https://www.apache.org/dyn/closer.cgi/hive/
> * http://hive.apache.org/releases -> 
> https://hive.apache.org/docs/downloads.html
> * http://hive.apache.org/src -> https://github.com/apache/hive
> * http://hive.apache.org/web-src -> 
> https://svn.apache.org/repos/asf/hive/cms/trunk
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (HIVE-17173) Add some convenience redirects to the Hive site

2017-07-31 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley closed HIVE-17173.


I committed this to subversion.

> Add some convenience redirects to the Hive site
> ---
>
> Key: HIVE-17173
> URL: https://issues.apache.org/jira/browse/HIVE-17173
> Project: Hive
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>
> I'd propose that we add the following redirects to our site's .htaccess:
> * http://hive.apache.org/bugs -> https://issues.apache.org/jira/browse/hive
> * http://hive.apache.org/downloads -> 
> https://www.apache.org/dyn/closer.cgi/hive/
> * http://hive.apache.org/releases -> 
> https://hive.apache.org/docs/downloads.html
> * http://hive.apache.org/src -> https://github.com/apache/hive
> * http://hive.apache.org/web-src -> 
> https://svn.apache.org/repos/asf/hive/cms/trunk
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-17171) Remove old javadoc versions

2017-07-31 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HIVE-17171.
--
Resolution: Fixed

Ok, I committed the change. I also updated the site to point to the archived 
javadocs for the older versions. I removed the links for the *really* old 
versions 0.10, 0.11, and hcat-0.5

> Remove old javadoc versions
> ---
>
> Key: HIVE-17171
> URL: https://issues.apache.org/jira/browse/HIVE-17171
> Project: Hive
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>
> We currently have a lot of old javadoc versions. I'd propose that we keep the 
> following versions:
> * r1.2.2
> * r2.1.1
> * r2.2.0
> (Note that 2.3.0 was not checked in to the site.) In particular, I'd suggest 
> we remove:
> * hcat-r0.5.0
> * r0.10.0
> * r0.11.0
> * r0.12.0
> * r0.13.1
> * r1.0.1
> * r1.1.1
> * r2.0.1
> Any concerns?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (HIVE-17171) Remove old javadoc versions

2017-07-31 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley closed HIVE-17171.


> Remove old javadoc versions
> ---
>
> Key: HIVE-17171
> URL: https://issues.apache.org/jira/browse/HIVE-17171
> Project: Hive
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>
> We currently have a lot of old javadoc versions. I'd propose that we keep the 
> following versions:
> * r1.2.2
> * r2.1.1
> * r2.2.0
> (Note that 2.3.0 was not checked in to the site.) In particular, I'd suggest 
> we remove:
> * hcat-r0.5.0
> * r0.10.0
> * r0.11.0
> * r0.12.0
> * r0.13.1
> * r1.0.1
> * r1.1.1
> * r2.0.1
> Any concerns?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17167) Create metastore specific configuration tool

2017-07-31 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107528#comment-16107528
 ] 

Alan Gates commented on HIVE-17167:
---

The only way to enforce this in code is to subclass Configuration, which I 
don't want to do as it makes it impossible to operate on existing HiveConf 
objects.  I could make it so that when one of the MetastoreConf.getX methods is 
called, it checks to see if both the metastore key and the hive key are set and 
to different values, and throws if so.  But this doesn't prevent people using 
just Configuration get and set methods and screwing things up.

> Create metastore specific configuration tool
> 
>
> Key: HIVE-17167
> URL: https://issues.apache.org/jira/browse/HIVE-17167
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17167.patch
>
>
> As part of making the metastore a separately releasable module we need 
> configuration tools that are specific to that module.  It cannot use or 
> extend HiveConf as that is in hive common.  But it must take a HiveConf 
> object and be able to operate on it.
> The best way to achieve this is using Hadoop's Configuration object (which 
> HiveConf extends) together with enums and static methods.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17212) Dynamic add partition by insert shouldn't generate INSERT event.

2017-07-31 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17212:

Status: Patch Available  (was: Open)

> Dynamic add partition by insert shouldn't generate INSERT event.
> 
>
> Key: HIVE-17212
> URL: https://issues.apache.org/jira/browse/HIVE-17212
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17212.01.patch
>
>
> A partition is dynamically added if INSERT INTO is invoked on a non-existing 
> partition.
> Generally, insert operation generated INSERT event to notify the operation 
> with new data files.
> In this case, Hive should generate only ADD_PARTITION events with the new 
> files added. It shouldn't create INSERT event.
> Need to test and verify this behaviour.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17212) Dynamic add partition by insert shouldn't generate INSERT event.

2017-07-31 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17212:

Attachment: HIVE-17212.01.patch

Added 01.patch with a test case to verify if dynamic add partition through 
insert does;t generate INSERT event. The code is already handling this. So, 
just added test case.

Request [~daijy]/[~anishek]/[~thejas] to please review!

> Dynamic add partition by insert shouldn't generate INSERT event.
> 
>
> Key: HIVE-17212
> URL: https://issues.apache.org/jira/browse/HIVE-17212
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17212.01.patch
>
>
> A partition is dynamically added if INSERT INTO is invoked on a non-existing 
> partition.
> Generally, insert operation generated INSERT event to notify the operation 
> with new data files.
> In this case, Hive should generate only ADD_PARTITION events with the new 
> files added. It shouldn't create INSERT event.
> Need to test and verify this behaviour.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15305) Add tests for METASTORE_EVENT_LISTENERS

2017-07-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107472#comment-16107472
 ] 

Hive QA commented on HIVE-15305:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879639/HIVE-15305.1.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10648 tests 
executed
*Failed tests:*
{noformat}
TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=233)
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=240)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.org.apache.hadoop.hive.cli.TestNegativeCliDriver
 (batchId=90)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
org.apache.hive.hcatalog.listener.TestTransactionalDbNotificationListener.sqlInsertPartition
 (batchId=233)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6197/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6197/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6197/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879639 - PreCommit-HIVE-Build

> Add tests for METASTORE_EVENT_LISTENERS
> ---
>
> Key: HIVE-15305
> URL: https://issues.apache.org/jira/browse/HIVE-15305
> Project: Hive
>  Issue Type: Bug
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-15305.1.patch, HIVE-15305.1.patch, HIVE-15305.patch
>
>
> HIVE-15232 reused TestDbNotificationListener to test 
> METASTORE_TRANSACTIONAL_EVENT_LISTENERS and removed unit testing of 
> METASTORE_EVENT_LISTENERS config. We should test both. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17212) Dynamic add partition by insert shouldn't generate INSERT event.

2017-07-31 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17212:

Description: 
A partition is dynamically added if INSERT INTO is invoked on a non-existing 
partition.
Generally, insert operation generated INSERT event to notify the operation with 
new data files.
In this case, Hive should generate only ADD_PARTITION events with the new files 
added. It shouldn't create INSERT event.
Need to test and verify this behaviour.

  was:
A partition is dynamically added if INSERT INTO is invoked on a non-existing 
partition.
In this case, Hive should generate only ADD_PARTITION events with the new files 
added. It shouldn't create INSERT event.
Need to test and verify this behaviour.


> Dynamic add partition by insert shouldn't generate INSERT event.
> 
>
> Key: HIVE-17212
> URL: https://issues.apache.org/jira/browse/HIVE-17212
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
>
> A partition is dynamically added if INSERT INTO is invoked on a non-existing 
> partition.
> Generally, insert operation generated INSERT event to notify the operation 
> with new data files.
> In this case, Hive should generate only ADD_PARTITION events with the new 
> files added. It shouldn't create INSERT event.
> Need to test and verify this behaviour.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17212) Dynamic add partition by insert shouldn't generate INSERT event.

2017-07-31 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan reassigned HIVE-17212:
---


> Dynamic add partition by insert shouldn't generate INSERT event.
> 
>
> Key: HIVE-17212
> URL: https://issues.apache.org/jira/browse/HIVE-17212
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
>
> A partition is dynamically added if INSERT INTO is invoked on a non-existing 
> partition.
> In this case, Hive should generate only ADD_PARTITION events with the new 
> files added. It shouldn't create INSERT event.
> Need to test and verify this behaviour.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16759) Add table type information to HMS log notifications

2017-07-31 Thread Janaki Lahorani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107444#comment-16107444
 ] 

Janaki Lahorani commented on HIVE-16759:


Hi [~vihangk1], I uploaded a patch with conflicts resolved.  I would be 
grateful if you can review.  Thanks.

> Add table type information to HMS log notifications
> ---
>
> Key: HIVE-16759
> URL: https://issues.apache.org/jira/browse/HIVE-16759
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.1.1
>Reporter: Sergio Peña
>Assignee: Janaki Lahorani
> Attachments: HIVE16759.1.patch, HIVE16759.2.patch, HIVE16759.3.patch, 
> HIVE16759.3.patch, HIVE16759.4.patch, HIVE-16759-branch-2.01.patch
>
>
> The DB notifications used by HiveMetaStore should include the table type for 
> all notifications that include table events, such as create, drop and alter 
> table.
> This would be useful for consumers to identify views vs tables.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16759) Add table type information to HMS log notifications

2017-07-31 Thread Janaki Lahorani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani updated HIVE-16759:
---
Attachment: HIVE-16759-branch-2.01.patch

> Add table type information to HMS log notifications
> ---
>
> Key: HIVE-16759
> URL: https://issues.apache.org/jira/browse/HIVE-16759
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.1.1
>Reporter: Sergio Peña
>Assignee: Janaki Lahorani
> Attachments: HIVE16759.1.patch, HIVE16759.2.patch, HIVE16759.3.patch, 
> HIVE16759.3.patch, HIVE16759.4.patch, HIVE-16759-branch-2.01.patch
>
>
> The DB notifications used by HiveMetaStore should include the table type for 
> all notifications that include table events, such as create, drop and alter 
> table.
> This would be useful for consumers to identify views vs tables.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17072) Make the parallelized timeout configurable in BeeLine tests

2017-07-31 Thread Marta Kuczora (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107429#comment-16107429
 ] 

Marta Kuczora commented on HIVE-17072:
--

Thanks a lot [~pvary] for committing the patch. I updated the documentation as 
well.

> Make the parallelized timeout configurable in BeeLine tests
> ---
>
> Key: HIVE-17072
> URL: https://issues.apache.org/jira/browse/HIVE-17072
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Minor
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17072.1.patch, HIVE-17072.2.patch
>
>
> When running the BeeLine tests parallel, the timeout is hardcoded in the 
> Parallelized.java:
> {noformat}
> @Override
> public void finished() {
>   executor.shutdown();
>   try {
> executor.awaitTermination(10, TimeUnit.MINUTES);
>   } catch (InterruptedException exc) {
> throw new RuntimeException(exc);
>   }
> }
> {noformat}
> It would be better to make it configurable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition

2017-07-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107423#comment-16107423
 ] 

Ashutosh Chauhan commented on HIVE-17148:
-

[~allgoodok] Can you add a testcase with your patch?

> Incorrect result for Hive join query with COALESCE in WHERE condition
> -
>
> Key: HIVE-17148
> URL: https://issues.apache.org/jira/browse/HIVE-17148
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.1
>Reporter: Vlad Gudikov
>Assignee: Vlad Gudikov
> Attachments: HIVE-17148.patch
>
>
> The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo 
> enabled:
> STEPS TO REPRODUCE:
> {code}
> Step 1: Create a table ct1
> create table ct1 (a1 string,b1 string);
> Step 2: Create a table ct2
> create table ct2 (a2 string);
> Step 3 : Insert following data into table ct1
> insert into table ct1 (a1) values ('1');
> Step 4 : Insert following data into table ct2
> insert into table ct2 (a2) values ('1');
> Step 5 : Execute the following query 
> select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2;
> {code}
> ACTUAL RESULT:
> {code}
> The query returns nothing;
> {code}
> EXPECTED RESULT:
> {code}
> 1   NULL1
> {code}
> The issue seems to be because of the incorrect query plan. In the plan we can 
> see:
> predicate:(a1 is not null and b1 is not null)
> which does not look correct. As a result, it is filtering out all the rows is 
> any column mentioned in the COALESCE has null value.
> Please find the query plan below:
> {code}
> Plan optimized by CBO.
> Vertex dependency in root stage
> Map 1 <- Map 2 (BROADCAST_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Map 1
>   File Output Operator [FS_10]
> Map Join Operator [MAPJOIN_15] (rows=1 width=4)
>   
> Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"]
> <-Map 2 [BROADCAST_EDGE]
>   BROADCAST [RS_7]
> PartitionCols:_col0
> Select Operator [SEL_5] (rows=1 width=1)
>   Output:["_col0"]
>   Filter Operator [FIL_14] (rows=1 width=1)
> predicate:a2 is not null
> TableScan [TS_3] (rows=1 width=1)
>   default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"]
> <-Select Operator [SEL_2] (rows=1 width=4)
> Output:["_col0","_col1"]
> Filter Operator [FIL_13] (rows=1 width=4)
>   predicate:(a1 is not null and b1 is not null)
>   TableScan [TS_0] (rows=1 width=4)
> default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"]
> {code}
> This happens only if join is inner type, otherwise HiveJoinAddNotRule which 
> creates this problem is skipped.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16845) INSERT OVERWRITE a table with dynamic partitions on S3 fails with NPE

2017-07-31 Thread Marta Kuczora (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107406#comment-16107406
 ] 

Marta Kuczora commented on HIVE-16845:
--

[~stakiar], I believe there wasn't. There were only 5 failing tests in the 
"13/Jul/17 14:42" run and all of them were known flaky tests. The patch in the 
"13/Jul/17 18:57" run was the same, I just removed an empty file which was 
accidentally left in the patch. So the additional failing tests should not be 
due to the patch. 
But since it was 2 weeks ago, I think it is better to rerun the tests, so I 
uploaded the patch again.

> INSERT OVERWRITE a table with dynamic partitions on S3 fails with NPE
> -
>
> Key: HIVE-16845
> URL: https://issues.apache.org/jira/browse/HIVE-16845
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
> Attachments: HIVE-16845.1.patch, HIVE-16845.2.patch, 
> HIVE-16845.3.patch, HIVE-16845.4.patch
>
>
> *How to reproduce*
> - Create a partitioned table on S3:
> {noformat}
> CREATE EXTERNAL TABLE s3table(user_id string COMMENT '', event_name string 
> COMMENT '') PARTITIONED BY (reported_date string, product_id int) LOCATION 
> 's3a://'; 
> {noformat}
> - Create a temp table:
> {noformat}
> create table tmp_table (id string, name string, date string, pid int) row 
> format delimited fields terminated by '\t' lines terminated by '\n' stored as 
> textfile;
> {noformat}
> - Load the following rows to the tmp table:
> {noformat}
> u1value1  2017-04-10  1
> u2value2  2017-04-10  1
> u3value3  2017-04-10  10001
> {noformat}
> - Set the following parameters:
> -- hive.exec.dynamic.partition.mode=nonstrict
> -- mapreduce.input.fileinputformat.split.maxsize=10
> -- hive.blobstore.optimizations.enabled=true
> -- hive.blobstore.use.blobstore.as.scratchdir=false
> -- hive.merge.mapfiles=true
> - Insert the rows from the temp table into the s3 table:
> {noformat}
> INSERT OVERWRITE TABLE s3table
> PARTITION (reported_date, product_id)
> SELECT
>   t.id as user_id,
>   t.name as event_name,
>   t.date as reported_date,
>   t.pid as product_id
> FROM tmp_table t;
> {noformat}
> A NPE will occur with the following stacktrace:
> {noformat}
> 2017-05-08 21:32:50,607 ERROR 
> org.apache.hive.service.cli.operation.Operation: 
> [HiveServer2-Background-Pool: Thread-184028]: Error running hive query: 
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code -101 from 
> org.apache.hadoop.hive.ql.exec.ConditionalTask. null
> at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:239)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.plan.ConditionalResolverMergeFiles.generateActualTasks(ConditionalResolverMergeFiles.java:290)
> at 
> org.apache.hadoop.hive.ql.plan.ConditionalResolverMergeFiles.getTasks(ConditionalResolverMergeFiles.java:175)
> at 
> org.apache.hadoop.hive.ql.exec.ConditionalTask.execute(ConditionalTask.java:81)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1977)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1690)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1422)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1206)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1201)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
> ... 11 more 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16845) INSERT OVERWRITE a table with dynamic partitions on S3 fails with NPE

2017-07-31 Thread Marta Kuczora (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-16845:
-
Attachment: HIVE-16845.4.patch

> INSERT OVERWRITE a table with dynamic partitions on S3 fails with NPE
> -
>
> Key: HIVE-16845
> URL: https://issues.apache.org/jira/browse/HIVE-16845
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
> Attachments: HIVE-16845.1.patch, HIVE-16845.2.patch, 
> HIVE-16845.3.patch, HIVE-16845.4.patch
>
>
> *How to reproduce*
> - Create a partitioned table on S3:
> {noformat}
> CREATE EXTERNAL TABLE s3table(user_id string COMMENT '', event_name string 
> COMMENT '') PARTITIONED BY (reported_date string, product_id int) LOCATION 
> 's3a://'; 
> {noformat}
> - Create a temp table:
> {noformat}
> create table tmp_table (id string, name string, date string, pid int) row 
> format delimited fields terminated by '\t' lines terminated by '\n' stored as 
> textfile;
> {noformat}
> - Load the following rows to the tmp table:
> {noformat}
> u1value1  2017-04-10  1
> u2value2  2017-04-10  1
> u3value3  2017-04-10  10001
> {noformat}
> - Set the following parameters:
> -- hive.exec.dynamic.partition.mode=nonstrict
> -- mapreduce.input.fileinputformat.split.maxsize=10
> -- hive.blobstore.optimizations.enabled=true
> -- hive.blobstore.use.blobstore.as.scratchdir=false
> -- hive.merge.mapfiles=true
> - Insert the rows from the temp table into the s3 table:
> {noformat}
> INSERT OVERWRITE TABLE s3table
> PARTITION (reported_date, product_id)
> SELECT
>   t.id as user_id,
>   t.name as event_name,
>   t.date as reported_date,
>   t.pid as product_id
> FROM tmp_table t;
> {noformat}
> A NPE will occur with the following stacktrace:
> {noformat}
> 2017-05-08 21:32:50,607 ERROR 
> org.apache.hive.service.cli.operation.Operation: 
> [HiveServer2-Background-Pool: Thread-184028]: Error running hive query: 
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code -101 from 
> org.apache.hadoop.hive.ql.exec.ConditionalTask. null
> at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:239)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.plan.ConditionalResolverMergeFiles.generateActualTasks(ConditionalResolverMergeFiles.java:290)
> at 
> org.apache.hadoop.hive.ql.plan.ConditionalResolverMergeFiles.getTasks(ConditionalResolverMergeFiles.java:175)
> at 
> org.apache.hadoop.hive.ql.exec.ConditionalTask.execute(ConditionalTask.java:81)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1977)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1690)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1422)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1206)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1201)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
> ... 11 more 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently

2017-07-31 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107393#comment-16107393
 ] 

Sergio Peña commented on HIVE-16886:


[~anishek] Got it, I will run some tests with ORACLE db as well to see if we 
have the same behavior. Btw, the above example has a duplicated EVENT_ID with a 
different NL_ID, here're the lines:
{noformat}
|  5432 | 5097 | 1501109698 | CREATE_TABLE| 
metastore_test_db_HIVE_HIVEMETASTORE_2 |

|  5437 | 5097 | 1501109698 | CREATE_TABLE| 
metastore_test_db_HIVE_HIVEMETASTORE_1 |
{noformat}

The {{HIVE_HIVEMETASTORE_N}} is a row written by a different HMS. In this cases 
we have HMS 1 and HMS 2 writing the same event ID.

> HMS log notifications may have duplicated event IDs if multiple HMS are 
> running concurrently
> 
>
> Key: HIVE-16886
> URL: https://issues.apache.org/jira/browse/HIVE-16886
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Reporter: Sergio Peña
>
> When running multiple Hive Metastore servers and DB notifications are 
> enabled, I could see that notifications can be persisted with a duplicated 
> event ID. 
> This does not happen when running multiple threads in a single HMS node due 
> to the locking acquired on the DbNotificationsLog class, but multiple HMS 
> could cause conflicts.
> The issue is in the ObjectStore#addNotificationEvent() method. The event ID 
> fetched from the datastore is used for the new notification, incremented in 
> the server itself, then persisted or updated back to the datastore. If 2 
> servers read the same ID, then these 2 servers write a new notification with 
> the same ID.
> The event ID is not unique nor a primary key.
> Here's a test case using the TestObjectStore class that confirms this issue:
> {noformat}
> @Test
>   public void testConcurrentAddNotifications() throws ExecutionException, 
> InterruptedException {
> final int NUM_THREADS = 2;
> CountDownLatch countIn = new CountDownLatch(NUM_THREADS);
> CountDownLatch countOut = new CountDownLatch(1);
> HiveConf conf = new HiveConf();
> conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, 
> MockPartitionExpressionProxy.class.getName());
> ExecutorService executorService = 
> Executors.newFixedThreadPool(NUM_THREADS);
> FutureTask tasks[] = new FutureTask[NUM_THREADS];
> for (int i=0; i   final int n = i;
>   tasks[i] = new FutureTask(new Callable() {
> @Override
> public Void call() throws Exception {
>   ObjectStore store = new ObjectStore();
>   store.setConf(conf);
>   NotificationEvent dbEvent =
>   new NotificationEvent(0, 0, 
> EventMessage.EventType.CREATE_DATABASE.toString(), "CREATE DATABASE DB" + n);
>   System.out.println("ADDING NOTIFICATION");
>   countIn.countDown();
>   countOut.await();
>   store.addNotificationEvent(dbEvent);
>   System.out.println("FINISH NOTIFICATION");
>   return null;
> }
>   });
>   executorService.execute(tasks[i]);
> }
> countIn.await();
> countOut.countDown();
> for (int i = 0; i < NUM_THREADS; ++i) {
>   tasks[i].get();
> }
> NotificationEventResponse eventResponse = 
> objectStore.getNextNotification(new NotificationEventRequest());
> Assert.assertEquals(2, eventResponse.getEventsSize());
> Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId());
> // This fails because the next notification has an event ID = 1
> Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId());
>   }
> {noformat}
> The last assertion fails expecting an event ID 1 instead of 2. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14903) from_utc_time function issue for CET daylight savings

2017-07-31 Thread Barna Zsombor Klara (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107383#comment-16107383
 ] 

Barna Zsombor Klara commented on HIVE-14903:


It seems this is already fixed on the master branch:

{code}
select from_utc_timestamp('2016-10-30 00:30:00','CET');
++
|  _c0   |
++
| 2016-10-30 02:30:00.0  |
++
{code}

If that is the case and you can confirm then please resolve the Jira?

> from_utc_time function issue for CET daylight savings
> -
>
> Key: HIVE-14903
> URL: https://issues.apache.org/jira/browse/HIVE-14903
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.1
>Reporter: Eric Lin
>Priority: Minor
>
> Based on https://en.wikipedia.org/wiki/Central_European_Summer_Time, the 
> summer time is between 1:00 UTC on the last Sunday of March and 1:00 on the 
> last Sunday of October, see test case below:
> Impala:
> {code}
> select from_utc_timestamp('2016-10-30 00:30:00','CET');
> Query: select from_utc_timestamp('2016-10-30 00:30:00','CET')
> +--+
> | from_utc_timestamp('2016-10-30 00:30:00', 'cet') |
> +--+
> | 2016-10-30 01:30:00  |
> +--+
> {code}
> Hive:
> {code}
> select from_utc_timestamp('2016-10-30 00:30:00','CET');
> INFO  : OK
> ++--+
> |  _c0   |
> ++--+
> | 2016-10-30 01:30:00.0  |
> ++--+
> {code}
> MySQL:
> {code}
> mysql> SELECT CONVERT_TZ( '2016-10-30 00:30:00', 'UTC', 'CET' );
> +---+
> | CONVERT_TZ( '2016-10-30 00:30:00', 'UTC', 'CET' ) |
> +---+
> | 2016-10-30 02:30:00   |
> +---+
> {code}
> At 00:30AM UTC, the daylight saving has not finished so the time different 
> should still be 2 hours rather than 1. MySQL returned correct result
> At 1:30, results are correct:
> Impala:
> {code}
> Query: select from_utc_timestamp('2016-10-30 01:30:00','CET')
> +--+
> | from_utc_timestamp('2016-10-30 01:30:00', 'cet') |
> +--+
> | 2016-10-30 02:30:00  |
> +--+
> Fetched 1 row(s) in 0.01s
> {code}
> Hive:
> {code}
> ++--+
> |  _c0   |
> ++--+
> | 2016-10-30 02:30:00.0  |
> ++--+
> 1 row selected (0.252 seconds)
> {code}
> MySQL:
> {code}
> mysql> SELECT CONVERT_TZ( '2016-10-30 01:30:00', 'UTC', 'CET' );
> +---+
> | CONVERT_TZ( '2016-10-30 01:30:00', 'UTC', 'CET' ) |
> +---+
> | 2016-10-30 02:30:00   |
> +---+
> 1 row in set (0.00 sec)
> {code}
> Seems like a bug.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >