date:20161109

[jira] [Commented] (HIVE-15082) Hive-1.2 cannot read data from complex data types with TIMESTAMP column, stored in Parquet

2016-11-09 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653262#comment-15653262
 ] 

Lefty Leverenz commented on HIVE-15082:
---

[~osayankin], it looks like you swapped the locations of patch-num and 
branch-name on patch 2, unless you meant it to be for branch-1.2 (but it still 
needs a patch-num).

> Hive-1.2 cannot read data from complex data types with TIMESTAMP column, 
> stored in Parquet
> --
>
> Key: HIVE-15082
> URL: https://issues.apache.org/jira/browse/HIVE-15082
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Blocker
> Attachments: HIVE-15082-branch-1.2.patch, HIVE-15082-branch-1.patch
>
>
> *STEP 1. Create test data*
> {code:sql}
> select * from dual;
> {code}
> *EXPECTED RESULT:*
> {noformat}
> Pretty_UnIQUe_StrinG
> {noformat}
> {code:sql}
> create table test_parquet1(login timestamp) stored as parquet;
> insert overwrite table test_parquet1 select from_unixtime(unix_timestamp()) 
> from dual;
> select * from test_parquet1 limit 1;
> {code}
> *EXPECTED RESULT:*
> No exceptions. Current timestamp as result.
> {noformat}
> 2016-10-27 10:58:19
> {noformat}
> *STEP 2. Store timestamp in array in parquet file*
> {code:sql}
> create table test_parquet2(x array) stored as parquet;
> insert overwrite table test_parquet2 select array(login) from test_parquet1;
> select * from test_parquet2;
> {code}
> *EXPECTED RESULT:*
> No exceptions. Current timestamp in brackets as result.
> {noformat}
> ["2016-10-27 10:58:19"]
> {noformat}
> *ACTUAL RESULT:*
> {noformat}
> ERROR [main]: CliDriver (SessionState.java:printError(963)) - Failed with 
> exception java.io.IOException:parquet.io.ParquetDecodingException: Can not 
> read value at 0 in block -1 in file 
> hdfs:///user/hive/warehouse/test_parquet2/00_0
> java.io.IOException: parquet.io.ParquetDecodingException: Can not read value 
> at 0 in block -1 in file hdfs:///user/hive/warehouse/test_parquet2/00_0
> {noformat}
> *ROOT-CAUSE:*
> Incorrect initialization of {{metadata}} {{HashMap}} causes that it has 
> {{null}} value in enumeration 
> {{org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter}} when 
> executing following line:
> {code:java}
>   boolean skipConversion = 
> Boolean.valueOf(metadata.get(HiveConf.ConfVars.HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION.varname));
> {code}
> in element {{ETIMESTAMP_CONVERTER}}.
> JVM throws NPE and parquet library can not read data from file and throws 
> {noformat}
> java.io.IOException:parquet.io.ParquetDecodingException: Can not read value 
> at 0 in block -1 in file hdfs:///user/hive/warehouse/test_parquet2/00_0
> {noformat}
> for its turn.
> *SOLUTION:*
> Perform initialization in separate method to skip overriding it with {{null}} 
> value in block of code
> {code:java}
>   if (parent != null) {
>  setMetadata(parent.getMetadata());
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15137) metastore add partitions background thread should use current username

2016-11-09 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-15137:
--
Attachment: HIVE-15137.1.patch

> metastore add partitions background thread should use current username
> --
>
> Key: HIVE-15137
> URL: https://issues.apache.org/jira/browse/HIVE-15137
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0, 2.1.1
>Reporter: Thejas M Nair
>Assignee: Daniel Dai
> Attachments: HIVE-15137.1.patch
>
>
> The background thread used in HIVE-13901 for adding partitions needs to be 
> reinitialized with current UGI for each invocation. Otherwise the user in 
> context while thread was created would be the current UGI during the actions 
> in the thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15137) metastore add partitions background thread should use current username

2016-11-09 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-15137:
--
Status: Patch Available  (was: Open)

> metastore add partitions background thread should use current username
> --
>
> Key: HIVE-15137
> URL: https://issues.apache.org/jira/browse/HIVE-15137
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0, 2.1.1
>Reporter: Thejas M Nair
>Assignee: Daniel Dai
> Attachments: HIVE-15137.1.patch
>
>
> The background thread used in HIVE-13901 for adding partitions needs to be 
> reinitialized with current UGI for each invocation. Otherwise the user in 
> context while thread was created would be the current UGI during the actions 
> in the thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15137) metastore add partitions background thread should use current username

2016-11-09 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653225#comment-15653225
 ] 

Daniel Dai commented on HIVE-15137:
---

Here is the instruction to reproduce:
1. set the size of thread pool to 1 (hive.metastore.fshandler.threads=1)
2. start metastore
3. Start HiveCli with user1, run "ALTER TABLE table1 ADD PARTITION ..."
4. Start HiveCli with user2, run "ALTER TABLE table1 ADD PARTITION ..."

The owner of both partition directories are user1.

The cause of the issue is the FileSystem object from fs cache in 
Warehouse.mkdirs has the wrong uid. At the time when mkdirs getting FileSystem, 
UserGroupInformation.getCurrentUser() in both cases is user1.

Upload a patch which use doAs inside thread within threadpool.

It is hard to write a UT. Manually tested.

> metastore add partitions background thread should use current username
> --
>
> Key: HIVE-15137
> URL: https://issues.apache.org/jira/browse/HIVE-15137
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0, 2.1.1
>Reporter: Thejas M Nair
>Assignee: Daniel Dai
>
> The background thread used in HIVE-13901 for adding partitions needs to be 
> reinitialized with current UGI for each invocation. Otherwise the user in 
> context while thread was created would be the current UGI during the actions 
> in the thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15101) Spark client can be stuck in RUNNING state

2016-11-09 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653187#comment-15653187
 ] 

Rui Li commented on HIVE-15101:
---

Hi [~tzenmyo], thanks for your input. I'll try your scenario and see what I can 
find.

We already take care of timeout in the Rpc code. Adding another timeout as in 
the patch may be no harm, but we should at least figure out why the existing 
logic doesn't work as expected.

> Spark client can be stuck in RUNNING state
> --
>
> Key: HIVE-15101
> URL: https://issues.apache.org/jira/browse/HIVE-15101
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.0.0, 2.1.0
> Environment: Hive 2.1.0
> Spark 1.6.2
>Reporter: Satoshi Iijima
>Assignee: Satoshi Iijima
> Attachments: HIVE-15101.patch, hadoop-yarn-nodemanager.log, 
> hive.log.gz
>
>
> When a Hive-on-Spark job is executed on YARN environment where UNHEALTHY 
> NodeManager exists, Spark client can be stuck in RUNNING state.
> thread dump:
> {code}
> "008ee7b6-b083-4ac9-ae1c-b6097d9bf761 main" #1 prio=5 os_prio=0 
> tid=0x7f14f4013800 nid=0x3855 in Object.wait() [0x7f14fd9b1000]
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   - waiting on <0xf6615550> (a 
> io.netty.util.concurrent.DefaultPromise)
>   at java.lang.Object.wait(Object.java:502)
>   at 
> io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:254)
>   - locked <0xf6615550> (a 
> io.netty.util.concurrent.DefaultPromise)
>   at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:32)
>   at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:31)
>   at 
> org.apache.hive.spark.client.SparkClientImpl.(SparkClientImpl.java:104)
>   at 
> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
>   - locked <0xf21b8e08> (a java.lang.Class for 
> org.apache.hive.spark.client.SparkClientFactory)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:99)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.(RemoteHiveSparkClient.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:67)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:136)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:89)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:742)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12891) Hive fails when java.io.tmpdir is set to a relative location

2016-11-09 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653132#comment-15653132
 ] 

Lefty Leverenz commented on HIVE-12891:
---

Should this be documented in the Hive wiki?  It could go in the Configuration 
doc, although we might need a new subsection for it.

* [AdminManual -- Configuration -- Configuration Variables | 
https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration#AdminManualConfiguration-ConfigurationVariables]

> Hive fails when java.io.tmpdir is set to a relative location
> 
>
> Key: HIVE-12891
> URL: https://issues.apache.org/jira/browse/HIVE-12891
> Project: Hive
>  Issue Type: Bug
>Reporter: Reuben Kuhnert
>Assignee: Barna Zsombor Klara
> Fix For: 2.2.0
>
> Attachments: HIVE-12891.01.19.2016.01.patch, HIVE-12891.03.patch, 
> HIVE-12891.04.patch, HIVE-12891.5.patch, HIVE-12981.01.22.2016.02.patch
>
>
> The function {{SessionState.createSessionDirs}} fails when trying to create 
> directories where {{java.io.tmpdir}} is set to a relative location.
> {code}
> \[SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: 
> IllegalArgumentException java.net.URISyntaxException: Relative path in 
> absolute URI: 
> file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1
> ...
> Minor variations:
> \[SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: SemanticException 
> Exception while processing Exception while writing out the local file 
> o.a.h.hive.ql/parse.SemanticException: Exception while processing exception 
> while writing out local file 
> ... 
> caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: 
> Relative path in absolute URI: 
> file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 
> at o.a.h.fs.Path.initialize (206) 
> at o.a.h.fs.Path.(197)... 
> at o.a.h.hive.ql.context.getScratchDir(267) 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13966) DbNotificationListener: can loose DDL operation notifications

2016-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653034#comment-15653034
 ] 

Hive QA commented on HIVE-13966:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838285/HIVE-13966.5.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10637 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] 
(batchId=91)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2059/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2059/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2059/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838285 - PreCommit-HIVE-Build

> DbNotificationListener: can loose DDL operation notifications
> -
>
> Key: HIVE-13966
> URL: https://issues.apache.org/jira/browse/HIVE-13966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Nachiket Vaidya
>Assignee: Mohit Sabharwal
>Priority: Critical
> Attachments: HIVE-13966.1.patch, HIVE-13966.2.patch, 
> HIVE-13966.3.patch, HIVE-13966.4.patch, HIVE-13966.4.patch, 
> HIVE-13966.5.patch, HIVE-13966.pdf
>
>
> The code for each API in HiveMetaStore.java is like this:
> 1. openTransaction()
> 2. -- operation--
> 3. commit() or rollback() based on result of the operation.
> 4. add entry to notification log (unconditionally)
> If the operation is failed (in step 2), we still add entry to notification 
> log. Found this issue in testing.
> It is still ok as this is the case of false positive.
> If the operation is successful and adding to notification log failed, the 
> user will get an MetaException. It will not rollback the operation, as it is 
> already committed. We need to handle this case so that we will not have false 
> negatives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15101) Spark client can be stuck in RUNNING state

2016-11-09 Thread Teruyoshi Zenmyo (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652934#comment-15652934
 ] 

Teruyoshi Zenmyo commented on HIVE-15101:
-

I have encountered the same issue on a testing environment without UNHEALTHY 
nodes (all of the nodes had been active).
I found that spark-submit.sh had failed due to resource shortage 
(spark.driver.memory > yarn.scheduler.maximum-allocation-mb).
The server-side timeout seems to not work in case of failures on 
spark-submit.sh and the patch introducing client-side timeout would make it 
safer.


> Spark client can be stuck in RUNNING state
> --
>
> Key: HIVE-15101
> URL: https://issues.apache.org/jira/browse/HIVE-15101
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.0.0, 2.1.0
> Environment: Hive 2.1.0
> Spark 1.6.2
>Reporter: Satoshi Iijima
>Assignee: Satoshi Iijima
> Attachments: HIVE-15101.patch, hadoop-yarn-nodemanager.log, 
> hive.log.gz
>
>
> When a Hive-on-Spark job is executed on YARN environment where UNHEALTHY 
> NodeManager exists, Spark client can be stuck in RUNNING state.
> thread dump:
> {code}
> "008ee7b6-b083-4ac9-ae1c-b6097d9bf761 main" #1 prio=5 os_prio=0 
> tid=0x7f14f4013800 nid=0x3855 in Object.wait() [0x7f14fd9b1000]
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   - waiting on <0xf6615550> (a 
> io.netty.util.concurrent.DefaultPromise)
>   at java.lang.Object.wait(Object.java:502)
>   at 
> io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:254)
>   - locked <0xf6615550> (a 
> io.netty.util.concurrent.DefaultPromise)
>   at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:32)
>   at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:31)
>   at 
> org.apache.hive.spark.client.SparkClientImpl.(SparkClientImpl.java:104)
>   at 
> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
>   - locked <0xf21b8e08> (a java.lang.Class for 
> org.apache.hive.spark.client.SparkClientFactory)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:99)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.(RemoteHiveSparkClient.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:67)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:136)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:89)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:742)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13966) DbNotificationListener: can loose DDL operation notifications

2016-11-09 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-13966:
---
Attachment: HIVE-13966.5.patch

> DbNotificationListener: can loose DDL operation notifications
> -
>
> Key: HIVE-13966
> URL: https://issues.apache.org/jira/browse/HIVE-13966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Nachiket Vaidya
>Assignee: Mohit Sabharwal
>Priority: Critical
> Attachments: HIVE-13966.1.patch, HIVE-13966.2.patch, 
> HIVE-13966.3.patch, HIVE-13966.4.patch, HIVE-13966.4.patch, 
> HIVE-13966.5.patch, HIVE-13966.pdf
>
>
> The code for each API in HiveMetaStore.java is like this:
> 1. openTransaction()
> 2. -- operation--
> 3. commit() or rollback() based on result of the operation.
> 4. add entry to notification log (unconditionally)
> If the operation is failed (in step 2), we still add entry to notification 
> log. Found this issue in testing.
> It is still ok as this is the case of false positive.
> If the operation is successful and adding to notification log failed, the 
> user will get an MetaException. It will not rollback the operation, as it is 
> already committed. We need to handle this case so that we will not have false 
> negatives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15112) Implement Parquet vectorization reader for Struct type

2016-11-09 Thread Ferdinand Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652892#comment-15652892
 ] 

Ferdinand Xu commented on HIVE-15112:
-

PR is sent out for preview. Not ready in QTest part and also pending on the 
uncommitted patch HIVE-14815.

> Implement Parquet vectorization reader for Struct type
> --
>
> Key: HIVE-15112
> URL: https://issues.apache.org/jira/browse/HIVE-15112
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>
> Like HIVE-14815, we need support Parquet vectorized reader for struct type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15112) Implement Parquet vectorization reader for Struct type

2016-11-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652889#comment-15652889
 ] 

ASF GitHub Bot commented on HIVE-15112:
---

GitHub user winningsix opened a pull request:

https://github.com/apache/hive/pull/113

HIVE-15112 Implement Parquet vectorization reader for Struct type

Patch includes:
1. support for struct type
2. UT refine

To be done:
QTest for struct type

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/winningsix/hive complex_types

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/113.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #113


commit 37f50c7629b5ef2a8fb6e9f63caaec6223abf308
Author: Ferdinand Xu 
Date:   2016-09-01T22:15:31Z

HIVE-14815: Support vectorization for Parquet

clean code and add qtest

Refine code

Clean code

Clean up code

Clean up

clean up code

Update qfile output files

Clean up code

Address comments

Avoid creating new HiveDecimalWritable object

Address more comments

Remove unused imports

Address further comments

Fix NPE

Fix for failed cases

commit 891b219838e4978f2eb4d41c0016214d44cc1bb7
Author: Ferdinand Xu 
Date:   2016-11-07T06:10:16Z

HIVE-15112: Implement Parquet vectorization reader for Complex types

commit 26e513a2ac67dcfb05875e6ad7ba07f158be9073
Author: Ferdinand Xu 
Date:   2016-11-09T19:49:46Z

Refactor UT




> Implement Parquet vectorization reader for Struct type
> --
>
> Key: HIVE-15112
> URL: https://issues.apache.org/jira/browse/HIVE-15112
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>
> Like HIVE-14815, we need support Parquet vectorized reader for struct type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14453) refactor physical writing of ORC data and metadata to FS from the logical writers

2016-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652874#comment-15652874
 ] 

Hive QA commented on HIVE-14453:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838280/HIVE-14453.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10637 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2058/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2058/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2058/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838280 - PreCommit-HIVE-Build

> refactor physical writing of ORC data and metadata to FS from the logical 
> writers
> -
>
> Key: HIVE-14453
> URL: https://issues.apache.org/jira/browse/HIVE-14453
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14453.01.patch, HIVE-14453.02.patch, 
> HIVE-14453.patch
>
>
> ORC data doesn't have to go directly into an HDFS stream via buffers, it can 
> go somewhere else (e.g. a write-thru cache, or an addressable system that 
> doesn't require the stream blocks to be held in memory before writing them 
> all together).
> To that effect, it would be nice to abstract the data block/metadata 
> structure creating from the physical file concerns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10924) add support for MERGE statement

2016-11-09 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652813#comment-15652813
 ] 

Eugene Koifman commented on HIVE-10924:
---

https://www.postgresql.org/message-id/1208372338.4259.202.ca...@ebony.site

> add support for MERGE statement
> ---
>
> Key: HIVE-10924
> URL: https://issues.apache.org/jira/browse/HIVE-10924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning, Query Processor, Transactions
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> add support for 
> MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15173) Allow dec as an alias for decimal

2016-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652765#comment-15652765
 ] 

Hive QA commented on HIVE-15173:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838268/HIVE-15173.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 84 failed/errored test(s), 10637 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_deep_filters]
 (batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_decimal] 
(batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_decimal_native] 
(batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[char_pad_convert] 
(batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_10_0] 
(batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_precision] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[distinct_windowing] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[distinct_windowing_no_cbo]
 (batchId=58)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[metadata_only_queries] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[metadata_only_queries_with_filters]
 (batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_file_dump] 
(batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_decimal] 
(batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_windowing_expressions]
 (batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_binary_join_groupby]
 (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_cast_constant] 
(batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_data_types] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_10_0] 
(batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_mapjoin] 
(batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_precision]
 (batchId=45)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round] 
(batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round_2] 
(batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_string_concat] 
(batchId=29)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_distinct] 
(batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_expressions] 
(batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_multipartitioning]
 (batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_navfn] 
(batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_ntile] 
(batchId=20)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_order_null] 
(batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_range_multiorder]
 (batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_rank] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_streaming] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_udaf] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_windowspec] 
(batchId=16)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters1]
 (batchId=131)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=131)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=133)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mapjoin_decimal]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadata_only_queries]
 (batchId=141)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadata_only_queries_with_filters]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_predicate_pushdown]
 (batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[parquet_predicate_pushdown]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)

[jira] [Updated] (HIVE-14089) complex type support in LLAP IO is broken

2016-11-09 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14089:

Description: 
HIVE-13617 is causing MiniLlapCliDriver following test failures
{code}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
{code}





  was:
HIVE-13617 is causing MiniLlapCliDriver following test failures
{code}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
{code}


Note to self - need to add multi-stripe test, and also test complex types with 
some nulls so that present stream is not suppressed.



> complex type support in LLAP IO is broken 
> --
>
> Key: HIVE-14089
> URL: https://issues.apache.org/jira/browse/HIVE-14089
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14089.04.patch, HIVE-14089.05.patch, 
> HIVE-14089.06.patch, HIVE-14089.07.patch, HIVE-14089.08.patch, 
> HIVE-14089.09.patch, HIVE-14089.10.patch, HIVE-14089.10.patch, 
> HIVE-14089.10.patch, HIVE-14089.11.patch, HIVE-14089.WIP.2.patch, 
> HIVE-14089.WIP.3.patch, HIVE-14089.WIP.patch
>
>
> HIVE-13617 is causing MiniLlapCliDriver following test failures
> {code}
> org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
> org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14453) refactor physical writing of ORC data and metadata to FS from the logical writers

2016-11-09 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14453:

Attachment: HIVE-14453.02.patch

I'd like to revive this patch for HIVE-15147 (where we want to reencode parts 
of a text file to ORC for caching, and cache columns separately from each 
other).

[~prasanth_j] can you please review? This is a refactoring, so no real logic 
changes as far as I see.

> refactor physical writing of ORC data and metadata to FS from the logical 
> writers
> -
>
> Key: HIVE-14453
> URL: https://issues.apache.org/jira/browse/HIVE-14453
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14453.01.patch, HIVE-14453.02.patch, 
> HIVE-14453.patch
>
>
> ORC data doesn't have to go directly into an HDFS stream via buffers, it can 
> go somewhere else (e.g. a write-thru cache, or an addressable system that 
> doesn't require the stream blocks to be held in memory before writing them 
> all together).
> To that effect, it would be nice to abstract the data block/metadata 
> structure creating from the physical file concerns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-14990) run all tests for MM tables and fix the issues that are found

2016-11-09 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15648980#comment-15648980
 ] 

Sergey Shelukhin edited comment on HIVE-14990 at 11/10/16 1:58 AM:
---

Looked at all the remaining tests. Out of 749 failed tests, about 100 failures 
and diffs are (or might be, at least) relevant.
Many of them are similar e.g. are missing stats, but I don't know if they are 
missing stats for the same reason. 
Many e.g. exim may be due to unsupported/path-dependant scenarios that were not 
immediately obvious.

Not sure why TestSparkCliDriver fails. Fails in client init for me with no 
useful logs (logs child process exited with 127, then times out).
I think we'll fix that during branch merge, if still broken.

Crossing out ones that are actually irrelevant
{panel}
TestCliDriver:
authorization_insert
create_default_prop
exim_04_evolved_parts
-exim_11_managed_external-
-exim_12_external_location-
-exim_15_external_part-
-exim_18_part_external-
-exim_19_00_part_external_location-
-exim_19_part_external_location-
insert1
list_bucket_dml_8
mm_all
orc_createas1
ppd_join4
stats_empty_dyn_part
stats_partscan_1_23
temp_table_display_colstats_tbllvl
temp_table_options1
vector_udf2
list_bucket_dml_14,list_bucket_*
llap_acid
insert_overwrite_directory2
authorization_load
autoColumnStats_9
create_like
drop_database_removes_partition_dirs
drop_table_removes_partition_dirs
index_auto_update
exim_01_nonpart,exim_02_part,exim_04_all_part,exim_05_some_part,exim_06_one_part,exim_16_part_external,exim_17_part_managed,exim_20_part_managed_location
load_overwrite
materialized_view_authorization_sqlstd,materialized_*
merge_dynamic_partition, merge_dynamic_partition*
orc_int_type_promotion
orc_vectorization_ppd
parquet_join2
partition_wise_fileformat,partition_wise_fileformat3
repl_1_drop,repl_3_exim_metadata 
sample6
sample_islocalmode_hook
show_tablestatus
smb_bucket_1
smb_mapjoin_2,smb_mapjoin_3,smb_mapjoin_7
stats_list_bucket
stats_noscan_2
symlink_text_input_format
temp_table_precedence
offset_limit_global_optimizer
rand_partitionpruner2

TestEncryptedHDFSCliDriver:
encryption_ctas
encryption_drop_partition
encryption_insert_values
encryption_join_unencrypted_tbl
encryption_load_data_to_encrypted_tables

MiniLlapLocal:
exchgpartition2lel
cbo_rp_lineage2
create_merge_compressed
deleteAnalyze
delete_where_no_match
delete_where_non_partitioned
dynpart_sort_optimization
escape2
insert1
lineage2
lineage3
orc_llap
schema_evol_orc_nonvec_part
schema_evol_orc_vec_part
schema_evol_text_nonvec_part
schema_evol_text_vec_part
schema_evol_text_vecrow_part
smb_mapjoin_6
tez_dml
union_fast_stats
update_all_types
update_tmp_table
update_where_no_match
update_where_non_partitioned
vector_outer_join1
vector_outer_join4

MiniLlap:
load_fs2
orc_ppd_basic
external_table_with_space_in_location_path
file_with_header_footer
import_exported_table
schemeAuthority,schemeAuthority2
table_nonprintable

Minimr:
infer_bucket_sort_map_operators
infer_bucket_sort_merge
infer_bucket_sort_reducers_power_two
root_dir_external_table
scriptfile1

TestSymlinkTextInputFormat#testCombine 
TestJdbcWithLocalClusterSpark, etc.
{panel}



was (Author: sershe):
Looked at all the remaining tests. Out of 749 failed tests, about 100 failures 
and diffs are (or might be, at least) relevant.
Many of them are similar e.g. are missing stats, but I don't know if they are 
missing stats for the same reason. 
Many e.g. exim may be due to unsupported/path-dependant scenarios that were not 
immediately obvious.

Not sure why TestSparkCliDriver fails. Fails in client init for me with no 
useful logs (logs child process exited with 127, then times out).
I think we'll fix that during branch merge, if still broken.

{noformat}
TestCliDriver:
authorization_insert
create_default_prop
exim_04_evolved_parts
exim_11_managed_external
exim_12_external_location
exim_15_external_part
exim_18_part_external
exim_19_00_part_external_location
exim_19_part_external_location
insert1
list_bucket_dml_8
mm_all
orc_createas1
ppd_join4
stats_empty_dyn_part
stats_partscan_1_23
temp_table_display_colstats_tbllvl
temp_table_options1
vector_udf2
list_bucket_dml_14,list_bucket_*
llap_acid
insert_overwrite_directory2
authorization_load
autoColumnStats_9
create_like
drop_database_removes_partition_dirs
drop_table_removes_partition_dirs
index_auto_update
exim_01_nonpart,exim_02_part,exim_04_all_part,exim_05_some_part,exim_06_one_part,exim_16_part_external,exim_17_part_managed,exim_20_part_managed_location
load_overwrite
materialized_view_authorization_sqlstd,materialized_*
merge_dynamic_partition, merge_dynamic_partition*
orc_int_type_promotion
orc_vectorization_ppd
parquet_join2
partition_wise_fileformat,partition_wise_fileformat3
repl_1_drop,repl_3_exim_metadata 
sample6
sample_islocalmode_hook
show_tablestatus
smb_bucket_1
smb_mapjoin_2,smb_mapjoin_3,smb_mapjoin_7
stats_list_bucket

[jira] [Updated] (HIVE-15174) Respect auth_to_local rules from hdfs configs (core-site.xml) for LDAP authentication too

2016-11-09 Thread Ruslan Dautkhanov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruslan Dautkhanov updated HIVE-15174:
-
Description: 
Hive has implemented Kerberos principal mapping for authentication; the same 
should be implemented for LDAP authentication.
Both Kerberos and LDAP are using Active Directory as a backend to store 
principals (in many cases), so it's naturally to think this should work for 
LDAP too.
Fact that this mapping works only for Kerberos and not for LDAP principals, 
breaks authentication in our organization.

  was:
Hive has implemented Kerberos principal mapping for authentication; the same 
should be implemented for LDAP authentication.
Both Kerberos and LDAP are using Active Directory as a backend to store 
principals (in many cases), so it's naturally to think this should work for 
LDAP too.
Fact that IMPALA-2660 works only for Kerberos and not for LDAP principals, 
breaks authentication in our organization.


> Respect auth_to_local rules from hdfs configs (core-site.xml) for LDAP 
> authentication too
> -
>
> Key: HIVE-15174
> URL: https://issues.apache.org/jira/browse/HIVE-15174
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, HiveServer2, Security
>Affects Versions: 1.1.1, 1.2.1, 2.1.0
> Environment: Hive 1.1; Hadoop 2.6
>Reporter: Ruslan Dautkhanov
>  Labels: security
>
> Hive has implemented Kerberos principal mapping for authentication; the same 
> should be implemented for LDAP authentication.
> Both Kerberos and LDAP are using Active Directory as a backend to store 
> principals (in many cases), so it's naturally to think this should work for 
> LDAP too.
> Fact that this mapping works only for Kerberos and not for LDAP principals, 
> breaks authentication in our organization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15174) Respect auth_to_local rules from hdfs configs (core-site.xml) for LDAP authentication too

2016-11-09 Thread Ruslan Dautkhanov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruslan Dautkhanov updated HIVE-15174:
-
Labels: security  (was: )

> Respect auth_to_local rules from hdfs configs (core-site.xml) for LDAP 
> authentication too
> -
>
> Key: HIVE-15174
> URL: https://issues.apache.org/jira/browse/HIVE-15174
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, HiveServer2, Security
>Affects Versions: 1.1.1, 1.2.1, 2.1.0
> Environment: Hive 1.1; Hadoop 2.6
>Reporter: Ruslan Dautkhanov
>  Labels: security
>
> Hive has implemented Kerberos principal mapping for authentication; the same 
> should be implemented for LDAP authentication.
> Both Kerberos and LDAP are using Active Directory as a backend to store 
> principals (in many cases), so it's naturally to think this should work for 
> LDAP too.
> Fact that IMPALA-2660 works only for Kerberos and not for LDAP principals, 
> breaks authentication in our organization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12891) Hive fails when java.io.tmpdir is set to a relative location

2016-11-09 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-12891:
---
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to 2.2.0. Thanks [~zsombor.klara] for the patch. 
If you want to committed to 2.1.1, please a patch for that branch as well.

> Hive fails when java.io.tmpdir is set to a relative location
> 
>
> Key: HIVE-12891
> URL: https://issues.apache.org/jira/browse/HIVE-12891
> Project: Hive
>  Issue Type: Bug
>Reporter: Reuben Kuhnert
>Assignee: Barna Zsombor Klara
> Fix For: 2.2.0
>
> Attachments: HIVE-12891.01.19.2016.01.patch, HIVE-12891.03.patch, 
> HIVE-12891.04.patch, HIVE-12891.5.patch, HIVE-12981.01.22.2016.02.patch
>
>
> The function {{SessionState.createSessionDirs}} fails when trying to create 
> directories where {{java.io.tmpdir}} is set to a relative location.
> {code}
> \[SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: 
> IllegalArgumentException java.net.URISyntaxException: Relative path in 
> absolute URI: 
> file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1
> ...
> Minor variations:
> \[SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: SemanticException 
> Exception while processing Exception while writing out the local file 
> o.a.h.hive.ql/parse.SemanticException: Exception while processing exception 
> while writing out local file 
> ... 
> caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: 
> Relative path in absolute URI: 
> file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 
> at o.a.h.fs.Path.initialize (206) 
> at o.a.h.fs.Path.(197)... 
> at o.a.h.hive.ql.context.getScratchDir(267) 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15173) Allow dec as an alias for decimal

2016-11-09 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-15173:

Attachment: HIVE-15173.patch

Simple patch with testcase.

> Allow dec as an alias for decimal
> -
>
> Key: HIVE-15173
> URL: https://issues.apache.org/jira/browse/HIVE-15173
> Project: Hive
>  Issue Type: Sub-task
>  Components: Parser
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-15173.patch
>
>
> Standard allows dec as an alias for decimal



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15173) Allow dec as an alias for decimal

2016-11-09 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-15173:

Status: Patch Available  (was: Open)

> Allow dec as an alias for decimal
> -
>
> Key: HIVE-15173
> URL: https://issues.apache.org/jira/browse/HIVE-15173
> Project: Hive
>  Issue Type: Sub-task
>  Components: Parser
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-15173.patch
>
>
> Standard allows dec as an alias for decimal



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15135) Add an llap mode which fails if queries cannot run in llap

2016-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652522#comment-15652522
 ] 

Hive QA commented on HIVE-15135:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838234/HIVE-15135.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 10157 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_mult_tables] 
(batchId=77)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=138)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=141)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=151)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2056/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2056/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2056/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 17 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838234 - PreCommit-HIVE-Build

> Add an llap mode which fails if queries cannot run in llap
> --
>
> Key: HIVE-15135
> URL: https://issues.apache.org/jira/browse/HIVE-15135
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15135.01.patch, HIVE-15135.02.patch
>
>
> ALL currently ends up launching new containers for queries which cannot run 
> in llap.
> There should be a mode where these queries don't run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13557) Make interval keyword optional while specifying DAY in interval arithmetic

2016-11-09 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652495#comment-15652495
 ] 

Pengcheng Xiong commented on HIVE-13557:


I have left some comments on RB. the patch can be improved.

> Make interval keyword optional while specifying DAY in interval arithmetic
> --
>
> Key: HIVE-13557
> URL: https://issues.apache.org/jira/browse/HIVE-13557
> Project: Hive
>  Issue Type: Sub-task
>  Components: Types
>Reporter: Ashutosh Chauhan
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13557.1.patch, HIVE-13557.1.patch, 
> HIVE-13557.1.patch
>
>
> Currently we support expressions like: {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31'))  - INTERVAL '30' DAY) AND 
> DATE('2000-01-31')
> {code}
> We should support:
> {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31')) + (-30) DAY) AND 
> DATE('2000-01-31')
> {code}
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15164) Change default RPC port for llap to be a dynamic port

2016-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652421#comment-15652421
 ] 

Hive QA commented on HIVE-15164:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838231/HIVE-15164.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10632 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk] 
(batchId=89)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=91)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2055/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2055/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2055/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838231 - PreCommit-HIVE-Build

> Change default RPC port for llap to be a dynamic port
> -
>
> Key: HIVE-15164
> URL: https://issues.apache.org/jira/browse/HIVE-15164
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15164.01.patch, HIVE-15164.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15149) Add additional information to ATSHook for Tez UI

2016-11-09 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-15149:
--
Component/s: Hooks

> Add additional information to ATSHook for Tez UI
> 
>
> Key: HIVE-15149
> URL: https://issues.apache.org/jira/browse/HIVE-15149
> Project: Hive
>  Issue Type: Improvement
>  Components: Hooks
>Reporter: Jason Dere
>Assignee: Li Lu
> Attachments: HIVE-15149.1.patch
>
>
> Additional query details wanted for TEZ-3530. The additional details 
> discussed include the following:
> Publish the following info ( in addition to existing bits published today):
> Application Id to which the query was submitted (primary filter)
> DAG Id (primary filter)
> Hive query name (primary filter)
> Hive Configs (everything a set command would provide except for sensitive 
> credential info)
> Potentially publish source of config i.e. set in hive query script vs 
> hive-site.xml, etc.
> Which HiveServer2 the query was submitted to
> *Which IP/host the query was submitted from - not sure what filter support 
> will be available.
> Which execution mode the query is running in (primary filter)
> What submission mode was used (cli/beeline/jdbc, etc)
> User info ( running as, actual end user, etc) - not sure if already present
> Perf logger events. The data published should be able to create a timeline 
> view of the query i.e. actual submission time, query compile timestamps, 
> execution timestamps, post-exec data moves, etc.
> Explain plan with enough details for visualizing.
> Databases and tables being queried (primary filter)
> Yarn queue info (primary filter)
> Caller context (primary filter)
> Original source i.e. submitter
> Thread info in HS2 if needed ( I believe Vikram may have added this earlier )
> Query time taken (with filter support )  
> Additional context info e.g. llap instance name and appId if required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-15149) Add additional information to ATSHook for Tez UI

2016-11-09 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere reassigned HIVE-15149:
-

Assignee: Jason Dere  (was: Li Lu)

> Add additional information to ATSHook for Tez UI
> 
>
> Key: HIVE-15149
> URL: https://issues.apache.org/jira/browse/HIVE-15149
> Project: Hive
>  Issue Type: Improvement
>  Components: Hooks
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-15149.1.patch
>
>
> Additional query details wanted for TEZ-3530. The additional details 
> discussed include the following:
> Publish the following info ( in addition to existing bits published today):
> Application Id to which the query was submitted (primary filter)
> DAG Id (primary filter)
> Hive query name (primary filter)
> Hive Configs (everything a set command would provide except for sensitive 
> credential info)
> Potentially publish source of config i.e. set in hive query script vs 
> hive-site.xml, etc.
> Which HiveServer2 the query was submitted to
> *Which IP/host the query was submitted from - not sure what filter support 
> will be available.
> Which execution mode the query is running in (primary filter)
> What submission mode was used (cli/beeline/jdbc, etc)
> User info ( running as, actual end user, etc) - not sure if already present
> Perf logger events. The data published should be able to create a timeline 
> view of the query i.e. actual submission time, query compile timestamps, 
> execution timestamps, post-exec data moves, etc.
> Explain plan with enough details for visualizing.
> Databases and tables being queried (primary filter)
> Yarn queue info (primary filter)
> Caller context (primary filter)
> Original source i.e. submitter
> Thread info in HS2 if needed ( I believe Vikram may have added this earlier )
> Query time taken (with filter support )  
> Additional context info e.g. llap instance name and appId if required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-15168) Flaky test: TestSparkClient.testJobSubmission (still flaky)

2016-11-09 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652357#comment-15652357
 ] 

Xuefu Zhang edited comment on HIVE-15168 at 11/9/16 11:22 PM:
--

Could we have a few words describing the problem and the fix? It's not obvious 
while reading code diff. Thanks.

Also, please attach the patch here as well.


was (Author: xuefuz):
Could we have a few words describing the problem and the fix? It's not obvious 
while reading code diff. Thanks.

> Flaky test: TestSparkClient.testJobSubmission (still flaky)
> ---
>
> Key: HIVE-15168
> URL: https://issues.apache.org/jira/browse/HIVE-15168
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>
> [HIVE-14910|https://issues.apache.org/jira/browse/HIVE-14910] already 
> addressed one source of flakyness bud sadly not all it seems.
> In JobHandleImpl the listeners are registered after the job has been 
> submitted.
> This may end up in a racecondition.
> {code}
>  // Link the RPC and the promise so that events from one are propagated to 
> the other as
>   // needed.
>   rpc.addListener(new 
> GenericFutureListener() {
> @Override
> public void operationComplete(io.netty.util.concurrent.Future 
> f) {
>   if (f.isSuccess()) {
> handle.changeState(JobHandle.State.QUEUED);
>   } else if (!promise.isDone()) {
> promise.setFailure(f.cause());
>   }
> }
>   });
>   promise.addListener(new GenericFutureListener() {
> @Override
> public void operationComplete(Promise p) {
>   if (jobId != null) {
> jobs.remove(jobId);
>   }
>   if (p.isCancelled() && !rpc.isDone()) {
> rpc.cancel(true);
>   }
> }
>   });
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15168) Flaky test: TestSparkClient.testJobSubmission (still flaky)

2016-11-09 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652357#comment-15652357
 ] 

Xuefu Zhang commented on HIVE-15168:


Could we have a few words describing the problem and the fix? It's not obvious 
while reading code diff. Thanks.

> Flaky test: TestSparkClient.testJobSubmission (still flaky)
> ---
>
> Key: HIVE-15168
> URL: https://issues.apache.org/jira/browse/HIVE-15168
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>
> [HIVE-14910|https://issues.apache.org/jira/browse/HIVE-14910] already 
> addressed one source of flakyness bud sadly not all it seems.
> In JobHandleImpl the listeners are registered after the job has been 
> submitted.
> This may end up in a racecondition.
> {code}
>  // Link the RPC and the promise so that events from one are propagated to 
> the other as
>   // needed.
>   rpc.addListener(new 
> GenericFutureListener() {
> @Override
> public void operationComplete(io.netty.util.concurrent.Future 
> f) {
>   if (f.isSuccess()) {
> handle.changeState(JobHandle.State.QUEUED);
>   } else if (!promise.isDone()) {
> promise.setFailure(f.cause());
>   }
> }
>   });
>   promise.addListener(new GenericFutureListener() {
> @Override
> public void operationComplete(Promise p) {
>   if (jobId != null) {
> jobs.remove(jobId);
>   }
>   if (p.isCancelled() && !rpc.isDone()) {
> rpc.cancel(true);
>   }
> }
>   });
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-15149) Add additional information to ATSHook for Tez UI

2016-11-09 Thread Li Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu reassigned HIVE-15149:


Assignee: Li Lu  (was: Jason Dere)

> Add additional information to ATSHook for Tez UI
> 
>
> Key: HIVE-15149
> URL: https://issues.apache.org/jira/browse/HIVE-15149
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jason Dere
>Assignee: Li Lu
> Attachments: HIVE-15149.1.patch
>
>
> Additional query details wanted for TEZ-3530. The additional details 
> discussed include the following:
> Publish the following info ( in addition to existing bits published today):
> Application Id to which the query was submitted (primary filter)
> DAG Id (primary filter)
> Hive query name (primary filter)
> Hive Configs (everything a set command would provide except for sensitive 
> credential info)
> Potentially publish source of config i.e. set in hive query script vs 
> hive-site.xml, etc.
> Which HiveServer2 the query was submitted to
> *Which IP/host the query was submitted from - not sure what filter support 
> will be available.
> Which execution mode the query is running in (primary filter)
> What submission mode was used (cli/beeline/jdbc, etc)
> User info ( running as, actual end user, etc) - not sure if already present
> Perf logger events. The data published should be able to create a timeline 
> view of the query i.e. actual submission time, query compile timestamps, 
> execution timestamps, post-exec data moves, etc.
> Explain plan with enough details for visualizing.
> Databases and tables being queried (primary filter)
> Yarn queue info (primary filter)
> Caller context (primary filter)
> Original source i.e. submitter
> Thread info in HS2 if needed ( I believe Vikram may have added this earlier )
> Query time taken (with filter support )  
> Additional context info e.g. llap instance name and appId if required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14582) Add trunc(numeric) udf

2016-11-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652348#comment-15652348
 ] 

Ashutosh Chauhan commented on HIVE-14582:
-

Any updates [~chinnalalam] ?

> Add trunc(numeric) udf
> --
>
> Key: HIVE-14582
> URL: https://issues.apache.org/jira/browse/HIVE-14582
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Ashutosh Chauhan
>Assignee: Chinna Rao Lalam
> Attachments: HIVE-14582.patch
>
>
> https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions200.htm



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13557) Make interval keyword optional while specifying DAY in interval arithmetic

2016-11-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652345#comment-15652345
 ] 

Ashutosh Chauhan commented on HIVE-13557:
-

Looks good to me. [~pxiong] do you also want to take a look?

> Make interval keyword optional while specifying DAY in interval arithmetic
> --
>
> Key: HIVE-13557
> URL: https://issues.apache.org/jira/browse/HIVE-13557
> Project: Hive
>  Issue Type: Sub-task
>  Components: Types
>Reporter: Ashutosh Chauhan
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13557.1.patch, HIVE-13557.1.patch, 
> HIVE-13557.1.patch
>
>
> Currently we support expressions like: {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31'))  - INTERVAL '30' DAY) AND 
> DATE('2000-01-31')
> {code}
> We should support:
> {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31')) + (-30) DAY) AND 
> DATE('2000-01-31')
> {code}
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15149) Add additional information to ATSHook for Tez UI

2016-11-09 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-15149:
--
Attachment: HIVE-15149.1.patch

Work in progress, added the following fields to the ATS event:
Hive query name
Hive configs
HiveServer2 IP address
Client IP
execution mode (mr/tez/llap/spark)
Hive instance type (cli/hs2)
Tables read/written
Fixed thread name (originally was ATSHook thread)


> Add additional information to ATSHook for Tez UI
> 
>
> Key: HIVE-15149
> URL: https://issues.apache.org/jira/browse/HIVE-15149
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-15149.1.patch
>
>
> Additional query details wanted for TEZ-3530. The additional details 
> discussed include the following:
> Publish the following info ( in addition to existing bits published today):
> Application Id to which the query was submitted (primary filter)
> DAG Id (primary filter)
> Hive query name (primary filter)
> Hive Configs (everything a set command would provide except for sensitive 
> credential info)
> Potentially publish source of config i.e. set in hive query script vs 
> hive-site.xml, etc.
> Which HiveServer2 the query was submitted to
> *Which IP/host the query was submitted from - not sure what filter support 
> will be available.
> Which execution mode the query is running in (primary filter)
> What submission mode was used (cli/beeline/jdbc, etc)
> User info ( running as, actual end user, etc) - not sure if already present
> Perf logger events. The data published should be able to create a timeline 
> view of the query i.e. actual submission time, query compile timestamps, 
> execution timestamps, post-exec data moves, etc.
> Explain plan with enough details for visualizing.
> Databases and tables being queried (primary filter)
> Yarn queue info (primary filter)
> Caller context (primary filter)
> Original source i.e. submitter
> Thread info in HS2 if needed ( I believe Vikram may have added this earlier )
> Query time taken (with filter support )  
> Additional context info e.g. llap instance name and appId if required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15171) set SparkTask's jobID with application id

2016-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652319#comment-15652319
 ] 

Hive QA commented on HIVE-15171:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838224/HIVE-15171.000.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10617 tests 
executed
*Failed tests:*
{noformat}
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=123)

[ptf_seqfile.q,union_remove_23.q,parallel_join0.q,union_remove_9.q,join_thrift.q,skewjoinopt14.q,vectorized_mapjoin.q,union4.q,auto_join5.q,vectorized_shufflejoin.q,smb_mapjoin_20.q,groupby8_noskew.q,auto_sortmerge_join_10.q,groupby11.q,union_remove_16.q]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2054/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2054/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2054/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838224 - PreCommit-HIVE-Build

> set SparkTask's jobID with application id
> -
>
> Key: HIVE-15171
> URL: https://issues.apache.org/jira/browse/HIVE-15171
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 2.1.0
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: HIVE-15171.000.patch
>
>
> set SparkTask's jobID with application id, The information will be useful to 
> monitor the Spark Application in hook



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15162) NPE in ATSHook

2016-11-09 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652208#comment-15652208
 ] 

Sergey Shelukhin commented on HIVE-15162:
-

+1

> NPE in ATSHook
> --
>
> Key: HIVE-15162
> URL: https://issues.apache.org/jira/browse/HIVE-15162
> Project: Hive
>  Issue Type: Bug
>  Components: Hooks
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-15162.1.patch
>
>
> {noformat}
> 2016-11-08T14:21:15,025 INFO  [ATS Logger 0]: hooks.ATSHook 
> (ATSHook.java:run(156)) - Failed to submit plan to ATS: 
> java.lang.NullPointerException
> at org.apache.hadoop.hive.ql.hooks.ATSHook$2.run(ATSHook.java:141)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15135) Add an llap mode which fails if queries cannot run in llap

2016-11-09 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-15135:
--
Attachment: HIVE-15135.02.patch

Updated patch with the name changes to llap_only. Also modified MinilapLocal to 
use this mode MiniLlap to use all.

> Add an llap mode which fails if queries cannot run in llap
> --
>
> Key: HIVE-15135
> URL: https://issues.apache.org/jira/browse/HIVE-15135
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15135.01.patch, HIVE-15135.02.patch
>
>
> ALL currently ends up launching new containers for queries which cannot run 
> in llap.
> There should be a mode where these queries don't run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14089) complex type support in LLAP IO is broken

2016-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652169#comment-15652169
 ] 

Hive QA commented on HIVE-14089:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838219/HIVE-14089.11.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10632 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2053/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2053/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2053/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838219 - PreCommit-HIVE-Build

> complex type support in LLAP IO is broken 
> --
>
> Key: HIVE-14089
> URL: https://issues.apache.org/jira/browse/HIVE-14089
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14089.04.patch, HIVE-14089.05.patch, 
> HIVE-14089.06.patch, HIVE-14089.07.patch, HIVE-14089.08.patch, 
> HIVE-14089.09.patch, HIVE-14089.10.patch, HIVE-14089.10.patch, 
> HIVE-14089.10.patch, HIVE-14089.11.patch, HIVE-14089.WIP.2.patch, 
> HIVE-14089.WIP.3.patch, HIVE-14089.WIP.patch
>
>
> HIVE-13617 is causing MiniLlapCliDriver following test failures
> {code}
> org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
> org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
> {code}
> Note to self - need to add multi-stripe test, and also test complex types 
> with some nulls so that present stream is not suppressed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15085) Reduce the memory used by unit tests, MiniCliDriver, MiniLlapLocal, MiniSpark

2016-11-09 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652156#comment-15652156
 ] 

Siddharth Seth commented on HIVE-15085:
---

[~prasanth_j] - could you please take a look. The test failures are unrelated - 
tracked under HIVE-15058.

> Reduce the memory used by unit tests, MiniCliDriver, MiniLlapLocal, MiniSpark
> -
>
> Key: HIVE-15085
> URL: https://issues.apache.org/jira/browse/HIVE-15085
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15085.01.patch, HIVE-15085.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15164) Change default RPC port for llap to be a dynamic port

2016-11-09 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-15164:
--
Attachment: HIVE-15164.02.patch

Updated patch to fix a test failure.

> Change default RPC port for llap to be a dynamic port
> -
>
> Key: HIVE-15164
> URL: https://issues.apache.org/jira/browse/HIVE-15164
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15164.01.patch, HIVE-15164.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-15172) Flaky test: TestSparkCliDriver.testCliDriver[limit_pushdown]

2016-11-09 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong resolved HIVE-15172.

Resolution: Fixed

i have just push a fix for it. i missed to update the golden file.

> Flaky test: TestSparkCliDriver.testCliDriver[limit_pushdown]
> 
>
> Key: HIVE-15172
> URL: https://issues.apache.org/jira/browse/HIVE-15172
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Jason Dere
>
> Looks like this has been failing on recent precommit tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15162) NPE in ATSHook

2016-11-09 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652004#comment-15652004
 ] 

Jason Dere commented on HIVE-15162:
---

[~ashutoshc] [~sershe] can you take a look?

> NPE in ATSHook
> --
>
> Key: HIVE-15162
> URL: https://issues.apache.org/jira/browse/HIVE-15162
> Project: Hive
>  Issue Type: Bug
>  Components: Hooks
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-15162.1.patch
>
>
> {noformat}
> 2016-11-08T14:21:15,025 INFO  [ATS Logger 0]: hooks.ATSHook 
> (ATSHook.java:run(156)) - Failed to submit plan to ATS: 
> java.lang.NullPointerException
> at org.apache.hadoop.hive.ql.hooks.ATSHook$2.run(ATSHook.java:141)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15162) NPE in ATSHook

2016-11-09 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652002#comment-15652002
 ] 

Jason Dere commented on HIVE-15162:
---

The 4 failing tests are all listed as issues under HIVE-15058.

> NPE in ATSHook
> --
>
> Key: HIVE-15162
> URL: https://issues.apache.org/jira/browse/HIVE-15162
> Project: Hive
>  Issue Type: Bug
>  Components: Hooks
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-15162.1.patch
>
>
> {noformat}
> 2016-11-08T14:21:15,025 INFO  [ATS Logger 0]: hooks.ATSHook 
> (ATSHook.java:run(156)) - Failed to submit plan to ATS: 
> java.lang.NullPointerException
> at org.apache.hadoop.hive.ql.hooks.ATSHook$2.run(ATSHook.java:141)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15171) set SparkTask's jobID with application id

2016-11-09 Thread zhihai xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated HIVE-15171:
-
Status: Patch Available  (was: Open)

> set SparkTask's jobID with application id
> -
>
> Key: HIVE-15171
> URL: https://issues.apache.org/jira/browse/HIVE-15171
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 2.1.0
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: HIVE-15171.000.patch
>
>
> set SparkTask's jobID with application id, The information will be useful to 
> monitor the Spark Application in hook



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15171) set SparkTask's jobID with application id

2016-11-09 Thread zhihai xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated HIVE-15171:
-
Attachment: HIVE-15171.000.patch

> set SparkTask's jobID with application id
> -
>
> Key: HIVE-15171
> URL: https://issues.apache.org/jira/browse/HIVE-15171
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 2.1.0
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: HIVE-15171.000.patch
>
>
> set SparkTask's jobID with application id, The information will be useful to 
> monitor the Spark Application in hook



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14089) complex type support in LLAP IO is broken

2016-11-09 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14089:

Attachment: HIVE-14089.11.patch

The trivial out file change (no inputs -> all inputs).

> complex type support in LLAP IO is broken 
> --
>
> Key: HIVE-14089
> URL: https://issues.apache.org/jira/browse/HIVE-14089
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14089.04.patch, HIVE-14089.05.patch, 
> HIVE-14089.06.patch, HIVE-14089.07.patch, HIVE-14089.08.patch, 
> HIVE-14089.09.patch, HIVE-14089.10.patch, HIVE-14089.10.patch, 
> HIVE-14089.10.patch, HIVE-14089.11.patch, HIVE-14089.WIP.2.patch, 
> HIVE-14089.WIP.3.patch, HIVE-14089.WIP.patch
>
>
> HIVE-13617 is causing MiniLlapCliDriver following test failures
> {code}
> org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
> org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
> {code}
> Note to self - need to add multi-stripe test, and also test complex types 
> with some nulls so that present stream is not suppressed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14998) Fix and update test: TestPluggableHiveSessionImpl

2016-11-09 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651931#comment-15651931
 ] 

Zoltan Haindrich commented on HIVE-14998:
-

[~thejas] can you please take a look at these changes?

> Fix and update test: TestPluggableHiveSessionImpl
> -
>
> Key: HIVE-14998
> URL: https://issues.apache.org/jira/browse/HIVE-14998
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-14998.1.patch
>
>
> this test either prints an exception to the stdout ... or not - in its 
> current form it doesn't really usefull.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15161) migrate ColumnStats to use jackson

2016-11-09 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651926#comment-15651926
 ] 

Zoltan Haindrich commented on HIVE-15161:
-

[~pxiong] can you please take a look at these changes?

and one more thing: there are a few cases when there "column_stats" is present; 
but "basic_stats" is false - and hence omitted...they seem to be a bit odd - 
should I look after these ?
{code}
autoColumnStats_4.q.out: COLUMN_STATS_ACCURATE   
{\"COLUMN_STATS\":{\"a\":\"true\",\"b\":\"true\"}}
{code}

> migrate ColumnStats to use jackson
> --
>
> Key: HIVE-15161
> URL: https://issues.apache.org/jira/browse/HIVE-15161
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 2.2.0
>
> Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, 
> HIVE-15161.3.patch, HIVE-15161.4.patch
>
>
> * json.org has license issues
> * jackson can provide a fully compatible alternative to it
> * there are a few flakiness issues caused by the order of the map entries of 
> the columns...this cat be addressed, org.json api was unfriendly in this 
> manner ;)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14089) complex type support in LLAP IO is broken

2016-11-09 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651907#comment-15651907
 ] 

Sergey Shelukhin commented on HIVE-14089:
-

vector_complex_join is a trivial explain change; the rest are known failures.
[~prasanth_j] can you please review? thanks

> complex type support in LLAP IO is broken 
> --
>
> Key: HIVE-14089
> URL: https://issues.apache.org/jira/browse/HIVE-14089
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14089.04.patch, HIVE-14089.05.patch, 
> HIVE-14089.06.patch, HIVE-14089.07.patch, HIVE-14089.08.patch, 
> HIVE-14089.09.patch, HIVE-14089.10.patch, HIVE-14089.10.patch, 
> HIVE-14089.10.patch, HIVE-14089.WIP.2.patch, HIVE-14089.WIP.3.patch, 
> HIVE-14089.WIP.patch
>
>
> HIVE-13617 is causing MiniLlapCliDriver following test failures
> {code}
> org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
> org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
> {code}
> Note to self - need to add multi-stripe test, and also test complex types 
> with some nulls so that present stream is not suppressed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15085) Reduce the memory used by unit tests, MiniCliDriver, MiniLlapLocal, MiniSpark

2016-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651903#comment-15651903
 ] 

Hive QA commented on HIVE-15085:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12837816/HIVE-15085.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10632 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=91)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2051/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2051/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2051/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12837816 - PreCommit-HIVE-Build

> Reduce the memory used by unit tests, MiniCliDriver, MiniLlapLocal, MiniSpark
> -
>
> Key: HIVE-15085
> URL: https://issues.apache.org/jira/browse/HIVE-15085
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15085.01.patch, HIVE-15085.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination

2016-11-09 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651862#comment-15651862
 ] 

Sahil Takiar commented on HIVE-14271:
-

Yes, agree with Steve. Sergio summarized it well. Sounds like this is a 
reasonable change, [~spena] can you re-open this JIRA.

> FileSinkOperator should not rename files to final paths when S3 is the 
> default destination
> --
>
> Key: HIVE-14271
> URL: https://issues.apache.org/jira/browse/HIVE-14271
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>
> FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished 
> writing all rows to a temporary path. The problem is that S3 does not support 
> renaming.
> Two options can be considered:
> a. Use a copy operation instead. After FileSinkOperator writes all rows to 
> outPaths, then the commit method will do a copy() call instead of move().
> b. Write row by row directly to the S3 path (see HIVE-1620). This may add 
> better performance calls, but we should take care of the cleanup part in case 
> of writing errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15161) migrate ColumnStats to use jackson

2016-11-09 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651874#comment-15651874
 ] 

Zoltan Haindrich commented on HIVE-15161:
-

failures are unrelated: HIVE-15084 ; HIVE-15115 ; HIVE-15116

> migrate ColumnStats to use jackson
> --
>
> Key: HIVE-15161
> URL: https://issues.apache.org/jira/browse/HIVE-15161
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 2.2.0
>
> Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, 
> HIVE-15161.3.patch, HIVE-15161.4.patch
>
>
> * json.org has license issues
> * jackson can provide a fully compatible alternative to it
> * there are a few flakiness issues caused by the order of the map entries of 
> the columns...this cat be addressed, org.json api was unfriendly in this 
> manner ;)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination

2016-11-09 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña reopened HIVE-14271:


> FileSinkOperator should not rename files to final paths when S3 is the 
> default destination
> --
>
> Key: HIVE-14271
> URL: https://issues.apache.org/jira/browse/HIVE-14271
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>
> FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished 
> writing all rows to a temporary path. The problem is that S3 does not support 
> renaming.
> Two options can be considered:
> a. Use a copy operation instead. After FileSinkOperator writes all rows to 
> outPaths, then the commit method will do a copy() call instead of move().
> b. Write row by row directly to the S3 path (see HIVE-1620). This may add 
> better performance calls, but we should take care of the cleanup part in case 
> of writing errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14975) Flaky Test: TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz

2016-11-09 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-14975:

Issue Type: Bug  (was: Sub-task)
Parent: (was: HIVE-15058)

> Flaky Test: TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz
> --
>
> Key: HIVE-14975
> URL: https://issues.apache.org/jira/browse/HIVE-14975
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Gopal V
>
> {code}
> 2016-10-14T22:51:32,947  INFO [main] beeline.TestBeelineArgParsing: Add 
> /home/hiveptest/104.155.175.228-hiveptest-0/maven/postgresql/postgresql/9.1-901.jdbc4/postgresql-9.1-901.jdbc4.jar
>  for the driver class org.postgresql.Driver
> Fail to add local jar due to the exception:java.util.zip.ZipException: error 
> in opening zip file
> error in opening zip file
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-14975) Flaky Test: TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz

2016-11-09 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich resolved HIVE-14975.
-
Resolution: Duplicate

> Flaky Test: TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz
> --
>
> Key: HIVE-14975
> URL: https://issues.apache.org/jira/browse/HIVE-14975
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Gopal V
>
> {code}
> 2016-10-14T22:51:32,947  INFO [main] beeline.TestBeelineArgParsing: Add 
> /home/hiveptest/104.155.175.228-hiveptest-0/maven/postgresql/postgresql/9.1-901.jdbc4/postgresql-9.1-901.jdbc4.jar
>  for the driver class org.postgresql.Driver
> Fail to add local jar due to the exception:java.util.zip.ZipException: error 
> in opening zip file
> error in opening zip file
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15164) Change default RPC port for llap to be a dynamic port

2016-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651786#comment-15651786
 ] 

Hive QA commented on HIVE-15164:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838092/HIVE-15164.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10618 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=91)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=126)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[limit_pushdown] 
(batchId=121)
org.apache.hadoop.hive.llap.daemon.impl.TestLlapDaemonProtocolServerImpl.test 
(batchId=277)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2050/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2050/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2050/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838092 - PreCommit-HIVE-Build

> Change default RPC port for llap to be a dynamic port
> -
>
> Key: HIVE-15164
> URL: https://issues.apache.org/jira/browse/HIVE-15164
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15164.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15114) Remove extra MoveTask operators

2016-11-09 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-15114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651757#comment-15651757
 ] 

Sergio Peña commented on HIVE-15114:


[~stakiar] No tests will be executed with this patch because the optimization 
only happens for blobstore, and we don't have automated tests for blobstore 
optimizations. It will be good to run a full set of tests once I attach a final 
patch, but for now I think we can wait as ptest won't give us any feedback for 
the change.

> Remove extra MoveTask operators
> ---
>
> Key: HIVE-15114
> URL: https://issues.apache.org/jira/browse/HIVE-15114
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Sahil Takiar
>Assignee: Sergio Peña
> Attachments: HIVE-15114.WIP.1.patch
>
>
> When running simple insert queries (e.g. {{INSERT INTO TABLE ... VALUES 
> ...}}) there an extraneous {{MoveTask}s is created.
> This is problematic when the scratch directory is on S3 since renames require 
> copying the entire dataset.
> For simple queries (like the one above), there are two MoveTasks. The first 
> one moves the output data from one file in the scratch directory to another 
> file in the scratch directory. The second MoveTask moves the data from the 
> scratch directory to its final table location.
> The first MoveTask should not be necessary. The goal of this JIRA it to 
> remove it. This should help improve performance when running on S3.
> It seems that the first Move might be caused by a dependency resolution 
> problem in the optimizer, where a dependent task doesn't get properly removed 
> when the task it depends on is filtered by a condition resolver.
> A dummy {{MoveTask}} is added in the 
> {{GenMapRedUtils.createMRWorkForMergingFiles}} method. This method creates a 
> conditional task which launches a job to merge tasks at the end of the file. 
> At the end of the conditional job there is a MoveTask.
> Even though Hive decides that the conditional merge job is no needed, it 
> seems the MoveTask is still added to the plan.
> Seems this extra {{MoveTask}} may have been added intentionally. Not sure why 
> yet. The {{ConditionalResolverMergeFiles}} says that one of three tasks will 
> be returned: move task only, merge task only, merge task followed by a move 
> task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15114) Remove extra MoveTask operators

2016-11-09 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651741#comment-15651741
 ] 

Sahil Takiar commented on HIVE-15114:
-

[~spena] should we "Submit Patch" so we can some test results from Hive QA?

> Remove extra MoveTask operators
> ---
>
> Key: HIVE-15114
> URL: https://issues.apache.org/jira/browse/HIVE-15114
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Sahil Takiar
>Assignee: Sergio Peña
> Attachments: HIVE-15114.WIP.1.patch
>
>
> When running simple insert queries (e.g. {{INSERT INTO TABLE ... VALUES 
> ...}}) there an extraneous {{MoveTask}s is created.
> This is problematic when the scratch directory is on S3 since renames require 
> copying the entire dataset.
> For simple queries (like the one above), there are two MoveTasks. The first 
> one moves the output data from one file in the scratch directory to another 
> file in the scratch directory. The second MoveTask moves the data from the 
> scratch directory to its final table location.
> The first MoveTask should not be necessary. The goal of this JIRA it to 
> remove it. This should help improve performance when running on S3.
> It seems that the first Move might be caused by a dependency resolution 
> problem in the optimizer, where a dependent task doesn't get properly removed 
> when the task it depends on is filtered by a condition resolver.
> A dummy {{MoveTask}} is added in the 
> {{GenMapRedUtils.createMRWorkForMergingFiles}} method. This method creates a 
> conditional task which launches a job to merge tasks at the end of the file. 
> At the end of the conditional job there is a MoveTask.
> Even though Hive decides that the conditional merge job is no needed, it 
> seems the MoveTask is still added to the plan.
> Seems this extra {{MoveTask}} may have been added intentionally. Not sure why 
> yet. The {{ConditionalResolverMergeFiles}} says that one of three tasks will 
> be returned: move task only, merge task only, merge task followed by a move 
> task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0

2016-11-09 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651701#comment-15651701
 ] 

Pengcheng Xiong commented on HIVE-15023:


[~kgyrtkirk], thanks for finding this out. I have pushed the patch to the 
master. Thanks again.

> SimpleFetchOptimizer needs to optimize limit=0
> --
>
> Key: HIVE-15023
> URL: https://issues.apache.org/jira/browse/HIVE-15023
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0
>
> Attachments: HIVE-15023.01.patch, HIVE-15023.02.patch
>
>
> on current master
> {code}
> hive> explain select key from src limit 0;
> OK
> STAGE DEPENDENCIES:
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: 0
>   Processor Tree:
> TableScan
>   alias: src
>   Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: key (type: string)
> outputColumnNames: _col0
> Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE 
> Column stats: NONE
> Limit
>   Number of rows: 0
>   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
> stats: NONE
>   ListSink
> Time taken: 7.534 seconds, Fetched: 20 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15114) Remove extra MoveTask operators

2016-11-09 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-15114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651657#comment-15651657
 ] 

Sergio Peña commented on HIVE-15114:


[~sershe] What do you think about this approach? It merges two MoveTask into 
one only for blobstore paths.

> Remove extra MoveTask operators
> ---
>
> Key: HIVE-15114
> URL: https://issues.apache.org/jira/browse/HIVE-15114
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Sahil Takiar
>Assignee: Sergio Peña
> Attachments: HIVE-15114.WIP.1.patch
>
>
> When running simple insert queries (e.g. {{INSERT INTO TABLE ... VALUES 
> ...}}) there an extraneous {{MoveTask}s is created.
> This is problematic when the scratch directory is on S3 since renames require 
> copying the entire dataset.
> For simple queries (like the one above), there are two MoveTasks. The first 
> one moves the output data from one file in the scratch directory to another 
> file in the scratch directory. The second MoveTask moves the data from the 
> scratch directory to its final table location.
> The first MoveTask should not be necessary. The goal of this JIRA it to 
> remove it. This should help improve performance when running on S3.
> It seems that the first Move might be caused by a dependency resolution 
> problem in the optimizer, where a dependent task doesn't get properly removed 
> when the task it depends on is filtered by a condition resolver.
> A dummy {{MoveTask}} is added in the 
> {{GenMapRedUtils.createMRWorkForMergingFiles}} method. This method creates a 
> conditional task which launches a job to merge tasks at the end of the file. 
> At the end of the conditional job there is a MoveTask.
> Even though Hive decides that the conditional merge job is no needed, it 
> seems the MoveTask is still added to the plan.
> Seems this extra {{MoveTask}} may have been added intentionally. Not sure why 
> yet. The {{ConditionalResolverMergeFiles}} says that one of three tasks will 
> be returned: move task only, merge task only, merge task followed by a move 
> task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15114) Remove extra MoveTask operators

2016-11-09 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-15114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-15114:
---
Attachment: HIVE-15114.WIP.1.patch

Attaching a patch that is work-in-progress for an early review. I need to add 
some unittests.

[~stakiar] The patch uses a new dispatcher that is executed on the task 
physical optimizer, and it looks for a ConditionalTask  to do the optimization.

Questions I have:
1. Should we move the optimization to the 
{{GenMapRedUtils.createMRWorkForMergingFiles}} instead?
2. Should we look for any MoveTask that links to another MoveTask on the whole 
plan instead of just focusing on the ConditionalTask?

> Remove extra MoveTask operators
> ---
>
> Key: HIVE-15114
> URL: https://issues.apache.org/jira/browse/HIVE-15114
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Sahil Takiar
>Assignee: Sergio Peña
> Attachments: HIVE-15114.WIP.1.patch
>
>
> When running simple insert queries (e.g. {{INSERT INTO TABLE ... VALUES 
> ...}}) there an extraneous {{MoveTask}s is created.
> This is problematic when the scratch directory is on S3 since renames require 
> copying the entire dataset.
> For simple queries (like the one above), there are two MoveTasks. The first 
> one moves the output data from one file in the scratch directory to another 
> file in the scratch directory. The second MoveTask moves the data from the 
> scratch directory to its final table location.
> The first MoveTask should not be necessary. The goal of this JIRA it to 
> remove it. This should help improve performance when running on S3.
> It seems that the first Move might be caused by a dependency resolution 
> problem in the optimizer, where a dependent task doesn't get properly removed 
> when the task it depends on is filtered by a condition resolver.
> A dummy {{MoveTask}} is added in the 
> {{GenMapRedUtils.createMRWorkForMergingFiles}} method. This method creates a 
> conditional task which launches a job to merge tasks at the end of the file. 
> At the end of the conditional job there is a MoveTask.
> Even though Hive decides that the conditional merge job is no needed, it 
> seems the MoveTask is still added to the plan.
> Seems this extra {{MoveTask}} may have been added intentionally. Not sure why 
> yet. The {{ConditionalResolverMergeFiles}} says that one of three tasks will 
> be returned: move task only, merge task only, merge task followed by a move 
> task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15090) Temporary DB failure can stop ExpiredTokenRemover thread

2016-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651643#comment-15651643
 ] 

Hive QA commented on HIVE-15090:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838189/HIVE-15090.3-branch-2.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 10462 tests 
executed
*Failed tests:*
{noformat}
TestJdbcWithMiniHA - did not produce a TEST-*.xml file (likely timed out) 
(batchId=494)
TestJdbcWithMiniMr - did not produce a TEST-*.xml file (likely timed out) 
(batchId=491)
TestMsgBusConnection - did not produce a TEST-*.xml file (likely timed out) 
(batchId=362)
TestOperationLoggingAPIWithTez - did not produce a TEST-*.xml file (likely 
timed out) (batchId=484)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_table_stats 
(batchId=92)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_orig_table_use_metadata
 (batchId=109)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 
(batchId=87)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_schema_evol_3a 
(batchId=97)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_null_optimizer 
(batchId=154)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_between_in 
(batchId=99)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_ppd_basic 
(batchId=521)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
 (batchId=539)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_ppd_basic 
(batchId=187)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_ppd_schema_evol_3a
 (batchId=198)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_between_in 
(batchId=199)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_cast_constant
 (batchId=183)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_complex_all
 (batchId=200)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_between_in 
(batchId=233)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching 
(batchId=492)
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd 
(batchId=487)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2049/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2049/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2049/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 20 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838189 - PreCommit-HIVE-Build

> Temporary DB failure can stop ExpiredTokenRemover thread
> 
>
> Key: HIVE-15090
> URL: https://issues.apache.org/jira/browse/HIVE-15090
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.3.0, 2.1.0, 2.0.1, 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Fix For: 2.2.0
>
> Attachments: HIVE-15090.2-branch-2.1.patch, HIVE-15090.2.patch, 
> HIVE-15090.2.patch, HIVE-15090.3-branch-2.1.patch, HIVE-15090.patch
>
>
> In HIVE-13090 we decided that we should not close the metastore if there is 
> an unexpected exception during the expired token removal process, but that 
> fix leaves a running metastore without ExpiredTokenRemover thread.
> To fix this I will move the catch inside the running loop, and hope the 
> thread could recover from the exception



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15093) S3-to-S3 Renames: Files should be moved individually rather than at a directory level

2016-11-09 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651540#comment-15651540
 ] 

Sahil Takiar commented on HIVE-15093:
-

[~steve_l] thanks for your input.

I'm happy to start looking into HADOOP-13600 if everyone agrees that is the 
better approach. I do have some questions though:

* Is Hadoop 2.8+ released anywhere? I don't see artifacts published on Maven 
Central, Hive is currently using version 2.7.2
* HADOOP-13600 is targeted for Hadoop 2.9.0 do we know when that would be 
released?

My main question is, if we do this in Hadoop when will the optimization 
actually make it into Hive?

[~ashutoshc] any chance you, or maybe someone on the PMC could comment on this? 
In addition to Steve's concerns, [~yalovyyi] and [~poeppt] had similar concerns 
expressed in earlier comments in this JIRA.

> S3-to-S3 Renames: Files should be moved individually rather than at a 
> directory level
> -
>
> Key: HIVE-15093
> URL: https://issues.apache.org/jira/browse/HIVE-15093
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15093.1.patch, HIVE-15093.2.patch, 
> HIVE-15093.3.patch, HIVE-15093.4.patch, HIVE-15093.5.patch, 
> HIVE-15093.6.patch, HIVE-15093.7.patch, HIVE-15093.8.patch, HIVE-15093.9.patch
>
>
> Hive's MoveTask uses the Hive.moveFile method to move data within a 
> distributed filesystem as well as blobstore filesystems.
> If the move is done within the same filesystem:
> 1: If the source path is a subdirectory of the destination path, files will 
> be moved one by one using a threapool of workers
> 2: If the source path is not a subdirectory of the destination path, a single 
> rename operation is used to move the entire directory
> The second option may not work well on blobstores such as S3. Renames are not 
> metadata operations and require copying all the data. Client connectors to 
> blobstores may not efficiently rename directories. Worst case, the connector 
> will copy each file one by one, sequentially rather than using a threadpool 
> of workers to copy the data (e.g. HADOOP-13600).
> Hive already has code to rename files using a threadpool of workers, but this 
> only occurs in case number 1.
> This JIRA aims to modify the code so that case 1 is triggered when copying 
> within a blobstore. The focus is on copies within a blobstore because 
> needToCopy will return true if the src and target filesystems are different, 
> in which case a different code path is triggered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15090) Temporary DB failure can stop ExpiredTokenRemover thread

2016-11-09 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651499#comment-15651499
 ] 

Peter Vary commented on HIVE-15090:
---

[~thejas] You are thinking like me :)

??Defining the exceptions that can be thrown by DelegationTokenStore that are 
not fatal and can be ignored.??
I have chickened out of this since it is a compatibility change - at least in 
my unpracticed view. If I change the DelegationTokenStore interface to add the 
new type of exception, then if someone has implemented his own 
DelegationTokenStore, it has to be changed to work with the new version of hive.

??Updating DBTokenStore to not thrown what could be transient errors, and just 
log those??
ExpiredTokenRemover uses the following DelegationTokenStore methods: 
updateMasterKey, removeMasterKey, getAllDelegationTokenIdentifiers, 
removeToken, getToken. Changing the behavior of these methods could cause 
unexpected results.

So I leaned for your first suggestion, but HIVE-13090 was a longstanding issue 
(introduced at Dec 7, 2011) with very visible effects and with only two jiras 
for it. I thought it is not that common to warrant the compatibility change.

What do you think [~thejas]? Is it worth to change the DelegationTokenStore 
interface? You have more experience with Hive than me.

Thanks,
Peter

> Temporary DB failure can stop ExpiredTokenRemover thread
> 
>
> Key: HIVE-15090
> URL: https://issues.apache.org/jira/browse/HIVE-15090
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.3.0, 2.1.0, 2.0.1, 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Fix For: 2.2.0
>
> Attachments: HIVE-15090.2-branch-2.1.patch, HIVE-15090.2.patch, 
> HIVE-15090.2.patch, HIVE-15090.3-branch-2.1.patch, HIVE-15090.patch
>
>
> In HIVE-13090 we decided that we should not close the metastore if there is 
> an unexpected exception during the expired token removal process, but that 
> fix leaves a running metastore without ExpiredTokenRemover thread.
> To fix this I will move the catch inside the running loop, and hope the 
> thread could recover from the exception



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15093) S3-to-S3 Renames: Files should be moved individually rather than at a directory level

2016-11-09 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651394#comment-15651394
 ] 

Steve Loughran commented on HIVE-15093:
---

-1 (non binding)

Doing parallel rename is a stop-gap solution which will be obsolete the moment 
someone sits down to do it in s3a with an implementation that its more 
efficient in its scheduling of copies calls, and, with tests and broader use, 
more tested.

HADOOP-13600 proposes parallel renames. Nobody has written that yet, —but I 
promise to review a patch people provide, with tests. Get that patch into 
Hadoop and there's only one place to maintain this stuff, no need to 
document/test another switch, maintain the option, have another codepath to 
keep alive, etc. 
The algorithm I proposed there would initially sorts the files by size, so the 
larger renames are scheduled first. Given a thread pool smaller than the list 
of files to rename, this should ensure that the scheduling is more optimal. the 
listing. If you really, really, want to do this in a separate piece of code, 
you should do the same.

Also, there are enough other s3a speedups that you should be testing against 
Hadoop 2.8+, both to avoid optimising against a now-obsolete codepath, but also 
to help find and report any problems in our code.

To summarise: go on, fix the code in Hadoop, simplify everyone's lives. 

> S3-to-S3 Renames: Files should be moved individually rather than at a 
> directory level
> -
>
> Key: HIVE-15093
> URL: https://issues.apache.org/jira/browse/HIVE-15093
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15093.1.patch, HIVE-15093.2.patch, 
> HIVE-15093.3.patch, HIVE-15093.4.patch, HIVE-15093.5.patch, 
> HIVE-15093.6.patch, HIVE-15093.7.patch, HIVE-15093.8.patch, HIVE-15093.9.patch
>
>
> Hive's MoveTask uses the Hive.moveFile method to move data within a 
> distributed filesystem as well as blobstore filesystems.
> If the move is done within the same filesystem:
> 1: If the source path is a subdirectory of the destination path, files will 
> be moved one by one using a threapool of workers
> 2: If the source path is not a subdirectory of the destination path, a single 
> rename operation is used to move the entire directory
> The second option may not work well on blobstores such as S3. Renames are not 
> metadata operations and require copying all the data. Client connectors to 
> blobstores may not efficiently rename directories. Worst case, the connector 
> will copy each file one by one, sequentially rather than using a threadpool 
> of workers to copy the data (e.g. HADOOP-13600).
> Hive already has code to rename files using a threadpool of workers, but this 
> only occurs in case number 1.
> This JIRA aims to modify the code so that case 1 is triggered when copying 
> within a blobstore. The focus is on copies within a blobstore because 
> needToCopy will return true if the src and target filesystems are different, 
> in which case a different code path is triggered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination

2016-11-09 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651362#comment-15651362
 ] 

Sergio Peña commented on HIVE-14271:


Agree with approach #2. If outPath and finalPath are scratch directories, then 
we can just write directly to finalPath and avoid the rename. 
[~ste...@apache.org] There is another patch to do S3-to-S3 renames in parallel 
to speed up the COPY operations (See HIVE-15093)

> FileSinkOperator should not rename files to final paths when S3 is the 
> default destination
> --
>
> Key: HIVE-14271
> URL: https://issues.apache.org/jira/browse/HIVE-14271
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>
> FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished 
> writing all rows to a temporary path. The problem is that S3 does not support 
> renaming.
> Two options can be considered:
> a. Use a copy operation instead. After FileSinkOperator writes all rows to 
> outPaths, then the commit method will do a copy() call instead of move().
> b. Write row by row directly to the S3 path (see HIVE-1620). This may add 
> better performance calls, but we should take care of the cleanup part in case 
> of writing errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15090) Temporary DB failure can stop ExpiredTokenRemover thread

2016-11-09 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651344#comment-15651344
 ] 

Thejas M Nair commented on HIVE-15090:
--

Some options are - 
 * Defining the exceptions that can be thrown by DelegationTokenStore that are 
not fatal and can be ignored.
 * Updating DBTokenStore to not thrown what could be transient errors, and just 
log those

[~pvary] What are your thoughts ?


> Temporary DB failure can stop ExpiredTokenRemover thread
> 
>
> Key: HIVE-15090
> URL: https://issues.apache.org/jira/browse/HIVE-15090
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.3.0, 2.1.0, 2.0.1, 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Fix For: 2.2.0
>
> Attachments: HIVE-15090.2-branch-2.1.patch, HIVE-15090.2.patch, 
> HIVE-15090.2.patch, HIVE-15090.3-branch-2.1.patch, HIVE-15090.patch
>
>
> In HIVE-13090 we decided that we should not close the metastore if there is 
> an unexpected exception during the expired token removal process, but that 
> fix leaves a running metastore without ExpiredTokenRemover thread.
> To fix this I will move the catch inside the running loop, and hope the 
> thread could recover from the exception



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination

2016-11-09 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651284#comment-15651284
 ] 

Steve Loughran commented on HIVE-14271:
---

Strategy 2 will eliminate one rename, which, with rename costs being O(data) is 
good. However, there's still one rename to go.

there's still the overhead of copying the data from scratch to final. This 
shouldn't be done in the client-side code, as object store COPY operations 
happen server side; they're what rename() uses. If renames of files in a 
directory are issued in parallel, then the rename can be significantly speeded 
up; this works precisely because you can hold open the HTTP connections for the 
copy calls without much cost in network traffic.

> FileSinkOperator should not rename files to final paths when S3 is the 
> default destination
> --
>
> Key: HIVE-14271
> URL: https://issues.apache.org/jira/browse/HIVE-14271
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>
> FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished 
> writing all rows to a temporary path. The problem is that S3 does not support 
> renaming.
> Two options can be considered:
> a. Use a copy operation instead. After FileSinkOperator writes all rows to 
> outPaths, then the commit method will do a copy() call instead of move().
> b. Write row by row directly to the S3 path (see HIVE-1620). This may add 
> better performance calls, but we should take care of the cleanup part in case 
> of writing errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15090) Temporary DB failure can stop ExpiredTokenRemover thread

2016-11-09 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-15090:
--
Attachment: HIVE-15090.3-branch-2.1.patch

Retriggering the patch, and hoping to get only the same failing tests as 
HIVE-15094 :)

> Temporary DB failure can stop ExpiredTokenRemover thread
> 
>
> Key: HIVE-15090
> URL: https://issues.apache.org/jira/browse/HIVE-15090
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.3.0, 2.1.0, 2.0.1, 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Fix For: 2.2.0
>
> Attachments: HIVE-15090.2-branch-2.1.patch, HIVE-15090.2.patch, 
> HIVE-15090.2.patch, HIVE-15090.3-branch-2.1.patch, HIVE-15090.patch
>
>
> In HIVE-13090 we decided that we should not close the metastore if there is 
> an unexpected exception during the expired token removal process, but that 
> fix leaves a running metastore without ExpiredTokenRemover thread.
> To fix this I will move the catch inside the running loop, and hope the 
> thread could recover from the exception



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15090) Temporary DB failure can stop ExpiredTokenRemover thread

2016-11-09 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651254#comment-15651254
 ] 

Peter Vary commented on HIVE-15090:
---

Hi [~thejas],

I was thinking about the same lines as you, but finally decided against it.
My reasoning was that the METASTORE_CLUSTER_DELEGATION_TOKEN_STORE_CLS is a 
configuration variable and could be set by the administrator to any class, that 
is why we will never be able to handle every future exception here correctly. 
So finally I decided to stick to a clean, easily understandable solution rather 
than create a partial solution for the DBTokenStore only.

Since this one is already committed to master, I think if we find a better 
approach I think we should open another jira to handle it.
I would be happy to help out there too.

Thanks again for taking a look at this!

Peter

> Temporary DB failure can stop ExpiredTokenRemover thread
> 
>
> Key: HIVE-15090
> URL: https://issues.apache.org/jira/browse/HIVE-15090
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.3.0, 2.1.0, 2.0.1, 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Fix For: 2.2.0
>
> Attachments: HIVE-15090.2-branch-2.1.patch, HIVE-15090.2.patch, 
> HIVE-15090.2.patch, HIVE-15090.patch
>
>
> In HIVE-13090 we decided that we should not close the metastore if there is 
> an unexpected exception during the expired token removal process, but that 
> fix leaves a running metastore without ExpiredTokenRemover thread.
> To fix this I will move the catch inside the running loop, and hope the 
> thread could recover from the exception



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15168) Flaky test: TestSparkClient.testJobSubmission (still flaky)

2016-11-09 Thread Barna Zsombor Klara (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651218#comment-15651218
 ] 

Barna Zsombor Klara commented on HIVE-15168:


Sadly my wonderful fix managed to prevent the testJobSubmission test from 
failing for one, single, sad day
I think I found a second racecondition, would you mind taking a second look at 
it [~xuefuz], [~lirui]?
I hope this time my fix will be a tiny bit more permanent...

In the meantime I'll try running the test a couple of hundred times in a loop 
to see if it breaks again.

> Flaky test: TestSparkClient.testJobSubmission (still flaky)
> ---
>
> Key: HIVE-15168
> URL: https://issues.apache.org/jira/browse/HIVE-15168
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>
> [HIVE-14910|https://issues.apache.org/jira/browse/HIVE-14910] already 
> addressed one source of flakyness bud sadly not all it seems.
> In JobHandleImpl the listeners are registered after the job has been 
> submitted.
> This may end up in a racecondition.
> {code}
>  // Link the RPC and the promise so that events from one are propagated to 
> the other as
>   // needed.
>   rpc.addListener(new 
> GenericFutureListener() {
> @Override
> public void operationComplete(io.netty.util.concurrent.Future 
> f) {
>   if (f.isSuccess()) {
> handle.changeState(JobHandle.State.QUEUED);
>   } else if (!promise.isDone()) {
> promise.setFailure(f.cause());
>   }
> }
>   });
>   promise.addListener(new GenericFutureListener() {
> @Override
> public void operationComplete(Promise p) {
>   if (jobId != null) {
> jobs.remove(jobId);
>   }
>   if (p.isCancelled() && !rpc.isDone()) {
> rpc.cancel(true);
>   }
> }
>   });
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15161) migrate ColumnStats to use jackson

2016-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651181#comment-15651181
 ] 

Hive QA commented on HIVE-15161:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838179/HIVE-15161.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10637 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=91)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[limit_pushdown] 
(batchId=121)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2048/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2048/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2048/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838179 - PreCommit-HIVE-Build

> migrate ColumnStats to use jackson
> --
>
> Key: HIVE-15161
> URL: https://issues.apache.org/jira/browse/HIVE-15161
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 2.2.0
>
> Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, 
> HIVE-15161.3.patch, HIVE-15161.4.patch
>
>
> * json.org has license issues
> * jackson can provide a fully compatible alternative to it
> * there are a few flakiness issues caused by the order of the map entries of 
> the columns...this cat be addressed, org.json api was unfriendly in this 
> manner ;)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14541) Beeline does not prompt for username and password properly

2016-11-09 Thread Miklos Csanady (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651085#comment-15651085
 ] 

Miklos Csanady commented on HIVE-14541:
---

Let me clarify: 
It should ask for username  if -u parameter is given without username AND (no 
-n connection parameter given and 
(no javax.jdo.option.ConnectionUserName given and no ConnectionUserName 
parameter found.))

For password prompt:
if -p given without value
OR (-u parameter is given with no passwd  AND none of 
javax.jdo.option.ConnectionPassword and ConnectionPassword found.)

[~vihangk1] Am I correct?
Miklos


> Beeline does not prompt for username and password properly
> --
>
> Key: HIVE-14541
> URL: https://issues.apache.org/jira/browse/HIVE-14541
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Vihang Karajgaonkar
>Assignee: Miklos Csanady
>
> In the default mode, when we connect using !connect 
> jdbc:hive2://localhost:1 (without providing user and password) beeling 
> prompts for it as expected.
> But when we use beeline -u "url" and do not provide -n or -p arguments, it 
> does not prompt for the user/password
> {noformat}
> $ ./beeline -u jdbc:hive2://localhost:1
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/Users/vihang/work/src/upstream/hive/packaging/target/apache-hive-2.2.0-SNAPSHOT-bin/apache-hive-2.2.0-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to jdbc:hive2://localhost:1
> Connected to: Apache Hive (version 2.2.0-SNAPSHOT)
> Driver: Hive JDBC (version 2.2.0-SNAPSHOT)
> 16/08/15 18:09:15 [main]: WARN jdbc.HiveConnection: Request to set autoCommit 
> to false; Hive does not support autoCommit=false.
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 2.2.0-SNAPSHOT by Apache Hive
> 0: jdbc:hive2://localhost:1> !quit
> Closing: 0: jdbc:hive2://localhost:1
> {noformat}
> {noformat}
> $ ./beeline
> Beeline version 2.2.0-SNAPSHOT by Apache Hive
> beeline> !connect "jdbc:hive2://localhost:1"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/Users/vihang/work/src/upstream/hive/packaging/target/apache-hive-2.2.0-SNAPSHOT-bin/apache-hive-2.2.0-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to jdbc:hive2://localhost:1
> Enter username for jdbc:hive2://localhost:1: hive
> Enter password for jdbc:hive2://localhost:1: 
> Connected to: Apache Hive (version 2.2.0-SNAPSHOT)
> Driver: Hive JDBC (version 2.2.0-SNAPSHOT)
> 16/08/15 18:09:03 [main]: WARN jdbc.HiveConnection: Request to set autoCommit 
> to false; Hive does not support autoCommit=false.
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 0: jdbc:hive2://localhost:1> !quit
> Closing: 0: jdbc:hive2://localhost:1
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15161) migrate ColumnStats to use jackson

2016-11-09 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-15161:

Attachment: HIVE-15161.4.patch

> migrate ColumnStats to use jackson
> --
>
> Key: HIVE-15161
> URL: https://issues.apache.org/jira/browse/HIVE-15161
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 2.2.0
>
> Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, 
> HIVE-15161.3.patch, HIVE-15161.4.patch
>
>
> * json.org has license issues
> * jackson can provide a fully compatible alternative to it
> * there are a few flakiness issues caused by the order of the map entries of 
> the columns...this cat be addressed, org.json api was unfriendly in this 
> manner ;)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15161) migrate ColumnStats to use jackson

2016-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651015#comment-15651015
 ] 

Hive QA commented on HIVE-15161:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838163/HIVE-15161.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10635 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_4] 
(batchId=11)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[autoColumnStats_2]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[limit_pushdown] 
(batchId=121)
org.apache.hadoop.hive.metastore.hbase.TestHBaseSchemaTool.oneMondoTest 
(batchId=191)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2047/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2047/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2047/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838163 - PreCommit-HIVE-Build

> migrate ColumnStats to use jackson
> --
>
> Key: HIVE-15161
> URL: https://issues.apache.org/jira/browse/HIVE-15161
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 2.2.0
>
> Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, 
> HIVE-15161.3.patch
>
>
> * json.org has license issues
> * jackson can provide a fully compatible alternative to it
> * there are a few flakiness issues caused by the order of the map entries of 
> the columns...this cat be addressed, org.json api was unfriendly in this 
> manner ;)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0

2016-11-09 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15650960#comment-15650960
 ] 

Zoltan Haindrich commented on HIVE-15023:
-

[~pxiong] it seems to me that there is a qtest which have "evaded" the output 
update ;)

and it's affected by the limit 0 optimization:

https://builds.apache.org/job/PreCommit-HIVE-Build/2046/testReport/org.apache.hadoop.hive.cli/TestSparkCliDriver/testCliDriver_limit_pushdown_/



> SimpleFetchOptimizer needs to optimize limit=0
> --
>
> Key: HIVE-15023
> URL: https://issues.apache.org/jira/browse/HIVE-15023
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0
>
> Attachments: HIVE-15023.01.patch, HIVE-15023.02.patch
>
>
> on current master
> {code}
> hive> explain select key from src limit 0;
> OK
> STAGE DEPENDENCIES:
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: 0
>   Processor Tree:
> TableScan
>   alias: src
>   Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: key (type: string)
> outputColumnNames: _col0
> Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE 
> Column stats: NONE
> Limit
>   Number of rows: 0
>   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
> stats: NONE
>   ListSink
> Time taken: 7.534 seconds, Fetched: 20 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15161) migrate ColumnStats to use jackson

2016-11-09 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-15161:

Attachment: (was: HIVE-15161.3.patch)

> migrate ColumnStats to use jackson
> --
>
> Key: HIVE-15161
> URL: https://issues.apache.org/jira/browse/HIVE-15161
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 2.2.0
>
> Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, 
> HIVE-15161.3.patch
>
>
> * json.org has license issues
> * jackson can provide a fully compatible alternative to it
> * there are a few flakiness issues caused by the order of the map entries of 
> the columns...this cat be addressed, org.json api was unfriendly in this 
> manner ;)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15161) migrate ColumnStats to use jackson

2016-11-09 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-15161:

Attachment: HIVE-15161.3.patch

> migrate ColumnStats to use jackson
> --
>
> Key: HIVE-15161
> URL: https://issues.apache.org/jira/browse/HIVE-15161
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 2.2.0
>
> Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, 
> HIVE-15161.3.patch
>
>
> * json.org has license issues
> * jackson can provide a fully compatible alternative to it
> * there are a few flakiness issues caused by the order of the map entries of 
> the columns...this cat be addressed, org.json api was unfriendly in this 
> manner ;)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15161) migrate ColumnStats to use jackson

2016-11-09 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-15161:

Attachment: HIVE-15161.3.patch

> migrate ColumnStats to use jackson
> --
>
> Key: HIVE-15161
> URL: https://issues.apache.org/jira/browse/HIVE-15161
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 2.2.0
>
> Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, 
> HIVE-15161.3.patch
>
>
> * json.org has license issues
> * jackson can provide a fully compatible alternative to it
> * there are a few flakiness issues caused by the order of the map entries of 
> the columns...this cat be addressed, org.json api was unfriendly in this 
> manner ;)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-14541) Beeline does not prompt for username and password properly

2016-11-09 Thread Miklos Csanady (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Csanady reassigned HIVE-14541:
-

Assignee: Miklos Csanady

> Beeline does not prompt for username and password properly
> --
>
> Key: HIVE-14541
> URL: https://issues.apache.org/jira/browse/HIVE-14541
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Vihang Karajgaonkar
>Assignee: Miklos Csanady
>
> In the default mode, when we connect using !connect 
> jdbc:hive2://localhost:1 (without providing user and password) beeling 
> prompts for it as expected.
> But when we use beeline -u "url" and do not provide -n or -p arguments, it 
> does not prompt for the user/password
> {noformat}
> $ ./beeline -u jdbc:hive2://localhost:1
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/Users/vihang/work/src/upstream/hive/packaging/target/apache-hive-2.2.0-SNAPSHOT-bin/apache-hive-2.2.0-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to jdbc:hive2://localhost:1
> Connected to: Apache Hive (version 2.2.0-SNAPSHOT)
> Driver: Hive JDBC (version 2.2.0-SNAPSHOT)
> 16/08/15 18:09:15 [main]: WARN jdbc.HiveConnection: Request to set autoCommit 
> to false; Hive does not support autoCommit=false.
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 2.2.0-SNAPSHOT by Apache Hive
> 0: jdbc:hive2://localhost:1> !quit
> Closing: 0: jdbc:hive2://localhost:1
> {noformat}
> {noformat}
> $ ./beeline
> Beeline version 2.2.0-SNAPSHOT by Apache Hive
> beeline> !connect "jdbc:hive2://localhost:1"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/Users/vihang/work/src/upstream/hive/packaging/target/apache-hive-2.2.0-SNAPSHOT-bin/apache-hive-2.2.0-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to jdbc:hive2://localhost:1
> Enter username for jdbc:hive2://localhost:1: hive
> Enter password for jdbc:hive2://localhost:1: 
> Connected to: Apache Hive (version 2.2.0-SNAPSHOT)
> Driver: Hive JDBC (version 2.2.0-SNAPSHOT)
> 16/08/15 18:09:03 [main]: WARN jdbc.HiveConnection: Request to set autoCommit 
> to false; Hive does not support autoCommit=false.
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 0: jdbc:hive2://localhost:1> !quit
> Closing: 0: jdbc:hive2://localhost:1
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12891) Hive fails when java.io.tmpdir is set to a relative location

2016-11-09 Thread Barna Zsombor Klara (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15650546#comment-15650546
 ] 

Barna Zsombor Klara commented on HIVE-12891:


Tests are flaky:
https://issues.apache.org/jira/browse/HIVE-14936 - orc_ppd_schema_evol_3a
https://issues.apache.org/jira/browse/HIVE-15169 - columnstats_part_coltype
https://issues.apache.org/jira/browse/HIVE-15116 - join_acid_non_acid
https://issues.apache.org/jira/browse/HIVE-15115 - union_fast_stats
https://issues.apache.org/jira/browse/HIVE-15084 - explainanalyze_4, 
explainanalyze_5
https://issues.apache.org/jira/browse/HIVE-15168 - testJobSubmission
https://issues.apache.org/jira/browse/HIVE-15170 - testTaskStatus

> Hive fails when java.io.tmpdir is set to a relative location
> 
>
> Key: HIVE-12891
> URL: https://issues.apache.org/jira/browse/HIVE-12891
> Project: Hive
>  Issue Type: Bug
>Reporter: Reuben Kuhnert
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-12891.01.19.2016.01.patch, HIVE-12891.03.patch, 
> HIVE-12891.04.patch, HIVE-12891.5.patch, HIVE-12981.01.22.2016.02.patch
>
>
> The function {{SessionState.createSessionDirs}} fails when trying to create 
> directories where {{java.io.tmpdir}} is set to a relative location.
> {code}
> \[SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: 
> IllegalArgumentException java.net.URISyntaxException: Relative path in 
> absolute URI: 
> file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1
> ...
> Minor variations:
> \[SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: SemanticException 
> Exception while processing Exception while writing out the local file 
> o.a.h.hive.ql/parse.SemanticException: Exception while processing exception 
> while writing out local file 
> ... 
> caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: 
> Relative path in absolute URI: 
> file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 
> at o.a.h.fs.Path.initialize (206) 
> at o.a.h.fs.Path.(197)... 
> at o.a.h.hive.ql.context.getScratchDir(267) 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-15158) Partition Table With timestamp type on S3 storage --> Error in getting fields from serde.Invalid Field null

2016-11-09 Thread thauvin damien (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

thauvin damien resolved HIVE-15158.
---
Resolution: Duplicate

duplicate with this jira HIVE-15157

> Partition Table With timestamp type on S3 storage --> Error in getting fields 
> from serde.Invalid Field null
> ---
>
> Key: HIVE-15158
> URL: https://issues.apache.org/jira/browse/HIVE-15158
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 2.1.0
> Environment: JDK 1.8 101 
>Reporter: thauvin damien
>
> Hello 
> I get the error above when i try to perform  :
> hive> DESCRIBE formatted table partition (tsbucket='2016-10-28 16%3A00%3A00');
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field null
> Here is the description of the issue.
> --External table Hive with dynamic partition enable on Aws S3 storage.
> --Partition Table with timestamp type .
> When i perform "show partition table;" everything is fine :
> hive>  show partitions table;
> OK
> tsbucket=2016-10-01 11%3A00%3A00
> tsbucket=2016-10-28 16%3A00%3A00
> And when i perform "describe FORMATTED table;" everything is fine
> Is this a bug ? 
> The stacktrace of hive.log :
> 2016-11-08T10:30:20,868 ERROR [ac3e0d48-22c5-4d04-a788-aeb004ea94f3 
> main([])]: exec.DDLTask (DDLTask.java:failed(574)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error in getting fields 
> from serde.Invalid Field null
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3414)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.describeTable(DDLTask.java:3109)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:408)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: MetaException(message:Invalid Field null)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getFieldsFromDeserializer(MetaStoreUtils.java:1336)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3409)
> ... 21 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15161) migrate ColumnStats to use jackson

2016-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15650127#comment-15650127
 ] 

Hive QA commented on HIVE-15161:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838130/HIVE-15161.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10634 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_4] 
(batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_bitmap_auto_partitioned]
 (batchId=27)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[autoColumnStats_2]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnStatsUpdateForStatsOptimizer_1]
 (batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[limit_pushdown] 
(batchId=121)
org.apache.hadoop.hive.metastore.hbase.TestHBaseSchemaTool.oneMondoTest 
(batchId=191)
org.apache.hive.spark.client.TestSparkClient.testJobSubmission (batchId=272)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2046/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2046/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2046/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838130 - PreCommit-HIVE-Build

> migrate ColumnStats to use jackson
> --
>
> Key: HIVE-15161
> URL: https://issues.apache.org/jira/browse/HIVE-15161
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 2.2.0
>
> Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch
>
>
> * json.org has license issues
> * jackson can provide a fully compatible alternative to it
> * there are a few flakiness issues caused by the order of the map entries of 
> the columns...this cat be addressed, org.json api was unfriendly in this 
> manner ;)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

82 matches

Mail list logo