[jira] [Commented] (HIVE-15082) Hive-1.2 cannot read data from complex data types with TIMESTAMP column, stored in Parquet
[ https://issues.apache.org/jira/browse/HIVE-15082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653262#comment-15653262 ] Lefty Leverenz commented on HIVE-15082: --- [~osayankin], it looks like you swapped the locations of patch-num and branch-name on patch 2, unless you meant it to be for branch-1.2 (but it still needs a patch-num). > Hive-1.2 cannot read data from complex data types with TIMESTAMP column, > stored in Parquet > -- > > Key: HIVE-15082 > URL: https://issues.apache.org/jira/browse/HIVE-15082 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Oleksiy Sayankin >Assignee: Oleksiy Sayankin >Priority: Blocker > Attachments: HIVE-15082-branch-1.2.patch, HIVE-15082-branch-1.patch > > > *STEP 1. Create test data* > {code:sql} > select * from dual; > {code} > *EXPECTED RESULT:* > {noformat} > Pretty_UnIQUe_StrinG > {noformat} > {code:sql} > create table test_parquet1(login timestamp) stored as parquet; > insert overwrite table test_parquet1 select from_unixtime(unix_timestamp()) > from dual; > select * from test_parquet1 limit 1; > {code} > *EXPECTED RESULT:* > No exceptions. Current timestamp as result. > {noformat} > 2016-10-27 10:58:19 > {noformat} > *STEP 2. Store timestamp in array in parquet file* > {code:sql} > create table test_parquet2(x array) stored as parquet; > insert overwrite table test_parquet2 select array(login) from test_parquet1; > select * from test_parquet2; > {code} > *EXPECTED RESULT:* > No exceptions. Current timestamp in brackets as result. > {noformat} > ["2016-10-27 10:58:19"] > {noformat} > *ACTUAL RESULT:* > {noformat} > ERROR [main]: CliDriver (SessionState.java:printError(963)) - Failed with > exception java.io.IOException:parquet.io.ParquetDecodingException: Can not > read value at 0 in block -1 in file > hdfs:///user/hive/warehouse/test_parquet2/00_0 > java.io.IOException: parquet.io.ParquetDecodingException: Can not read value > at 0 in block -1 in file hdfs:///user/hive/warehouse/test_parquet2/00_0 > {noformat} > *ROOT-CAUSE:* > Incorrect initialization of {{metadata}} {{HashMap}} causes that it has > {{null}} value in enumeration > {{org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter}} when > executing following line: > {code:java} > boolean skipConversion = > Boolean.valueOf(metadata.get(HiveConf.ConfVars.HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION.varname)); > {code} > in element {{ETIMESTAMP_CONVERTER}}. > JVM throws NPE and parquet library can not read data from file and throws > {noformat} > java.io.IOException:parquet.io.ParquetDecodingException: Can not read value > at 0 in block -1 in file hdfs:///user/hive/warehouse/test_parquet2/00_0 > {noformat} > for its turn. > *SOLUTION:* > Perform initialization in separate method to skip overriding it with {{null}} > value in block of code > {code:java} > if (parent != null) { > setMetadata(parent.getMetadata()); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15137) metastore add partitions background thread should use current username
[ https://issues.apache.org/jira/browse/HIVE-15137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-15137: -- Attachment: HIVE-15137.1.patch > metastore add partitions background thread should use current username > -- > > Key: HIVE-15137 > URL: https://issues.apache.org/jira/browse/HIVE-15137 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.2.0, 2.1.1 >Reporter: Thejas M Nair >Assignee: Daniel Dai > Attachments: HIVE-15137.1.patch > > > The background thread used in HIVE-13901 for adding partitions needs to be > reinitialized with current UGI for each invocation. Otherwise the user in > context while thread was created would be the current UGI during the actions > in the thread. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15137) metastore add partitions background thread should use current username
[ https://issues.apache.org/jira/browse/HIVE-15137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-15137: -- Status: Patch Available (was: Open) > metastore add partitions background thread should use current username > -- > > Key: HIVE-15137 > URL: https://issues.apache.org/jira/browse/HIVE-15137 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.2.0, 2.1.1 >Reporter: Thejas M Nair >Assignee: Daniel Dai > Attachments: HIVE-15137.1.patch > > > The background thread used in HIVE-13901 for adding partitions needs to be > reinitialized with current UGI for each invocation. Otherwise the user in > context while thread was created would be the current UGI during the actions > in the thread. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15137) metastore add partitions background thread should use current username
[ https://issues.apache.org/jira/browse/HIVE-15137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653225#comment-15653225 ] Daniel Dai commented on HIVE-15137: --- Here is the instruction to reproduce: 1. set the size of thread pool to 1 (hive.metastore.fshandler.threads=1) 2. start metastore 3. Start HiveCli with user1, run "ALTER TABLE table1 ADD PARTITION ..." 4. Start HiveCli with user2, run "ALTER TABLE table1 ADD PARTITION ..." The owner of both partition directories are user1. The cause of the issue is the FileSystem object from fs cache in Warehouse.mkdirs has the wrong uid. At the time when mkdirs getting FileSystem, UserGroupInformation.getCurrentUser() in both cases is user1. Upload a patch which use doAs inside thread within threadpool. It is hard to write a UT. Manually tested. > metastore add partitions background thread should use current username > -- > > Key: HIVE-15137 > URL: https://issues.apache.org/jira/browse/HIVE-15137 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.2.0, 2.1.1 >Reporter: Thejas M Nair >Assignee: Daniel Dai > > The background thread used in HIVE-13901 for adding partitions needs to be > reinitialized with current UGI for each invocation. Otherwise the user in > context while thread was created would be the current UGI during the actions > in the thread. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15101) Spark client can be stuck in RUNNING state
[ https://issues.apache.org/jira/browse/HIVE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653187#comment-15653187 ] Rui Li commented on HIVE-15101: --- Hi [~tzenmyo], thanks for your input. I'll try your scenario and see what I can find. We already take care of timeout in the Rpc code. Adding another timeout as in the patch may be no harm, but we should at least figure out why the existing logic doesn't work as expected. > Spark client can be stuck in RUNNING state > -- > > Key: HIVE-15101 > URL: https://issues.apache.org/jira/browse/HIVE-15101 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 2.0.0, 2.1.0 > Environment: Hive 2.1.0 > Spark 1.6.2 >Reporter: Satoshi Iijima >Assignee: Satoshi Iijima > Attachments: HIVE-15101.patch, hadoop-yarn-nodemanager.log, > hive.log.gz > > > When a Hive-on-Spark job is executed on YARN environment where UNHEALTHY > NodeManager exists, Spark client can be stuck in RUNNING state. > thread dump: > {code} > "008ee7b6-b083-4ac9-ae1c-b6097d9bf761 main" #1 prio=5 os_prio=0 > tid=0x7f14f4013800 nid=0x3855 in Object.wait() [0x7f14fd9b1000] >java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xf6615550> (a > io.netty.util.concurrent.DefaultPromise) > at java.lang.Object.wait(Object.java:502) > at > io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:254) > - locked <0xf6615550> (a > io.netty.util.concurrent.DefaultPromise) > at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:32) > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:31) > at > org.apache.hive.spark.client.SparkClientImpl.(SparkClientImpl.java:104) > at > org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80) > - locked <0xf21b8e08> (a java.lang.Class for > org.apache.hive.spark.client.SparkClientFactory) > at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:99) > at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.(RemoteHiveSparkClient.java:95) > at > org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:67) > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62) > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114) > at > org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:136) > at > org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:89) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:742) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:239) > at org.apache.hadoop.util.RunJar.main(RunJar.java:153) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12891) Hive fails when java.io.tmpdir is set to a relative location
[ https://issues.apache.org/jira/browse/HIVE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653132#comment-15653132 ] Lefty Leverenz commented on HIVE-12891: --- Should this be documented in the Hive wiki? It could go in the Configuration doc, although we might need a new subsection for it. * [AdminManual -- Configuration -- Configuration Variables | https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration#AdminManualConfiguration-ConfigurationVariables] > Hive fails when java.io.tmpdir is set to a relative location > > > Key: HIVE-12891 > URL: https://issues.apache.org/jira/browse/HIVE-12891 > Project: Hive > Issue Type: Bug >Reporter: Reuben Kuhnert >Assignee: Barna Zsombor Klara > Fix For: 2.2.0 > > Attachments: HIVE-12891.01.19.2016.01.patch, HIVE-12891.03.patch, > HIVE-12891.04.patch, HIVE-12891.5.patch, HIVE-12981.01.22.2016.02.patch > > > The function {{SessionState.createSessionDirs}} fails when trying to create > directories where {{java.io.tmpdir}} is set to a relative location. > {code} > \[SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: > IllegalArgumentException java.net.URISyntaxException: Relative path in > absolute URI: > file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 > ... > Minor variations: > \[SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: SemanticException > Exception while processing Exception while writing out the local file > o.a.h.hive.ql/parse.SemanticException: Exception while processing exception > while writing out local file > ... > caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: > Relative path in absolute URI: > file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 > at o.a.h.fs.Path.initialize (206) > at o.a.h.fs.Path.(197)... > at o.a.h.hive.ql.context.getScratchDir(267) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13966) DbNotificationListener: can loose DDL operation notifications
[ https://issues.apache.org/jira/browse/HIVE-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653034#comment-15653034 ] Hive QA commented on HIVE-13966: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12838285/HIVE-13966.5.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10637 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=145) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] (batchId=91) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2059/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2059/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2059/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12838285 - PreCommit-HIVE-Build > DbNotificationListener: can loose DDL operation notifications > - > > Key: HIVE-13966 > URL: https://issues.apache.org/jira/browse/HIVE-13966 > Project: Hive > Issue Type: Bug > Components: HCatalog >Reporter: Nachiket Vaidya >Assignee: Mohit Sabharwal >Priority: Critical > Attachments: HIVE-13966.1.patch, HIVE-13966.2.patch, > HIVE-13966.3.patch, HIVE-13966.4.patch, HIVE-13966.4.patch, > HIVE-13966.5.patch, HIVE-13966.pdf > > > The code for each API in HiveMetaStore.java is like this: > 1. openTransaction() > 2. -- operation-- > 3. commit() or rollback() based on result of the operation. > 4. add entry to notification log (unconditionally) > If the operation is failed (in step 2), we still add entry to notification > log. Found this issue in testing. > It is still ok as this is the case of false positive. > If the operation is successful and adding to notification log failed, the > user will get an MetaException. It will not rollback the operation, as it is > already committed. We need to handle this case so that we will not have false > negatives. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15101) Spark client can be stuck in RUNNING state
[ https://issues.apache.org/jira/browse/HIVE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652934#comment-15652934 ] Teruyoshi Zenmyo commented on HIVE-15101: - I have encountered the same issue on a testing environment without UNHEALTHY nodes (all of the nodes had been active). I found that spark-submit.sh had failed due to resource shortage (spark.driver.memory > yarn.scheduler.maximum-allocation-mb). The server-side timeout seems to not work in case of failures on spark-submit.sh and the patch introducing client-side timeout would make it safer. > Spark client can be stuck in RUNNING state > -- > > Key: HIVE-15101 > URL: https://issues.apache.org/jira/browse/HIVE-15101 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 2.0.0, 2.1.0 > Environment: Hive 2.1.0 > Spark 1.6.2 >Reporter: Satoshi Iijima >Assignee: Satoshi Iijima > Attachments: HIVE-15101.patch, hadoop-yarn-nodemanager.log, > hive.log.gz > > > When a Hive-on-Spark job is executed on YARN environment where UNHEALTHY > NodeManager exists, Spark client can be stuck in RUNNING state. > thread dump: > {code} > "008ee7b6-b083-4ac9-ae1c-b6097d9bf761 main" #1 prio=5 os_prio=0 > tid=0x7f14f4013800 nid=0x3855 in Object.wait() [0x7f14fd9b1000] >java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xf6615550> (a > io.netty.util.concurrent.DefaultPromise) > at java.lang.Object.wait(Object.java:502) > at > io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:254) > - locked <0xf6615550> (a > io.netty.util.concurrent.DefaultPromise) > at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:32) > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:31) > at > org.apache.hive.spark.client.SparkClientImpl.(SparkClientImpl.java:104) > at > org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80) > - locked <0xf21b8e08> (a java.lang.Class for > org.apache.hive.spark.client.SparkClientFactory) > at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:99) > at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.(RemoteHiveSparkClient.java:95) > at > org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:67) > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62) > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114) > at > org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:136) > at > org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:89) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:742) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:239) > at org.apache.hadoop.util.RunJar.main(RunJar.java:153) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13966) DbNotificationListener: can loose DDL operation notifications
[ https://issues.apache.org/jira/browse/HIVE-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohit Sabharwal updated HIVE-13966: --- Attachment: HIVE-13966.5.patch > DbNotificationListener: can loose DDL operation notifications > - > > Key: HIVE-13966 > URL: https://issues.apache.org/jira/browse/HIVE-13966 > Project: Hive > Issue Type: Bug > Components: HCatalog >Reporter: Nachiket Vaidya >Assignee: Mohit Sabharwal >Priority: Critical > Attachments: HIVE-13966.1.patch, HIVE-13966.2.patch, > HIVE-13966.3.patch, HIVE-13966.4.patch, HIVE-13966.4.patch, > HIVE-13966.5.patch, HIVE-13966.pdf > > > The code for each API in HiveMetaStore.java is like this: > 1. openTransaction() > 2. -- operation-- > 3. commit() or rollback() based on result of the operation. > 4. add entry to notification log (unconditionally) > If the operation is failed (in step 2), we still add entry to notification > log. Found this issue in testing. > It is still ok as this is the case of false positive. > If the operation is successful and adding to notification log failed, the > user will get an MetaException. It will not rollback the operation, as it is > already committed. We need to handle this case so that we will not have false > negatives. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15112) Implement Parquet vectorization reader for Struct type
[ https://issues.apache.org/jira/browse/HIVE-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652892#comment-15652892 ] Ferdinand Xu commented on HIVE-15112: - PR is sent out for preview. Not ready in QTest part and also pending on the uncommitted patch HIVE-14815. > Implement Parquet vectorization reader for Struct type > -- > > Key: HIVE-15112 > URL: https://issues.apache.org/jira/browse/HIVE-15112 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > > Like HIVE-14815, we need support Parquet vectorized reader for struct type. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15112) Implement Parquet vectorization reader for Struct type
[ https://issues.apache.org/jira/browse/HIVE-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652889#comment-15652889 ] ASF GitHub Bot commented on HIVE-15112: --- GitHub user winningsix opened a pull request: https://github.com/apache/hive/pull/113 HIVE-15112 Implement Parquet vectorization reader for Struct type Patch includes: 1. support for struct type 2. UT refine To be done: QTest for struct type You can merge this pull request into a Git repository by running: $ git pull https://github.com/winningsix/hive complex_types Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/113.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #113 commit 37f50c7629b5ef2a8fb6e9f63caaec6223abf308 Author: Ferdinand XuDate: 2016-09-01T22:15:31Z HIVE-14815: Support vectorization for Parquet clean code and add qtest Refine code Clean code Clean up code Clean up clean up code Update qfile output files Clean up code Address comments Avoid creating new HiveDecimalWritable object Address more comments Remove unused imports Address further comments Fix NPE Fix for failed cases commit 891b219838e4978f2eb4d41c0016214d44cc1bb7 Author: Ferdinand Xu Date: 2016-11-07T06:10:16Z HIVE-15112: Implement Parquet vectorization reader for Complex types commit 26e513a2ac67dcfb05875e6ad7ba07f158be9073 Author: Ferdinand Xu Date: 2016-11-09T19:49:46Z Refactor UT > Implement Parquet vectorization reader for Struct type > -- > > Key: HIVE-15112 > URL: https://issues.apache.org/jira/browse/HIVE-15112 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > > Like HIVE-14815, we need support Parquet vectorized reader for struct type. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14453) refactor physical writing of ORC data and metadata to FS from the logical writers
[ https://issues.apache.org/jira/browse/HIVE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652874#comment-15652874 ] Hive QA commented on HIVE-14453: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12838280/HIVE-14453.02.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10637 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=145) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2058/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2058/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2058/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12838280 - PreCommit-HIVE-Build > refactor physical writing of ORC data and metadata to FS from the logical > writers > - > > Key: HIVE-14453 > URL: https://issues.apache.org/jira/browse/HIVE-14453 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14453.01.patch, HIVE-14453.02.patch, > HIVE-14453.patch > > > ORC data doesn't have to go directly into an HDFS stream via buffers, it can > go somewhere else (e.g. a write-thru cache, or an addressable system that > doesn't require the stream blocks to be held in memory before writing them > all together). > To that effect, it would be nice to abstract the data block/metadata > structure creating from the physical file concerns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10924) add support for MERGE statement
[ https://issues.apache.org/jira/browse/HIVE-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652813#comment-15652813 ] Eugene Koifman commented on HIVE-10924: --- https://www.postgresql.org/message-id/1208372338.4259.202.ca...@ebony.site > add support for MERGE statement > --- > > Key: HIVE-10924 > URL: https://issues.apache.org/jira/browse/HIVE-10924 > Project: Hive > Issue Type: New Feature > Components: Query Planning, Query Processor, Transactions >Affects Versions: 1.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > > add support for > MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15173) Allow dec as an alias for decimal
[ https://issues.apache.org/jira/browse/HIVE-15173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652765#comment-15652765 ] Hive QA commented on HIVE-15173: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12838268/HIVE-15173.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 84 failed/errored test(s), 10637 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_deep_filters] (batchId=80) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_decimal] (batchId=62) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_decimal_native] (batchId=25) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[char_pad_convert] (batchId=6) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_10_0] (batchId=38) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_precision] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[distinct_windowing] (batchId=10) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[distinct_windowing_no_cbo] (batchId=58) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[metadata_only_queries] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[metadata_only_queries_with_filters] (batchId=62) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_file_dump] (batchId=51) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_decimal] (batchId=6) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_windowing_expressions] (batchId=57) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_binary_join_groupby] (batchId=73) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_cast_constant] (batchId=8) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_data_types] (batchId=68) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_10_0] (batchId=51) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_mapjoin] (batchId=50) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_precision] (batchId=45) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round] (batchId=32) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round_2] (batchId=21) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_string_concat] (batchId=29) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_distinct] (batchId=50) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_expressions] (batchId=49) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_multipartitioning] (batchId=49) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_navfn] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_ntile] (batchId=20) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_order_null] (batchId=49) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_range_multiorder] (batchId=6) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_rank] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_streaming] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_udaf] (batchId=59) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_windowspec] (batchId=16) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters1] (batchId=131) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=131) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=133) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mapjoin_decimal] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadata_only_queries] (batchId=141) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadata_only_queries_with_filters] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_predicate_pushdown] (batchId=136) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[parquet_predicate_pushdown] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=145)
[jira] [Updated] (HIVE-14089) complex type support in LLAP IO is broken
[ https://issues.apache.org/jira/browse/HIVE-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14089: Description: HIVE-13617 is causing MiniLlapCliDriver following test failures {code} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join {code} was: HIVE-13617 is causing MiniLlapCliDriver following test failures {code} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join {code} Note to self - need to add multi-stripe test, and also test complex types with some nulls so that present stream is not suppressed. > complex type support in LLAP IO is broken > -- > > Key: HIVE-14089 > URL: https://issues.apache.org/jira/browse/HIVE-14089 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Sergey Shelukhin > Attachments: HIVE-14089.04.patch, HIVE-14089.05.patch, > HIVE-14089.06.patch, HIVE-14089.07.patch, HIVE-14089.08.patch, > HIVE-14089.09.patch, HIVE-14089.10.patch, HIVE-14089.10.patch, > HIVE-14089.10.patch, HIVE-14089.11.patch, HIVE-14089.WIP.2.patch, > HIVE-14089.WIP.3.patch, HIVE-14089.WIP.patch > > > HIVE-13617 is causing MiniLlapCliDriver following test failures > {code} > org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all > org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14453) refactor physical writing of ORC data and metadata to FS from the logical writers
[ https://issues.apache.org/jira/browse/HIVE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14453: Attachment: HIVE-14453.02.patch I'd like to revive this patch for HIVE-15147 (where we want to reencode parts of a text file to ORC for caching, and cache columns separately from each other). [~prasanth_j] can you please review? This is a refactoring, so no real logic changes as far as I see. > refactor physical writing of ORC data and metadata to FS from the logical > writers > - > > Key: HIVE-14453 > URL: https://issues.apache.org/jira/browse/HIVE-14453 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14453.01.patch, HIVE-14453.02.patch, > HIVE-14453.patch > > > ORC data doesn't have to go directly into an HDFS stream via buffers, it can > go somewhere else (e.g. a write-thru cache, or an addressable system that > doesn't require the stream blocks to be held in memory before writing them > all together). > To that effect, it would be nice to abstract the data block/metadata > structure creating from the physical file concerns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-14990) run all tests for MM tables and fix the issues that are found
[ https://issues.apache.org/jira/browse/HIVE-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15648980#comment-15648980 ] Sergey Shelukhin edited comment on HIVE-14990 at 11/10/16 1:58 AM: --- Looked at all the remaining tests. Out of 749 failed tests, about 100 failures and diffs are (or might be, at least) relevant. Many of them are similar e.g. are missing stats, but I don't know if they are missing stats for the same reason. Many e.g. exim may be due to unsupported/path-dependant scenarios that were not immediately obvious. Not sure why TestSparkCliDriver fails. Fails in client init for me with no useful logs (logs child process exited with 127, then times out). I think we'll fix that during branch merge, if still broken. Crossing out ones that are actually irrelevant {panel} TestCliDriver: authorization_insert create_default_prop exim_04_evolved_parts -exim_11_managed_external- -exim_12_external_location- -exim_15_external_part- -exim_18_part_external- -exim_19_00_part_external_location- -exim_19_part_external_location- insert1 list_bucket_dml_8 mm_all orc_createas1 ppd_join4 stats_empty_dyn_part stats_partscan_1_23 temp_table_display_colstats_tbllvl temp_table_options1 vector_udf2 list_bucket_dml_14,list_bucket_* llap_acid insert_overwrite_directory2 authorization_load autoColumnStats_9 create_like drop_database_removes_partition_dirs drop_table_removes_partition_dirs index_auto_update exim_01_nonpart,exim_02_part,exim_04_all_part,exim_05_some_part,exim_06_one_part,exim_16_part_external,exim_17_part_managed,exim_20_part_managed_location load_overwrite materialized_view_authorization_sqlstd,materialized_* merge_dynamic_partition, merge_dynamic_partition* orc_int_type_promotion orc_vectorization_ppd parquet_join2 partition_wise_fileformat,partition_wise_fileformat3 repl_1_drop,repl_3_exim_metadata sample6 sample_islocalmode_hook show_tablestatus smb_bucket_1 smb_mapjoin_2,smb_mapjoin_3,smb_mapjoin_7 stats_list_bucket stats_noscan_2 symlink_text_input_format temp_table_precedence offset_limit_global_optimizer rand_partitionpruner2 TestEncryptedHDFSCliDriver: encryption_ctas encryption_drop_partition encryption_insert_values encryption_join_unencrypted_tbl encryption_load_data_to_encrypted_tables MiniLlapLocal: exchgpartition2lel cbo_rp_lineage2 create_merge_compressed deleteAnalyze delete_where_no_match delete_where_non_partitioned dynpart_sort_optimization escape2 insert1 lineage2 lineage3 orc_llap schema_evol_orc_nonvec_part schema_evol_orc_vec_part schema_evol_text_nonvec_part schema_evol_text_vec_part schema_evol_text_vecrow_part smb_mapjoin_6 tez_dml union_fast_stats update_all_types update_tmp_table update_where_no_match update_where_non_partitioned vector_outer_join1 vector_outer_join4 MiniLlap: load_fs2 orc_ppd_basic external_table_with_space_in_location_path file_with_header_footer import_exported_table schemeAuthority,schemeAuthority2 table_nonprintable Minimr: infer_bucket_sort_map_operators infer_bucket_sort_merge infer_bucket_sort_reducers_power_two root_dir_external_table scriptfile1 TestSymlinkTextInputFormat#testCombine TestJdbcWithLocalClusterSpark, etc. {panel} was (Author: sershe): Looked at all the remaining tests. Out of 749 failed tests, about 100 failures and diffs are (or might be, at least) relevant. Many of them are similar e.g. are missing stats, but I don't know if they are missing stats for the same reason. Many e.g. exim may be due to unsupported/path-dependant scenarios that were not immediately obvious. Not sure why TestSparkCliDriver fails. Fails in client init for me with no useful logs (logs child process exited with 127, then times out). I think we'll fix that during branch merge, if still broken. {noformat} TestCliDriver: authorization_insert create_default_prop exim_04_evolved_parts exim_11_managed_external exim_12_external_location exim_15_external_part exim_18_part_external exim_19_00_part_external_location exim_19_part_external_location insert1 list_bucket_dml_8 mm_all orc_createas1 ppd_join4 stats_empty_dyn_part stats_partscan_1_23 temp_table_display_colstats_tbllvl temp_table_options1 vector_udf2 list_bucket_dml_14,list_bucket_* llap_acid insert_overwrite_directory2 authorization_load autoColumnStats_9 create_like drop_database_removes_partition_dirs drop_table_removes_partition_dirs index_auto_update exim_01_nonpart,exim_02_part,exim_04_all_part,exim_05_some_part,exim_06_one_part,exim_16_part_external,exim_17_part_managed,exim_20_part_managed_location load_overwrite materialized_view_authorization_sqlstd,materialized_* merge_dynamic_partition, merge_dynamic_partition* orc_int_type_promotion orc_vectorization_ppd parquet_join2 partition_wise_fileformat,partition_wise_fileformat3 repl_1_drop,repl_3_exim_metadata sample6 sample_islocalmode_hook show_tablestatus smb_bucket_1 smb_mapjoin_2,smb_mapjoin_3,smb_mapjoin_7 stats_list_bucket
[jira] [Updated] (HIVE-15174) Respect auth_to_local rules from hdfs configs (core-site.xml) for LDAP authentication too
[ https://issues.apache.org/jira/browse/HIVE-15174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated HIVE-15174: - Description: Hive has implemented Kerberos principal mapping for authentication; the same should be implemented for LDAP authentication. Both Kerberos and LDAP are using Active Directory as a backend to store principals (in many cases), so it's naturally to think this should work for LDAP too. Fact that this mapping works only for Kerberos and not for LDAP principals, breaks authentication in our organization. was: Hive has implemented Kerberos principal mapping for authentication; the same should be implemented for LDAP authentication. Both Kerberos and LDAP are using Active Directory as a backend to store principals (in many cases), so it's naturally to think this should work for LDAP too. Fact that IMPALA-2660 works only for Kerberos and not for LDAP principals, breaks authentication in our organization. > Respect auth_to_local rules from hdfs configs (core-site.xml) for LDAP > authentication too > - > > Key: HIVE-15174 > URL: https://issues.apache.org/jira/browse/HIVE-15174 > Project: Hive > Issue Type: Bug > Components: Authentication, HiveServer2, Security >Affects Versions: 1.1.1, 1.2.1, 2.1.0 > Environment: Hive 1.1; Hadoop 2.6 >Reporter: Ruslan Dautkhanov > Labels: security > > Hive has implemented Kerberos principal mapping for authentication; the same > should be implemented for LDAP authentication. > Both Kerberos and LDAP are using Active Directory as a backend to store > principals (in many cases), so it's naturally to think this should work for > LDAP too. > Fact that this mapping works only for Kerberos and not for LDAP principals, > breaks authentication in our organization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15174) Respect auth_to_local rules from hdfs configs (core-site.xml) for LDAP authentication too
[ https://issues.apache.org/jira/browse/HIVE-15174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated HIVE-15174: - Labels: security (was: ) > Respect auth_to_local rules from hdfs configs (core-site.xml) for LDAP > authentication too > - > > Key: HIVE-15174 > URL: https://issues.apache.org/jira/browse/HIVE-15174 > Project: Hive > Issue Type: Bug > Components: Authentication, HiveServer2, Security >Affects Versions: 1.1.1, 1.2.1, 2.1.0 > Environment: Hive 1.1; Hadoop 2.6 >Reporter: Ruslan Dautkhanov > Labels: security > > Hive has implemented Kerberos principal mapping for authentication; the same > should be implemented for LDAP authentication. > Both Kerberos and LDAP are using Active Directory as a backend to store > principals (in many cases), so it's naturally to think this should work for > LDAP too. > Fact that IMPALA-2660 works only for Kerberos and not for LDAP principals, > breaks authentication in our organization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12891) Hive fails when java.io.tmpdir is set to a relative location
[ https://issues.apache.org/jira/browse/HIVE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-12891: --- Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to 2.2.0. Thanks [~zsombor.klara] for the patch. If you want to committed to 2.1.1, please a patch for that branch as well. > Hive fails when java.io.tmpdir is set to a relative location > > > Key: HIVE-12891 > URL: https://issues.apache.org/jira/browse/HIVE-12891 > Project: Hive > Issue Type: Bug >Reporter: Reuben Kuhnert >Assignee: Barna Zsombor Klara > Fix For: 2.2.0 > > Attachments: HIVE-12891.01.19.2016.01.patch, HIVE-12891.03.patch, > HIVE-12891.04.patch, HIVE-12891.5.patch, HIVE-12981.01.22.2016.02.patch > > > The function {{SessionState.createSessionDirs}} fails when trying to create > directories where {{java.io.tmpdir}} is set to a relative location. > {code} > \[SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: > IllegalArgumentException java.net.URISyntaxException: Relative path in > absolute URI: > file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 > ... > Minor variations: > \[SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: SemanticException > Exception while processing Exception while writing out the local file > o.a.h.hive.ql/parse.SemanticException: Exception while processing exception > while writing out local file > ... > caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: > Relative path in absolute URI: > file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 > at o.a.h.fs.Path.initialize (206) > at o.a.h.fs.Path.(197)... > at o.a.h.hive.ql.context.getScratchDir(267) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15173) Allow dec as an alias for decimal
[ https://issues.apache.org/jira/browse/HIVE-15173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-15173: Attachment: HIVE-15173.patch Simple patch with testcase. > Allow dec as an alias for decimal > - > > Key: HIVE-15173 > URL: https://issues.apache.org/jira/browse/HIVE-15173 > Project: Hive > Issue Type: Sub-task > Components: Parser >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-15173.patch > > > Standard allows dec as an alias for decimal -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15173) Allow dec as an alias for decimal
[ https://issues.apache.org/jira/browse/HIVE-15173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-15173: Status: Patch Available (was: Open) > Allow dec as an alias for decimal > - > > Key: HIVE-15173 > URL: https://issues.apache.org/jira/browse/HIVE-15173 > Project: Hive > Issue Type: Sub-task > Components: Parser >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-15173.patch > > > Standard allows dec as an alias for decimal -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15135) Add an llap mode which fails if queries cannot run in llap
[ https://issues.apache.org/jira/browse/HIVE-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652522#comment-15652522 ] Hive QA commented on HIVE-15135: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12838234/HIVE-15135.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 10157 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_mult_tables] (batchId=77) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver (batchId=136) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver (batchId=137) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver (batchId=138) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver (batchId=141) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver (batchId=142) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver (batchId=144) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver (batchId=151) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2056/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2056/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2056/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 17 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12838234 - PreCommit-HIVE-Build > Add an llap mode which fails if queries cannot run in llap > -- > > Key: HIVE-15135 > URL: https://issues.apache.org/jira/browse/HIVE-15135 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-15135.01.patch, HIVE-15135.02.patch > > > ALL currently ends up launching new containers for queries which cannot run > in llap. > There should be a mode where these queries don't run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13557) Make interval keyword optional while specifying DAY in interval arithmetic
[ https://issues.apache.org/jira/browse/HIVE-13557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652495#comment-15652495 ] Pengcheng Xiong commented on HIVE-13557: I have left some comments on RB. the patch can be improved. > Make interval keyword optional while specifying DAY in interval arithmetic > -- > > Key: HIVE-13557 > URL: https://issues.apache.org/jira/browse/HIVE-13557 > Project: Hive > Issue Type: Sub-task > Components: Types >Reporter: Ashutosh Chauhan >Assignee: Zoltan Haindrich > Attachments: HIVE-13557.1.patch, HIVE-13557.1.patch, > HIVE-13557.1.patch > > > Currently we support expressions like: {code} > WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31')) - INTERVAL '30' DAY) AND > DATE('2000-01-31') > {code} > We should support: > {code} > WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31')) + (-30) DAY) AND > DATE('2000-01-31') > {code} > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15164) Change default RPC port for llap to be a dynamic port
[ https://issues.apache.org/jira/browse/HIVE-15164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652421#comment-15652421 ] Hive QA commented on HIVE-15164: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12838231/HIVE-15164.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10632 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk] (batchId=89) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=145) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=91) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2055/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2055/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2055/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12838231 - PreCommit-HIVE-Build > Change default RPC port for llap to be a dynamic port > - > > Key: HIVE-15164 > URL: https://issues.apache.org/jira/browse/HIVE-15164 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-15164.01.patch, HIVE-15164.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15149) Add additional information to ATSHook for Tez UI
[ https://issues.apache.org/jira/browse/HIVE-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-15149: -- Component/s: Hooks > Add additional information to ATSHook for Tez UI > > > Key: HIVE-15149 > URL: https://issues.apache.org/jira/browse/HIVE-15149 > Project: Hive > Issue Type: Improvement > Components: Hooks >Reporter: Jason Dere >Assignee: Li Lu > Attachments: HIVE-15149.1.patch > > > Additional query details wanted for TEZ-3530. The additional details > discussed include the following: > Publish the following info ( in addition to existing bits published today): > Application Id to which the query was submitted (primary filter) > DAG Id (primary filter) > Hive query name (primary filter) > Hive Configs (everything a set command would provide except for sensitive > credential info) > Potentially publish source of config i.e. set in hive query script vs > hive-site.xml, etc. > Which HiveServer2 the query was submitted to > *Which IP/host the query was submitted from - not sure what filter support > will be available. > Which execution mode the query is running in (primary filter) > What submission mode was used (cli/beeline/jdbc, etc) > User info ( running as, actual end user, etc) - not sure if already present > Perf logger events. The data published should be able to create a timeline > view of the query i.e. actual submission time, query compile timestamps, > execution timestamps, post-exec data moves, etc. > Explain plan with enough details for visualizing. > Databases and tables being queried (primary filter) > Yarn queue info (primary filter) > Caller context (primary filter) > Original source i.e. submitter > Thread info in HS2 if needed ( I believe Vikram may have added this earlier ) > Query time taken (with filter support ) > Additional context info e.g. llap instance name and appId if required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-15149) Add additional information to ATSHook for Tez UI
[ https://issues.apache.org/jira/browse/HIVE-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere reassigned HIVE-15149: - Assignee: Jason Dere (was: Li Lu) > Add additional information to ATSHook for Tez UI > > > Key: HIVE-15149 > URL: https://issues.apache.org/jira/browse/HIVE-15149 > Project: Hive > Issue Type: Improvement > Components: Hooks >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-15149.1.patch > > > Additional query details wanted for TEZ-3530. The additional details > discussed include the following: > Publish the following info ( in addition to existing bits published today): > Application Id to which the query was submitted (primary filter) > DAG Id (primary filter) > Hive query name (primary filter) > Hive Configs (everything a set command would provide except for sensitive > credential info) > Potentially publish source of config i.e. set in hive query script vs > hive-site.xml, etc. > Which HiveServer2 the query was submitted to > *Which IP/host the query was submitted from - not sure what filter support > will be available. > Which execution mode the query is running in (primary filter) > What submission mode was used (cli/beeline/jdbc, etc) > User info ( running as, actual end user, etc) - not sure if already present > Perf logger events. The data published should be able to create a timeline > view of the query i.e. actual submission time, query compile timestamps, > execution timestamps, post-exec data moves, etc. > Explain plan with enough details for visualizing. > Databases and tables being queried (primary filter) > Yarn queue info (primary filter) > Caller context (primary filter) > Original source i.e. submitter > Thread info in HS2 if needed ( I believe Vikram may have added this earlier ) > Query time taken (with filter support ) > Additional context info e.g. llap instance name and appId if required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-15168) Flaky test: TestSparkClient.testJobSubmission (still flaky)
[ https://issues.apache.org/jira/browse/HIVE-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652357#comment-15652357 ] Xuefu Zhang edited comment on HIVE-15168 at 11/9/16 11:22 PM: -- Could we have a few words describing the problem and the fix? It's not obvious while reading code diff. Thanks. Also, please attach the patch here as well. was (Author: xuefuz): Could we have a few words describing the problem and the fix? It's not obvious while reading code diff. Thanks. > Flaky test: TestSparkClient.testJobSubmission (still flaky) > --- > > Key: HIVE-15168 > URL: https://issues.apache.org/jira/browse/HIVE-15168 > Project: Hive > Issue Type: Sub-task >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara > > [HIVE-14910|https://issues.apache.org/jira/browse/HIVE-14910] already > addressed one source of flakyness bud sadly not all it seems. > In JobHandleImpl the listeners are registered after the job has been > submitted. > This may end up in a racecondition. > {code} > // Link the RPC and the promise so that events from one are propagated to > the other as > // needed. > rpc.addListener(new > GenericFutureListener() { > @Override > public void operationComplete(io.netty.util.concurrent.Future > f) { > if (f.isSuccess()) { > handle.changeState(JobHandle.State.QUEUED); > } else if (!promise.isDone()) { > promise.setFailure(f.cause()); > } > } > }); > promise.addListener(new GenericFutureListener () { > @Override > public void operationComplete(Promise p) { > if (jobId != null) { > jobs.remove(jobId); > } > if (p.isCancelled() && !rpc.isDone()) { > rpc.cancel(true); > } > } > }); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15168) Flaky test: TestSparkClient.testJobSubmission (still flaky)
[ https://issues.apache.org/jira/browse/HIVE-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652357#comment-15652357 ] Xuefu Zhang commented on HIVE-15168: Could we have a few words describing the problem and the fix? It's not obvious while reading code diff. Thanks. > Flaky test: TestSparkClient.testJobSubmission (still flaky) > --- > > Key: HIVE-15168 > URL: https://issues.apache.org/jira/browse/HIVE-15168 > Project: Hive > Issue Type: Sub-task >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara > > [HIVE-14910|https://issues.apache.org/jira/browse/HIVE-14910] already > addressed one source of flakyness bud sadly not all it seems. > In JobHandleImpl the listeners are registered after the job has been > submitted. > This may end up in a racecondition. > {code} > // Link the RPC and the promise so that events from one are propagated to > the other as > // needed. > rpc.addListener(new > GenericFutureListener() { > @Override > public void operationComplete(io.netty.util.concurrent.Future > f) { > if (f.isSuccess()) { > handle.changeState(JobHandle.State.QUEUED); > } else if (!promise.isDone()) { > promise.setFailure(f.cause()); > } > } > }); > promise.addListener(new GenericFutureListener () { > @Override > public void operationComplete(Promise p) { > if (jobId != null) { > jobs.remove(jobId); > } > if (p.isCancelled() && !rpc.isDone()) { > rpc.cancel(true); > } > } > }); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-15149) Add additional information to ATSHook for Tez UI
[ https://issues.apache.org/jira/browse/HIVE-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu reassigned HIVE-15149: Assignee: Li Lu (was: Jason Dere) > Add additional information to ATSHook for Tez UI > > > Key: HIVE-15149 > URL: https://issues.apache.org/jira/browse/HIVE-15149 > Project: Hive > Issue Type: Improvement >Reporter: Jason Dere >Assignee: Li Lu > Attachments: HIVE-15149.1.patch > > > Additional query details wanted for TEZ-3530. The additional details > discussed include the following: > Publish the following info ( in addition to existing bits published today): > Application Id to which the query was submitted (primary filter) > DAG Id (primary filter) > Hive query name (primary filter) > Hive Configs (everything a set command would provide except for sensitive > credential info) > Potentially publish source of config i.e. set in hive query script vs > hive-site.xml, etc. > Which HiveServer2 the query was submitted to > *Which IP/host the query was submitted from - not sure what filter support > will be available. > Which execution mode the query is running in (primary filter) > What submission mode was used (cli/beeline/jdbc, etc) > User info ( running as, actual end user, etc) - not sure if already present > Perf logger events. The data published should be able to create a timeline > view of the query i.e. actual submission time, query compile timestamps, > execution timestamps, post-exec data moves, etc. > Explain plan with enough details for visualizing. > Databases and tables being queried (primary filter) > Yarn queue info (primary filter) > Caller context (primary filter) > Original source i.e. submitter > Thread info in HS2 if needed ( I believe Vikram may have added this earlier ) > Query time taken (with filter support ) > Additional context info e.g. llap instance name and appId if required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14582) Add trunc(numeric) udf
[ https://issues.apache.org/jira/browse/HIVE-14582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652348#comment-15652348 ] Ashutosh Chauhan commented on HIVE-14582: - Any updates [~chinnalalam] ? > Add trunc(numeric) udf > -- > > Key: HIVE-14582 > URL: https://issues.apache.org/jira/browse/HIVE-14582 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Ashutosh Chauhan >Assignee: Chinna Rao Lalam > Attachments: HIVE-14582.patch > > > https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions200.htm -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13557) Make interval keyword optional while specifying DAY in interval arithmetic
[ https://issues.apache.org/jira/browse/HIVE-13557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652345#comment-15652345 ] Ashutosh Chauhan commented on HIVE-13557: - Looks good to me. [~pxiong] do you also want to take a look? > Make interval keyword optional while specifying DAY in interval arithmetic > -- > > Key: HIVE-13557 > URL: https://issues.apache.org/jira/browse/HIVE-13557 > Project: Hive > Issue Type: Sub-task > Components: Types >Reporter: Ashutosh Chauhan >Assignee: Zoltan Haindrich > Attachments: HIVE-13557.1.patch, HIVE-13557.1.patch, > HIVE-13557.1.patch > > > Currently we support expressions like: {code} > WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31')) - INTERVAL '30' DAY) AND > DATE('2000-01-31') > {code} > We should support: > {code} > WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31')) + (-30) DAY) AND > DATE('2000-01-31') > {code} > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15149) Add additional information to ATSHook for Tez UI
[ https://issues.apache.org/jira/browse/HIVE-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-15149: -- Attachment: HIVE-15149.1.patch Work in progress, added the following fields to the ATS event: Hive query name Hive configs HiveServer2 IP address Client IP execution mode (mr/tez/llap/spark) Hive instance type (cli/hs2) Tables read/written Fixed thread name (originally was ATSHook thread) > Add additional information to ATSHook for Tez UI > > > Key: HIVE-15149 > URL: https://issues.apache.org/jira/browse/HIVE-15149 > Project: Hive > Issue Type: Improvement >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-15149.1.patch > > > Additional query details wanted for TEZ-3530. The additional details > discussed include the following: > Publish the following info ( in addition to existing bits published today): > Application Id to which the query was submitted (primary filter) > DAG Id (primary filter) > Hive query name (primary filter) > Hive Configs (everything a set command would provide except for sensitive > credential info) > Potentially publish source of config i.e. set in hive query script vs > hive-site.xml, etc. > Which HiveServer2 the query was submitted to > *Which IP/host the query was submitted from - not sure what filter support > will be available. > Which execution mode the query is running in (primary filter) > What submission mode was used (cli/beeline/jdbc, etc) > User info ( running as, actual end user, etc) - not sure if already present > Perf logger events. The data published should be able to create a timeline > view of the query i.e. actual submission time, query compile timestamps, > execution timestamps, post-exec data moves, etc. > Explain plan with enough details for visualizing. > Databases and tables being queried (primary filter) > Yarn queue info (primary filter) > Caller context (primary filter) > Original source i.e. submitter > Thread info in HS2 if needed ( I believe Vikram may have added this earlier ) > Query time taken (with filter support ) > Additional context info e.g. llap instance name and appId if required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15171) set SparkTask's jobID with application id
[ https://issues.apache.org/jira/browse/HIVE-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652319#comment-15652319 ] Hive QA commented on HIVE-15171: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12838224/HIVE-15171.000.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10617 tests executed *Failed tests:* {noformat} TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=123) [ptf_seqfile.q,union_remove_23.q,parallel_join0.q,union_remove_9.q,join_thrift.q,skewjoinopt14.q,vectorized_mapjoin.q,union4.q,auto_join5.q,vectorized_shufflejoin.q,smb_mapjoin_20.q,groupby8_noskew.q,auto_sortmerge_join_10.q,groupby11.q,union_remove_16.q] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=145) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2054/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2054/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2054/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12838224 - PreCommit-HIVE-Build > set SparkTask's jobID with application id > - > > Key: HIVE-15171 > URL: https://issues.apache.org/jira/browse/HIVE-15171 > Project: Hive > Issue Type: Improvement > Components: Spark >Affects Versions: 2.1.0 >Reporter: zhihai xu >Assignee: zhihai xu > Attachments: HIVE-15171.000.patch > > > set SparkTask's jobID with application id, The information will be useful to > monitor the Spark Application in hook -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15162) NPE in ATSHook
[ https://issues.apache.org/jira/browse/HIVE-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652208#comment-15652208 ] Sergey Shelukhin commented on HIVE-15162: - +1 > NPE in ATSHook > -- > > Key: HIVE-15162 > URL: https://issues.apache.org/jira/browse/HIVE-15162 > Project: Hive > Issue Type: Bug > Components: Hooks >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-15162.1.patch > > > {noformat} > 2016-11-08T14:21:15,025 INFO [ATS Logger 0]: hooks.ATSHook > (ATSHook.java:run(156)) - Failed to submit plan to ATS: > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.hooks.ATSHook$2.run(ATSHook.java:141) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15135) Add an llap mode which fails if queries cannot run in llap
[ https://issues.apache.org/jira/browse/HIVE-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-15135: -- Attachment: HIVE-15135.02.patch Updated patch with the name changes to llap_only. Also modified MinilapLocal to use this mode MiniLlap to use all. > Add an llap mode which fails if queries cannot run in llap > -- > > Key: HIVE-15135 > URL: https://issues.apache.org/jira/browse/HIVE-15135 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-15135.01.patch, HIVE-15135.02.patch > > > ALL currently ends up launching new containers for queries which cannot run > in llap. > There should be a mode where these queries don't run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14089) complex type support in LLAP IO is broken
[ https://issues.apache.org/jira/browse/HIVE-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652169#comment-15652169 ] Hive QA commented on HIVE-14089: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12838219/HIVE-14089.11.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10632 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=145) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2053/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2053/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2053/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12838219 - PreCommit-HIVE-Build > complex type support in LLAP IO is broken > -- > > Key: HIVE-14089 > URL: https://issues.apache.org/jira/browse/HIVE-14089 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Sergey Shelukhin > Attachments: HIVE-14089.04.patch, HIVE-14089.05.patch, > HIVE-14089.06.patch, HIVE-14089.07.patch, HIVE-14089.08.patch, > HIVE-14089.09.patch, HIVE-14089.10.patch, HIVE-14089.10.patch, > HIVE-14089.10.patch, HIVE-14089.11.patch, HIVE-14089.WIP.2.patch, > HIVE-14089.WIP.3.patch, HIVE-14089.WIP.patch > > > HIVE-13617 is causing MiniLlapCliDriver following test failures > {code} > org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all > org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join > {code} > Note to self - need to add multi-stripe test, and also test complex types > with some nulls so that present stream is not suppressed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15085) Reduce the memory used by unit tests, MiniCliDriver, MiniLlapLocal, MiniSpark
[ https://issues.apache.org/jira/browse/HIVE-15085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652156#comment-15652156 ] Siddharth Seth commented on HIVE-15085: --- [~prasanth_j] - could you please take a look. The test failures are unrelated - tracked under HIVE-15058. > Reduce the memory used by unit tests, MiniCliDriver, MiniLlapLocal, MiniSpark > - > > Key: HIVE-15085 > URL: https://issues.apache.org/jira/browse/HIVE-15085 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-15085.01.patch, HIVE-15085.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15164) Change default RPC port for llap to be a dynamic port
[ https://issues.apache.org/jira/browse/HIVE-15164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-15164: -- Attachment: HIVE-15164.02.patch Updated patch to fix a test failure. > Change default RPC port for llap to be a dynamic port > - > > Key: HIVE-15164 > URL: https://issues.apache.org/jira/browse/HIVE-15164 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-15164.01.patch, HIVE-15164.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-15172) Flaky test: TestSparkCliDriver.testCliDriver[limit_pushdown]
[ https://issues.apache.org/jira/browse/HIVE-15172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong resolved HIVE-15172. Resolution: Fixed i have just push a fix for it. i missed to update the golden file. > Flaky test: TestSparkCliDriver.testCliDriver[limit_pushdown] > > > Key: HIVE-15172 > URL: https://issues.apache.org/jira/browse/HIVE-15172 > Project: Hive > Issue Type: Sub-task > Components: Tests >Reporter: Jason Dere > > Looks like this has been failing on recent precommit tests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15162) NPE in ATSHook
[ https://issues.apache.org/jira/browse/HIVE-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652004#comment-15652004 ] Jason Dere commented on HIVE-15162: --- [~ashutoshc] [~sershe] can you take a look? > NPE in ATSHook > -- > > Key: HIVE-15162 > URL: https://issues.apache.org/jira/browse/HIVE-15162 > Project: Hive > Issue Type: Bug > Components: Hooks >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-15162.1.patch > > > {noformat} > 2016-11-08T14:21:15,025 INFO [ATS Logger 0]: hooks.ATSHook > (ATSHook.java:run(156)) - Failed to submit plan to ATS: > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.hooks.ATSHook$2.run(ATSHook.java:141) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15162) NPE in ATSHook
[ https://issues.apache.org/jira/browse/HIVE-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652002#comment-15652002 ] Jason Dere commented on HIVE-15162: --- The 4 failing tests are all listed as issues under HIVE-15058. > NPE in ATSHook > -- > > Key: HIVE-15162 > URL: https://issues.apache.org/jira/browse/HIVE-15162 > Project: Hive > Issue Type: Bug > Components: Hooks >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-15162.1.patch > > > {noformat} > 2016-11-08T14:21:15,025 INFO [ATS Logger 0]: hooks.ATSHook > (ATSHook.java:run(156)) - Failed to submit plan to ATS: > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.hooks.ATSHook$2.run(ATSHook.java:141) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15171) set SparkTask's jobID with application id
[ https://issues.apache.org/jira/browse/HIVE-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated HIVE-15171: - Status: Patch Available (was: Open) > set SparkTask's jobID with application id > - > > Key: HIVE-15171 > URL: https://issues.apache.org/jira/browse/HIVE-15171 > Project: Hive > Issue Type: Improvement > Components: Spark >Affects Versions: 2.1.0 >Reporter: zhihai xu >Assignee: zhihai xu > Attachments: HIVE-15171.000.patch > > > set SparkTask's jobID with application id, The information will be useful to > monitor the Spark Application in hook -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15171) set SparkTask's jobID with application id
[ https://issues.apache.org/jira/browse/HIVE-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated HIVE-15171: - Attachment: HIVE-15171.000.patch > set SparkTask's jobID with application id > - > > Key: HIVE-15171 > URL: https://issues.apache.org/jira/browse/HIVE-15171 > Project: Hive > Issue Type: Improvement > Components: Spark >Affects Versions: 2.1.0 >Reporter: zhihai xu >Assignee: zhihai xu > Attachments: HIVE-15171.000.patch > > > set SparkTask's jobID with application id, The information will be useful to > monitor the Spark Application in hook -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14089) complex type support in LLAP IO is broken
[ https://issues.apache.org/jira/browse/HIVE-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14089: Attachment: HIVE-14089.11.patch The trivial out file change (no inputs -> all inputs). > complex type support in LLAP IO is broken > -- > > Key: HIVE-14089 > URL: https://issues.apache.org/jira/browse/HIVE-14089 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Sergey Shelukhin > Attachments: HIVE-14089.04.patch, HIVE-14089.05.patch, > HIVE-14089.06.patch, HIVE-14089.07.patch, HIVE-14089.08.patch, > HIVE-14089.09.patch, HIVE-14089.10.patch, HIVE-14089.10.patch, > HIVE-14089.10.patch, HIVE-14089.11.patch, HIVE-14089.WIP.2.patch, > HIVE-14089.WIP.3.patch, HIVE-14089.WIP.patch > > > HIVE-13617 is causing MiniLlapCliDriver following test failures > {code} > org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all > org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join > {code} > Note to self - need to add multi-stripe test, and also test complex types > with some nulls so that present stream is not suppressed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14998) Fix and update test: TestPluggableHiveSessionImpl
[ https://issues.apache.org/jira/browse/HIVE-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651931#comment-15651931 ] Zoltan Haindrich commented on HIVE-14998: - [~thejas] can you please take a look at these changes? > Fix and update test: TestPluggableHiveSessionImpl > - > > Key: HIVE-14998 > URL: https://issues.apache.org/jira/browse/HIVE-14998 > Project: Hive > Issue Type: Bug > Components: Tests >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-14998.1.patch > > > this test either prints an exception to the stdout ... or not - in its > current form it doesn't really usefull. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15161) migrate ColumnStats to use jackson
[ https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651926#comment-15651926 ] Zoltan Haindrich commented on HIVE-15161: - [~pxiong] can you please take a look at these changes? and one more thing: there are a few cases when there "column_stats" is present; but "basic_stats" is false - and hence omitted...they seem to be a bit odd - should I look after these ? {code} autoColumnStats_4.q.out: COLUMN_STATS_ACCURATE {\"COLUMN_STATS\":{\"a\":\"true\",\"b\":\"true\"}} {code} > migrate ColumnStats to use jackson > -- > > Key: HIVE-15161 > URL: https://issues.apache.org/jira/browse/HIVE-15161 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Fix For: 2.2.0 > > Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, > HIVE-15161.3.patch, HIVE-15161.4.patch > > > * json.org has license issues > * jackson can provide a fully compatible alternative to it > * there are a few flakiness issues caused by the order of the map entries of > the columns...this cat be addressed, org.json api was unfriendly in this > manner ;) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14089) complex type support in LLAP IO is broken
[ https://issues.apache.org/jira/browse/HIVE-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651907#comment-15651907 ] Sergey Shelukhin commented on HIVE-14089: - vector_complex_join is a trivial explain change; the rest are known failures. [~prasanth_j] can you please review? thanks > complex type support in LLAP IO is broken > -- > > Key: HIVE-14089 > URL: https://issues.apache.org/jira/browse/HIVE-14089 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Sergey Shelukhin > Attachments: HIVE-14089.04.patch, HIVE-14089.05.patch, > HIVE-14089.06.patch, HIVE-14089.07.patch, HIVE-14089.08.patch, > HIVE-14089.09.patch, HIVE-14089.10.patch, HIVE-14089.10.patch, > HIVE-14089.10.patch, HIVE-14089.WIP.2.patch, HIVE-14089.WIP.3.patch, > HIVE-14089.WIP.patch > > > HIVE-13617 is causing MiniLlapCliDriver following test failures > {code} > org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all > org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join > {code} > Note to self - need to add multi-stripe test, and also test complex types > with some nulls so that present stream is not suppressed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15085) Reduce the memory used by unit tests, MiniCliDriver, MiniLlapLocal, MiniSpark
[ https://issues.apache.org/jira/browse/HIVE-15085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651903#comment-15651903 ] Hive QA commented on HIVE-15085: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12837816/HIVE-15085.02.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10632 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=145) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=91) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2051/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2051/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2051/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12837816 - PreCommit-HIVE-Build > Reduce the memory used by unit tests, MiniCliDriver, MiniLlapLocal, MiniSpark > - > > Key: HIVE-15085 > URL: https://issues.apache.org/jira/browse/HIVE-15085 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-15085.01.patch, HIVE-15085.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651862#comment-15651862 ] Sahil Takiar commented on HIVE-14271: - Yes, agree with Steve. Sergio summarized it well. Sounds like this is a reasonable change, [~spena] can you re-open this JIRA. > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Sergio Peña > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15161) migrate ColumnStats to use jackson
[ https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651874#comment-15651874 ] Zoltan Haindrich commented on HIVE-15161: - failures are unrelated: HIVE-15084 ; HIVE-15115 ; HIVE-15116 > migrate ColumnStats to use jackson > -- > > Key: HIVE-15161 > URL: https://issues.apache.org/jira/browse/HIVE-15161 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Fix For: 2.2.0 > > Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, > HIVE-15161.3.patch, HIVE-15161.4.patch > > > * json.org has license issues > * jackson can provide a fully compatible alternative to it > * there are a few flakiness issues caused by the order of the map entries of > the columns...this cat be addressed, org.json api was unfriendly in this > manner ;) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña reopened HIVE-14271: > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Sergio Peña > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14975) Flaky Test: TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz
[ https://issues.apache.org/jira/browse/HIVE-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-14975: Issue Type: Bug (was: Sub-task) Parent: (was: HIVE-15058) > Flaky Test: TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz > -- > > Key: HIVE-14975 > URL: https://issues.apache.org/jira/browse/HIVE-14975 > Project: Hive > Issue Type: Bug > Components: Tests >Affects Versions: 2.2.0 >Reporter: Gopal V > > {code} > 2016-10-14T22:51:32,947 INFO [main] beeline.TestBeelineArgParsing: Add > /home/hiveptest/104.155.175.228-hiveptest-0/maven/postgresql/postgresql/9.1-901.jdbc4/postgresql-9.1-901.jdbc4.jar > for the driver class org.postgresql.Driver > Fail to add local jar due to the exception:java.util.zip.ZipException: error > in opening zip file > error in opening zip file > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-14975) Flaky Test: TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz
[ https://issues.apache.org/jira/browse/HIVE-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich resolved HIVE-14975. - Resolution: Duplicate > Flaky Test: TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz > -- > > Key: HIVE-14975 > URL: https://issues.apache.org/jira/browse/HIVE-14975 > Project: Hive > Issue Type: Sub-task > Components: Tests >Affects Versions: 2.2.0 >Reporter: Gopal V > > {code} > 2016-10-14T22:51:32,947 INFO [main] beeline.TestBeelineArgParsing: Add > /home/hiveptest/104.155.175.228-hiveptest-0/maven/postgresql/postgresql/9.1-901.jdbc4/postgresql-9.1-901.jdbc4.jar > for the driver class org.postgresql.Driver > Fail to add local jar due to the exception:java.util.zip.ZipException: error > in opening zip file > error in opening zip file > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15164) Change default RPC port for llap to be a dynamic port
[ https://issues.apache.org/jira/browse/HIVE-15164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651786#comment-15651786 ] Hive QA commented on HIVE-15164: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12838092/HIVE-15164.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10618 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=145) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=91) org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver (batchId=126) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[limit_pushdown] (batchId=121) org.apache.hadoop.hive.llap.daemon.impl.TestLlapDaemonProtocolServerImpl.test (batchId=277) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2050/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2050/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2050/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12838092 - PreCommit-HIVE-Build > Change default RPC port for llap to be a dynamic port > - > > Key: HIVE-15164 > URL: https://issues.apache.org/jira/browse/HIVE-15164 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-15164.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15114) Remove extra MoveTask operators
[ https://issues.apache.org/jira/browse/HIVE-15114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651757#comment-15651757 ] Sergio Peña commented on HIVE-15114: [~stakiar] No tests will be executed with this patch because the optimization only happens for blobstore, and we don't have automated tests for blobstore optimizations. It will be good to run a full set of tests once I attach a final patch, but for now I think we can wait as ptest won't give us any feedback for the change. > Remove extra MoveTask operators > --- > > Key: HIVE-15114 > URL: https://issues.apache.org/jira/browse/HIVE-15114 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 2.1.0 >Reporter: Sahil Takiar >Assignee: Sergio Peña > Attachments: HIVE-15114.WIP.1.patch > > > When running simple insert queries (e.g. {{INSERT INTO TABLE ... VALUES > ...}}) there an extraneous {{MoveTask}s is created. > This is problematic when the scratch directory is on S3 since renames require > copying the entire dataset. > For simple queries (like the one above), there are two MoveTasks. The first > one moves the output data from one file in the scratch directory to another > file in the scratch directory. The second MoveTask moves the data from the > scratch directory to its final table location. > The first MoveTask should not be necessary. The goal of this JIRA it to > remove it. This should help improve performance when running on S3. > It seems that the first Move might be caused by a dependency resolution > problem in the optimizer, where a dependent task doesn't get properly removed > when the task it depends on is filtered by a condition resolver. > A dummy {{MoveTask}} is added in the > {{GenMapRedUtils.createMRWorkForMergingFiles}} method. This method creates a > conditional task which launches a job to merge tasks at the end of the file. > At the end of the conditional job there is a MoveTask. > Even though Hive decides that the conditional merge job is no needed, it > seems the MoveTask is still added to the plan. > Seems this extra {{MoveTask}} may have been added intentionally. Not sure why > yet. The {{ConditionalResolverMergeFiles}} says that one of three tasks will > be returned: move task only, merge task only, merge task followed by a move > task. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15114) Remove extra MoveTask operators
[ https://issues.apache.org/jira/browse/HIVE-15114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651741#comment-15651741 ] Sahil Takiar commented on HIVE-15114: - [~spena] should we "Submit Patch" so we can some test results from Hive QA? > Remove extra MoveTask operators > --- > > Key: HIVE-15114 > URL: https://issues.apache.org/jira/browse/HIVE-15114 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 2.1.0 >Reporter: Sahil Takiar >Assignee: Sergio Peña > Attachments: HIVE-15114.WIP.1.patch > > > When running simple insert queries (e.g. {{INSERT INTO TABLE ... VALUES > ...}}) there an extraneous {{MoveTask}s is created. > This is problematic when the scratch directory is on S3 since renames require > copying the entire dataset. > For simple queries (like the one above), there are two MoveTasks. The first > one moves the output data from one file in the scratch directory to another > file in the scratch directory. The second MoveTask moves the data from the > scratch directory to its final table location. > The first MoveTask should not be necessary. The goal of this JIRA it to > remove it. This should help improve performance when running on S3. > It seems that the first Move might be caused by a dependency resolution > problem in the optimizer, where a dependent task doesn't get properly removed > when the task it depends on is filtered by a condition resolver. > A dummy {{MoveTask}} is added in the > {{GenMapRedUtils.createMRWorkForMergingFiles}} method. This method creates a > conditional task which launches a job to merge tasks at the end of the file. > At the end of the conditional job there is a MoveTask. > Even though Hive decides that the conditional merge job is no needed, it > seems the MoveTask is still added to the plan. > Seems this extra {{MoveTask}} may have been added intentionally. Not sure why > yet. The {{ConditionalResolverMergeFiles}} says that one of three tasks will > be returned: move task only, merge task only, merge task followed by a move > task. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0
[ https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651701#comment-15651701 ] Pengcheng Xiong commented on HIVE-15023: [~kgyrtkirk], thanks for finding this out. I have pushed the patch to the master. Thanks again. > SimpleFetchOptimizer needs to optimize limit=0 > -- > > Key: HIVE-15023 > URL: https://issues.apache.org/jira/browse/HIVE-15023 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.2.0 > > Attachments: HIVE-15023.01.patch, HIVE-15023.02.patch > > > on current master > {code} > hive> explain select key from src limit 0; > OK > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: 0 > Processor Tree: > TableScan > alias: src > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: key (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Limit > Number of rows: 0 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > ListSink > Time taken: 7.534 seconds, Fetched: 20 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15114) Remove extra MoveTask operators
[ https://issues.apache.org/jira/browse/HIVE-15114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651657#comment-15651657 ] Sergio Peña commented on HIVE-15114: [~sershe] What do you think about this approach? It merges two MoveTask into one only for blobstore paths. > Remove extra MoveTask operators > --- > > Key: HIVE-15114 > URL: https://issues.apache.org/jira/browse/HIVE-15114 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 2.1.0 >Reporter: Sahil Takiar >Assignee: Sergio Peña > Attachments: HIVE-15114.WIP.1.patch > > > When running simple insert queries (e.g. {{INSERT INTO TABLE ... VALUES > ...}}) there an extraneous {{MoveTask}s is created. > This is problematic when the scratch directory is on S3 since renames require > copying the entire dataset. > For simple queries (like the one above), there are two MoveTasks. The first > one moves the output data from one file in the scratch directory to another > file in the scratch directory. The second MoveTask moves the data from the > scratch directory to its final table location. > The first MoveTask should not be necessary. The goal of this JIRA it to > remove it. This should help improve performance when running on S3. > It seems that the first Move might be caused by a dependency resolution > problem in the optimizer, where a dependent task doesn't get properly removed > when the task it depends on is filtered by a condition resolver. > A dummy {{MoveTask}} is added in the > {{GenMapRedUtils.createMRWorkForMergingFiles}} method. This method creates a > conditional task which launches a job to merge tasks at the end of the file. > At the end of the conditional job there is a MoveTask. > Even though Hive decides that the conditional merge job is no needed, it > seems the MoveTask is still added to the plan. > Seems this extra {{MoveTask}} may have been added intentionally. Not sure why > yet. The {{ConditionalResolverMergeFiles}} says that one of three tasks will > be returned: move task only, merge task only, merge task followed by a move > task. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15114) Remove extra MoveTask operators
[ https://issues.apache.org/jira/browse/HIVE-15114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-15114: --- Attachment: HIVE-15114.WIP.1.patch Attaching a patch that is work-in-progress for an early review. I need to add some unittests. [~stakiar] The patch uses a new dispatcher that is executed on the task physical optimizer, and it looks for a ConditionalTask to do the optimization. Questions I have: 1. Should we move the optimization to the {{GenMapRedUtils.createMRWorkForMergingFiles}} instead? 2. Should we look for any MoveTask that links to another MoveTask on the whole plan instead of just focusing on the ConditionalTask? > Remove extra MoveTask operators > --- > > Key: HIVE-15114 > URL: https://issues.apache.org/jira/browse/HIVE-15114 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 2.1.0 >Reporter: Sahil Takiar >Assignee: Sergio Peña > Attachments: HIVE-15114.WIP.1.patch > > > When running simple insert queries (e.g. {{INSERT INTO TABLE ... VALUES > ...}}) there an extraneous {{MoveTask}s is created. > This is problematic when the scratch directory is on S3 since renames require > copying the entire dataset. > For simple queries (like the one above), there are two MoveTasks. The first > one moves the output data from one file in the scratch directory to another > file in the scratch directory. The second MoveTask moves the data from the > scratch directory to its final table location. > The first MoveTask should not be necessary. The goal of this JIRA it to > remove it. This should help improve performance when running on S3. > It seems that the first Move might be caused by a dependency resolution > problem in the optimizer, where a dependent task doesn't get properly removed > when the task it depends on is filtered by a condition resolver. > A dummy {{MoveTask}} is added in the > {{GenMapRedUtils.createMRWorkForMergingFiles}} method. This method creates a > conditional task which launches a job to merge tasks at the end of the file. > At the end of the conditional job there is a MoveTask. > Even though Hive decides that the conditional merge job is no needed, it > seems the MoveTask is still added to the plan. > Seems this extra {{MoveTask}} may have been added intentionally. Not sure why > yet. The {{ConditionalResolverMergeFiles}} says that one of three tasks will > be returned: move task only, merge task only, merge task followed by a move > task. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15090) Temporary DB failure can stop ExpiredTokenRemover thread
[ https://issues.apache.org/jira/browse/HIVE-15090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651643#comment-15651643 ] Hive QA commented on HIVE-15090: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12838189/HIVE-15090.3-branch-2.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 10462 tests executed *Failed tests:* {noformat} TestJdbcWithMiniHA - did not produce a TEST-*.xml file (likely timed out) (batchId=494) TestJdbcWithMiniMr - did not produce a TEST-*.xml file (likely timed out) (batchId=491) TestMsgBusConnection - did not produce a TEST-*.xml file (likely timed out) (batchId=362) TestOperationLoggingAPIWithTez - did not produce a TEST-*.xml file (likely timed out) (batchId=484) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_table_stats (batchId=92) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_orig_table_use_metadata (batchId=109) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 (batchId=87) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_schema_evol_3a (batchId=97) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_null_optimizer (batchId=154) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_between_in (batchId=99) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_ppd_basic (batchId=521) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner (batchId=539) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_ppd_basic (batchId=187) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_ppd_schema_evol_3a (batchId=198) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_between_in (batchId=199) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_cast_constant (batchId=183) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_complex_all (batchId=200) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_between_in (batchId=233) org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching (batchId=492) org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd (batchId=487) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2049/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2049/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2049/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 20 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12838189 - PreCommit-HIVE-Build > Temporary DB failure can stop ExpiredTokenRemover thread > > > Key: HIVE-15090 > URL: https://issues.apache.org/jira/browse/HIVE-15090 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.3.0, 2.1.0, 2.0.1, 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary > Fix For: 2.2.0 > > Attachments: HIVE-15090.2-branch-2.1.patch, HIVE-15090.2.patch, > HIVE-15090.2.patch, HIVE-15090.3-branch-2.1.patch, HIVE-15090.patch > > > In HIVE-13090 we decided that we should not close the metastore if there is > an unexpected exception during the expired token removal process, but that > fix leaves a running metastore without ExpiredTokenRemover thread. > To fix this I will move the catch inside the running loop, and hope the > thread could recover from the exception -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15093) S3-to-S3 Renames: Files should be moved individually rather than at a directory level
[ https://issues.apache.org/jira/browse/HIVE-15093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651540#comment-15651540 ] Sahil Takiar commented on HIVE-15093: - [~steve_l] thanks for your input. I'm happy to start looking into HADOOP-13600 if everyone agrees that is the better approach. I do have some questions though: * Is Hadoop 2.8+ released anywhere? I don't see artifacts published on Maven Central, Hive is currently using version 2.7.2 * HADOOP-13600 is targeted for Hadoop 2.9.0 do we know when that would be released? My main question is, if we do this in Hadoop when will the optimization actually make it into Hive? [~ashutoshc] any chance you, or maybe someone on the PMC could comment on this? In addition to Steve's concerns, [~yalovyyi] and [~poeppt] had similar concerns expressed in earlier comments in this JIRA. > S3-to-S3 Renames: Files should be moved individually rather than at a > directory level > - > > Key: HIVE-15093 > URL: https://issues.apache.org/jira/browse/HIVE-15093 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 2.1.0 >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-15093.1.patch, HIVE-15093.2.patch, > HIVE-15093.3.patch, HIVE-15093.4.patch, HIVE-15093.5.patch, > HIVE-15093.6.patch, HIVE-15093.7.patch, HIVE-15093.8.patch, HIVE-15093.9.patch > > > Hive's MoveTask uses the Hive.moveFile method to move data within a > distributed filesystem as well as blobstore filesystems. > If the move is done within the same filesystem: > 1: If the source path is a subdirectory of the destination path, files will > be moved one by one using a threapool of workers > 2: If the source path is not a subdirectory of the destination path, a single > rename operation is used to move the entire directory > The second option may not work well on blobstores such as S3. Renames are not > metadata operations and require copying all the data. Client connectors to > blobstores may not efficiently rename directories. Worst case, the connector > will copy each file one by one, sequentially rather than using a threadpool > of workers to copy the data (e.g. HADOOP-13600). > Hive already has code to rename files using a threadpool of workers, but this > only occurs in case number 1. > This JIRA aims to modify the code so that case 1 is triggered when copying > within a blobstore. The focus is on copies within a blobstore because > needToCopy will return true if the src and target filesystems are different, > in which case a different code path is triggered. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15090) Temporary DB failure can stop ExpiredTokenRemover thread
[ https://issues.apache.org/jira/browse/HIVE-15090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651499#comment-15651499 ] Peter Vary commented on HIVE-15090: --- [~thejas] You are thinking like me :) ??Defining the exceptions that can be thrown by DelegationTokenStore that are not fatal and can be ignored.?? I have chickened out of this since it is a compatibility change - at least in my unpracticed view. If I change the DelegationTokenStore interface to add the new type of exception, then if someone has implemented his own DelegationTokenStore, it has to be changed to work with the new version of hive. ??Updating DBTokenStore to not thrown what could be transient errors, and just log those?? ExpiredTokenRemover uses the following DelegationTokenStore methods: updateMasterKey, removeMasterKey, getAllDelegationTokenIdentifiers, removeToken, getToken. Changing the behavior of these methods could cause unexpected results. So I leaned for your first suggestion, but HIVE-13090 was a longstanding issue (introduced at Dec 7, 2011) with very visible effects and with only two jiras for it. I thought it is not that common to warrant the compatibility change. What do you think [~thejas]? Is it worth to change the DelegationTokenStore interface? You have more experience with Hive than me. Thanks, Peter > Temporary DB failure can stop ExpiredTokenRemover thread > > > Key: HIVE-15090 > URL: https://issues.apache.org/jira/browse/HIVE-15090 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.3.0, 2.1.0, 2.0.1, 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary > Fix For: 2.2.0 > > Attachments: HIVE-15090.2-branch-2.1.patch, HIVE-15090.2.patch, > HIVE-15090.2.patch, HIVE-15090.3-branch-2.1.patch, HIVE-15090.patch > > > In HIVE-13090 we decided that we should not close the metastore if there is > an unexpected exception during the expired token removal process, but that > fix leaves a running metastore without ExpiredTokenRemover thread. > To fix this I will move the catch inside the running loop, and hope the > thread could recover from the exception -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15093) S3-to-S3 Renames: Files should be moved individually rather than at a directory level
[ https://issues.apache.org/jira/browse/HIVE-15093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651394#comment-15651394 ] Steve Loughran commented on HIVE-15093: --- -1 (non binding) Doing parallel rename is a stop-gap solution which will be obsolete the moment someone sits down to do it in s3a with an implementation that its more efficient in its scheduling of copies calls, and, with tests and broader use, more tested. HADOOP-13600 proposes parallel renames. Nobody has written that yet, —but I promise to review a patch people provide, with tests. Get that patch into Hadoop and there's only one place to maintain this stuff, no need to document/test another switch, maintain the option, have another codepath to keep alive, etc. The algorithm I proposed there would initially sorts the files by size, so the larger renames are scheduled first. Given a thread pool smaller than the list of files to rename, this should ensure that the scheduling is more optimal. the listing. If you really, really, want to do this in a separate piece of code, you should do the same. Also, there are enough other s3a speedups that you should be testing against Hadoop 2.8+, both to avoid optimising against a now-obsolete codepath, but also to help find and report any problems in our code. To summarise: go on, fix the code in Hadoop, simplify everyone's lives. > S3-to-S3 Renames: Files should be moved individually rather than at a > directory level > - > > Key: HIVE-15093 > URL: https://issues.apache.org/jira/browse/HIVE-15093 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 2.1.0 >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-15093.1.patch, HIVE-15093.2.patch, > HIVE-15093.3.patch, HIVE-15093.4.patch, HIVE-15093.5.patch, > HIVE-15093.6.patch, HIVE-15093.7.patch, HIVE-15093.8.patch, HIVE-15093.9.patch > > > Hive's MoveTask uses the Hive.moveFile method to move data within a > distributed filesystem as well as blobstore filesystems. > If the move is done within the same filesystem: > 1: If the source path is a subdirectory of the destination path, files will > be moved one by one using a threapool of workers > 2: If the source path is not a subdirectory of the destination path, a single > rename operation is used to move the entire directory > The second option may not work well on blobstores such as S3. Renames are not > metadata operations and require copying all the data. Client connectors to > blobstores may not efficiently rename directories. Worst case, the connector > will copy each file one by one, sequentially rather than using a threadpool > of workers to copy the data (e.g. HADOOP-13600). > Hive already has code to rename files using a threadpool of workers, but this > only occurs in case number 1. > This JIRA aims to modify the code so that case 1 is triggered when copying > within a blobstore. The focus is on copies within a blobstore because > needToCopy will return true if the src and target filesystems are different, > in which case a different code path is triggered. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651362#comment-15651362 ] Sergio Peña commented on HIVE-14271: Agree with approach #2. If outPath and finalPath are scratch directories, then we can just write directly to finalPath and avoid the rename. [~ste...@apache.org] There is another patch to do S3-to-S3 renames in parallel to speed up the COPY operations (See HIVE-15093) > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Sergio Peña > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15090) Temporary DB failure can stop ExpiredTokenRemover thread
[ https://issues.apache.org/jira/browse/HIVE-15090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651344#comment-15651344 ] Thejas M Nair commented on HIVE-15090: -- Some options are - * Defining the exceptions that can be thrown by DelegationTokenStore that are not fatal and can be ignored. * Updating DBTokenStore to not thrown what could be transient errors, and just log those [~pvary] What are your thoughts ? > Temporary DB failure can stop ExpiredTokenRemover thread > > > Key: HIVE-15090 > URL: https://issues.apache.org/jira/browse/HIVE-15090 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.3.0, 2.1.0, 2.0.1, 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary > Fix For: 2.2.0 > > Attachments: HIVE-15090.2-branch-2.1.patch, HIVE-15090.2.patch, > HIVE-15090.2.patch, HIVE-15090.3-branch-2.1.patch, HIVE-15090.patch > > > In HIVE-13090 we decided that we should not close the metastore if there is > an unexpected exception during the expired token removal process, but that > fix leaves a running metastore without ExpiredTokenRemover thread. > To fix this I will move the catch inside the running loop, and hope the > thread could recover from the exception -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651284#comment-15651284 ] Steve Loughran commented on HIVE-14271: --- Strategy 2 will eliminate one rename, which, with rename costs being O(data) is good. However, there's still one rename to go. there's still the overhead of copying the data from scratch to final. This shouldn't be done in the client-side code, as object store COPY operations happen server side; they're what rename() uses. If renames of files in a directory are issued in parallel, then the rename can be significantly speeded up; this works precisely because you can hold open the HTTP connections for the copy calls without much cost in network traffic. > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Sergio Peña > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15090) Temporary DB failure can stop ExpiredTokenRemover thread
[ https://issues.apache.org/jira/browse/HIVE-15090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-15090: -- Attachment: HIVE-15090.3-branch-2.1.patch Retriggering the patch, and hoping to get only the same failing tests as HIVE-15094 :) > Temporary DB failure can stop ExpiredTokenRemover thread > > > Key: HIVE-15090 > URL: https://issues.apache.org/jira/browse/HIVE-15090 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.3.0, 2.1.0, 2.0.1, 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary > Fix For: 2.2.0 > > Attachments: HIVE-15090.2-branch-2.1.patch, HIVE-15090.2.patch, > HIVE-15090.2.patch, HIVE-15090.3-branch-2.1.patch, HIVE-15090.patch > > > In HIVE-13090 we decided that we should not close the metastore if there is > an unexpected exception during the expired token removal process, but that > fix leaves a running metastore without ExpiredTokenRemover thread. > To fix this I will move the catch inside the running loop, and hope the > thread could recover from the exception -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15090) Temporary DB failure can stop ExpiredTokenRemover thread
[ https://issues.apache.org/jira/browse/HIVE-15090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651254#comment-15651254 ] Peter Vary commented on HIVE-15090: --- Hi [~thejas], I was thinking about the same lines as you, but finally decided against it. My reasoning was that the METASTORE_CLUSTER_DELEGATION_TOKEN_STORE_CLS is a configuration variable and could be set by the administrator to any class, that is why we will never be able to handle every future exception here correctly. So finally I decided to stick to a clean, easily understandable solution rather than create a partial solution for the DBTokenStore only. Since this one is already committed to master, I think if we find a better approach I think we should open another jira to handle it. I would be happy to help out there too. Thanks again for taking a look at this! Peter > Temporary DB failure can stop ExpiredTokenRemover thread > > > Key: HIVE-15090 > URL: https://issues.apache.org/jira/browse/HIVE-15090 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.3.0, 2.1.0, 2.0.1, 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary > Fix For: 2.2.0 > > Attachments: HIVE-15090.2-branch-2.1.patch, HIVE-15090.2.patch, > HIVE-15090.2.patch, HIVE-15090.patch > > > In HIVE-13090 we decided that we should not close the metastore if there is > an unexpected exception during the expired token removal process, but that > fix leaves a running metastore without ExpiredTokenRemover thread. > To fix this I will move the catch inside the running loop, and hope the > thread could recover from the exception -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15168) Flaky test: TestSparkClient.testJobSubmission (still flaky)
[ https://issues.apache.org/jira/browse/HIVE-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651218#comment-15651218 ] Barna Zsombor Klara commented on HIVE-15168: Sadly my wonderful fix managed to prevent the testJobSubmission test from failing for one, single, sad day I think I found a second racecondition, would you mind taking a second look at it [~xuefuz], [~lirui]? I hope this time my fix will be a tiny bit more permanent... In the meantime I'll try running the test a couple of hundred times in a loop to see if it breaks again. > Flaky test: TestSparkClient.testJobSubmission (still flaky) > --- > > Key: HIVE-15168 > URL: https://issues.apache.org/jira/browse/HIVE-15168 > Project: Hive > Issue Type: Sub-task >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara > > [HIVE-14910|https://issues.apache.org/jira/browse/HIVE-14910] already > addressed one source of flakyness bud sadly not all it seems. > In JobHandleImpl the listeners are registered after the job has been > submitted. > This may end up in a racecondition. > {code} > // Link the RPC and the promise so that events from one are propagated to > the other as > // needed. > rpc.addListener(new > GenericFutureListener() { > @Override > public void operationComplete(io.netty.util.concurrent.Future > f) { > if (f.isSuccess()) { > handle.changeState(JobHandle.State.QUEUED); > } else if (!promise.isDone()) { > promise.setFailure(f.cause()); > } > } > }); > promise.addListener(new GenericFutureListener () { > @Override > public void operationComplete(Promise p) { > if (jobId != null) { > jobs.remove(jobId); > } > if (p.isCancelled() && !rpc.isDone()) { > rpc.cancel(true); > } > } > }); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15161) migrate ColumnStats to use jackson
[ https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651181#comment-15651181 ] Hive QA commented on HIVE-15161: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12838179/HIVE-15161.4.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10637 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=145) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=91) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[limit_pushdown] (batchId=121) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2048/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2048/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2048/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12838179 - PreCommit-HIVE-Build > migrate ColumnStats to use jackson > -- > > Key: HIVE-15161 > URL: https://issues.apache.org/jira/browse/HIVE-15161 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Fix For: 2.2.0 > > Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, > HIVE-15161.3.patch, HIVE-15161.4.patch > > > * json.org has license issues > * jackson can provide a fully compatible alternative to it > * there are a few flakiness issues caused by the order of the map entries of > the columns...this cat be addressed, org.json api was unfriendly in this > manner ;) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14541) Beeline does not prompt for username and password properly
[ https://issues.apache.org/jira/browse/HIVE-14541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651085#comment-15651085 ] Miklos Csanady commented on HIVE-14541: --- Let me clarify: It should ask for username if -u parameter is given without username AND (no -n connection parameter given and (no javax.jdo.option.ConnectionUserName given and no ConnectionUserName parameter found.)) For password prompt: if -p given without value OR (-u parameter is given with no passwd AND none of javax.jdo.option.ConnectionPassword and ConnectionPassword found.) [~vihangk1] Am I correct? Miklos > Beeline does not prompt for username and password properly > -- > > Key: HIVE-14541 > URL: https://issues.apache.org/jira/browse/HIVE-14541 > Project: Hive > Issue Type: Bug > Components: Beeline >Reporter: Vihang Karajgaonkar >Assignee: Miklos Csanady > > In the default mode, when we connect using !connect > jdbc:hive2://localhost:1 (without providing user and password) beeling > prompts for it as expected. > But when we use beeline -u "url" and do not provide -n or -p arguments, it > does not prompt for the user/password > {noformat} > $ ./beeline -u jdbc:hive2://localhost:1 > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/Users/vihang/work/src/upstream/hive/packaging/target/apache-hive-2.2.0-SNAPSHOT-bin/apache-hive-2.2.0-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] > Connecting to jdbc:hive2://localhost:1 > Connected to: Apache Hive (version 2.2.0-SNAPSHOT) > Driver: Hive JDBC (version 2.2.0-SNAPSHOT) > 16/08/15 18:09:15 [main]: WARN jdbc.HiveConnection: Request to set autoCommit > to false; Hive does not support autoCommit=false. > Transaction isolation: TRANSACTION_REPEATABLE_READ > Beeline version 2.2.0-SNAPSHOT by Apache Hive > 0: jdbc:hive2://localhost:1> !quit > Closing: 0: jdbc:hive2://localhost:1 > {noformat} > {noformat} > $ ./beeline > Beeline version 2.2.0-SNAPSHOT by Apache Hive > beeline> !connect "jdbc:hive2://localhost:1" > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/Users/vihang/work/src/upstream/hive/packaging/target/apache-hive-2.2.0-SNAPSHOT-bin/apache-hive-2.2.0-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] > Connecting to jdbc:hive2://localhost:1 > Enter username for jdbc:hive2://localhost:1: hive > Enter password for jdbc:hive2://localhost:1: > Connected to: Apache Hive (version 2.2.0-SNAPSHOT) > Driver: Hive JDBC (version 2.2.0-SNAPSHOT) > 16/08/15 18:09:03 [main]: WARN jdbc.HiveConnection: Request to set autoCommit > to false; Hive does not support autoCommit=false. > Transaction isolation: TRANSACTION_REPEATABLE_READ > 0: jdbc:hive2://localhost:1> !quit > Closing: 0: jdbc:hive2://localhost:1 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15161) migrate ColumnStats to use jackson
[ https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-15161: Attachment: HIVE-15161.4.patch > migrate ColumnStats to use jackson > -- > > Key: HIVE-15161 > URL: https://issues.apache.org/jira/browse/HIVE-15161 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Fix For: 2.2.0 > > Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, > HIVE-15161.3.patch, HIVE-15161.4.patch > > > * json.org has license issues > * jackson can provide a fully compatible alternative to it > * there are a few flakiness issues caused by the order of the map entries of > the columns...this cat be addressed, org.json api was unfriendly in this > manner ;) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15161) migrate ColumnStats to use jackson
[ https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651015#comment-15651015 ] Hive QA commented on HIVE-15161: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12838163/HIVE-15161.3.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10635 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_4] (batchId=11) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[autoColumnStats_2] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=145) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[limit_pushdown] (batchId=121) org.apache.hadoop.hive.metastore.hbase.TestHBaseSchemaTool.oneMondoTest (batchId=191) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2047/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2047/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2047/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12838163 - PreCommit-HIVE-Build > migrate ColumnStats to use jackson > -- > > Key: HIVE-15161 > URL: https://issues.apache.org/jira/browse/HIVE-15161 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Fix For: 2.2.0 > > Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, > HIVE-15161.3.patch > > > * json.org has license issues > * jackson can provide a fully compatible alternative to it > * there are a few flakiness issues caused by the order of the map entries of > the columns...this cat be addressed, org.json api was unfriendly in this > manner ;) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0
[ https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15650960#comment-15650960 ] Zoltan Haindrich commented on HIVE-15023: - [~pxiong] it seems to me that there is a qtest which have "evaded" the output update ;) and it's affected by the limit 0 optimization: https://builds.apache.org/job/PreCommit-HIVE-Build/2046/testReport/org.apache.hadoop.hive.cli/TestSparkCliDriver/testCliDriver_limit_pushdown_/ > SimpleFetchOptimizer needs to optimize limit=0 > -- > > Key: HIVE-15023 > URL: https://issues.apache.org/jira/browse/HIVE-15023 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.2.0 > > Attachments: HIVE-15023.01.patch, HIVE-15023.02.patch > > > on current master > {code} > hive> explain select key from src limit 0; > OK > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: 0 > Processor Tree: > TableScan > alias: src > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: key (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Limit > Number of rows: 0 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > ListSink > Time taken: 7.534 seconds, Fetched: 20 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15161) migrate ColumnStats to use jackson
[ https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-15161: Attachment: (was: HIVE-15161.3.patch) > migrate ColumnStats to use jackson > -- > > Key: HIVE-15161 > URL: https://issues.apache.org/jira/browse/HIVE-15161 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Fix For: 2.2.0 > > Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, > HIVE-15161.3.patch > > > * json.org has license issues > * jackson can provide a fully compatible alternative to it > * there are a few flakiness issues caused by the order of the map entries of > the columns...this cat be addressed, org.json api was unfriendly in this > manner ;) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15161) migrate ColumnStats to use jackson
[ https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-15161: Attachment: HIVE-15161.3.patch > migrate ColumnStats to use jackson > -- > > Key: HIVE-15161 > URL: https://issues.apache.org/jira/browse/HIVE-15161 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Fix For: 2.2.0 > > Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, > HIVE-15161.3.patch > > > * json.org has license issues > * jackson can provide a fully compatible alternative to it > * there are a few flakiness issues caused by the order of the map entries of > the columns...this cat be addressed, org.json api was unfriendly in this > manner ;) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15161) migrate ColumnStats to use jackson
[ https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-15161: Attachment: HIVE-15161.3.patch > migrate ColumnStats to use jackson > -- > > Key: HIVE-15161 > URL: https://issues.apache.org/jira/browse/HIVE-15161 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Fix For: 2.2.0 > > Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, > HIVE-15161.3.patch > > > * json.org has license issues > * jackson can provide a fully compatible alternative to it > * there are a few flakiness issues caused by the order of the map entries of > the columns...this cat be addressed, org.json api was unfriendly in this > manner ;) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-14541) Beeline does not prompt for username and password properly
[ https://issues.apache.org/jira/browse/HIVE-14541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Csanady reassigned HIVE-14541: - Assignee: Miklos Csanady > Beeline does not prompt for username and password properly > -- > > Key: HIVE-14541 > URL: https://issues.apache.org/jira/browse/HIVE-14541 > Project: Hive > Issue Type: Bug > Components: Beeline >Reporter: Vihang Karajgaonkar >Assignee: Miklos Csanady > > In the default mode, when we connect using !connect > jdbc:hive2://localhost:1 (without providing user and password) beeling > prompts for it as expected. > But when we use beeline -u "url" and do not provide -n or -p arguments, it > does not prompt for the user/password > {noformat} > $ ./beeline -u jdbc:hive2://localhost:1 > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/Users/vihang/work/src/upstream/hive/packaging/target/apache-hive-2.2.0-SNAPSHOT-bin/apache-hive-2.2.0-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] > Connecting to jdbc:hive2://localhost:1 > Connected to: Apache Hive (version 2.2.0-SNAPSHOT) > Driver: Hive JDBC (version 2.2.0-SNAPSHOT) > 16/08/15 18:09:15 [main]: WARN jdbc.HiveConnection: Request to set autoCommit > to false; Hive does not support autoCommit=false. > Transaction isolation: TRANSACTION_REPEATABLE_READ > Beeline version 2.2.0-SNAPSHOT by Apache Hive > 0: jdbc:hive2://localhost:1> !quit > Closing: 0: jdbc:hive2://localhost:1 > {noformat} > {noformat} > $ ./beeline > Beeline version 2.2.0-SNAPSHOT by Apache Hive > beeline> !connect "jdbc:hive2://localhost:1" > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/Users/vihang/work/src/upstream/hive/packaging/target/apache-hive-2.2.0-SNAPSHOT-bin/apache-hive-2.2.0-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] > Connecting to jdbc:hive2://localhost:1 > Enter username for jdbc:hive2://localhost:1: hive > Enter password for jdbc:hive2://localhost:1: > Connected to: Apache Hive (version 2.2.0-SNAPSHOT) > Driver: Hive JDBC (version 2.2.0-SNAPSHOT) > 16/08/15 18:09:03 [main]: WARN jdbc.HiveConnection: Request to set autoCommit > to false; Hive does not support autoCommit=false. > Transaction isolation: TRANSACTION_REPEATABLE_READ > 0: jdbc:hive2://localhost:1> !quit > Closing: 0: jdbc:hive2://localhost:1 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12891) Hive fails when java.io.tmpdir is set to a relative location
[ https://issues.apache.org/jira/browse/HIVE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15650546#comment-15650546 ] Barna Zsombor Klara commented on HIVE-12891: Tests are flaky: https://issues.apache.org/jira/browse/HIVE-14936 - orc_ppd_schema_evol_3a https://issues.apache.org/jira/browse/HIVE-15169 - columnstats_part_coltype https://issues.apache.org/jira/browse/HIVE-15116 - join_acid_non_acid https://issues.apache.org/jira/browse/HIVE-15115 - union_fast_stats https://issues.apache.org/jira/browse/HIVE-15084 - explainanalyze_4, explainanalyze_5 https://issues.apache.org/jira/browse/HIVE-15168 - testJobSubmission https://issues.apache.org/jira/browse/HIVE-15170 - testTaskStatus > Hive fails when java.io.tmpdir is set to a relative location > > > Key: HIVE-12891 > URL: https://issues.apache.org/jira/browse/HIVE-12891 > Project: Hive > Issue Type: Bug >Reporter: Reuben Kuhnert >Assignee: Barna Zsombor Klara > Attachments: HIVE-12891.01.19.2016.01.patch, HIVE-12891.03.patch, > HIVE-12891.04.patch, HIVE-12891.5.patch, HIVE-12981.01.22.2016.02.patch > > > The function {{SessionState.createSessionDirs}} fails when trying to create > directories where {{java.io.tmpdir}} is set to a relative location. > {code} > \[SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: > IllegalArgumentException java.net.URISyntaxException: Relative path in > absolute URI: > file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 > ... > Minor variations: > \[SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: SemanticException > Exception while processing Exception while writing out the local file > o.a.h.hive.ql/parse.SemanticException: Exception while processing exception > while writing out local file > ... > caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: > Relative path in absolute URI: > file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 > at o.a.h.fs.Path.initialize (206) > at o.a.h.fs.Path.(197)... > at o.a.h.hive.ql.context.getScratchDir(267) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-15158) Partition Table With timestamp type on S3 storage --> Error in getting fields from serde.Invalid Field null
[ https://issues.apache.org/jira/browse/HIVE-15158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] thauvin damien resolved HIVE-15158. --- Resolution: Duplicate duplicate with this jira HIVE-15157 > Partition Table With timestamp type on S3 storage --> Error in getting fields > from serde.Invalid Field null > --- > > Key: HIVE-15158 > URL: https://issues.apache.org/jira/browse/HIVE-15158 > Project: Hive > Issue Type: Bug > Components: Clients >Affects Versions: 2.1.0 > Environment: JDK 1.8 101 >Reporter: thauvin damien > > Hello > I get the error above when i try to perform : > hive> DESCRIBE formatted table partition (tsbucket='2016-10-28 16%3A00%3A00'); > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from > serde.Invalid Field null > Here is the description of the issue. > --External table Hive with dynamic partition enable on Aws S3 storage. > --Partition Table with timestamp type . > When i perform "show partition table;" everything is fine : > hive> show partitions table; > OK > tsbucket=2016-10-01 11%3A00%3A00 > tsbucket=2016-10-28 16%3A00%3A00 > And when i perform "describe FORMATTED table;" everything is fine > Is this a bug ? > The stacktrace of hive.log : > 2016-11-08T10:30:20,868 ERROR [ac3e0d48-22c5-4d04-a788-aeb004ea94f3 > main([])]: exec.DDLTask (DDLTask.java:failed(574)) - > org.apache.hadoop.hive.ql.metadata.HiveException: Error in getting fields > from serde.Invalid Field null > at > org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3414) > at > org.apache.hadoop.hive.ql.exec.DDLTask.describeTable(DDLTask.java:3109) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:408) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > Caused by: MetaException(message:Invalid Field null) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getFieldsFromDeserializer(MetaStoreUtils.java:1336) > at > org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3409) > ... 21 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15161) migrate ColumnStats to use jackson
[ https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15650127#comment-15650127 ] Hive QA commented on HIVE-15161: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12838130/HIVE-15161.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10634 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_4] (batchId=11) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_bitmap_auto_partitioned] (batchId=27) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[autoColumnStats_2] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnStatsUpdateForStatsOptimizer_1] (batchId=142) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=145) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[limit_pushdown] (batchId=121) org.apache.hadoop.hive.metastore.hbase.TestHBaseSchemaTool.oneMondoTest (batchId=191) org.apache.hive.spark.client.TestSparkClient.testJobSubmission (batchId=272) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2046/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2046/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2046/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12838130 - PreCommit-HIVE-Build > migrate ColumnStats to use jackson > -- > > Key: HIVE-15161 > URL: https://issues.apache.org/jira/browse/HIVE-15161 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Fix For: 2.2.0 > > Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch > > > * json.org has license issues > * jackson can provide a fully compatible alternative to it > * there are a few flakiness issues caused by the order of the map entries of > the columns...this cat be addressed, org.json api was unfriendly in this > manner ;) -- This message was sent by Atlassian JIRA (v6.3.4#6332)