[jira] [Created] (HIVE-24904) CVE-2019-10172,CVE-2019-10202 vulnerabilities in jackson-mapper-asl-1.9.13.jar
Oleksiy Sayankin created HIVE-24904: --- Summary: CVE-2019-10172,CVE-2019-10202 vulnerabilities in jackson-mapper-asl-1.9.13.jar Key: HIVE-24904 URL: https://issues.apache.org/jira/browse/HIVE-24904 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin CVE list: CVE-2019-10172,CVE-2019-10202 CVSS score: High {code} ./packaging/target/apache-hive-4.0.0-SNAPSHOT-bin/apache-hive-4.0.0-SNAPSHOT-bin/lib/jackson-mapper-asl-1.9.13.jar {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24740) Can't order by an unselected column
Oleksiy Sayankin created HIVE-24740: --- Summary: Can't order by an unselected column Key: HIVE-24740 URL: https://issues.apache.org/jira/browse/HIVE-24740 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin {code} CREATE TABLE t1 (column1 STRING); {code} {code} select substr(column1,1,4), avg(column1) from t1 group by substr(column1,1,4) order by column1; {code} {code} org.apache.hadoop.hive.ql.parse.SemanticException: Line 3:87 Invalid table alias or column reference 'column1': (possible column names are: _c0, _c1, .(tok_function substr (tok_table_or_col column1) 1 4), .(tok_function avg (tok_table_or_col column1))) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genAllRexNode(CalcitePlanner.java:5645) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genAllRexNode(CalcitePlanner.java:5576) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.getOrderByExpression(CalcitePlanner.java:4326) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.beginGenOBLogicalPlan(CalcitePlanner.java:4230) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genOBLogicalPlan(CalcitePlanner.java:4136) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5326) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1864) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1810) at org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:130) at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:915) at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:179) at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:125) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1571) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:562) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12538) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:456) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:315) at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223) at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:492) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:445) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:409) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:403) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:355) at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:744) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:714) at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:170) at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157) at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver(TestCliDriver.java:62) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:135) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at
[jira] [Created] (HIVE-23075) Add property for manual configuration of SSL version
Oleksiy Sayankin created HIVE-23075: --- Summary: Add property for manual configuration of SSL version Key: HIVE-23075 URL: https://issues.apache.org/jira/browse/HIVE-23075 Project: Hive Issue Type: Improvement Components: Security Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin Add property for manual configuration of SSL version -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22980) Support custom path filter for ORC tables
Oleksiy Sayankin created HIVE-22980: --- Summary: Support custom path filter for ORC tables Key: HIVE-22980 URL: https://issues.apache.org/jira/browse/HIVE-22980 Project: Hive Issue Type: New Feature Components: ORC Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin The customer is looking for an option to specify custom path filter for ORC tables. Please find the details below from customer requirement. Problem Statement/Approach in customer words : {quote} Currently, Orc file input format does not take in path filters set in the property "mapreduce.input.pathfilter.class" OR " mapred.input.pathfilter.class ". So, we cannot use custom filters with Orc files. AcidUtils class has a static filter called "hiddenFilters" which is used by ORC to filter input paths. If we can pass the custom filter classes(set in the property mentioned above) to AcidUtils and replace hiddenFilter with a filter that does an "and" operation over hiddenFilter+customFilters, the filters would work well. On local testing, mapreduce.input.pathfilter.class seems to be working for Text tables but not for ORC tables. {quote} Our analysis: {{OrcInputFormat}} and {{FileInputFormat}} are different implementations for {{Inputformat}} interface. Property "{{mapreduce.input.pathfilter.class}}" is only respected by {{FileInputFormat}}, but not by any other implementations of {{InputFormat}}. The customer wants to have the ability to filter out rows based on path/filenames, current ORC features like bloomfilters and indexes are not good enough for them to minimize number of disk read operations. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22919) StorageBasedAuthorizationProvider does not allow create databases after changing hive.metastore.warehouse.dir
Oleksiy Sayankin created HIVE-22919: --- Summary: StorageBasedAuthorizationProvider does not allow create databases after changing hive.metastore.warehouse.dir Key: HIVE-22919 URL: https://issues.apache.org/jira/browse/HIVE-22919 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin *ENVIRONMENT:* Hive-2.3 *STEPS TO REPRODUCE:* 1. Configure Storage Based Authorization: {code:xml} hive.security.authorization.enabled true hive.security.metastore.authorization.manager org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider hive.security.authorization.manager org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider hive.security.metastore.authenticator.manager org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator hive.metastore.pre.event.listeners org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener {code} 2. Create a few directories, change owners and permissions to it: {code:java}hadoop fs -mkdir /tmp/m1 hadoop fs -mkdir /tmp/m2 hadoop fs -mkdir /tmp/m3 hadoop fs -chown testuser1:testuser1 /tmp/m[1,3] hadoop fs -chmod 700 /tmp/m[1-3]{code} 3. Check permissions: {code:java}[test@node2 ~]$ hadoop fs -ls /tmp|grep m[1-3] drwx-- - testuser1 testuser1 0 2020-02-11 10:25 /tmp/m1 drwx-- - test test 0 2020-02-11 10:25 /tmp/m2 drwx-- - testuser1 testuser1 1 2020-02-11 10:36 /tmp/m3 [test@node2 ~]${code} 4. Loggin into Hive CLI using embedded Hive Metastore as *"testuser1"* user, with *"hive.metastore.warehouse.dir"* set to *"/tmp/m1"*: {code:java}sudo -u testuser1 hive --hiveconf hive.metastore.uris= --hiveconf hive.metastore.warehouse.dir=/tmp/m1{code} 5. Perform the next steps: {code:sql}-- 1. Check "hive.metastore.warehouse.dir" value: SET hive.metastore.warehouse.dir; -- 2. Set "hive.metastore.warehouse.dir" to the path, to which "testuser1" user does not have an access: SET hive.metastore.warehouse.dir=/tmp/m2; -- 3. Try to create a database: CREATE DATABASE m2; -- 4. Set "hive.metastore.warehouse.dir" to the path, to which "testuser1" user has an access: SET hive.metastore.warehouse.dir=/tmp/m3; -- 5. Try to create a database: CREATE DATABASE m3;{code} *ACTUAL RESULT:* Query 5 fails with an exception below. It does not handle "hive.metastore.warehouse.dir" proprty: {code:java}hive> -- 5. Try to create a database: hive> CREATE DATABASE m3; FAILED: HiveException org.apache.hadoop.security.AccessControlException: User testuser1(user id 5001) does not have access to hdfs:/tmp/m2/m3.db hive>{code} *EXPECTED RESULT:* Query 5 creates a database; -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22911) Locks entries are left over inside HIVE_LOCKS when using DbTxnManager
Oleksiy Sayankin created HIVE-22911: --- Summary: Locks entries are left over inside HIVE_LOCKS when using DbTxnManager Key: HIVE-22911 URL: https://issues.apache.org/jira/browse/HIVE-22911 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin We found lots of orphan/old/leftover lock entries inside {{HIVE_LOCKS}}. There are more than 120k locks in HIVE_LOCKS of MySQL database. We also checked the top 3 tables which are related to the existing locks: {code} mysql> select HL_DB,HL_TABLE, count(*) from HIVE_LOCKS group by 1,2 order by 3 desc limit 10; +---+--+--+ | HL_DB | HL_TABLE | count(*) | +---+--+--+ | db1 | table1 | 66984 | | db1 | table2 | 33208 | | db1 | table3 | 9315 | … {code} For table “db1. table1”, here are 3 Hive sessions related, and each of the Hive session is waiting for 22328 read locks. This is because this table “db1. table1” is a huge partition table, and it has more than 200k child partitions. I am guessing each of Hive session was trying to do a full table scan on it. I group-by based on column {{HL_LAST_HEARTBEAT}} instead, here is the list: {code} MariaDB [customer]> select cast(FROM_UNIXTIME(HL_LAST_HEARTBEAT/1000) as date) as dt,count(*) as cnt from HIVE_LOCKS -> group by 1 order by 1; +++ | dt | cnt| +++ | 1969-12-31 | 2 | | 2019-05-20 | 10 | | 2019-05-21 | 3 | | 2019-05-23 | 5 | | 2019-05-24 | 2 | | 2019-05-25 | 1 | | 2019-05-29 | 7 | | 2019-05-30 | 2 | | 2019-06-11 | 13 | | 2019-06-28 | 3 | | 2019-07-02 | 2 | | 2019-07-04 | 5 | | 2019-07-09 | 1 | | 2019-07-15 | 2 | | 2019-07-16 | 1 | | 2019-07-18 | 2 | | 2019-07-20 | 3 | | 2019-07-29 | 5 | | 2019-07-30 | 9 | | 2019-07-31 | 7 | | 2019-08-02 | 2 | | 2019-08-06 | 5 | | 2019-08-07 | 17 | | 2019-08-08 | 8 | | 2019-08-09 | 5 | | 2019-08-21 | 1 | | 2019-08-22 | 20 | | 2019-08-23 | 1 | | 2019-08-26 | 5 | | 2019-08-27 | 98 | | 2019-08-28 | 3 | | 2019-08-29 | 1 | | 2019-09-02 | 3 | | 2019-09-04 | 3 | | 2019-09-05 |105 | | 2019-09-06 | 3 | | 2019-09-07 | 2 | | 2019-09-09 | 6 | | 2019-09-12 | 9 | | 2019-09-13 | 1 | | 2019-09-17 | 1 | | 2019-09-24 | 3 | | 2019-09-26 | 6 | | 2019-09-27 | 4 | | 2019-09-30 | 1 | | 2019-10-01 | 2 | | 2019-10-03 | 9 | | 2019-10-04 | 2 | | 2019-10-06 | 1 | | 2019-10-08 | 1 | | 2019-10-09 | 1 | | 2019-10-10 | 6 | | 2019-10-11 | 1 | | 2019-10-16 | 13 | | 2019-10-17 | 1 | | 2019-10-18 | 2 | | 2019-10-19 | 2 | | 2019-10-21 | 10 | | 2019-10-22 | 6 | | 2019-10-28 | 2 | | 2019-10-29 | 4 | | 2019-10-30 | 2 | | 2019-10-31 | 2 | | 2019-11-05 | 2 | | 2019-11-06 | 2 | | 2019-11-11 | 1 | | 2019-11-13 | 1 | | 2019-11-14 | 1 | | 2019-11-21 | 4 | | 2019-11-26 | 1 | | 2019-11-27 | 1 | | 2019-12-05 | 4 | | 2019-12-06 | 2 | | 2019-12-12 | 1 | | 2019-12-14 | 1 | | 2019-12-15 | 3 | | 2019-12-16 | 1 | | 2019-12-17 | 1 | | 2019-12-18 | 1 | | 2019-12-19 | 2 | | 2019-12-20 | 2 | | 2019-12-23 | 1 | | 2019-12-27 | 1 | | 2020-01-07 | 1 | | 2020-01-08 | 14 | | 2020-01-09 | 2 | | 2020-01-12 |372 | | 2020-01-14 | 2 | | 2020-01-15 | 1 | | 2020-01-20 | 11 | | 2020-01-21 | 119253 | | 2020-01-23 |113 | | 2020-01-24 | 4 | | 2020-01-25 |536 | | 2020-01-26 | 2132 | | 2020-01-27 |396 | | 2020-01-28 | 1 | | 2020-01-29 | 3 | | 2020-01-30 | 11 | | 2020-01-31 | 11 | | 2020-02-03 | 2 | | 2020-02-04 | 4 | | 2020-02-05 | 5 | | 2020-02-06 | 8 | | 2020-02-10 | 32 | | 2020-02-11 | 15 | | 2020-02-12 | 14 | | 2020-02-13 | 1 | | 2020-02-14 | 92 | +++ 109 rows in set (0.16 sec) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-21961) Update jetty version to 9.4.x
Oleksiy Sayankin created HIVE-21961: --- Summary: Update jetty version to 9.4.x Key: HIVE-21961 URL: https://issues.apache.org/jira/browse/HIVE-21961 Project: Hive Issue Type: Task Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21929) Hive on Tez requers explicite set of property "hive.tez.container.size"
Oleksiy Sayankin created HIVE-21929: --- Summary: Hive on Tez requers explicite set of property "hive.tez.container.size" Key: HIVE-21929 URL: https://issues.apache.org/jira/browse/HIVE-21929 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin Without the explicit setting of the property {{hive.tez.container.size}} Tez client submit to the YARN memory size as "-1". After that container creation is rejected by YARN. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21865) Add verification for Tez engine before starting Tez sessions
Oleksiy Sayankin created HIVE-21865: --- Summary: Add verification for Tez engine before starting Tez sessions Key: HIVE-21865 URL: https://issues.apache.org/jira/browse/HIVE-21865 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin Here is the log of starting HS2 when engine is MR: {code} 2019-06-12T06:08:05,115 WARN [main] server.HiveServer2: Error starting HiveServer2 on attempt 1, will retry in 6ms java.lang.NoClassDefFoundError: org/apache/tez/dag/api/TezConfiguration at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession$AbstractTriggerValidator.startTriggerValidator(TezSessionPoolSession.java:74) at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.initTriggers(TezSessionPoolManager.java:207) at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.startPool(TezSessionPoolManager.java:114) at org.apache.hive.service.server.HiveServer2.initAndStartTezSessionPoolManager(HiveServer2.java:860) at org.apache.hive.service.server.HiveServer2.startOrReconnectTezSessions(HiveServer2.java:843) at org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:766) at org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1058) at org.apache.hive.service.server.HiveServer2.access$1600(HiveServer2.java:144) at org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1326) at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1170) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236) Caused by: java.lang.ClassNotFoundException: org.apache.tez.dag.api.TezConfiguration at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 16 more {code} HS2 starts correctly but the exception above annoys customer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21860) Incorrect FQDN of HadoopThriftAuthBridge23 in ShimLoader
Oleksiy Sayankin created HIVE-21860: --- Summary: Incorrect FQDN of HadoopThriftAuthBridge23 in ShimLoader Key: HIVE-21860 URL: https://issues.apache.org/jira/browse/HIVE-21860 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21843) UNION query with regular expressions works in Hive-2.1 and does not work in Hive-2.3
Oleksiy Sayankin created HIVE-21843: --- Summary: UNION query with regular expressions works in Hive-2.1 and does not work in Hive-2.3 Key: HIVE-21843 URL: https://issues.apache.org/jira/browse/HIVE-21843 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin *STEPS TO REPRODUCE:* 1. Create a table: {code:java}CREATE TABLE t (a1 INT, a2 INT); INSERT INTO TABLE t VALUES (1,1),(1,2),(2,1),(2,2);{code} 2. SET hive.support.quoted.identifiers to "none": {code:java}SET hive.support.quoted.identifiers=none;{code} 3. Run the query: {code:java}SELECT `(a1)?+.+` FROM t UNION SELECT `(a2)?+.+` FROM t;{code} *ACTUAL RESULT:* The query fails with an exception: {code:java}2019-05-23T01:36:53,604 ERROR [9aa457a9-1c74-466e-abef-ec2f007117f3 main] ql.Driver: FAILED: SemanticException Line 0:-1 Invalid column reference '`(a1)?+.+`' org.apache.hadoop.hive.ql.parse.SemanticException: Line 0:-1 Invalid column reference '`(a1)?+.+`' at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:11734) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11674) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11642) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11620) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapGroupByOperator(SemanticAnalyzer.java:5225) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggrNoSkew(SemanticAnalyzer.java:6330) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:9659) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:10579) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:10457) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:11202) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:481) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11215) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:286) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:512) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:836) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:774) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:697) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21802) Unexpected change in HiveQL clause order
Oleksiy Sayankin created HIVE-21802: --- Summary: Unexpected change in HiveQL clause order Key: HIVE-21802 URL: https://issues.apache.org/jira/browse/HIVE-21802 Project: Hive Issue Type: Bug Components: Parser, Query Processor Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin This query worked in Hive 1.2 ( ({{ORDER}} clause _before_ {{WINDOW}})): {code:java} CREATE TABLE ccdp_v02 AS SELECT * from (select cust_xref_id, cstone_last_updatetm, instal_late_pay_ct, ROW_NUMBER() over w1 as RN, a.report_dt from cstonedb3.gam_ccdp_us a where report_dt = '2019-05-01' and cust_xref_id in (1234) order by cust_xref_id, a.report_dt, cstone_last_updatetm desc WINDOW w1 as (partition by a.cust_xref_id, a.report_dt order by a.cstone_last_updatetm desc) ) tmp where RN=1 DISTRIBUTE BY report_dt; {code} In Hive2.1 it fails with: {code:java} hive> SELECT id > FROM ( > SELECT > id, > a1, > ROW_NUMBER() OVER w1 AS RN, > b1 > FROM i a > ORDER BY id, b1, a1 DESC > WINDOW w1 as (PARTITION BY id, b1 ORDER BY a1 DESC) > ); NoViableAltException(257@[]) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2269) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2479) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1692) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1313) at org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:42092) at org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:36765) at org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:37017) at org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:36663) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:35852) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:35740) at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2307) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1335) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:208) at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:77) at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:70) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:468) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:836) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:774) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:697) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) FAILED: ParseException line 3:4 cannot recognize input near '(' 'SELECT' 'id' in joinSource hive> {code} *STEPS TO REPRODUCE:* 1. Create a table: {code:java} CREATE TABLE i (id INT, a1 INT, b1 BOOLEAN); {code} 2. Run the query which was working in Hive-1.2: ({{ORDER}} clause _before_ {{WINDOW}}) {code:java} SELECT id FROM ( SELECT id, a1, ROW_NUMBER() OVER w1 AS rn, b1 FROM i a ORDER BY id, b1, a1 DESC WINDOW w1 as (PARTITION BY id, b1 ORDER BY a1 DESC) ) tmp WHERE rn=1 DISTRIBUTE BY id; {code} *ACTUAL RESULT:* The query fails with an exception you can find above. The query from Step 2 which works for Hive-2.3 is ( ({{ORDER}} clause _after_ {{WINDOW}})): {code:java} SELECT id FROM ( SELECT id, a1, ROW_NUMBER() OVER w1 AS rn, b1
[jira] [Created] (HIVE-21207) Use 0.12.0 libthrift version in Hive
Oleksiy Sayankin created HIVE-21207: --- Summary: Use 0.12.0 libthrift version in Hive Key: HIVE-21207 URL: https://issues.apache.org/jira/browse/HIVE-21207 Project: Hive Issue Type: Improvement Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin Use 0.12.0 libthrift version in Hive. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20457) Create authorization mechanism for granting/revoking privileges to change Hive properties
Oleksiy Sayankin created HIVE-20457: --- Summary: Create authorization mechanism for granting/revoking privileges to change Hive properties Key: HIVE-20457 URL: https://issues.apache.org/jira/browse/HIVE-20457 Project: Hive Issue Type: Improvement Components: Security Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20348) Hive HCat does not create a proper "client" on kerberos cluster without hive metastore
Oleksiy Sayankin created HIVE-20348: --- Summary: Hive HCat does not create a proper "client" on kerberos cluster without hive metastore Key: HIVE-20348 URL: https://issues.apache.org/jira/browse/HIVE-20348 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin *STEPS TO REPRODUCE:* 1. Configure Hive to use embedded Metastore (do not specify {{hive.metastore.uris}} in {{hive-site.xml}}); 2. Create a database and a table in MySQL: {code:java} mysql -uroot -p123456 -e "CREATE DATABASE test;CREATE TABLE test.test (id INT);INSERT INTO test.test VALUES (1),(2),(3)" {code} 3. Create a table in Hive: {code:java} hive -e "CREATE TABLE default.test (id INT)" {code} 4. Run Sqoop import command: {code:java} sqoop import --connect 'jdbc:mysql://localhost:3306/test' --username root --password 123456 --table test --hcatalog-database "default" --hcatalog-table "test" --verbose -m 1 {code} *ACTUAL RESULT:* Sqoop import command fails with an exception: {code:java} 18/08/08 01:07:09 ERROR tool.ImportTool: Encountered IOException running import job: org.apache.hive.hcatalog.common.HCatException : 2001 : Error setting output information. Cause : java.lang.NullPointerException at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:220) at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:70) at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureHCat(SqoopHCatUtilities.java:361) at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureImportOutputFormat(SqoopHCatUtilities.java:783) at org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(ImportJobBase.java:98) at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:259) at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:689) at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:498) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:606) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) Caused by: java.lang.NullPointerException at org.apache.hadoop.security.token.Token.decodeWritable(Token.java:256) at org.apache.hadoop.security.token.Token.decodeFromUrlString(Token.java:275) at org.apache.hive.hcatalog.common.HCatUtil.extractThriftToken(HCatUtil.java:351) at org.apache.hive.hcatalog.mapreduce.Security.handleSecurity(Security.java:139) at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:214) ... 15 more {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19746) Hadoop credential provider allows to read passwords
Oleksiy Sayankin created HIVE-19746: --- Summary: Hadoop credential provider allows to read passwords Key: HIVE-19746 URL: https://issues.apache.org/jira/browse/HIVE-19746 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin This simple program allows to read any password from any {{jceks}} file: {code} package com.test.app; import java.util.List; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.security.alias.CredentialProvider; import org.apache.hadoop.security.alias.CredentialProviderFactory; import java.io.IOException; public class PasswordReader { public static void main(String[] args) throws IOException { if (args == null || args.length == 0){ throw new IllegalArgumentException("Credential provider path is to set"); } String credentialProviderPath = args[0]; Configuration configuration = new Configuration(); configuration.set(CredentialProviderFactory.CREDENTIAL_PROVIDER_PATH, credentialProviderPath); CredentialProvider credentialProvider = CredentialProviderFactory.getProviders(configuration).get(0); List aliases = credentialProvider.getAliases(); for(String alias : aliases){ System.out.println(alias + " = " + new String(configuration.getPassword(alias))); } } } {code} {code} java -cp $(hadoop classpath):password-reader.jar com.test.app.PasswordReader jceks://hdfs/user/hive/hivemetastore.jceks {code} *RESULT* {code} javax.jdo.option.connectionpassword = 123456 {code} File {{jceks://hdfs/user/hive/hivemetastore.jceks}} has {{\-rw\-r\-\-r\-\-}} permissions and {{hdfs:hdfs}} owner:group. We can't remove world readable permissions here, because Hive is configured for impersonation to allow users user than {{hdfs}} connect to HiveServer2. When removed world readable permissions I got the exception: {code} 2018-05-31T10:08:40,191 ERROR [pool-7-thread-41] fs.Inode: Marking failure for: /user/hive/hivemetastore.jceks, error: Input/output error 2018-05-31T10:08:40,192 ERROR [pool-7-thread-41] fs.Inode: Throwing exception for: /user/hive/hivemetastore.jceks, error: Input/output error 2018-05-31T10:08:40,192 ERROR [pool-7-thread-41] metastore.RetryingHMSHandler: java.lang.RuntimeException: Error getting metastore password: null at org.apache.hadoop.hive.metastore.ObjectStore.getDataSourceProps(ObjectStore.java:485) at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:279) {code} Any ideas how to protect passwords? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19587) HeartBeat thread uses cancelled delegation token while connecting to meta on KERBEROS cluster
Oleksiy Sayankin created HIVE-19587: --- Summary: HeartBeat thread uses cancelled delegation token while connecting to meta on KERBEROS cluster Key: HIVE-19587 URL: https://issues.apache.org/jira/browse/HIVE-19587 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin *STEP 1. Create test data* {code} create table t1 (id int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ","; create table t2 (id int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ","; {code} Generate 10 000 000 lines of random data {code} package com.test.app; import java.io.FileNotFoundException; import java.io.PrintWriter; import java.util.concurrent.ThreadLocalRandom; public class App { public static void main(String[] args) throws FileNotFoundException { try (PrintWriter out = new PrintWriter("table.data");) { int min = 0; int max = 10_000; int numRows = 10_000_000; for (int i = 0; i <= numRows - 1; i++){ int randomNum = ThreadLocalRandom.current().nextInt(min, max + 1); out.println(randomNum); } } } } {code} Upload data to Hive tables {code} load data local inpath '/home/myuser/table.data' into table t1; load data local inpath '/home/myuser/table.data' into table t2; {code} *STEP 2. Configure transactions in hive-site.xml* {code} hive.exec.dynamic.partition.mode nonstrict hive.support.concurrency true hive.enforce.bucketing true hive.txn.manager org.apache.hadoop.hive.ql.lockmgr.DbTxnManager hive.compactor.initiator.on true hive.compactor.worker.threads 1 {code} *STEP 3. Configure hive.txn.timeout in hive-site.xml* {code} hive.txn.timeout 10s {code} *STEP 4. Connect via beeline to HS2 with KERBEROS* {code} !connect jdbc:hive2://node8.cluster:1/default;principal=myuser/node8.cluster@NODE8;ssl=true;sslTrustStore=/opt/myuser/conf/ssl_truststore {code} {code} select count(*) from t1; {code} *STEP 5. Close connection and reconnect* {code} !close {code} {code} !connect jdbc:hive2://node8.cluster:1/default;principal=myuser/node8.cluster@NODE8;ssl=true;sslTrustStore=/opt/myuser/conf/ssl_truststore {code} *STEP 6. Perform long playing query* This query lasts about 600s {code} select count(*) from t1 join t2 on t1.id = t2.id; {code} *EXPECTED RESULT* Query finishes successfully *ACTUAL RESULT* {code} 2018-05-17T13:54:54,921 ERROR [pool-7-thread-10] transport.TSaslTransport: SASL negotiation failure javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password at com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:598) at com.sun.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java:244) at org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283) at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:663) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:660) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1613) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:660) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.security.token.SecretManager$InvalidToken: token expired or does not exist: owner=myuser, renewer=myuser, realUser=, issueDate=1526565229297, maxDate=1527170029297, sequenceNumber=1, masterKeyId=1 at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java:104) at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java:56) at
[jira] [Created] (HIVE-19295) Some multiple inserts do work on MR engine
Oleksiy Sayankin created HIVE-19295: --- Summary: Some multiple inserts do work on MR engine Key: HIVE-19295 URL: https://issues.apache.org/jira/browse/HIVE-19295 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin *General Info* Hive version : 2.3.3 {code} commit 3f7dde31aed44b5440563d3f9d8a8887beccf0be Author: Daniel DaiDate: Wed Mar 28 16:46:29 2018 -0700 Preparing for 2.3.3 release {code} Hadoop version: 2.7.2. Engine {code} hive> set hive.execution.engine; hive.execution.engine=mr {code} *STEP 1. Create test data* {code} DROP TABLE IF EXISTS customer_target; DROP TABLE IF EXISTS customer_source; {code} {code} CREATE TABLE customer_target (id STRING, first_name STRING, last_name STRING, age INT); {code} {code} insert into customer_target values ('001', 'John', 'Smith', 45), ('002', 'Michael', 'Watson', 27), ('003', 'Den', 'Brown', 33); SELECT id, first_name, last_name, age FROM customer_target; {code} {code} +--+-++--+ | id | first_name | last_name | age | +--+-++--+ | 002 | Michael | Watson | 27 | | 001 | John| Smith | 45 | | 003 | Den | Brown | 33 | +--+-++--+ {code} {code} CREATE TABLE customer_source (id STRING, first_name STRING, last_name STRING, age INT); insert into customer_source values ('001', 'Dorothi', 'Hogward', 77), ('007', 'Alex', 'Bowee', 1), ('088', 'Robert', 'Dowson', 25); SELECT id, first_name, last_name, age FROM customer_source; {code} {code} +--+-++--+ | id | first_name | last_name | age | +--+-++--+ | 088 | Robert | Dowson | 25 | | 001 | Dorothi | Hogward| 77 | | 007 | Alex| Bowee | 1| +--+-++--+ {code} *STEP 2. Do multiple insert* {code} FROM `default`.`customer_target` `trg` JOIN `default`.`customer_source` `src` ON `src`.`id` = `trg`.`id` INSERT INTO `default`.`customer_target`-- update clause select `trg`.`id`, `src`.`first_name`, `src`.`last_name`, `trg`.`age` WHERE `src`.`id` = `trg`.`id` sort by `trg`.id INSERT INTO `default`.`customer_target`-- insert clause select `src`.`id`, `src`.`first_name`, `src`.`last_name`, `src`.`age` WHERE `trg`.`id` IS NULL {code} *ACTUAL RESULT* {code} FAILED: SemanticException [Error 10087]: The same output cannot be present multiple times: customer_target {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19286) NPE in MERGE operator on MR mode
Oleksiy Sayankin created HIVE-19286: --- Summary: NPE in MERGE operator on MR mode Key: HIVE-19286 URL: https://issues.apache.org/jira/browse/HIVE-19286 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin {code} DROP TABLE IF EXISTS customer_target; DROP TABLE IF EXISTS customer_source; {code} {code} CREATE TABLE customer_target (id STRING, first_name STRING, last_name STRING, age INT) clustered by (id) into 2 buckets stored as ORC TBLPROPERTIES ('transactional'='true'); {code} {code} insert into customer_target values ('001', 'John', 'Smith', 45), ('002', 'Michael', 'Watson', 27), ('003', 'Den', 'Brown', 33); SELECT id, first_name, last_name, age FROM customer_target; {code} {code} +--+-++--+ | id | first_name | last_name | age | +--+-++--+ | 002 | Michael | Watson | 27 | | 001 | John| Smith | 45 | | 003 | Den | Brown | 33 | +--+-++--+ {code} {code} CREATE TABLE customer_source (id STRING, first_name STRING, last_name STRING, age INT); insert into customer_source values ('001', 'Dorothi', 'Hogward', 77), ('007', 'Alex', 'Bowee', 1), ('088', 'Robert', 'Dowson', 25); SELECT id, first_name, last_name, age FROM customer_source; {code} {code} +--+-++--+ | id | first_name | last_name | age | +--+-++--+ | 088 | Robert | Dowson | 25 | | 001 | Dorothi | Hogward| 77 | | 007 | Alex| Bowee | 1| +--+-++--+ {code} {code} merge into customer_target trg using customer_source src on src.id = trg.id when matched then update set first_name = src.first_name, last_name = src.last_name when not matched then insert values (src.id, src.first_name, src.last_name, src.age); {code} {code} 2018-04-24T07:11:44,448 DEBUG [main] log.PerfLogger: 2018-04-24T07:11:44,448 INFO [main] exec.SerializationUtilities: Deserializing MapredLocalWork using kryo 2018-04-24T07:11:44,463 DEBUG [main] exec.Utilities: Hive Conf not found or Session not initiated, use thread based class loader instead 2018-04-24T07:11:44,538 DEBUG [main] log.PerfLogger: 2018-04-24T07:11:44,545 INFO [main] mr.MapredLocalTask: 2018-04-24 07:11:44 Starting to launch local task to process map join; maximum memory = 477626368 2018-04-24T07:11:44,545 DEBUG [main] mr.MapredLocalTask: initializeOperators: trg, children = [HASHTABLESINK[37]] 2018-04-24T07:11:44,656 DEBUG [main] exec.Utilities: Hive Conf not found or Session not initiated, use thread based class loader instead 2018-04-24T07:11:44,676 INFO [main] mr.MapredLocalTask: fetchoperator for trg created 2018-04-24T07:11:44,676 INFO [main] exec.TableScanOperator: Initializing operator TS[0] 2018-04-24T07:11:44,676 DEBUG [main] exec.TableScanOperator: Initialization Done 0 TS 2018-04-24T07:11:44,676 DEBUG [main] exec.TableScanOperator: Operator 0 TS initialized 2018-04-24T07:11:44,676 DEBUG [main] exec.TableScanOperator: Initializing children of 0 TS 2018-04-24T07:11:44,676 DEBUG [main] exec.HashTableSinkOperator: Initializing child 37 HASHTABLESINK 2018-04-24T07:11:44,676 INFO [main] exec.HashTableSinkOperator: Initializing operator HASHTABLESINK[37] 2018-04-24T07:11:44,677 INFO [main] mapjoin.MapJoinMemoryExhaustionHandler: JVM Max Heap Size: 477626368 2018-04-24T07:11:44,680 ERROR [main] mr.MapredLocalTask: Hive Runtime Error: Map local work failed java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57) ~[hive-exec-2.3.3.jar:2.3.3] at org.apache.hadoop.hive.ql.exec.JoinUtil.getObjectInspectorsFromEvaluators(JoinUtil.java:91) ~[hive-exec-2.3.3.jar:2.3.3] at org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.initializeOp(HashTableSinkOperator.java:153) ~[hive-exec-2.3.3.jar:2.3.3] at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:366) ~[hive-exec-2.3.3.jar:2.3.3] at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:556) ~[hive-exec-2.3.3.jar:2.3.3] at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:508) ~[hive-exec-2.3.3.jar:2.3.3] at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) ~[hive-exec-2.3.3.jar:2.3.3] at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.initializeOperators(MapredLocalTask.java:508) ~[hive-exec-2.3.3.jar:2.3.3] at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:411) ~[hive-exec-2.3.3.jar:2.3.3] at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeInProcess(MapredLocalTask.java:391) ~[hive-exec-2.3.3.jar:2.3.3]
[jira] [Created] (HIVE-18975) NPE when inserting NULL value in structure and array with HBase table
Oleksiy Sayankin created HIVE-18975: --- Summary: NPE when inserting NULL value in structure and array with HBase table Key: HIVE-18975 URL: https://issues.apache.org/jira/browse/HIVE-18975 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18728) Secure webHCat with SSL
Oleksiy Sayankin created HIVE-18728: --- Summary: Secure webHCat with SSL Key: HIVE-18728 URL: https://issues.apache.org/jira/browse/HIVE-18728 Project: Hive Issue Type: New Feature Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18702) "Insert overwrite table" doesn't clean the table directory before overwriting
Oleksiy Sayankin created HIVE-18702: --- Summary: "Insert overwrite table" doesn't clean the table directory before overwriting Key: HIVE-18702 URL: https://issues.apache.org/jira/browse/HIVE-18702 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin *STEP 1. Create test data* {code} nano /home/test/users.txt {code} Add to file: {code} Peter,34 John,25 Mary,28 {code} {code} hadoop fs -mkdir /bug hadoop fs -copyFromLocal /home/test/users.txt /bug hadoop fs -ls /bug {code} *EXPECTED RESULT:* {code} Found 2 items -rwxr-xr-x 3 root root 25 2015-10-15 16:11 /bug/users.txt {code} *STEP 2. Upload data to hive* {code} create external table bug(name string, age int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' LOCATION '/bug'; select * from bug; {code} *EXPECTED RESULT:* {code} OK Peter 34 John25 Mary28 {code} {code} create external table bug1(name string, age int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' LOCATION '/bug1'; insert overwrite table bug select * from bug1; select * from bug; {code} *EXPECTED RESULT:* {code} OK Time taken: 0.097 seconds {code} *ACTUAL RESULT:* {code} hive> select * from bug; OK Peter 34 John25 Mary28 Time taken: 0.198 seconds, Fetched: 3 row(s) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18541) Secure HS2 web ui ith PAM
Oleksiy Sayankin created HIVE-18541: --- Summary: Secure HS2 web ui ith PAM Key: HIVE-18541 URL: https://issues.apache.org/jira/browse/HIVE-18541 Project: Hive Issue Type: Sub-task Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18279) Incorrect condition in StatsOpimizer
Oleksiy Sayankin created HIVE-18279: --- Summary: Incorrect condition in StatsOpimizer Key: HIVE-18279 URL: https://issues.apache.org/jira/browse/HIVE-18279 Project: Hive Issue Type: Improvement Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-18115) Fix schema version info for Hive-2.3.2
Oleksiy Sayankin created HIVE-18115: --- Summary: Fix schema version info for Hive-2.3.2 Key: HIVE-18115 URL: https://issues.apache.org/jira/browse/HIVE-18115 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin Priority: Minor Error while starting HiveMeta {code} Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Hive Schema version 2.3.2 does not match metastore's schema version 2.3.0 Metastore is not upgraded or corrupt at org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:7600) ~[hive-exec-2.3.2.jar:2.3.2] at org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:7563) ~[hive-exec-2.3.2.jar:2.3.2] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_141] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_141] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_141] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_141] at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101) ~[hive-exec-2.3.2.jar:2.3.2] at com.sun.proxy.$Proxy23.verifySchema(Unknown Source) ~[?:?] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMSForConf(HiveMetaStore.java:591) ~[hive-exec-2.3.2.jar:2.3.2] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:584) ~[hive-exec-2.3.2.jar:2.3.2] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:651) ~[hive-exec-2.3.2.jar:2.3.2] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:427) ~[hive-exec-2.3.2.jar:2.3.2] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_141] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_141] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_141] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_141] at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) ~[hive-exec-2.3.2.jar:2.3.2] at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) ~[hive-exec-2.3.2.jar:2.3.2] at org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:79) ~[hive-exec-2.3.2.jar:2.3.2] {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-18060) UpdateInputAccessTimeHook fails for non-current database
Oleksiy Sayankin created HIVE-18060: --- Summary: UpdateInputAccessTimeHook fails for non-current database Key: HIVE-18060 URL: https://issues.apache.org/jira/browse/HIVE-18060 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin Steps to reproduce: *STEP 1. Create DBs and tables* {code} hive> create database temp; hive> use temp; hive> create table test(id int); hive> create database temp2; hive> use temp2; hive> create table test2(id int); {code} *STEP 2. Set {{hive.exec.pre.hooks}}* {code} hive> set hive.exec.pre.hooks=org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec; {code} *STEP 3. Use {{desc}}* {code} hive> use temp; hive> desc temp2.test2; {code} *EXPECTED RESULT* Code works fine and shows table info *ACTUAL RESULT* {code} FAILED: Hive Internal Error: org.apache.hadoop.hive.ql.metadata.InvalidTableException(Table not found test2) org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found test2 at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1258) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1209) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1196) at org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec.run(UpdateInputAccessTimeHook.java:61) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1688) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1454) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1172) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1162) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:234) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:185) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:401) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:791) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:729) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:652) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:647) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17098) Race condition in Hbase tables
Oleksiy Sayankin created HIVE-17098: --- Summary: Race condition in Hbase tables Key: HIVE-17098 URL: https://issues.apache.org/jira/browse/HIVE-17098 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin These steps simulate our customer production env. *STEP 1. Create test tables* {code} CREATE TABLE for_loading( key int, value string, age int, salary decimal (10,2) ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; {code} {code} CREATE TABLE test_1( key int, value string, age int, salary decimal (10,2) ) ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( 'hbase.columns.mapping'=':key, cf1:value, cf1:age, cf1:salary', 'serialization.format'='1') TBLPROPERTIES ( 'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}', 'hbase.table.name'='test_1', 'numFiles'='0', 'numRows'='0', 'rawDataSize'='0', 'totalSize'='0', 'transient_lastDdlTime'='1495769316'); {code} {code} CREATE TABLE test_2( key int, value string, age int, salary decimal (10,2) ) ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( 'hbase.columns.mapping'=':key, cf1:value, cf1:age, cf1:salary', 'serialization.format'='1') TBLPROPERTIES ( 'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}', 'hbase.table.name'='test_2', 'numFiles'='0', 'numRows'='0', 'rawDataSize'='0', 'totalSize'='0', 'transient_lastDdlTime'='1495769316'); {code} *STEP 2. Create test data* {code} import java.io.IOException; import java.math.BigDecimal; import java.nio.charset.Charset; import java.nio.file.Files; import java.nio.file.Path; import java.nio.file.Paths; import java.nio.file.StandardOpenOption; import java.util.ArrayList; import java.util.Arrays; import java.util.List; import java.util.Random; import static java.lang.String.format; public class Generator { private static List lines = new ArrayList<>(); private static List name = Arrays.asList("Brian", "John", "Rodger", "Max", "Freddie", "Albert", "Fedor", "Lev", "Niccolo"); private static List salary = new ArrayList<>(); public static void main(String[] args) { generateData(Integer.parseInt(args[0]), args[1]); } public static void generateData(int rowNumber, String file) { double maxValue = 2.55; double minValue = 1000.03; Random random = new Random(); for (int i = 1; i <= rowNumber; i++) { lines.add( i + "," + name.get(random.nextInt(name.size())) + "," + (random.nextInt(62) + 18) + "," + format("%.2f", (minValue + (maxValue - minValue) * random.nextDouble(; } Path path = Paths.get(file); try { Files.write(path, lines, Charset.forName("UTF-8"), StandardOpenOption.APPEND); } catch (IOException e) { e.printStackTrace(); } } } {code} {code} javac Generator.java java Generator 300 dataset.csv hadoop fs -put dataset.csv / {code} *STEP 3. Upload test data* {code} load data local inpath '/home/myuser/dataset.csv' into table for_loading; {code} {code} from for_loading insert into table test_1 select key,value,age,salary; {code} {code} from for_loading insert into table test_2 select key,value,age,salary; {code} *STEP 4. Run test queries* Run in 5 parallel terminals for table {{test_1}} {code} for i in {1..500}; do beeline -u "jdbc:hive2://localhost:1/default testuser1" -e "select * from test_1 limit 10;" 1>/dev/null; done {code} Run in 5 parallel terminals for table {{test_2}} {code} for i in {1..500}; do beeline -u "jdbc:hive2://localhost:1/default testuser2" -e "select * from test_2 limit 10;" 1>/dev/null; done {code} *EXPECTED RESULT:* All queris are OK. *ACTUAL RESULT* {code} org.apache.hive.service.cli.HiveSQLException: java.io.IOException: java.lang.IllegalStateException: The input format instance has not been properly ini tialized. Ensure you call initializeTable either in your constructor or initialize method at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:484) at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:308) at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:847) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at
[jira] [Created] (HIVE-15082) Hive-1.2 cannot read data from complex data types with TIMESTAMP column, stored in Parquet
Oleksiy Sayankin created HIVE-15082: --- Summary: Hive-1.2 cannot read data from complex data types with TIMESTAMP column, stored in Parquet Key: HIVE-15082 URL: https://issues.apache.org/jira/browse/HIVE-15082 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin Fix For: 1.2.2 *STEP 1. Create test data* {code:sql} select * from dual; {code} *EXPECTED RESULT:* {noformat} Pretty_UnIQUe_StrinG {noformat} {code:sql} create table test_parquet1(login timestamp) stored as parquet; insert overwrite table test_parquet1 select from_unixtime(unix_timestamp()) from dual; select * from test_parquet1 limit 1; {code} *EXPECTED RESULT:* No exceptions. Current timestamp as result. {noformat} 2016-10-27 10:58:19 {noformat} *STEP 2. Store timestamp in array in parquet file* {code:sql} create table test_parquet2(x array) stored as parquet; insert overwrite table test_parquet2 select array(login) from test_parquet1; select * from test_parquet2; {code} *EXPECTED RESULT:* No exceptions. Current timestamp in brackets as result. {noformat} ["2016-10-27 10:58:19"] {noformat} *ACTUAL RESULT:* {noformat} ERROR [main]: CliDriver (SessionState.java:printError(963)) - Failed with exception java.io.IOException:parquet.io.ParquetDecodingException: Can not read value at 0 in block -1 in file hdfs:///user/hive/warehouse/test_parquet2/00_0 java.io.IOException: parquet.io.ParquetDecodingException: Can not read value at 0 in block -1 in file hdfs:///user/hive/warehouse/test_parquet2/00_0 {noformat} *ROOT-CAUSE:* Incorrect initialization of {{metadata}} {{HashMap}} causes that it has {{null}} value in enumeration {{org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter}} when executing following line: {code:java} boolean skipConversion = Boolean.valueOf(metadata.get(HiveConf.ConfVars.HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION.varname)); {code} in element {{ETIMESTAMP_CONVERTER}}. JVM throws NPE and parquet library can not read data from file and throws {noformat} java.io.IOException:parquet.io.ParquetDecodingException: Can not read value at 0 in block -1 in file hdfs:///user/hive/warehouse/test_parquet2/00_0 {noformat} for its turn. *SOLUTION:* Perform initialization in separate method to skip overriding it with {{null}} value in block of code {code:java} if (parent != null) { setMetadata(parent.getMetadata()); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14777) Add support of Spark-2.0.0 in Hive-2.X.X
Oleksiy Sayankin created HIVE-14777: --- Summary: Add support of Spark-2.0.0 in Hive-2.X.X Key: HIVE-14777 URL: https://issues.apache.org/jira/browse/HIVE-14777 Project: Hive Issue Type: Wish Reporter: Oleksiy Sayankin -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14696) Hive Query Fail with MetaException(message:org.datanucleus.exceptions.NucleusDataStoreException: Size request failed
Oleksiy Sayankin created HIVE-14696: --- Summary: Hive Query Fail with MetaException(message:org.datanucleus.exceptions.NucleusDataStoreException: Size request failed Key: HIVE-14696 URL: https://issues.apache.org/jira/browse/HIVE-14696 Project: Hive Issue Type: Bug Components: Metastore Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin We have a customer who is on Hive 0.13 and the queries seem to be failing with exception: {code} 2016-08-30 00:22:58,965 ERROR [main]: metadata.Hive (Hive.java:getPartition(1619)) - MetaException(message:org.datanucleus.exceptions.NucleusDataStoreException: Size request failed : SELECT COUNT(*) FROM `SORT_COLS` THIS WHERE THIS.`SD_ID`=? AND THIS.`INTEGER_IDX`>=0) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partition_with_auth_result$get_partition_with_auth_resultStandardScheme.read(ThriftHiveMetastore.java:54171) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partition_with_auth_result$get_partition_with_auth_resultStandardScheme.read(ThriftHiveMetastore.java:54148) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partition_with_auth_result.read(ThriftHiveMetastore.java:54079) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partition_with_auth(ThriftHiveMetastore.java:1689) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partition_with_auth(ThriftHiveMetastore.java:1672) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartitionWithAuthInfo(HiveMetaStoreClient.java:1003) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at com.sun.proxy.$Proxy9.getPartitionWithAuthInfo(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1611) at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1565) at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:370) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1508) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1275) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1093) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:916) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:906) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:359) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:456) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:466) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:748) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {code} A similar JIRA for Hive 0.13: https://issues.apache.org/jira/browse/HIVE-8766 I suppose it's the similar issues because of both issues related to hive metastore performance, can occur when metastore is overloaded and can throw different exceptions -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14544) LOAD DATA statement appends .0 to the partition name
Oleksiy Sayankin created HIVE-14544: --- Summary: LOAD DATA statement appends .0 to the partition name Key: HIVE-14544 URL: https://issues.apache.org/jira/browse/HIVE-14544 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin *STEP 1. Create file with data:* {noformat} echo 1 > /tmp/data.file {noformat} *STEP 2. Create table in hive:* {noformat} CREATE TABLE `issue` (`id` INT) PARTITIONED BY (`ts` TIMESTAMP); {noformat} *STEP 3. Insert data into table:* {noformat} SET hive.exec.dynamic.partition.mode=nonstrict; INSERT INTO TABLE `issue` PARTITION (`ts`) VALUES (1,'1970-01-01 00:00:00'),(2,'1980-01-01 00:00:00'),(3,'1990-01-01 00:00:00'); {noformat} *STEP 4. Load data into table using hive:* {noformat} LOAD DATA LOCAL INPATH '/tmp/data.file' OVERWRITE INTO TABLE `issue` PARTITION (`ts`='2000-01-01 00:00:00'); {noformat} *STEP 5. Run show partitions query:* {noformat} SHOW PARTITIONS `issue`; {noformat} *EXPECTED RESULT:* {noformat} ts=1970-01-01 00%3A00%3A00 ts=1980-01-01 00%3A00%3A00 ts=1990-01-01 00%3A00%3A00 ts=2000-01-01 00%3A00%3A00 {noformat} *ACTUAL RESULT* We've gotten partitions with different precision {noformat} ts=1970-01-01 00%3A00%3A00 ts=1980-01-01 00%3A00%3A00 ts=1990-01-01 00%3A00%3A00 ts=2000-01-01 00%3A00%3A00.0 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14145) Too small length of column 'PARAM_VALUE' in table 'SERDE_PARAMS'
Oleksiy Sayankin created HIVE-14145: --- Summary: Too small length of column 'PARAM_VALUE' in table 'SERDE_PARAMS' Key: HIVE-14145 URL: https://issues.apache.org/jira/browse/HIVE-14145 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin Customer has following table {code} create external table hive_hbase_test( HBASE_KEY string, ENTITY_NAME string, ENTITY_ID string, CLAIM_HEADER_ID string, CLAIM_LINE_ID string, MEDICAL_CLAIM_SOURCE_SYSTEM string, UNIQUE_MEMBER_ID string, MEMBER_SOURCE_SYSTEM string, SUBSCRIBER_ID string, COVERAGE_CLASS_CODE string, SERVICING_PROVIDER_ID string, PROVIDER_SOURCE_SYSTEM string, SERVICING_PROVIDER_SPECIALTY string, SERVICING_STANDARD_PROVIDER_SPECIALTY string, SERVICING_PROVIDER_TYPE_CODE string, REFERRING_PROVIDER_ID string, ADMITTING_PROVIDER_ID string, ATTENDING_PROVIDER_ID string, OPERATING_PROVIDER_ID string, BILLING_PROVIDER_ID string, ORDERING_PROVIDER_ID string, HEALTH_PLAN_SOURCE_ID string, HEALTH_PLAN_PAYER_NAME string, BUSINESS_UNIT string, OPERATING_UNIT string, PRODUCT string, MARKET string, DEPARTMENT string, IPA string, SUPPLEMENTAL_DATA_TYPE string, PSEUDO_CLAIM_FLAG string, CLAIM_STATUS string, CLAIM_LINE_STATUS string, CLAIM_DENIED_FLAG string, SERVICE_LINE_DENIED_FLAG string, DENIED_REASON_CODE string, SERVICE_LINE_DENIED_REASON_CODE string, DAYS_DENIED int, DIAGNOSIS_DATE timestamp, SERVICE_DATE TIMESTAMP, SERVICE_FROM_DATE TIMESTAMP, SERVICE_TO_DATE TIMESTAMP, ADMIT_DATE TIMESTAMP, ADMIT_TYPE string, ADMIT_SOURCE_TYPE string, DISCHARGE_DATE TIMESTAMP, DISCHARGE_STATUS_CODE string, SERVICE_LINE_TYPE_OF_SERVICE string, TYPE_OF_BILL_CODE string, INPATIENT_FLAG string, PLACE_OF_SERVICE_CODE string, FACILITY_CODE string, AUTHORIZATION_NUMBER string, CLAIM_REFERRAL_NUMBER string, CLAIM_TYPE string, CLAIM_ADJUSTMENT_TYPE string, ICD_DIAGNOSIS_CODE_1 string, PRESENT_ON_ADMISSION_FLAG_1 string, ICD_DIAGNOSIS_CODE_2 string, PRESENT_ON_ADMISSION_FLAG_2 string, ICD_DIAGNOSIS_CODE_3 string, PRESENT_ON_ADMISSION_FLAG_3 string, ICD_DIAGNOSIS_CODE_4 string, PRESENT_ON_ADMISSION_FLAG_4 string, ICD_DIAGNOSIS_CODE_5 string, PRESENT_ON_ADMISSION_FLAG_5 string, ICD_DIAGNOSIS_CODE_6 string, PRESENT_ON_ADMISSION_FLAG_6 string, ICD_DIAGNOSIS_CODE_7 string, PRESENT_ON_ADMISSION_FLAG_7 string, ICD_DIAGNOSIS_CODE_8 string, PRESENT_ON_ADMISSION_FLAG_8 string, ICD_DIAGNOSIS_CODE_9 string, PRESENT_ON_ADMISSION_FLAG_9 string, ICD_DIAGNOSIS_CODE_10 string, PRESENT_ON_ADMISSION_FLAG_10 string, ICD_DIAGNOSIS_CODE_11 string, PRESENT_ON_ADMISSION_FLAG_11 string, ICD_DIAGNOSIS_CODE_12 string, PRESENT_ON_ADMISSION_FLAG_12 string, ICD_DIAGNOSIS_CODE_13 string, PRESENT_ON_ADMISSION_FLAG_13 string, ICD_DIAGNOSIS_CODE_14 string, PRESENT_ON_ADMISSION_FLAG_14 string, ICD_DIAGNOSIS_CODE_15 string, PRESENT_ON_ADMISSION_FLAG_15 string, ICD_DIAGNOSIS_CODE_16 string, PRESENT_ON_ADMISSION_FLAG_16 string, ICD_DIAGNOSIS_CODE_17 string, PRESENT_ON_ADMISSION_FLAG_17 string, ICD_DIAGNOSIS_CODE_18 string, PRESENT_ON_ADMISSION_FLAG_18 string, ICD_DIAGNOSIS_CODE_19 string, PRESENT_ON_ADMISSION_FLAG_19 string, ICD_DIAGNOSIS_CODE_20 string, PRESENT_ON_ADMISSION_FLAG_20 string, ICD_DIAGNOSIS_CODE_21 string, PRESENT_ON_ADMISSION_FLAG_21 string, ICD_DIAGNOSIS_CODE_22 string, PRESENT_ON_ADMISSION_FLAG_22 string, ICD_DIAGNOSIS_CODE_23 string, PRESENT_ON_ADMISSION_FLAG_23 string, ICD_DIAGNOSIS_CODE_24 string, PRESENT_ON_ADMISSION_FLAG_24 string, ICD_DIAGNOSIS_CODE_25 string, PRESENT_ON_ADMISSION_FLAG_25 string, QUANTITY_OF_SERVICES decimal(10,2), REVENUE_CODE string, PROCEDURE_CODE string, PROCEDURE_CODE_MODIFIER_1 string, PROCEDURE_CODE_MODIFIER_2 string, PROCEDURE_CODE_MODIFIER_3 string, PROCEDURE_CODE_MODIFIER_4 string, ICD_VERSION_CODE_TYPE string, ICD_PROCEDURE_CODE_1 string, ICD_PROCEDURE_CODE_2 string, ICD_PROCEDURE_CODE_3 string, ICD_PROCEDURE_CODE_4 string, ICD_PROCEDURE_CODE_5 string, ICD_PROCEDURE_CODE_6 string, ICD_PROCEDURE_CODE_7 string, ICD_PROCEDURE_CODE_8 string, ICD_PROCEDURE_CODE_9 string, ICD_PROCEDURE_CODE_10 string, ICD_PROCEDURE_CODE_11 string, ICD_PROCEDURE_CODE_12 string, ICD_PROCEDURE_CODE_13 string, ICD_PROCEDURE_CODE_14 string, ICD_PROCEDURE_CODE_15 string, ICD_PROCEDURE_CODE_16 string, ICD_PROCEDURE_CODE_17 string, ICD_PROCEDURE_CODE_18 string, ICD_PROCEDURE_CODE_19 string, ICD_PROCEDURE_CODE_20 string, ICD_PROCEDURE_CODE_21 string, ICD_PROCEDURE_CODE_22 string, ICD_PROCEDURE_CODE_23 string, ICD_PROCEDURE_CODE_24 string, ICD_PROCEDURE_CODE_25 string, E_CODE_1 string, E_CODE_TYPE_1 string, E_CODE_2 string, E_CODE_TYPE_2 string, E_CODE_3 string, E_CODE_TYPE_3 string, EMERGENCY_FLAG string, HOSPITAL_RELATED_FLAG string, OUTSIDE_LABS_FLAG string, PPS_CODE string, NATIONAL_DRUG_CODE string, VALUE_AMOUNT decimal(10,2), CAPITATED_SERVICE_FLAG string, NETWORK_STATUS_FLAG string, ADJUDICATED_DATE
[jira] [Created] (HIVE-13706) Need default actualOutputFormat for HivePassThroughOutputFormat
Oleksiy Sayankin created HIVE-13706: --- Summary: Need default actualOutputFormat for HivePassThroughOutputFormat Key: HIVE-13706 URL: https://issues.apache.org/jira/browse/HIVE-13706 Project: Hive Issue Type: Bug Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin Customer migrated from Hive-0.13 to Hive-1.2 Old tables have description: {code} Table Parameters: hbase.table.name/user/test/xyz storage_handler org.apache.hadoop.hive.hbase.HBaseStorageHandler transient_lastDdlTime 1462273708 # Storage Information SerDe Library: org.apache.hadoop.hive.hbase.HBaseSerDe InputFormat:org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.HivePassThroughOutputFormat Compressed: No Num Buckets:-1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: hbase.columns.mapping :key,cf1:val serialization.format1 Time taken: 0.259 seconds, Fetched: 30 row(s) {code} Because there is no default constructor of HivePassThroughOutputFormat in Hive-1.2, exception happens. I can reproduce it manually. 1. Create table in Hive {code} create table t1(id int) STORED AS INPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveInputFormat" OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HivePassThroughOutputFormat" ; {code} 2. Perform query {code} select count(*) from t1; {code} 3. See exception: {code} java.lang.RuntimeException: java.lang.NoSuchMethodException: org.apache.hadoop.hive.ql.io.HivePassThroughOutputFormat.() at org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:85) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:277) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:272) at org.apache.hadoop.hive.ql.exec.Utilities.createDummyFileForEmptyPartition(Utilities.java:3489) at org.apache.hadoop.hive.ql.exec.Utilities.getInputPaths(Utilities.java:3417) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:372) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1656) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1415) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1198) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1062) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1052) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.lang.NoSuchMethodException: org.apache.hadoop.hive.ql.io.HivePassThroughOutputFormat.() at java.lang.Class.getConstructor0(Class.java:2892) at java.lang.Class.getDeclaredConstructor(Class.java:2058) at org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:79) ... 25 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13085) Need an API / configuration parameter to find out the authenticated user from beeline
Oleksiy Sayankin created HIVE-13085: --- Summary: Need an API / configuration parameter to find out the authenticated user from beeline Key: HIVE-13085 URL: https://issues.apache.org/jira/browse/HIVE-13085 Project: Hive Issue Type: Improvement Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin This request has come from the customer who wanted an API / configuration parameter to identify the authenticated user from the beeline. This is similar to the request done in the thread below: https://community.hortonworks.com/questions/2620/hadoop-environment-variable-or-configuration-varia.html But this would not be a feasible option for the requestor. So here the general ask is once the user is logged in to the beeline, they wanted to identify who this user is and then use this information to enforce the ACLs on the tables through the customer's custom code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12749) Constant propagate returns string values in incorrect format
Oleksiy Sayankin created HIVE-12749: --- Summary: Constant propagate returns string values in incorrect format Key: HIVE-12749 URL: https://issues.apache.org/jira/browse/HIVE-12749 Project: Hive Issue Type: Bug Affects Versions: 1.2.0, 1.0.0 Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin Fix For: 2.0.0 h2. STEP 1. Create and upload test data Execute in command line: {noformat} nano stest.data {noformat} Add to file: {noformat} 000126,000777 000126,000778 000126,000779 000474,000888 000468,000889 000272,000880 {noformat} {noformat} hadoop fs -put stest.data / {noformat} {noformat} hive> create table stest(x STRING, y STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; hive> LOAD DATA INPATH '/stest.data' OVERWRITE INTO TABLE stest; {noformat} h2. STEP 2. Execute test query {noformat} hive> select x from stest where x = 126; {noformat} EXPECTED RESULT: {noformat} 000126 000126 000126 {noformat} ACTUAL RESULT: {noformat} 126 126 126 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)