[jira] [Commented] (HIVE-8065) Support HDFS encryption functionality on Hive
[ https://issues.apache.org/jira/browse/HIVE-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529998#comment-14529998 ] Brock Noland commented on HIVE-8065: In that case the results of the query are staged in ez1. Support HDFS encryption functionality on Hive - Key: HIVE-8065 URL: https://issues.apache.org/jira/browse/HIVE-8065 Project: Hive Issue Type: Improvement Affects Versions: 0.13.1 Reporter: Sergio Peña Assignee: Sergio Peña Labels: Hive-Scrum The new encryption support on HDFS makes Hive incompatible and unusable when this feature is used. HDFS encryption is designed so that an user can configure different encryption zones (or directories) for multi-tenant environments. An encryption zone has an exclusive encryption key, such as AES-128 or AES-256. Because of security compliance, the HDFS does not allow to move/rename files between encryption zones. Renames are allowed only inside the same encryption zone. A copy is allowed between encryption zones. See HDFS-6134 for more details about HDFS encryption design. Hive currently uses a scratch directory (like /tmp/$user/$random). This scratch directory is used for the output of intermediate data (between MR jobs) and for the final output of the hive query which is later moved to the table directory location. If Hive tables are in different encryption zones than the scratch directory, then Hive won't be able to renames those files/directories, and it will make Hive unusable. To handle this problem, we can change the scratch directory of the query/statement to be inside the same encryption zone of the table directory location. This way, the renaming process will be successful. Also, for statements that move files between encryption zones (i.e. LOAD DATA), a copy may be executed instead of a rename. This will cause an overhead when copying large data files, but it won't break the encryption on Hive. Another security thing to consider is when using joins selects. If Hive joins different tables with different encryption key strengths, then the results of the select might break the security compliance of the tables. Let's say two tables with 128 bits and 256 bits encryption are joined, then the temporary results might be stored in the 128 bits encryption zone. This will conflict with the table encrypted with 256 bits temporary. To fix this, Hive should be able to select the scratch directory that is more secured/encrypted in order to save the intermediate data temporary with no compliance issues. For instance: {noformat} SELECT * FROM table-aes128 t1 JOIN table-aes256 t2 WHERE t1.id == t2.id; {noformat} - This should use a scratch directory (or staging directory) inside the table-aes256 table location. {noformat} INSERT OVERWRITE TABLE table-unencrypted SELECT * FROM table-aes1; {noformat} - This should use a scratch directory inside the table-aes1 location. {noformat} FROM table-unencrypted INSERT OVERWRITE TABLE table-aes128 SELECT id, name INSERT OVERWRITE TABLE table-aes256 SELECT id, name {noformat} - This should use a scratch directory on each of the tables locations. - The first SELECT will have its scratch directory on table-aes128 directory. - The second SELECT will have its scratch directory on table-aes256 directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9644) CASE comparison operator rotation optimization
[ https://issues.apache.org/jira/browse/HIVE-9644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-9644: --- Attachment: HIVE-9644.2.patch Extended patch with folding of when udf. CASE comparison operator rotation optimization -- Key: HIVE-9644 URL: https://issues.apache.org/jira/browse/HIVE-9644 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 0.14.0, 1.0.0, 1.1.0 Reporter: Gopal V Assignee: Ashutosh Chauhan Attachments: HIVE-9644.1.patch, HIVE-9644.2.patch, HIVE-9644.patch Constant folding for queries don't kick in for some automatically generated query patterns which look like this. {code} hive explain select count(1) from store_sales where (case ss_sold_date when '1998-01-01' then 1 else null end)=1; {code} This should get rewritten by pushing the equality into the case branches. {code} select count(1) from store_sales where (case ss_sold_date when '1998-01-01' then 1=1 else null=1 end); {code} Ending up with a simplified filter condition, resolving itself as {code} select count(1) from store_sales where ss_sold_date= '1998-01-01' ; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10623) Implement hive cli options using beeline functionality
[ https://issues.apache.org/jira/browse/HIVE-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-10623: Attachment: HIVE-10623.patch Hi [~xuefuz], could you help review this jira? Thank yoU! Implement hive cli options using beeline functionality -- Key: HIVE-10623 URL: https://issues.apache.org/jira/browse/HIVE-10623 Project: Hive Issue Type: Sub-task Components: CLI Reporter: Ferdinand Xu Assignee: Ferdinand Xu Attachments: HIVE-10623.patch We need to support the original hive cli options for the purpose of backwards compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10625) Handle Authorization for 'select expr' hive queries in SQL Standard Authorization
[ https://issues.apache.org/jira/browse/HIVE-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530251#comment-14530251 ] Nemon Lou commented on HIVE-10625: -- And the error log in Hive Server : {code} 2015-05-06 17:09:51,935 | ERROR | HiveServer2-Handler-Pool: Thread-114 | FAILED: HiveAuthzPluginException Error getting object from metastore for Object [type=TABLE_OR_VIEW, name=_dummy_database._dummy_table] org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthzPluginException: Error getting object from metastore for Object [type=TABLE_OR_VIEW, name=_dummy_database._dummy_table] at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.throwGetObjErr(SQLAuthorizationUtils.java:310) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.isOwner(SQLAuthorizationUtils.java:272) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.getPrivilegesFromMetaStore(SQLAuthorizationUtils.java:212) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizationValidator.checkPrivileges(SQLStdHiveAuthorizationValidator.java:141) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizationValidator.checkPrivileges(SQLStdHiveAuthorizationValidator.java:93) at org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthorizerImpl.checkPrivileges(HiveAuthorizerImpl.java:85) at org.apache.hadoop.hive.ql.Driver.doAuthorizationV2(Driver.java:770) at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:565) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:467) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1106) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:102) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:202) at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:379) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:366) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1672) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) at com.sun.proxy.$Proxy18.executeStatementAsync(Unknown Source) at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:271) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:415) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: NoSuchObjectException(message:_dummy_database._dummy_table table not found) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result$get_table_resultStandardScheme.read(ThriftHiveMetastore.java:32085) at
[jira] [Updated] (HIVE-10625) Handle Authorization for 'select expr' hive queries in SQL Standard Authorization
[ https://issues.apache.org/jira/browse/HIVE-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou updated HIVE-10625: - Description: Hive internally rewrites this 'select expression' query into 'select expression from _dummy_database._dummy_table', where these dummy db and table are temp entities for the current query. The SQL Standard Authorization need to handle these special objects. Typing select reverse(123); in beeline,will get this error : {code} Error: Error while compiling statement: FAILED: HiveAuthzPluginException Error getting object from metastore for Object [type=TABLE_OR_VIEW, name=_dummy_database._dummy_table] (state=42000,code=4) {code} was: Hive internally rewrites this 'select expression' query into 'select expression from _dummy_database._dummy_table', where these dummy db and table are temp entities for the current query. The SQL Standard Authorization need to handle these special objects. Typing select reverse(123); in beeline : ,will get this error : {code} Error: Error while compiling statement: FAILED: HiveAuthzPluginException Error getting object from metastore for Object [type=TABLE_OR_VIEW, name=_dummy_database._dummy_table] (state=42000,code=4) {code} Handle Authorization for 'select expr' hive queries in SQL Standard Authorization - Key: HIVE-10625 URL: https://issues.apache.org/jira/browse/HIVE-10625 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Affects Versions: 1.1.0 Reporter: Nemon Lou Hive internally rewrites this 'select expression' query into 'select expression from _dummy_database._dummy_table', where these dummy db and table are temp entities for the current query. The SQL Standard Authorization need to handle these special objects. Typing select reverse(123); in beeline,will get this error : {code} Error: Error while compiling statement: FAILED: HiveAuthzPluginException Error getting object from metastore for Object [type=TABLE_OR_VIEW, name=_dummy_database._dummy_table] (state=42000,code=4) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9456) Make Hive support unicode with MSSQL as Metastore backend
[ https://issues.apache.org/jira/browse/HIVE-9456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530263#comment-14530263 ] Sushanth Sowmyan commented on HIVE-9456: Precommit tests skipped this saying that attachment id 12729762 had also already been tested. As I prepared to upload a .3.patch identical to the .2.patch, I got to thinking - there's no point in the precommit tests running on this patch - this patch affects only mssql, which the precommit tests do not use. [~thejas], what do you think? Should we go ahead and submit this as-is? Make Hive support unicode with MSSQL as Metastore backend - Key: HIVE-9456 URL: https://issues.apache.org/jira/browse/HIVE-9456 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0 Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou Attachments: HIVE-9456.1.patch, HIVE-9456.2.patch, HIVE-9456.branch-1.2.patch There are significant issues when Hive uses MSSQL as metastore backend to support unicode, since MSSQL handles varchar and nvarchar datatypes differently. Hive 0.14 metastore mssql script DDL was using varchar as datatype, which can't handle multi-bytes/unicode characters, e.g., Chinese chars. This JIRA is going to track implementation of unicode support in that case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface
[ https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530169#comment-14530169 ] Hive QA commented on HIVE-9582: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12730401/HIVE-9582.8.patch {color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 8900 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_parts org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_load_data_to_encrypted_tables org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_select_read_only_encrypted_tbl org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_droppartition org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_sba_drop_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_alterpart_loc org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropView org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessing org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessingCustomSetWhitelistAppend {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3751/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3751/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3751/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 24 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12730401 - PreCommit-HIVE-TRUNK-Build HCatalog should use IMetaStoreClient interface -- Key: HIVE-9582 URL: https://issues.apache.org/jira/browse/HIVE-9582 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore Affects Versions: 0.14.0, 0.13.1 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Labels: hcatalog, metastore, rolling_upgrade Fix For: 1.2.0 Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, HIVE-9582.4.patch, HIVE-9582.5.patch, HIVE-9582.6.patch, HIVE-9582.7.patch, HIVE-9582.8.patch, HIVE-9583.1.patch Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. Hence during a failure, the client retries and possibly succeeds. But HCatalog has long been using HiveMetaStoreClient directly and hence failures are costly, especially if they are during the commit stage of a job. Its also not possible to do rolling upgrade of MetaStore Server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10484) Vectorization : RuntimeException Big Table Retained Mapping duplicate column
[ https://issues.apache.org/jira/browse/HIVE-10484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-10484: Description: With vectorization and tez enabled TPC-DS Q70 fails with {code} Caused by: java.lang.RuntimeException: Big Table Retained Mapping duplicate column 6 in ordered column map {6=(value column: 6, type name: int), 21=(value column: 21, type name: float), 22=(value column: 22, type name: int)} when adding value column 6, type int at org.apache.hadoop.hive.ql.exec.vector.VectorColumnOrderedMap.add(VectorColumnOrderedMap.java:97) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnOutputMapping.add(VectorColumnOutputMapping.java:40) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.determineCommonInfo(VectorMapJoinCommonOperator.java:320) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.init(VectorMapJoinCommonOperator.java:254) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.init(VectorMapJoinGenerateResultOperator.java:89) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.init(VectorMapJoinInnerGenerateResultOperator.java:97) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.init(VectorMapJoinInnerLongOperator.java:79) ... 49 more {code} Query {code:sql} select s_state from (select s_state as s_state, sum(ss_net_profit), rank() over ( partition by s_state order by sum(ss_net_profit) desc) as ranking from store_sales, store, date_dim where d_month_seq between 1193 and 1193+11 and date_dim.d_date_sk = store_sales.ss_sold_date_sk and store.s_store_sk = store_sales.ss_store_sk group by s_state ) tmp1 where ranking = 5 {code} was: With vectorization and tez enabled TPC-DS Q70 fails with {code} Caused by: java.lang.RuntimeException: Big Table Retained Mapping duplicate column 6 in ordered column map {6=(value column: 6, type name: int), 21=(value column: 21, type name: float), 22=(value column: 22, type name: int)} when adding value column 6, type int at org.apache.hadoop.hive.ql.exec.vector.VectorColumnOrderedMap.add(VectorColumnOrderedMap.java:97) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnOutputMapping.add(VectorColumnOutputMapping.java:40) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.determineCommonInfo(VectorMapJoinCommonOperator.java:320) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.init(VectorMapJoinCommonOperator.java:254) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.init(VectorMapJoinGenerateResultOperator.java:89) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.init(VectorMapJoinInnerGenerateResultOperator.java:97) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.init(VectorMapJoinInnerLongOperator.java:79) ... 49 more {code} Query {code} select s_state from (select s_state as s_state, sum(ss_net_profit), rank() over ( partition by s_state order by sum(ss_net_profit) desc) as ranking from store_sales, store, date_dim where d_month_seq between 1193 and 1193+11 and date_dim.d_date_sk = store_sales.ss_sold_date_sk and store.s_store_sk = store_sales.ss_store_sk group by s_state ) tmp1 where ranking = 5 {code} Vectorization : RuntimeException Big Table Retained Mapping duplicate column -- Key: HIVE-10484 URL: https://issues.apache.org/jira/browse/HIVE-10484 Project: Hive Issue Type: Bug Components: Tez, Vectorization Affects Versions: 1.2.0 Reporter: Mostafa Mokhtar Assignee: Matt McCline Fix For: 1.2.0 Attachments: HIVE-10484.01.patch With vectorization and tez enabled TPC-DS Q70 fails with {code} Caused by: java.lang.RuntimeException: Big Table Retained Mapping duplicate column 6 in ordered column map {6=(value column: 6, type name: int), 21=(value column: 21, type name: float), 22=(value column: 22, type name: int)} when adding value column 6, type int at org.apache.hadoop.hive.ql.exec.vector.VectorColumnOrderedMap.add(VectorColumnOrderedMap.java:97) at
[jira] [Commented] (HIVE-9644) CASE comparison operator rotation optimization
[ https://issues.apache.org/jira/browse/HIVE-9644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530188#comment-14530188 ] Gopal V commented on HIVE-9644: --- The patch does not seem to have the comparison operator tree rotation, perhaps we should leave this JIRA open and open another one to hold the CASE/WHEN folding?. CASE comparison operator rotation optimization -- Key: HIVE-9644 URL: https://issues.apache.org/jira/browse/HIVE-9644 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 0.14.0, 1.0.0, 1.1.0 Reporter: Gopal V Assignee: Ashutosh Chauhan Attachments: HIVE-9644.1.patch, HIVE-9644.2.patch, HIVE-9644.patch Constant folding for queries don't kick in for some automatically generated query patterns which look like this. {code} hive explain select count(1) from store_sales where (case ss_sold_date when '1998-01-01' then 1 else null end)=1; {code} This should get rewritten by pushing the equality into the case branches. {code} select count(1) from store_sales where (case ss_sold_date when '1998-01-01' then 1=1 else null=1 end); {code} Ending up with a simplified filter condition, resolving itself as {code} select count(1) from store_sales where ss_sold_date= '1998-01-01' ; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10308) Vectorization execution throws java.lang.IllegalArgumentException: Unsupported complex type: MAP
[ https://issues.apache.org/jira/browse/HIVE-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530238#comment-14530238 ] Matt McCline commented on HIVE-10308: - I ended up fixing this problem with one of my other changes because it caused some Q file failures. Vectorization execution throws java.lang.IllegalArgumentException: Unsupported complex type: MAP Key: HIVE-10308 URL: https://issues.apache.org/jira/browse/HIVE-10308 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.14.0, 0.13.1, 1.2.0, 1.1.0 Reporter: Selina Zhang Assignee: Matt McCline Attachments: HIVE-10308.1.patch Steps to reproduce: CREATE TABLE test_orc (a INT, b MAPINT, STRING) STORED AS ORC; INSERT OVERWRITE TABLE test_orc SELECT 1, MAP(1, one, 2, two) FROM src LIMIT 1; CREATE TABLE test(key INT) ; INSERT OVERWRITE TABLE test SELECT 1 FROM src LIMIT 1; set hive.vectorized.execution.enabled=true; set hive.auto.convert.join=false; select l.key from test l left outer join test_orc r on (l.key= r.a) where r.a is not null; Stack trace: Caused by: java.lang.IllegalArgumentException: Unsupported complex type: MAP at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.genVectorExpressionWritable(VectorExpressionWriterFactory.java:456) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.processVectorInspector(VectorExpressionWriterFactory.java:1191) at org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.initializeOp(VectorReduceSinkOperator.java:58) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:442) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:198) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface
[ https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530266#comment-14530266 ] Sushanth Sowmyan commented on HIVE-9582: Committed to master and branch-1.2. Thanks Thirvel Thejas! HCatalog should use IMetaStoreClient interface -- Key: HIVE-9582 URL: https://issues.apache.org/jira/browse/HIVE-9582 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore Affects Versions: 0.14.0, 0.13.1 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Labels: hcatalog, metastore, rolling_upgrade Fix For: 1.2.0 Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, HIVE-9582.4.patch, HIVE-9582.5.patch, HIVE-9582.6.patch, HIVE-9582.7.patch, HIVE-9582.8.patch, HIVE-9583.1.patch Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. Hence during a failure, the client retries and possibly succeeds. But HCatalog has long been using HiveMetaStoreClient directly and hence failures are costly, especially if they are during the commit stage of a job. Its also not possible to do rolling upgrade of MetaStore Server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9845) HCatSplit repeats information making input split data size huge
[ https://issues.apache.org/jira/browse/HIVE-9845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530279#comment-14530279 ] Sushanth Sowmyan commented on HIVE-9845: Note : precommit link when it runs will be at http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3761 HCatSplit repeats information making input split data size huge --- Key: HIVE-9845 URL: https://issues.apache.org/jira/browse/HIVE-9845 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Rohini Palaniswamy Assignee: Mithun Radhakrishnan Attachments: HIVE-9845.1.patch, HIVE-9845.3.patch, HIVE-9845.4.patch, HIVE-9845.5.patch Pig on Tez jobs with larger tables hit PIG-4443. Running on HDFS data which has even triple the number of splits(100K+ splits and tasks) does not hit that issue. {code} HCatBaseInputFormat.java: //Call getSplit on the InputFormat, create an //HCatSplit for each underlying split //NumSplits is 0 for our purposes org.apache.hadoop.mapred.InputSplit[] baseSplits = inputFormat.getSplits(jobConf, 0); for(org.apache.hadoop.mapred.InputSplit split : baseSplits) { splits.add(new HCatSplit( partitionInfo, split,allCols)); } {code} Each hcatSplit duplicates partition schema and table schema. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10597) Relative path doesn't work with CREATE TABLE LOCATION 'relative/path'
[ https://issues.apache.org/jira/browse/HIVE-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530294#comment-14530294 ] Hive QA commented on HIVE-10597: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12730515/HIVE-10597.02.patch {color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 8902 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_parts org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_load_data_to_encrypted_tables org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_select_read_only_encrypted_tbl org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_droppartition org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_sba_drop_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_alterpart_loc org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropView org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessing org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessingCustomSetWhitelistAppend {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3752/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3752/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3752/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 24 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12730515 - PreCommit-HIVE-TRUNK-Build Relative path doesn't work with CREATE TABLE LOCATION 'relative/path' - Key: HIVE-10597 URL: https://issues.apache.org/jira/browse/HIVE-10597 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Reuben Kuhnert Assignee: Reuben Kuhnert Priority: Minor Attachments: HIVE-10597.01.patch, HIVE-10597.02.patch {code} 0: jdbc:hive2://a2110.halxg.cloudera.com:1000 CREATE EXTERNAL TABLE IF NOT EXISTS mydb.employees3 like mydb.employees LOCATION 'data/stock'; Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.lang.NullPointerException) (state=08S01,code=1) 0: jdbc:hive2://a2110.halxg.cloudera.com:1000 CREATE EXTERNAL TABLE IF NOT EXISTS mydb.employees3 like mydb.employees LOCATION '/user/hive/data/stock'; No rows affected (0.369 seconds) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10539) set default value of hive.repl.task.factory
[ https://issues.apache.org/jira/browse/HIVE-10539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530264#comment-14530264 ] Sushanth Sowmyan commented on HIVE-10539: - None of the precommit test failures here are related. Committing. set default value of hive.repl.task.factory --- Key: HIVE-10539 URL: https://issues.apache.org/jira/browse/HIVE-10539 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-10539.1.patch, HIVE-10539.2.patch, HIVE-10539.3.patch hive.repl.task.factory does not have a default value set. It should be set to org.apache.hive.hcatalog.api.repl.exim.EximReplicationTaskFactory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10484) Vectorization : RuntimeException Big Table Retained Mapping duplicate column
[ https://issues.apache.org/jira/browse/HIVE-10484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530198#comment-14530198 ] Matt McCline commented on HIVE-10484: - Found the problem -- relatively simple fix. Vectorization : RuntimeException Big Table Retained Mapping duplicate column -- Key: HIVE-10484 URL: https://issues.apache.org/jira/browse/HIVE-10484 Project: Hive Issue Type: Bug Components: Tez, Vectorization Affects Versions: 1.2.0 Reporter: Mostafa Mokhtar Assignee: Matt McCline Fix For: 1.2.0 Attachments: HIVE-10484.01.patch With vectorization and tez enabled TPC-DS Q70 fails with {code} Caused by: java.lang.RuntimeException: Big Table Retained Mapping duplicate column 6 in ordered column map {6=(value column: 6, type name: int), 21=(value column: 21, type name: float), 22=(value column: 22, type name: int)} when adding value column 6, type int at org.apache.hadoop.hive.ql.exec.vector.VectorColumnOrderedMap.add(VectorColumnOrderedMap.java:97) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnOutputMapping.add(VectorColumnOutputMapping.java:40) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.determineCommonInfo(VectorMapJoinCommonOperator.java:320) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.init(VectorMapJoinCommonOperator.java:254) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.init(VectorMapJoinGenerateResultOperator.java:89) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.init(VectorMapJoinInnerGenerateResultOperator.java:97) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.init(VectorMapJoinInnerLongOperator.java:79) ... 49 more {code} Query {code:sql} select s_state from (select s_state as s_state, sum(ss_net_profit), rank() over ( partition by s_state order by sum(ss_net_profit) desc) as ranking from store_sales, store, date_dim where d_month_seq between 1193 and 1193+11 and date_dim.d_date_sk = store_sales.ss_sold_date_sk and store.s_store_sk = store_sales.ss_store_sk group by s_state ) tmp1 where ranking = 5 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface
[ https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530234#comment-14530234 ] Sushanth Sowmyan commented on HIVE-9582: The test failures don't seem related to this patch. Will go ahead and commit. HCatalog should use IMetaStoreClient interface -- Key: HIVE-9582 URL: https://issues.apache.org/jira/browse/HIVE-9582 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore Affects Versions: 0.14.0, 0.13.1 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Labels: hcatalog, metastore, rolling_upgrade Fix For: 1.2.0 Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, HIVE-9582.4.patch, HIVE-9582.5.patch, HIVE-9582.6.patch, HIVE-9582.7.patch, HIVE-9582.8.patch, HIVE-9583.1.patch Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. Hence during a failure, the client retries and possibly succeeds. But HCatalog has long been using HiveMetaStoreClient directly and hence failures are costly, especially if they are during the commit stage of a job. Its also not possible to do rolling upgrade of MetaStore Server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10592) ORC file dump in JSON format
[ https://issues.apache.org/jira/browse/HIVE-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529991#comment-14529991 ] Hive QA commented on HIVE-10592: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12730381/HIVE-10592.3.patch {color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 8901 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_parts org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_load_data_to_encrypted_tables org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_select_read_only_encrypted_tbl org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_droppartition org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_sba_drop_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_alterpart_loc org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropView org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessing org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessingCustomSetWhitelistAppend {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3747/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3747/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3747/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 24 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12730381 - PreCommit-HIVE-TRUNK-Build ORC file dump in JSON format Key: HIVE-10592 URL: https://issues.apache.org/jira/browse/HIVE-10592 Project: Hive Issue Type: New Feature Affects Versions: 1.3.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Attachments: HIVE-10592.1.patch, HIVE-10592.2.patch, HIVE-10592.3.patch, HIVE-10592.4.patch ORC file dump uses custom format. Will be useful to dump ORC metadata in json format so that other tools can be built on top it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10568) Select count(distinct()) can have more optimal execution plan
[ https://issues.apache.org/jira/browse/HIVE-10568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10568: Attachment: HIVE-10568.1.patch Rebased on HIVE-10607. Patch is ready for review. [~jpullokkaran] can you take a look? Select count(distinct()) can have more optimal execution plan - Key: HIVE-10568 URL: https://issues.apache.org/jira/browse/HIVE-10568 Project: Hive Issue Type: Improvement Components: CBO, Logical Optimizer Affects Versions: 0.6.0, 0.7.0, 0.8.0, 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0, 1.0.0, 1.1.0 Reporter: Mostafa Mokhtar Assignee: Ashutosh Chauhan Attachments: HIVE-10568.1.patch, HIVE-10568.patch, HIVE-10568.patch {code:sql} select count(distinct ss_ticket_number) from store_sales; {code} can be rewritten as {code:sql} select count(1) from (select distinct ss_ticket_number from store_sales) a; {code} which may run upto 3x faster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9743) Incorrect result set for vectorized left outer join
[ https://issues.apache.org/jira/browse/HIVE-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530084#comment-14530084 ] Hive QA commented on HIVE-9743: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12730688/HIVE-9743.09.patch {color:red}ERROR:{color} -1 due to 27 failed/errored test(s), 8905 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_parts org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_leftsemi_mapjoin_orig org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_load_data_to_encrypted_tables org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_select_read_only_encrypted_tbl org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_droppartition org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_sba_drop_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_alterpart_loc org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropView org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessing org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessingCustomSetWhitelistAppend org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3750/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3750/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3750/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 27 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12730688 - PreCommit-HIVE-TRUNK-Build Incorrect result set for vectorized left outer join --- Key: HIVE-9743 URL: https://issues.apache.org/jira/browse/HIVE-9743 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.14.0 Reporter: N Campbell Assignee: Matt McCline Attachments: HIVE-9743.01.patch, HIVE-9743.02.patch, HIVE-9743.03.patch, HIVE-9743.04.patch, HIVE-9743.05.patch, HIVE-9743.06.patch, HIVE-9743.08.patch, HIVE-9743.09.patch This query is supposed to return 3 rows and will when run without Tez but returns 2 rows when run with Tez. select tjoin1.rnum, tjoin1.c1, tjoin1.c2, tjoin2.c2 as c2j2 from tjoin1 left outer join tjoin2 on ( tjoin1.c1 = tjoin2.c1 and tjoin1.c2 15 ) tjoin1.rnum tjoin1.c1 tjoin1.c2 c2j2 1 20 25 null 2 null 50 null instead of tjoin1.rnum tjoin1.c1 tjoin1.c2 c2j2 0 10 15 null 1
[jira] [Assigned] (HIVE-10515) Create tests to cover existing (supported) Hive CLI functionality
[ https://issues.apache.org/jira/browse/HIVE-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu reassigned HIVE-10515: --- Assignee: Ferdinand Xu Create tests to cover existing (supported) Hive CLI functionality - Key: HIVE-10515 URL: https://issues.apache.org/jira/browse/HIVE-10515 Project: Hive Issue Type: Sub-task Components: CLI Affects Versions: 0.10.0 Reporter: Xuefu Zhang Assignee: Ferdinand Xu After removing HiveServer1, Hive CLI's functionality is reduced to its original use case, a thick client application. Let's identify this so that we maintain it when implementation is changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9743) Incorrect result set for vectorized left outer join
[ https://issues.apache.org/jira/browse/HIVE-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-9743: --- Attachment: HIVE-9743.091.patch Incorrect result set for vectorized left outer join --- Key: HIVE-9743 URL: https://issues.apache.org/jira/browse/HIVE-9743 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.14.0 Reporter: N Campbell Assignee: Matt McCline Attachments: HIVE-9743.01.patch, HIVE-9743.02.patch, HIVE-9743.03.patch, HIVE-9743.04.patch, HIVE-9743.05.patch, HIVE-9743.06.patch, HIVE-9743.08.patch, HIVE-9743.09.patch, HIVE-9743.091.patch This query is supposed to return 3 rows and will when run without Tez but returns 2 rows when run with Tez. select tjoin1.rnum, tjoin1.c1, tjoin1.c2, tjoin2.c2 as c2j2 from tjoin1 left outer join tjoin2 on ( tjoin1.c1 = tjoin2.c1 and tjoin1.c2 15 ) tjoin1.rnum tjoin1.c1 tjoin1.c2 c2j2 1 20 25 null 2 null 50 null instead of tjoin1.rnum tjoin1.c1 tjoin1.c2 c2j2 0 10 15 null 1 20 25 null 2 null 50 null create table if not exists TJOIN1 (RNUM int , C1 int, C2 int) STORED AS orc ; 0|10|15 1|20|25 2|\N|50 create table if not exists TJOIN2 (RNUM int , C1 int, C2 char(2)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS TEXTFILE ; 0|10|BB 1|15|DD 2|\N|EE 3|10|FF -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10548) Remove dependency to s3 repository in root pom
[ https://issues.apache.org/jira/browse/HIVE-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-10548: - Attachment: HIVE-10548.2.patch Remove dependency to s3 repository in root pom -- Key: HIVE-10548 URL: https://issues.apache.org/jira/browse/HIVE-10548 Project: Hive Issue Type: Bug Components: Build Infrastructure Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-10548.2.patch, HIVE-10548.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10484) Vectorization : RuntimeException Big Table Retained Mapping duplicate column
[ https://issues.apache.org/jira/browse/HIVE-10484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-10484: Attachment: HIVE-10484.01.patch Vectorization : RuntimeException Big Table Retained Mapping duplicate column -- Key: HIVE-10484 URL: https://issues.apache.org/jira/browse/HIVE-10484 Project: Hive Issue Type: Bug Components: Tez, Vectorization Affects Versions: 1.2.0 Reporter: Mostafa Mokhtar Assignee: Matt McCline Fix For: 1.2.0 Attachments: HIVE-10484.01.patch With vectorization and tez enabled TPC-DS Q70 fails with {code} Caused by: java.lang.RuntimeException: Big Table Retained Mapping duplicate column 6 in ordered column map {6=(value column: 6, type name: int), 21=(value column: 21, type name: float), 22=(value column: 22, type name: int)} when adding value column 6, type int at org.apache.hadoop.hive.ql.exec.vector.VectorColumnOrderedMap.add(VectorColumnOrderedMap.java:97) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnOutputMapping.add(VectorColumnOutputMapping.java:40) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.determineCommonInfo(VectorMapJoinCommonOperator.java:320) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.init(VectorMapJoinCommonOperator.java:254) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.init(VectorMapJoinGenerateResultOperator.java:89) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.init(VectorMapJoinInnerGenerateResultOperator.java:97) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.init(VectorMapJoinInnerLongOperator.java:79) ... 49 more {code} Query {code} select s_state from (select s_state as s_state, sum(ss_net_profit), rank() over ( partition by s_state order by sum(ss_net_profit) desc) as ranking from store_sales, store, date_dim where d_month_seq between 1193 and 1193+11 and date_dim.d_date_sk = store_sales.ss_sold_date_sk and store.s_store_sk = store_sales.ss_store_sk group by s_state ) tmp1 where ranking = 5 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10435) Make HiveSession implementation pluggable through configuration
[ https://issues.apache.org/jira/browse/HIVE-10435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akshay Goyal updated HIVE-10435: Attachment: HIVE-10435.1.patch Make HiveSession implementation pluggable through configuration --- Key: HIVE-10435 URL: https://issues.apache.org/jira/browse/HIVE-10435 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Akshay Goyal Attachments: HIVE-10435.1.patch SessionManager in CLIService creates and keeps track of HiveSession. Right now, it creates HiveSessionImpl which is one implementation of HiveSession. This improvement request is to make it pluggable through a configuration sothat other implementations can be passed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10453) HS2 leaking open file descriptors when using UDFs
[ https://issues.apache.org/jira/browse/HIVE-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-10453: Attachment: HIVE-10453.2.patch HS2 leaking open file descriptors when using UDFs - Key: HIVE-10453 URL: https://issues.apache.org/jira/browse/HIVE-10453 Project: Hive Issue Type: Bug Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-10453.1.patch, HIVE-10453.2.patch 1. create a custom function by CREATE FUNCTION myfunc AS 'someudfclass' using jar 'hdfs:///tmp/myudf.jar'; 2. Create a simple jdbc client, just do connect, run simple query which using the function such as: select myfunc(col1) from sometable 3. Disconnect. Check open file for HiveServer2 by: lsof -p HSProcID | grep myudf.jar You will see the leak as: {noformat} java 28718 ychen txt REG1,4741 212977666 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar java 28718 ychen 330r REG1,4741 212977666 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10626) Spark paln need to be updated [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-10626: Attachment: HIVE-10626-spark.patch Patch contains the diff of basic patch and latest patch. Spark paln need to be updated [Spark Branch] Key: HIVE-10626 URL: https://issues.apache.org/jira/browse/HIVE-10626 Project: Hive Issue Type: Bug Components: Spark Affects Versions: spark-branch Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-10626-spark.patch [HIVE-8858] basic patch was committed, latest patch need to be committed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10435) Make HiveSession implementation pluggable through configuration
[ https://issues.apache.org/jira/browse/HIVE-10435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530621#comment-14530621 ] Hive QA commented on HIVE-10435: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12730823/HIVE-10435.1.patch {color:red}ERROR:{color} -1 due to 28 failed/errored test(s), 8901 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_parts org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_load_data_to_encrypted_tables org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_select_read_only_encrypted_tbl org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_droppartition org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_sba_drop_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_alterpart_loc org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropView org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessing org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessingCustomSetWhitelistAppend org.apache.hive.hcatalog.streaming.TestStreaming.testAddPartition org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3779/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3779/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3779/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 28 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12730823 - PreCommit-HIVE-TRUNK-Build Make HiveSession implementation pluggable through configuration --- Key: HIVE-10435 URL: https://issues.apache.org/jira/browse/HIVE-10435 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Akshay Goyal Attachments: HIVE-10435.1.patch SessionManager in CLIService creates and keeps track of HiveSession. Right now, it creates HiveSessionImpl which is one implementation of HiveSession. This improvement request is to make it pluggable through a configuration sothat other implementations can be passed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10308) Vectorization execution throws java.lang.IllegalArgumentException: Unsupported complex type: MAP
[ https://issues.apache.org/jira/browse/HIVE-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-10308: Description: Steps to reproduce: {code:sql} CREATE TABLE test_orc (a INT, b MAPINT, STRING) STORED AS ORC; INSERT OVERWRITE TABLE test_orc SELECT 1, MAP(1, one, 2, two) FROM src LIMIT 1; CREATE TABLE test(key INT) ; INSERT OVERWRITE TABLE test SELECT 1 FROM src LIMIT 1; set hive.vectorized.execution.enabled=true; set hive.auto.convert.join=false; select l.key from test l left outer join test_orc r on (l.key= r.a) where r.a is not null; {code} Stack trace: {noformat} Caused by: java.lang.IllegalArgumentException: Unsupported complex type: MAP at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.genVectorExpressionWritable(VectorExpressionWriterFactory.java:456) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.processVectorInspector(VectorExpressionWriterFactory.java:1191) at org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.initializeOp(VectorReduceSinkOperator.java:58) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:442) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:198) {noformat} was: Steps to reproduce: CREATE TABLE test_orc (a INT, b MAPINT, STRING) STORED AS ORC; INSERT OVERWRITE TABLE test_orc SELECT 1, MAP(1, one, 2, two) FROM src LIMIT 1; CREATE TABLE test(key INT) ; INSERT OVERWRITE TABLE test SELECT 1 FROM src LIMIT 1; set hive.vectorized.execution.enabled=true; set hive.auto.convert.join=false; select l.key from test l left outer join test_orc r on (l.key= r.a) where r.a is not null; Stack trace: Caused by: java.lang.IllegalArgumentException: Unsupported complex type: MAP at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.genVectorExpressionWritable(VectorExpressionWriterFactory.java:456) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.processVectorInspector(VectorExpressionWriterFactory.java:1191) at org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.initializeOp(VectorReduceSinkOperator.java:58) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:442) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:198) Vectorization execution throws java.lang.IllegalArgumentException: Unsupported complex type: MAP Key: HIVE-10308 URL: https://issues.apache.org/jira/browse/HIVE-10308 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.14.0, 0.13.1, 1.2.0, 1.1.0 Reporter: Selina Zhang Assignee: Matt McCline Attachments: HIVE-10308.1.patch Steps to reproduce: {code:sql} CREATE TABLE test_orc (a INT, b MAPINT, STRING) STORED AS ORC; INSERT OVERWRITE TABLE test_orc SELECT 1, MAP(1, one, 2, two) FROM src LIMIT 1; CREATE TABLE test(key INT) ; INSERT OVERWRITE TABLE test SELECT 1 FROM src LIMIT 1; set hive.vectorized.execution.enabled=true; set hive.auto.convert.join=false; select l.key from test l left outer join test_orc r on (l.key= r.a) where r.a is not null; {code} Stack trace: {noformat} Caused by: java.lang.IllegalArgumentException: Unsupported complex type: MAP at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.genVectorExpressionWritable(VectorExpressionWriterFactory.java:456) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.processVectorInspector(VectorExpressionWriterFactory.java:1191) at org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.initializeOp(VectorReduceSinkOperator.java:58) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) at
[jira] [Commented] (HIVE-10484) Vectorization : RuntimeException Big Table Retained Mapping duplicate column
[ https://issues.apache.org/jira/browse/HIVE-10484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531019#comment-14531019 ] Matt McCline commented on HIVE-10484: - I just tried creating a smaller repro and did not succeed. I'll try taking the monster query and making a Q file... Vectorization : RuntimeException Big Table Retained Mapping duplicate column -- Key: HIVE-10484 URL: https://issues.apache.org/jira/browse/HIVE-10484 Project: Hive Issue Type: Bug Components: Tez, Vectorization Affects Versions: 1.2.0 Reporter: Mostafa Mokhtar Assignee: Matt McCline Fix For: 1.2.0 Attachments: HIVE-10484.01.patch, HIVE-10484.02.patch With vectorization and tez enabled TPC-DS Q70 fails with {code} Caused by: java.lang.RuntimeException: Big Table Retained Mapping duplicate column 6 in ordered column map {6=(value column: 6, type name: int), 21=(value column: 21, type name: float), 22=(value column: 22, type name: int)} when adding value column 6, type int at org.apache.hadoop.hive.ql.exec.vector.VectorColumnOrderedMap.add(VectorColumnOrderedMap.java:97) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnOutputMapping.add(VectorColumnOutputMapping.java:40) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.determineCommonInfo(VectorMapJoinCommonOperator.java:320) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.init(VectorMapJoinCommonOperator.java:254) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.init(VectorMapJoinGenerateResultOperator.java:89) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.init(VectorMapJoinInnerGenerateResultOperator.java:97) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.init(VectorMapJoinInnerLongOperator.java:79) ... 49 more {code} Query {code:sql} select s_state from (select s_state as s_state, sum(ss_net_profit), rank() over ( partition by s_state order by sum(ss_net_profit) desc) as ranking from store_sales, store, date_dim where d_month_seq between 1193 and 1193+11 and date_dim.d_date_sk = store_sales.ss_sold_date_sk and store.s_store_sk = store_sales.ss_store_sk group by s_state ) tmp1 where ranking = 5 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
[ https://issues.apache.org/jira/browse/HIVE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531523#comment-14531523 ] Sushanth Sowmyan commented on HIVE-8696: Also, on a related note, this section is way too brittle, and will likely break because of another exception chaining change sometime in the future, and is also very difficult to maintain from a readability perspective. {code} - assertTrue(e.getCause().getMessage().contains( + assertTrue(((InvocationTargetException)e.getCause().getCause().getCause()).getTargetException().getMessage().contains( Could not connect to meta store using any of the URIs provided)); {code} I'm not going to ask to change that now, given that this patch itself is very important, and a blocker, but could you please file a follow up jira to clean up this testcase regarding this. :) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient. - Key: HIVE-8696 URL: https://issues.apache.org/jira/browse/HIVE-8696 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore Affects Versions: 0.12.0, 0.13.1 Reporter: Mithun Radhakrishnan Assignee: Thiruvel Thirumoolan Fix For: 1.2.0 Attachments: HIVE-8696.1.patch, HIVE-8696.2.patch, HIVE-8696.3.patch, HIVE-8696.4.patch, HIVE-8696.poc.patch The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the HCatClient API that log in through keytabs will fail without retry, when their TGTs expire. The fix is inbound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
[ https://issues.apache.org/jira/browse/HIVE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531598#comment-14531598 ] Thiruvel Thirumoolan commented on HIVE-8696: Agree. Raised HIVE-10637 to fix it. HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient. - Key: HIVE-8696 URL: https://issues.apache.org/jira/browse/HIVE-8696 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore Affects Versions: 0.12.0, 0.13.1 Reporter: Mithun Radhakrishnan Assignee: Thiruvel Thirumoolan Fix For: 1.2.0 Attachments: HIVE-8696.1.patch, HIVE-8696.2.patch, HIVE-8696.3.patch, HIVE-8696.4.patch, HIVE-8696.5.patch, HIVE-8696.poc.patch The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the HCatClient API that log in through keytabs will fail without retry, when their TGTs expire. The fix is inbound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9743) Incorrect result set for vectorized left outer join
[ https://issues.apache.org/jira/browse/HIVE-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531628#comment-14531628 ] Vikram Dixit K commented on HIVE-9743: -- +1 Incorrect result set for vectorized left outer join --- Key: HIVE-9743 URL: https://issues.apache.org/jira/browse/HIVE-9743 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.14.0 Reporter: N Campbell Assignee: Matt McCline Attachments: HIVE-9743.01.patch, HIVE-9743.02.patch, HIVE-9743.03.patch, HIVE-9743.04.patch, HIVE-9743.05.patch, HIVE-9743.06.patch, HIVE-9743.08.patch, HIVE-9743.09.patch, HIVE-9743.091.patch This query is supposed to return 3 rows and will when run without Tez but returns 2 rows when run with Tez. select tjoin1.rnum, tjoin1.c1, tjoin1.c2, tjoin2.c2 as c2j2 from tjoin1 left outer join tjoin2 on ( tjoin1.c1 = tjoin2.c1 and tjoin1.c2 15 ) tjoin1.rnum tjoin1.c1 tjoin1.c2 c2j2 1 20 25 null 2 null 50 null instead of tjoin1.rnum tjoin1.c1 tjoin1.c2 c2j2 0 10 15 null 1 20 25 null 2 null 50 null create table if not exists TJOIN1 (RNUM int , C1 int, C2 int) STORED AS orc ; 0|10|15 1|20|25 2|\N|50 create table if not exists TJOIN2 (RNUM int , C1 int, C2 char(2)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS TEXTFILE ; 0|10|BB 1|15|DD 2|\N|EE 3|10|FF -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10453) HS2 leaking open file descriptors when using UDFs
[ https://issues.apache.org/jira/browse/HIVE-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531646#comment-14531646 ] Szehon Ho commented on HIVE-10453: -- OK, as long as its not a typo and means something :) No problem. HS2 leaking open file descriptors when using UDFs - Key: HIVE-10453 URL: https://issues.apache.org/jira/browse/HIVE-10453 Project: Hive Issue Type: Bug Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-10453.1.patch, HIVE-10453.2.patch 1. create a custom function by CREATE FUNCTION myfunc AS 'someudfclass' using jar 'hdfs:///tmp/myudf.jar'; 2. Create a simple jdbc client, just do connect, run simple query which using the function such as: select myfunc(col1) from sometable 3. Disconnect. Check open file for HiveServer2 by: lsof -p HSProcID | grep myudf.jar You will see the leak as: {noformat} java 28718 ychen txt REG1,4741 212977666 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar java 28718 ychen 330r REG1,4741 212977666 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9644) Fold case when udfs
[ https://issues.apache.org/jira/browse/HIVE-9644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531543#comment-14531543 ] Ashutosh Chauhan commented on HIVE-9644: Created HIVE-10636 as follow-up Fold case when udfs - Key: HIVE-9644 URL: https://issues.apache.org/jira/browse/HIVE-9644 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 0.14.0, 1.0.0, 1.1.0 Reporter: Gopal V Assignee: Ashutosh Chauhan Attachments: HIVE-9644.1.patch, HIVE-9644.2.patch, HIVE-9644.3.patch, HIVE-9644.patch Constant folding for queries don't kick in for some automatically generated query patterns which look like this. {code} hive explain select count(1) from store_sales where (case ss_sold_date when '1998-01-01' then 1 else null end)=1; {code} This should get rewritten by pushing the equality into the case branches. {code} select count(1) from store_sales where (case ss_sold_date when '1998-01-01' then 1=1 else null=1 end); {code} Ending up with a simplified filter condition, resolving itself as {code} select count(1) from store_sales where ss_sold_date= '1998-01-01' ; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
[ https://issues.apache.org/jira/browse/HIVE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531607#comment-14531607 ] Sushanth Sowmyan commented on HIVE-8696: Awesome, thanks! HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient. - Key: HIVE-8696 URL: https://issues.apache.org/jira/browse/HIVE-8696 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore Affects Versions: 0.12.0, 0.13.1 Reporter: Mithun Radhakrishnan Assignee: Thiruvel Thirumoolan Fix For: 1.2.0 Attachments: HIVE-8696.1.patch, HIVE-8696.2.patch, HIVE-8696.3.patch, HIVE-8696.4.patch, HIVE-8696.5.patch, HIVE-8696.poc.patch The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the HCatClient API that log in through keytabs will fail without retry, when their TGTs expire. The fix is inbound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9743) Incorrect result set for vectorized left outer join
[ https://issues.apache.org/jira/browse/HIVE-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531621#comment-14531621 ] Hive QA commented on HIVE-9743: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12730767/HIVE-9743.091.patch {color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 8904 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_parts org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_load_data_to_encrypted_tables org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_select_read_only_encrypted_tbl org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_droppartition org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_sba_drop_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_alterpart_loc org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropView org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessing org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessingCustomSetWhitelistAppend {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3785/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3785/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3785/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 24 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12730767 - PreCommit-HIVE-TRUNK-Build Incorrect result set for vectorized left outer join --- Key: HIVE-9743 URL: https://issues.apache.org/jira/browse/HIVE-9743 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.14.0 Reporter: N Campbell Assignee: Matt McCline Attachments: HIVE-9743.01.patch, HIVE-9743.02.patch, HIVE-9743.03.patch, HIVE-9743.04.patch, HIVE-9743.05.patch, HIVE-9743.06.patch, HIVE-9743.08.patch, HIVE-9743.09.patch, HIVE-9743.091.patch This query is supposed to return 3 rows and will when run without Tez but returns 2 rows when run with Tez. select tjoin1.rnum, tjoin1.c1, tjoin1.c2, tjoin2.c2 as c2j2 from tjoin1 left outer join tjoin2 on ( tjoin1.c1 = tjoin2.c1 and tjoin1.c2 15 ) tjoin1.rnum tjoin1.c1 tjoin1.c2 c2j2 1 20 25 null 2 null 50 null instead of tjoin1.rnum tjoin1.c1 tjoin1.c2 c2j2 0 10 15 null 1 20 25 null 2 null 50 null create table if not exists TJOIN1 (RNUM int , C1 int, C2 int) STORED AS orc ; 0|10|15 1|20|25 2|\N|50 create table if not exists TJOIN2 (RNUM int , C1 int, C2 char(2))
[jira] [Commented] (HIVE-9743) Incorrect result set for vectorized left outer join
[ https://issues.apache.org/jira/browse/HIVE-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531638#comment-14531638 ] Matt McCline commented on HIVE-9743: None of the test failures are related to my changes. Incorrect result set for vectorized left outer join --- Key: HIVE-9743 URL: https://issues.apache.org/jira/browse/HIVE-9743 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.14.0 Reporter: N Campbell Assignee: Matt McCline Attachments: HIVE-9743.01.patch, HIVE-9743.02.patch, HIVE-9743.03.patch, HIVE-9743.04.patch, HIVE-9743.05.patch, HIVE-9743.06.patch, HIVE-9743.08.patch, HIVE-9743.09.patch, HIVE-9743.091.patch This query is supposed to return 3 rows and will when run without Tez but returns 2 rows when run with Tez. select tjoin1.rnum, tjoin1.c1, tjoin1.c2, tjoin2.c2 as c2j2 from tjoin1 left outer join tjoin2 on ( tjoin1.c1 = tjoin2.c1 and tjoin1.c2 15 ) tjoin1.rnum tjoin1.c1 tjoin1.c2 c2j2 1 20 25 null 2 null 50 null instead of tjoin1.rnum tjoin1.c1 tjoin1.c2 c2j2 0 10 15 null 1 20 25 null 2 null 50 null create table if not exists TJOIN1 (RNUM int , C1 int, C2 int) STORED AS orc ; 0|10|15 1|20|25 2|\N|50 create table if not exists TJOIN2 (RNUM int , C1 int, C2 char(2)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS TEXTFILE ; 0|10|BB 1|15|DD 2|\N|EE 3|10|FF -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9743) Incorrect result set for vectorized left outer join
[ https://issues.apache.org/jira/browse/HIVE-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531735#comment-14531735 ] Vikram Dixit K commented on HIVE-9743: -- Thanks Matt and Jason. Incorrect result set for vectorized left outer join --- Key: HIVE-9743 URL: https://issues.apache.org/jira/browse/HIVE-9743 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.14.0 Reporter: N Campbell Assignee: Matt McCline Fix For: 1.2.0 Attachments: HIVE-9743.01.patch, HIVE-9743.02.patch, HIVE-9743.03.patch, HIVE-9743.04.patch, HIVE-9743.05.patch, HIVE-9743.06.patch, HIVE-9743.08.patch, HIVE-9743.09.patch, HIVE-9743.091.patch This query is supposed to return 3 rows and will when run without Tez but returns 2 rows when run with Tez. select tjoin1.rnum, tjoin1.c1, tjoin1.c2, tjoin2.c2 as c2j2 from tjoin1 left outer join tjoin2 on ( tjoin1.c1 = tjoin2.c1 and tjoin1.c2 15 ) tjoin1.rnum tjoin1.c1 tjoin1.c2 c2j2 1 20 25 null 2 null 50 null instead of tjoin1.rnum tjoin1.c1 tjoin1.c2 c2j2 0 10 15 null 1 20 25 null 2 null 50 null create table if not exists TJOIN1 (RNUM int , C1 int, C2 int) STORED AS orc ; 0|10|15 1|20|25 2|\N|50 create table if not exists TJOIN2 (RNUM int , C1 int, C2 char(2)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS TEXTFILE ; 0|10|BB 1|15|DD 2|\N|EE 3|10|FF -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10628) Incorrect result when vectorized native mapjoin is enabled
[ https://issues.apache.org/jira/browse/HIVE-10628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531739#comment-14531739 ] Sushanth Sowmyan commented on HIVE-10628: - After discussion with Gunther, marking this for 1.2.1 instead of 1.2.0, since vectorization is optional by default, and so, there is a workaround. If this makes it before we're done with the RC process, we will include it in 1.2.0 itself, but we will not consider it a blocker for the 1.2.0 release. Incorrect result when vectorized native mapjoin is enabled -- Key: HIVE-10628 URL: https://issues.apache.org/jira/browse/HIVE-10628 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.0 Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Fix For: 1.2.0, 1.3.0 Incorrect results for this query: {noformat} select count(*) from store_sales ss join store_returns sr on (sr.sr_item_sk = ss.ss_item_sk and sr.sr_customer_sk = ss.ss_customer_sk and sr.sr_item_sk = ss.ss_item_sk) where ss.ss_net_paid 1000; {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9644) CASE comparison operator rotation optimization
[ https://issues.apache.org/jira/browse/HIVE-9644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-9644: --- Attachment: HIVE-9644.3.patch Updated patch. CASE comparison operator rotation optimization -- Key: HIVE-9644 URL: https://issues.apache.org/jira/browse/HIVE-9644 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 0.14.0, 1.0.0, 1.1.0 Reporter: Gopal V Assignee: Ashutosh Chauhan Attachments: HIVE-9644.1.patch, HIVE-9644.2.patch, HIVE-9644.3.patch, HIVE-9644.patch Constant folding for queries don't kick in for some automatically generated query patterns which look like this. {code} hive explain select count(1) from store_sales where (case ss_sold_date when '1998-01-01' then 1 else null end)=1; {code} This should get rewritten by pushing the equality into the case branches. {code} select count(1) from store_sales where (case ss_sold_date when '1998-01-01' then 1=1 else null=1 end); {code} Ending up with a simplified filter condition, resolving itself as {code} select count(1) from store_sales where ss_sold_date= '1998-01-01' ; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10559) IndexOutOfBoundsException with RemoveDynamicPruningBySize
[ https://issues.apache.org/jira/browse/HIVE-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-10559: - Attachment: HIVE-10559.03.patch Upload 3rd patch for testing. [~hagleitn] Can you take another look? IndexOutOfBoundsException with RemoveDynamicPruningBySize - Key: HIVE-10559 URL: https://issues.apache.org/jira/browse/HIVE-10559 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0, 1.3.0 Reporter: Wei Zheng Assignee: Wei Zheng Attachments: HIVE-10559.01.patch, HIVE-10559.02.patch, HIVE-10559.03.patch, q85.q The problem can be reproduced by running the script attached. Backtrace {code} 2015-04-29 10:34:36,390 ERROR [main]: ql.Driver (SessionState.java:printError(956)) - FAILED: IndexOutOfBoundsException Index: 0, Size: 0 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.hadoop.hive.ql.optimizer.RemoveDynamicPruningBySize.process(RemoveDynamicPruningBySize.java:61) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79) at org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:77) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110) at org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsDependentOptimizations(TezCompiler.java:281) at org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:123) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:102) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10092) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9932) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1026) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1000) at org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:139) at org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_q85(TestMiniTezCliDriver.java:123) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at junit.framework.TestCase.runTest(TestCase.java:176) at junit.framework.TestCase.runBare(TestCase.java:141) at junit.framework.TestResult$1.protect(TestResult.java:122) at junit.framework.TestResult.runProtected(TestResult.java:142) at junit.framework.TestResult.run(TestResult.java:125) at junit.framework.TestCase.run(TestCase.java:129) at junit.framework.TestSuite.runTest(TestSuite.java:255) at junit.framework.TestSuite.run(TestSuite.java:250) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at
[jira] [Updated] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
[ https://issues.apache.org/jira/browse/HIVE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-8696: --- Attachment: HIVE-8696.5.patch Sorry, updating patch 5 which should be complete. I was going through many iterations to test it locally, looks like I missed the complete patch. HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient. - Key: HIVE-8696 URL: https://issues.apache.org/jira/browse/HIVE-8696 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore Affects Versions: 0.12.0, 0.13.1 Reporter: Mithun Radhakrishnan Assignee: Thiruvel Thirumoolan Fix For: 1.2.0 Attachments: HIVE-8696.1.patch, HIVE-8696.2.patch, HIVE-8696.3.patch, HIVE-8696.4.patch, HIVE-8696.5.patch, HIVE-8696.poc.patch The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the HCatClient API that log in through keytabs will fail without retry, when their TGTs expire. The fix is inbound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10453) HS2 leaking open file descriptors when using UDFs
[ https://issues.apache.org/jira/browse/HIVE-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531637#comment-14531637 ] Yongzhi Chen commented on HIVE-10453: - Thanks [~szehon] for reviewing it. The CUDFLoader means classloader related to load udf jars. C means class. I am not good with names. HS2 leaking open file descriptors when using UDFs - Key: HIVE-10453 URL: https://issues.apache.org/jira/browse/HIVE-10453 Project: Hive Issue Type: Bug Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-10453.1.patch, HIVE-10453.2.patch 1. create a custom function by CREATE FUNCTION myfunc AS 'someudfclass' using jar 'hdfs:///tmp/myudf.jar'; 2. Create a simple jdbc client, just do connect, run simple query which using the function such as: select myfunc(col1) from sometable 3. Disconnect. Check open file for HiveServer2 by: lsof -p HSProcID | grep myudf.jar You will see the leak as: {noformat} java 28718 ychen txt REG1,4741 212977666 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar java 28718 ychen 330r REG1,4741 212977666 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531704#comment-14531704 ] Sergio Peña commented on HIVE-9736: --- Hi [~mithun] This patch is causing the above tests to fail due to the change on {{Hadoop23Shims.checkFileAccess(FileSystem fs, IteratorFileStatus statuses, EnumSetFsAction actions)}}. The line that fails is {{accessMethod.invoke(fs, statuses.next(), combine(actions));}} I an running hadoop 2.6.0, and the FileSystem.access() object accepts a Path and FsAction. When I run the code that checks patch permissions, I get this error: {noformat} hive explain select * from a join b on a.id = b.id; FAILED: SemanticException Unable to determine if hdfs://localhost:9000/user/hive/warehouse/a is read only: java.lang.IllegalArgumentException: argument type mismatch {noformat} Is there a follow-up jira for this error? StorageBasedAuthProvider should batch namenode-calls where possible. Key: HIVE-9736 URL: https://issues.apache.org/jira/browse/HIVE-9736 Project: Hive Issue Type: Bug Components: Metastore, Security Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Fix For: 1.2.0 Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch Consider a table partitioned by 2 keys (dt, region). Say a dt partition could have 1 associated regions. Consider that the user does: {code:sql} ALTER TABLE my_table DROP PARTITION (dt='20150101'); {code} As things stand now, {{StorageBasedAuthProvider}} will make individual {{DistributedFileSystem.listStatus()}} calls for each partition-directory, and authorize each one separately. It'd be faster to batch the calls, and examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10564) webhcat should use webhcat-site.xml properties for controller job submission
[ https://issues.apache.org/jira/browse/HIVE-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531707#comment-14531707 ] Thejas M Nair commented on HIVE-10564: -- [~ekoifman] Thanks a lot for identifying the issue, reviewing the patch and verifying the fix! webhcat should use webhcat-site.xml properties for controller job submission Key: HIVE-10564 URL: https://issues.apache.org/jira/browse/HIVE-10564 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-10564.1.patch, HIVE-10564.2.patch webhcat should use webhcat-site.xml in configuration for the TempletonController map-only job that it launches. This will allow users to set any MR/hdfs properties that want to see used for the controller job. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10556) ORC PPD schema on read related changes
[ https://issues.apache.org/jira/browse/HIVE-10556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531539#comment-14531539 ] Sushanth Sowmyan commented on HIVE-10556: - Per discussion with [~hagleitn], this is okay to defer out to 1.2.1, so doing so - still marking it in the tentative list, so if it is reviewed and in by the time we finish our RC process, it'll be in, but otherwise, it'll track for 1.2.1 ORC PPD schema on read related changes -- Key: HIVE-10556 URL: https://issues.apache.org/jira/browse/HIVE-10556 Project: Hive Issue Type: Bug Affects Versions: 1.2.0, 1.3.0 Reporter: Prasanth Jayachandran Assignee: Gopal V Follow up for HIVE-10286. Some fixes needs to be done for schema on read. Like Predicate.STRING with value 15 and integer min/max stats of 10,100 should return YES_NO truth value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10559) IndexOutOfBoundsException with RemoveDynamicPruningBySize
[ https://issues.apache.org/jira/browse/HIVE-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531537#comment-14531537 ] Sushanth Sowmyan commented on HIVE-10559: - Per discussion with [~hagleitn], this is okay to defer out to 1.2.1, so doing so - still marking it in the tentative list, so if it is reviewed and in by the time we finish our RC process, it'll be in, but otherwise, it'll track for 1.2.1 IndexOutOfBoundsException with RemoveDynamicPruningBySize - Key: HIVE-10559 URL: https://issues.apache.org/jira/browse/HIVE-10559 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0, 1.3.0 Reporter: Wei Zheng Assignee: Wei Zheng Attachments: HIVE-10559.01.patch, HIVE-10559.02.patch, HIVE-10559.03.patch, q85.q The problem can be reproduced by running the script attached. Backtrace {code} 2015-04-29 10:34:36,390 ERROR [main]: ql.Driver (SessionState.java:printError(956)) - FAILED: IndexOutOfBoundsException Index: 0, Size: 0 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.hadoop.hive.ql.optimizer.RemoveDynamicPruningBySize.process(RemoveDynamicPruningBySize.java:61) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79) at org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:77) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110) at org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsDependentOptimizations(TezCompiler.java:281) at org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:123) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:102) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10092) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9932) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1026) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1000) at org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:139) at org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_q85(TestMiniTezCliDriver.java:123) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at junit.framework.TestCase.runTest(TestCase.java:176) at junit.framework.TestCase.runBare(TestCase.java:141) at junit.framework.TestResult$1.protect(TestResult.java:122) at junit.framework.TestResult.runProtected(TestResult.java:142) at junit.framework.TestResult.run(TestResult.java:125) at junit.framework.TestCase.run(TestCase.java:129) at junit.framework.TestSuite.runTest(TestSuite.java:255) at junit.framework.TestSuite.run(TestSuite.java:250) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at
[jira] [Commented] (HIVE-10564) webhcat should use webhcat-site.xml properties for controller job submission
[ https://issues.apache.org/jira/browse/HIVE-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531588#comment-14531588 ] Eugene Koifman commented on HIVE-10564: --- [~thejas] I tested patch 2 - it runs clean. +1 webhcat should use webhcat-site.xml properties for controller job submission Key: HIVE-10564 URL: https://issues.apache.org/jira/browse/HIVE-10564 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-10564.1.patch, HIVE-10564.2.patch webhcat should use webhcat-site.xml in configuration for the TempletonController map-only job that it launches. This will allow users to set any MR/hdfs properties that want to see used for the controller job. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
[ https://issues.apache.org/jira/browse/HIVE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531670#comment-14531670 ] Thiruvel Thirumoolan commented on HIVE-8696: I ran all hcatalog tests locally and they passed. Will wait for precommit build to run. Hopefully no surprises. I also updated the review board entry with the latest patch. HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient. - Key: HIVE-8696 URL: https://issues.apache.org/jira/browse/HIVE-8696 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore Affects Versions: 0.12.0, 0.13.1 Reporter: Mithun Radhakrishnan Assignee: Thiruvel Thirumoolan Fix For: 1.2.0 Attachments: HIVE-8696.1.patch, HIVE-8696.2.patch, HIVE-8696.3.patch, HIVE-8696.4.patch, HIVE-8696.5.patch, HIVE-8696.poc.patch The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the HCatClient API that log in through keytabs will fail without retry, when their TGTs expire. The fix is inbound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10638) HIVE-9736 introduces issues with Hadoop23Shims.checkFileAccess
[ https://issues.apache.org/jira/browse/HIVE-10638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531723#comment-14531723 ] Sushanth Sowmyan commented on HIVE-10638: - Marking as blocker for branch-1.2. HIVE-9736 introduces issues with Hadoop23Shims.checkFileAccess -- Key: HIVE-10638 URL: https://issues.apache.org/jira/browse/HIVE-10638 Project: Hive Issue Type: Bug Reporter: Sushanth Sowmyan Copy-pasting [~spena]'s comment in HIVE-9736: Hi [~mithun] This patch is causing the above tests to fail due to the change on {{Hadoop23Shims.checkFileAccess(FileSystem fs, IteratorFileStatus statuses, EnumSetFsAction actions)}}. The line that fails is {{accessMethod.invoke(fs, statuses.next(), combine(actions));}} I an running hadoop 2.6.0, and the FileSystem.access() object accepts a Path and FsAction. When I run the code that checks patch permissions, I get this error: {noformat} hive explain select * from a join b on a.id = b.id; FAILED: SemanticException Unable to determine if hdfs://localhost:9000/user/hive/warehouse/a is read only: java.lang.IllegalArgumentException: argument type mismatch {noformat} Is there a follow-up jira for this error? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531722#comment-14531722 ] Sushanth Sowmyan commented on HIVE-9736: Hi Sergio, thanks for the catch, have filed https://issues.apache.org/jira/browse/HIVE-10638 for the same. [~mithun], could you please look at that issue? I will look through it too. StorageBasedAuthProvider should batch namenode-calls where possible. Key: HIVE-9736 URL: https://issues.apache.org/jira/browse/HIVE-9736 Project: Hive Issue Type: Bug Components: Metastore, Security Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Fix For: 1.2.0 Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch Consider a table partitioned by 2 keys (dt, region). Say a dt partition could have 1 associated regions. Consider that the user does: {code:sql} ALTER TABLE my_table DROP PARTITION (dt='20150101'); {code} As things stand now, {{StorageBasedAuthProvider}} will make individual {{DistributedFileSystem.listStatus()}} calls for each partition-directory, and authorize each one separately. It'd be faster to batch the calls, and examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9644) CASE comparison operator rotation optimization
[ https://issues.apache.org/jira/browse/HIVE-9644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531519#comment-14531519 ] Ashutosh Chauhan commented on HIVE-9644: I will update the description of this jira (about folding udfs) since it has captured a bit of discussion. Will open another jira for operator rotation. CASE comparison operator rotation optimization -- Key: HIVE-9644 URL: https://issues.apache.org/jira/browse/HIVE-9644 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 0.14.0, 1.0.0, 1.1.0 Reporter: Gopal V Assignee: Ashutosh Chauhan Attachments: HIVE-9644.1.patch, HIVE-9644.2.patch, HIVE-9644.patch Constant folding for queries don't kick in for some automatically generated query patterns which look like this. {code} hive explain select count(1) from store_sales where (case ss_sold_date when '1998-01-01' then 1 else null end)=1; {code} This should get rewritten by pushing the equality into the case branches. {code} select count(1) from store_sales where (case ss_sold_date when '1998-01-01' then 1=1 else null=1 end); {code} Ending up with a simplified filter condition, resolving itself as {code} select count(1) from store_sales where ss_sold_date= '1998-01-01' ; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9644) Fold case when udfs
[ https://issues.apache.org/jira/browse/HIVE-9644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-9644: --- Summary: Fold case when udfs (was: CASE comparison operator rotation optimization) Fold case when udfs - Key: HIVE-9644 URL: https://issues.apache.org/jira/browse/HIVE-9644 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 0.14.0, 1.0.0, 1.1.0 Reporter: Gopal V Assignee: Ashutosh Chauhan Attachments: HIVE-9644.1.patch, HIVE-9644.2.patch, HIVE-9644.3.patch, HIVE-9644.patch Constant folding for queries don't kick in for some automatically generated query patterns which look like this. {code} hive explain select count(1) from store_sales where (case ss_sold_date when '1998-01-01' then 1 else null end)=1; {code} This should get rewritten by pushing the equality into the case branches. {code} select count(1) from store_sales where (case ss_sold_date when '1998-01-01' then 1=1 else null=1 end); {code} Ending up with a simplified filter condition, resolving itself as {code} select count(1) from store_sales where ss_sold_date= '1998-01-01' ; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10565) LLAP: Native Vector Map Join doesn't handle filtering and matching on LEFT OUTER JOIN repeated key correctly
[ https://issues.apache.org/jira/browse/HIVE-10565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531536#comment-14531536 ] Sushanth Sowmyan commented on HIVE-10565: - Per discussion with [~hagleitn], this is okay to defer out to 1.2.1, so doing so - still marking it in the tentative list, so if it is reviewed and in by the time we finish our RC process, it'll be in, but otherwise, it'll track for 1.2.1 LLAP: Native Vector Map Join doesn't handle filtering and matching on LEFT OUTER JOIN repeated key correctly Key: HIVE-10565 URL: https://issues.apache.org/jira/browse/HIVE-10565 Project: Hive Issue Type: Sub-task Components: Hive Affects Versions: 1.2.0 Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Fix For: 1.2.0, 1.3.0 Attachments: HIVE-10565.01.patch, HIVE-10565.02.patch, HIVE-10565.03.patch, HIVE-10565.04.patch, HIVE-10565.05.patch, HIVE-10565.06.patch, HIVE-10565.07.patch, HIVE-10565.08.patch Filtering can knock out some of the rows for a repeated key, but those knocked out rows need to be included in the LEFT OUTER JOIN result and are currently not when only some rows are filtered out. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10530) Aggregate stats cache: bug fixes for RDBMS path
[ https://issues.apache.org/jira/browse/HIVE-10530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-10530: Attachment: HIVE-10530.2.patch Aggregate stats cache: bug fixes for RDBMS path --- Key: HIVE-10530 URL: https://issues.apache.org/jira/browse/HIVE-10530 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.2.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 1.2.0 Attachments: HIVE-10530.1.patch, HIVE-10530.2.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10526) CBO (Calcite Return Path): HiveCost epsilon comparison should take row count in to account
[ https://issues.apache.org/jira/browse/HIVE-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531344#comment-14531344 ] Ashutosh Chauhan commented on HIVE-10526: - +1 LGTM CBO (Calcite Return Path): HiveCost epsilon comparison should take row count in to account -- Key: HIVE-10526 URL: https://issues.apache.org/jira/browse/HIVE-10526 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: 0.12.0 Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 1.2.0 Attachments: HIVE-10526.1.patch, HIVE-10526.2.patch, HIVE-10526.3.patch, HIVE-10526.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10626) Spark paln need to be updated [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-10626: Attachment: HIVE-10626.2-spark.patch If it is not used out of this class scope it make sense. Updated the patch with local variable. Spark paln need to be updated [Spark Branch] Key: HIVE-10626 URL: https://issues.apache.org/jira/browse/HIVE-10626 Project: Hive Issue Type: Bug Components: Spark Affects Versions: spark-branch Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-10626-spark.patch, HIVE-10626.1-spark.patch, HIVE-10626.2-spark.patch [HIVE-8858] basic patch was committed, latest patch need to be committed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6679) HiveServer2 should support configurable the server side socket timeout and keepalive for various transports types where applicable
[ https://issues.apache.org/jira/browse/HIVE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531463#comment-14531463 ] Sushanth Sowmyan commented on HIVE-6679: Per discussion with Vaibhav, confirming deferring out of branch-1.2 HiveServer2 should support configurable the server side socket timeout and keepalive for various transports types where applicable -- Key: HIVE-6679 URL: https://issues.apache.org/jira/browse/HIVE-6679 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0 Reporter: Prasad Mujumdar Assignee: Navis Labels: TODOC1.0, TODOC15 Fix For: 1.3.0 Attachments: HIVE-6679.1.patch.txt, HIVE-6679.2.patch.txt, HIVE-6679.3.patch, HIVE-6679.4.patch, HIVE-6679.5.patch, HIVE-6679.6.patch HiveServer2 should support configurable the server side socket read timeout and TCP keep-alive option. Metastore server already support this (and the so is the old hive server). We now have multiple client connectivity options like Kerberos, Delegation Token (Digest-MD5), Plain SASL, Plain SASL with SSL and raw sockets. The configuration should be applicable to all types (if possible). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-10611) Mini tez tests wait for 5 minutes before shutting down
[ https://issues.apache.org/jira/browse/HIVE-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531462#comment-14531462 ] Vikram Dixit K edited comment on HIVE-10611 at 5/6/15 9:31 PM: --- Committed to trunk to alleviate some of the pressure. Thanks Ashutosh for the review. was (Author: vikram.dixit): Committed to trunk to alleviate some of the pressure. Mini tez tests wait for 5 minutes before shutting down -- Key: HIVE-10611 URL: https://issues.apache.org/jira/browse/HIVE-10611 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 1.3.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10611.1.patch Currently, at shutdown, the tez mini cluster waits for the session to close before shutting down the cluster. This ends up being 5 minutes - the default value. We can shut down the session to alleviate this situation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10612) HIVE-10578 broke TestSQLStdHiveAccessControllerHS2 tests
[ https://issues.apache.org/jira/browse/HIVE-10612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-10612: Attachment: HIVE-10612.2.patch Attaching duplicate HIVE-10612.2.patch in case the precommit break yesterday night marks the .1.patch as processed. HIVE-10578 broke TestSQLStdHiveAccessControllerHS2 tests Key: HIVE-10612 URL: https://issues.apache.org/jira/browse/HIVE-10612 Project: Hive Issue Type: Bug Components: Authorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-10612.1.patch, HIVE-10612.2.patch The change in HIVE-10578 has broken two tests in TestSQLStdHiveAccessControllerHS2 - testConfigProcessing and testConfigProcessingCustomSetWhitelistAppend. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9845) HCatSplit repeats information making input split data size huge
[ https://issues.apache.org/jira/browse/HIVE-9845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531306#comment-14531306 ] Hive QA commented on HIVE-9845: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12730874/HIVE-9845.6.patch {color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 8900 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_parts org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_load_data_to_encrypted_tables org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_select_read_only_encrypted_tbl org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_droppartition org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_sba_drop_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_alterpart_loc org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropView org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessing org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessingCustomSetWhitelistAppend org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3783/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3783/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3783/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 25 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12730874 - PreCommit-HIVE-TRUNK-Build HCatSplit repeats information making input split data size huge --- Key: HIVE-9845 URL: https://issues.apache.org/jira/browse/HIVE-9845 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Rohini Palaniswamy Assignee: Mithun Radhakrishnan Attachments: HIVE-9845.1.patch, HIVE-9845.3.patch, HIVE-9845.4.patch, HIVE-9845.5.patch, HIVE-9845.6.patch Pig on Tez jobs with larger tables hit PIG-4443. Running on HDFS data which has even triple the number of splits(100K+ splits and tasks) does not hit that issue. {code} HCatBaseInputFormat.java: //Call getSplit on the InputFormat, create an //HCatSplit for each underlying split //NumSplits is 0 for our purposes org.apache.hadoop.mapred.InputSplit[] baseSplits = inputFormat.getSplits(jobConf, 0); for(org.apache.hadoop.mapred.InputSplit split : baseSplits) { splits.add(new HCatSplit( partitionInfo, split,allCols)); } {code} Each hcatSplit duplicates partition schema and table schema.
[jira] [Commented] (HIVE-10626) Spark paln need to be updated [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531320#comment-14531320 ] Jimmy Xiang commented on HIVE-10626: I see. In order to use toString, we have to add an extra field, which may not be good. Probably it is better to create a StringBuilder local variable in logSparkPlan. What do you think? Spark paln need to be updated [Spark Branch] Key: HIVE-10626 URL: https://issues.apache.org/jira/browse/HIVE-10626 Project: Hive Issue Type: Bug Components: Spark Affects Versions: spark-branch Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-10626-spark.patch, HIVE-10626.1-spark.patch [HIVE-8858] basic patch was committed, latest patch need to be committed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10563) MiniTezCliDriver tests ordering issues
[ https://issues.apache.org/jira/browse/HIVE-10563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531330#comment-14531330 ] Ashutosh Chauhan commented on HIVE-10563: - I have not verified in code, but thats my understanding too. Only once in q file should be sufficient to cover for all queries in it. MiniTezCliDriver tests ordering issues -- Key: HIVE-10563 URL: https://issues.apache.org/jira/browse/HIVE-10563 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10563.1.patch, HIVE-10563.2.patch There are a bunch of tests related to TestMiniTezCliDriver which gives ordering issues when run on Centos/Windows/OSX -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10614) schemaTool upgrade from 0.14.0 to 1.3.0 causes failure
[ https://issues.apache.org/jira/browse/HIVE-10614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531360#comment-14531360 ] Sushanth Sowmyan commented on HIVE-10614: - Cool, will go ahead and commit HIVE-10614.1.patch to master and HIVE-10614.1.branch-0.12.patch to branch-1.2. [~hsubramaniyan], could you please file another jira and link it to HIVE-7018 and this jira as a follow-up jira to address removal of LINK_TARGET_ID in a manner that is schematool compliant? schemaTool upgrade from 0.14.0 to 1.3.0 causes failure -- Key: HIVE-10614 URL: https://issues.apache.org/jira/browse/HIVE-10614 Project: Hive Issue Type: Bug Components: Metastore Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Priority: Critical Attachments: HIVE-10614.1-branch-0.12.patch, HIVE-10614.1.patch ./schematool -dbType mysql -upgradeSchemaFrom 0.14.0 -verbose {code} ++--+ | | ++--+ | HIVE-7018 Remove Table and Partition tables column LINK_TARGET_ID from Mysql for other DBs do not have it | ++--+ 1 row selected (0.004 seconds) 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_TLBS_LINKID No rows affected (0.005 seconds) 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_PARTITIONS_LINKID No rows affected (0.006 seconds) 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_LINKID No rows affected (0.002 seconds) 0: jdbc:mysql://node-1.example.com/hive CREATE PROCEDURE RM_TLBS_LINKID() BEGIN IF EXISTS (SELECT * FROM `INFORMATION_SCHEMA`.`COLUMNS` WHERE `TABLE_NAME` = 'TBLS' AND `COLUMN_NAME` = 'LINK_TARGET_ID') THEN ALTER TABLE `TBLS` DROP FOREIGN KEY `TBLS_FK3` ; ALTER TABLE `TBLS` DROP KEY `TBLS_N51` ; ALTER TABLE `TBLS` DROP COLUMN `LINK_TARGET_ID` ; END IF; END Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1 (state=42000,code=1064) Closing: 0: jdbc:mysql://node-1.example.com/hive?createDatabaseIfNotExist=true org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !! org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !! at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:229) at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:468) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.io.IOException: Schema script failed, errorcode 2 at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:355) at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:326) at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:224) {code} Looks like HIVE-7018 has introduced stored procedure as part of mysql upgrade script and it is causing issues with schematool upgrade. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
[ https://issues.apache.org/jira/browse/HIVE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-8696: --- Attachment: HIVE-8696.3.patch I think TestPassProperties would fail because the error message changed in the stack frame. Updating with test changes included. HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient. - Key: HIVE-8696 URL: https://issues.apache.org/jira/browse/HIVE-8696 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore Affects Versions: 0.12.0, 0.13.1 Reporter: Mithun Radhakrishnan Assignee: Thiruvel Thirumoolan Fix For: 1.2.0 Attachments: HIVE-8696.1.patch, HIVE-8696.2.patch, HIVE-8696.3.patch, HIVE-8696.poc.patch The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the HCatClient API that log in through keytabs will fail without retry, when their TGTs expire. The fix is inbound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10563) MiniTezCliDriver tests ordering issues
[ https://issues.apache.org/jira/browse/HIVE-10563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-10563: - Attachment: HIVE-10563.3.patch MiniTezCliDriver tests ordering issues -- Key: HIVE-10563 URL: https://issues.apache.org/jira/browse/HIVE-10563 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10563.1.patch, HIVE-10563.2.patch, HIVE-10563.3.patch There are a bunch of tests related to TestMiniTezCliDriver which gives ordering issues when run on Centos/Windows/OSX -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9845) HCatSplit repeats information making input split data size huge
[ https://issues.apache.org/jira/browse/HIVE-9845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531404#comment-14531404 ] Sushanth Sowmyan commented on HIVE-9845: None of the test failure appear related, will go ahead and commit, +1. HCatSplit repeats information making input split data size huge --- Key: HIVE-9845 URL: https://issues.apache.org/jira/browse/HIVE-9845 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Rohini Palaniswamy Assignee: Mithun Radhakrishnan Attachments: HIVE-9845.1.patch, HIVE-9845.3.patch, HIVE-9845.4.patch, HIVE-9845.5.patch, HIVE-9845.6.patch Pig on Tez jobs with larger tables hit PIG-4443. Running on HDFS data which has even triple the number of splits(100K+ splits and tasks) does not hit that issue. {code} HCatBaseInputFormat.java: //Call getSplit on the InputFormat, create an //HCatSplit for each underlying split //NumSplits is 0 for our purposes org.apache.hadoop.mapred.InputSplit[] baseSplits = inputFormat.getSplits(jobConf, 0); for(org.apache.hadoop.mapred.InputSplit split : baseSplits) { splits.add(new HCatSplit( partitionInfo, split,allCols)); } {code} Each hcatSplit duplicates partition schema and table schema. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10484) Vectorization : RuntimeException Big Table Retained Mapping duplicate column
[ https://issues.apache.org/jira/browse/HIVE-10484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531467#comment-14531467 ] Hive QA commented on HIVE-10484: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12730881/HIVE-10484.02.patch {color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 8900 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_parts org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_load_data_to_encrypted_tables org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_select_read_only_encrypted_tbl org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_droppartition org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_sba_drop_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_alterpart_loc org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropView org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessing org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessingCustomSetWhitelistAppend {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3784/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3784/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3784/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 24 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12730881 - PreCommit-HIVE-TRUNK-Build Vectorization : RuntimeException Big Table Retained Mapping duplicate column -- Key: HIVE-10484 URL: https://issues.apache.org/jira/browse/HIVE-10484 Project: Hive Issue Type: Bug Components: Tez, Vectorization Affects Versions: 1.2.0 Reporter: Mostafa Mokhtar Assignee: Matt McCline Fix For: 1.2.0 Attachments: HIVE-10484.01.patch, HIVE-10484.02.patch With vectorization and tez enabled TPC-DS Q70 fails with {code} Caused by: java.lang.RuntimeException: Big Table Retained Mapping duplicate column 6 in ordered column map {6=(value column: 6, type name: int), 21=(value column: 21, type name: float), 22=(value column: 22, type name: int)} when adding value column 6, type int at org.apache.hadoop.hive.ql.exec.vector.VectorColumnOrderedMap.add(VectorColumnOrderedMap.java:97) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnOutputMapping.add(VectorColumnOutputMapping.java:40) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.determineCommonInfo(VectorMapJoinCommonOperator.java:320) at
[jira] [Commented] (HIVE-10521) TxnHandler.timeOutTxns only times out some of the expired transactions
[ https://issues.apache.org/jira/browse/HIVE-10521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531481#comment-14531481 ] Sushanth Sowmyan commented on HIVE-10521: - Given that precommit tests have now run, [~ekoifman]/[~alangates], could you please verify and commit this patch? TxnHandler.timeOutTxns only times out some of the expired transactions -- Key: HIVE-10521 URL: https://issues.apache.org/jira/browse/HIVE-10521 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.14.0, 1.0.0, 1.1.0 Reporter: Alan Gates Assignee: Alan Gates Attachments: HIVE-10521.2.patch, HIVE-10521.3.patch, HIVE-10521.4.patch, HIVE-10521.patch {code} for (int i = 0; i 20 rs.next(); i++) deadTxns.add(rs.getLong(1)); // We don't care whether all of the transactions get deleted or not, // if some didn't it most likely means someone else deleted them in the interum if (deadTxns.size() 0) abortTxns(dbConn, deadTxns); {code} While it makes sense to limit the number of transactions aborted in one pass (since this get's translated to an IN clause) we should still make sure all are timed out. Also, 20 seems pretty small as a batch size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
[ https://issues.apache.org/jira/browse/HIVE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-8696: --- Attachment: HIVE-8696.4.patch HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient. - Key: HIVE-8696 URL: https://issues.apache.org/jira/browse/HIVE-8696 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore Affects Versions: 0.12.0, 0.13.1 Reporter: Mithun Radhakrishnan Assignee: Thiruvel Thirumoolan Fix For: 1.2.0 Attachments: HIVE-8696.1.patch, HIVE-8696.2.patch, HIVE-8696.3.patch, HIVE-8696.4.patch, HIVE-8696.poc.patch The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the HCatClient API that log in through keytabs will fail without retry, when their TGTs expire. The fix is inbound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7018) Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but not others
[ https://issues.apache.org/jira/browse/HIVE-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-7018: --- Fix Version/s: (was: 1.2.0) Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but not others - Key: HIVE-7018 URL: https://issues.apache.org/jira/browse/HIVE-7018 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Yongzhi Chen Attachments: HIVE-7018.1.patch, HIVE-7018.2.patch It appears that at least postgres and oracle do not have the LINK_TARGET_ID column while mysql does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10635) Redo HIVE-7018 in a schematool compatible manner
[ https://issues.apache.org/jira/browse/HIVE-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-10635: - Description: In HIVE-10614, we had to revert HIVE-7018 because it was not schematool compatible and it would prevent upgrade from 0.14.0 to 1.3.0 when run via schematool. We need to redo HIVE-7018 work once the script introduced for HIVE-7018 is schematool compliant. (was: In HIVE-10614, we had to revert HIVE-7018 because it was not schematool compatible and it would prevent upgrade from 0.14.0 to 1.3.0 when run via schematool.) Redo HIVE-7018 in a schematool compatible manner Key: HIVE-10635 URL: https://issues.apache.org/jira/browse/HIVE-10635 Project: Hive Issue Type: Bug Components: Metastore Reporter: Hari Sankar Sivarama Subramaniyan In HIVE-10614, we had to revert HIVE-7018 because it was not schematool compatible and it would prevent upgrade from 0.14.0 to 1.3.0 when run via schematool. We need to redo HIVE-7018 work once the script introduced for HIVE-7018 is schematool compliant. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10626) Spark paln need to be updated [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531382#comment-14531382 ] Jimmy Xiang commented on HIVE-10626: Thanks for making the change. +1 pending on test. Spark paln need to be updated [Spark Branch] Key: HIVE-10626 URL: https://issues.apache.org/jira/browse/HIVE-10626 Project: Hive Issue Type: Bug Components: Spark Affects Versions: spark-branch Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-10626-spark.patch, HIVE-10626.1-spark.patch, HIVE-10626.2-spark.patch [HIVE-8858] basic patch was committed, latest patch need to be committed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8065) Support HDFS encryption functionality on Hive
[ https://issues.apache.org/jira/browse/HIVE-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531443#comment-14531443 ] Sergio Peña commented on HIVE-8065: --- Hey [~thejas] Here's some answers about the issues: 1. If the encrypted zone where the results will be written is read-only, then Hive will try to use the directory set by {{hive.exec.scratchdir}} only if the scratch directory is encrypted as well (see HIVE-8945). This might create a performance issue if the encrypted scratch directory is in a different encryption zone. The user may change that directory to a writable directory inside the same encryption zone to make the move faster. This might be a little tedious for users, but it is the only way to protect their data. 2. This is a little tricky. Currently, hive selects the encryption zone that has the most strength cipher (aes128 vs aes256), and uses that location to store all final and intermediate results. This avoids writing intermediate data (aes256 to aes128), and then writing back the final result to aes256. Here we have another performance issue where final result files would be copied (and not renamed) to the destination table as encryption zones might be different. We did not do any work to deny access to stored results in another encryption zone. The solution only avoids that encrypted data touches non-encrypted zones, or weaker encrypted zones. Maybe other solutions, like Sentry, may work on this access control. But without an access control mechanism, this issue exists on the scratch directory, doesn't it? Support HDFS encryption functionality on Hive - Key: HIVE-8065 URL: https://issues.apache.org/jira/browse/HIVE-8065 Project: Hive Issue Type: Improvement Affects Versions: 0.13.1 Reporter: Sergio Peña Assignee: Sergio Peña Labels: Hive-Scrum The new encryption support on HDFS makes Hive incompatible and unusable when this feature is used. HDFS encryption is designed so that an user can configure different encryption zones (or directories) for multi-tenant environments. An encryption zone has an exclusive encryption key, such as AES-128 or AES-256. Because of security compliance, the HDFS does not allow to move/rename files between encryption zones. Renames are allowed only inside the same encryption zone. A copy is allowed between encryption zones. See HDFS-6134 for more details about HDFS encryption design. Hive currently uses a scratch directory (like /tmp/$user/$random). This scratch directory is used for the output of intermediate data (between MR jobs) and for the final output of the hive query which is later moved to the table directory location. If Hive tables are in different encryption zones than the scratch directory, then Hive won't be able to renames those files/directories, and it will make Hive unusable. To handle this problem, we can change the scratch directory of the query/statement to be inside the same encryption zone of the table directory location. This way, the renaming process will be successful. Also, for statements that move files between encryption zones (i.e. LOAD DATA), a copy may be executed instead of a rename. This will cause an overhead when copying large data files, but it won't break the encryption on Hive. Another security thing to consider is when using joins selects. If Hive joins different tables with different encryption key strengths, then the results of the select might break the security compliance of the tables. Let's say two tables with 128 bits and 256 bits encryption are joined, then the temporary results might be stored in the 128 bits encryption zone. This will conflict with the table encrypted with 256 bits temporary. To fix this, Hive should be able to select the scratch directory that is more secured/encrypted in order to save the intermediate data temporary with no compliance issues. For instance: {noformat} SELECT * FROM table-aes128 t1 JOIN table-aes256 t2 WHERE t1.id == t2.id; {noformat} - This should use a scratch directory (or staging directory) inside the table-aes256 table location. {noformat} INSERT OVERWRITE TABLE table-unencrypted SELECT * FROM table-aes1; {noformat} - This should use a scratch directory inside the table-aes1 location. {noformat} FROM table-unencrypted INSERT OVERWRITE TABLE table-aes128 SELECT id, name INSERT OVERWRITE TABLE table-aes256 SELECT id, name {noformat} - This should use a scratch directory on each of the tables locations. - The first SELECT will have its scratch directory on table-aes128 directory. - The second SELECT will have its scratch directory on table-aes256 directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6679) HiveServer2 should support configurable the server side socket timeout and keepalive for various transports types where applicable
[ https://issues.apache.org/jira/browse/HIVE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531451#comment-14531451 ] Vaibhav Gumashta commented on HIVE-6679: Deferring this to 1.3 since it'll need some testing. HiveServer2 should support configurable the server side socket timeout and keepalive for various transports types where applicable -- Key: HIVE-6679 URL: https://issues.apache.org/jira/browse/HIVE-6679 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0 Reporter: Prasad Mujumdar Assignee: Navis Labels: TODOC1.0, TODOC15 Fix For: 1.3.0 Attachments: HIVE-6679.1.patch.txt, HIVE-6679.2.patch.txt, HIVE-6679.3.patch, HIVE-6679.4.patch, HIVE-6679.5.patch, HIVE-6679.6.patch HiveServer2 should support configurable the server side socket read timeout and TCP keep-alive option. Metastore server already support this (and the so is the old hive server). We now have multiple client connectivity options like Kerberos, Delegation Token (Digest-MD5), Plain SASL, Plain SASL with SSL and raw sockets. The configuration should be applicable to all types (if possible). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10564) webhcat should use webhcat-site.xml properties for controller job submission
[ https://issues.apache.org/jira/browse/HIVE-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-10564: - Attachment: HIVE-10564.2.patch webhcat should use webhcat-site.xml properties for controller job submission Key: HIVE-10564 URL: https://issues.apache.org/jira/browse/HIVE-10564 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-10564.1.patch, HIVE-10564.2.patch webhcat should use webhcat-site.xml in configuration for the TempletonController map-only job that it launches. This will allow users to set any MR/hdfs properties that want to see used for the controller job. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10635) Redo HIVE-7018 in a schematool compatible manner
[ https://issues.apache.org/jira/browse/HIVE-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair resolved HIVE-10635. -- Resolution: Duplicate Closing this, will track in HIVE-7018 itself. Redo HIVE-7018 in a schematool compatible manner Key: HIVE-10635 URL: https://issues.apache.org/jira/browse/HIVE-10635 Project: Hive Issue Type: Bug Components: Metastore Reporter: Hari Sankar Sivarama Subramaniyan In HIVE-10614, we had to revert HIVE-7018 because it was not schematool compatible and it would prevent upgrade from 0.14.0 to 1.3.0 when run via schematool. We need to redo HIVE-7018 work once the script introduced for HIVE-7018 is schematool compliant. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7018) Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but not others
[ https://issues.apache.org/jira/browse/HIVE-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531497#comment-14531497 ] Thejas M Nair commented on HIVE-7018: - As mentioned in HIVE-10635 - In HIVE-10614, we had to revert HIVE-7018 because it was not schematool compatible and it would prevent upgrade from 0.14.0 to 1.3.0 when run via schematool. We need to redo HIVE-7018 work once the script introduced for HIVE-7018 is schematool compliant. Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but not others - Key: HIVE-7018 URL: https://issues.apache.org/jira/browse/HIVE-7018 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Yongzhi Chen Fix For: 1.2.0 Attachments: HIVE-7018.1.patch, HIVE-7018.2.patch It appears that at least postgres and oracle do not have the LINK_TARGET_ID column while mysql does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HIVE-7018) Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but not others
[ https://issues.apache.org/jira/browse/HIVE-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair reopened HIVE-7018: - As this patch has not gone into a release, it is easier to track the issue by reopening this jira. Closing HIVE-10635 Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but not others - Key: HIVE-7018 URL: https://issues.apache.org/jira/browse/HIVE-7018 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Yongzhi Chen Fix For: 1.2.0 Attachments: HIVE-7018.1.patch, HIVE-7018.2.patch It appears that at least postgres and oracle do not have the LINK_TARGET_ID column while mysql does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10530) Aggregate stats cache: bug fixes for RDBMS path
[ https://issues.apache.org/jira/browse/HIVE-10530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531348#comment-14531348 ] Thejas M Nair commented on HIVE-10530: -- +1 pending tests Aggregate stats cache: bug fixes for RDBMS path --- Key: HIVE-10530 URL: https://issues.apache.org/jira/browse/HIVE-10530 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.2.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 1.2.0 Attachments: HIVE-10530.1.patch, HIVE-10530.2.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10614) schemaTool upgrade from 0.14.0 to 1.3.0 causes failure
[ https://issues.apache.org/jira/browse/HIVE-10614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531384#comment-14531384 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-10614: -- [~sushanth] Thanks, HIVE-10635 is the follow-up jira. Thanks Hari schemaTool upgrade from 0.14.0 to 1.3.0 causes failure -- Key: HIVE-10614 URL: https://issues.apache.org/jira/browse/HIVE-10614 Project: Hive Issue Type: Bug Components: Metastore Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Priority: Critical Fix For: 1.2.0 Attachments: HIVE-10614.1-branch-0.12.patch, HIVE-10614.1.patch ./schematool -dbType mysql -upgradeSchemaFrom 0.14.0 -verbose {code} ++--+ | | ++--+ | HIVE-7018 Remove Table and Partition tables column LINK_TARGET_ID from Mysql for other DBs do not have it | ++--+ 1 row selected (0.004 seconds) 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_TLBS_LINKID No rows affected (0.005 seconds) 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_PARTITIONS_LINKID No rows affected (0.006 seconds) 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_LINKID No rows affected (0.002 seconds) 0: jdbc:mysql://node-1.example.com/hive CREATE PROCEDURE RM_TLBS_LINKID() BEGIN IF EXISTS (SELECT * FROM `INFORMATION_SCHEMA`.`COLUMNS` WHERE `TABLE_NAME` = 'TBLS' AND `COLUMN_NAME` = 'LINK_TARGET_ID') THEN ALTER TABLE `TBLS` DROP FOREIGN KEY `TBLS_FK3` ; ALTER TABLE `TBLS` DROP KEY `TBLS_N51` ; ALTER TABLE `TBLS` DROP COLUMN `LINK_TARGET_ID` ; END IF; END Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1 (state=42000,code=1064) Closing: 0: jdbc:mysql://node-1.example.com/hive?createDatabaseIfNotExist=true org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !! org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !! at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:229) at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:468) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.io.IOException: Schema script failed, errorcode 2 at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:355) at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:326) at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:224) {code} Looks like HIVE-7018 has introduced stored procedure as part of mysql upgrade script and it is causing issues with schematool upgrade. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8065) Support HDFS encryption functionality on Hive
[ https://issues.apache.org/jira/browse/HIVE-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531401#comment-14531401 ] Brock Noland commented on HIVE-8065: bq. Write permissions are now required to read from these tables. Sergio can comment how read-only tables are handled. We did think of this case. bq. Sensitive data from one zone will be stored in another. Note that file permissions are still enforced and zones are not meant to be an access control mechanism. For example, a user with appropriate permissions could copy data from one ez to another ez1. Nothing in this change, changes that fact. Support HDFS encryption functionality on Hive - Key: HIVE-8065 URL: https://issues.apache.org/jira/browse/HIVE-8065 Project: Hive Issue Type: Improvement Affects Versions: 0.13.1 Reporter: Sergio Peña Assignee: Sergio Peña Labels: Hive-Scrum The new encryption support on HDFS makes Hive incompatible and unusable when this feature is used. HDFS encryption is designed so that an user can configure different encryption zones (or directories) for multi-tenant environments. An encryption zone has an exclusive encryption key, such as AES-128 or AES-256. Because of security compliance, the HDFS does not allow to move/rename files between encryption zones. Renames are allowed only inside the same encryption zone. A copy is allowed between encryption zones. See HDFS-6134 for more details about HDFS encryption design. Hive currently uses a scratch directory (like /tmp/$user/$random). This scratch directory is used for the output of intermediate data (between MR jobs) and for the final output of the hive query which is later moved to the table directory location. If Hive tables are in different encryption zones than the scratch directory, then Hive won't be able to renames those files/directories, and it will make Hive unusable. To handle this problem, we can change the scratch directory of the query/statement to be inside the same encryption zone of the table directory location. This way, the renaming process will be successful. Also, for statements that move files between encryption zones (i.e. LOAD DATA), a copy may be executed instead of a rename. This will cause an overhead when copying large data files, but it won't break the encryption on Hive. Another security thing to consider is when using joins selects. If Hive joins different tables with different encryption key strengths, then the results of the select might break the security compliance of the tables. Let's say two tables with 128 bits and 256 bits encryption are joined, then the temporary results might be stored in the 128 bits encryption zone. This will conflict with the table encrypted with 256 bits temporary. To fix this, Hive should be able to select the scratch directory that is more secured/encrypted in order to save the intermediate data temporary with no compliance issues. For instance: {noformat} SELECT * FROM table-aes128 t1 JOIN table-aes256 t2 WHERE t1.id == t2.id; {noformat} - This should use a scratch directory (or staging directory) inside the table-aes256 table location. {noformat} INSERT OVERWRITE TABLE table-unencrypted SELECT * FROM table-aes1; {noformat} - This should use a scratch directory inside the table-aes1 location. {noformat} FROM table-unencrypted INSERT OVERWRITE TABLE table-aes128 SELECT id, name INSERT OVERWRITE TABLE table-aes256 SELECT id, name {noformat} - This should use a scratch directory on each of the tables locations. - The first SELECT will have its scratch directory on table-aes128 directory. - The second SELECT will have its scratch directory on table-aes256 directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-8065) Support HDFS encryption functionality on Hive
[ https://issues.apache.org/jira/browse/HIVE-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531401#comment-14531401 ] Brock Noland edited comment on HIVE-8065 at 5/6/15 9:00 PM: bq. Write permissions are now required to read from these tables. Sergio can comment how read-only tables are handled. We did think of this case. bq. Sensitive data from one zone will be stored in another. Note that permissions are still enforced and zones are not meant to be an access control mechanism. For example, a user with appropriate permissions could copy data from one ez to another ez1. Nothing in this change, changes that fact. was (Author: brocknoland): bq. Write permissions are now required to read from these tables. Sergio can comment how read-only tables are handled. We did think of this case. bq. Sensitive data from one zone will be stored in another. Note that file permissions are still enforced and zones are not meant to be an access control mechanism. For example, a user with appropriate permissions could copy data from one ez to another ez1. Nothing in this change, changes that fact. Support HDFS encryption functionality on Hive - Key: HIVE-8065 URL: https://issues.apache.org/jira/browse/HIVE-8065 Project: Hive Issue Type: Improvement Affects Versions: 0.13.1 Reporter: Sergio Peña Assignee: Sergio Peña Labels: Hive-Scrum The new encryption support on HDFS makes Hive incompatible and unusable when this feature is used. HDFS encryption is designed so that an user can configure different encryption zones (or directories) for multi-tenant environments. An encryption zone has an exclusive encryption key, such as AES-128 or AES-256. Because of security compliance, the HDFS does not allow to move/rename files between encryption zones. Renames are allowed only inside the same encryption zone. A copy is allowed between encryption zones. See HDFS-6134 for more details about HDFS encryption design. Hive currently uses a scratch directory (like /tmp/$user/$random). This scratch directory is used for the output of intermediate data (between MR jobs) and for the final output of the hive query which is later moved to the table directory location. If Hive tables are in different encryption zones than the scratch directory, then Hive won't be able to renames those files/directories, and it will make Hive unusable. To handle this problem, we can change the scratch directory of the query/statement to be inside the same encryption zone of the table directory location. This way, the renaming process will be successful. Also, for statements that move files between encryption zones (i.e. LOAD DATA), a copy may be executed instead of a rename. This will cause an overhead when copying large data files, but it won't break the encryption on Hive. Another security thing to consider is when using joins selects. If Hive joins different tables with different encryption key strengths, then the results of the select might break the security compliance of the tables. Let's say two tables with 128 bits and 256 bits encryption are joined, then the temporary results might be stored in the 128 bits encryption zone. This will conflict with the table encrypted with 256 bits temporary. To fix this, Hive should be able to select the scratch directory that is more secured/encrypted in order to save the intermediate data temporary with no compliance issues. For instance: {noformat} SELECT * FROM table-aes128 t1 JOIN table-aes256 t2 WHERE t1.id == t2.id; {noformat} - This should use a scratch directory (or staging directory) inside the table-aes256 table location. {noformat} INSERT OVERWRITE TABLE table-unencrypted SELECT * FROM table-aes1; {noformat} - This should use a scratch directory inside the table-aes1 location. {noformat} FROM table-unencrypted INSERT OVERWRITE TABLE table-aes128 SELECT id, name INSERT OVERWRITE TABLE table-aes256 SELECT id, name {noformat} - This should use a scratch directory on each of the tables locations. - The first SELECT will have its scratch directory on table-aes128 directory. - The second SELECT will have its scratch directory on table-aes256 directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-10611) Mini tez tests wait for 5 minutes before shutting down
[ https://issues.apache.org/jira/browse/HIVE-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531462#comment-14531462 ] Vikram Dixit K edited comment on HIVE-10611 at 5/6/15 9:32 PM: --- Committed to trunk to alleviate some of the HiveQA pressure. Thanks Ashutosh for the review. was (Author: vikram.dixit): Committed to trunk to alleviate some of the pressure. Thanks Ashutosh for the review. Mini tez tests wait for 5 minutes before shutting down -- Key: HIVE-10611 URL: https://issues.apache.org/jira/browse/HIVE-10611 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 1.3.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10611.1.patch Currently, at shutdown, the tez mini cluster waits for the session to close before shutting down the cluster. This ends up being 5 minutes - the default value. We can shut down the session to alleviate this situation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10506) CBO (Calcite Return Path): Disallow return path to be enable if CBO is off
[ https://issues.apache.org/jira/browse/HIVE-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531487#comment-14531487 ] Sushanth Sowmyan commented on HIVE-10506: - Hi, given that precommit tests have run, and this has been +1ed, could we please get this patch committed in? CBO (Calcite Return Path): Disallow return path to be enable if CBO is off -- Key: HIVE-10506 URL: https://issues.apache.org/jira/browse/HIVE-10506 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.2.0 Attachments: HIVE-10506.01.patch, HIVE-10506.patch If hive.cbo.enable=false and hive.cbo.returnpath=true then some optimizations would kick in. It's quite possible that in customer environment, they might end up in these scenarios; we should prevent it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10484) Vectorization : RuntimeException Big Table Retained Mapping duplicate column
[ https://issues.apache.org/jira/browse/HIVE-10484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531494#comment-14531494 ] Matt McCline commented on HIVE-10484: - [~vikram.dixit] Thank You! Vectorization : RuntimeException Big Table Retained Mapping duplicate column -- Key: HIVE-10484 URL: https://issues.apache.org/jira/browse/HIVE-10484 Project: Hive Issue Type: Bug Components: Tez, Vectorization Affects Versions: 1.2.0 Reporter: Mostafa Mokhtar Assignee: Matt McCline Fix For: 1.2.0 Attachments: HIVE-10484.01.patch, HIVE-10484.02.patch With vectorization and tez enabled TPC-DS Q70 fails with {code} Caused by: java.lang.RuntimeException: Big Table Retained Mapping duplicate column 6 in ordered column map {6=(value column: 6, type name: int), 21=(value column: 21, type name: float), 22=(value column: 22, type name: int)} when adding value column 6, type int at org.apache.hadoop.hive.ql.exec.vector.VectorColumnOrderedMap.add(VectorColumnOrderedMap.java:97) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnOutputMapping.add(VectorColumnOutputMapping.java:40) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.determineCommonInfo(VectorMapJoinCommonOperator.java:320) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.init(VectorMapJoinCommonOperator.java:254) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.init(VectorMapJoinGenerateResultOperator.java:89) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.init(VectorMapJoinInnerGenerateResultOperator.java:97) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.init(VectorMapJoinInnerLongOperator.java:79) ... 49 more {code} Query {code:sql} select s_state from (select s_state as s_state, sum(ss_net_profit), rank() over ( partition by s_state order by sum(ss_net_profit) desc) as ranking from store_sales, store, date_dim where d_month_seq between 1193 and 1193+11 and date_dim.d_date_sk = store_sales.ss_sold_date_sk and store.s_store_sk = store_sales.ss_store_sk group by s_state ) tmp1 where ranking = 5 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10626) Spark paln need to be updated [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531310#comment-14531310 ] Chinna Rao Lalam commented on HIVE-10626: - This is one expection form [HIVE-8858] {quote} 3. It would be better if we log this graph in one line. The easiest way is to have a toString() method in SparkPlan and then we can just log the string representation of SparkPlan. {quote} Spark paln need to be updated [Spark Branch] Key: HIVE-10626 URL: https://issues.apache.org/jira/browse/HIVE-10626 Project: Hive Issue Type: Bug Components: Spark Affects Versions: spark-branch Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-10626-spark.patch, HIVE-10626.1-spark.patch [HIVE-8858] basic patch was committed, latest patch need to be committed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-8524) When table is renamed stats are lost as changes are not propagated to metastore tables TAB_COL_STATS and PART_COL_STATS
[ https://issues.apache.org/jira/browse/HIVE-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-8524. Resolution: Fixed Fix Version/s: 1.2.0 Fixed via HIVE-9720 When table is renamed stats are lost as changes are not propagated to metastore tables TAB_COL_STATS and PART_COL_STATS Key: HIVE-8524 URL: https://issues.apache.org/jira/browse/HIVE-8524 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0, 1.0.0, 1.1.0 Reporter: Mostafa Mokhtar Assignee: Ashutosh Chauhan Labels: hive Fix For: 1.2.0 When a Hive table is renamed that the name is not updated in TAB_COL_STATS and PART_COL_STATS. Repro 1) Create table 2) insert rows 3) Analyze table t1 compute statistics for columns; 4) set hive.stats.fetch.column.stats=true; 5) Explain select * from t1 where c1 x 6) ALTER TABLE t1 RENAME TO 2; 7) Explain select * from t2 where c1 x ; /* stats will be missing */ 8) Query the Metastore tables to validate According to the documentation Metastore should be updated {code} This statement lets you change the name of a table to a different name. As of version 0.6, a rename on a managed table moves its HDFS location as well. (Older Hive versions just renamed the table in the metastore without moving the HDFS location.) {code} Another related issue is that the schema of the stats table is not consistent with TBLS and DBS as these two table are normalized while TAB_COL_STATS and PART_COL_STATS have TABLE_NAME and DB_NAME denormalized in them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8524) When table is renamed stats are lost as changes are not propagated to metastore tables TAB_COL_STATS and PART_COL_STATS
[ https://issues.apache.org/jira/browse/HIVE-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8524: --- Affects Version/s: 1.1.0 1.0.0 When table is renamed stats are lost as changes are not propagated to metastore tables TAB_COL_STATS and PART_COL_STATS Key: HIVE-8524 URL: https://issues.apache.org/jira/browse/HIVE-8524 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0, 1.0.0, 1.1.0 Reporter: Mostafa Mokhtar Assignee: Ashutosh Chauhan Labels: hive Fix For: 1.2.0 When a Hive table is renamed that the name is not updated in TAB_COL_STATS and PART_COL_STATS. Repro 1) Create table 2) insert rows 3) Analyze table t1 compute statistics for columns; 4) set hive.stats.fetch.column.stats=true; 5) Explain select * from t1 where c1 x 6) ALTER TABLE t1 RENAME TO 2; 7) Explain select * from t2 where c1 x ; /* stats will be missing */ 8) Query the Metastore tables to validate According to the documentation Metastore should be updated {code} This statement lets you change the name of a table to a different name. As of version 0.6, a rename on a managed table moves its HDFS location as well. (Older Hive versions just renamed the table in the metastore without moving the HDFS location.) {code} Another related issue is that the schema of the stats table is not consistent with TBLS and DBS as these two table are normalized while TAB_COL_STATS and PART_COL_STATS have TABLE_NAME and DB_NAME denormalized in them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10635) Redo HIVE-7018 in a schematool compatible manner
[ https://issues.apache.org/jira/browse/HIVE-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-10635: - Component/s: Metastore Redo HIVE-7018 in a schematool compatible manner Key: HIVE-10635 URL: https://issues.apache.org/jira/browse/HIVE-10635 Project: Hive Issue Type: Bug Components: Metastore Reporter: Hari Sankar Sivarama Subramaniyan In HIVE-10614, we had to revert HIVE-7018 because it was not schematool compatible and it would prevent upgrade from 0.14.0 to 1.3.0 when run via schematool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9451) Add max size of column dictionaries to ORC metadata
[ https://issues.apache.org/jira/browse/HIVE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531432#comment-14531432 ] Sushanth Sowmyan commented on HIVE-9451: After discussion with Owen, marking as tentative for 1.2 - i.e. this will not hold up the RC process for 1.2.0, but if it makes it before we release, it'll be part of 1.2.0. This will still be honoured for inclusion in a 1.2.1 when we do it. Add max size of column dictionaries to ORC metadata --- Key: HIVE-9451 URL: https://issues.apache.org/jira/browse/HIVE-9451 Project: Hive Issue Type: Improvement Reporter: Owen O'Malley Assignee: Owen O'Malley Labels: ORC Fix For: 1.2.0 Attachments: HIVE-9451.patch, HIVE-9451.patch To predict the amount of memory required to read an ORC file we need to know the size of the dictionaries for the columns that we are reading. I propose adding the number of bytes for each column's dictionary to the stripe's column statistics. The file's column statistics would have the maximum dictionary size for each column. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10595) Dropping a table can cause NPEs in the compactor
[ https://issues.apache.org/jira/browse/HIVE-10595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-10595: Attachment: HIVE-10595.1.patch Duplicating HIVE-10595.patch as HIVE-10595.1.patch to submit through precommit. Dropping a table can cause NPEs in the compactor Key: HIVE-10595 URL: https://issues.apache.org/jira/browse/HIVE-10595 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0, 1.0.0, 1.1.0 Reporter: Alan Gates Assignee: Alan Gates Attachments: HIVE-10595.1.patch, HIVE-10595.patch Reproduction: # start metastore with compactor off # insert enough entries in a table to trigger a compaction # drop the table # stop metastore # restart metastore with compactor on Result: NPE in the compactor threads. I suspect this would also happen if the inserts and drops were done in between a run of the compactor, but I haven't proven it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10614) schemaTool upgrade from 0.14.0 to 1.3.0 causes failure
[ https://issues.apache.org/jira/browse/HIVE-10614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-10614: - Attachment: (was: HIVE-10614.1-master.patch) schemaTool upgrade from 0.14.0 to 1.3.0 causes failure -- Key: HIVE-10614 URL: https://issues.apache.org/jira/browse/HIVE-10614 Project: Hive Issue Type: Bug Components: Metastore Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Priority: Critical ./schematool -dbType mysql -upgradeSchemaFrom 0.14.0 -verbose {code} ++--+ | | ++--+ | HIVE-7018 Remove Table and Partition tables column LINK_TARGET_ID from Mysql for other DBs do not have it | ++--+ 1 row selected (0.004 seconds) 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_TLBS_LINKID No rows affected (0.005 seconds) 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_PARTITIONS_LINKID No rows affected (0.006 seconds) 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_LINKID No rows affected (0.002 seconds) 0: jdbc:mysql://node-1.example.com/hive CREATE PROCEDURE RM_TLBS_LINKID() BEGIN IF EXISTS (SELECT * FROM `INFORMATION_SCHEMA`.`COLUMNS` WHERE `TABLE_NAME` = 'TBLS' AND `COLUMN_NAME` = 'LINK_TARGET_ID') THEN ALTER TABLE `TBLS` DROP FOREIGN KEY `TBLS_FK3` ; ALTER TABLE `TBLS` DROP KEY `TBLS_N51` ; ALTER TABLE `TBLS` DROP COLUMN `LINK_TARGET_ID` ; END IF; END Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1 (state=42000,code=1064) Closing: 0: jdbc:mysql://node-1.example.com/hive?createDatabaseIfNotExist=true org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !! org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !! at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:229) at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:468) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.io.IOException: Schema script failed, errorcode 2 at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:355) at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:326) at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:224) {code} Looks like HIVE-7018 has introduced stored procedure as part of mysql upgrade script and it is causing issues with schematool upgrade. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10614) schemaTool upgrade from 0.14.0 to 1.3.0 causes failure
[ https://issues.apache.org/jira/browse/HIVE-10614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-10614: - Attachment: (was: HIVE-10614.1.patch) schemaTool upgrade from 0.14.0 to 1.3.0 causes failure -- Key: HIVE-10614 URL: https://issues.apache.org/jira/browse/HIVE-10614 Project: Hive Issue Type: Bug Components: Metastore Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Priority: Critical ./schematool -dbType mysql -upgradeSchemaFrom 0.14.0 -verbose {code} ++--+ | | ++--+ | HIVE-7018 Remove Table and Partition tables column LINK_TARGET_ID from Mysql for other DBs do not have it | ++--+ 1 row selected (0.004 seconds) 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_TLBS_LINKID No rows affected (0.005 seconds) 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_PARTITIONS_LINKID No rows affected (0.006 seconds) 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_LINKID No rows affected (0.002 seconds) 0: jdbc:mysql://node-1.example.com/hive CREATE PROCEDURE RM_TLBS_LINKID() BEGIN IF EXISTS (SELECT * FROM `INFORMATION_SCHEMA`.`COLUMNS` WHERE `TABLE_NAME` = 'TBLS' AND `COLUMN_NAME` = 'LINK_TARGET_ID') THEN ALTER TABLE `TBLS` DROP FOREIGN KEY `TBLS_FK3` ; ALTER TABLE `TBLS` DROP KEY `TBLS_N51` ; ALTER TABLE `TBLS` DROP COLUMN `LINK_TARGET_ID` ; END IF; END Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1 (state=42000,code=1064) Closing: 0: jdbc:mysql://node-1.example.com/hive?createDatabaseIfNotExist=true org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !! org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !! at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:229) at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:468) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.io.IOException: Schema script failed, errorcode 2 at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:355) at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:326) at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:224) {code} Looks like HIVE-7018 has introduced stored procedure as part of mysql upgrade script and it is causing issues with schematool upgrade. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10453) HS2 leaking open file descriptors when using UDFs
[ https://issues.apache.org/jira/browse/HIVE-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531112#comment-14531112 ] Szehon Ho commented on HIVE-10453: -- Good catch, +1. One minor question is 'CUDFLoader' method name a typo, or intentional? HS2 leaking open file descriptors when using UDFs - Key: HIVE-10453 URL: https://issues.apache.org/jira/browse/HIVE-10453 Project: Hive Issue Type: Bug Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-10453.1.patch, HIVE-10453.2.patch 1. create a custom function by CREATE FUNCTION myfunc AS 'someudfclass' using jar 'hdfs:///tmp/myudf.jar'; 2. Create a simple jdbc client, just do connect, run simple query which using the function such as: select myfunc(col1) from sometable 3. Disconnect. Check open file for HiveServer2 by: lsof -p HSProcID | grep myudf.jar You will see the leak as: {noformat} java 28718 ychen txt REG1,4741 212977666 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar java 28718 ychen 330r REG1,4741 212977666 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10614) schemaTool upgrade from 0.14.0 to 1.3.0 causes failure
[ https://issues.apache.org/jira/browse/HIVE-10614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-10614: - Attachment: HIVE-10614.1.patch schemaTool upgrade from 0.14.0 to 1.3.0 causes failure -- Key: HIVE-10614 URL: https://issues.apache.org/jira/browse/HIVE-10614 Project: Hive Issue Type: Bug Components: Metastore Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Priority: Critical Attachments: HIVE-10614.1-branch-0.12.patch, HIVE-10614.1.patch ./schematool -dbType mysql -upgradeSchemaFrom 0.14.0 -verbose {code} ++--+ | | ++--+ | HIVE-7018 Remove Table and Partition tables column LINK_TARGET_ID from Mysql for other DBs do not have it | ++--+ 1 row selected (0.004 seconds) 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_TLBS_LINKID No rows affected (0.005 seconds) 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_PARTITIONS_LINKID No rows affected (0.006 seconds) 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_LINKID No rows affected (0.002 seconds) 0: jdbc:mysql://node-1.example.com/hive CREATE PROCEDURE RM_TLBS_LINKID() BEGIN IF EXISTS (SELECT * FROM `INFORMATION_SCHEMA`.`COLUMNS` WHERE `TABLE_NAME` = 'TBLS' AND `COLUMN_NAME` = 'LINK_TARGET_ID') THEN ALTER TABLE `TBLS` DROP FOREIGN KEY `TBLS_FK3` ; ALTER TABLE `TBLS` DROP KEY `TBLS_N51` ; ALTER TABLE `TBLS` DROP COLUMN `LINK_TARGET_ID` ; END IF; END Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1 (state=42000,code=1064) Closing: 0: jdbc:mysql://node-1.example.com/hive?createDatabaseIfNotExist=true org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !! org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !! at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:229) at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:468) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.io.IOException: Schema script failed, errorcode 2 at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:355) at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:326) at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:224) {code} Looks like HIVE-7018 has introduced stored procedure as part of mysql upgrade script and it is causing issues with schematool upgrade. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10595) Dropping a table can cause NPEs in the compactor
[ https://issues.apache.org/jira/browse/HIVE-10595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531154#comment-14531154 ] Eugene Koifman commented on HIVE-10595: --- I filed a ticket to investigate the more general issue. +1 this patch Dropping a table can cause NPEs in the compactor Key: HIVE-10595 URL: https://issues.apache.org/jira/browse/HIVE-10595 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0, 1.0.0, 1.1.0 Reporter: Alan Gates Assignee: Alan Gates Attachments: HIVE-10595.patch Reproduction: # start metastore with compactor off # insert enough entries in a table to trigger a compaction # drop the table # stop metastore # restart metastore with compactor on Result: NPE in the compactor threads. I suspect this would also happen if the inserts and drops were done in between a run of the compactor, but I haven't proven it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-10633) LLAP: remove GC setting from runLlapDaemon
[ https://issues.apache.org/jira/browse/HIVE-10633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-10633: --- Assignee: Sergey Shelukhin LLAP: remove GC setting from runLlapDaemon -- Key: HIVE-10633 URL: https://issues.apache.org/jira/browse/HIVE-10633 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: llap Attachments: HIVE-10633.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10634) The HMS upgrade test script on LXC is exiting with error even if the test were run successfuly
[ https://issues.apache.org/jira/browse/HIVE-10634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531251#comment-14531251 ] Szehon Ho commented on HIVE-10634: -- +1 The HMS upgrade test script on LXC is exiting with error even if the test were run successfuly -- Key: HIVE-10634 URL: https://issues.apache.org/jira/browse/HIVE-10634 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-10634.1.patch The execute-test-on-lxc.sh script is exiting with '1' error code after the tests were executed even if the test did not fail. This is causing that PreCommit-HIVE-METASTORE-Test publishes invalid results to Jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10626) Spark paln need to be updated [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531275#comment-14531275 ] Jimmy Xiang commented on HIVE-10626: Why do we need SparkPlan.toString()? Spark paln need to be updated [Spark Branch] Key: HIVE-10626 URL: https://issues.apache.org/jira/browse/HIVE-10626 Project: Hive Issue Type: Bug Components: Spark Affects Versions: spark-branch Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-10626-spark.patch, HIVE-10626.1-spark.patch [HIVE-8858] basic patch was committed, latest patch need to be committed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)