[jira] [Commented] (HIVE-24254) Remove setOwner call in ReplChangeManager
[ https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17211557#comment-17211557 ] Pravin Sinha commented on HIVE-24254: - +1 > Remove setOwner call in ReplChangeManager > - > > Key: HIVE-24254 > URL: https://issues.apache.org/jira/browse/HIVE-24254 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24254.01.patch, HIVE-24254.02.patch, > HIVE-24254.03.patch > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager
[ https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi updated HIVE-24254: --- Status: In Progress (was: Patch Available) > Remove setOwner call in ReplChangeManager > - > > Key: HIVE-24254 > URL: https://issues.apache.org/jira/browse/HIVE-24254 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24254.01.patch, HIVE-24254.02.patch, > HIVE-24254.03.patch > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager
[ https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi updated HIVE-24254: --- Attachment: HIVE-24254.03.patch Status: Patch Available (was: In Progress) > Remove setOwner call in ReplChangeManager > - > > Key: HIVE-24254 > URL: https://issues.apache.org/jira/browse/HIVE-24254 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24254.01.patch, HIVE-24254.02.patch, > HIVE-24254.03.patch > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager
[ https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi updated HIVE-24254: --- Status: In Progress (was: Patch Available) > Remove setOwner call in ReplChangeManager > - > > Key: HIVE-24254 > URL: https://issues.apache.org/jira/browse/HIVE-24254 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24254.01.patch, HIVE-24254.02.patch > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager
[ https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi updated HIVE-24254: --- Attachment: HIVE-24254.02.patch Status: Patch Available (was: In Progress) > Remove setOwner call in ReplChangeManager > - > > Key: HIVE-24254 > URL: https://issues.apache.org/jira/browse/HIVE-24254 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24254.01.patch, HIVE-24254.02.patch > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24254) Remove setOwner call in ReplChangeManager
[ https://issues.apache.org/jira/browse/HIVE-24254?focusedWorklogId=498829&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498829 ] ASF GitHub Bot logged work on HIVE-24254: - Author: ASF GitHub Bot Created on: 10/Oct/20 02:41 Start Date: 10/Oct/20 02:41 Worklog Time Spent: 10m Work Description: aasha commented on pull request #1567: URL: https://github.com/apache/hive/pull/1567#issuecomment-706472758 Please find the response a) Added a test. To enable a successful recycle we need acl set up or disable permission check for write operation. I have done the later as the acl set up was not working with the available apis. dfs.permissions.enabled = true : Regardless of whether permissions are on or off, chmod, chgrp, chown and setfacl always check permissions. So setOwner call was failing without the patch but move was successful as anyone can write to HDFS. b) Yes added as part of the same test. Along with this a CM Clearer test is also added. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498829) Time Spent: 20m (was: 10m) > Remove setOwner call in ReplChangeManager > - > > Key: HIVE-24254 > URL: https://issues.apache.org/jira/browse/HIVE-24254 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24254.01.patch > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24004) Improve performance for filter hook for superuser path
[ https://issues.apache.org/jira/browse/HIVE-24004?focusedWorklogId=498816&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498816 ] ASF GitHub Bot logged work on HIVE-24004: - Author: ASF GitHub Bot Created on: 10/Oct/20 00:53 Start Date: 10/Oct/20 00:53 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1373: URL: https://github.com/apache/hive/pull/1373#issuecomment-706458124 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498816) Time Spent: 1h (was: 50m) > Improve performance for filter hook for superuser path > -- > > Key: HIVE-24004 > URL: https://issues.apache.org/jira/browse/HIVE-24004 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 4.0.0 >Reporter: Sam An >Assignee: Sam An >Priority: Minor > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > In HiveMetastoreAuthoriver, the sequence of creating the authorizer can be > optimized so that for super user when we can skip authorization, we can skip > creating authorizer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-8950) Add support in ParquetHiveSerde to create table schema from a parquet file
[ https://issues.apache.org/jira/browse/HIVE-8950?focusedWorklogId=498812&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498812 ] ASF GitHub Bot logged work on HIVE-8950: Author: ASF GitHub Bot Created on: 10/Oct/20 00:53 Start Date: 10/Oct/20 00:53 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1353: URL: https://github.com/apache/hive/pull/1353 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498812) Time Spent: 0.5h (was: 20m) > Add support in ParquetHiveSerde to create table schema from a parquet file > -- > > Key: HIVE-8950 > URL: https://issues.apache.org/jira/browse/HIVE-8950 > Project: Hive > Issue Type: Improvement >Reporter: Ashish Singh >Assignee: Ashish Singh >Priority: Major > Labels: pull-request-available > Attachments: HIVE-8950.1.patch, HIVE-8950.10.patch, > HIVE-8950.11.patch, HIVE-8950.2.patch, HIVE-8950.3.patch, HIVE-8950.4.patch, > HIVE-8950.5.patch, HIVE-8950.6.patch, HIVE-8950.7.patch, HIVE-8950.8.patch, > HIVE-8950.9.patch, HIVE-8950.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > PARQUET-76 and PARQUET-47 ask for creating parquet backed tables without > having to specify the column names and types. As, parquet files store schema > in their footer, it is possible to generate hive schema from parquet file's > metadata. This will improve usability of parquet backed tables. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23977) Consolidate partition fetch to one place
[ https://issues.apache.org/jira/browse/HIVE-23977?focusedWorklogId=498813&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498813 ] ASF GitHub Bot logged work on HIVE-23977: - Author: ASF GitHub Bot Created on: 10/Oct/20 00:53 Start Date: 10/Oct/20 00:53 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1354: URL: https://github.com/apache/hive/pull/1354 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498813) Time Spent: 0.5h (was: 20m) > Consolidate partition fetch to one place > > > Key: HIVE-23977 > URL: https://issues.apache.org/jira/browse/HIVE-23977 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Steve Carlin >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23960) Partition with no column statistics leads to unbalanced calls to openTransaction/commitTransaction error during get_partitions_by_names
[ https://issues.apache.org/jira/browse/HIVE-23960?focusedWorklogId=498815&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498815 ] ASF GitHub Bot logged work on HIVE-23960: - Author: ASF GitHub Bot Created on: 10/Oct/20 00:53 Start Date: 10/Oct/20 00:53 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1343: URL: https://github.com/apache/hive/pull/1343#issuecomment-706458147 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498815) Time Spent: 50m (was: 40m) > Partition with no column statistics leads to unbalanced calls to > openTransaction/commitTransaction error during get_partitions_by_names > --- > > Key: HIVE-23960 > URL: https://issues.apache.org/jira/browse/HIVE-23960 > Project: Hive > Issue Type: Task >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23960.01.patch, HIVE-23960.02.patch, > HIVE-23960.03.patch > > Time Spent: 50m > Remaining Estimate: 0h > > {color:#172b4d}Creating a partition with data and adding another partition is > leading to unbalanced calls to open/commit transaction during > get_partitions_by_names call.{color} > {color:#172b4d}Issue was discovered during REPL DUMP operation which uses > this HMS call to get the metadata of partition. This error occurs when there > is a partition with no column statistics.{color} > {color:#172b4d}To reproduce:{color} > {code:java} > CREATE TABLE student_part_acid(name string, age int, gpa double) PARTITIONED > BY (ds string) STORED AS orc; > LOAD DATA INPATH ‘/user/hive/partDir/student_part_acid/ds=20110924’ INTO > TABLE student_part_acid partition(ds=20110924); > ALTER TABLE student_part_acid ADD PARTITION (ds=20110925); > Now if we try to preform REPL DUMP it fails with this the error "Unbalanced > calls to open/commit transaction" on the HS2 side. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-22934) Hive server interactive log counters to error stream
[ https://issues.apache.org/jira/browse/HIVE-22934?focusedWorklogId=498814&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498814 ] ASF GitHub Bot logged work on HIVE-22934: - Author: ASF GitHub Bot Created on: 10/Oct/20 00:53 Start Date: 10/Oct/20 00:53 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1200: URL: https://github.com/apache/hive/pull/1200#issuecomment-706458154 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498814) Time Spent: 0.5h (was: 20m) > Hive server interactive log counters to error stream > > > Key: HIVE-22934 > URL: https://issues.apache.org/jira/browse/HIVE-22934 > Project: Hive > Issue Type: Bug >Reporter: Slim Bouguerra >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-22934.01.patch, HIVE-22934.02.patch, > HIVE-22934.03.patch, HIVE-22934.04.patch, HIVE-22934.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > Hive server is logging the console output to system error stream. > This need to be fixed because > First we do not roll the file. > Second writing to such file is done sequential and can lead to throttle/poor > perf. > {code} > -rw-r--r-- 1 hive hadoop 9.5G Feb 26 17:22 hive-server2-interactive.err > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24255) StorageHandler with select-limit query is returning 0 rows
[ https://issues.apache.org/jira/browse/HIVE-24255?focusedWorklogId=498780&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498780 ] ASF GitHub Bot logged work on HIVE-24255: - Author: ASF GitHub Bot Created on: 09/Oct/20 22:58 Start Date: 09/Oct/20 22:58 Worklog Time Spent: 10m Work Description: nareshpr opened a new pull request #1568: URL: https://github.com/apache/hive/pull/1568 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498780) Remaining Estimate: 0h Time Spent: 10m > StorageHandler with select-limit query is returning 0 rows > -- > > Key: HIVE-24255 > URL: https://issues.apache.org/jira/browse/HIVE-24255 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > > {code:java} > CREATE EXTERNAL TABLE dbs(db_id bigint, db_location_uri string, name string, > owner_name string, owner_type string) STORED BY > 'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES > ('hive.sql.database.type'='METASTORE', 'hive.sql.query'='SELECT `DB_ID`, > `DB_LOCATION_URI`, `NAME`, `OWNER_NAME`, `OWNER_TYPE` FROM `DBS`'); > {code} > ==> Wrong Result <== > {code:java} > set hive.limit.optimize.enable=true; > select * from dbs limit 1; > -- > VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED > -- > Map 1 .. container SUCCEEDED 0 0 0 0 0 0 > -- > VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 0.91 s > -- > ++--+---+-+-+ > | dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | > dbs.owner_type | > ++--+---+-+-+ > ++--+---+-+-+ > {code} > ==> Correct Result <== > {code:java} > set hive.limit.optimize.enable=false; > select * from dbs limit 1; > -- > VERTICES MODESTATUS TOTAL COMPLETED RUNNING PENDING > FAILED KILLED > -- > Map 1 .. container SUCCEEDED 1 100 > 0 0 > -- > VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 4.11 s > --+++---+-+-+ > | dbs.db_id |dbs.db_location_uri | dbs.name > | dbs.owner_name | dbs.owner_type | > +++---+-+-+ > | 1 | hdfs://abcd:8020/warehouse/tablespace/managed/hive | default > | public | ROLE| > +++---+-+-+{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24255) StorageHandler with select-limit query is returning 0 rows
[ https://issues.apache.org/jira/browse/HIVE-24255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24255: -- Labels: pull-request-available (was: ) > StorageHandler with select-limit query is returning 0 rows > -- > > Key: HIVE-24255 > URL: https://issues.apache.org/jira/browse/HIVE-24255 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > > {code:java} > CREATE EXTERNAL TABLE dbs(db_id bigint, db_location_uri string, name string, > owner_name string, owner_type string) STORED BY > 'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES > ('hive.sql.database.type'='METASTORE', 'hive.sql.query'='SELECT `DB_ID`, > `DB_LOCATION_URI`, `NAME`, `OWNER_NAME`, `OWNER_TYPE` FROM `DBS`'); > {code} > ==> Wrong Result <== > {code:java} > set hive.limit.optimize.enable=true; > select * from dbs limit 1; > -- > VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED > -- > Map 1 .. container SUCCEEDED 0 0 0 0 0 0 > -- > VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 0.91 s > -- > ++--+---+-+-+ > | dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | > dbs.owner_type | > ++--+---+-+-+ > ++--+---+-+-+ > {code} > ==> Correct Result <== > {code:java} > set hive.limit.optimize.enable=false; > select * from dbs limit 1; > -- > VERTICES MODESTATUS TOTAL COMPLETED RUNNING PENDING > FAILED KILLED > -- > Map 1 .. container SUCCEEDED 1 100 > 0 0 > -- > VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 4.11 s > --+++---+-+-+ > | dbs.db_id |dbs.db_location_uri | dbs.name > | dbs.owner_name | dbs.owner_type | > +++---+-+-+ > | 1 | hdfs://abcd:8020/warehouse/tablespace/managed/hive | default > | public | ROLE| > +++---+-+-+{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24120) Plugin for external DatabaseProduct in standalone HMS
[ https://issues.apache.org/jira/browse/HIVE-24120?focusedWorklogId=498767&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498767 ] ASF GitHub Bot logged work on HIVE-24120: - Author: ASF GitHub Bot Created on: 09/Oct/20 22:26 Start Date: 09/Oct/20 22:26 Worklog Time Spent: 10m Work Description: gatorblue commented on a change in pull request #1470: URL: https://github.com/apache/hive/pull/1470#discussion_r502691897 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/DatabaseProduct.java ## @@ -20,71 +20,666 @@ import java.sql.SQLException; import java.sql.SQLTransactionRollbackException; +import java.sql.Timestamp; +import java.util.ArrayList; +import java.util.EnumMap; +import java.util.HashMap; +import java.util.List; +import java.util.Map; -/** Database product infered via JDBC. */ -public enum DatabaseProduct { - DERBY, MYSQL, POSTGRES, ORACLE, SQLSERVER, OTHER; +import org.apache.hadoop.conf.Configurable; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hive.metastore.api.MetaException; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf.ConfVars; +import org.apache.hadoop.util.ReflectionUtils; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import com.google.common.base.Preconditions; + +/** Database product inferred via JDBC. Encapsulates all SQL logic associated with + * the database product. + * This class is a singleton, which is instantiated the first time + * method determineDatabaseProduct is invoked. + * Tests that need to create multiple instances can use the reset method + * */ +public class DatabaseProduct implements Configurable { + static final private Logger LOG = LoggerFactory.getLogger(DatabaseProduct.class.getName()); + + private static enum DbType {DERBY, MYSQL, POSTGRES, ORACLE, SQLSERVER, CUSTOM, UNDEFINED}; + public DbType dbType; + + // Singleton instance + private static DatabaseProduct theDatabaseProduct; + + Configuration myConf; + /** + * Protected constructor for singleton class + * @param id + */ + protected DatabaseProduct() {} + + public static final String DERBY_NAME = "derby"; + public static final String SQL_SERVER_NAME = "microsoft sql server"; + public static final String MYSQL_NAME = "mysql"; + public static final String POSTGRESQL_NAME = "postgresql"; + public static final String ORACLE_NAME = "oracle"; + public static final String UNDEFINED_NAME = "other"; + /** * Determine the database product type * @param productName string to defer database connection * @return database product type */ - public static DatabaseProduct determineDatabaseProduct(String productName) throws SQLException { -if (productName == null) { - return OTHER; + public static DatabaseProduct determineDatabaseProduct(String productName, Configuration c) { +DbType dbt; + +if (theDatabaseProduct != null) { + Preconditions.checkState(theDatabaseProduct.dbType == getDbType(productName)); + return theDatabaseProduct; } + +// This method may be invoked by concurrent connections +synchronized (DatabaseProduct.class) { + + if (productName == null) { +productName = UNDEFINED_NAME; + } + + dbt = getDbType(productName); + + // Check for null again in case of race condition + if (theDatabaseProduct == null) { +final Configuration conf = c!= null ? c : MetastoreConf.newMetastoreConf(); +// Check if we are using an external database product +boolean isExternal = MetastoreConf.getBoolVar(conf, ConfVars.USE_CUSTOM_RDBMS); + +if (isExternal) { + // The DatabaseProduct will be created by instantiating an external class via + // reflection. The external class can override any method in the current class + String className = MetastoreConf.getVar(conf, ConfVars.CUSTOM_RDBMS_CLASSNAME); + + if (className != null) { +try { + theDatabaseProduct = (DatabaseProduct) + ReflectionUtils.newInstance(Class.forName(className), conf); + + LOG.info(String.format("Using custom RDBMS %s. Overriding DbType: %s", className, dbt)); + dbt = DbType.CUSTOM; +}catch (Exception e) { + LOG.warn("Caught exception instantiating custom database product. Reverting to " + dbt, e); +} + } + else { +LOG.warn("Unexpected: metastore.use.custom.database.product was set, " + + "but metastore.custom.database.product.classname was not. Reverting to " + dbt); + } +} + +if (theDatabaseProduct == null) { + theDatabaseProduct = new DatabaseProduct(); Re
[jira] [Work logged] (HIVE-24120) Plugin for external DatabaseProduct in standalone HMS
[ https://issues.apache.org/jira/browse/HIVE-24120?focusedWorklogId=498761&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498761 ] ASF GitHub Bot logged work on HIVE-24120: - Author: ASF GitHub Bot Created on: 09/Oct/20 22:19 Start Date: 09/Oct/20 22:19 Worklog Time Spent: 10m Work Description: gatorblue commented on a change in pull request #1470: URL: https://github.com/apache/hive/pull/1470#discussion_r502689641 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/DatabaseProduct.java ## @@ -20,71 +20,666 @@ import java.sql.SQLException; import java.sql.SQLTransactionRollbackException; +import java.sql.Timestamp; +import java.util.ArrayList; +import java.util.EnumMap; +import java.util.HashMap; +import java.util.List; +import java.util.Map; -/** Database product infered via JDBC. */ -public enum DatabaseProduct { - DERBY, MYSQL, POSTGRES, ORACLE, SQLSERVER, OTHER; +import org.apache.hadoop.conf.Configurable; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hive.metastore.api.MetaException; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf.ConfVars; +import org.apache.hadoop.util.ReflectionUtils; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import com.google.common.base.Preconditions; + +/** Database product inferred via JDBC. Encapsulates all SQL logic associated with + * the database product. + * This class is a singleton, which is instantiated the first time + * method determineDatabaseProduct is invoked. + * Tests that need to create multiple instances can use the reset method + * */ +public class DatabaseProduct implements Configurable { + static final private Logger LOG = LoggerFactory.getLogger(DatabaseProduct.class.getName()); + + private static enum DbType {DERBY, MYSQL, POSTGRES, ORACLE, SQLSERVER, CUSTOM, UNDEFINED}; + public DbType dbType; + + // Singleton instance + private static DatabaseProduct theDatabaseProduct; + + Configuration myConf; + /** + * Protected constructor for singleton class + * @param id + */ + protected DatabaseProduct() {} + + public static final String DERBY_NAME = "derby"; + public static final String SQL_SERVER_NAME = "microsoft sql server"; + public static final String MYSQL_NAME = "mysql"; + public static final String POSTGRESQL_NAME = "postgresql"; + public static final String ORACLE_NAME = "oracle"; + public static final String UNDEFINED_NAME = "other"; + /** * Determine the database product type * @param productName string to defer database connection * @return database product type */ - public static DatabaseProduct determineDatabaseProduct(String productName) throws SQLException { -if (productName == null) { - return OTHER; + public static DatabaseProduct determineDatabaseProduct(String productName, Configuration c) { +DbType dbt; + +if (theDatabaseProduct != null) { + Preconditions.checkState(theDatabaseProduct.dbType == getDbType(productName)); + return theDatabaseProduct; } + +// This method may be invoked by concurrent connections +synchronized (DatabaseProduct.class) { + + if (productName == null) { +productName = UNDEFINED_NAME; + } + + dbt = getDbType(productName); + + // Check for null again in case of race condition + if (theDatabaseProduct == null) { +final Configuration conf = c!= null ? c : MetastoreConf.newMetastoreConf(); +// Check if we are using an external database product +boolean isExternal = MetastoreConf.getBoolVar(conf, ConfVars.USE_CUSTOM_RDBMS); + +if (isExternal) { + // The DatabaseProduct will be created by instantiating an external class via + // reflection. The external class can override any method in the current class + String className = MetastoreConf.getVar(conf, ConfVars.CUSTOM_RDBMS_CLASSNAME); + + if (className != null) { +try { + theDatabaseProduct = (DatabaseProduct) + ReflectionUtils.newInstance(Class.forName(className), conf); + + LOG.info(String.format("Using custom RDBMS %s. Overriding DbType: %s", className, dbt)); + dbt = DbType.CUSTOM; +}catch (Exception e) { + LOG.warn("Caught exception instantiating custom database product. Reverting to " + dbt, e); Review comment: I changed the code to throw a RuntimeException instead. This method is called in a few places where it's not clear what to do with a regular Exception. Hope this makes sense. This is an automated message from the Apache Git Service. To respond to the message, please lo
[jira] [Updated] (HIVE-24255) StorageHandler with select-limit query is returning 0 rows
[ https://issues.apache.org/jira/browse/HIVE-24255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naresh P R updated HIVE-24255: -- Description: {code:java} CREATE EXTERNAL TABLE dbs(db_id bigint, db_location_uri string, name string, owner_name string, owner_type string) STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES ('hive.sql.database.type'='METASTORE', 'hive.sql.query'='SELECT `DB_ID`, `DB_LOCATION_URI`, `NAME`, `OWNER_NAME`, `OWNER_TYPE` FROM `DBS`'); {code} ==> Wrong Result <== {code:java} set hive.limit.optimize.enable=true; select * from dbs limit 1; -- VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -- Map 1 .. container SUCCEEDED 0 0 0 0 0 0 -- VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 0.91 s -- ++--+---+-+-+ | dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | dbs.owner_type | ++--+---+-+-+ ++--+---+-+-+ {code} ==> Correct Result <== {code:java} set hive.limit.optimize.enable=false; select * from dbs limit 1; -- VERTICES MODESTATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -- Map 1 .. container SUCCEEDED 1 100 0 0 -- VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 4.11 s --+++---+-+-+ | dbs.db_id |dbs.db_location_uri | dbs.name | dbs.owner_name | dbs.owner_type | +++---+-+-+ | 1 | hdfs://abcd:8020/warehouse/tablespace/managed/hive | default | public | ROLE| +++---+-+-+{code} was: {code:java} CREATE EXTERNAL TABLE test_table(db_id bigint, db_location_uri string, name string, owner_name string, owner_type string) STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES ('hive.sql.database.type'='METASTORE', 'hive.sql.query'='SELECT `DB_ID`, `DB_LOCATION_URI`, `NAME`, `OWNER_NAME`, `OWNER_TYPE` FROM `DBS`'); ==> Wrong Result <== set hive.limit.optimize.enable=true; select * from test_table limit 1; -- VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -- Map 1 .. container SUCCEEDED 0 0 0 0 0 0 -- VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 0.91 s -- ++--+---+-+-+ | dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | dbs.owner_type | ++--+---+-+-+ ++--+---+-+-+ ==> Correct Result <== set hive.limit.optimize.enable=false; select * from test_table limit 1; -- VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -- Map 1 .. container SUCCEEDED 1 1 0 0 0 0 -- VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 4.11 s -- ++--
[jira] [Updated] (HIVE-24255) StorageHandler with select-limit query is returning 0 rows
[ https://issues.apache.org/jira/browse/HIVE-24255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naresh P R updated HIVE-24255: -- Description: {code:java} CREATE EXTERNAL TABLE test_table(db_id bigint, db_location_uri string, name string, owner_name string, owner_type string) STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES ('hive.sql.database.type'='METASTORE', 'hive.sql.query'='SELECT `DB_ID`, `DB_LOCATION_URI`, `NAME`, `OWNER_NAME`, `OWNER_TYPE` FROM `DBS`'); ==> Wrong Result <== set hive.limit.optimize.enable=true; select * from test_table limit 1; -- VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -- Map 1 .. container SUCCEEDED 0 0 0 0 0 0 -- VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 0.91 s -- ++--+---+-+-+ | dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | dbs.owner_type | ++--+---+-+-+ ++--+---+-+-+ ==> Correct Result <== set hive.limit.optimize.enable=false; select * from test_table limit 1; -- VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -- Map 1 .. container SUCCEEDED 1 1 0 0 0 0 -- VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 4.11 s -- +++---+-+-+ | dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | dbs.owner_type | +++---+-+-+ | 1 | hdfs://abcd:8020/warehouse/tablespace/managed/hive | default | public | ROLE | - {code} was: {code:java} CREATE EXTERNAL TABLE test_table(db_id bigint, db_location_uri string, name string, owner_name string, owner_type string) STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES ('hive.sql.database.type'='METASTORE', 'hive.sql.query'='SELECT `DB_ID`, `DB_LOCATION_URI`, `NAME`, `OWNER_NAME`, `OWNER_TYPE` FROM `DBS`'); ==> Wrong Result <== set hive.limit.optimize.enable=true; select * from test_table limit 1; -- VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -- Map 1 .. container SUCCEEDED 0 0 0 0 0 0 -- VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 0.91 s -- ++--+---+-+-+ | dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | dbs.owner_type | ++--+---+-+-+ ++--+---+-+-+ ==> Correct Result <== set hive.limit.optimize.enable=false; select * from test_table limit 1; -- VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -- Map 1 .. container SUCCEEDED 1 1 0 0 0 0 -- VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 4.11 s -- +++---+-+-+ | dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | dbs.owner_type | ++---
[jira] [Assigned] (HIVE-24255) StorageHandler with select-limit query is returning 0 rows
[ https://issues.apache.org/jira/browse/HIVE-24255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naresh P R reassigned HIVE-24255: - > StorageHandler with select-limit query is returning 0 rows > -- > > Key: HIVE-24255 > URL: https://issues.apache.org/jira/browse/HIVE-24255 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > > > {code:java} > CREATE EXTERNAL TABLE test_table(db_id bigint, db_location_uri string, name > string, owner_name string, owner_type string) > STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' > TBLPROPERTIES ('hive.sql.database.type'='METASTORE', 'hive.sql.query'='SELECT > `DB_ID`, `DB_LOCATION_URI`, `NAME`, `OWNER_NAME`, `OWNER_TYPE` FROM `DBS`'); > ==> Wrong Result <== > set hive.limit.optimize.enable=true; > select * from test_table limit 1; > -- > VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED > -- > Map 1 .. container SUCCEEDED 0 0 0 0 0 0 > -- > VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 0.91 s > -- > ++--+---+-+-+ > | dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | > dbs.owner_type | > ++--+---+-+-+ > ++--+---+-+-+ > ==> Correct Result <== > set hive.limit.optimize.enable=false; > select * from test_table limit 1; > -- > VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED > -- > Map 1 .. container SUCCEEDED 1 1 0 0 0 0 > -- > VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 4.11 s > -- > +++---+-+-+ > | dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | > dbs.owner_type | > +++---+-+-+ > | 1 | hdfs://abcd:8020/warehouse/tablespace/managed/hive | default | public | > ROLE | > {code} > +++---+-+-+ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23700) HiveConf static initialization fails when JAR URI is opaque
[ https://issues.apache.org/jira/browse/HIVE-23700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17211384#comment-17211384 ] Cole Mackenzie commented on HIVE-23700: --- I am also receiving this error with the following dependencies and building a FAT jar. {code:java} org.apache.spark spark-sql_2.12 3.0.1 org.apache.spark spark-hive_2.12 3.0.1 {code} Stacktrace: {code:java} Caused by: java.lang.IllegalArgumentException: URI is not hierarchical at java.io.File.(File.java:420) at org.apache.hadoop.hive.conf.HiveConf.findConfigFile(HiveConf.java:176) at org.apache.hadoop.hive.conf.HiveConf.(HiveConf.java:145) ... 58 common frames omitted {code} > HiveConf static initialization fails when JAR URI is opaque > --- > > Key: HIVE-23700 > URL: https://issues.apache.org/jira/browse/HIVE-23700 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.7 >Reporter: Francisco Guerrero >Assignee: Francisco Guerrero >Priority: Minor > Labels: pull-request-available > Attachments: HIVE-23700.1.patch > > Original Estimate: 120h > Time Spent: 2h > Remaining Estimate: 118h > > HiveConf static initialization fails when the jar URI is opaque, for example > when it's embedded as a fat jar in a spring boot application. Then > initialization of the HiveConf static block fails and the HiveConf class does > not get classloaded. The opaque URI in my case looks like this > _jar:file:/usr/local/server/some-service-jar.jar!/BOOT-INF/lib/hive-common-2.3.7.jar!/_ > HiveConf#findConfigFile should be able to handle `IllegalArgumentException` > when the jar `URI` provided to `File` throws the exception. > To surface this issue three conditions need to be met. > 1. hive-site.xml should not be on the classpath > 2. hive-site.xml should not be on "HIVE_CONF_DIR" > 3. hive-site.xml should not be on "HIVE_HOME" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager
[ https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24254: -- Labels: pull-request-available (was: ) > Remove setOwner call in ReplChangeManager > - > > Key: HIVE-24254 > URL: https://issues.apache.org/jira/browse/HIVE-24254 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24254.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24254) Remove setOwner call in ReplChangeManager
[ https://issues.apache.org/jira/browse/HIVE-24254?focusedWorklogId=498713&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498713 ] ASF GitHub Bot logged work on HIVE-24254: - Author: ASF GitHub Bot Created on: 09/Oct/20 19:32 Start Date: 09/Oct/20 19:32 Worklog Time Spent: 10m Work Description: aasha opened a new pull request #1567: URL: https://github.com/apache/hive/pull/1567 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498713) Remaining Estimate: 0h Time Spent: 10m > Remove setOwner call in ReplChangeManager > - > > Key: HIVE-24254 > URL: https://issues.apache.org/jira/browse/HIVE-24254 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Attachments: HIVE-24254.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager
[ https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi updated HIVE-24254: --- Attachment: HIVE-24254.01.patch Status: Patch Available (was: In Progress) > Remove setOwner call in ReplChangeManager > - > > Key: HIVE-24254 > URL: https://issues.apache.org/jira/browse/HIVE-24254 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Attachments: HIVE-24254.01.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager
[ https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi updated HIVE-24254: --- Status: In Progress (was: Patch Available) > Remove setOwner call in ReplChangeManager > - > > Key: HIVE-24254 > URL: https://issues.apache.org/jira/browse/HIVE-24254 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager
[ https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi updated HIVE-24254: --- Attachment: (was: HIVE-24254.01.patch) > Remove setOwner call in ReplChangeManager > - > > Key: HIVE-24254 > URL: https://issues.apache.org/jira/browse/HIVE-24254 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager
[ https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi updated HIVE-24254: --- Attachment: HIVE-24254.01.patch Status: Patch Available (was: In Progress) > Remove setOwner call in ReplChangeManager > - > > Key: HIVE-24254 > URL: https://issues.apache.org/jira/browse/HIVE-24254 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Attachments: HIVE-24254.01.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (HIVE-24254) Remove setOwner call in ReplChangeManager
[ https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-24254 started by Aasha Medhi. -- > Remove setOwner call in ReplChangeManager > - > > Key: HIVE-24254 > URL: https://issues.apache.org/jira/browse/HIVE-24254 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Attachments: HIVE-24254.01.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24208) LLAP: query job stuck due to race conditions
[ https://issues.apache.org/jira/browse/HIVE-24208?focusedWorklogId=498668&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498668 ] ASF GitHub Bot logged work on HIVE-24208: - Author: ASF GitHub Bot Created on: 09/Oct/20 17:09 Start Date: 09/Oct/20 17:09 Worklog Time Spent: 10m Work Description: bymm commented on pull request #1534: URL: https://github.com/apache/hive/pull/1534#issuecomment-706298999 Unittest is added. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498668) Time Spent: 0.5h (was: 20m) > LLAP: query job stuck due to race conditions > > > Key: HIVE-24208 > URL: https://issues.apache.org/jira/browse/HIVE-24208 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.4 >Reporter: Yuriy Baltovskyy >Assignee: Yuriy Baltovskyy >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > When issuing an LLAP query, sometimes the TEZ job on LLAP server never ends > and it never returns the data reader. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24246) Fix for Ranger Deny policy overriding policy with same resource name
[ https://issues.apache.org/jira/browse/HIVE-24246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24246: -- Labels: pull-request-available (was: ) > Fix for Ranger Deny policy overriding policy with same resource name > - > > Key: HIVE-24246 > URL: https://issues.apache.org/jira/browse/HIVE-24246 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24246.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24246) Fix for Ranger Deny policy overriding policy with same resource name
[ https://issues.apache.org/jira/browse/HIVE-24246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi updated HIVE-24246: --- Attachment: HIVE-24246.01.patch Status: Patch Available (was: In Progress) > Fix for Ranger Deny policy overriding policy with same resource name > - > > Key: HIVE-24246 > URL: https://issues.apache.org/jira/browse/HIVE-24246 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Attachments: HIVE-24246.01.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (HIVE-24246) Fix for Ranger Deny policy overriding policy with same resource name
[ https://issues.apache.org/jira/browse/HIVE-24246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-24246 started by Aasha Medhi. -- > Fix for Ranger Deny policy overriding policy with same resource name > - > > Key: HIVE-24246 > URL: https://issues.apache.org/jira/browse/HIVE-24246 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Attachments: HIVE-24246.01.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24246) Fix for Ranger Deny policy overriding policy with same resource name
[ https://issues.apache.org/jira/browse/HIVE-24246?focusedWorklogId=498626&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498626 ] ASF GitHub Bot logged work on HIVE-24246: - Author: ASF GitHub Bot Created on: 09/Oct/20 15:41 Start Date: 09/Oct/20 15:41 Worklog Time Spent: 10m Work Description: aasha opened a new pull request #1566: URL: https://github.com/apache/hive/pull/1566 …esource name ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498626) Remaining Estimate: 0h Time Spent: 10m > Fix for Ranger Deny policy overriding policy with same resource name > - > > Key: HIVE-24246 > URL: https://issues.apache.org/jira/browse/HIVE-24246 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Attachments: HIVE-24246.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures
[ https://issues.apache.org/jira/browse/HIVE-24217?focusedWorklogId=498624&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498624 ] ASF GitHub Bot logged work on HIVE-24217: - Author: ASF GitHub Bot Created on: 09/Oct/20 15:33 Start Date: 09/Oct/20 15:33 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #1542: URL: https://github.com/apache/hive/pull/1542#discussion_r502512642 ## File path: standalone-metastore/metastore-server/src/main/resources/package.jdo ## @@ -1549,6 +1549,83 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Review comment: > I'm not sure if they never participate in a query. If one wants to discover the stored procedures which are currently stored in a DB and find out on what data they operate they would need to do some clumsy string manipulations on the signature. I believe you are thinking about `information_schema` stuff - its not set in stone that we have to get all that data from the metastore db - for this case we might add a few UDFs parameter info into an array or something ; so we will still store simple things in the metastore - but we could transform it into more readable in. > Considering that other DB engines also store these information separately I would like to keep it as it is for now and see how it works in practice. Later on when we have multi language support we can revisit this issue. yes it might be..but it would be better to revisit stuff like this if its really needed; and not after we have introduced "something" which later we should care for even if we don't want to I still think there will be no real benefit of "storing it" in a decomposed manner - it will be harder to go forward in case stuff changes - and right now will not use it for anything ; so let's remove it..and add it only if there is a real need for it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498624) Time Spent: 1h 40m (was: 1.5h) > HMS storage backend for HPL/SQL stored procedures > - > > Key: HIVE-24217 > URL: https://issues.apache.org/jira/browse/HIVE-24217 > Project: Hive > Issue Type: Bug > Components: Hive, hpl/sql, Metastore >Reporter: Attila Magyar >Assignee: Attila Magyar >Priority: Major > Labels: pull-request-available > Attachments: HPL_SQL storedproc HMS storage.pdf > > Time Spent: 1h 40m > Remaining Estimate: 0h > > HPL/SQL procedures are currently stored in text files. The goal of this Jira > is to implement a Metastore backend for storing and loading these procedures. > This is an incremental step towards having fully capable stored procedures in > Hive. > > See the attached design for more information. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24211) Replace Snapshot invalidate logic with WriteSet check for txn conflict detection
[ https://issues.apache.org/jira/browse/HIVE-24211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17211074#comment-17211074 ] Denys Kuzmenko commented on HIVE-24211: --- Merged to master. Thank you for the review, [~pvarga] and [~pvary]! > Replace Snapshot invalidate logic with WriteSet check for txn conflict > detection > > > Key: HIVE-24211 > URL: https://issues.apache.org/jira/browse/HIVE-24211 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > *Issue with concurrent writes on partitioned table:* > Concurrent writes on different partitions should execute in parallel without > issues. They acquire a shared lock on table level and exclusive write on > partition level (hive.txn.xlock.write=true). > However there is a problem with the Snapshot validation. It compares valid > writeIds seen by current transaction, recorded before locking, with the > actual list of writeIds. The Issue is that writeId in Snapshot has no > information on partition, meaning that concurrent writes to different > partitions would be seen as writes to the same non-partitioned table causing > Snapshot to be obsolete. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24211) Replace Snapshot invalidate logic with WriteSet check for txn conflict detection
[ https://issues.apache.org/jira/browse/HIVE-24211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko resolved HIVE-24211. --- Resolution: Fixed > Replace Snapshot invalidate logic with WriteSet check for txn conflict > detection > > > Key: HIVE-24211 > URL: https://issues.apache.org/jira/browse/HIVE-24211 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > *Issue with concurrent writes on partitioned table:* > Concurrent writes on different partitions should execute in parallel without > issues. They acquire a shared lock on table level and exclusive write on > partition level (hive.txn.xlock.write=true). > However there is a problem with the Snapshot validation. It compares valid > writeIds seen by current transaction, recorded before locking, with the > actual list of writeIds. The Issue is that writeId in Snapshot has no > information on partition, meaning that concurrent writes to different > partitions would be seen as writes to the same non-partitioned table causing > Snapshot to be obsolete. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24211) Replace Snapshot invalidate logic with WriteSet check for txn conflict detection
[ https://issues.apache.org/jira/browse/HIVE-24211?focusedWorklogId=498609&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498609 ] ASF GitHub Bot logged work on HIVE-24211: - Author: ASF GitHub Bot Created on: 09/Oct/20 15:08 Start Date: 09/Oct/20 15:08 Worklog Time Spent: 10m Work Description: deniskuzZ merged pull request #1533: URL: https://github.com/apache/hive/pull/1533 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498609) Time Spent: 50m (was: 40m) > Replace Snapshot invalidate logic with WriteSet check for txn conflict > detection > > > Key: HIVE-24211 > URL: https://issues.apache.org/jira/browse/HIVE-24211 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > *Issue with concurrent writes on partitioned table:* > Concurrent writes on different partitions should execute in parallel without > issues. They acquire a shared lock on table level and exclusive write on > partition level (hive.txn.xlock.write=true). > However there is a problem with the Snapshot validation. It compares valid > writeIds seen by current transaction, recorded before locking, with the > actual list of writeIds. The Issue is that writeId in Snapshot has no > information on partition, meaning that concurrent writes to different > partitions would be seen as writes to the same non-partitioned table causing > Snapshot to be obsolete. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21737) Upgrade Avro to version 1.10.0
[ https://issues.apache.org/jira/browse/HIVE-21737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17211061#comment-17211061 ] Chinna Rao Lalam commented on HIVE-21737: - Hi [~iemejia], Verified this patch and found these 2 test failures with below exception {quote}avro_deserialize_map_null.q parquet_map_null.q {quote} {quote}Failed with exception java.io.IOException:org.apache.avro.AvroTypeException: Invalid default for field avreau_col_1: null not a [] {quote} It looks these exceptions are because of breaking backword compatability of avro version. https://issues.apache.org/jira/browse/AVRO-2817 We tried setting *Schema.Parser.setValidateDefaults(false)* to turn of defaults validation Ex. org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils#getSchemaFor(java.io.File) it did not work. [~iemejia] any idea/workarond for this issue? > Upgrade Avro to version 1.10.0 > -- > > Key: HIVE-21737 > URL: https://issues.apache.org/jira/browse/HIVE-21737 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: Ismaël Mejía >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > Attachments: 0001-HIVE-21737-Bump-Apache-Avro-to-1.9.2.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Avro >= 1.9.x bring a lot of fixes including a leaner version of Avro without > Jackson in the public API and Guava as a dependency. Worth the update. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24236) Connection leak in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-24236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen resolved HIVE-24236. - Fix Version/s: 4.0.0 Resolution: Fixed > Connection leak in TxnHandler > - > > Key: HIVE-24236 > URL: https://issues.apache.org/jira/browse/HIVE-24236 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > We see failures in QE tests with cannot allocate connections errors. The > exception stack like following: > {noformat} > 2020-09-29T18:44:26,563 INFO [Heartbeater-0]: txn.TxnHandler > (TxnHandler.java:checkRetryable(3733)) - Non-retryable error in > heartbeat(HeartbeatRequest(lockid:0, txnid:11908)) : Cannot get a connection, > general error (SQLState=null, ErrorCode=0) > 2020-09-29T18:44:26,564 ERROR [Heartbeater-0]: metastore.RetryingHMSHandler > (RetryingHMSHandler.java:invokeInternal(201)) - MetaException(message:Unable > to select from transaction database > org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, general > error > at > org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:118) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3605) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3598) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2739) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452) > at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy63.heartbeat(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.heartbeat(HiveMetaStoreClient.java:3247) > at sun.reflect.GeneratedMethodAccessor414.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:213) > at com.sun.proxy.$Proxy64.heartbeat(Unknown Source) > at > org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:671) > at > org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.lambda$run$0(DbTxnManager.java:1102) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898) > at > org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:1101) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.InterruptedException > at java.lang.Object.wait(Native Method) > at > org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1112) > at > org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106) > ... 29 more > ) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2747) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452) > at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source) > {noformat} > and > {noformat} > Caused by: java.util.NoSuchElementException: Timeout waiting for idle object > at > org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1134) > at
[jira] [Work started] (HIVE-24236) Connection leak in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-24236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-24236 started by Yongzhi Chen. --- > Connection leak in TxnHandler > - > > Key: HIVE-24236 > URL: https://issues.apache.org/jira/browse/HIVE-24236 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > We see failures in QE tests with cannot allocate connections errors. The > exception stack like following: > {noformat} > 2020-09-29T18:44:26,563 INFO [Heartbeater-0]: txn.TxnHandler > (TxnHandler.java:checkRetryable(3733)) - Non-retryable error in > heartbeat(HeartbeatRequest(lockid:0, txnid:11908)) : Cannot get a connection, > general error (SQLState=null, ErrorCode=0) > 2020-09-29T18:44:26,564 ERROR [Heartbeater-0]: metastore.RetryingHMSHandler > (RetryingHMSHandler.java:invokeInternal(201)) - MetaException(message:Unable > to select from transaction database > org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, general > error > at > org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:118) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3605) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3598) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2739) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452) > at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy63.heartbeat(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.heartbeat(HiveMetaStoreClient.java:3247) > at sun.reflect.GeneratedMethodAccessor414.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:213) > at com.sun.proxy.$Proxy64.heartbeat(Unknown Source) > at > org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:671) > at > org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.lambda$run$0(DbTxnManager.java:1102) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898) > at > org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:1101) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.InterruptedException > at java.lang.Object.wait(Native Method) > at > org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1112) > at > org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106) > ... 29 more > ) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2747) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452) > at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source) > {noformat} > and > {noformat} > Caused by: java.util.NoSuchElementException: Timeout waiting for idle object > at > org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1134) > at > org.apache.commons.dbcp.PoolingDataSource.getConnection(Po
[jira] [Work logged] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
[ https://issues.apache.org/jira/browse/HIVE-24248?focusedWorklogId=498535&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498535 ] ASF GitHub Bot logged work on HIVE-24248: - Author: ASF GitHub Bot Created on: 09/Oct/20 14:15 Start Date: 09/Oct/20 14:15 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on pull request #1565: URL: https://github.com/apache/hive/pull/1565#issuecomment-706119583 @kasakrisz @kgyrtkirk The flaky test runs successfully: http://ci.hive.apache.org/job/hive-flaky-check/126/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498535) Time Spent: 1h 20m (was: 1h 10m) > TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky > -- > > Key: HIVE-24248 > URL: https://issues.apache.org/jira/browse/HIVE-24248 > Project: Hive > Issue Type: Bug >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests] > {code:java} > java.lang.AssertionError: > Client Execution succeeded but contained differences (error code = 1) after > executing subquery_join_rewrite.q > 241,244d240 > < 1 1 > < 1 2 > < 2 1 > < 2 2 > 245a242,243 > > 2 2 > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24254) Remove setOwner call in ReplChangeManager
[ https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi reassigned HIVE-24254: -- > Remove setOwner call in ReplChangeManager > - > > Key: HIVE-24254 > URL: https://issues.apache.org/jira/browse/HIVE-24254 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-21052) Make sure transactions get cleaned if they are aborted before addPartitions is called
[ https://issues.apache.org/jira/browse/HIVE-21052?focusedWorklogId=498499&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498499 ] ASF GitHub Bot logged work on HIVE-21052: - Author: ASF GitHub Bot Created on: 09/Oct/20 14:12 Start Date: 09/Oct/20 14:12 Worklog Time Spent: 10m Work Description: pvargacl commented on a change in pull request #1548: URL: https://github.com/apache/hive/pull/1548#discussion_r501713463 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java ## @@ -853,6 +857,273 @@ public void majorCompactAfterAbort() throws Exception { Lists.newArrayList(5, 6), 1); } + @Test + public void testCleanAbortCompactAfterAbortTwoPartitions() throws Exception { +String dbName = "default"; +String tblName = "cws"; + +HiveStreamingConnection connection1 = prepareTableTwoPartitionsAndConnection(dbName, tblName, 1); +HiveStreamingConnection connection2 = prepareTableTwoPartitionsAndConnection(dbName, tblName, 1); + +connection1.beginTransaction(); +connection1.write("1,1".getBytes()); +connection1.write("2,2".getBytes()); +connection1.abortTransaction(); + +connection2.beginTransaction(); +connection2.write("1,3".getBytes()); +connection2.write("2,3".getBytes()); +connection2.write("3,3".getBytes()); +connection2.abortTransaction(); + +assertAndCompactCleanAbort(dbName, tblName); + +connection1.close(); +connection2.close(); + } + + @Test + public void testCleanAbortCompactAfterAbort() throws Exception { +String dbName = "default"; +String tblName = "cws"; + +// Create three folders with two different transactions +HiveStreamingConnection connection1 = prepareTableAndConnection(dbName, tblName, 1); +HiveStreamingConnection connection2 = prepareTableAndConnection(dbName, tblName, 1); + +connection1.beginTransaction(); +connection1.write("1,1".getBytes()); +connection1.write("2,2".getBytes()); +connection1.abortTransaction(); + +connection2.beginTransaction(); +connection2.write("1,3".getBytes()); +connection2.write("2,3".getBytes()); +connection2.write("3,3".getBytes()); +connection2.abortTransaction(); + +assertAndCompactCleanAbort(dbName, tblName); + +connection1.close(); +connection2.close(); + } + + private void assertAndCompactCleanAbort(String dbName, String tblName) throws Exception { +IMetaStoreClient msClient = new HiveMetaStoreClient(conf); +TxnStore txnHandler = TxnUtils.getTxnStore(conf); +Table table = msClient.getTable(dbName, tblName); +FileSystem fs = FileSystem.get(conf); +FileStatus[] stat = +fs.listStatus(new Path(table.getSd().getLocation())); +if (3 != stat.length) { + Assert.fail("Expecting three directories corresponding to three partitions, FileStatus[] stat " + Arrays.toString(stat)); +} + +int count = TxnDbUtil.countQueryAgent(conf, "select count(*) from TXN_COMPONENTS where TC_OPERATION_TYPE='p'"); +// We should have two rows corresponding to the two aborted transactions +Assert.assertEquals(TxnDbUtil.queryToString(conf, "select * from TXN_COMPONENTS"), 2, count); + +runInitiator(conf); +count = TxnDbUtil.countQueryAgent(conf, "select count(*) from COMPACTION_QUEUE where CQ_TYPE='p'"); +// Only one job is added to the queue per table. This job corresponds to all the entries for a particular table +// with rows in TXN_COMPONENTS +Assert.assertEquals(TxnDbUtil.queryToString(conf, "select * from COMPACTION_QUEUE"), 1, count); + +ShowCompactResponse rsp = txnHandler.showCompact(new ShowCompactRequest()); +Assert.assertEquals(1, rsp.getCompacts().size()); +Assert.assertEquals(TxnStore.CLEANING_RESPONSE, rsp.getCompacts().get(0).getState()); +Assert.assertEquals("cws", rsp.getCompacts().get(0).getTablename()); +Assert.assertEquals(CompactionType.CLEAN_ABORTED, +rsp.getCompacts().get(0).getType()); + +runCleaner(conf); + +// After the cleaner runs TXN_COMPONENTS and COMPACTION_QUEUE should have zero rows, also the folders should have been deleted. +count = TxnDbUtil.countQueryAgent(conf, "select count(*) from TXN_COMPONENTS"); +Assert.assertEquals(TxnDbUtil.queryToString(conf, "select * from TXN_COMPONENTS"), 0, count); + +count = TxnDbUtil.countQueryAgent(conf, "select count(*) from COMPACTION_QUEUE"); +Assert.assertEquals(TxnDbUtil.queryToString(conf, "select * from COMPACTION_QUEUE"), 0, count); + +RemoteIterator it = +fs.listFiles(new Path(table.getSd().getLocation()), true); +if (it.hasNext()) { + Assert.fail("Expecting compaction to have cleaned the directories, FileStatus[] stat " + Arrays.toString(stat)); Review comment: I think this assert is quit misleading
[jira] [Work logged] (HIVE-24244) NPE during Atlas metadata replication
[ https://issues.apache.org/jira/browse/HIVE-24244?focusedWorklogId=498473&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498473 ] ASF GitHub Bot logged work on HIVE-24244: - Author: ASF GitHub Bot Created on: 09/Oct/20 14:10 Start Date: 09/Oct/20 14:10 Worklog Time Spent: 10m Work Description: pkumarsinha commented on a change in pull request #1563: URL: https://github.com/apache/hive/pull/1563#discussion_r502202414 ## File path: ql/src/test/org/apache/hadoop/hive/ql/exec/repl/TestAtlasDumpTask.java ## @@ -96,4 +105,13 @@ public void testAtlasDumpMetrics() throws Exception { Assert.assertTrue(eventDetailsCaptor .getAllValues().get(1).toString().contains("{\"dbName\":\"srcDB\",\"dumpEndTime\"")); } + + @Test + public void testAtlasRestClientBuilder() throws SemanticException, IOException { +mockStatic(UserGroupInformation.class); + when(UserGroupInformation.getLoginUser()).thenReturn(mock(UserGroupInformation.class)); +AtlasRestClientBuilder atlasRestCleintBuilder = new AtlasRestClientBuilder("http://localhost:31000";); +AtlasRestClient atlasClient = atlasRestCleintBuilder.getClient(conf); +Assert.assertTrue(atlasClient != null); Review comment: HiveConf is mocked, so hive in test is not present(so false). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498473) Time Spent: 1h (was: 50m) > NPE during Atlas metadata replication > - > > Key: HIVE-24244 > URL: https://issues.apache.org/jira/browse/HIVE-24244 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24244.01.patch > > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24106) Abort polling on the operation state when the current thread is interrupted
[ https://issues.apache.org/jira/browse/HIVE-24106?focusedWorklogId=498480&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498480 ] ASF GitHub Bot logged work on HIVE-24106: - Author: ASF GitHub Bot Created on: 09/Oct/20 14:10 Start Date: 09/Oct/20 14:10 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #1456: URL: https://github.com/apache/hive/pull/1456#discussion_r502273628 ## File path: jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java ## @@ -360,6 +360,9 @@ TGetOperationStatusResp waitForOperationToComplete() throws SQLException { // Poll on the operation status, till the operation is complete do { try { +if (Thread.currentThread().isInterrupted()) { + throw new SQLException("Interrupted while polling on the operation status", "70100"); Review comment: I think this error message and code should be placed in the `ErrorMsg` class This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498480) Time Spent: 1h (was: 50m) > Abort polling on the operation state when the current thread is interrupted > --- > > Key: HIVE-24106 > URL: https://issues.apache.org/jira/browse/HIVE-24106 > Project: Hive > Issue Type: Improvement > Components: JDBC >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > If running HiveStatement asynchronously as a task like in a thread or future, > if we interrupt the task, the HiveStatement would continue to poll on the > operation state until finish. It's may better to provide a way to abort the > executing in such case. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24241) Enable SharedWorkOptimizer to merge downstream operators after an optimization step
[ https://issues.apache.org/jira/browse/HIVE-24241?focusedWorklogId=498437&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498437 ] ASF GitHub Bot logged work on HIVE-24241: - Author: ASF GitHub Bot Created on: 09/Oct/20 14:07 Start Date: 09/Oct/20 14:07 Worklog Time Spent: 10m Work Description: kgyrtkirk opened a new pull request #1562: URL: https://github.com/apache/hive/pull/1562 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498437) Time Spent: 20m (was: 10m) > Enable SharedWorkOptimizer to merge downstream operators after an > optimization step > --- > > Key: HIVE-24241 > URL: https://issues.apache.org/jira/browse/HIVE-24241 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24224) Fix skipping header/footer for Hive on Tez on compressed files
[ https://issues.apache.org/jira/browse/HIVE-24224?focusedWorklogId=498381&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498381 ] ASF GitHub Bot logged work on HIVE-24224: - Author: ASF GitHub Bot Created on: 09/Oct/20 14:02 Start Date: 09/Oct/20 14:02 Worklog Time Spent: 10m Work Description: kgyrtkirk closed pull request #1546: URL: https://github.com/apache/hive/pull/1546 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498381) Time Spent: 50m (was: 40m) > Fix skipping header/footer for Hive on Tez on compressed files > -- > > Key: HIVE-24224 > URL: https://issues.apache.org/jira/browse/HIVE-24224 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Compressed file with Hive on Tez returns header and footers - for both > select * and select count ( * ): > {noformat} > printf "offset,id,other\n9,\"20200315 X00 1356\",123\n17,\"20200315 X00 > 1357\",123\nrst,rst,rst" > data.csv > hdfs dfs -put -f data.csv /apps/hive/warehouse/bz2test/bz2tbl1/ > bzip2 -f data.csv > hdfs dfs -put -f data.csv.bz2 /apps/hive/warehouse/bz2test/bz2tbl2/ > beeline -e "CREATE EXTERNAL TABLE default.bz2tst2 ( > sequence int, > id string, > other string) > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' > LOCATION '/apps/hive/warehouse/bz2test/bz2tbl2' > TBLPROPERTIES ( > 'skip.header.line.count'='1', > 'skip.footer.line.count'='1');" > beeline -e " > SET hive.fetch.task.conversion = none; > SELECT * FROM default.bz2tst2;" > +---+++ > | bz2tst2.sequence | bz2tst2.id | bz2tst2.other | > +---+++ > | offset| id | other | > | 9 | 20200315 X00 1356 | 123| > | 17| 20200315 X00 1357 | 123| > | rst | rst| rst| > +---+++ > {noformat} > PS: HIVE-22769 addressed the issue for Hive on LLAP. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24244) NPE during Atlas metadata replication
[ https://issues.apache.org/jira/browse/HIVE-24244?focusedWorklogId=498364&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498364 ] ASF GitHub Bot logged work on HIVE-24244: - Author: ASF GitHub Bot Created on: 09/Oct/20 14:01 Start Date: 09/Oct/20 14:01 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #1563: URL: https://github.com/apache/hive/pull/1563#discussion_r502163365 ## File path: ql/src/test/org/apache/hadoop/hive/ql/exec/repl/TestAtlasDumpTask.java ## @@ -96,4 +105,13 @@ public void testAtlasDumpMetrics() throws Exception { Assert.assertTrue(eventDetailsCaptor .getAllValues().get(1).toString().contains("{\"dbName\":\"srcDB\",\"dumpEndTime\"")); } + + @Test + public void testAtlasRestClientBuilder() throws SemanticException, IOException { +mockStatic(UserGroupInformation.class); + when(UserGroupInformation.getLoginUser()).thenReturn(mock(UserGroupInformation.class)); +AtlasRestClientBuilder atlasRestCleintBuilder = new AtlasRestClientBuilder("http://localhost:31000";); +AtlasRestClient atlasClient = atlasRestCleintBuilder.getClient(conf); +Assert.assertTrue(atlasClient != null); Review comment: hive in test repl is set to true. It will return a No Op Client which will never be null. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498364) Time Spent: 50m (was: 40m) > NPE during Atlas metadata replication > - > > Key: HIVE-24244 > URL: https://issues.apache.org/jira/browse/HIVE-24244 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24244.01.patch > > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23851) MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions
[ https://issues.apache.org/jira/browse/HIVE-23851?focusedWorklogId=498292&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498292 ] ASF GitHub Bot logged work on HIVE-23851: - Author: ASF GitHub Bot Created on: 09/Oct/20 13:55 Start Date: 09/Oct/20 13:55 Worklog Time Spent: 10m Work Description: kgyrtkirk merged pull request #1271: URL: https://github.com/apache/hive/pull/1271 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498292) Time Spent: 5.5h (was: 5h 20m) > MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions > > > Key: HIVE-23851 > URL: https://issues.apache.org/jira/browse/HIVE-23851 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Syed Shameerur Rahman >Assignee: Syed Shameerur Rahman >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 5.5h > Remaining Estimate: 0h > > *Steps to reproduce:* > # Create external table > # Run msck command to sync all the partitions with metastore > # Remove one of the partition path > # Run msck repair with partition filtering > *Stack Trace:* > {code:java} > 2020-07-15T02:10:29,045 ERROR [4dad298b-28b1-4e6b-94b6-aa785b60c576 main] > ppr.PartitionExpressionForMetastore: Failed to deserialize the expression > java.lang.IndexOutOfBoundsException: Index: 110, Size: 0 > at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_192] > at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_192] > at > org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:60) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:857) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:707) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:211) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectFromKryo(SerializationUtilities.java:806) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeExpressionFromKryo(SerializationUtilities.java:775) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.deserializeExpr(PartitionExpressionForMetastore.java:96) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.convertExprToFilter(PartitionExpressionForMetastore.java:52) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.PartFilterExprUtil.makeExpressionTree(PartFilterExprUtil.java:48) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3593) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:80) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT-tests.jar:4.0.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_192] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_192] > {code} > *Cause:* > In case of msck repair with partition filtering we expect expression proxy > class to be set as PartitionExpressionForMetastore ( > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/misc/msck/MsckAnalyzer.java#L78 > ), While dropping partition we serialize the drop partition filter > expression as ( > https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java#L589 > ) which is incompatible during deserializtion happening in > PartitionExpressionForMetastore ( > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionExpressionForMetastore.java#L52 > ) hence the query fails with Failed to deseriali
[jira] [Work logged] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
[ https://issues.apache.org/jira/browse/HIVE-24248?focusedWorklogId=498282&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498282 ] ASF GitHub Bot logged work on HIVE-24248: - Author: ASF GitHub Bot Created on: 09/Oct/20 13:54 Start Date: 09/Oct/20 13:54 Worklog Time Spent: 10m Work Description: dengzhhu653 opened a new pull request #1565: URL: https://github.com/apache/hive/pull/1565 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498282) Time Spent: 1h 10m (was: 1h) > TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky > -- > > Key: HIVE-24248 > URL: https://issues.apache.org/jira/browse/HIVE-24248 > Project: Hive > Issue Type: Bug >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests] > {code:java} > java.lang.AssertionError: > Client Execution succeeded but contained differences (error code = 1) after > executing subquery_join_rewrite.q > 241,244d240 > < 1 1 > < 1 2 > < 2 1 > < 2 2 > 245a242,243 > > 2 2 > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24106) Abort polling on the operation state when the current thread is interrupted
[ https://issues.apache.org/jira/browse/HIVE-24106?focusedWorklogId=498239&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498239 ] ASF GitHub Bot logged work on HIVE-24106: - Author: ASF GitHub Bot Created on: 09/Oct/20 13:51 Start Date: 09/Oct/20 13:51 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on a change in pull request #1456: URL: https://github.com/apache/hive/pull/1456#discussion_r502403149 ## File path: jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java ## @@ -360,6 +360,9 @@ TGetOperationStatusResp waitForOperationToComplete() throws SQLException { // Poll on the operation status, till the operation is complete do { try { +if (Thread.currentThread().isInterrupted()) { + throw new SQLException("Interrupted while polling on the operation status", "70100"); Review comment: Done, thank you very much! @kgyrtkirk This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498239) Time Spent: 50m (was: 40m) > Abort polling on the operation state when the current thread is interrupted > --- > > Key: HIVE-24106 > URL: https://issues.apache.org/jira/browse/HIVE-24106 > Project: Hive > Issue Type: Improvement > Components: JDBC >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > If running HiveStatement asynchronously as a task like in a thread or future, > if we interrupt the task, the HiveStatement would continue to poll on the > operation state until finish. It's may better to provide a way to abort the > executing in such case. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-21052) Make sure transactions get cleaned if they are aborted before addPartitions is called
[ https://issues.apache.org/jira/browse/HIVE-21052?focusedWorklogId=498199&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498199 ] ASF GitHub Bot logged work on HIVE-21052: - Author: ASF GitHub Bot Created on: 09/Oct/20 13:48 Start Date: 09/Oct/20 13:48 Worklog Time Spent: 10m Work Description: deniskuzZ commented on pull request #1548: URL: https://github.com/apache/hive/pull/1548#issuecomment-705545322 looks like master is broken right now This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498199) Time Spent: 10h 40m (was: 10.5h) > Make sure transactions get cleaned if they are aborted before addPartitions > is called > - > > Key: HIVE-21052 > URL: https://issues.apache.org/jira/browse/HIVE-21052 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 3.0.0, 3.1.1 >Reporter: Jaume M >Assignee: Jaume M >Priority: Critical > Labels: pull-request-available > Attachments: Aborted Txn w_Direct Write.pdf, HIVE-21052.1.patch, > HIVE-21052.10.patch, HIVE-21052.11.patch, HIVE-21052.12.patch, > HIVE-21052.2.patch, HIVE-21052.3.patch, HIVE-21052.4.patch, > HIVE-21052.5.patch, HIVE-21052.6.patch, HIVE-21052.7.patch, > HIVE-21052.8.patch, HIVE-21052.9.patch > > Time Spent: 10h 40m > Remaining Estimate: 0h > > If the transaction is aborted between openTxn and addPartitions and data has > been written on the table the transaction manager will think it's an empty > transaction and no cleaning will be done. > This is currently an issue in the streaming API and in micromanaged tables. > As proposed by [~ekoifman] this can be solved by: > * Writing an entry with a special marker to TXN_COMPONENTS at openTxn and > when addPartitions is called remove this entry from TXN_COMPONENTS and add > the corresponding partition entry to TXN_COMPONENTS. > * If the cleaner finds and entry with a special marker in TXN_COMPONENTS that > specifies that a transaction was opened and it was aborted it must generate > jobs for the worker for every possible partition available. > cc [~ewohlstadter] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-21611) Date.getTime() can be changed to System.currentTimeMillis()
[ https://issues.apache.org/jira/browse/HIVE-21611?focusedWorklogId=498188&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498188 ] ASF GitHub Bot logged work on HIVE-21611: - Author: ASF GitHub Bot Created on: 09/Oct/20 13:47 Start Date: 09/Oct/20 13:47 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1334: URL: https://github.com/apache/hive/pull/1334 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498188) Time Spent: 2.5h (was: 2h 20m) > Date.getTime() can be changed to System.currentTimeMillis() > --- > > Key: HIVE-21611 > URL: https://issues.apache.org/jira/browse/HIVE-21611 > Project: Hive > Issue Type: Bug >Reporter: bd2019us >Assignee: Hunter Logan >Priority: Major > Labels: pull-request-available > Attachments: 1.patch > > Time Spent: 2.5h > Remaining Estimate: 0h > > Hello, > I found that System.currentTimeMillis() can be used here instead of new > Date.getTime(). > Since new Date() is a thin wrapper of light method > System.currentTimeMillis(). The performance will be greatly damaged if it is > invoked too much times. > According to my local testing at the same environment, > System.currentTimeMillis() can achieve a speedup to 5 times (435 ms vs 2073 > ms), when these two methods are invoked 5,000,000 times. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24253) HMS needs to support keystore/truststores types besides JKS
[ https://issues.apache.org/jira/browse/HIVE-24253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen reassigned HIVE-24253: --- > HMS needs to support keystore/truststores types besides JKS > --- > > Key: HIVE-24253 > URL: https://issues.apache.org/jira/browse/HIVE-24253 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen >Priority: Major > > When HiveMetaStoreClient connects to HMS with enabled SSL, HMS should support > the default keystore type specified for the JDK and not always use JKS. Same > as HIVE-23958 for hive, HMS should support to set additional > keystore/truststore types used for different applications like for FIPS > crypto algorithms. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24225) FIX S3A recordReader policy selection
[ https://issues.apache.org/jira/browse/HIVE-24225?focusedWorklogId=498058&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498058 ] ASF GitHub Bot logged work on HIVE-24225: - Author: ASF GitHub Bot Created on: 09/Oct/20 13:36 Start Date: 09/Oct/20 13:36 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #1547: URL: https://github.com/apache/hive/pull/1547#issuecomment-705446922 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498058) Time Spent: 1.5h (was: 1h 20m) > FIX S3A recordReader policy selection > - > > Key: HIVE-24225 > URL: https://issues.apache.org/jira/browse/HIVE-24225 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > Dynamic S3A recordReader policy selection can cause issues on lazy > initialized FS objects -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24203) Implement stats annotation rule for the LateralViewJoinOperator
[ https://issues.apache.org/jira/browse/HIVE-24203?focusedWorklogId=498075&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498075 ] ASF GitHub Bot logged work on HIVE-24203: - Author: ASF GitHub Bot Created on: 09/Oct/20 13:37 Start Date: 09/Oct/20 13:37 Worklog Time Spent: 10m Work Description: okumin commented on a change in pull request #1531: URL: https://github.com/apache/hive/pull/1531#discussion_r501689451 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java ## @@ -2921,6 +2920,97 @@ public Object process(Node nd, Stack stack, NodeProcessorCtx procCtx, } } + /** + * LateralViewJoinOperator changes the data size and column level statistics. + * + * A diagram of LATERAL VIEW. + * + * [Lateral View Forward] + * / \ + *[Select] [Select] + *|| + *| [UDTF] + *\ / + * [Lateral View Join] + * + * For each row of the source, the left branch just picks columns and the right branch processes UDTF. + * And then LVJ joins a row from the left branch with rows from the right branch. + * The join has one-to-many relationship since UDTF can generate multiple rows. + * + * This rule multiplies the stats from the left branch by T(right) / T(left) and sums up the both sides. + */ + public static class LateralViewJoinStatsRule extends DefaultStatsRule implements SemanticNodeProcessor { +@Override +public Object process(Node nd, Stack stack, NodeProcessorCtx procCtx, + Object... nodeOutputs) throws SemanticException { + final LateralViewJoinOperator lop = (LateralViewJoinOperator) nd; + final AnnotateStatsProcCtx aspCtx = (AnnotateStatsProcCtx) procCtx; + final HiveConf conf = aspCtx.getConf(); + + if (!isAllParentsContainStatistics(lop)) { +return null; + } + + final List> parents = lop.getParentOperators(); + if (parents.size() != 2) { +LOG.warn("LateralViewJoinOperator should have just two parents but actually has " ++ parents.size() + " parents."); +return null; + } + + final Statistics selectStats = parents.get(LateralViewJoinOperator.SELECT_TAG).getStatistics(); + final Statistics udtfStats = parents.get(LateralViewJoinOperator.UDTF_TAG).getStatistics(); + + final double factor = (double) udtfStats.getNumRows() / (double) selectStats.getNumRows(); + final long selectDataSize = StatsUtils.safeMult(selectStats.getDataSize(), factor); + final long dataSize = StatsUtils.safeAdd(selectDataSize, udtfStats.getDataSize()); + Statistics joinedStats = new Statistics(udtfStats.getNumRows(), dataSize, 0, 0); + + if (satisfyPrecondition(selectStats) && satisfyPrecondition(udtfStats)) { +final Map columnExprMap = lop.getColumnExprMap(); +final RowSchema schema = lop.getSchema(); + +joinedStats.updateColumnStatsState(selectStats.getColumnStatsState()); +final List selectColStats = StatsUtils +.getColStatisticsFromExprMap(conf, selectStats, columnExprMap, schema); +joinedStats.addToColumnStats(multiplyColStats(selectColStats, factor)); + +joinedStats.updateColumnStatsState(udtfStats.getColumnStatsState()); +final List udtfColStats = StatsUtils +.getColStatisticsFromExprMap(conf, udtfStats, columnExprMap, schema); +joinedStats.addToColumnStats(udtfColStats); + +joinedStats = applyRuntimeStats(aspCtx.getParseContext().getContext(), joinedStats, lop); +lop.setStatistics(joinedStats); + +if (LOG.isDebugEnabled()) { + LOG.debug("[0] STATS-" + lop.toString() + ": " + joinedStats.extendedToString()); +} Review comment: I also agree and I did that. https://github.com/apache/hive/pull/1531/commits/d333d5d70184a1cf1f0c0f239e9229965e486202 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java ## @@ -2921,6 +2920,97 @@ public Object process(Node nd, Stack stack, NodeProcessorCtx procCtx, } } + /** + * LateralViewJoinOperator changes the data size and column level statistics. + * + * A diagram of LATERAL VIEW. + * + * [Lateral View Forward] + * / \ + *[Select] [Select] + *|| + *| [UDTF] + *\ / + * [Lateral View Join] + * + * For each row of the source, the left branch just picks columns and the right branch processes UDTF. + * And then LVJ joins a row from the left branch with rows from the right branch. + * The join has one-to-many relationship since UDTF can generate multiple rows. + * + * This rule multiplies the stats from the left branch b
[jira] [Work logged] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
[ https://issues.apache.org/jira/browse/HIVE-24248?focusedWorklogId=498169&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498169 ] ASF GitHub Bot logged work on HIVE-24248: - Author: ASF GitHub Bot Created on: 09/Oct/20 13:45 Start Date: 09/Oct/20 13:45 Worklog Time Spent: 10m Work Description: kasakrisz commented on pull request #1565: URL: https://github.com/apache/hive/pull/1565#issuecomment-706053133 I run this test locally using current master and got the same result like in this PR. Let's wait for flaky test run results. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498169) Time Spent: 1h (was: 50m) > TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky > -- > > Key: HIVE-24248 > URL: https://issues.apache.org/jira/browse/HIVE-24248 > Project: Hive > Issue Type: Bug >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests] > {code:java} > java.lang.AssertionError: > Client Execution succeeded but contained differences (error code = 1) after > executing subquery_join_rewrite.q > 241,244d240 > < 1 1 > < 1 2 > < 2 1 > < 2 2 > 245a242,243 > > 2 2 > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23948) Improve Query Results Cache
[ https://issues.apache.org/jira/browse/HIVE-23948?focusedWorklogId=498163&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498163 ] ASF GitHub Bot logged work on HIVE-23948: - Author: ASF GitHub Bot Created on: 09/Oct/20 13:45 Start Date: 09/Oct/20 13:45 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1335: URL: https://github.com/apache/hive/pull/1335 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498163) Time Spent: 1h (was: 50m) > Improve Query Results Cache > --- > > Key: HIVE-23948 > URL: https://issues.apache.org/jira/browse/HIVE-23948 > Project: Hive > Issue Type: Improvement >Reporter: Hunter Logan >Assignee: Hunter Logan >Priority: Minor > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Creating a Jira for this github PR from before github was actively used > [https://github.com/apache/hive/pull/652] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-21052) Make sure transactions get cleaned if they are aborted before addPartitions is called
[ https://issues.apache.org/jira/browse/HIVE-21052?focusedWorklogId=498134&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498134 ] ASF GitHub Bot logged work on HIVE-21052: - Author: ASF GitHub Bot Created on: 09/Oct/20 13:42 Start Date: 09/Oct/20 13:42 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1548: URL: https://github.com/apache/hive/pull/1548#discussion_r499754423 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java ## @@ -2839,6 +2848,87 @@ public static void setNonTransactional(Map tblProps) { tblProps.remove(hive_metastoreConstants.TABLE_TRANSACTIONAL_PROPERTIES); } + /** + * Look for delta directories matching the list of writeIds and deletes them. + * @param rootPartition root partition to look for the delta directories + * @param conf configuration + * @param writeIds list of writeIds to look for in the delta directories + * @return list of deleted directories. + * @throws IOException + */ + public static List deleteDeltaDirectories(Path rootPartition, Configuration conf, Set writeIds) + throws IOException { +FileSystem fs = rootPartition.getFileSystem(conf); + +PathFilter filter = (p) -> { + String name = p.getName(); + for (Long wId : writeIds) { +if (name.startsWith(deltaSubdir(wId, wId)) && !name.contains("=")) { Review comment: changed, included delete_delta as well ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -97,9 +100,9 @@ public void run() { long minOpenTxnId = txnHandler.findMinOpenTxnIdForCleaner(); LOG.info("Cleaning based on min open txn id: " + minOpenTxnId); List cleanerList = new ArrayList<>(); - for(CompactionInfo compactionInfo : txnHandler.findReadyToClean()) { + for (CompactionInfo compactionInfo : txnHandler.findReadyToClean()) { cleanerList.add(CompletableFuture.runAsync(CompactorUtil.ThrowingRunnable.unchecked(() -> -clean(compactionInfo, minOpenTxnId)), cleanerExecutor)); + clean(compactionInfo, minOpenTxnId)), cleanerExecutor)); Review comment: In original patch Map tableLock = new ConcurrentHashMap<>() was used to prevent a concurrent p-clean (where the whole table will be scanned). I think, that is resolved by grouping p-cleans and recording list of writeIds that needs to be removed. @vpnvishv is that correct? ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -97,9 +100,9 @@ public void run() { long minOpenTxnId = txnHandler.findMinOpenTxnIdForCleaner(); LOG.info("Cleaning based on min open txn id: " + minOpenTxnId); List cleanerList = new ArrayList<>(); - for(CompactionInfo compactionInfo : txnHandler.findReadyToClean()) { + for (CompactionInfo compactionInfo : txnHandler.findReadyToClean()) { cleanerList.add(CompletableFuture.runAsync(CompactorUtil.ThrowingRunnable.unchecked(() -> -clean(compactionInfo, minOpenTxnId)), cleanerExecutor)); + clean(compactionInfo, minOpenTxnId)), cleanerExecutor)); Review comment: In original patch Map tableLock = new ConcurrentHashMap<>() was used to prevent a concurrent p-clean (where the whole table will be scanned). I think, that is resolved by grouping p-cleans and recording list of writeIds that needs to be removed. @vpnvishv is that correct? Also we do not allow concurrent Cleaners, their execution is mutexed. ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -97,9 +100,9 @@ public void run() { long minOpenTxnId = txnHandler.findMinOpenTxnIdForCleaner(); LOG.info("Cleaning based on min open txn id: " + minOpenTxnId); List cleanerList = new ArrayList<>(); - for(CompactionInfo compactionInfo : txnHandler.findReadyToClean()) { + for (CompactionInfo compactionInfo : txnHandler.findReadyToClean()) { cleanerList.add(CompletableFuture.runAsync(CompactorUtil.ThrowingRunnable.unchecked(() -> -clean(compactionInfo, minOpenTxnId)), cleanerExecutor)); + clean(compactionInfo, minOpenTxnId)), cleanerExecutor)); Review comment: 1. In original patch Map tableLock = new ConcurrentHashMap<>() was used to prevent a concurrent p-clean (where the whole table will be scanned). I think, that is resolved by grouping p-cleans and recording list of writeIds that needs to be removed. @vpnvishv is that correct? Also we do not allow concurrent Cleaners, their execution is mutexed. 2. was related to the following issue based on Map tableLock = new Conc
[jira] [Work logged] (HIVE-23800) Add hooks when HiveServer2 stops due to OutOfMemoryError
[ https://issues.apache.org/jira/browse/HIVE-23800?focusedWorklogId=498131&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498131 ] ASF GitHub Bot logged work on HIVE-23800: - Author: ASF GitHub Bot Created on: 09/Oct/20 13:42 Start Date: 09/Oct/20 13:42 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on a change in pull request #1205: URL: https://github.com/apache/hive/pull/1205#discussion_r502158480 ## File path: ql/src/java/org/apache/hadoop/hive/ql/hooks/HookContext.java ## @@ -45,7 +47,50 @@ public class HookContext { static public enum HookType { -PRE_EXEC_HOOK, POST_EXEC_HOOK, ON_FAILURE_HOOK + Review comment: Checked on my test and production env, it shows that the hooks compiled for the old api can be reused without any changes with the new implementation. ## File path: ql/src/java/org/apache/hadoop/hive/ql/HookRunner.java ## @@ -39,57 +36,27 @@ import org.apache.hadoop.hive.ql.parse.HiveSemanticAnalyzerHook; import org.apache.hadoop.hive.ql.parse.HiveSemanticAnalyzerHookContext; import org.apache.hadoop.hive.ql.session.SessionState; -import org.apache.hadoop.hive.ql.session.SessionState.LogHelper; import org.apache.hive.common.util.HiveStringUtils; +import static org.apache.hadoop.hive.ql.hooks.HookContext.HookType.*; + /** * Handles hook executions for {@link Driver}. */ public class HookRunner { private static final String CLASS_NAME = Driver.class.getName(); private final HiveConf conf; - private LogHelper console; - private List queryHooks = new ArrayList<>(); - private List saHooks = new ArrayList<>(); - private List driverRunHooks = new ArrayList<>(); - private List preExecHooks = new ArrayList<>(); - private List postExecHooks = new ArrayList<>(); - private List onFailureHooks = new ArrayList<>(); - private boolean initialized = false; + private final HooksLoader loader; Review comment: Rename it to HiveHooks instead. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498131) Time Spent: 5.5h (was: 5h 20m) > Add hooks when HiveServer2 stops due to OutOfMemoryError > > > Key: HIVE-23800 > URL: https://issues.apache.org/jira/browse/HIVE-23800 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Minor > Labels: pull-request-available > Time Spent: 5.5h > Remaining Estimate: 0h > > Make oom hook an interface of HiveServer2, so user can implement the hook to > do something before HS2 stops, such as dumping the heap or altering the > devops. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
[ https://issues.apache.org/jira/browse/HIVE-24248?focusedWorklogId=498110&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498110 ] ASF GitHub Bot logged work on HIVE-24248: - Author: ASF GitHub Bot Created on: 09/Oct/20 13:40 Start Date: 09/Oct/20 13:40 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on pull request #1565: URL: https://github.com/apache/hive/pull/1565#issuecomment-706028333 @kasakrisz could you please take a look? I think we have the same resultset in a different order... so SORT_QUERY_RESULTS should have ignored this difference This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498110) Time Spent: 50m (was: 40m) > TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky > -- > > Key: HIVE-24248 > URL: https://issues.apache.org/jira/browse/HIVE-24248 > Project: Hive > Issue Type: Bug >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests] > {code:java} > java.lang.AssertionError: > Client Execution succeeded but contained differences (error code = 1) after > executing subquery_join_rewrite.q > 241,244d240 > < 1 1 > < 1 2 > < 2 1 > < 2 2 > 245a242,243 > > 2 2 > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24244) NPE during Atlas metadata replication
[ https://issues.apache.org/jira/browse/HIVE-24244?focusedWorklogId=498050&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498050 ] ASF GitHub Bot logged work on HIVE-24244: - Author: ASF GitHub Bot Created on: 09/Oct/20 13:31 Start Date: 09/Oct/20 13:31 Worklog Time Spent: 10m Work Description: pkumarsinha opened a new pull request #1563: URL: https://github.com/apache/hive/pull/1563 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498050) Time Spent: 40m (was: 0.5h) > NPE during Atlas metadata replication > - > > Key: HIVE-24244 > URL: https://issues.apache.org/jira/browse/HIVE-24244 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24244.01.patch > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23851) MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions
[ https://issues.apache.org/jira/browse/HIVE-23851?focusedWorklogId=498048&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498048 ] ASF GitHub Bot logged work on HIVE-23851: - Author: ASF GitHub Bot Created on: 09/Oct/20 13:30 Start Date: 09/Oct/20 13:30 Worklog Time Spent: 10m Work Description: shameersss1 commented on pull request #1271: URL: https://github.com/apache/hive/pull/1271#issuecomment-705332460 @kgyrtkirk Thank you taking your time in reviewing this! Yes, +1 from my side for (deprecating) kyro stuffs. String based approach is cool. But i am not sure how easy or difficult will it be to do the changes. I will try to explore this from my side as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498048) Time Spent: 5h 20m (was: 5h 10m) > MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions > > > Key: HIVE-23851 > URL: https://issues.apache.org/jira/browse/HIVE-23851 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Syed Shameerur Rahman >Assignee: Syed Shameerur Rahman >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 5h 20m > Remaining Estimate: 0h > > *Steps to reproduce:* > # Create external table > # Run msck command to sync all the partitions with metastore > # Remove one of the partition path > # Run msck repair with partition filtering > *Stack Trace:* > {code:java} > 2020-07-15T02:10:29,045 ERROR [4dad298b-28b1-4e6b-94b6-aa785b60c576 main] > ppr.PartitionExpressionForMetastore: Failed to deserialize the expression > java.lang.IndexOutOfBoundsException: Index: 110, Size: 0 > at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_192] > at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_192] > at > org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:60) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:857) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:707) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:211) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectFromKryo(SerializationUtilities.java:806) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeExpressionFromKryo(SerializationUtilities.java:775) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.deserializeExpr(PartitionExpressionForMetastore.java:96) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.convertExprToFilter(PartitionExpressionForMetastore.java:52) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.PartFilterExprUtil.makeExpressionTree(PartFilterExprUtil.java:48) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3593) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:80) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT-tests.jar:4.0.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_192] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_192] > {code} > *Cause:* > In case of msck repair with partition filtering we expect expression proxy > class to be set as PartitionExpressionForMetastore ( > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/misc/msck/MsckAnalyzer.java#L78 > ), While dropping partition we serialize the drop partition filter > expression as ( > https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Ms
[jira] [Work logged] (HIVE-24242) Relax safety checks in SharedWorkOptimizer
[ https://issues.apache.org/jira/browse/HIVE-24242?focusedWorklogId=498021&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498021 ] ASF GitHub Bot logged work on HIVE-24242: - Author: ASF GitHub Bot Created on: 09/Oct/20 13:28 Start Date: 09/Oct/20 13:28 Worklog Time Spent: 10m Work Description: kgyrtkirk opened a new pull request #1564: URL: https://github.com/apache/hive/pull/1564 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498021) Time Spent: 20m (was: 10m) > Relax safety checks in SharedWorkOptimizer > -- > > Key: HIVE-24242 > URL: https://issues.apache.org/jira/browse/HIVE-24242 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > there are some checks to lock out problematic cases > For UnionOperator > [here|https://github.com/apache/hive/blob/1507d80fd47aad38b87bba4fd58c1427ba89dbbf/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SharedWorkOptimizer.java#L1571] > This check could prevent the optimization even if the Union is only visible > from only 1 of the TS ops. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24236) Connection leak in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-24236?focusedWorklogId=498017&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498017 ] ASF GitHub Bot logged work on HIVE-24236: - Author: ASF GitHub Bot Created on: 09/Oct/20 13:27 Start Date: 09/Oct/20 13:27 Worklog Time Spent: 10m Work Description: yongzhi merged pull request #1559: URL: https://github.com/apache/hive/pull/1559 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498017) Time Spent: 1.5h (was: 1h 20m) > Connection leak in TxnHandler > - > > Key: HIVE-24236 > URL: https://issues.apache.org/jira/browse/HIVE-24236 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > We see failures in QE tests with cannot allocate connections errors. The > exception stack like following: > {noformat} > 2020-09-29T18:44:26,563 INFO [Heartbeater-0]: txn.TxnHandler > (TxnHandler.java:checkRetryable(3733)) - Non-retryable error in > heartbeat(HeartbeatRequest(lockid:0, txnid:11908)) : Cannot get a connection, > general error (SQLState=null, ErrorCode=0) > 2020-09-29T18:44:26,564 ERROR [Heartbeater-0]: metastore.RetryingHMSHandler > (RetryingHMSHandler.java:invokeInternal(201)) - MetaException(message:Unable > to select from transaction database > org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, general > error > at > org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:118) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3605) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3598) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2739) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452) > at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy63.heartbeat(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.heartbeat(HiveMetaStoreClient.java:3247) > at sun.reflect.GeneratedMethodAccessor414.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:213) > at com.sun.proxy.$Proxy64.heartbeat(Unknown Source) > at > org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:671) > at > org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.lambda$run$0(DbTxnManager.java:1102) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898) > at > org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:1101) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.InterruptedException > at java.lang.Object.wait(Native Method) > at > org.apache.commons.pool.impl.GenericObjectPool.borrow
[jira] [Work logged] (HIVE-24106) Abort polling on the operation state when the current thread is interrupted
[ https://issues.apache.org/jira/browse/HIVE-24106?focusedWorklogId=497950&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497950 ] ASF GitHub Bot logged work on HIVE-24106: - Author: ASF GitHub Bot Created on: 09/Oct/20 12:48 Start Date: 09/Oct/20 12:48 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on a change in pull request #1456: URL: https://github.com/apache/hive/pull/1456#discussion_r502403149 ## File path: jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java ## @@ -360,6 +360,9 @@ TGetOperationStatusResp waitForOperationToComplete() throws SQLException { // Poll on the operation status, till the operation is complete do { try { +if (Thread.currentThread().isInterrupted()) { + throw new SQLException("Interrupted while polling on the operation status", "70100"); Review comment: Done, thank you very much! @kgyrtkirk This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 497950) Time Spent: 40m (was: 0.5h) > Abort polling on the operation state when the current thread is interrupted > --- > > Key: HIVE-24106 > URL: https://issues.apache.org/jira/browse/HIVE-24106 > Project: Hive > Issue Type: Improvement > Components: JDBC >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > If running HiveStatement asynchronously as a task like in a thread or future, > if we interrupt the task, the HiveStatement would continue to poll on the > operation state until finish. It's may better to provide a way to abort the > executing in such case. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
[ https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210869#comment-17210869 ] Hyukjin Kwon commented on HIVE-16391: - SPARK-20202 is resolved now. Spark does not use Hive 1.2 fork anymore, and does not need 1.2.x release. I am tentatively resolving this ticket. > Publish proper Hive 1.2 jars (without including all dependencies in uber jar) > - > > Key: HIVE-16391 > URL: https://issues.apache.org/jira/browse/HIVE-16391 > Project: Hive > Issue Type: Task > Components: Build Infrastructure >Affects Versions: 1.2.2 >Reporter: Reynold Xin >Assignee: Saisai Shao >Priority: Major > Labels: pull-request-available > Fix For: 1.2.3 > > Attachments: HIVE-16391.1.patch, HIVE-16391.2.patch, HIVE-16391.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the > only change in the fork is to work around the issue that Hive publishes only > two sets of jars: one set with no dependency declared, and another with all > the dependencies included in the published uber jar. That is to say, Hive > doesn't publish a set of jars with the proper dependencies declared. > There is general consensus on both sides that we should remove the forked > Hive. > The change in the forked version is recorded here > https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2 > Note that the fork in the past included other fixes but those have all become > unnecessary. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
[ https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated HIVE-16391: Resolution: Not A Problem Status: Resolved (was: Patch Available) > Publish proper Hive 1.2 jars (without including all dependencies in uber jar) > - > > Key: HIVE-16391 > URL: https://issues.apache.org/jira/browse/HIVE-16391 > Project: Hive > Issue Type: Task > Components: Build Infrastructure >Affects Versions: 1.2.2 >Reporter: Reynold Xin >Assignee: Saisai Shao >Priority: Major > Labels: pull-request-available > Fix For: 1.2.3 > > Attachments: HIVE-16391.1.patch, HIVE-16391.2.patch, HIVE-16391.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the > only change in the fork is to work around the issue that Hive publishes only > two sets of jars: one set with no dependency declared, and another with all > the dependencies included in the published uber jar. That is to say, Hive > doesn't publish a set of jars with the proper dependencies declared. > There is general consensus on both sides that we should remove the forked > Hive. > The change in the forked version is recorded here > https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2 > Note that the fork in the past included other fixes but those have all become > unnecessary. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24252) Improve decision model for using semijoin reducers
[ https://issues.apache.org/jira/browse/HIVE-24252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stamatis Zampetakis reassigned HIVE-24252: -- > Improve decision model for using semijoin reducers > -- > > Key: HIVE-24252 > URL: https://issues.apache.org/jira/browse/HIVE-24252 > Project: Hive > Issue Type: Improvement >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > > After a few experiments with TPC-DS 10TB dataset, we observed that in some > cases semijoin reducers were not effective; they didn't reduce the number of > records or they reduced the relation only a tiny bit. > In some cases we can make the semijoin reducer more effective by adding more > columns but this requires also a bigger bloom filter so the decision for the > number of columns to include in the bloom becomes more delicate. > The current decision model always chooses multi-column semijoin reducers if > they are available but this may not always beneficial if the a single column > can reduce significantly the target relation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24251) Improve bloom filter size estimation for multi column semijoin reducers
[ https://issues.apache.org/jira/browse/HIVE-24251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stamatis Zampetakis reassigned HIVE-24251: -- > Improve bloom filter size estimation for multi column semijoin reducers > --- > > Key: HIVE-24251 > URL: https://issues.apache.org/jira/browse/HIVE-24251 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > > There are various cases where the expected size of the bloom filter is > largely underestimated making the semijoin reducer completely ineffective. > This more relevant for multi-column semi join reducers since the current > [code|https://github.com/apache/hive/blob/d61c9160ffa5afbd729887c3db690eccd7ef8238/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFBloomFilter.java#L273] > does not take them into account. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
[ https://issues.apache.org/jira/browse/HIVE-24248?focusedWorklogId=497884&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497884 ] ASF GitHub Bot logged work on HIVE-24248: - Author: ASF GitHub Bot Created on: 09/Oct/20 11:10 Start Date: 09/Oct/20 11:10 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on pull request #1565: URL: https://github.com/apache/hive/pull/1565#issuecomment-706119583 @kasakrisz @kgyrtkirk The flaky test runs successfully: http://ci.hive.apache.org/job/hive-flaky-check/126/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 497884) Time Spent: 40m (was: 0.5h) > TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky > -- > > Key: HIVE-24248 > URL: https://issues.apache.org/jira/browse/HIVE-24248 > Project: Hive > Issue Type: Bug >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests] > {code:java} > java.lang.AssertionError: > Client Execution succeeded but contained differences (error code = 1) after > executing subquery_join_rewrite.q > 241,244d240 > < 1 1 > < 1 2 > < 2 1 > < 2 2 > 245a242,243 > > 2 2 > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24244) NPE during Atlas metadata replication
[ https://issues.apache.org/jira/browse/HIVE-24244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210834#comment-17210834 ] Aasha Medhi commented on HIVE-24244: +1 > NPE during Atlas metadata replication > - > > Key: HIVE-24244 > URL: https://issues.apache.org/jira/browse/HIVE-24244 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24244.01.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24234) Improve checkHashModeEfficiency in VectorGroupByOperator
[ https://issues.apache.org/jira/browse/HIVE-24234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan reassigned HIVE-24234: --- Assignee: Rajesh Balamohan > Improve checkHashModeEfficiency in VectorGroupByOperator > > > Key: HIVE-24234 > URL: https://issues.apache.org/jira/browse/HIVE-24234 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24234.wip.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Currently, {{VectorGroupByOperator::checkHashModeEfficiency}} compares the > number of entries with the number input records that have been processed. For > grouping sets, it accounts for grouping set length as well. > Issue is that, the condition becomes invalid after processing large number of > input records. This prevents the system from switching over to streaming > mode. > e.g Assume 500,000 input records processed, with 9 grouping sets, with > 100,000 entries in hashtable. Hashtable would never cross 4,500, entries > as the max size itself is 1M by default. > It would be good to compare the input records (adjusted for grouping sets) > with number of output records (along with size of hashtable size) to > determine hashing or streaming mode. > E.g Q67. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24249) Create View fails if a materialized view exists with the same query
[ https://issues.apache.org/jira/browse/HIVE-24249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa updated HIVE-24249: -- Description: {code:java} create table t1(col0 int) STORED AS ORC TBLPROPERTIES ('transactional'='true'); create materialized view mv1 as select * from t1 where col0 > 2; create view v1 as select sub.* from (select * from t1 where col0 > 2) sub where sub.col0 = 10; {code} The planner realize that the view definition has a subquery which match the materialized view query and replaces it to the materialized view scan. {code:java} HiveProject($f0=[CAST(10):INTEGER]) HiveFilter(condition=[=(10, $0)]) HiveTableScan(table=[[default, mv1]], table:alias=[default.mv1]) {code} Then exception is thrown: {code:java} org.apache.hadoop.hive.ql.parse.SemanticException: View definition references materialized view default.mv1 at org.apache.hadoop.hive.ql.ddl.view.create.CreateViewAnalyzer.validateCreateView(CreateViewAnalyzer.java:211) at org.apache.hadoop.hive.ql.ddl.view.create.CreateViewAnalyzer.analyzeInternal(CreateViewAnalyzer.java:99) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301) at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223) at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:174) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:415) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:364) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:358) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:355) at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:744) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:714) at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:170) at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157) at org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:135) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.junit.runners.Suite.runChild(Suite.java:128) at org.junit.runners.Suite.runChild(Suite.java:27) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.apache.hadoo
[jira] [Commented] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
[ https://issues.apache.org/jira/browse/HIVE-24248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210750#comment-17210750 ] Zhihua Deng commented on HIVE-24248: Thanks much for your time [~kgyrtkirk] , I added a flaky check for this: [http://ci.hive.apache.org/job/hive-flaky-check/126/] > TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky > -- > > Key: HIVE-24248 > URL: https://issues.apache.org/jira/browse/HIVE-24248 > Project: Hive > Issue Type: Bug >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests] > {code:java} > java.lang.AssertionError: > Client Execution succeeded but contained differences (error code = 1) after > executing subquery_join_rewrite.q > 241,244d240 > < 1 1 > < 1 2 > < 2 1 > < 2 2 > 245a242,243 > > 2 2 > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24249) Create View fails if a materialized view exists with the same query
[ https://issues.apache.org/jira/browse/HIVE-24249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210732#comment-17210732 ] Krisztian Kasa commented on HIVE-24249: --- A possible solution could be disabling materialized view rewrite during view creation. cc [~jcamachorodriguez] > Create View fails if a materialized view exists with the same query > --- > > Key: HIVE-24249 > URL: https://issues.apache.org/jira/browse/HIVE-24249 > Project: Hive > Issue Type: Bug >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > > {code:java} > create table t1(col0 int) STORED AS ORC > TBLPROPERTIES ('transactional'='true'); > create materialized view mv1 as > select * from t1 where col0 > 2; > create view mv1 as > select sub.* from (select * from t1 where col0 > 2) sub > where sub.col0 = 10; > {code} > The planner realize that the view definition has a subquery which match the > materialized view query and replaces it to the materialized view scan. > {code:java} > HiveProject($f0=[CAST(10):INTEGER]) > HiveFilter(condition=[=(10, $0)]) > HiveTableScan(table=[[default, mv1]], table:alias=[default.mv1]) > {code} > Then exception is thrown: > {code:java} > org.apache.hadoop.hive.ql.parse.SemanticException: View definition > references materialized view default.mv1 > at > org.apache.hadoop.hive.ql.ddl.view.create.CreateViewAnalyzer.validateCreateView(CreateViewAnalyzer.java:211) > at > org.apache.hadoop.hive.ql.ddl.view.create.CreateViewAnalyzer.analyzeInternal(CreateViewAnalyzer.java:99) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301) > at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223) > at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:174) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:415) > at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:364) > at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:358) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) > at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:355) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:744) > at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:714) > at > org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:170) > at > org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157) > at > org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:135) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > at org.junit.run
[jira] [Assigned] (HIVE-24249) Create View fails if a materialized view exists with the same query
[ https://issues.apache.org/jira/browse/HIVE-24249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa reassigned HIVE-24249: - > Create View fails if a materialized view exists with the same query > --- > > Key: HIVE-24249 > URL: https://issues.apache.org/jira/browse/HIVE-24249 > Project: Hive > Issue Type: Bug >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > > {code:java} > create table t1(col0 int) STORED AS ORC > TBLPROPERTIES ('transactional'='true'); > create materialized view mv1 as > select * from t1 where col0 > 2; > create view mv1 as > select sub.* from (select * from t1 where col0 > 2) sub > where sub.col0 = 10; > {code} > The planner realize that the view definition has a subquery which match the > materialized view query and replaces it to the materialized view scan. > {code:java} > HiveProject($f0=[CAST(10):INTEGER]) > HiveFilter(condition=[=(10, $0)]) > HiveTableScan(table=[[default, mv1]], table:alias=[default.mv1]) > {code} > Then exception is thrown: > {code:java} > org.apache.hadoop.hive.ql.parse.SemanticException: View definition > references materialized view default.mv1 > at > org.apache.hadoop.hive.ql.ddl.view.create.CreateViewAnalyzer.validateCreateView(CreateViewAnalyzer.java:211) > at > org.apache.hadoop.hive.ql.ddl.view.create.CreateViewAnalyzer.analyzeInternal(CreateViewAnalyzer.java:99) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301) > at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223) > at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:174) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:415) > at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:364) > at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:358) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) > at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:355) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:744) > at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:714) > at > org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:170) > at > org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157) > at > org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:135) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runners.Suite.runChil
[jira] [Work logged] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
[ https://issues.apache.org/jira/browse/HIVE-24248?focusedWorklogId=497826&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497826 ] ASF GitHub Bot logged work on HIVE-24248: - Author: ASF GitHub Bot Created on: 09/Oct/20 08:42 Start Date: 09/Oct/20 08:42 Worklog Time Spent: 10m Work Description: kasakrisz commented on pull request #1565: URL: https://github.com/apache/hive/pull/1565#issuecomment-706053133 I run this test locally using current master and got the same result like in this PR. Let's wait for flaky test run results. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 497826) Time Spent: 0.5h (was: 20m) > TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky > -- > > Key: HIVE-24248 > URL: https://issues.apache.org/jira/browse/HIVE-24248 > Project: Hive > Issue Type: Bug >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests] > {code:java} > java.lang.AssertionError: > Client Execution succeeded but contained differences (error code = 1) after > executing subquery_join_rewrite.q > 241,244d240 > < 1 1 > < 1 2 > < 2 1 > < 2 2 > 245a242,243 > > 2 2 > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24106) Abort polling on the operation state when the current thread is interrupted
[ https://issues.apache.org/jira/browse/HIVE-24106?focusedWorklogId=497822&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497822 ] ASF GitHub Bot logged work on HIVE-24106: - Author: ASF GitHub Bot Created on: 09/Oct/20 08:34 Start Date: 09/Oct/20 08:34 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #1456: URL: https://github.com/apache/hive/pull/1456#discussion_r502273628 ## File path: jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java ## @@ -360,6 +360,9 @@ TGetOperationStatusResp waitForOperationToComplete() throws SQLException { // Poll on the operation status, till the operation is complete do { try { +if (Thread.currentThread().isInterrupted()) { + throw new SQLException("Interrupted while polling on the operation status", "70100"); Review comment: I think this error message and code should be placed in the `ErrorMsg` class This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 497822) Time Spent: 0.5h (was: 20m) > Abort polling on the operation state when the current thread is interrupted > --- > > Key: HIVE-24106 > URL: https://issues.apache.org/jira/browse/HIVE-24106 > Project: Hive > Issue Type: Improvement > Components: JDBC >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > If running HiveStatement asynchronously as a task like in a thread or future, > if we interrupt the task, the HiveStatement would continue to poll on the > operation state until finish. It's may better to provide a way to abort the > executing in such case. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23851) MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions
[ https://issues.apache.org/jira/browse/HIVE-23851?focusedWorklogId=497813&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497813 ] ASF GitHub Bot logged work on HIVE-23851: - Author: ASF GitHub Bot Created on: 09/Oct/20 08:22 Start Date: 09/Oct/20 08:22 Worklog Time Spent: 10m Work Description: kgyrtkirk merged pull request #1271: URL: https://github.com/apache/hive/pull/1271 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 497813) Time Spent: 5h 10m (was: 5h) > MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions > > > Key: HIVE-23851 > URL: https://issues.apache.org/jira/browse/HIVE-23851 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Syed Shameerur Rahman >Assignee: Syed Shameerur Rahman >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 5h 10m > Remaining Estimate: 0h > > *Steps to reproduce:* > # Create external table > # Run msck command to sync all the partitions with metastore > # Remove one of the partition path > # Run msck repair with partition filtering > *Stack Trace:* > {code:java} > 2020-07-15T02:10:29,045 ERROR [4dad298b-28b1-4e6b-94b6-aa785b60c576 main] > ppr.PartitionExpressionForMetastore: Failed to deserialize the expression > java.lang.IndexOutOfBoundsException: Index: 110, Size: 0 > at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_192] > at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_192] > at > org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:60) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:857) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:707) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:211) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectFromKryo(SerializationUtilities.java:806) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeExpressionFromKryo(SerializationUtilities.java:775) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.deserializeExpr(PartitionExpressionForMetastore.java:96) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.convertExprToFilter(PartitionExpressionForMetastore.java:52) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.PartFilterExprUtil.makeExpressionTree(PartFilterExprUtil.java:48) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3593) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:80) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT-tests.jar:4.0.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_192] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_192] > {code} > *Cause:* > In case of msck repair with partition filtering we expect expression proxy > class to be set as PartitionExpressionForMetastore ( > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/misc/msck/MsckAnalyzer.java#L78 > ), While dropping partition we serialize the drop partition filter > expression as ( > https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java#L589 > ) which is incompatible during deserializtion happening in > PartitionExpressionForMetastore ( > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionExpressionForMetastore.java#L52 > ) hence the query fails with Failed to deseriali
[jira] [Resolved] (HIVE-23851) MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions
[ https://issues.apache.org/jira/browse/HIVE-23851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich resolved HIVE-23851. - Resolution: Fixed merged into master. Thank you Syed Shameerur Rahman! > MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions > > > Key: HIVE-23851 > URL: https://issues.apache.org/jira/browse/HIVE-23851 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Syed Shameerur Rahman >Assignee: Syed Shameerur Rahman >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 5h 10m > Remaining Estimate: 0h > > *Steps to reproduce:* > # Create external table > # Run msck command to sync all the partitions with metastore > # Remove one of the partition path > # Run msck repair with partition filtering > *Stack Trace:* > {code:java} > 2020-07-15T02:10:29,045 ERROR [4dad298b-28b1-4e6b-94b6-aa785b60c576 main] > ppr.PartitionExpressionForMetastore: Failed to deserialize the expression > java.lang.IndexOutOfBoundsException: Index: 110, Size: 0 > at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_192] > at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_192] > at > org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:60) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:857) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:707) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:211) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectFromKryo(SerializationUtilities.java:806) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeExpressionFromKryo(SerializationUtilities.java:775) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.deserializeExpr(PartitionExpressionForMetastore.java:96) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.convertExprToFilter(PartitionExpressionForMetastore.java:52) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.PartFilterExprUtil.makeExpressionTree(PartFilterExprUtil.java:48) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3593) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:80) > [hive-standalone-metastore-server-4.0.0-SNAPSHOT-tests.jar:4.0.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_192] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_192] > {code} > *Cause:* > In case of msck repair with partition filtering we expect expression proxy > class to be set as PartitionExpressionForMetastore ( > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/misc/msck/MsckAnalyzer.java#L78 > ), While dropping partition we serialize the drop partition filter > expression as ( > https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java#L589 > ) which is incompatible during deserializtion happening in > PartitionExpressionForMetastore ( > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionExpressionForMetastore.java#L52 > ) hence the query fails with Failed to deserialize the expression. > *Solutions*: > I could think of two approaches to this problem > # Since PartitionExpressionForMetastore is required only during parition > pruning step, We can switch back the expression proxy class to > MsckPartitionExpressionProxy once the partition pruning step is done. > # The other solution is to make serialization process in msck drop partition > filter expression compatible with the one with > PartitionExpressionForMetastore, We can do this via Reflection since the drop > partition serialization happens in Msck class (standadlone-metatsore) by this > way we can completely remove the need for class MsckPartitionExpressionProx
[jira] [Work logged] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
[ https://issues.apache.org/jira/browse/HIVE-24248?focusedWorklogId=497803&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497803 ] ASF GitHub Bot logged work on HIVE-24248: - Author: ASF GitHub Bot Created on: 09/Oct/20 07:48 Start Date: 09/Oct/20 07:48 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on pull request #1565: URL: https://github.com/apache/hive/pull/1565#issuecomment-706028333 @kasakrisz could you please take a look? I think we have the same resultset in a different order... so SORT_QUERY_RESULTS should have ignored this difference This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 497803) Time Spent: 20m (was: 10m) > TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky > -- > > Key: HIVE-24248 > URL: https://issues.apache.org/jira/browse/HIVE-24248 > Project: Hive > Issue Type: Bug >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests] > {code:java} > java.lang.AssertionError: > Client Execution succeeded but contained differences (error code = 1) after > executing subquery_join_rewrite.q > 241,244d240 > < 1 1 > < 1 2 > < 2 1 > < 2 2 > 245a242,243 > > 2 2 > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
[ https://issues.apache.org/jira/browse/HIVE-24248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210686#comment-17210686 ] Zoltan Haindrich edited comment on HIVE-24248 at 10/9/20, 7:45 AM: --- Thank you [~dengzh] for opening this ticket - I was also about to do the same :) I've disabled this test - please also run a flaky check before enabling it back http://ci.hive.apache.org/job/hive-flaky-check/124/ was (Author: kgyrtkirk): I've disabled this test - please also run a flaky check before enabling it back http://ci.hive.apache.org/job/hive-flaky-check/124/ > TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky > -- > > Key: HIVE-24248 > URL: https://issues.apache.org/jira/browse/HIVE-24248 > Project: Hive > Issue Type: Bug >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests] > {code:java} > java.lang.AssertionError: > Client Execution succeeded but contained differences (error code = 1) after > executing subquery_join_rewrite.q > 241,244d240 > < 1 1 > < 1 2 > < 2 1 > < 2 2 > 245a242,243 > > 2 2 > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
[ https://issues.apache.org/jira/browse/HIVE-24248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210686#comment-17210686 ] Zoltan Haindrich commented on HIVE-24248: - I've disabled this test - please also run a flaky check before enabling it back http://ci.hive.apache.org/job/hive-flaky-check/124/ > TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky > -- > > Key: HIVE-24248 > URL: https://issues.apache.org/jira/browse/HIVE-24248 > Project: Hive > Issue Type: Bug >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests] > {code:java} > java.lang.AssertionError: > Client Execution succeeded but contained differences (error code = 1) after > executing subquery_join_rewrite.q > 241,244d240 > < 1 1 > < 1 2 > < 2 1 > < 2 2 > 245a242,243 > > 2 2 > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
[ https://issues.apache.org/jira/browse/HIVE-24248?focusedWorklogId=497786&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497786 ] ASF GitHub Bot logged work on HIVE-24248: - Author: ASF GitHub Bot Created on: 09/Oct/20 07:03 Start Date: 09/Oct/20 07:03 Worklog Time Spent: 10m Work Description: dengzhhu653 opened a new pull request #1565: URL: https://github.com/apache/hive/pull/1565 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 497786) Remaining Estimate: 0h Time Spent: 10m > TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky > -- > > Key: HIVE-24248 > URL: https://issues.apache.org/jira/browse/HIVE-24248 > Project: Hive > Issue Type: Bug >Reporter: Zhihua Deng >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests] > {code:java} > java.lang.AssertionError: > Client Execution succeeded but contained differences (error code = 1) after > executing subquery_join_rewrite.q > 241,244d240 > < 1 1 > < 1 2 > < 2 1 > < 2 2 > 245a242,243 > > 2 2 > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
[ https://issues.apache.org/jira/browse/HIVE-24248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24248: -- Labels: pull-request-available (was: ) > TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky > -- > > Key: HIVE-24248 > URL: https://issues.apache.org/jira/browse/HIVE-24248 > Project: Hive > Issue Type: Bug >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests] > {code:java} > java.lang.AssertionError: > Client Execution succeeded but contained differences (error code = 1) after > executing subquery_join_rewrite.q > 241,244d240 > < 1 1 > < 1 2 > < 2 1 > < 2 2 > 245a242,243 > > 2 2 > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)