[jira] [Work started] (HIVE-25653) Precision problem in STD, STDDDEV_SAMP,STDDEV_POP
[ https://issues.apache.org/jira/browse/HIVE-25653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-25653 started by Ashish Sharma. > Precision problem in STD, STDDDEV_SAMP,STDDEV_POP > - > > Key: HIVE-25653 > URL: https://issues.apache.org/jira/browse/HIVE-25653 > Project: Hive > Issue Type: Improvement >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > > Description > *Script*- > create table test ( col1 int ); > insert into values > ('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'); > select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV > ,STDDEV_POP(col1) as STDDEV_POP from test; > *Result*- > STDDDEV_SAMPSTDDEV > STDDEV_POP > 5.940794514955821E-13 5.42317860890711E-13 5.42317860890711E-13 > *Expected*- > STDDDEV_SAMPSTDDEV > STDDEV_POP > 0 0 >0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25626) JDBCStorageHandler CBO fails when JDBC_PASSWORD_URI is used
[ https://issues.apache.org/jira/browse/HIVE-25626?focusedWorklogId=671667=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-671667 ] ASF GitHub Bot logged work on HIVE-25626: - Author: ASF GitHub Bot Created on: 28/Oct/21 21:31 Start Date: 28/Oct/21 21:31 Worklog Time Spent: 10m Work Description: cravani commented on a change in pull request #2734: URL: https://github.com/apache/hive/pull/2734#discussion_r738789344 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java ## @@ -3011,9 +3011,14 @@ private RelNode genTableLogicalPlan(String tableAlias, QB qb) throws SemanticExc final String user = tabMetaData.getProperty(Constants.JDBC_USERNAME); String pswd = tabMetaData.getProperty(Constants.JDBC_PASSWORD); if (pswd == null) { - String keystore = tabMetaData.getProperty(Constants.JDBC_KEYSTORE); - String key = tabMetaData.getProperty(Constants.JDBC_KEY); - pswd = Utilities.getPasswdFromKeystore(keystore, key); + if(!(tabMetaData.getProperty(Constants.JDBC_PASSWORD_URI) == null)) { + pswd = Utilities.getPasswdFromUri(tabMetaData.getProperty(Constants.JDBC_PASSWORD_URI)); + } + else { +String keystore = tabMetaData.getProperty(Constants.JDBC_KEYSTORE); +String key = tabMetaData.getProperty(Constants.JDBC_KEY); +pswd = Utilities.getPasswdFromKeystore(keystore, key); + } Review comment: @zabetak Thank you for the comments, problem with final string would lead to compilation error if used in el else block. Modified patch a bit and submitted a PR again. Test case is sill pending, would it be Ok if I write a test case post HIVE-25594 gets pushed? maybe another Jira? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 671667) Time Spent: 0.5h (was: 20m) > JDBCStorageHandler CBO fails when JDBC_PASSWORD_URI is used > --- > > Key: HIVE-25626 > URL: https://issues.apache.org/jira/browse/HIVE-25626 > Project: Hive > Issue Type: Bug > Components: Hive, JDBC storage handler >Affects Versions: 3.1.2, 4.0.0 >Reporter: Chiran Ravani >Assignee: Chiran Ravani >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > When table created with JDBCStorageHandler and JDBC_PASSWORD_URI is used as a > password mechanism, CBO fails causing all the data to be fetched from DB and > then processed in hive. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25591) CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema
[ https://issues.apache.org/jira/browse/HIVE-25591?focusedWorklogId=671550=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-671550 ] ASF GitHub Bot logged work on HIVE-25591: - Author: ASF GitHub Bot Created on: 28/Oct/21 16:59 Start Date: 28/Oct/21 16:59 Worklog Time Spent: 10m Work Description: zabetak commented on pull request #2759: URL: https://github.com/apache/hive/pull/2759#issuecomment-954030659 Hey @cravani please have a look as well and let me know what you think. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 671550) Time Spent: 20m (was: 10m) > CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema > > > Key: HIVE-25591 > URL: https://issues.apache.org/jira/browse/HIVE-25591 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Consider the following use case where tables reside in some user-defined > schema in some JDBC compliant database: > +Postgres+ > {code:sql} > create schema world; > create table if not exists world.country (name varchar(80) not null); > insert into world.country (name) values ('India'); > insert into world.country (name) values ('Russia'); > insert into world.country (name) values ('USA'); > {code} > The following DDL statement in Hive fails: > +Hive+ > {code:sql} > CREATE EXTERNAL TABLE country (name varchar(80)) > STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' > TBLPROPERTIES ( > "hive.sql.database.type" = "POSTGRES", > "hive.sql.jdbc.driver" = "org.postgresql.Driver", > "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/test", > "hive.sql.dbcp.username" = "user", > "hive.sql.dbcp.password" = "pwd", > "hive.sql.schema" = "world", > "hive.sql.table" = "country"); > {code} > The exception is the following: > {noformat} > org.postgresql.util.PSQLException: ERROR: relation "country" does not exist > Position: 15 > at > org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2532) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2267) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:312) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:448) > ~[postgresql-42.2.14.jar:42.2.14] > at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:369) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:153) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:103) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122) > ~[commons-dbcp2-2.7.0.jar:2.7.0] > at > org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122) > ~[commons-dbcp2-2.7.0.jar:2.7.0] > at > org.apache.hive.storage.jdbc.dao.GenericJdbcDatabaseAccessor.getColumnNames(GenericJdbcDatabaseAccessor.java:83) > [hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hive.storage.jdbc.JdbcSerDe.initialize(JdbcSerDe.java:98) > [hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:95) > [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:78) > [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:342) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:324) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.metadata.Table.getColsInternal(Table.java:734) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at
[jira] [Updated] (HIVE-25591) CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema
[ https://issues.apache.org/jira/browse/HIVE-25591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25591: -- Labels: pull-request-available (was: ) > CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema > > > Key: HIVE-25591 > URL: https://issues.apache.org/jira/browse/HIVE-25591 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Consider the following use case where tables reside in some user-defined > schema in some JDBC compliant database: > +Postgres+ > {code:sql} > create schema world; > create table if not exists world.country (name varchar(80) not null); > insert into world.country (name) values ('India'); > insert into world.country (name) values ('Russia'); > insert into world.country (name) values ('USA'); > {code} > The following DDL statement in Hive fails: > +Hive+ > {code:sql} > CREATE EXTERNAL TABLE country (name varchar(80)) > STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' > TBLPROPERTIES ( > "hive.sql.database.type" = "POSTGRES", > "hive.sql.jdbc.driver" = "org.postgresql.Driver", > "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/test", > "hive.sql.dbcp.username" = "user", > "hive.sql.dbcp.password" = "pwd", > "hive.sql.schema" = "world", > "hive.sql.table" = "country"); > {code} > The exception is the following: > {noformat} > org.postgresql.util.PSQLException: ERROR: relation "country" does not exist > Position: 15 > at > org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2532) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2267) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:312) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:448) > ~[postgresql-42.2.14.jar:42.2.14] > at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:369) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:153) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:103) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122) > ~[commons-dbcp2-2.7.0.jar:2.7.0] > at > org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122) > ~[commons-dbcp2-2.7.0.jar:2.7.0] > at > org.apache.hive.storage.jdbc.dao.GenericJdbcDatabaseAccessor.getColumnNames(GenericJdbcDatabaseAccessor.java:83) > [hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hive.storage.jdbc.JdbcSerDe.initialize(JdbcSerDe.java:98) > [hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:95) > [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:78) > [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:342) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:324) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.metadata.Table.getColsInternal(Table.java:734) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:717) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.ddl.table.create.CreateTableDesc.toTable(CreateTableDesc.java:933) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.execute(CreateTableOperation.java:59) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:84) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at
[jira] [Work logged] (HIVE-25591) CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema
[ https://issues.apache.org/jira/browse/HIVE-25591?focusedWorklogId=671549=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-671549 ] ASF GitHub Bot logged work on HIVE-25591: - Author: ASF GitHub Bot Created on: 28/Oct/21 16:58 Start Date: 28/Oct/21 16:58 Worklog Time Spent: 10m Work Description: zabetak opened a new pull request #2759: URL: https://github.com/apache/hive/pull/2759 The tests rely on HIVE-25594 for which there is a separate pull request (https://github.com/apache/hive/pull/2742). Please do not review https://github.com/apache/hive/commit/cb3026b4db9454c12d5376c71a28eb34b35d783d here. If there are remarks please comment on https://github.com/apache/hive/pull/2742 instead. ### What changes were proposed in this pull request? 1. Remove getOriQueryToExecute method in favor of getQueryToExecute 2. Move getQueryToExecute method into GenericJdbcDatabaseAccessor to improve encapsulation since the method is only used in this class. 3. Include hive.sql.schema if available when generating the SQL query. 4. Add tests/usage samples of hive.sql.schema property in different DBMS. ### Why are the changes needed? 1. Avoid failures when the table is in non-default schema. 2. Demonstrate how hive.sql.schema can be used in different DBMS. 3. Minor encapsulation improvement. ### Does this PR introduce _any_ user-facing change? Fixes a failure. ### How was this patch tested? `mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile_regex="jdbc_table_with_schema.*" -Dtest.output.overwrite` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 671549) Remaining Estimate: 0h Time Spent: 10m > CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema > > > Key: HIVE-25591 > URL: https://issues.apache.org/jira/browse/HIVE-25591 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Consider the following use case where tables reside in some user-defined > schema in some JDBC compliant database: > +Postgres+ > {code:sql} > create schema world; > create table if not exists world.country (name varchar(80) not null); > insert into world.country (name) values ('India'); > insert into world.country (name) values ('Russia'); > insert into world.country (name) values ('USA'); > {code} > The following DDL statement in Hive fails: > +Hive+ > {code:sql} > CREATE EXTERNAL TABLE country (name varchar(80)) > STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' > TBLPROPERTIES ( > "hive.sql.database.type" = "POSTGRES", > "hive.sql.jdbc.driver" = "org.postgresql.Driver", > "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/test", > "hive.sql.dbcp.username" = "user", > "hive.sql.dbcp.password" = "pwd", > "hive.sql.schema" = "world", > "hive.sql.table" = "country"); > {code} > The exception is the following: > {noformat} > org.postgresql.util.PSQLException: ERROR: relation "country" does not exist > Position: 15 > at > org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2532) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2267) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:312) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:448) > ~[postgresql-42.2.14.jar:42.2.14] > at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:369) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:153) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:103) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122) > ~[commons-dbcp2-2.7.0.jar:2.7.0] > at > org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122) > ~[commons-dbcp2-2.7.0.jar:2.7.0] > at >
[jira] [Updated] (HIVE-25658) Fix regex for masking totalSize table properties in Iceberg q-tests
[ https://issues.apache.org/jira/browse/HIVE-25658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25658: -- Labels: pull-request-available (was: ) > Fix regex for masking totalSize table properties in Iceberg q-tests > --- > > Key: HIVE-25658 > URL: https://issues.apache.org/jira/browse/HIVE-25658 > Project: Hive > Issue Type: Improvement >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-25607 introduced a text replace regex for masking out the totalSize > table property values in Iceberg q.out files. The regex however did not cover > all of the props in the q.out files, so here is the fix for the regex. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25658) Fix regex for masking totalSize table properties in Iceberg q-tests
[ https://issues.apache.org/jira/browse/HIVE-25658?focusedWorklogId=671495=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-671495 ] ASF GitHub Bot logged work on HIVE-25658: - Author: ASF GitHub Bot Created on: 28/Oct/21 15:11 Start Date: 28/Oct/21 15:11 Worklog Time Spent: 10m Work Description: marton-bod merged pull request #2757: URL: https://github.com/apache/hive/pull/2757 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 671495) Remaining Estimate: 0h Time Spent: 10m > Fix regex for masking totalSize table properties in Iceberg q-tests > --- > > Key: HIVE-25658 > URL: https://issues.apache.org/jira/browse/HIVE-25658 > Project: Hive > Issue Type: Improvement >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-25607 introduced a text replace regex for masking out the totalSize > table property values in Iceberg q.out files. The regex however did not cover > all of the props in the q.out files, so here is the fix for the regex. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25658) Fix regex for masking totalSize table properties in Iceberg q-tests
[ https://issues.apache.org/jira/browse/HIVE-25658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Bod resolved HIVE-25658. --- Resolution: Fixed > Fix regex for masking totalSize table properties in Iceberg q-tests > --- > > Key: HIVE-25658 > URL: https://issues.apache.org/jira/browse/HIVE-25658 > Project: Hive > Issue Type: Improvement >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-25607 introduced a text replace regex for masking out the totalSize > table property values in Iceberg q.out files. The regex however did not cover > all of the props in the q.out files, so here is the fix for the regex. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25658) Fix regex for masking totalSize table properties in Iceberg q-tests
[ https://issues.apache.org/jira/browse/HIVE-25658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435485#comment-17435485 ] Marton Bod commented on HIVE-25658: --- Committed to master. Thanks [~szita] for the review! > Fix regex for masking totalSize table properties in Iceberg q-tests > --- > > Key: HIVE-25658 > URL: https://issues.apache.org/jira/browse/HIVE-25658 > Project: Hive > Issue Type: Improvement >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-25607 introduced a text replace regex for masking out the totalSize > table property values in Iceberg q.out files. The regex however did not cover > all of the props in the q.out files, so here is the fix for the regex. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25660) File Format (ORC/AVRO/TextFile...) available in information schema for bulk query
[ https://issues.apache.org/jira/browse/HIVE-25660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon AUBERT updated HIVE-25660: Description: Hello all, As of today, when you want to know the file format of every table, you have, as far I know, two solutions : -a loop in shell -a loop in the tool you use for HQL queries, and then parse the answer, etc.. I think this is way too complicated for such a very basic need. So a table_file_format (or partition_file_format, I don't know) in the information_schema would be a very precious help for monitoring. It can be directly read by a reporting tool (Superset, Tableau, PowerBi, Qlik, whatever you want). Best regards, Simon was: Hello all, As of today, when you want to know the file format of every table, you have, as far I know, two solutions : -a loop in shell -a loop in the tool you use for HQL queries, and then parse the answer, etc.. I think this is way too complicated for such a very basic need. So a table_file_format (or partition_file_format, I don't know) in the information_schema would be a very precious help for monitoring. Best regards, Simon > File Format (ORC/AVRO/TextFile...) available in information schema for bulk > query > - > > Key: HIVE-25660 > URL: https://issues.apache.org/jira/browse/HIVE-25660 > Project: Hive > Issue Type: Improvement > Components: File Formats, Metastore >Reporter: Simon AUBERT >Priority: Major > > Hello all, > As of today, when you want to know the file format of every table, you have, > as far I know, two solutions : > -a loop in shell > -a loop in the tool you use for HQL queries, and then parse the answer, etc.. > I think this is way too complicated for such a very basic need. So a > table_file_format (or partition_file_format, I don't know) in the > information_schema would be a very precious help for monitoring. It can be > directly read by a reporting tool (Superset, Tableau, PowerBi, Qlik, whatever > you want). > Best regards, > Simon -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25660) File Format (ORC/AVRO/TextFile...) available in information schema for bulk query
[ https://issues.apache.org/jira/browse/HIVE-25660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon AUBERT updated HIVE-25660: Description: Hello all, As of today, when you want to know the file format of every table, you have, as far I know, two solutions : -a loop in shell -a loop in the tool you use for HQL queries, and then parse the answer, etc.. I think this is way too complicated for such a very basic need. So a table_file_format (or partition_file_format, I don't know) in the information_schema would be a very precious help for monitoring. Best regards, Simon was: Hello all, As of today, when you want to know the file format of every table, you have, as far I know, two solutions : -a loop in shell -a loop in the tool you use for HQL queries, and then parse the answer, etc.. I think this is way too complicated for such a very basic need. So a table_file_format (or partion_file_format, I don't know) in the information_schema would be a very precious help for monitoring. Best regards, Simon > File Format (ORC/AVRO/TextFile...) available in information schema for bulk > query > - > > Key: HIVE-25660 > URL: https://issues.apache.org/jira/browse/HIVE-25660 > Project: Hive > Issue Type: Improvement > Components: File Formats, Metastore >Reporter: Simon AUBERT >Priority: Major > > Hello all, > As of today, when you want to know the file format of every table, you have, > as far I know, two solutions : > -a loop in shell > -a loop in the tool you use for HQL queries, and then parse the answer, etc.. > I think this is way too complicated for such a very basic need. So a > table_file_format (or partition_file_format, I don't know) in the > information_schema would be a very precious help for monitoring. > Best regards, > Simon -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25659) Divide IN/(NOT IN) queries based on number of max parameters SQL engine can support
[ https://issues.apache.org/jira/browse/HIVE-25659?focusedWorklogId=671480=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-671480 ] ASF GitHub Bot logged work on HIVE-25659: - Author: ASF GitHub Bot Created on: 28/Oct/21 14:50 Start Date: 28/Oct/21 14:50 Worklog Time Spent: 10m Work Description: guptanikhil007 opened a new pull request #2758: URL: https://github.com/apache/hive/pull/2758 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 671480) Remaining Estimate: 0h Time Spent: 10m > Divide IN/(NOT IN) queries based on number of max parameters SQL engine can > support > --- > > Key: HIVE-25659 > URL: https://issues.apache.org/jira/browse/HIVE-25659 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore >Affects Versions: 3.1.0, 4.0.0 >Reporter: Nikhil Gupta >Assignee: Nikhil Gupta >Priority: Minor > Fix For: 4.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Function > org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings > can generate queries with huge number of parameters with very small value of > DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE and DIRECT_SQL_MAX_QUERY_LENGTH while > generating delete query for completed_compactions table > Example: > {code:java} > DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE = 100 > DIRECT_SQL_MAX_QUERY_LENGTH = 10 (10 KB) > Number of parameters in a single query = 4759 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25659) Divide IN/(NOT IN) queries based on number of max parameters SQL engine can support
[ https://issues.apache.org/jira/browse/HIVE-25659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25659: -- Labels: pull-request-available (was: ) > Divide IN/(NOT IN) queries based on number of max parameters SQL engine can > support > --- > > Key: HIVE-25659 > URL: https://issues.apache.org/jira/browse/HIVE-25659 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore >Affects Versions: 3.1.0, 4.0.0 >Reporter: Nikhil Gupta >Assignee: Nikhil Gupta >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Function > org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings > can generate queries with huge number of parameters with very small value of > DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE and DIRECT_SQL_MAX_QUERY_LENGTH while > generating delete query for completed_compactions table > Example: > {code:java} > DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE = 100 > DIRECT_SQL_MAX_QUERY_LENGTH = 10 (10 KB) > Number of parameters in a single query = 4759 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25659) Divide IN/(NOT IN) queries based on number of max parameters SQL engine can support
[ https://issues.apache.org/jira/browse/HIVE-25659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435458#comment-17435458 ] Nikhil Gupta commented on HIVE-25659: - Code to find number of parameters {noformat} public static void main(String[] args) { Configuration conf = MetastoreConf.newMetastoreConf(); conf.set(ConfVars.DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE.getVarname(), "100"); conf.set(ConfVars.DIRECT_SQL_MAX_QUERY_LENGTH.getVarname(), "10"); List queries = new ArrayList<>(); List deleteSet = new ArrayList<>(); for (long i=0; i < 1; i++) { deleteSet.add(i+1); } StringBuilder prefix = new StringBuilder(); StringBuilder suffix = new StringBuilder(); prefix.append("delete from COMPLETED_COMPACTIONS where "); suffix.append(""); List questions = new ArrayList<>(deleteSet.size()); for (int i = 0; i < deleteSet.size(); i++) { questions.add("?"); } List counts = TxnUtils.buildQueryWithINClauseStrings(conf, queries, prefix, suffix, questions, "cc_id", false, false); System.out.println(queries.get(0).chars().filter(ch -> ch == '?').count()); }{noformat} > Divide IN/(NOT IN) queries based on number of max parameters SQL engine can > support > --- > > Key: HIVE-25659 > URL: https://issues.apache.org/jira/browse/HIVE-25659 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore >Affects Versions: 3.1.0, 4.0.0 >Reporter: Nikhil Gupta >Assignee: Nikhil Gupta >Priority: Minor > Fix For: 4.0.0 > > > Function > org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings > can generate queries with huge number of parameters with very small value of > DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE and DIRECT_SQL_MAX_QUERY_LENGTH while > generating delete query for completed_compactions table > Example: > {code:java} > DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE = 100 > DIRECT_SQL_MAX_QUERY_LENGTH = 10 (10 KB) > Number of parameters in a single query = 4759 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25659) Divide IN/(NOT IN) queries based on number of max parameters SQL engine can support
[ https://issues.apache.org/jira/browse/HIVE-25659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikhil Gupta updated HIVE-25659: Description: Function org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings can generate queries with huge number of parameters with very small value of DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE and DIRECT_SQL_MAX_QUERY_LENGTH while generating delete query for completed_compactions table Example: {code:java} DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE = 100 DIRECT_SQL_MAX_QUERY_LENGTH = 10 (10 KB) Number of parameters in a single query = 4759 {code} was: Function org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings can generate queries with huge number of parameters with very small value of DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE and DIRECT_SQL_MAX_QUERY_LENGTH while generating delete query for completed_compactions table Example: DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE = 100 DIRECT_SQL_MAX_QUERY_LENGTH = 10 (10 KB) Number of parameters in a single query = 4759 > Divide IN/(NOT IN) queries based on number of max parameters SQL engine can > support > --- > > Key: HIVE-25659 > URL: https://issues.apache.org/jira/browse/HIVE-25659 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore >Affects Versions: 3.1.0, 4.0.0 >Reporter: Nikhil Gupta >Assignee: Nikhil Gupta >Priority: Minor > Fix For: 4.0.0 > > > Function > org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings > can generate queries with huge number of parameters with very small value of > DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE and DIRECT_SQL_MAX_QUERY_LENGTH while > generating delete query for completed_compactions table > Example: > {code:java} > DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE = 100 > DIRECT_SQL_MAX_QUERY_LENGTH = 10 (10 KB) > Number of parameters in a single query = 4759 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25659) Divide IN/(NOT IN) queries based on number of max parameters SQL engine can support
[ https://issues.apache.org/jira/browse/HIVE-25659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikhil Gupta updated HIVE-25659: Description: Function org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings can generate queries with huge number of parameters with very small value of DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE and DIRECT_SQL_MAX_QUERY_LENGTH while generating delete query for completed_compactions table Example: DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE = 100 DIRECT_SQL_MAX_QUERY_LENGTH = 10 (10 KB) Number of parameters in a single query = 4759 was: > Divide IN/(NOT IN) queries based on number of max parameters SQL engine can > support > --- > > Key: HIVE-25659 > URL: https://issues.apache.org/jira/browse/HIVE-25659 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore >Affects Versions: 3.1.0, 4.0.0 >Reporter: Nikhil Gupta >Assignee: Nikhil Gupta >Priority: Minor > Fix For: 4.0.0 > > > Function > org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings > can generate queries with huge number of parameters with very small value of > DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE and DIRECT_SQL_MAX_QUERY_LENGTH while > generating delete query for completed_compactions table > Example: > DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE = 100 > DIRECT_SQL_MAX_QUERY_LENGTH = 10 (10 KB) > Number of parameters in a single query = 4759 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25659) Divide IN/(NOT IN) queries based on number of max parameters SQL engine can support
[ https://issues.apache.org/jira/browse/HIVE-25659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikhil Gupta reassigned HIVE-25659: --- Assignee: Nikhil Gupta > Divide IN/(NOT IN) queries based on number of max parameters SQL engine can > support > --- > > Key: HIVE-25659 > URL: https://issues.apache.org/jira/browse/HIVE-25659 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore >Affects Versions: 3.1.0, 4.0.0 >Reporter: Nikhil Gupta >Assignee: Nikhil Gupta >Priority: Minor > Fix For: 4.0.0 > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25596) Compress Hive Replication Metrics while storing
[ https://issues.apache.org/jira/browse/HIVE-25596?focusedWorklogId=671426=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-671426 ] ASF GitHub Bot logged work on HIVE-25596: - Author: ASF GitHub Bot Created on: 28/Oct/21 13:15 Start Date: 28/Oct/21 13:15 Worklog Time Spent: 10m Work Description: hmangla98 commented on a change in pull request #2724: URL: https://github.com/apache/hive/pull/2724#discussion_r73835 ## File path: metastore/scripts/upgrade/hive/hive-schema-4.0.0.hive.sql ## @@ -1466,7 +1466,8 @@ CREATE EXTERNAL TABLE IF NOT EXISTS `REPLICATION_METRICS` ( `POLICY_NAME` string, `DUMP_EXECUTION_ID` bigint, `METADATA` string, -`PROGRESS` string +`PROGRESS` string, +`MESSAGE_FORMAT` varchar(16) Review comment: Yes, varchar is supported in hive-schema files. Actually, we're using same syntax in case of Notification_log table. So, i kept it same for both the tables. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 671426) Time Spent: 1h 50m (was: 1h 40m) > Compress Hive Replication Metrics while storing > --- > > Key: HIVE-25596 > URL: https://issues.apache.org/jira/browse/HIVE-25596 > Project: Hive > Issue Type: Improvement >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > Compress the json fields of sys.replication_metrics table to optimise RDBMS > space usage. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25628) Avoid unnecessary file ops if Iceberg table is LLAP cached
[ https://issues.apache.org/jira/browse/HIVE-25628?focusedWorklogId=671424=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-671424 ] ASF GitHub Bot logged work on HIVE-25628: - Author: ASF GitHub Bot Created on: 28/Oct/21 13:12 Start Date: 28/Oct/21 13:12 Worklog Time Spent: 10m Work Description: szlta merged pull request #2748: URL: https://github.com/apache/hive/pull/2748 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 671424) Time Spent: 0.5h (was: 20m) > Avoid unnecessary file ops if Iceberg table is LLAP cached > -- > > Key: HIVE-25628 > URL: https://issues.apache.org/jira/browse/HIVE-25628 > Project: Hive > Issue Type: Improvement > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > In case the query execution is vectorized for an Iceberg table, we need to > make an extra file open operation on the ORC file to learn what the file > schema is (to be matched later with the logical schema). > In LLAP configuration the file schema could be retrieved through LLAP cache > as ORC metadata is cached, so we should avoid the file operation when > possible. > Also: LLAP relies on cache keys that are usually triplets of file information > and is constructed by an FS.listStatus call. For iceberg tables we should > rely on such file information provided by Iceberg's metadata to spare this > call too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25628) Avoid unnecessary file ops if Iceberg table is LLAP cached
[ https://issues.apache.org/jira/browse/HIVE-25628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ádám Szita resolved HIVE-25628. --- Fix Version/s: 4.0.0 Resolution: Fixed Committed to master. Thanks for the review [~mbod]! > Avoid unnecessary file ops if Iceberg table is LLAP cached > -- > > Key: HIVE-25628 > URL: https://issues.apache.org/jira/browse/HIVE-25628 > Project: Hive > Issue Type: Improvement > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > In case the query execution is vectorized for an Iceberg table, we need to > make an extra file open operation on the ORC file to learn what the file > schema is (to be matched later with the logical schema). > In LLAP configuration the file schema could be retrieved through LLAP cache > as ORC metadata is cached, so we should avoid the file operation when > possible. > Also: LLAP relies on cache keys that are usually triplets of file information > and is constructed by an FS.listStatus call. For iceberg tables we should > rely on such file information provided by Iceberg's metadata to spare this > call too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25658) Fix regex for masking totalSize table properties in Iceberg q-tests
[ https://issues.apache.org/jira/browse/HIVE-25658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Bod reassigned HIVE-25658: - > Fix regex for masking totalSize table properties in Iceberg q-tests > --- > > Key: HIVE-25658 > URL: https://issues.apache.org/jira/browse/HIVE-25658 > Project: Hive > Issue Type: Improvement >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > > HIVE-25607 introduced a text replace regex for masking out the totalSize > table property values in Iceberg q.out files. The regex however did not cover > all of the props in the q.out files, so here is the fix for the regex. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25596) Compress Hive Replication Metrics while storing
[ https://issues.apache.org/jira/browse/HIVE-25596?focusedWorklogId=671407=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-671407 ] ASF GitHub Bot logged work on HIVE-25596: - Author: ASF GitHub Bot Created on: 28/Oct/21 12:26 Start Date: 28/Oct/21 12:26 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #2724: URL: https://github.com/apache/hive/pull/2724#discussion_r738331249 ## File path: metastore/scripts/upgrade/hive/upgrade-3.1.0-to-4.0.0.hive.sql ## @@ -527,7 +527,8 @@ CREATE EXTERNAL TABLE IF NOT EXISTS `REPLICATION_METRICS` ( `POLICY_NAME` string, `DUMP_EXECUTION_ID` bigint, `METADATA` string, -`PROGRESS` string +`PROGRESS` string, +`MESSAGE_FORMAT` varchar(16) Review comment: varchar or string? ## File path: standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql ## @@ -1367,7 +1367,8 @@ CREATE TABLE "REPLICATION_METRICS" ( "RM_DUMP_EXECUTION_ID" bigint NOT NULL, "RM_METADATA" varchar(max), "RM_PROGRESS" varchar(max), - "RM_START_TIME" integer NOT NULL + "RM_START_TIME" integer NOT NULL, + MESSAGE_FORMAT nvarchar(16), Review comment: typo nvarchar ## File path: metastore/scripts/upgrade/hive/hive-schema-4.0.0.hive.sql ## @@ -1466,7 +1466,8 @@ CREATE EXTERNAL TABLE IF NOT EXISTS `REPLICATION_METRICS` ( `POLICY_NAME` string, `DUMP_EXECUTION_ID` bigint, `METADATA` string, -`PROGRESS` string +`PROGRESS` string, +`MESSAGE_FORMAT` varchar(16) Review comment: is varchar supported? can this be string? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 671407) Time Spent: 1h 40m (was: 1.5h) > Compress Hive Replication Metrics while storing > --- > > Key: HIVE-25596 > URL: https://issues.apache.org/jira/browse/HIVE-25596 > Project: Hive > Issue Type: Improvement >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Compress the json fields of sys.replication_metrics table to optimise RDBMS > space usage. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25650) Make workerId and workerVersionId optional in the FindNextCompactRequest
[ https://issues.apache.org/jira/browse/HIVE-25650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Csomor resolved HIVE-25650. -- Fix Version/s: 4.0.0 Resolution: Fixed > Make workerId and workerVersionId optional in the FindNextCompactRequest > > > Key: HIVE-25650 > URL: https://issues.apache.org/jira/browse/HIVE-25650 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 4.0.0 >Reporter: Viktor Csomor >Assignee: Viktor Csomor >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > In hive_metastore.thrift the FindNextCompactRequest struct's fields are > required: > {code} > struct FindNextCompactRequest { > 1: required string workerId, > 2: required string workerVersion > }{code} > these should probably be made optional, to avoid breaking compaction if > they're not available. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25650) Make workerId and workerVersionId optional in the FindNextCompactRequest
[ https://issues.apache.org/jira/browse/HIVE-25650?focusedWorklogId=671302=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-671302 ] ASF GitHub Bot logged work on HIVE-25650: - Author: ASF GitHub Bot Created on: 28/Oct/21 08:58 Start Date: 28/Oct/21 08:58 Worklog Time Spent: 10m Work Description: lcspinter merged pull request #2749: URL: https://github.com/apache/hive/pull/2749 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 671302) Time Spent: 0.5h (was: 20m) > Make workerId and workerVersionId optional in the FindNextCompactRequest > > > Key: HIVE-25650 > URL: https://issues.apache.org/jira/browse/HIVE-25650 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 4.0.0 >Reporter: Viktor Csomor >Assignee: Viktor Csomor >Priority: Minor > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > In hive_metastore.thrift the FindNextCompactRequest struct's fields are > required: > {code} > struct FindNextCompactRequest { > 1: required string workerId, > 2: required string workerVersion > }{code} > these should probably be made optional, to avoid breaking compaction if > they're not available. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25656) Get materialized view state based on number of affected rows by transactions
[ https://issues.apache.org/jira/browse/HIVE-25656?focusedWorklogId=671293=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-671293 ] ASF GitHub Bot logged work on HIVE-25656: - Author: ASF GitHub Bot Created on: 28/Oct/21 08:38 Start Date: 28/Oct/21 08:38 Worklog Time Spent: 10m Work Description: kasakrisz opened a new pull request #2756: URL: https://github.com/apache/hive/pull/2756 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 671293) Remaining Estimate: 0h Time Spent: 10m > Get materialized view state based on number of affected rows by transactions > > > Key: HIVE-25656 > URL: https://issues.apache.org/jira/browse/HIVE-25656 > Project: Hive > Issue Type: Improvement > Components: Materialized views, Transactions >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Fix For: 4.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > To enable the faster incremental rebuild of materialized views presence of > update/delete operations on the source tables of the view since the last > rebuild must be checked. Based on the outcome different plan is generated for > scenarios in presence of update/delete and insert only operations. > Currently this is done by querying the COMPLETED_TXN_COMPONENTS table however > the records from this table is cleaned when MV source tables are compacted. > This reduces the chances of incremental MV rebuild. > The goal of this patch is to find an alternative way to store and retrieve > this information. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25656) Get materialized view state based on number of affected rows by transactions
[ https://issues.apache.org/jira/browse/HIVE-25656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25656: -- Labels: pull-request-available (was: ) > Get materialized view state based on number of affected rows by transactions > > > Key: HIVE-25656 > URL: https://issues.apache.org/jira/browse/HIVE-25656 > Project: Hive > Issue Type: Improvement > Components: Materialized views, Transactions >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > To enable the faster incremental rebuild of materialized views presence of > update/delete operations on the source tables of the view since the last > rebuild must be checked. Based on the outcome different plan is generated for > scenarios in presence of update/delete and insert only operations. > Currently this is done by querying the COMPLETED_TXN_COMPONENTS table however > the records from this table is cleaned when MV source tables are compacted. > This reduces the chances of incremental MV rebuild. > The goal of this patch is to find an alternative way to store and retrieve > this information. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25656) Get materialized view state based on number of affected rows by transactions
[ https://issues.apache.org/jira/browse/HIVE-25656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa reassigned HIVE-25656: - > Get materialized view state based on number of affected rows by transactions > > > Key: HIVE-25656 > URL: https://issues.apache.org/jira/browse/HIVE-25656 > Project: Hive > Issue Type: Improvement > Components: Materialized views, Transactions >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Fix For: 4.0.0 > > > To enable the faster incremental rebuild of materialized views presence of > update/delete operations on the source tables of the view since the last > rebuild must be checked. Based on the outcome different plan is generated for > scenarios in presence of update/delete and insert only operations. > Currently this is done by querying the COMPLETED_TXN_COMPONENTS table however > the records from this table is cleaned when MV source tables are compacted. > This reduces the chances of incremental MV rebuild. > The goal of this patch is to find an alternative way to store and retrieve > this information. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25647) hadoop memo
[ https://issues.apache.org/jira/browse/HIVE-25647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] St Li updated HIVE-25647: - Description: master,slave1,slave2master,slave1,slave2//opt represent wechat hadoop bigdata dev//2019 :bigdata competitionhadoop 50070hbase 16010storm 8080 #hostnamehostnamectl set-hostname master && bash hostname master && bash hostname slave1/slave2 && bash vim /etc/hostname master/slave1/slave2vim /etc/hosts ip master ip slave1 ipslave2 #yumcd /etc/yum.repos.d && rm -rf *wget http://172.16.47.240/bigdata/repofile/bigdata.repoyum clean all #firewallsystemctl stop firewalldsystemctl status firewalld #timezonetzselect 5-9-1-1echo "TZ='Asia/Shanghai'; export TZ" >> /etc/profile && source /etc/profile #ntpyum install -y ntpvim /etc/ntp.conf//#server 0~3.centos.pool.ntp.org iburstserver 127.127.1.0fudge 127.127.1.0 stratum 10/bin/systemctl restart ntpd.servicentpdate master (slave1,slave2) #crontabservice crond status/sbin/service crond startcrontab -e*/30 8-17 * * * /usr/sbin/ntpdate mastercrontab –l #ssh passwordssh-keygen -t dsa -P '' -f ~/.ssh/id_dsacat /root/.ssh/id_dsa.pub >> /root/.ssh/authorized_keysscp ~/.ssh/authorized_keys root@slave1:~/.ssh/scp ~/.ssh/authorized_keys root@slave2:~/.ssh/ ssh-copy-id masterssh-copy-id slave1ssh-copy-id slave2 #install jdkmkdir -p /usr/javatar -zxvf jdk-8u171-linux-x64.tar.gz -C /usr/java/ vim /etc/profileexport JAVA_HOME=/usr/java/jdk1.8.0_171export CLASSPATH=$JAVA_HOME/lib/export PATH=$PATH:$JAVA_HOME/bin source /etc/profile && java -version scp -r /usr/java root@slave1:/usr/scp -r /usr/java root@slave2:/usr/ #install hadoopmkdir -p /usr/hadoop && cd /usr/hadooptar -zxvf /usr/hadoop/hadoop-2.7.3.tar.gz -C /usr/hadoop/rm -rf /usr/hadoop/hadoop-2.7.3.tar.gzvim /etc/profileexport HADOOP_HOME=/usr/hadoop/hadoop-2.7.3export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbinhadoop //test hadoop-env.sh/mapred-env.sh/yarn-en.shexport JAVA_HOME=/usr/java/jdk1.8.0_171 ##vim core-site.xmlfs.default.name \{hdfs://master:9000}hadoop.tmp.dir \{/usr/hadoop/hadoop-2.7.3/hdfs/tmp}io.file.buffer.size \{131072}fs.checkpoint.period \{60}fs.checkpoint.size \{67108864} ##hdfs-site.xmldfs.replication \{2}dfs.namenode.name.dir \{file:/usr/hadoop/hadoop-2.7.3/hdfs/name}dfs.datanode.data.dir \{file:/usr/hadoop/hadoop-2.7.3/hdfs/data} ##vim yarn-env.shyarn.resourcemanager.address \{master:18040}yarn.resourcemanager.scheduler.address \{master:18030}yarn.resourcemanager.webapp.address \{master:18088}yarn.resourcemanager.resource-tracker.address \{18025}yarn.resourcemanager.admin.address \{master:18141}yarn.nodemanager.aux-services \{mapreduce_shuffle}yarn.nodemanager.auxservices.mapreduce.shuffle.class \{org.apache.hadoop.mapred.ShuffleHandler} #vim mapred-site.xmlmapreduce.framework.name \{yarn} #slaves fileecho master > master && echo slave1 > slaves && echo slave2 >> slaves #hadoop formathadoop namenode -format (master) //has been successfully#start hadoopstart-all.shmaster :NameNode,SecondaryNameNode,ResourceManagerslave1~2:DataNode,NodeManager start-dfs.shstart-yarn.shhadoop-daemon.sh start namenodehadoop-daemon.sh start datanodehadoop-daemon.sh start secondarynamenodehadoop-daemon.sh start resourcemanagerhadoop-daemon.sh start nodemanager test hdfs& mapreducehadoop fs -mkdir /inputhadoop fs -put $HADOOP_HOME/README.txt /input http://master:50070hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.4.jar #install hiveyum -y install mysql-community-server slave2:mysqlserver slave1:hiveserver master:hiveclientsystemctl daemon-reloadsystemctl start mysqldcat /var/log/mysqld.log grep "temporary password"mysql -uroot -pset global validate_password_policy=0;set global validate_password_length=4;alter user 'root'@'localhost' identified by '123456';mysql -uroot -p123456create user 'root'@'%' identified by '123456';grant all privileges on *.* to 'root'@'%' with grant option;flush privileges; mkdir -p /usr/hive tar -zxvf /usr/hive/apache-hive-2.1.1-bin.tar.gz -C /usr/hive/ vim /etc/profile //for hiveexport HIVE_HOME=/usr/hive/apache-hive-2.1.1-binexport PATH=$PATH:$HIVE_HOME/binsource /etc/profile vim hive-env.shcd $HIVE_HOME/conf && vim hive-env.shexport HADOOP_HOME=/usr/hadoop/hadoop-2.7.3export HIVE_CONF_DIR=/usr/hive/apache-hive-2.1.1-bin/confexport HIVE_AUX_JARS_PATH=/usr/hive/apache-hive-2.1.1-bin/lib cp $HIVE_HOME/lib/jline-2.12.jar $HADOOP_HOME/share/hadoop/yarn/lib/ ##slave1 hive-servercd $HIVE_HOME/lib && wget or cp mysql-connector-java-5.1.47-bin.jar hive-site.xml (hive-server)hive.metastore.warehouse.dir \{/user/hive_remote/warehouse}javax.jdo.option.ConnectionDriverName \{com.mysql.jdbc.Driver}javax.jdo.option.ConnectionURL \{jdbc:mysql://slave2:3306/hive?createDatabaseIfNotExist=trueuseSSL=false}javax.jdo.option.ConnectionUserName \{root}javax.jdo.option.ConnectionPassword \{123456} hive-site.xml (hive