[jira] [Created] (HIVE-25442) Initiator speed-up: only read compaction history once per loop
Zoltan Chovan created HIVE-25442: Summary: Initiator speed-up: only read compaction history once per loop Key: HIVE-25442 URL: https://issues.apache.org/jira/browse/HIVE-25442 Project: Hive Issue Type: Improvement Reporter: Zoltan Chovan Assignee: Zoltan Chovan In checkFailedCompactions (which is called for every partition in the list of potentials to compact) we select from metadata table COMPLETED_COMPACTIONS. But the Initiator main loop already has a ShowCompactResponse. We can use that instead. For cases where metadata is huge, this will help speed up the Initiator. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25346) cleanTxnToWriteIdTable breaks SNAPSHOT isolation
Zoltan Chovan created HIVE-25346: Summary: cleanTxnToWriteIdTable breaks SNAPSHOT isolation Key: HIVE-25346 URL: https://issues.apache.org/jira/browse/HIVE-25346 Project: Hive Issue Type: Bug Reporter: Zoltan Chovan Assignee: Zoltan Chovan -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24798) refactor TxnHandler.cleanupRecords to use predefined query strings
Zoltan Chovan created HIVE-24798: Summary: refactor TxnHandler.cleanupRecords to use predefined query strings Key: HIVE-24798 URL: https://issues.apache.org/jira/browse/HIVE-24798 Project: Hive Issue Type: Improvement Reporter: Zoltan Chovan Assignee: Zoltan Chovan TxnHandler.cleanupRecords should use predefined query strings instead of using a stringbuffer to build the delete queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24753) Non blocking DROP PARTITION implementation
Zoltan Chovan created HIVE-24753: Summary: Non blocking DROP PARTITION implementation Key: HIVE-24753 URL: https://issues.apache.org/jira/browse/HIVE-24753 Project: Hive Issue Type: New Feature Reporter: Zoltan Chovan Implement a way to execute drop partition operations in a way that doesn't have to wait for currently running read operations to be finished. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24445) Non blocking DROP table implementation
Zoltan Chovan created HIVE-24445: Summary: Non blocking DROP table implementation Key: HIVE-24445 URL: https://issues.apache.org/jira/browse/HIVE-24445 Project: Hive Issue Type: New Feature Components: Hive Reporter: Zoltan Chovan Assignee: Zoltan Chovan Implement a way to execute drop table operations in a way that doesn't have to wait for currently running read operations to be finished. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23391) Change requested lock for ALTER TABLE ADD COLUMN to DDL_SHARED
Zoltan Chovan created HIVE-23391: Summary: Change requested lock for ALTER TABLE ADD COLUMN to DDL_SHARED Key: HIVE-23391 URL: https://issues.apache.org/jira/browse/HIVE-23391 Project: Hive Issue Type: Improvement Reporter: Zoltan Chovan A long running query can block a simple add column query, as the add column will require a DDL_EXCLUSIVE lock currently. By changing this to a shared lock, this metadata only query can be executed without having to wait for the previous query to finish. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23363) Upgrade DataNucleus dependency to 5.2
Zoltan Chovan created HIVE-23363: Summary: Upgrade DataNucleus dependency to 5.2 Key: HIVE-23363 URL: https://issues.apache.org/jira/browse/HIVE-23363 Project: Hive Issue Type: Improvement Reporter: Zoltan Chovan -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23044) Make sure Cleaner doesn't delete delta directories for running queries
Zoltan Chovan created HIVE-23044: Summary: Make sure Cleaner doesn't delete delta directories for running queries Key: HIVE-23044 URL: https://issues.apache.org/jira/browse/HIVE-23044 Project: Hive Issue Type: Improvement Affects Versions: 3.1.0 Reporter: Zoltan Chovan Assignee: Zoltan Chovan -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22899) Make sure qtests clean up copied files from test directories
Zoltan Chovan created HIVE-22899: Summary: Make sure qtests clean up copied files from test directories Key: HIVE-22899 URL: https://issues.apache.org/jira/browse/HIVE-22899 Project: Hive Issue Type: Improvement Reporter: Zoltan Chovan Assignee: Zoltan Chovan Several qtest files are copying schema or test files to the test directories (such as ${system:test.tmp.dir} and ${hiveconf:hive.metastore.warehouse.dir}), many times without changing the name of the copied file. When the same files is copied by another qtest to the same directory the copy and hence the test fails. This can lead to flaky tests when any two of these qtests gets scheduled to the same batch. In order to avoid these failures, we should make sure the files copied to the test dirs have unique names and we should make sure these files are cleaned up by the same qtest files that copies the file. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22869) Add locking benchmark to metastore-tools/metastore-benchmarks
Zoltan Chovan created HIVE-22869: Summary: Add locking benchmark to metastore-tools/metastore-benchmarks Key: HIVE-22869 URL: https://issues.apache.org/jira/browse/HIVE-22869 Project: Hive Issue Type: Improvement Reporter: Zoltan Chovan Assignee: Zoltan Chovan Add the possibility to run benchmarks on opening lock in the HMS -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22804) Ensure ANSI quotes are used for mysql connections
Zoltan Chovan created HIVE-22804: Summary: Ensure ANSI quotes are used for mysql connections Key: HIVE-22804 URL: https://issues.apache.org/jira/browse/HIVE-22804 Project: Hive Issue Type: Improvement Reporter: Zoltan Chovan Assignee: Zoltan Chovan Recent changes in direct sql queries to resolve postgres issues(e.g. TxnHandler in HIVE-22663 ) break compatibility with mysql backend db. A workaround for these issues is to add a session config to the mysql connection string, e.g.: {code:java} jdbc:mysql://localhost:3306/db?sessionVariables=sql_mode=ANSI_QUOTES {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22790) QTests seem to fail when using MySQL docker backend db
Zoltan Chovan created HIVE-22790: Summary: QTests seem to fail when using MySQL docker backend db Key: HIVE-22790 URL: https://issues.apache.org/jira/browse/HIVE-22790 Project: Hive Issue Type: Bug Components: Standalone Metastore, Testing Infrastructure Reporter: Zoltan Chovan Running qtests for example with the following command: {code:java} mvn test -Dtest.output.overwrite=true -Pitests -pl itests/qtest -Dtest=TestCliDriver -Dqfile=partition_params_postgres.q -Dtest.metastore.db=mysql {code} Interestingly it seems to fail even when checking out the patch (HIVE-21954) that introduced the dockerized testing. cc: [~abstractdog] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22750) Consolidate LockType naming
Zoltan Chovan created HIVE-22750: Summary: Consolidate LockType naming Key: HIVE-22750 URL: https://issues.apache.org/jira/browse/HIVE-22750 Project: Hive Issue Type: Improvement Components: Transactions Reporter: Zoltan Chovan Assignee: Zoltan Chovan Extend enum with string literal to remove unnecessary `id` to `char` casting for the LockType: {code:java} switch (lockType) { case EXCLUSIVE: lockChar = LOCK_EXCLUSIVE; break; case SHARED_READ: lockChar = LOCK_SHARED; break; case SHARED_WRITE: lockChar = LOCK_SEMI_SHARED; break; } {code} Consolidate LockType naming in code and schema upgrade scripts: {code:java} CASE WHEN HL.`HL_LOCK_TYPE` = 'e' THEN 'exclusive' WHEN HL.`HL_LOCK_TYPE` = 'r' THEN 'shared' WHEN HL.`HL_LOCK_TYPE` = 'w' THEN *'semi-shared'* END AS LOCK_TYPE, {code} EXCL_DROP EXCL_WRITE SHARED_WRITE SHARED_READ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22741) Speed up ObjectStore method getTableMeta
Zoltan Chovan created HIVE-22741: Summary: Speed up ObjectStore method getTableMeta Key: HIVE-22741 URL: https://issues.apache.org/jira/browse/HIVE-22741 Project: Hive Issue Type: Improvement Reporter: Zoltan Chovan Assignee: Zoltan Chovan -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22727) Add hive db schema changes introduced in HIVE-21884 to the schema upgrade scripts
Zoltan Chovan created HIVE-22727: Summary: Add hive db schema changes introduced in HIVE-21884 to the schema upgrade scripts Key: HIVE-22727 URL: https://issues.apache.org/jira/browse/HIVE-22727 Project: Hive Issue Type: Bug Reporter: Zoltan Chovan Assignee: Zoltan Chovan -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22628) Add locks and transactions tables from sys db to information_schema
Zoltan Chovan created HIVE-22628: Summary: Add locks and transactions tables from sys db to information_schema Key: HIVE-22628 URL: https://issues.apache.org/jira/browse/HIVE-22628 Project: Hive Issue Type: Improvement Affects Versions: 4.0.0 Reporter: Zoltan Chovan Assignee: Zoltan Chovan Fix For: 4.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22627) Add schema changes introduced in HIVE-21443 to the schema upgrade scripts
Zoltan Chovan created HIVE-22627: Summary: Add schema changes introduced in HIVE-21443 to the schema upgrade scripts Key: HIVE-22627 URL: https://issues.apache.org/jira/browse/HIVE-22627 Project: Hive Issue Type: Improvement Reporter: Zoltan Chovan Assignee: Zoltan Chovan -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22553) Expose locks and transactions in sys db
Zoltan Chovan created HIVE-22553: Summary: Expose locks and transactions in sys db Key: HIVE-22553 URL: https://issues.apache.org/jira/browse/HIVE-22553 Project: Hive Issue Type: Improvement Affects Versions: 4.0.0 Reporter: Zoltan Chovan Assignee: Zoltan Chovan Create new sysdb tables/views to access lock and transaction data. This allows to provide admins with live data about ongoing locks and transacions. Due to this being in the sys db access to this information can be restricted to select privileged users. Information about locks and compactions can be joined and accessed at the same time. Compaction related transactions would also be visible. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22546) Postgres schema not using quoted identifiers for certain tables
Zoltan Chovan created HIVE-22546: Summary: Postgres schema not using quoted identifiers for certain tables Key: HIVE-22546 URL: https://issues.apache.org/jira/browse/HIVE-22546 Project: Hive Issue Type: Bug Reporter: Zoltan Chovan Assignee: Zoltan Chovan In the latest postgresql schema (standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-4.0.0.postgres.sql) the following tables have lowercase table and column names: {code:java} aux_table compaction_queue completed_compactions completed_txn_components hive_locks materialization_rebuild_locks min_history_level next_compaction_queue_id next_lock_id next_txn_id next_write_id repl_txn_map runtime_stats txn_components txn_to_write_id txns write_set{code} As these tables are referenced from the Hive sys database, the queries to these tables will fail with a "Table not found" error. The problem is that the table and column names are not enclosed in quotes, so postgres will turn these identifiers into lowercase. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-20267) Expanding WebUI to include form to dynamically config log levels
Zoltan Chovan created HIVE-20267: Summary: Expanding WebUI to include form to dynamically config log levels Key: HIVE-20267 URL: https://issues.apache.org/jira/browse/HIVE-20267 Project: Hive Issue Type: Improvement Reporter: Zoltan Chovan Assignee: Zoltan Chovan Expanding the possibility to change the log levels during runtime, the webUI can be extended to interact with the Log4j2ConfiguratorServlet, this way it can be directly used and users/admins don't need to execute curl commands from commandline. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-17334) Problem in interpreting structure columns in view.
Zoltan Chovan created HIVE-17334: Summary: Problem in interpreting structure columns in view. Key: HIVE-17334 URL: https://issues.apache.org/jira/browse/HIVE-17334 Project: Hive Issue Type: Improvement Reporter: Zoltan Chovan Priority: Minor Reproduction steps, to set up the env: {code:java} drop view if exists test_db.view_a; drop view if exists test_db.view_b; drop table if exists test_db.table_a; create database if not exists test_db; create external table if not exists test_db.table_a ( `id` bigint, `users` array> ) partitioned by (p_date int) stored as parquet location '/user/hive/'; {code} With this, he following scenario will fail: {code:java} select t.p_date, t.id, p.user_id, sum(p.counter) as counter, sum(sum(p.counter)) over (partition by t.id) as total_actions from test_db.table_a as t lateral view explode(t.users) users_two as p group by t.p_date, t.id, p.user_id; create view if not exists test_db.view_a as select t.p_date, t.id, p.user_id, sum(p.counter) as counter, sum(sum(p.counter)) over (partition by t.id) as total_actions from test_db.table_a as t lateral view explode(t.users) users_two as p group by t.p_date, t.id, p.user_id; select * from test_db.view_a where p_date = 20170711; {code} The following scenario will succeed: {code:java} create view if not exists test_db.view_b as select base.*, sum(base.counter) over (partition by base.id) as total_actions from ( select t.p_date, t.id, p.user_id, sum(p.counter) as counter from test_db.table_a as t lateral view explode(t.users) users_two as p group by t.p_date, t.id, p.user_id ) as base; select * from test_db.view_b where p_date = 20170711; {code} As you can see the only difference is that the addition of "sum(sum(p.counter)) over (partition by t.id) as total_actions" If the view is created as follows then the query that was breaking works. {code:java} create view if not exists test_db.view_c as select dt.p_date, dt.id, users_two.p.user_id, sum(p.counter) as counter, sum(sum(p.counter)) over (partition by dt.id) as total_actions from test_db.table_a as dt lateral view explode(dt.users) users_two as p group by dt.p_date, dt.id, users_two.p.user_id; select * from test_db.view_c where p_date = 20170711; {code} The workaround is to use the table name along with the column name e.g users_two.p.user_id. Please advise if this is a bug and if it could be fixed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17310) Regex column referencing in aggregate functions/group by
Zoltan Chovan created HIVE-17310: Summary: Regex column referencing in aggregate functions/group by Key: HIVE-17310 URL: https://issues.apache.org/jira/browse/HIVE-17310 Project: Hive Issue Type: New Feature Reporter: Zoltan Chovan Priority: Minor The following works as expected: {code:java} set hive.support.quoted.identifiers=none; SELECT `(id|created_at)?+.+` FROM test_table WHERE date between "2017-07-01" and "2017-08-01"; {code} However the query fails when adding count/group by like as follows: {code:java} set hive.support.quoted.identifiers=none; SELECT count(*), `(id|created_at)?+.+` FROM test_table WHERE date between "2017-07-01" and "2017-08-01" GROUP BY `(id|created_at)?+.+` ; {code} Currently this fails with an error. Would it be feasible to implement this feature? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17058) Separate configuration parameter for setting staging directory on S3
Zoltan Chovan created HIVE-17058: Summary: Separate configuration parameter for setting staging directory on S3 Key: HIVE-17058 URL: https://issues.apache.org/jira/browse/HIVE-17058 Project: Hive Issue Type: Improvement Reporter: Zoltan Chovan Priority: Minor Currently there is one parameter that is used for setting the staging directory: hive.exec.stagingdir, which will be used both on HDFS and on S3 when set. A current workaround is to manually set the staging directory before inserting into a S3 table. The requested feature is to introduce a new paramater that would only set teh location of the staging directory on S3, thus separating it from the HDFS staging directory setting -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-15792) Hive should raise SemanticException when LPAD/RPAD pad character's length is 0
Zoltan Chovan created HIVE-15792: Summary: Hive should raise SemanticException when LPAD/RPAD pad character's length is 0 Key: HIVE-15792 URL: https://issues.apache.org/jira/browse/HIVE-15792 Project: Hive Issue Type: Bug Reporter: Zoltan Chovan Priority: Minor For example SELECT LPAD('A', 2, ''); will cause an infinite loop and the running query will hang without any error. It would be great if this could be prevented by checking the pad character's length and if it's 0 then throw a SemanticException. -- This message was sent by Atlassian JIRA (v6.3.15#6346)