[jira] [Created] (HIVE-25442) Initiator speed-up: only read compaction history once per loop

2021-08-10 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-25442:


 Summary: Initiator speed-up: only read compaction history once per 
loop
 Key: HIVE-25442
 URL: https://issues.apache.org/jira/browse/HIVE-25442
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Chovan
Assignee: Zoltan Chovan


In checkFailedCompactions (which is called for every partition in the list of 
potentials to compact) we select from metadata table COMPLETED_COMPACTIONS.

But the Initiator main loop already has a ShowCompactResponse. We can use that 
instead.

For cases where metadata is huge, this will help speed up the Initiator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25346) cleanTxnToWriteIdTable breaks SNAPSHOT isolation

2021-07-19 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-25346:


 Summary: cleanTxnToWriteIdTable breaks SNAPSHOT isolation
 Key: HIVE-25346
 URL: https://issues.apache.org/jira/browse/HIVE-25346
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Chovan
Assignee: Zoltan Chovan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24798) refactor TxnHandler.cleanupRecords to use predefined query strings

2021-02-19 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-24798:


 Summary: refactor TxnHandler.cleanupRecords to use predefined 
query strings
 Key: HIVE-24798
 URL: https://issues.apache.org/jira/browse/HIVE-24798
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Chovan
Assignee: Zoltan Chovan


TxnHandler.cleanupRecords should use predefined query strings instead of using 
a stringbuffer to build the delete queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24753) Non blocking DROP PARTITION implementation

2021-02-08 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-24753:


 Summary: Non blocking DROP PARTITION implementation
 Key: HIVE-24753
 URL: https://issues.apache.org/jira/browse/HIVE-24753
 Project: Hive
  Issue Type: New Feature
Reporter: Zoltan Chovan


Implement a way to execute drop partition operations in a way that doesn't have 
to wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24445) Non blocking DROP table implementation

2020-11-30 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-24445:


 Summary: Non blocking DROP table implementation
 Key: HIVE-24445
 URL: https://issues.apache.org/jira/browse/HIVE-24445
 Project: Hive
  Issue Type: New Feature
  Components: Hive
Reporter: Zoltan Chovan
Assignee: Zoltan Chovan


Implement a way to execute drop table operations in a way that doesn't have to 
wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23391) Change requested lock for ALTER TABLE ADD COLUMN to DDL_SHARED

2020-05-07 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-23391:


 Summary: Change requested lock for ALTER TABLE ADD COLUMN to 
DDL_SHARED
 Key: HIVE-23391
 URL: https://issues.apache.org/jira/browse/HIVE-23391
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Chovan


A long running query can block a simple add column query, as the add column 
will require a DDL_EXCLUSIVE lock currently. By changing this to a shared lock, 
this metadata only query can be executed without having to wait for the 
previous query to finish.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23363) Upgrade DataNucleus dependency to 5.2

2020-05-04 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-23363:


 Summary: Upgrade DataNucleus dependency to 5.2
 Key: HIVE-23363
 URL: https://issues.apache.org/jira/browse/HIVE-23363
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Chovan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23044) Make sure Cleaner doesn't delete delta directories for running queries

2020-03-18 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-23044:


 Summary: Make sure Cleaner doesn't delete delta directories for 
running queries
 Key: HIVE-23044
 URL: https://issues.apache.org/jira/browse/HIVE-23044
 Project: Hive
  Issue Type: Improvement
Affects Versions: 3.1.0
Reporter: Zoltan Chovan
Assignee: Zoltan Chovan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22899) Make sure qtests clean up copied files from test directories

2020-02-18 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-22899:


 Summary: Make sure qtests clean up copied files from test 
directories
 Key: HIVE-22899
 URL: https://issues.apache.org/jira/browse/HIVE-22899
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Chovan
Assignee: Zoltan Chovan


Several qtest files are copying schema or test files to the test directories 
(such as ${system:test.tmp.dir} and ${hiveconf:hive.metastore.warehouse.dir}), 
many times without changing the name of the copied file. When the same files is 
copied by another qtest to the same directory the copy and hence the test 
fails. This can lead to flaky tests when any two of these qtests gets scheduled 
to the same batch.

 

In order to avoid these failures, we should make sure the files copied to the 
test dirs have unique names and we should make sure these files are cleaned up 
by the same qtest files that copies the file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22869) Add locking benchmark to metastore-tools/metastore-benchmarks

2020-02-11 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-22869:


 Summary: Add locking benchmark to 
metastore-tools/metastore-benchmarks
 Key: HIVE-22869
 URL: https://issues.apache.org/jira/browse/HIVE-22869
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Chovan
Assignee: Zoltan Chovan


Add the possibility to run benchmarks on opening lock in the HMS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22804) Ensure ANSI quotes are used for mysql connections

2020-01-31 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-22804:


 Summary: Ensure ANSI quotes are used for mysql connections
 Key: HIVE-22804
 URL: https://issues.apache.org/jira/browse/HIVE-22804
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Chovan
Assignee: Zoltan Chovan


Recent changes in direct sql queries to resolve postgres issues(e.g. TxnHandler 
in HIVE-22663 ) break compatibility with mysql backend db. 

A workaround for these issues is to add a session config to the mysql 
connection string, e.g.:
{code:java}
jdbc:mysql://localhost:3306/db?sessionVariables=sql_mode=ANSI_QUOTES
{code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22790) QTests seem to fail when using MySQL docker backend db

2020-01-29 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-22790:


 Summary: QTests seem to fail when using MySQL docker backend db
 Key: HIVE-22790
 URL: https://issues.apache.org/jira/browse/HIVE-22790
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore, Testing Infrastructure
Reporter: Zoltan Chovan


Running qtests for example with the following command:

 
{code:java}
mvn test -Dtest.output.overwrite=true -Pitests -pl itests/qtest 
-Dtest=TestCliDriver -Dqfile=partition_params_postgres.q 
-Dtest.metastore.db=mysql
{code}
 

 

Interestingly it seems to fail even when checking out the patch (HIVE-21954) 
that introduced the dockerized testing.

 

cc: [~abstractdog]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22750) Consolidate LockType naming

2020-01-20 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-22750:


 Summary: Consolidate LockType naming
 Key: HIVE-22750
 URL: https://issues.apache.org/jira/browse/HIVE-22750
 Project: Hive
  Issue Type: Improvement
  Components: Transactions
Reporter: Zoltan Chovan
Assignee: Zoltan Chovan




Extend enum with string literal to remove unnecessary `id` to `char` casting 
for the LockType:


{code:java}
switch (lockType) {
case EXCLUSIVE:
  lockChar = LOCK_EXCLUSIVE;
  break;
case SHARED_READ:
  lockChar = LOCK_SHARED;
  break;
case SHARED_WRITE:
  lockChar = LOCK_SEMI_SHARED;
  break;
  }
{code}


Consolidate LockType naming in code and schema upgrade scripts:


{code:java}
CASE WHEN HL.`HL_LOCK_TYPE` = 'e' THEN 'exclusive' WHEN HL.`HL_LOCK_TYPE` = 'r' 
THEN 'shared' WHEN HL.`HL_LOCK_TYPE` = 'w' THEN *'semi-shared'* END AS 
LOCK_TYPE,

{code}

EXCL_DROP
EXCL_WRITE
SHARED_WRITE
SHARED_READ




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22741) Speed up ObjectStore method getTableMeta

2020-01-17 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-22741:


 Summary: Speed up ObjectStore method getTableMeta 
 Key: HIVE-22741
 URL: https://issues.apache.org/jira/browse/HIVE-22741
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Chovan
Assignee: Zoltan Chovan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22727) Add hive db schema changes introduced in HIVE-21884 to the schema upgrade scripts

2020-01-14 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-22727:


 Summary: Add hive db schema changes introduced in HIVE-21884 to 
the schema upgrade scripts
 Key: HIVE-22727
 URL: https://issues.apache.org/jira/browse/HIVE-22727
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Chovan
Assignee: Zoltan Chovan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22628) Add locks and transactions tables from sys db to information_schema

2019-12-11 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-22628:


 Summary: Add locks and transactions tables from sys db to 
information_schema
 Key: HIVE-22628
 URL: https://issues.apache.org/jira/browse/HIVE-22628
 Project: Hive
  Issue Type: Improvement
Affects Versions: 4.0.0
Reporter: Zoltan Chovan
Assignee: Zoltan Chovan
 Fix For: 4.0.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22627) Add schema changes introduced in HIVE-21443 to the schema upgrade scripts

2019-12-11 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-22627:


 Summary: Add schema changes introduced in HIVE-21443 to the schema 
upgrade scripts
 Key: HIVE-22627
 URL: https://issues.apache.org/jira/browse/HIVE-22627
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Chovan
Assignee: Zoltan Chovan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22553) Expose locks and transactions in sys db

2019-11-27 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-22553:


 Summary: Expose locks and transactions in sys db
 Key: HIVE-22553
 URL: https://issues.apache.org/jira/browse/HIVE-22553
 Project: Hive
  Issue Type: Improvement
Affects Versions: 4.0.0
Reporter: Zoltan Chovan
Assignee: Zoltan Chovan


Create new sysdb tables/views to access lock and transaction data.

This allows to provide admins with live data about ongoing locks and 
transacions. Due to this being in the sys db access to this information can be 
restricted to select privileged users.



Information about locks and compactions can be joined and accessed at the same 
time.

Compaction related transactions would also be visible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22546) Postgres schema not using quoted identifiers for certain tables

2019-11-26 Thread Zoltan Chovan (Jira)
Zoltan Chovan created HIVE-22546:


 Summary: Postgres schema not using quoted identifiers for certain 
tables
 Key: HIVE-22546
 URL: https://issues.apache.org/jira/browse/HIVE-22546
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Chovan
Assignee: Zoltan Chovan


In the latest postgresql schema 
(standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-4.0.0.postgres.sql)
 the following tables have lowercase table and column names:
{code:java}
aux_table 
compaction_queue 
completed_compactions 
completed_txn_components 
hive_locks 
materialization_rebuild_locks 
min_history_level 
next_compaction_queue_id 
next_lock_id 
next_txn_id 
next_write_id 
repl_txn_map 
runtime_stats 
txn_components 
txn_to_write_id 
txns 
write_set{code}
As these tables are referenced from the Hive sys database, the queries to these 
tables will fail with a "Table not found" error.

The problem is that the table and column names are not enclosed in quotes, so 
postgres will turn these identifiers into lowercase.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-20267) Expanding WebUI to include form to dynamically config log levels

2018-07-29 Thread Zoltan Chovan (JIRA)
Zoltan Chovan created HIVE-20267:


 Summary: Expanding WebUI to include form to dynamically config log 
levels 
 Key: HIVE-20267
 URL: https://issues.apache.org/jira/browse/HIVE-20267
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Chovan
Assignee: Zoltan Chovan


Expanding the possibility to change the log levels during runtime, the webUI 
can be extended to interact with the Log4j2ConfiguratorServlet, this way it can 
be directly used and users/admins don't need to execute curl commands from 
commandline.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-17334) Problem in interpreting structure columns in view.

2017-08-16 Thread Zoltan Chovan (JIRA)
Zoltan Chovan created HIVE-17334:


 Summary: Problem in interpreting structure columns in view.
 Key: HIVE-17334
 URL: https://issues.apache.org/jira/browse/HIVE-17334
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Chovan
Priority: Minor


Reproduction steps, to set up the env:


{code:java}
drop view if exists test_db.view_a;
drop view if exists test_db.view_b;
drop table if exists test_db.table_a;

create database if not exists test_db;

create external table if not exists test_db.table_a (
`id` bigint,
`users` array>
)
partitioned by (p_date int)
stored as parquet
location '/user/hive/';
{code}

With this, he following scenario will fail:


{code:java}
select
  t.p_date,
  t.id,
  p.user_id,
  sum(p.counter) as counter,
  sum(sum(p.counter)) over
(partition by t.id) as total_actions
from test_db.table_a as t
lateral view explode(t.users) users_two as p
group by t.p_date, t.id, p.user_id;

create view if not exists test_db.view_a as 
select 
  t.p_date,
  t.id,
  p.user_id,
  sum(p.counter) as counter,
  sum(sum(p.counter)) over
(partition by t.id) as total_actions
from test_db.table_a as t
lateral view explode(t.users) users_two as p
group by t.p_date, t.id, p.user_id;


select *
from test_db.view_a
where p_date = 20170711;
{code}


The following scenario will succeed:

{code:java}
create view if not exists test_db.view_b as
select
  base.*,
  sum(base.counter) over
(partition by base.id) as total_actions
from (
  select 
t.p_date,
t.id,
p.user_id,
sum(p.counter) as counter
  from test_db.table_a as t
  lateral view explode(t.users) users_two as p
  group by t.p_date, t.id, p.user_id
) as base;

select *
from test_db.view_b
where p_date = 20170711;
{code}


As you can see the only difference is that the addition of "sum(sum(p.counter)) 
over (partition by t.id) as total_actions"

If the view is created as follows then the query that was breaking works.


{code:java}
create view if not exists test_db.view_c as 
select 
  dt.p_date,
  dt.id,
  users_two.p.user_id,
  sum(p.counter) as counter,
  sum(sum(p.counter)) over
(partition by dt.id) as total_actions
from test_db.table_a as dt
lateral view explode(dt.users) users_two as p
group by dt.p_date, dt.id, users_two.p.user_id;

select *
from test_db.view_c
where p_date = 20170711;

{code}

The workaround is to use the table name along with the column name e.g 
users_two.p.user_id. Please advise if this is a bug and if it could be fixed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17310) Regex column referencing in aggregate functions/group by

2017-08-14 Thread Zoltan Chovan (JIRA)
Zoltan Chovan created HIVE-17310:


 Summary: Regex column referencing in aggregate functions/group by
 Key: HIVE-17310
 URL: https://issues.apache.org/jira/browse/HIVE-17310
 Project: Hive
  Issue Type: New Feature
Reporter: Zoltan Chovan
Priority: Minor


The following works as expected:

{code:java}
set hive.support.quoted.identifiers=none;
SELECT `(id|created_at)?+.+`
FROM test_table
WHERE date between "2017-07-01" and "2017-08-01";
{code}

However the query fails when adding count/group by like as follows:


{code:java}
set hive.support.quoted.identifiers=none;

SELECT count(*), `(id|created_at)?+.+`
FROM test_table
WHERE date between "2017-07-01" and "2017-08-01"
GROUP BY `(id|created_at)?+.+` ;
{code}


Currently this fails with an error. Would it be feasible to implement this 
feature?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17058) Separate configuration parameter for setting staging directory on S3

2017-07-07 Thread Zoltan Chovan (JIRA)
Zoltan Chovan created HIVE-17058:


 Summary: Separate configuration parameter for setting staging 
directory on S3 
 Key: HIVE-17058
 URL: https://issues.apache.org/jira/browse/HIVE-17058
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Chovan
Priority: Minor


Currently there is one parameter that is used for setting the staging 
directory: hive.exec.stagingdir, which will be used both on HDFS and on S3 when 
set. A current workaround is to manually set the staging directory before 
inserting into a S3 table.
The requested feature is to introduce a new paramater that would only set teh 
location of the staging directory on S3, thus separating it from the HDFS 
staging directory setting



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-15792) Hive should raise SemanticException when LPAD/RPAD pad character's length is 0

2017-02-02 Thread Zoltan Chovan (JIRA)
Zoltan Chovan created HIVE-15792:


 Summary: Hive should raise SemanticException when LPAD/RPAD pad 
character's length is 0
 Key: HIVE-15792
 URL: https://issues.apache.org/jira/browse/HIVE-15792
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Chovan
Priority: Minor


For example SELECT LPAD('A', 2, ''); will cause an infinite loop and the 
running query will hang without any error.

It would be great if this could be prevented by checking the pad character's 
length and if it's 0 then throw a SemanticException.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)