[jira] [Created] (HIVE-26429) Set default value of hive.txn.xlock.ctas to true and update lineage info for CTAS queries.

2022-07-26 Thread Simhadri G (Jira)
Simhadri G created HIVE-26429:
-

 Summary: Set default value of hive.txn.xlock.ctas to true and 
update lineage info for CTAS queries.
 Key: HIVE-26429
 URL: https://issues.apache.org/jira/browse/HIVE-26429
 Project: Hive
  Issue Type: Improvement
Reporter: Simhadri G
Assignee: Simhadri G






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26424) When decimal type has overflowed the specified precision it must throw an error/warning instead of succeeding with NULL entries

2022-07-22 Thread Simhadri G (Jira)
Simhadri G created HIVE-26424:
-

 Summary: When decimal type has overflowed the specified precision 
it must throw an error/warning instead of succeeding with NULL entries
 Key: HIVE-26424
 URL: https://issues.apache.org/jira/browse/HIVE-26424
 Project: Hive
  Issue Type: Bug
Reporter: Simhadri G


When the decimal type has overflowed the specified precision, it results in 
null entries as seen below:
{code:java}
0: jdbc:hive2://localhost:10001/> select cast(48932.19 AS DECIMAL(6,6));
+---+
|  _c0  |
+---+
| NULL  |
+---+
1 row selected (0.178 seconds){code}
 

This can be a significant issue when inserting a large amount of data from one 
table to another. This can result in entire columns having NULL entries, as 
seen below

 
{code:java}


0: jdbc:hive2://localhost:10001/> select * from t2;

+---+
|      t2.num       |
+---+
| 28367.81  |
| 49632.19  |
| NULL              |
| 28367.81  |
| 49632.19  |
| NULL              |
+---+
6 rows selected (0.202 seconds) 

0: jdbc:hive2://localhost:10001/> create table t3(num decimal(20,10));

0: jdbc:hive2://localhost:10001/> insert into t3 select cast(t2.num as 
decimal(5,2)) from t2;
12 rows affected (40.97 seconds)


0: jdbc:hive2://localhost:10001/> select * from t3;
+-+
| t3.num  |
+-+
| NULL    |
| NULL    |
| NULL    |
| NULL    |
| NULL    |
| NULL    |
+-+
6 rows selected (0.205 seconds){code}
I think it would be better to throw an error as below instead of succeeding. 
Similar to Mysql.
{code:java}
ERROR : Out of range value for column 'cast(num as decimal(5,2))' {code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26244) Implementing locking for concurrent ctas

2022-05-20 Thread Simhadri G (Jira)
Simhadri G created HIVE-26244:
-

 Summary: Implementing locking for concurrent ctas
 Key: HIVE-26244
 URL: https://issues.apache.org/jira/browse/HIVE-26244
 Project: Hive
  Issue Type: Improvement
Reporter: Simhadri G
Assignee: Simhadri G






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26215) Expose the MIN_HISTORY_LEVEL table through Hive sys database

2022-05-09 Thread Simhadri G (Jira)
Simhadri G created HIVE-26215:
-

 Summary:  Expose the MIN_HISTORY_LEVEL table  through Hive sys 
database 
 Key: HIVE-26215
 URL: https://issues.apache.org/jira/browse/HIVE-26215
 Project: Hive
  Issue Type: Improvement
Reporter: Simhadri G
Assignee: Simhadri G


While we still (partially) use MIN_HISTORY_LEVEL for the cleaner, we should 
expose it as a sys table so we can see what might be blocking the Cleaner 
thread.

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26009) Determine number of buckets for implicitly bucketed ACIDv2 tables

2022-03-07 Thread Simhadri G (Jira)
Simhadri G created HIVE-26009:
-

 Summary: Determine number of buckets for implicitly bucketed 
ACIDv2 tables 
 Key: HIVE-26009
 URL: https://issues.apache.org/jira/browse/HIVE-26009
 Project: Hive
  Issue Type: Improvement
Reporter: Simhadri G
Assignee: Simhadri G


Hive tries to set number of reducers equal to number of buckets here: 
[https://github.com/apache/hive/blob/9857c4e584384f7b0a49c34bc2bdf876c2ea1503/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L6958]
 

 

The numberOfBuckets for implicitly bucketed tables is set to -1 by default. 
When this is the case, it is left to hive to estimate the number of reducers 
required the job, based on job input, and configuration parameters.

[https://github.com/apache/hive/blob/9857c4e584384f7b0a49c34bc2bdf876c2ea1503/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L3369]

 

This estimate is not optimal in all cases. In the worst case, it case result in 
a single reducer being launched , which can lead to a significant bottleneck in 
performance .

 

Ideally,  the number of reducers launched should equal to number of buckets, 
which is the case for explicitly bucketed tables.

 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25471) Clear entries in Privilege table - sys.tbl_col_privs when privilege synchroniser is disabled to avoid stale permission.

2021-08-20 Thread Simhadri G (Jira)
Simhadri G created HIVE-25471:
-

 Summary: Clear entries in Privilege table - sys.tbl_col_privs  
when privilege synchroniser is disabled to avoid stale permission.
 Key: HIVE-25471
 URL: https://issues.apache.org/jira/browse/HIVE-25471
 Project: Hive
  Issue Type: Task
Reporter: Simhadri G
Assignee: Simhadri G






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24497) Node heartbeats from LLAP Daemon to the client are not matching leading to timeout.

2020-12-07 Thread Simhadri G (Jira)
Simhadri G created HIVE-24497:
-

 Summary: Node heartbeats from LLAP Daemon to the client are not 
matching leading to timeout.
 Key: HIVE-24497
 URL: https://issues.apache.org/jira/browse/HIVE-24497
 Project: Hive
  Issue Type: Sub-task
Reporter: Simhadri G
Assignee: Simhadri G


Node heartbeat contains info about all the tasks that were submitted to that 
LLAP Daemon. In cloud deployment, the client is not able to match this 
heartbeats due to differences in hostname and port .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23361) Optimising privilege synchroniser

2020-05-04 Thread Simhadri G (Jira)
Simhadri G created HIVE-23361:
-

 Summary: Optimising privilege synchroniser
 Key: HIVE-23361
 URL: https://issues.apache.org/jira/browse/HIVE-23361
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Simhadri G


Privilege synchronizer pulls the list of databases, tables and columns from the 
Hive Metastore. For each of these objects it fetches the privilege information 
and invokes HMS API to refresh the privilege information in HMS. This patch 
store the privilege information as bit string. This is done to reduce the size 
of the tbl_col_privs tables in metastore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23301) Optimising privilege synchroniser: UDF for updating privileges

2020-04-26 Thread Simhadri G (Jira)
Simhadri G created HIVE-23301:
-

 Summary: Optimising privilege synchroniser: UDF for updating 
privileges
 Key: HIVE-23301
 URL: https://issues.apache.org/jira/browse/HIVE-23301
 Project: Hive
  Issue Type: Improvement
  Components: Metastore, UDF
Affects Versions: 3.1.1
Reporter: Simhadri G
 Attachments: UDFSplitMapPrivs.patch

Privilege synchronizer pulls the list of databases, tables and columns from the 
Hive Metastore. For each of these objects it fetches the privilege information 
and invokes HMS API to refresh the privilege information in HMS. The current 
UDF Maps  a bit string  to a privilege based on if the privilege is granted or 
not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)