[
https://issues.apache.org/jira/browse/HIVE-22654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009801#comment-17009801
]
Peter Vary commented on HIVE-22654:
-----------------------------------
Thanks [~rajesh.balamohan]! Fair point about the complicated query.
Not sure if you already have a reviewer or not, here are my questions:
* We have a "global" locksForPartitions for every batch. Would it make sense to
calculate this for every batch instead? If I did not miss something we still
might end up with too much element in the in clause if the result of
getLocksBeingHeldForPartitions(locksBeingChecked) is too high. The specific
place where the in clause could grow is this:
{code}
if (!locksForPartitions.isEmpty()) {
query.append(" and (hl_partition is null or hl_partition in(");
first = true;
for (String s : locksForPartitions) {
if (first) {
first = false;
} else {
query.append(", ");
}
query.append('\'');
query.append(s);
query.append('\'');
}
query.append("))");
}
{code}
* Is the checkLock(Connection dbConn, long extLockId, Set<String>
locksForPartitions, Statement stmt, List<LockInfo> locksBeingChecked) really
checks the locks? My understanding is that the main goal of this method is to
collect the related locks, and by the way really checks the concurrent writes
too... My feeling is that the name is a little misleading.
* As part of the suggestion above, I would move out the writeset check from the
current private checkLock method. It does not have to be batched, and we might
be better issuing it in one select instead of issuing it in every batch. (also
if we do it before collecting the lock data we might save some execution time
by failing fast)
* Handling/closing the Statement and the ResultSet objects seems a little dodgy
for me. {stmt} is input for the private checkLock method, but closed by us if
there is a write, but not recreated without closure in other cases. We might
want to review this part of the code as well.
* The calculation of {{sawNull}} was "global" for every batch. Now with
batching this could be problematic. We might have to calculate the value before
starting the batching, and use this as an input parameter.
These are the potential issues I have been able to identify, maybe having a
pull request or review board would help us to have better communication
channels.
Thanks for taking this up!
Peter
> ACID: Allow TxnHandler::checkLock to chunk partitions by 1000
> --------------------------------------------------------------
>
> Key: HIVE-22654
> URL: https://issues.apache.org/jira/browse/HIVE-22654
> Project: Hive
> Issue Type: Bug
> Reporter: Gopal Vijayaraghavan
> Assignee: Rajesh Balamohan
> Priority: Major
> Attachments: HIVE-22654.1.patch, HIVE-22654.2.patch
>
>
> The following loop can end up with too many entries within the IN clause
> throwing.
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L4428
> {code:java}
> // If any of the partition requests are null, then I need to pull all
> // partition locks for this table.
> sawNull = false;
> strings.clear();
> for (LockInfo info : locksBeingChecked) {
> if (info.partition == null) {
> sawNull = true;
> break;
> } else {
> strings.add(info.partition);
> }
> }
> {code}
> {code}
> 2019-12-17T04:28:57,991 ERROR [pool-8-thread-143]:
> metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(201)) -
> MetaException(message:Unable to update transaction database
> java.sql.SQLSyntaxErrorException: ORA-01795: maximum number of expressions in
> a list is 1000
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)