[
https://issues.apache.org/jira/browse/HIVE-27217?focusedWorklogId=854901&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-854901
]
ASF GitHub Bot logged work on HIVE-27217:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 04/Apr/23 21:25
Start Date: 04/Apr/23 21:25
Worklog Time Spent: 10m
Work Description: jfsii commented on PR #4197:
URL: https://github.com/apache/hive/pull/4197#issuecomment-1496625307
Re: @TuroczyX
When it is re-thrown it will bubble up to the user as an exception during
execution. It would be treated as other thrift exceptions would be - such as if
the HMS server went down or the call timed out.
It is not a good choice to swallow here, because a missing write
notification will affect various systems such as maybe event notification and
replication. It is much preferable to fail and make it visible so a user can
take action (either by checking HMS logs, making sure HMS connectivity is good,
etc) rather than having to figure out why the notification log is missing
entries long after the fact. I feel like this is an oversight when attempting
the fallback mechanism rather than a purposeful choice to ignore these
exceptions.
The reason I found this is due to the Impala catalogd implementation not
implementing add_write_notification_log_in_batch which causes it to throw a
different TApplicationException and thus the
add_write_notification_log_in_batch is failed silently. Since catalogd also
relies on the notification log/events, depending on timing of events, the
metadata cache would end up containing incorrect metadata because it didn't
realize a partition was updated. This could be fixed on the catalogd side of
things to throw a UNKNOWN_METHOD or WRONG_METHOD_NAME exception, but it is
still entirely possible for a TApplicationException of a different type to
still be thrown. I've seen "UNKNOWN" in the past, but there also exists
INVALID_MESSAGE_TYPE/BAD_SEQUENCE_ID/MISSING_RESULT which could very well show
up depending on network conditions and/or issues during the processing on the
HMS side.
Issue Time Tracking
-------------------
Worklog Id: (was: 854901)
Time Spent: 40m (was: 0.5h)
> addWriteNotificationLogInBatch can silently fail
> ------------------------------------------------
>
> Key: HIVE-27217
> URL: https://issues.apache.org/jira/browse/HIVE-27217
> Project: Hive
> Issue Type: Bug
> Components: HiveServer2
> Reporter: John Sherman
> Assignee: John Sherman
> Priority: Major
> Labels: pull-request-available
> Time Spent: 40m
> Remaining Estimate: 0h
>
> Debugging an issue, I noticed that addWriteNotificationLogInBatch in
> Hive.java can fail silently if the TApplicationException thrown is not
> TApplicationException.UNKNOWN_METHOD or
> TApplicationException.WRONG_METHOD_NAME.
> https://github.com/apache/hive/blob/40a7d689e51d02fa9b324553fd1810d0ad043080/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L3359-L3381
> Failures to write in the notification log can be very difficult to debug, we
> should rethrow the exception so that the failure is very visible.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)