[
https://issues.apache.org/jira/browse/IMPALA-12829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18015677#comment-18015677
]
ASF subversion and git services commented on IMPALA-12829:
----------------------------------------------------------
Commit 50926b5d8e941c5cc10fd77d0b4556e3441c41e7 in impala's branch
refs/heads/master from Sai Hemanth Gantasala
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=50926b5d8 ]
IMPALA-12829: Skip processing transaction events if the table is
HMS sync disabled.
For transactional tables, the event processor is not skipping abort_txn
and commit_txn_event if the database/table is HMS sync disabled. This
processing is unnecessary and helps to improve event processor lag by
skipping abort_txn, and commit_txn events if the corresponding database
or transactional tables are HMS sync disabled. The database name and
table name is present for the Alloc_write_id_event, skipping this event
is already implemented if HMS sync is disabled. Since dbname and table
name is not present for the abort_txn and commit_txn events, we need to
check if HMS sync is disabled on the HMS table property when the table
object is extracted in the CatalogServiceCatalog#addWriteIdsToTable().
Also, fixed the partitions and table refreshed metrics for CommitTxn
event.
Additional Issues discovered during testing:
1) CatalogServiceCatalog#reloadTableIfExists() didn't verify if the
current eventId is older than the table's lastSyncEventId which leads to
unecessary reloading of table for commit txns.
2) Insert queries from impala didn't update the validWriteIdList for
transactional tables in the cache, so CommitTxn events triggered by
insert events are triggering reload on unpartitioned transactional
tables again while consuming these CommitTxn events. Fixed it by
updating the validWriteIdList in the cache.
3) CommitTxn events generated after AlterTable events are leading to
incorrect results if file metadata reload is skipped in AlterTable
events. Reason being AlterTable event will update the writeId from
metastore but doesn't reload filemetadata which yields incorrect
results. This is fixed in HdfsTable class to not skip filemetadata
reload if writeId is changed.
4) Added bigger timeouts in TestEventProcessingWithImpala test class
to avoid flakiness for the transactional events in the event processor
caused by catalogd_load_metadata_delay config
Testing:
- Added end-to-end tests to verify transaction events are skipped.
Change-Id: I5d0ecb3b756755bc04c66a538a9ae6b88011a019
Reviewed-on: http://gerrit.cloudera.org:8080/21175
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Skip processing abort_txn and commit_txn events if the table is HMS sync
> disabled
> ---------------------------------------------------------------------------------
>
> Key: IMPALA-12829
> URL: https://issues.apache.org/jira/browse/IMPALA-12829
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Reporter: Sai Hemanth Gantasala
> Assignee: Sai Hemanth Gantasala
> Priority: Critical
> Labels: catalog-2024
>
> For transactional tables, the event processor is not skipping abort_txn and
> commit_txn events if the table is HMS sync disabled using
> {{impala.disableHmsSync}} set to true on HMS table property.
> This processing is unnecessary and helps to improve event processor lag by
> skipping abort_txn and commit_txn_event events if the corresponding
> transactional tables are HMS sync disabled.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]