[ 
https://issues.apache.org/jira/browse/IMPALA-11501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17803555#comment-17803555
 ] 

ASF subversion and git services commented on IMPALA-11501:
----------------------------------------------------------

Commit 5f434e84678e4401c26172a7121c8e3c70ab664f in impala's branch 
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=5f434e846 ]

IMPALA-12670: getIfPresent should throw the cause of error

CatalogdMetaProvider maintains a map (a Guava cache) as its local
catalog cache. It has a piggyback mechanism to load metadata from
catalogd that when concurrent threads want to load the same content
(identified by the same key) from catalogd, only one of them actually
sends the request and load the result into the cache. Other threads wait
and get the result when the work is done.

The piggyback mechanism is implemented by putting a Future object as the
value when the key doesn't exist in the cache. The Future object handles
the loading. Other threads that want the same value just invoke
Future.get() to wait. See more in the comments in loadWithCaching().

If there are any errors thrown in the loading process, Future.get() will
encapsulate the error into an ExecutionException and throw it instead.
The cause could be an InconsistentMetadataFetchException which indicates
FE should retry the planning. It's handled in Frontend#getTExecRequest().

In loadWithCaching(), we try to throw the cause of the exception thrown
from Future.get(). So the InconsistentMetadataFetchException can be
handled as expected. However, in getIfPresent(), the error handling is
inconsistent that it try to throw the current exception. That causes
retriable failures can't be retried. Note that this is an existing bug
but got more easy to be hitted after IMPALA-11501 because getIfPresent()
is now used in LocalDb#getTableIfCached() which is used in many places.

This patch fixes getIfPresent() to have the same logic of using the
Future object (including error handling) as loadWithCaching(). Also
adds more loggings in both catalogd and impalad sides when the lookup
status is abnormal.

In order to test the loading error more easily, this patch adds a hidden
flag, inject_failure_ratio_in_catalog_fetch, to randomly inject
retriable errors.

Tests
 - Ran test_local_catalog_ddls_with_invalidate_metadata 700 times.
 - Add e2e test that will easily fail without this fix.

Change-Id: I74268ba2bb700988107780e13ffbdbb4c767d09d
Reviewed-on: http://gerrit.cloudera.org:8080/20853
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Add flag to allow metadata-cache operations on masked tables
> ------------------------------------------------------------
>
>                 Key: IMPALA-11501
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11501
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Security
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>             Fix For: Impala 4.4.0
>
>
> "REFRESH <table>" and "INVALIDATE METADATA <table>" are the table level 
> metadata-cache operations that only used in Impala (not Hive, SparkSQL or 
> else).
> In Hive-Ranger plugin, when a table is masked (either by column-masking or 
> row-filtering policy) for a user, the user can't perform any modification 
> (insert/delete/update) on the table (RANGER-1087, RANGER-1100). However, Hive 
> doesn't have those metadata-cache operations. It's a grey area whether we 
> should block them or not.
> Currently, Impala blocks metadata-cache operations as well (IMPALA-10554, 
> IMPALA-11281). However, it's possible that, before upgrade, some 
> data-consumer jobs already have REFRESH in them. It'd be better to have a 
> flag to allow such operations for smooth upgrade process.
> The flag can be something like "allow_refresh_by_masked_users".
> CC [~fangyurao], [~csringhofer]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to