[ 
https://issues.apache.org/jira/browse/IMPALA-9791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17134595#comment-17134595
 ] 

ASF subversion and git services commented on IMPALA-9791:
---------------------------------------------------------

Commit 0cb44242d20532945e5fb09f5bbef6c65415a753 in impala's branch 
refs/heads/master from Vihang Karajgaonkar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=0cb4424 ]

IMPALA-9791: Support validWriteIdList in getPartialCatalogObject API

This change enhances the Catalog-v2 API getPartialCatalogObject to
support ValidWriteIdList as an optional field in the TableInfoSelector.
When such a field is provided by the clients, catalog compares the
provided ValidWriteIdList with the cached ValidWriteIdList of the
table. The catalog reloads the table if it determines that the cached
table is stale with respect to the ValidWriteIdList provided.
In case the table is already at or above the requested ValidWriteIdList
catalog uses the cached table metadata information to filter out
filedescriptors pertaining to the provided ValidWriteIdList.
Note that in case compactions it is possible that the requested
ValidWriteIdList cannot be satisfied using the cached file-metadata
for some partitions. For such partitions, catalog re-fetches the
file-metadata from the FileSystem.

In order to implement the fall-back to getting the file-metadata from
filesystem, the patch refactor some of file-metadata loading logic into
ParallelFileMetadataLoader which also helps simplify some methods
in HdfsTable.java. Additionally, it modifies the WriteIdBasedPredicate
to optionally do a strict check which throws an exception on some
scenarios.

This is helpful to provide a snapshot view of the table metadata during
query compilation with respect to other changes happening to the table
concurrently. Note that this change does not implement the coordinator
side changes needed for catalog clients to use such a field. That would
be taken up in a separate change to keep this patch smaller.

Testing:
1. Ran existing filemetadata loader tests.
2. Added a new test which exercises the various cases for
ValidWriteIdList comparison.
3. Ran core tests along with the dependent MetastoreClientPool
patch (IMPALA-9824).

Change-Id: Ied2c7c3cb2009c407e8fbc3af4722b0d34f57c4a
Reviewed-on: http://gerrit.cloudera.org:8080/16008
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Support validWriteIdList in getPartialCatalogObject
> ---------------------------------------------------
>
>                 Key: IMPALA-9791
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9791
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Vihang Karajgaonkar
>            Assignee: Vihang Karajgaonkar
>            Priority: Major
>              Labels: impala-acid
>
> When transactional tables are being queried, the coordinator (or any other 
> Catalog client) can optionally provide a ValidWriteIdList of the table. In 
> such case, catalog can return the metadata which is consistent with the given 
> ValidWriteIdList. There are the following 3 possibilities:
> 1. Client provided ValidWriteIdList is more recent.
> In this case, catalog should reload the table then send the metadata 
> consistent with the provided writeIdList.
> 2. Client ValidWriteIdList is same.
> Catalog can return the cached metadata directly.
> 3. ClientValidWriteIdList is stale with respect to the one in catalog.
> In this case, catalog can attempt to return metadata which is consistent with 
> respect to client's view of the writeIdList and return accordingly. Note that 
> in case 1, it is possible that after reload, catalog moves ahead of the 
> client's writeIdList and hence this becomes a sub-case of 1.
> Having such an enhancement to the API can help support consistent read 
> support for ACID tables (see IMPALA-8788)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to