Hello Bharath Vissapragada, Tianyi Wang, Impala Public Jenkins, Vuk Ercegovac,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/11182
to look at the new patch set (#2).
Change subject: IMPALA-7127 (continued): initial fetch-from-catalogd
implementation
......................................................................
IMPALA-7127 (continued): initial fetch-from-catalogd implementation
This patch adds a new RPC to the catalogd which allows a client to fetch
a partial view of table or database metadata. Various subsets of
information can be specified and are sent back in fairly "raw" format.
A new MetaProvider implementation is added which uses this API to
support granular fetching of metadata into the impalad. The interface
had to be reworked in a few ways to support this:
- This API uses partition IDs instead of names to specify them. So, the
listPartitions API now returns opaque PartitionRefs which are passed
back to the MetaProvider when loading more partition details. The new
implementation stores the IDs in these refs while the direct-to-HMS
implementation just uses names.
- The fetching of file descriptors was merged into the loading of other
partition metadata. I couldn't think of any cases where we needed to
list partition details without also fetching the file descriptors so
it simplified things a bit to merge the two. This was a lot easier to
implement for CatalogdMetaProvider since the file metadata is stored
by partition rather than looked up by a directory as in the previous
API.
This necessitated moving some of the logic out of LocalFsTable into
DirectMetaProvider, so LocalFsTable no longer deals directly with HDFS
APIs like FileStatus.
- The handling of "default partition" for an unpartitioned table moved
into the MetaProvider implementations itself instead of LocalFsTable.
This is because the CatalogdProvider sees the "default partition" as a
partition that actually has an identifier on the catalogd, whereas the
DirectMetaProvider does not. So, now both providers export the
"default partition" as a partition like all the others.
This patch also starts to address one of the potential semantic risks of
partial caching on the impalad. If one query fetches some subset of
partitions, then a DDL occurs to change the table metadata, and another
query is submitted, we want to ensure that the metadata for the latter
query still reads a consistent snapshot. In other words, we need to
ensure that the metadata like partition list and table schema come from
the same snapshot as the finer-grained metadata like partition contents.
In order to implement this, the MetadataProvider API now requires that
callers use a 'TableRef' object to specify the table to be read, instead
of the dbName/tableName. In the DirectMetaProvider we don't have any
convenient version numbers for a table, so the TableRef just
encapsulates the naming. In the CatalogdMetaProvider, we additionally
store the version number of the table, and then all subsequent requests
verify that the version number has not changed. If it detects a
concurrent modification, an exception is thrown. I tested this manually
for now by running queries against a table in a loop from my shell while
issuing concurrent 'refresh' queries to that same table. I plan to add
functionality to the planner to issue an automatic "replan" when
this exception is detected.
Change-Id: If49207fc592b1cc552fbcc7199568b6833f86901
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-service-client-wrapper.h
M be/src/catalog/catalog.cc
M be/src/catalog/catalog.h
M be/src/exec/catalog-op-executor.cc
M be/src/exec/catalog-op-executor.h
M be/src/service/fe-support.cc
M common/fbs/CMakeLists.txt
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
A
fe/src/main/java/org/apache/impala/catalog/local/InconsistentMetadataFetchException.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalHbaseTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalPartitionSpec.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalView.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/service/FeSupport.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/HdfsPartitionTest.java
A fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoTest.java
M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java
33 files changed, 1,482 insertions(+), 231 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11182/2
--
To view, visit http://gerrit.cloudera.org:8080/11182
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If49207fc592b1cc552fbcc7199568b6833f86901
Gerrit-Change-Number: 11182
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tianyi Wang <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Vuk Ercegovac <[email protected]>