Hello Bharath Vissapragada, Tianyi Wang, Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11182

to look at the new patch set (#7).

Change subject: IMPALA-7436: initial fetch-from-catalogd implementation
......................................................................

IMPALA-7436: initial fetch-from-catalogd implementation

This patch adds a new RPC to the catalogd which allows a client to fetch
a partial view of table or database metadata. Various subsets of
information can be specified and are sent back in fairly "raw" format.

A new MetaProvider implementation is added which uses this API to
support granular fetching of metadata into the impalad. The interface
had to be reworked in a few ways to support this:

- This API uses partition IDs instead of names to specify them. So, the
  listPartitions API now returns opaque PartitionRefs which are passed
  back to the MetaProvider when loading more partition details. The new
  implementation stores the IDs in these refs while the direct-to-HMS
  implementation just uses names.

- The fetching of file descriptors was merged into the loading of other
  partition metadata. I couldn't think of any cases where we needed to
  list partition details without also fetching the file descriptors so
  it simplified things a bit to merge the two. This was a lot easier to
  implement for CatalogdMetaProvider since the file metadata is stored
  by partition rather than looked up by a directory as in the previous
  API.

  This necessitated moving some of the logic out of LocalFsTable into
  DirectMetaProvider, so LocalFsTable no longer deals directly with HDFS
  APIs like FileStatus.

- The handling of "default partition" for an unpartitioned table moved
  into the MetaProvider implementations itself instead of LocalFsTable.
  This is because the CatalogdProvider sees the "default partition" as a
  partition that actually has an identifier on the catalogd, whereas the
  DirectMetaProvider does not. So, now both providers export the
  "default partition" as a partition like all the others.

This patch also starts to address one of the potential semantic risks of
partial caching on the impalad. If one query fetches some subset of
partitions, then a DDL occurs to change the table metadata, and another
query is submitted, we want to ensure that the metadata for the latter
query still reads a consistent snapshot. In other words, we need to
ensure that the metadata like partition list and table schema come from
the same snapshot as the finer-grained metadata like partition contents.

In order to implement this, the MetadataProvider API now requires that
callers use a 'TableRef' object to specify the table to be read, instead
of the dbName/tableName. In the DirectMetaProvider we don't have any
convenient version numbers for a table, so the TableRef just
encapsulates the naming. In the CatalogdMetaProvider, we additionally
store the version number of the table, and then all subsequent requests
verify that the version number has not changed. If it detects a
concurrent modification, an exception is thrown. In a future patch,
I'm planning on having the frontend catch the exception and trigger a
"re-plan".

Change-Id: If49207fc592b1cc552fbcc7199568b6833f86901
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-service-client-wrapper.h
M be/src/catalog/catalog.cc
M be/src/catalog/catalog.h
M be/src/exec/catalog-op-executor.cc
M be/src/exec/catalog-op-executor.h
M be/src/service/fe-support.cc
M common/fbs/CMakeLists.txt
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
A 
fe/src/main/java/org/apache/impala/catalog/local/InconsistentMetadataFetchException.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalHbaseTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalPartitionSpec.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalView.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/FeSupport.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/HdfsPartitionTest.java
A fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoTest.java
M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java
36 files changed, 1,630 insertions(+), 285 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11182/7
--
To view, visit http://gerrit.cloudera.org:8080/11182
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If49207fc592b1cc552fbcc7199568b6833f86901
Gerrit-Change-Number: 11182
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>

Reply via email to