Quanlong Huang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/18811


Change subject: IMPALA-11469: Make prefix of ignored staging dirs configurable
......................................................................

IMPALA-11469: Make prefix of ignored staging dirs configurable

External systems like Hive or Spark will write temporary or "non-data"
files in the table location. Catalogd will skip them when loading file
metadata. However, the prefix is currently hard coded. We recently found
that Spark streaming will generated a _spark_metadata dir which is not
handled correctly.

To avoid future code changes when interact with more systems, this patch
adds a new startup flag, ignored_dir_prefix_list, for catalogd. It's a
comma separated list for the prefix of ignored dirs. Currently, the
default value is ".,_tmp.,_spark_metadata". Users can add more in the
future.

Tests:
 - Add a case for _spark_metadata in FileSystemUtilTest

Change-Id: I108bfa823281a35d28932f7ccce0b12a0c5af57d
---
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java
5 files changed, 50 insertions(+), 8 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/18811/1
--
To view, visit http://gerrit.cloudera.org:8080/18811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I108bfa823281a35d28932f7ccce0b12a0c5af57d
Gerrit-Change-Number: 18811
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang <[email protected]>

Reply via email to