Csaba Ringhofer has uploaded a new patch set (#19) to the change originally 
created by Sudhanshu Arora. ( http://gerrit.cloudera.org:8080/13334 )

Change subject: acid: Filter unwanted files based on ACID state.
......................................................................

acid: Filter unwanted files based on ACID state.

- Added new functionality in AcidUtils to filter out files in
  uncommitted directories, and to find the latest valid base data and
  filter out files corresponding to older deltas or bases.

- Changed Table loading to only load writeIds for transactional tables,
  and enabled a previously-ignored unit test.

- Modified Hive configuration to enable support for compactions:
-- Need to pass Tez on the HMS classpath, since HMS actually schedules
   compactions rather than HS2.
-- Had to configure a worker thread for the compactor, or else
   compactions wouldn't proceed even when manually triggered.

Testing:
- New unit tests (AcidUtilsTest) for filtering logic.
- New e2e test to read data written by Hive in an insert-only table,
  with INSERT, INSERT OVERWRITE, and compaction. Also tests negative
  cases e2e.

To enable the e2e test, this adds support for a 'HIVE_QUERY' section to
the test script files. To make it reasonably fast, this uses Thrift to
connect to HS2 rather than shelling out to beeline. In order for this to
work properly, a bit of extra special-casing had to be added to the test
utility.

This commit was co-authored by Sudhanshu Arora and Todd Lipcon.

Change-Id: Icf0aeb36e10c827ead59ed7f67e731199394fe8e
---
M fe/pom.xml
M fe/src/compat-hive-2/java/org/apache/hadoop/hive/common/ValidWriteIdList.java
M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java
M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java
M fe/src/test/java/org/apache/impala/catalog/HdfsPartitionTest.java
A fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java
M fe/src/test/resources/hive-site.xml.py
M testdata/bin/run-hive-server.sh
A testdata/workloads/functional-query/queries/QueryTest/acid-compaction.test
A testdata/workloads/functional-query/queries/QueryTest/acid-negative.test
A testdata/workloads/functional-query/queries/QueryTest/acid.test
M tests/common/impala_connection.py
M tests/common/impala_test_suite.py
M tests/common/skip.py
A tests/query_test/test_acid.py
M tests/util/test_file_parser.py
22 files changed, 824 insertions(+), 182 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/13334/19
--
To view, visit http://gerrit.cloudera.org:8080/13334
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icf0aeb36e10c827ead59ed7f67e731199394fe8e
Gerrit-Change-Number: 13334
Gerrit-PatchSet: 19
Gerrit-Owner: Sudhanshu Arora <sudhan...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Sudhanshu Arora <sudhan...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>

Reply via email to