Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8260 )

Change subject: PREVIEW: IMPALA-6052: Change HDFS layout for test tables
......................................................................


Patch Set 3:

(1 comment)

> (2 comments)
 >
 > The direction here seems fine to me. This is a case where I think
 > you'll need to run both exhaustive and S3 tests before commit,
 > since this is so very cross-cutting.
 >
 > What's the bigger picture of what brought you here?

Ran exhaustive. Dataload on S3 relies on snapshots, so it is not easy to test.

Partly this is just a cleanup for cleanup's sake. I don't like listing HDFS and 
finding hundreds of directories with no structure. I think developers will be 
better off whenever they need to look at the HDFS files.

A separate motivation is that the inconsistency has created problems when 
loading data on a remote cluster. The current way of loading data on a remote 
cluster is to copy the HDFS data over and then recreate the tables and 
metadata. However, if dataload can't tell that a table is already populated (by 
looking at the disk usage of directories on HDFS), then it will try to do 
inserts or loads as well as the create table statements. IMPALA-6068 had to be 
reverted because dataload couldn't tell that a table was already populated, and 
it tried to do a LOAD DATA LOCAL statement, which can't work on a remote 
cluster. This creates the consistency needed to accurately detect whether a 
table is populated.

http://gerrit.cloudera.org:8080/#/c/8260/3/testdata/bin/generate-schema-statements.py
File testdata/bin/generate-schema-statements.py:

http://gerrit.cloudera.org:8080/#/c/8260/3/testdata/bin/generate-schema-statements.py@473
PS3, Line 473:   if p.returncode != 0:
             :     print "eval_section command failed: {0}".format(cmd)
             :     assert(False)
> nit: I think this is equivalent to:
Done



--
To view, visit http://gerrit.cloudera.org:8080/8260
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ba27ba6d3c7e445795e750281070963bbe1bb51
Gerrit-Change-Number: 8260
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Philip Zeyliger <[email protected]>
Gerrit-Comment-Date: Fri, 29 Dec 2017 23:24:10 +0000
Gerrit-HasComments: Yes

Reply via email to