David Knupp has posted comments on this change.

Change subject: IMPALA-4365: Enabling end-to-end tests on a remote cluster
......................................................................


Patch Set 13:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4769/13//COMMIT_MSG
Commit Message:

Line 51: still problems to work out with the remote data load script itself.
> Did you try loading data on a remote cluster and running tests on in with t
Yes, many times. I should update this sentence to be more clear.

This is mainly a references to the several "clean up" changes that Harrison 
suggested earlier, for which JIRA's have been opened. We can address those when 
there's time. More pressing than cleaning up all those things is fact that we 
need to have this checked in order to validate Impala running against a remote 
CDH 5.10 cluster, and time is getting short. We have less than two weeks now.

There were some other actual problems that were mysterious to me initially. 
E.g., Kudu related failures started appearing once recent Kudu changes were 
submitted -- until I realized that this issue was breaking things: 

https://jira.cloudera.com/browse/OPSAPS-37322

But after tweaking the cluster, data loading works, and tests run -- though 
many tests may need to be tweaked to work remotely.


http://gerrit.cloudera.org:8080/#/c/4769/13/bin/remote_data_load.py
File bin/remote_data_load.py:

Line 534:         sys.exit(1)
> In general, I think it's a bad practice to call sys.exit inside functions. 
OK, I'll move this. I'd seen this pattern used here in other scripts here 
(e.g., load-data.py that we use for local data loading), so didn't know it was 
a frowned upon practice.


http://gerrit.cloudera.org:8080/#/c/4769/13/testdata/datasets/functional/schema_constraints.csv
File testdata/datasets/functional/schema_constraints.csv:

PS13, Line 120: Wide tables fail due to the SERDEPROPERTIES limits
> is this a new issue? Is it specific to remote data loading?
For our mini-cluster, we work around this problem here:

https://github.com/apache/incubator-impala/blob/master/bin/create-test-configuration.sh#L99

However, create-test-configuration.sh is part of our local build process. It 
doesn't get called when CDH is deployed to a remote cluster. Besides, that 
script assumes that the metastore database will always be postgres, which is 
not the case when testing against a remote cluster.

Before the change to this file, I had been using another hand-rolled script to 
configure the property separately after deployment. With this, I can drop that 
step.

I've also tested the local data load after this change, and it's unaffected.


-- 
To view, visit http://gerrit.cloudera.org:8080/4769
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dkn...@cloudera.com>
Gerrit-Reviewer: David Knupp <dkn...@cloudera.com>
Gerrit-Reviewer: Harrison Sheinblatt <h...@hotmail.com>
Gerrit-Reviewer: Martin Grund <grundprin...@gmail.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovyt...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to