David Knupp has posted comments on this change. Change subject: IMPALA-4365: Enabling end-to-end tests on a remote cluster ......................................................................
Patch Set 13: (3 comments) http://gerrit.cloudera.org:8080/#/c/4769/13//COMMIT_MSG Commit Message: Line 51: still problems to work out with the remote data load script itself. > Did you try loading data on a remote cluster and running tests on in with t Yes, many times. I should update this sentence to be more clear. This is mainly a references to the several "clean up" changes that Harrison suggested earlier, for which JIRA's have been opened. We can address those when there's time. More pressing than cleaning up all those things is fact that we need to have this checked in order to validate Impala running against a remote CDH 5.10 cluster, and time is getting short. We have less than two weeks now. There were some other actual problems that were mysterious to me initially. E.g., Kudu related failures started appearing once recent Kudu changes were submitted -- until I realized that this issue was breaking things: https://jira.cloudera.com/browse/OPSAPS-37322 But after tweaking the cluster, data loading works, and tests run -- though many tests may need to be tweaked to work remotely. http://gerrit.cloudera.org:8080/#/c/4769/13/bin/remote_data_load.py File bin/remote_data_load.py: Line 534: sys.exit(1) > In general, I think it's a bad practice to call sys.exit inside functions. OK, I'll move this. I'd seen this pattern used here in other scripts here (e.g., load-data.py that we use for local data loading), so didn't know it was a frowned upon practice. http://gerrit.cloudera.org:8080/#/c/4769/13/testdata/datasets/functional/schema_constraints.csv File testdata/datasets/functional/schema_constraints.csv: PS13, Line 120: Wide tables fail due to the SERDEPROPERTIES limits > is this a new issue? Is it specific to remote data loading? For our mini-cluster, we work around this problem here: https://github.com/apache/incubator-impala/blob/master/bin/create-test-configuration.sh#L99 However, create-test-configuration.sh is part of our local build process. It doesn't get called when CDH is deployed to a remote cluster. Besides, that script assumes that the metastore database will always be postgres, which is not the case when testing against a remote cluster. Before the change to this file, I had been using another hand-rolled script to configure the property separately after deployment. With this, I can drop that step. I've also tested the local data load after this change, and it's unaffected. -- To view, visit http://gerrit.cloudera.org:8080/4769 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp <dkn...@cloudera.com> Gerrit-Reviewer: David Knupp <dkn...@cloudera.com> Gerrit-Reviewer: Harrison Sheinblatt <h...@hotmail.com> Gerrit-Reviewer: Martin Grund <grundprin...@gmail.com> Gerrit-Reviewer: Michael Brown <mi...@cloudera.com> Gerrit-Reviewer: Taras Bobrovytsky <tbobrovyt...@cloudera.com> Gerrit-HasComments: Yes