[Impala-ASF-CR] DRAFT PREVIEW: Prereqs for load test system testing
Harrison Sheinblatt has abandoned this change. Change subject: DRAFT PREVIEW: Prereqs for load test system testing .. Abandoned New one submitted -- To view, visit http://gerrit.cloudera.org:8080/3785 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: Ibd330d8a19ae5d968b669a4258cf826d0db28cdd Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Harrison Sheinblatt Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins
[Impala-ASF-CR] Prereqs for load test system testing
Harrison Sheinblatt has uploaded a new change for review. http://gerrit.cloudera.org:8080/5851 Change subject: Prereqs for load test system testing .. Prereqs for load test system testing Additional py.test parameters to control load testing, include externally linked test modules in python path, add CmService methods to query timeseries and Impala query data from CM if CM is part of SUT, as well as update admission control configuration parameters. Also added a dependency for pyjavaproperties and updated the version of python_dateutil to the virtual env. Change-Id: I38b5b4fe8c30d1c5f591bbe37b6be8bb8a5398ab --- M bin/impala-python-common.sh M bin/set-pythonpath.sh M infra/python/deps/requirements.txt M tests/comparison/cluster.py M tests/conftest.py 5 files changed, 649 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/5851/1 -- To view, visit http://gerrit.cloudera.org:8080/5851 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I38b5b4fe8c30d1c5f591bbe37b6be8bb8a5398ab Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Harrison Sheinblatt
[Impala-ASF-CR] Disable HDFS ACLs on the minicluster
Harrison Sheinblatt has posted comments on this change. Change subject: Disable HDFS ACLs on the minicluster .. Patch Set 1: I think we can't make this change globally as we'd drop test coverage. I think we need to put this in as a workaround temporarily only for data load, but for testing leave the acls on. Not sure this is complete or referring to the right setting though, so still need to double check: $ find . -type f | xargs grep -i "Skip.*hdfs_acls" ./metadata/test_hdfs_permissions.py:@SkipIfS3.hdfs_acls ./query_test/test_insert_behaviour.py: @SkipIfS3.hdfs_acls ./query_test/test_insert_behaviour.py: @SkipIfIsilon.hdfs_acls ./query_test/test_insert_behaviour.py: @SkipIfS3.hdfs_acls ./query_test/test_insert_behaviour.py: @SkipIfIsilon.hdfs_acls ./query_test/test_insert_behaviour.py: @SkipIfS3.hdfs_acls ./query_test/test_insert_behaviour.py: @SkipIfIsilon.hdfs_acls ./query_test/test_insert_behaviour.py: @SkipIfS3.hdfs_acls ./query_test/test_insert_behaviour.py: @SkipIfIsilon.hdfs_acls ./query_test/test_insert_behaviour.py: @SkipIfS3.hdfs_acls ./query_test/test_insert_behaviour.py: @SkipIfIsilon.hdfs_acls ./custom_cluster/test_insert_behaviour.py:@SkipIfS3.hdfs_acls -- To view, visit http://gerrit.cloudera.org:8080/5350 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1799ef34dca3b94acdfcde1fe7e5d143bdfd24c7 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Harrison Sheinblatt Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store sales
Harrison Sheinblatt has posted comments on this change. Change subject: IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store_sales .. Patch Set 2: I think this is an OK solution to unblock current work given we have Jiras for key improvements: simplifying the calling method and considering using 'recover partitions' generally for the 'new' (or newly used) use case where we have loaded the HDFS snapshot but not the metadata snapshot. I do suggest the following: 1. Add a comment in this 'alter' section that the statement is for the code path where we have restored the HDFS snapshot but still need to generate the metadata. 2. File a new Jira to investigate other cases where there may be a problem with this use case or the .test sections aren't used as intended. E.g. in alltypeserror in functional_schema_template.sql, there is no ALTER section, but the CREATE section adds partitions explicitly. 3. Update the jira to refactor the calling python method to link the other jiras if not already, and mention that we want all 3 key code paths to be simpler in comments. I think it's too hard to make larger scale changes now given the priority of the dependent work. The reasons are: 1. We'd have to fix a lot of .test file sections to be consistent, this would require a lot of time going through them all and a lot of testing given that there are at least 3 code paths through each test file to verify. 2. Pulling the key logic up one level into the python requires a large refactor which should probably be accomplished when we refactor the key method there (jira already exists). Right now that method is too complicated to make changes to and test quickly. -- To view, visit http://gerrit.cloudera.org:8080/5177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaae97d1d44201aeeacacdd39adbae35753512950 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Harrison Sheinblatt Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4352: test infra: store Impala/Kudu primary keys in object model
Harrison Sheinblatt has posted comments on this change. Change subject: IMPALA-4352: test infra: store Impala/Kudu primary keys in object model .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4873/1/tests/comparison/common.py File tests/comparison/common.py: Line 524: return tuple(col for col in self._cols if col.is_primary_key) In general I believe primary keys to be ordered, and that the order is independent of declaration order of columns in the create table statement. I think it would be better to store the primary keys as a separate list like _cols to maintain that order information in the Table class. -- To view, visit http://gerrit.cloudera.org:8080/4873 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib654b6cd0e8c2a172ffb7330497be4d4a751e6e5 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Harrison Sheinblatt Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] Enabling end-to-end tests on a remote cluster
Harrison Sheinblatt has posted comments on this change. Change subject: Enabling end-to-end tests on a remote cluster .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4769/1/bin/remote_data_load.py File bin/remote_data_load.py: PS1, Line 365: main > I'm having a bit of trouble parsing this sentence. Can you clarify? With the parser options directly in main() it would be difficult to invoke the main() logic from another python script without shelling out to execute the script as a sub process. If instead, you define the parse options in a separate method, and create a method that does all the logic in main() but takes a parameter of the args, then another python program could set an arg dictionary and invoke the main logic directly without need to shell out. -- To view, visit http://gerrit.cloudera.org:8080/4769 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Harrison Sheinblatt Gerrit-Reviewer: Martin Grund Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] Enabling end-to-end tests on a remote cluster
Harrison Sheinblatt has posted comments on this change. Change subject: Enabling end-to-end tests on a remote cluster .. Patch Set 1: (3 comments) Responded to comments. http://gerrit.cloudera.org:8080/#/c/4769/1/testdata/bin/compute-table-stats.sh File testdata/bin/compute-table-stats.sh: PS1, Line 27: IMPALAD > IMPALA-4346 has been filed. Can you reference the Jira in a comment? http://gerrit.cloudera.org:8080/#/c/4769/1/testdata/bin/create-load-data.sh File testdata/bin/create-load-data.sh: PS1, Line 38: HS2_HOST_PORT > I think the latter is preferable -- corral all the required configs in one Is it reasonable to add a comment referencing the Jira here? http://gerrit.cloudera.org:8080/#/c/4769/1/testdata/bin/setup-hdfs-env.sh File testdata/bin/setup-hdfs-env.sh: PS1, Line 53: CACHEADMIN_ARGS > Clarification: can you be more explicit about the check you want? Something If the is_kerberized block is executed above, then the CACHADMIN_ARGS would include '-owner ${PREVIOUS_USER}'. If HADOOP_USER_NAME is also true, then we add another '-owner ${USER}' to this, which probably breaks it. I think there are probably 4 bugs: 1) The is_kerberized block above probably isn't supported and should be removed and 2) the CACHEADMIN_ARGS definition logic needs a clear conditional, ideally in a single location, that sets the user/group/owner information properly in a way that you can easily tell it's always well-defined. Here it looks like the logic is intended to be that if it's kerberized it sets owner one way, if it's not kerberized and the hadoop user is defined it's set another way and if it's not kerberized and the hadoop user is not defined it stays undefined. If we want to keep the is_kerberized logic in one place, then we can have it set another parameter about owner fields and here only update it if it's set already. 3) CACHEADMIN_ARGS is prob! ably the wrong name as it is for the -addPool command and sets a subset of the args 4) We should explicitly set all arguments to cacheadmin, -addPool if possible (e.g. mode maxTtl) -- To view, visit http://gerrit.cloudera.org:8080/4769 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Harrison Sheinblatt Gerrit-Reviewer: Martin Grund Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
Harrison Sheinblatt has posted comments on this change. Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/4348/9/testdata/bin/check-hbase-nodes.py File testdata/bin/check-hbase-nodes.py: Line 135: zk_client.stop() A problem is that zk client wants you to stop() and then close(). In this case, if there is an exception or something that gets through, the close() will be called without stop(). This may work without a 'graceful shutdown' of the client, and be OK. I've done a 2-phase context manager in the past to ensure stop() is called before close(). -- To view, visit http://gerrit.cloudera.org:8080/4348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Harrison Sheinblatt Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Ishaan Joshi Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes