Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12212 )
Change subject: [scripts] Add initial test scripts for backup/restore testing ...................................................................... Patch Set 1: (12 comments) http://gerrit.cloudera.org:8080/#/c/12212/1/src/kudu/scripts/backup-perf.py File src/kudu/scripts/backup-perf.py: PS1: Some of the lines in this file are too long. http://gerrit.cloudera.org:8080/#/c/12212/1/src/kudu/scripts/backup-perf.py@50 PS1, Line 50: self.last_tick_ = timeit.default_timer() Nit: I don't think we need the underscore suffix to denote class variables; in Python you have to prepend 'self.' to every class variable reference anyway, so it's usually pretty obvious when you're accessing a class variable. http://gerrit.cloudera.org:8080/#/c/12212/1/src/kudu/scripts/backup-perf.py@71 PS1, Line 71: Backported from Python 2.7 as it's implemented as pure python on stdlib. Not really understanding the justification. I thought this was so you could use check_output() on Python 2.6, since it was introduced in 2.7? http://gerrit.cloudera.org:8080/#/c/12212/1/src/kudu/scripts/backup-perf.py@76 PS1, Line 76: output, unused_err = process.communicate() Nit: Python convention is to use _ in place of a variable whose value you don't care about. But if this is a faithful backport, ignore this comment. http://gerrit.cloudera.org:8080/#/c/12212/1/src/kudu/scripts/backup-perf.py@223 PS1, Line 223: kudu-backup2_2.11-*.jar Could we glob this even more aggressively so that we're resilient to Spark/Scala version changes too? For kudu-spark2-tools as well. http://gerrit.cloudera.org:8080/#/c/12212/1/src/kudu/scripts/backup-perf.py@244 PS1, Line 244: # Actions I think you can simplify the true/false stuff with action='store_true'; doesn't that imply type, choices, and default? http://gerrit.cloudera.org:8080/#/c/12212/1/src/kudu/scripts/grep_stats.py File src/kudu/scripts/grep_stats.py: PS1: License header. http://gerrit.cloudera.org:8080/#/c/12212/1/src/kudu/scripts/grep_stats.py@6 PS1, Line 6: for line in sys.stdin: This script isn't used yet, and I think you could replace it with an awk snippet anyway. Something like https://stackoverflow.com/questions/38972736/how-to-select-lines-between-two-patterns/38972737#38972737. http://gerrit.cloudera.org:8080/#/c/12212/1/src/kudu/scripts/run_backup_tests.sh File src/kudu/scripts/run_backup_tests.sh: http://gerrit.cloudera.org:8080/#/c/12212/1/src/kudu/scripts/run_backup_tests.sh@51 PS1, Line 51: NUM_TABLET_SERVERS=$((`kudu tserver list localhost -format=csv | wc -l`)) This should be $KUDU_MASTER_ADDRESSES http://gerrit.cloudera.org:8080/#/c/12212/1/src/kudu/scripts/run_backup_tests.sh@62 PS1, Line 62: OPTS="$OPTS --kudu-spark-tools-jar $ROOT/java/kudu-spark-tools/build/libs/kudu-spark2-tools_2.11-1.9.0-SNAPSHOT.jar" : OPTS="$OPTS --kudu-backup-jar $ROOT/java/kudu-backup/build/libs/kudu-backup2_2.11-1.9.0-SNAPSHOT.jar" Could we glob these so we needn't update this when the version changes? http://gerrit.cloudera.org:8080/#/c/12212/1/src/kudu/scripts/run_backup_tests.sh@81 PS1, Line 81: hadoop fs -rm -r -skipTrash $BACKUP_PATH/$TS Should we do something like this before we get started, to clear out any leftover state from the a previous interrupted run? http://gerrit.cloudera.org:8080/#/c/12212/1/src/kudu/scripts/run_backup_tests.sh@89 PS1, Line 89: exit 0 This isn't necessary; the script will exit with status 0 on its own at the end. -- To view, visit http://gerrit.cloudera.org:8080/12212 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0c8efc3778b20687f0c1bc4c825f49f8f24e6d3b Gerrit-Change-Number: 12212 Gerrit-PatchSet: 1 Gerrit-Owner: Grant Henke <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Comment-Date: Fri, 11 Jan 2019 04:25:35 +0000 Gerrit-HasComments: Yes
