Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-3764: fuzz test HDFS scanners ......................................................................
Patch Set 3: (8 comments) http://gerrit.cloudera.org:8080/#/c/3448/3/tests/query_test/test_scanners.py File tests/query_test/test_scanners.py: Line 642: rng = random.Random() I don't think we need to create a Random object, because it's inconvenient to pass it everywhere. Just call random.seed(x). Then instead of calling rng.randint(...), you can call random.randint(...). The behavior should be identical. Line 644: if random_seed is None: random_seed = time.time() how about: random_seed = os.environ.get("SCANNER_FUZZ_SEED") or time.time() Also, you don't need to put None in the get function, it defaults to that. PS3, Line 676: not nit: I think it's a little easier to read if you put "not" right before "in". if "SCANNER_FUZZ_KEEP_FILES" not in os.environ Line 701: """ Start the comment on this line. Here and elsewhere. PS3, Line 721: copy nit: how about copy_num? PS3, Line 722: copy%d_%s I think here it's easier to use format here: "copy{}_{}".format(copy, filename) so you don't have to worry about differentiating between strings and integers. Also, in general .format() should be preferred over % because % is being deprecated. There is no % in Python 3, to which we will have to switch at some point. PS3, Line 726: self.corrupt_file(filepath, rng) How about moving this to the loop above under shutil.copyfile? self.corrupt_file(copypath) PS3, Line 750: 0, 254 Maybe leave a comment why only 255 different values are generated (instead of 256)? -- To view, visit http://gerrit.cloudera.org:8080/3448 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I50cf43195a7c582caa02c85ae400ea2256fa3a3b Gerrit-PatchSet: 3 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Taras Bobrovytsky <[email protected]> Gerrit-HasComments: Yes
