Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/8056 )
Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet ...................................................................... Patch Set 4: (4 comments) I like the overall approach. I have a few small naming/style issues, but I think this is getting closer. http://gerrit.cloudera.org:8080/#/c/8056/4/tests/query_test/test_scanners_fuzz.py File tests/query_test/test_scanners_fuzz.py: http://gerrit.cloudera.org:8080/#/c/8056/4/tests/query_test/test_scanners_fuzz.py@101 PS4, Line 101: """ Clone an existing parquet table with codec as none in the : unique database. This cloned table is passed to run_fuzz_test : which clones the table and corrupts the table. The test later : checks that there is no crash while performing SQL queries on : a corrupt table. : """ I think this comment should focus on why this test is different from the others. For example, it should explain that the parquet tables in the default schema are always compressed. So, in order to test uncompressed parquet, we need to create a new source table. I think you can skip the last two sentences. http://gerrit.cloudera.org:8080/#/c/8056/4/tests/query_test/test_scanners_fuzz.py@111 PS4, Line 111: db_name = unique_database I would prefer to emphasize that the source and destination are the unique_database. To make that clearer, I think I would get rid of this variable and just use unique_database directly in each location. http://gerrit.cloudera.org:8080/#/c/8056/4/tests/query_test/test_scanners_fuzz.py@117 PS4, Line 117: functional_parquet.alltypes Can we extend this to do fuzzing on decimal_tbl as well? I was thinking this could be a loop that runs fuzzing over a list of tables (that happens to have two entries). http://gerrit.cloudera.org:8080/#/c/8056/4/tests/query_test/test_scanners_fuzz.py@118 PS4, Line 118: .format(fq_tbl_name)) This indentation is a bit awkward. I don't think .format should be on its own line. One way to get around this is to use only 4 space indentation for the second line (" select"...) and then put the .format on that line. -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-Change-Number: 8056 Gerrit-PatchSet: 4 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Mon, 25 Sep 2017 23:25:30 +0000 Gerrit-HasComments: Yes