Skye Wanderman-Milne has posted comments on this change. Change subject: IMPALA-2466: Add more tests for the HDFS parquet scanner. ......................................................................
Patch Set 2: (6 comments) http://gerrit.cloudera.org:8080/#/c/1500/2/tests/query_test/test_scanners.py File tests/query_test/test_scanners.py: Line 238: def test_multiple_blocks_one_rg(self, vector): one_row_group Line 247: def _multiple_blocks_helper(self, query, total_rows, one_rg=False, ranges_per_node=1): Can you add a brief comment summarizing what this function does? Also one_row_group, one_rg may be hard to parse Line 248: result = self.client.execute(query) Since this function makes assumptions about the query below, how about you pass in the fully-qualified table name (e.g. "functional_parquet.whatever") instead of the full query. Line 250: assert result.data[0] == total_rows Maybe rename total_rows to rows_in_table. Also make this assert result.data[0] == str(total_rows), then you can pass in an int which is more intuitive. Line 260: assert len(num_row_groups_list) == len(scan_ranges_complete_list) == 4 Personally I like this broken up into two asserts better (also I think this will make the output less helpful if one of the conditions fails). Also assert len(num_rows_read_list) == 4? Line 338: trailing white space -- To view, visit http://gerrit.cloudera.org:8080/1500 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4faccd9ce3fad42402652c8f17d4e7aa3d593368 Gerrit-PatchSet: 2 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Skye Wanderman-Milne <[email protected]> Gerrit-HasComments: Yes
