Tim Armstrong has posted comments on this change. (
http://gerrit.cloudera.org:8080/8056 )
Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed
parquet
......................................................................
Patch Set 6:
(4 comments)
The test doesn't appear to be creating uncompressed parquet files. Looking at
the query profile scanning
test_fuzz_uncompressed_parquet_fc4a3734.parquet_uncomp_dst_decimal_tbl I see:
HDFS_SCAN_NODE (id=0):(Total: 5.970ms, non-child: 5.970ms, % non-child:
100.00%)
Hdfs split stats (<volume id>:<# splits>/<split lengths>): -1:1/1.43
KB
ExecOption: PARQUET Codegen Enabled, Codegen enabled: 1 out of 1
Hdfs Read Thread Concurrency Bucket: 0:0% 1:0% 2:0% 3:0% 4:0% 5:0%
6:0% 7:0%
File Formats: PARQUET/SNAPPY:6
It looks like compression_codec isn't modified so we're just getting the
default behaviour of using snappy compression.
http://gerrit.cloudera.org:8080/#/c/8056/6//COMMIT_MSG
Commit Message:
http://gerrit.cloudera.org:8080/#/c/8056/6//COMMIT_MSG@12
PS6, Line 12:
Can you mention what testing you did? The fuzz tests are randomised we should
run them in a loop for a while to confirm that that they're stable.
http://gerrit.cloudera.org:8080/#/c/8056/6/tests/query_test/test_scanners_fuzz.py
File tests/query_test/test_scanners_fuzz.py:
http://gerrit.cloudera.org:8080/#/c/8056/6/tests/query_test/test_scanners_fuzz.py@101
PS6, Line 101: """Parquet tables in default schema are compressed, so in
order
It's weird that parquet/none means parquet/snappy. Unsure what the history is
here and we don't need to change it but it is confusing.
http://gerrit.cloudera.org:8080/#/c/8056/6/tests/query_test/test_scanners_fuzz.py@116
PS6, Line 116: " select * from
functional_parquet.{1}".format(fq_tbl_name, orig_tbl_name))
Long lines > 90 chars here and just below.
http://gerrit.cloudera.org:8080/#/c/8056/6/tests/query_test/test_scanners_fuzz.py@140
PS6, Line 140: self.execute_query("create table %s.%s like %s.%s" %
(fuzz_db, fuzz_table, src_db, src_table))
Long line
--
To view, visit http://gerrit.cloudera.org:8080/8056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc
Gerrit-Change-Number: 8056
Gerrit-PatchSet: 6
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Tue, 03 Oct 2017 18:38:43 +0000
Gerrit-HasComments: Yes