Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3764,3914: fuzz test HDFS scanners and fix parquet bugs 
found
......................................................................


Patch Set 10:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/3448/9/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 649:       FILE_CHECK_GE(col_reader->def_level(),
> Doesn't have to be done now, but FILE_CHECK places would be nice if they ca
I'm not sure I exactly understand. FILE_CHECKs are DCHECKs now so we should 
crash if the scanner actually exercises them.


http://gerrit.cloudera.org:8080/#/c/3448/9/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 497:   if (range->offset_ < 0) {
> this seems weird. makes it unclear if it's an invariant or not. I think we 
Yeah that is weird. Fixed it.


http://gerrit.cloudera.org:8080/#/c/3448/9/be/src/runtime/scoped-buffer.h
File be/src/runtime/scoped-buffer.h:

Line 13: // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
> mind putting in the ASF license text instead?
Done


http://gerrit.cloudera.org:8080/#/c/3448/9/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

Line 171:     if (UNLIKELY(buffer_len == 0)) return Status("Dictionary cannot 
be 0 bytes");
> who can call this with buffer_len == 0? why isn't != 0 an invariant while <
My thought was that negative lengths are clearly nonsensical, but the fact that 
zero-length buffers aren't valid dictionaries seems closer to an implementation 
detail. Or at least it isn't obvious to someone working on the scanner code; 
calling SetData() with length = 0 doesn't obviously look wrong. 

So it seemed reasonable that we should validate it since we're validating the 
input here anyway. I can remove the check if you prefer since the callsite in 
parquet checks it anyway.


http://gerrit.cloudera.org:8080/#/c/3448/9/tests/query_test/test_scanners_fuzz.py
File tests/query_test/test_scanners_fuzz.py:

Line 1: # Licensed to the Apache Software Foundation (ASF) under one
> please put ASF license here rather than copyright.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3448
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I50cf43195a7c582caa02c85ae400ea2256fa3a3b
Gerrit-PatchSet: 10
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: David Knupp <[email protected]>
Gerrit-Reviewer: Taras Bobrovytsky <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to