Dan Hecht has posted comments on this change.

Change subject: IMPALA-3441, IMPALA-3659: check for malformed Avro data
......................................................................


Patch Set 11:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/3072/11/be/src/exec/hdfs-avro-scanner-test.cc
File be/src/exec/hdfs-avro-scanner-test.cc:

Line 304:   // TestReadAvroString(data, 10, sv, 1);
i assume you'll reenable this?


http://gerrit.cloudera.org:8080/#/c/3072/11/be/src/exec/hdfs-avro-scanner.cc
File be/src/exec/hdfs-avro-scanner.cc:

PS11, Line 189: num_entries
this should be validated (like line 123).


Line 483:     DCHECK_GE(num_records, 0);
should return status


Line 485:     DCHECK_GE(compressed_size, 0);
same


http://gerrit.cloudera.org:8080/#/c/3072/11/be/src/exec/hdfs-avro-scanner.h
File be/src/exec/hdfs-avro-scanner.h:

PS11, Line 283: Report
Consider renaming these: SetStatusCorruptData(), SetStatusInvalidValue() to 
make it clearer what these do.


http://gerrit.cloudera.org:8080/#/c/3072/11/be/src/exec/read-write-util.cc
File be/src/exec/read-write-util.cc:

PS11, Line 25: FindZLongLength
how about calling it FindZIntegerLength() since it's really a utility for 
ReadZInteger()?


PS11, Line 36: ReadZLong
ReadZInteger()


PS11, Line 38: ReadZLongSlow
ReadZIntegerSlow


Line 61: ZResult ReadWriteUtil::ReadZInteger(uint8_t** buf, uint8_t* buf_end) {
maybe add:
DCHECK(MAX_LEN == MAX_ZINT_LEN || MAX_LEN == MAX_ZLONG_LEN);


Line 87: }
does ReadZInteger get inlined? might be safer to put these in the header so 
that the wrappers get inlined at the callsite.


http://gerrit.cloudera.org:8080/#/c/3072/11/be/src/exec/read-write-util.h
File be/src/exec/read-write-util.h:

Line 109:   /// is incremented past the encoded long.
should document buf_end


Line 247: // }
remove commented out code


http://gerrit.cloudera.org:8080/#/c/3072/11/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

Line 244:     Status ReportInvalidInt();
it's pretty confusing that these are named "Report" given that, unlike the new 
scanner routines which do report, these don't actually report the error but 
just generate the error status.  Anyway, not your change, we can leave it for 
later. (See suggestion for Avro case to help differentiate).


-- 
To view, visit http://gerrit.cloudera.org:8080/3072
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I801a11c496a128e02c564c2a9c44baa5a97be132
Gerrit-PatchSet: 11
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Skye Wanderman-Milne <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to