Dan Hecht has posted comments on this change. Change subject: IMPALA-3441, IMPALA-3659: check for malformed Avro data ......................................................................
Patch Set 11: (13 comments) http://gerrit.cloudera.org:8080/#/c/3072/11/be/src/exec/hdfs-avro-scanner-test.cc File be/src/exec/hdfs-avro-scanner-test.cc: Line 304: // TestReadAvroString(data, 10, sv, 1); i assume you'll reenable this? http://gerrit.cloudera.org:8080/#/c/3072/11/be/src/exec/hdfs-avro-scanner.cc File be/src/exec/hdfs-avro-scanner.cc: PS11, Line 189: num_entries this should be validated (like line 123). Line 483: DCHECK_GE(num_records, 0); should return status Line 485: DCHECK_GE(compressed_size, 0); same http://gerrit.cloudera.org:8080/#/c/3072/11/be/src/exec/hdfs-avro-scanner.h File be/src/exec/hdfs-avro-scanner.h: PS11, Line 283: Report Consider renaming these: SetStatusCorruptData(), SetStatusInvalidValue() to make it clearer what these do. http://gerrit.cloudera.org:8080/#/c/3072/11/be/src/exec/read-write-util.cc File be/src/exec/read-write-util.cc: PS11, Line 25: FindZLongLength how about calling it FindZIntegerLength() since it's really a utility for ReadZInteger()? PS11, Line 36: ReadZLong ReadZInteger() PS11, Line 38: ReadZLongSlow ReadZIntegerSlow Line 61: ZResult ReadWriteUtil::ReadZInteger(uint8_t** buf, uint8_t* buf_end) { maybe add: DCHECK(MAX_LEN == MAX_ZINT_LEN || MAX_LEN == MAX_ZLONG_LEN); Line 87: } does ReadZInteger get inlined? might be safer to put these in the header so that the wrappers get inlined at the callsite. http://gerrit.cloudera.org:8080/#/c/3072/11/be/src/exec/read-write-util.h File be/src/exec/read-write-util.h: Line 109: /// is incremented past the encoded long. should document buf_end Line 247: // } remove commented out code http://gerrit.cloudera.org:8080/#/c/3072/11/be/src/exec/scanner-context.h File be/src/exec/scanner-context.h: Line 244: Status ReportInvalidInt(); it's pretty confusing that these are named "Report" given that, unlike the new scanner routines which do report, these don't actually report the error but just generate the error status. Anyway, not your change, we can leave it for later. (See suggestion for Avro case to help differentiate). -- To view, visit http://gerrit.cloudera.org:8080/3072 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I801a11c496a128e02c564c2a9c44baa5a97be132 Gerrit-PatchSet: 11 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Skye Wanderman-Milne <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Skye Wanderman-Milne <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
