Skye Wanderman-Milne has posted comments on this change.

Change subject: IMPALA-3441: check for malformed Avro data
......................................................................


Patch Set 5:

(4 comments)

I used perf top with the --perf_map option and ReadZLong() is where most of the 
time is spent:

  18.47%  impalad [.] _ZN6impala13ReadWriteUtil9ReadZLongEPPhPlS3_
   9.11%  impalad [.] _ZN6snappy13RawUncompressEPNS_6SourceEPc

Unfortunately I need to add even more error checking to this function 
(IMPALA-3659), so maybe we should get this in and work on ReadZLong() perf in 
that patch.

http://gerrit.cloudera.org:8080/#/c/3072/6/be/src/exec/hdfs-avro-scanner-ir.cc
File be/src/exec/hdfs-avro-scanner-ir.cc:

Line 176:     ReportInvalidValue(TErrorCode::AVRO_INVALID_LENGTH, len);
> UNLIKELY
Done


Line 180:     ReportCorruptData(TErrorCode::AVRO_TRUNCATED_BLOCK);
> UNLIKELY
Done


http://gerrit.cloudera.org:8080/#/c/3072/5/be/src/exec/hdfs-avro-scanner-test.cc
File be/src/exec/hdfs-avro-scanner-test.cc:

Line 190:   // TODO: we don't handle invalid values (e.g. overflow)
> There's an ASSERT_DEBUG_DEATH macro that does some magic to assert that it 
Dan pointed out that we need to fix this one way or another or we still may 
crash on corrupt files, so I'm gonna skip this for now since I need to fix it 
ASAP anyway. I'll add a test case for it when I fix it.


http://gerrit.cloudera.org:8080/#/c/3072/6/testdata/workloads/functional-query/queries/DataErrorsTest/avro-errors.test
File 
testdata/workloads/functional-query/queries/DataErrorsTest/avro-errors.test:

Line 7
> Are you intending to fix this in this patch?
Oops, forgot to address this. I'm having a lot of trouble matching these error 
messages with the current test verifier logic, so I commented them out for now 
and meant to file a JIRA to fix this.


-- 
To view, visit http://gerrit.cloudera.org:8080/3072
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I801a11c496a128e02c564c2a9c44baa5a97be132
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Skye Wanderman-Milne <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to