[
https://issues.apache.org/jira/browse/IMPALA-7756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16663969#comment-16663969
]
Tim Armstrong commented on IMPALA-7756:
---------------------------------------
Maybe it's cheap enough to add an UNLIKELY() branch at some of these places
though. It looks like it's mostly in the nested types scanning code.
> Create a "strict mode" for file parsing in debug builds
> -------------------------------------------------------
>
> Key: IMPALA-7756
> URL: https://issues.apache.org/jira/browse/IMPALA-7756
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend, Infrastructure
> Reporter: Csaba Ringhofer
> Priority: Minor
> Labels: parquet
>
> Errors in data files can be handled in several ways:
> a. the error can lead to a non-ok STATUS returned to the user
> b. some errors are ignored intentionally to support files written by buggy
> writers from the past or to avoid the performance cost of checking /
> complexity of bubbling up the status
> c. some errors lead to hitting a DCHECK - this means that it will be ignored
> in release builds, but will cause crash in debug builds
> c. is problematic because fuzz tests can generate such corrupt files, leading
> to crashed (and flaky) tests. On the other side, ignoring errors can mean
> that bugs in writers are not detected by Impala's tests.
> c. could be replaced with DCHECK like macros that are switched off if a flag
> (e.g. 'strict_parsers') is false. The flag would have no effect in release
> builds, and would be true by default during tests, while some specific tests
> (fuzz tests + parser tests for corrupt files) would start impalads with the
> flag turned off.
> The drawback would be that related tests would have to be custom cluster
> tests, making the test suite slower. The main benefit is that it is easy to
> add such checks to the code and they can provide guardrails during file
> writer development.
> An alternative approach would be to implement this as a query option, which
> would make tests simpler and faster, but would make the backend changes less
> trivial.
> There are already macros that seem to have the same purpose (e.g.
> FILE_CHECK_GE),, but they are implemented as simple DCHECKs (see
> https://github.com/apache/impala/blob/15e8ce4f273945ce548fe677ee0140dea8068e6d/be/src/util/debug-util.h#L153I)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]