[
https://issues.apache.org/jira/browse/IMPALA-13894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yida Wu resolved IMPALA-13894.
------------------------------
Fix Version/s: Impala 5.0.0
Resolution: Fixed
> Tuple cache correctness verification should proceed past file size differences
> ------------------------------------------------------------------------------
>
> Key: IMPALA-13894
> URL: https://issues.apache.org/jira/browse/IMPALA-13894
> Project: IMPALA
> Issue Type: Task
> Components: Backend
> Affects Versions: Impala 5.0.0
> Reporter: Joe McDonnell
> Assignee: Yida Wu
> Priority: Major
> Fix For: Impala 5.0.0
>
>
> Tuple cache correctness verification does a fast check to see if the two
> files are identical. If it determines that they are not identical, then it
> can proceed to a slow check that corrects for order differences.
> This fast check looks at the file sizes and if they are not the same, it
> returns a not-OK status:
> {noformat}
> if (file1_length != file2_length || file1_length ==
> TUPLE_TEXT_FILE_SIZE_ERROR) {
> return Status(TErrorCode::TUPLE_CACHE_INCONSISTENCY,
> Substitute("Size of file '$0' (size: $1) and '$2' (size: $3) are
> different",
> path_a + DEBUG_TUPLE_CACHE_BAD_POSTFIX, file1_length,
> path_b + DEBUG_TUPLE_CACHE_BAD_POSTFIX, file2_length));
> }{noformat}
> Returning not-OK status actually causes the calling code to skip the slow
> check that can give more detail about what is different. We should change
> this to set *passed = false and let the slower check go forward so that it
> produces a more interesting error message. It's also unclear whether the same
> rows in a different order would always have the same size.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)