[
https://issues.apache.org/jira/browse/IMPALA-8151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761492#comment-16761492
]
ASF subversion and git services commented on IMPALA-8151:
---------------------------------------------------------
Commit ae96a9fb19e0a2e0a5529f2f36d3b5ee0d336f69 in impala's branch
refs/heads/master from poojanilangekar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ae96a9f ]
IMPALA-8151: Use sizeof() in HiveUdfCall to specify non-primitive type's size
Previously, data type sizes were hardcoded in
HiveUdfCall::Evaluate(). Since IMPALA-7367 removed the padding
from STRING and VARCHAR types, it could read past the end of the
actual value and cause a crash. This change replaces the hardcoded
values with sizeof() calls to determine the size of non-primitive
types (STRING, VARCHAR and TIMESTAMP) to avoid similar issues in
the future.
Testing:
Ran test_udfs.py on an ASAN build.
Added logs to manually verify the size of bytes copied.
Change-Id: I919c330546fa86b474ab66245b20ceb1f5525b41
Reviewed-on: http://gerrit.cloudera.org:8080/12355
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> HiveUdfCall assumes StringValue is 16 bytes
> -------------------------------------------
>
> Key: IMPALA-8151
> URL: https://issues.apache.org/jira/browse/IMPALA-8151
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 3.2.0
> Reporter: Tim Armstrong
> Assignee: Pooja Nilangekar
> Priority: Blocker
> Labels: crash
>
> HiveUdfCall has the sizes of internal types hardcoded as magic numbers:
> {code}
> switch (GetChild(i)->type().type) {
> case TYPE_BOOLEAN:
> case TYPE_TINYINT:
> // Using explicit sizes helps the compiler unroll memcpy
> memcpy(input_ptr, v, 1);
> break;
> case TYPE_SMALLINT:
> memcpy(input_ptr, v, 2);
> break;
> case TYPE_INT:
> case TYPE_FLOAT:
> memcpy(input_ptr, v, 4);
> break;
> case TYPE_BIGINT:
> case TYPE_DOUBLE:
> memcpy(input_ptr, v, 8);
> break;
> case TYPE_TIMESTAMP:
> case TYPE_STRING:
> case TYPE_VARCHAR:
> memcpy(input_ptr, v, 16);
> break;
> default:
> DCHECK(false) << "NYI";
> }
> {code}
> STRING and VARCHAR were only 16 bytes because of padding. This padding is
> removed by IMPALA-7367, so this will read past the end of the actual value.
> This could in theory lead to a crash.
> We need to change the value, but we should probably also switch to
> sizeof(StringValue) so that it doesn't get broken by similar changes in
> future.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]