[
https://issues.apache.org/jira/browse/IMPALA-8151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758629#comment-16758629
]
Pooja Nilangekar commented on IMPALA-8151:
------------------------------------------
I agree. I believe it would make sense to use sizeof() for all other datatypes
as well. Since datatypes like TIMESTAMP may be modified in the future. Or would
it be too much of an overhead?
> HiveUdfCall assumes StringValue is 16 bytes
> -------------------------------------------
>
> Key: IMPALA-8151
> URL: https://issues.apache.org/jira/browse/IMPALA-8151
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 3.2.0
> Reporter: Tim Armstrong
> Assignee: Pooja Nilangekar
> Priority: Blocker
> Labels: crash
>
> HiveUdfCall has the sizes of internal types hardcoded as magic numbers:
> {code}
> switch (GetChild(i)->type().type) {
> case TYPE_BOOLEAN:
> case TYPE_TINYINT:
> // Using explicit sizes helps the compiler unroll memcpy
> memcpy(input_ptr, v, 1);
> break;
> case TYPE_SMALLINT:
> memcpy(input_ptr, v, 2);
> break;
> case TYPE_INT:
> case TYPE_FLOAT:
> memcpy(input_ptr, v, 4);
> break;
> case TYPE_BIGINT:
> case TYPE_DOUBLE:
> memcpy(input_ptr, v, 8);
> break;
> case TYPE_TIMESTAMP:
> case TYPE_STRING:
> case TYPE_VARCHAR:
> memcpy(input_ptr, v, 16);
> break;
> default:
> DCHECK(false) << "NYI";
> }
> {code}
> STRING and VARCHAR were only 16 bytes because of padding. This padding is
> removed by IMPALA-7367, so this will read past the end of the actual value.
> This could in theory lead to a crash.
> We need to change the value, but we should probably also switch to
> sizeof(StringValue) so that it doesn't get broken by similar changes in
> future.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]