[ 
https://issues.apache.org/jira/browse/IMPALA-8151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761492#comment-16761492
 ] 

ASF subversion and git services commented on IMPALA-8151:
---------------------------------------------------------

Commit ae96a9fb19e0a2e0a5529f2f36d3b5ee0d336f69 in impala's branch 
refs/heads/master from poojanilangekar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ae96a9f ]

IMPALA-8151: Use sizeof() in HiveUdfCall to specify non-primitive type's size

Previously, data type sizes were hardcoded in
HiveUdfCall::Evaluate(). Since IMPALA-7367 removed the padding
from STRING and VARCHAR types, it could read past the end of the
actual value and cause a crash. This change replaces the hardcoded
values with  sizeof() calls to determine the size of non-primitive
types (STRING, VARCHAR and TIMESTAMP) to avoid similar issues in
the future.

Testing:
Ran test_udfs.py on an ASAN build.
Added logs to manually verify the size of bytes copied.

Change-Id: I919c330546fa86b474ab66245b20ceb1f5525b41
Reviewed-on: http://gerrit.cloudera.org:8080/12355
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> HiveUdfCall assumes StringValue is 16 bytes
> -------------------------------------------
>
>                 Key: IMPALA-8151
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8151
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 3.2.0
>            Reporter: Tim Armstrong
>            Assignee: Pooja Nilangekar
>            Priority: Blocker
>              Labels: crash
>
> HiveUdfCall has the sizes of internal types hardcoded as magic numbers:
> {code}
>       switch (GetChild(i)->type().type) {
>         case TYPE_BOOLEAN:
>         case TYPE_TINYINT:
>           // Using explicit sizes helps the compiler unroll memcpy
>           memcpy(input_ptr, v, 1);
>           break;
>         case TYPE_SMALLINT:
>           memcpy(input_ptr, v, 2);
>           break;
>         case TYPE_INT:
>         case TYPE_FLOAT:
>           memcpy(input_ptr, v, 4);
>           break;
>         case TYPE_BIGINT:
>         case TYPE_DOUBLE:
>           memcpy(input_ptr, v, 8);
>           break;
>         case TYPE_TIMESTAMP:
>         case TYPE_STRING:
>         case TYPE_VARCHAR:
>           memcpy(input_ptr, v, 16);
>           break;
>         default:
>           DCHECK(false) << "NYI";
>       }
> {code}
> STRING and VARCHAR were only 16 bytes because of padding. This padding is 
> removed by IMPALA-7367, so this will read past the end of the actual value. 
> This could in theory lead to a crash.
> We need to change the value, but we should probably also switch to 
> sizeof(StringValue) so that it doesn't get broken by similar changes in 
> future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to