[
https://issues.apache.org/jira/browse/IMPALA-7833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16768833#comment-16768833
]
Bikramjeet Vig commented on IMPALA-7833:
----------------------------------------
Oh right, forgot about that query option. So, I dig a lil more and this is what
i found:
i was able to trigger int32 overflow in *concat's* implementation but it does
not crash impala because it adds all input string lenghts to an int32 and it
just wraps around but gets caught by the (str_len > StringVal::MAX_LENGTH)
check in the stringVal constructor, as it gets compared with an unsigned int. I
think we can screw this up by crossing the uint limit of 4gig; so i tried doing
that but hit the process mem limit instead. Regardless overflowing in the
concat implementation is definitely wrong and susceptible to unpredictable
behavior so we should fix that. The query i used to overflow the size is:
select concat(lpad('foo', 1073741820 , ' '),lpad('foo', 1073741820 , '
'),lpad('foo', 1073741820 , ' '));
As to why *lpad/rpad/space* crashes, that happens when it initializes the size
variable of type int32 with an int64 value but it gets set to zero instead
(probabaly due to conversion checks added by the compiler) and we allocate no
mem for the string val, then at the end when it tries to use memset/memcpy with
the non-zero length it crashes.
with *regex_escape* I dont think we can trigger overflow of int32 since
StringVal::MAX_LENGTH is 1 gig and regex_escape directly creates a StringVal
with double the size of its input, if input_str.size is <=
StringVal::MAX_LENGTH we are fine since StringVal constructor wont let us
create a string more than 1 gig. So seems like regex_escape will work fine.
> Audit and fix other string builtins for long string handling
> ------------------------------------------------------------
>
> Key: IMPALA-7833
> URL: https://issues.apache.org/jira/browse/IMPALA-7833
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 2.11.0, Impala 3.0, Impala 3.1.0
> Reporter: Tim Armstrong
> Priority: Critical
> Labels: crash, ramp-up
>
> Following on from IMPALA-7822, there are some other string builtins that seem
> to follow the same pattern of having a string size overflow an int passed
> into the StringVal constructor. I think in some cases we get lucky and it
> works out, but others it seems possible to crash given the right input
> values.
> Here are some examples of cases where we can hit such bugs:
> {noformat}
> select lpad('foo', 17179869184 , ' ');
> select rpad('foo', 17179869184 , ' ');
> select space(17179869184 );
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]