[
https://issues.apache.org/jira/browse/SPARK-44840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17756574#comment-17756574
]
Serge Rielau edited comment on SPARK-44840 at 8/20/23 7:41 PM:
---------------------------------------------------------------
[~srowen] There is no standard as such.
However, there are multiple reasons not to be compatible with Snowflake:
1. Precedence: SUBSTR('Hello', 1, 1) => 'H', SUBSTR('Hello', -1, 1) => 'o' (not
'l').
2. array access has been a mixed bag for us (some 0, some 1-based), but we have
tried to move towards 1-based as well. e.g., element_at() is 1-based, and we
use -1 (!) to get the last element.
3. Snowflake had no choice but to use -1 for the second last element because 1
is their second element. Because they are 0-based they are unable to use
array_insert() to append an element (short of passing the (length - 1) as
parameter. So the proposal is objectively more powerful.
was (Author: JIRAUSER288374):
[~srowen] There is no standard as such.
However, there are multiple reasons not to be compatible with Snowflake:
1. Precedence: SUBSTR('Hello', 1, 1) => 'H', SUBSTR('Hello', -1, 1) => 'o' (not
'l').
2. array access has been a mixed bag for us (some 0, some 1-based), but we have
tried to move towards 1-based as well. e.g., element_at() is 1-based, and we
use -1 (!) to get the last element.
3. Snowflake had no choice but to use 1 for the second last element because 1
is their second element. Because they are 0-based they are unable to use
array_insert() to append an element (short of passing the (length - 1) as
parameter. So the proposal is objectively more powerful.
> array_insert() give wrong results for ngative index
> ---------------------------------------------------
>
> Key: SPARK-44840
> URL: https://issues.apache.org/jira/browse/SPARK-44840
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 3.4.0
> Reporter: Serge Rielau
> Assignee: Max Gekk
> Priority: Major
>
> Unlike in Snowflake we decided that array_inert() is 1 based.
> This means 1 is the first element in an array and -1 is the last.
> This matches the behavior of functions such as substr() and element_at().
>
> {code:java}
> > SELECT array_insert(array('a', 'b', 'c'), 1, 'z');
> ["z","a","b","c"]
> > SELECT array_insert(array('a', 'b', 'c'), 0, 'z');
> Error
> > SELECT array_insert(array('a', 'b', 'c'), -1, 'z');
> ["a","b","c","z"]
> > SELECT array_insert(array('a', 'b', 'c'), 5, 'z');
> ["a","b","c",NULL,"z"]
> > SELECT array_insert(array('a', 'b', 'c'), -5, 'z');
> ["z",NULL,"a","b","c"]
> > SELECT array_insert(array('a', 'b', 'c'), 2, cast(NULL AS STRING));
> ["a",NULL,"b","c"]
> {code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]