[ 
https://issues.apache.org/jira/browse/CALCITE-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17753883#comment-17753883
 ] 

Ran Tao edited comment on CALCITE-5830 at 8/14/23 4:03 AM:
-----------------------------------------------------------

[~jiajunbernoulli] hi, jiajun. I get your point. however, the result and 
behavior of yours is wrong. IMHO, firstly, i think in calcite we should respect 
Apache Spark not Databricks Spark. Of cause, in your case I have got some 
details: 

apache-spark-3.4.0:
{code:java}
spark-sql (default)> SELECT array_insert(array('a', 'b', 'c'), -1, 'z');
["a","b","z","c"]
Time taken: 4.478 seconds, Fetched 1 row(s) {code}
apache-spark-3.4.1
{code:java}
spark-sql (default)> SELECT array_insert(array('a', 'b', 'c'), -5, 'z');
["z",null,null,"a","b","c"]
Time taken: 3.587 seconds, Fetched 1 row(s)
 {code}
 

1.this function in databricks's doc is a old version(it's wrong and fixed 
immediately, however the doc is not updated), I opened a discussion in spark 
devs: [https://lists.apache.org/thread/1p5hkql96k5qc5ww6wkd7mq6qdbgyz1n] it was 
doc error, they will update it.

2.here is the some stable and latest databricks's spark behavior, it's same as 
Open Source Apache Spark behavior, and also same as the implementation in this 
PR. And this is correct result. you can test in databricks either.

!image-2023-08-14-11-20-46-189.png|width=583,height=506!

Finally, spark has this behavior because it distinguish the prepends(negative 
index) and appends(positive index). I have replied in the PR for you.


was (Author: lemonjing):
[~jiajunbernoulli] hi, jiajun. I get your point. however, the result and 
behavior of yours is wrong. IMHO, firstly, i think in calcite we should respect 
Apache Spark not Databricks Spark. Of cause, in your case I have got some 
details: 

apache-spark-3.4.0:
{code:java}
spark-sql (default)> SELECT array_insert(array('a', 'b', 'c'), -1, 'z');
["a","b","z","c"]
Time taken: 4.478 seconds, Fetched 1 row(s) {code}
apache-spark-3.4.1
{code:java}
spark-sql (default)> SELECT array_insert(array('a', 'b', 'c'), -5, 'z');
["z",null,null,"a","b","c"]
Time taken: 3.587 seconds, Fetched 1 row(s)
 {code}
 

1.this function in databricks's doc is a old version(it's wrong and fixed 
immediately, however the doc is not updated), I opened a discussion in spark 
devs: [https://lists.apache.org/thread/1p5hkql96k5qc5ww6wkd7mq6qdbgyz1n] 

2.here is the some stable and latest databricks's spark behavior, it's same as 
Open Source Apache Spark behavior, and also same as the implementation in this 
PR. And this is correct result. you can test in databricks either.

!image-2023-08-14-11-20-46-189.png|width=583,height=506!

Finally, spark has this behavior because it distinguish the prepends(negative 
index) and appends(positive index). I have replied in the PR for you.

> Add ARRAY_INSERT function(enabled in Spark library)
> ---------------------------------------------------
>
>                 Key: CALCITE-5830
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5830
>             Project: Calcite
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 1.34.0
>            Reporter: Ran Tao
>            Assignee: Ran Tao
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: image-2023-08-14-11-17-33-205.png, 
> image-2023-08-14-11-20-46-189.png
>
>
> array_insert(x, pos, val) - Places val into index pos of array x. Array 
> indices start at 1, or start from the end if index is negative. Index above 
> array size appends the array, or prepends the array if index is negative, 
> with 'null' elements
> *Examples:*
> > SELECT array_insert(array(1, 2, 3, 4), 5, 5); [1,2,3,4,5]
> > SELECT array_insert(array(5, 3, 2, 1), -3, 4); [5,4,3,2,1] 
> https://spark.apache.org/docs/latest/api/sql/index.html#array_insert



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to