Question about ARRAY_INSERT between Spark and Databricks

Ran Tao Sun, 13 Aug 2023 05:28:21 -0700

Hi, devs.

I found that the  ARRAY_INSERT[1] function (from spark 3.4.0) has different
semantics with databricks[2].


e.g.

// spark
SELECT array_insert(array('a', 'b', 'c'), -1, 'z');
 ["a","b","z","c"]

// databricks
SELECT array_insert(array('a', 'b', 'c'), -1, 'z');
 ["a","b","c","z"]

// spark
SELECT array_insert(array('a', 'b', 'c'), -5, 'z');
["z",null,null,"a","b","c"]

// databricks
SELECT array_insert(array('a', 'b', 'c'), -5, 'z');
 ["z",NULL,"a","b","c"]

It looks like that inserting negative index is more reasonable in
Databricks.

Of cause, I read the source code of spark, and I can understand the logic
of spark, but my question is whether spark is designed like this on purpose?


[1] https://spark.apache.org/docs/latest/api/sql/index.html#array_insert
[2]
https://docs.databricks.com/en/sql/language-manual/functions/array_insert.html


Best Regards,
Ran Tao
https://github.com/chucheng92

Question about ARRAY_INSERT between Spark and Databricks

Reply via email to