Daniel-Davies commented on PR #38867: URL: https://github.com/apache/spark/pull/38867#issuecomment-1337219140
@LuciferYang for quick feedback I'd be grateful for an overarching review of the method, and some assistance on the following questions: - Core behaviour: one interesting property of snowflake's array_insert function is that it will let you extend the array further than (numElements + 1) if you specify a far away index. For example, array_insert([1,2,3], 10, 4) will print [1,2,3,null,null,null,null,null,null,4). It would worry me a bit if an array could grow to astronomical sizes through some kind of mistake (e.g. are we happy with taking a risk of the 'pos' column containing a value of 2,000,000,000?), so I've returned a null if the provided array 'pos' index is out of bounds. Let me know if the snowflake behaviour should be exactly reproduced instead. - I've used the scala library 'patch' function to implement the behaviour, which prioritises minimal code over performance. Please let me know if this should be changed. - I'm not too clear on how to implicitly cast the provided 'item' parameter yet (i.e. I provide an array of LongType and try to insert an IntegerType.). I think the array type should probably not change though (e.g. if I provide a LongType array but a StringType item for insertion, it doesn't feel right to cast the whole Array to a StringType). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
