[GitHub] [spark] Gschiavon commented on issue #25728: [SPARK-29020][SQL] Improving array_sort behaviour
Gschiavon commented on issue #25728: [SPARK-29020][SQL] Improving array_sort behaviour URL: https://github.com/apache/spark/pull/25728#issuecomment-554776903 No problem @kiszk ! I just made some changes based on your comments! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Gschiavon commented on issue #25728: [SPARK-29020][SQL] Improving array_sort behaviour
Gschiavon commented on issue #25728: [SPARK-29020][SQL] Improving array_sort behaviour URL: https://github.com/apache/spark/pull/25728#issuecomment-554749156 ping @kiszk This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Gschiavon commented on issue #25728: [SPARK-29020][SQL] Improving array_sort behaviour
Gschiavon commented on issue #25728: [SPARK-29020][SQL] Improving array_sort behaviour URL: https://github.com/apache/spark/pull/25728#issuecomment-531099338 I understood the same @ueshin since he said to match presto This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Gschiavon commented on issue #25728: [SPARK-29020][SQL] Improving array_sort behaviour
Gschiavon commented on issue #25728: [SPARK-29020][SQL] Improving array_sort behaviour URL: https://github.com/apache/spark/pull/25728#issuecomment-530767308 Ok I understand. Also I think that might be a different PR? and then have different signatures for array_sort. For example: array_sort(column) -> asc by default array_sort(column, order) array_sort(column, comparatorFunction) Or you just want to have array_sort(column, comparatorFunction)? I think having array_sort(column, order) is easier to use, and most of the times you just want to order asc or desc. At the same time having the comparatorFunction can open more possibilities. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Gschiavon commented on issue #25728: [SPARK-29020][SQL] Improving array_sort behaviour
Gschiavon commented on issue #25728: [SPARK-29020][SQL] Improving array_sort behaviour URL: https://github.com/apache/spark/pull/25728#issuecomment-530207202 How does it translate for this PR @ueshin ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Gschiavon commented on issue #25728: [SPARK-29020][SQL] Improving array_sort behaviour
Gschiavon commented on issue #25728: [SPARK-29020][SQL] Improving array_sort behaviour URL: https://github.com/apache/spark/pull/25728#issuecomment-530031571 I think it would be good to have array_sort to be able to order in both ways, all the array function from 2.4.0 are "array_something" and it's kind of confusing to have sort_array. I think it would be great to have that unified. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Gschiavon commented on issue #25728: [SPARK-29020][SQL] Improving array_sort behaviour
Gschiavon commented on issue #25728: [SPARK-29020][SQL] Improving array_sort behaviour URL: https://github.com/apache/spark/pull/25728#issuecomment-529994076 > I see, is it that the ordering of nulls is different? > I think we'd at least want the doc of each to comment on the difference with the other very similar method, to make this clear. I suppose one day we may add control over null ordering and deprecate one in favor of the other. Okay @srowen , I'll explain the ordering of nulls for all as they are now: `sort_array` in `ascending order` nulls are at the `beginning` `sort_array` in `descending order` nulls are at the `end` `array_sort` in `ascending order` nulls are at the `end` And after the PR: `sort_array` will remain the same `array_sort` in `ascending order` nulls are at the `end` `array_sort` in `descending order` nulls are at the `end` So this way the actual behaviour of `array_sort` remains the same. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Gschiavon commented on issue #25728: [SPARK-29020][SQL] Improving array_sort behaviour
Gschiavon commented on issue #25728: [SPARK-29020][SQL] Improving array_sort behaviour URL: https://github.com/apache/spark/pull/25728#issuecomment-529977320 > @ueshin yeah I agree, the bigger question is, why do we have both `array_sort` and `sort_array`? > > @kiszk looks like you added `array_sort` to match Presto in https://issues.apache.org/jira/browse/SPARK-23921 . Presto supports a comparator function rather than an ascending flag. > > If `sort_array` does what we want already, and is the older method here, why not make `array_sort` an alias for it? We could deprecate one, but which one? `sort_array` has been around longer, but `array_sort` is at least the name in one other system. > > Overall I'd just like to avoid duplicating the implementation, at least, even if both are supported as the function name. @srowen we couldn't make an alias between sort_array and array_sort because they don't have the same null policy, it would change the behaviour of array_sort. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Gschiavon commented on issue #25728: [SPARK-29020][SQL] Improving array_sort behaviour
Gschiavon commented on issue #25728: [SPARK-29020][SQL] Improving array_sort behaviour URL: https://github.com/apache/spark/pull/25728#issuecomment-529976886 @srowen we couldn't make an alias between `sort_array` and `array_sort` because they don't have the same null policy, it would change the behaviour of `array_sort`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org