Github user lokm01 commented on the issue:
https://github.com/apache/spark/pull/21121
@ueshin Currently we use our own implementation of zipWithIndex when we do
explode and need to preserve the ordering of the array elements (especially if
there is a shuffle involved in the subsequent transformation).
Sure, once transform becomes available, it will be much better and more
performant to use that, but since we're dealing with production applications,
we would like to start rewriting these jobs with those small "drop-in"
replacements for functions such as zipWithIndex before going for a major
rewrite with HOFs in spark SQL.
I've seen many threads in the community, which recommend the same approach
when dealing with these difficult array cases - I'm pretty sure it will benefit
other users.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]