[GitHub] spark issue #21121: [SPARK-24042][SQL] Collection function: zip_with_index

lokm01 Fri, 27 Apr 2018 05:02:11 -0700

Github user lokm01 commented on the issue:

    https://github.com/apache/spark/pull/21121
  
    @ueshin Currently we use our own implementation of zipWithIndex when we do 
explode and need to preserve the ordering of the array elements (especially if 
there is a shuffle involved in the subsequent transformation).
    
    Sure, once transform becomes available, it will be much better and more 
performant to use that, but since we're dealing with production applications, 
we would like to start rewriting these jobs with those small "drop-in" 
replacements for functions such as zipWithIndex before going for a major 
rewrite with HOFs in spark SQL.
    
    I've seen many threads in the community, which recommend the same approach 
when dealing with these difficult array cases - I'm pretty sure it will benefit 
other users.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #21121: [SPARK-24042][SQL] Collection function: zip_with_index

Reply via email to