landlord-matt commented on PR #43871: URL: https://github.com/apache/spark/pull/43871#issuecomment-1824563107
@JoshRosen : Thanks for clarifying! While it is true that the code is available for anyone to read and it is well written, I still think it would nice to have a text version for us dummies. Instead of explaining it in the function, I also had this brilliant idea of explaining it in the data type document. What do you think of this proposal? https://github.com/apache/spark/pull/43981 Also, I am still wondering about this section. It loops through all the fields in the schema and adds it to that magic ordering array. I assume that ordering array corresponds to "ORDER BY 1,2,3", but what is "1,2,3" in this case? How does it iterate through the schema? https://github.com/apache/spark/blob/fdcd20f4b51c3ddddaae12f7d3f429e7b77c9f5e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/PhysicalDataType.scala#L318-L329 If you want you can also check out my other proposal about collect_list that was ignored :( https://github.com/apache/spark/pull/43787 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
