jhaberstroh-sharethis opened a new pull request, #41686: URL: https://github.com/apache/spark/pull/41686
The `on` field complained when I passed it a Tuple. That's because it saw that it checked for `list` exactly, and so wrapped it into a list like `[on]`, leading to immediate failure. This was surprising -- typically, tuple and list should be interchangeable, and typically tuple is the more readily accepted type. I have proposed a change that moves towards the principle of least surprise for this situation. The reason it checked for `list` exactly is because `Column` actually is an `Iterable` object because it implements `__iter__`. It only does this because it has `__getitem__` implemented, and this allows it to be iterated over with `iter()`. This caused bad behavior, and so `__iter__` was implemented to raise an exception any time a Column is iterated over. That change was implemented in SPARK-10417: https://github.com/apache/spark/pull/8574 It happens to also be that Python docs specifically advise against checking for iterability by using `isinstance(x, Iterable)`, and that checking for ability to call `iter()` is preferred. For references: https://stackoverflow.com/questions/1952464/in-python-how-do-i-determine-if-an-object-is-iterable https://docs.python.org/3/library/collections.abc.html#collections.abc.Iterable There will be no user-facing changes for existing working code. It will only fix code that did not work previously. ### How was this patch tested? Tests for: * `isinstance_interable` behaves as-expected for all combinations of (str, col) and (bare, list, tuple). * `to_list_column_style` creates a list when passed any of these types, and contains a non-iterable (as-defined) * require that all of these different joins produce the same result. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
