asmello commented on issue #23882: [SPARK-26979][PYTHON] Add missing string column name support for some SQL functions URL: https://github.com/apache/spark/pull/23882#issuecomment-473842884 It's definitely better after the merge than before, so I see no reason to revert. Though due to the missing math functions fix, I agree a follow up is required. As for cases like `when()` and `array_contains()`, that warrants a deeper discussion. Again, it's not that support for column names is missing there, it's just that those accept string literals, which take precedence in the API. In a way, supporting columns at all is a second-thought there, as their original formulation only supported literals. So we either remove columns from there entirely (at the cost of losing functionality) or we drop the literal support (which redefines the API in a big way). But neither has anything to do with supporting column names, so this should be reserved for another PR. > The main reason I was initially worried was that we should see if it makes sense to support string as columns in PySpark's API That's a valid concern, but, again, it's better to be consistent for now. It's not a "just-work-for-now" situation, either, it's a full fix for the consistency problem. String support is a related, but separate discussion, too.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
