asmello edited a comment on issue #23879: [SPARK-26979][SQL] Add missing column name support for SQL functions URL: https://github.com/apache/spark/pull/23879#issuecomment-466769335 > These generics are only applied to `functions.py`. Can you then whitelist one by one? I had whitelisted one by one at `_create_function()` uses between lines 257-273, but I forgot to do the same for line 1451. My bad, I truly apologise. Turns out all functions defined there are exceptions too. Still the point stands that uses of `_create_function()` are the only cases where the JVM function is called directly without argument conversion, and with those functions also taken care of, the SQL API should be fully consistent in this regard. > Also, there are few exceptions like `from_json` that takes `schema` as `Column` but also `DataType` This is a different matter altogether. `schema` is not the column being operated at in this case, it's a schema specification, so there's no reason it should take a column name here. It could, I suppose, but that sounds like an anti-pattern to me. > Plus, we should make it consistent in `dataframe.py` as well. What kind of impact does this have there? > `Column` and string are completely different types. It's not iterable or something related with duck-typing. The same can be said about `max(1, 2, 3)` and `max([1, 2, 3])`, you know. As long as they are related semantically, Python encourages different calling patterns. BTW, I've found more than one issue with `_create_function()` being used to define the same function more than once. This is dangerous. I feel it should be removed altogether. **I'll close this PR now, as I'm convinced the change should be made on PySpark's side.**
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
