zhengruifeng commented on PR #49879:
URL: https://github.com/apache/spark/pull/49879#issuecomment-2650680171

   https://github.com/apache/spark/pull/49879#issuecomment-2650528939
   
   @yaooqinn The problem is that spark doesn't provide a consistent string 
argument handling, the same argument in very similar functions can be treated 
in different ways.
   
   For example, 
https://github.com/apache/spark/blob/59dd406ffab6f7df7f36fe7befe121822e68bf00/python/pyspark/sql/functions/builtin.py#L18495-L18499
   
   And this inconsistency actually caused unexpected results:
   
   A user changed his code from `element_at(c, "a")` to `try_element_at(c, 
"a")`, and the query still ran successfully but generated unexpected results, 
because the input dataframe has column 'a'. That is why I fixed such type hint 
and added some notes like this.
   
   There are 500+ functions APIs and column APIs, we cannot expected users 
always check the API references.
   
   With `Column` argument, users can exactly express what they want `col("a")` 
or `lit("a")`. The query may fail and SQL engine tells what happened, but won't 
silently generate _wrong_ results.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to