[GitHub] [spark] HyukjinKwon edited a comment on pull request #28106: [SPARK-31335][SQL] Add try function support

GitBox Thu, 07 May 2020 19:45:25 -0700


HyukjinKwon edited a comment on pull request #28106:
URL: https://github.com/apache/spark/pull/28106#issuecomment-625597600



   @yaooqinn, I discussed offline with other people who I know, and I decided 
to share here as it looks valid concerns to address:
   
   The concerns are basically, It might be unclear to end users.  For example, 
`TRY(a / MyUDF(b))`. It will catch both the exceptions from `MyUDF` and the 
division zero. It might be unclear to end users. Should they use `TRY(a / 
TRY(MyUDF(b)))` vs `TRY(a / MyUDF(b))`. Another example might be 
`TRY(SUM(a/b))` vs `TRY(SUM(TRY(a/b)))`.
   
   Subqueries might be a problem as well:
   
   ```
   TRY(a IN (SELECT ... WHERE a/b > 1)),
   ```
   
   Errors from `a/b` will be propagated all the way to the TRY and it will be 
replaced to `NULL`; however, I guess we can also think it should return `NULL` 
from `a/b`?
   
   How does it work:
   - When the expression requires a shuffle such as window functions?
   - When the runtime exception occurs in vectorized pandas UDF - the exception 
will happen once for a batch?
   
   Maybe, it's best to check how two references you pointed out work in these 
cases.
   
   Looks like some other vendors choose to add safe_* or try_* expressions that 
scope clearly. For example:
   
   
https://docs.microsoft.com/en-us/sql/t-sql/functions/try-cast-transact-sql?view=sql-server-ver15
   https://docs.snowflake.com/en/sql-reference/functions/try_cast.html
   
https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#safe_prefix
   
   Maybe we should take a step back and think about this a bit more.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HyukjinKwon edited a comment on pull request #28106: [SPARK-31335][SQL] Add try function support

Reply via email to