[GitHub] [spark] HyukjinKwon commented on pull request #28106: [SPARK-31335][SQL] Add try function support

2020-05-07 Thread GitBox


HyukjinKwon commented on pull request #28106:
URL: https://github.com/apache/spark/pull/28106#issuecomment-625597600


   @yaooqinn, I discussed offline with other people who I know, and I decided 
to share here as it looks valid concerns to address:
   
   The concerns are basically, It might be unclear to end users.  For example, 
`TRY(a / MyUDF(b))`. It will catch both the exceptions from `MyUDF` and the 
division zero. It might be unclear to end users. Should they use `TRY(a / 
TRY(MyUDF(b)))` vs `TRY(a / MyUDF(b))`. Another example might be 
`TRY(SUM(a/b))` vs `TRY(SUM(TRY(a/b)))`.
   
   Subqueries might be a problem as well:
   
   ```
   TRY(a IN (SELECT ... WHERE a/b > 1)),
   ```
   
   Errors from `a/b` will be propagated all the way to the TRY and it will be 
replaced to `NULL`; however, I guess we should return `NULL` from `a/b`?
   
   How does it work:
   - When the expression requires a shuffle such as window functions?
   - When the runtime exception occurs in vectorized pandas UDF - the exception 
will happen once for a batch?
   
   Looks like some vendors choose to add safe_* or try_* expressions that scope 
clearly. For example:
   
   
https://docs.microsoft.com/en-us/sql/t-sql/functions/try-cast-transact-sql?view=sql-server-ver15
   https://docs.snowflake.com/en/sql-reference/functions/try_cast.html
   
https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#safe_prefix
   
   Maybe we should think about this more.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #28106: [SPARK-31335][SQL] Add try function support

2020-05-06 Thread GitBox


HyukjinKwon commented on pull request #28106:
URL: https://github.com/apache/spark/pull/28106#issuecomment-624485631


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #28106: [SPARK-31335][SQL] Add try function support

2020-05-05 Thread GitBox


HyukjinKwon commented on pull request #28106:
URL: https://github.com/apache/spark/pull/28106#issuecomment-624456778







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #28106: [SPARK-31335][SQL] Add try function support

2020-04-27 Thread GitBox


HyukjinKwon commented on pull request #28106:
URL: https://github.com/apache/spark/pull/28106#issuecomment-620359601


   cc @cloud-fan too



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #28106: [SPARK-31335][SQL] Add try function support

2020-04-27 Thread GitBox


HyukjinKwon commented on pull request #28106:
URL: https://github.com/apache/spark/pull/28106#issuecomment-620359343


   I'm positive on this function.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org