Re: [PR] feat: Add `datafusion-spark` crate [datafusion]

via GitHub Sun, 02 Feb 2025 01:00:52 -0800


shehabgamin commented on PR #14392:
URL: https://github.com/apache/datafusion/pull/14392#issuecomment-2629306233


   > In optimizer, we rely on the name to do such optimization so if we rename 
it to name like 'spark_count' we might need to add the spark name to those 
optimize rules as well, which increase the maintainence cost. If we assume the 
datafusion native function and spark function is mutually exclusive (I guess we 
do so) then having consistent name for optimizer is preferred choice.
   
   I'm glad you brought this up, @jayzhan211. 
   
   Some Spark functions behave identically to DataFusion functions but have 
different names. For example:
   - Spark’s `startswith(str, substr)` corresponds to DataFusion’s 
`expr_fn::starts_with(str, substr)`
   
   There are also cases where functions take input arguments in a different 
order. For example:
   - Spark’s `position(substr, str)` corresponds to DataFusion’s 
`expr_fn::strpos(str, substr)`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat: Add `datafusion-spark` crate [datafusion]

Reply via email to