+1
It is important that different APIs can be used to call the same function

Ryan Berti <rbe...@netflix.com.invalid> 于2023年5月25日周四 01:48写道:

> During my recent experience developing functions, I found that identifying
> locations (sql + connect functions.scala + functions.py, FunctionRegistry,
> + whatever is required for R) and standards for adding function signatures
> was not straight forward (should you use optional args or overload
> functions? which col/lit helpers should be used when?). Are there docs
> describing all of the locations + standards for defining a function? If
> not, that'd be great to have too.
>
> Ryan Berti
>
> Senior Data Engineer  |  Ads DE
>
> M 7023217573
>
> 5808 W Sunset Blvd  |  Los Angeles, CA 90028
>
>
>
> On Wed, May 24, 2023 at 12:44 AM Enrico Minack <i...@enrico.minack.dev>
> wrote:
>
>> +1
>>
>> Functions available in SQL (more general in one API) should be available
>> in all APIs. I am very much in favor of this.
>>
>> Enrico
>>
>>
>> Am 24.05.23 um 09:41 schrieb Hyukjin Kwon:
>>
>> Hi all,
>>
>> I would like to discuss adding all SQL functions into Scala, Python and R
>> API.
>> We have SQL functions that do not exist in Scala, Python and R around 175.
>> For example, we don’t have pyspark.sql.functions.percentile but you can
>> invoke
>> it as a SQL function, e.g., SELECT percentile(...).
>>
>> The reason why we do not have all functions in the first place is that we
>> want to
>> only add commonly used functions, see also
>> https://github.com/apache/spark/pull/21318 (which I agreed at that time)
>>
>> However, this has been raised multiple times over years, from the OSS
>> community, dev mailing list, JIRAs, stackoverflow, etc.
>> Seems it’s confusing about which function is available or not.
>>
>> Yes, we have a workaround. We can call all expressions by expr("...") or 
>> call_udf("...",
>> Columns ...)
>> But still it seems that it’s not very user-friendly because they expect
>> them available under the functions namespace.
>>
>> Therefore, I would like to propose adding all expressions into all
>> languages so that Spark is simpler and less confusing, e.g., which API is
>> in functions or not.
>>
>> Any thoughts?
>>
>>
>>

Reply via email to