maropu commented on pull request #30867:
URL: https://github.com/apache/spark/pull/30867#issuecomment-748818566


   Thanks for the comment, @dongjoon-hyun !
   
   > The category is based on the outputType or inputType? Some functions are 
at the intersection of both types. e.g. StructToCsv or CreateArray.
   
   A basic policy to re-categorize functions is that functions in the same file 
are categorized into the same group. But, yea, the two cases you pointed out 
above are ambiguous cases, I think. In the current approach, `StructToCsv` is 
categorized into `csv_funcs` because it exists in the `csvExpressions.scala` 
file (I mean that it is categorized based on its functionality). `CreateArray` 
is categorized into `array_funcs`based on the output type (The other functions 
in `array_funcs` are categorized based on their input types though...).
   
   > The definition of array_func and collection_func?
   
   `array_funcs` and `map_funcs` are  sub-groups of `collection_funcs` in the 
current approach. For example, `array_contains` is used only for arrays, so it 
is assigned to `array_funcs`. On the other hand, `reverse` is used for both 
arrays and strings, so it is assigned to `collection_funcs`.
   
   Anyway, this is a first-shot to re-categorize them, so I'm open to other 
ideas.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to