Github user actuaryzhang commented on the issue:

    https://github.com/apache/spark/pull/18025
  
    @felixcheung I just made a new commit which I think has the cleanest 
solution so far. In this one, I implemented grouping for all aggregate 
functions for Column, except those that are also defined for other classes 
(`count`, `first` and `last`). As you can see, it achieves the following:
    - Centralized documentation for easy navigation.
    - Reduced number of items in `See also`
    - Betters examples using shared data. This avoids creating a data frame for 
each function if they are documented separately.
    - Cleaner structure and much fewer Rd files.
    - Remove duplicated definition of `@param`
    - No need to write meaningless examples for trivial functions (because of 
grouping). 
    
    In this version, I also demonstrate the for methods defined by multiple 
classes (`count`, `first` and `last`), we can still document them on their own 
RD, and simply give a link in the `SeeAlso` section. Of course, we can combine 
the doc for these three to something like `shared_methods.Rd` since each of 
them is tiny. 
    
    Also, to facilitate review, perhaps we can break the changes into several 
PRs, one for each of `aggregate_functions`, `datetime_functions`, 
`math_function`, and `misc_functions`? 
    
    After making the change to the Column methods, I will work on the doc for 
SparkDataFrame and GroupedData. 
    
    Please let me know your thoughts. 
    
    
     


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to