alamb opened a new issue, #12432:
URL: https://github.com/apache/datafusion/issues/12432

   ### Is your feature request related to a problem or challenge?
   
   When we add a new function to datafusion's library we have to remember to 
document that function in the documentation, for example in 
https://datafusion.apache.org/user-guide/sql/scalar_functions.html
   
   I observed this recently in 
https://github.com/apache/datafusion/pull/12429#pullrequestreview-2296500404. 
This likely means we have forgotten to document some functions or that the 
documentation has drifted over time
   
   Also this means the help text for various functions can only be found on the 
DataFusion website, and not, for example within the function itself. 
   
   It would be awesome if you could do something like this from SQL:
   ```sql
   > DESCRIBE sqrt;
   
   Returns the square root of a number.
   
   sqrt(numeric_expression)
   
   Arguments
   * numeric_expression: Numeric expression to operate on. Can be a constant, 
column, or function, and any combination of arithmetic operators.
   ```
   
   
   
   
   ### Describe the solution you'd like
   
   I would like:
   1. The help text / description of a ScalarUDFImpl (and AggregateUDFImpl) is 
available programatically (see https://github.com/apache/datafusion/issues/8366)
   2. The contents of the [SQL reference 
guide](https://datafusion.apache.org/user-guide/sql/scalar_functions.html#sqrt) 
was automatically generated from the source code 
   
   DataFusion already does something like this for 
[`ConfigOptions`](https://docs.rs/datafusion/latest/datafusion/common/config/struct.ConfigOptions.html)
   
   For example, the comments in 
https://docs.rs/datafusion/latest/datafusion/config/struct.SqlParserOptions.html
 are automatically added to the documentation programatically:
   * Run this script: 
https://github.com/apache/datafusion/blob/main/dev/update_config_docs.sh
   * Which runs this program: 
https://github.com/apache/datafusion/blob/3ece7a736193a87941a00eb35f3001df282fd075/datafusion/core/src/bin/print_config_docs.rs#L4
   * Which updates this file: 
https://github.com/apache/datafusion/blob/main/docs/source/user-guide/configs.md
   
   ### Describe alternatives you've considered
   
   I suggest this as a high level approach
   1. Add methods to the 
[`ScalarUDFImpl`](https://docs.rs/datafusion/latest/datafusion/logical_expr/trait.ScalarUDFImpl.html)
 trait as proposed by @universalmind303 in 
https://github.com/apache/datafusion/issues/8366 `ScalarUDFImpl::description` 
and `ScalarUDFImpl::sql_example`
   2. Create a script and extraction program, similar to how it is does for 
`ConfigOptions` to generate the sql reference from those functions.
   
   In terms of implementation order I would personally suggest breaking this 
project into smaller parts:
   
   A first PR that does:
   1. Add the methods to the trait
   2. Move the documentation for one or two of the methods into the traits / 
code
   3. A script that creates some of the documentation (perhaps we could start 
by creating a new temporary page like 
https://datafusion.apache.org/user-guide/sql/scalar_functions_new.html that has 
only the auto generated documentation)
   
   Then we can work in multiple PRs to port the remaining documentation over to 
the code (which will automatically result in the new page getting updated) 
   
   And then finally we can remove the old page when all functions are ported. 
   
   If we start working on this project, we (I) can file follow tickets to track 
porting the remaining functions / doing the same thing for aggregate functions, 
etc. 
   
   
   ### Additional context
   
   
   
   Also, similarly, GlareDB has a way to automatically annotate functions with 
documentation, and @universalmind303  proposed something similar here 
https://github.com/apache/datafusion/issues/8366
   
   Also, @findepi is considering implementing `SHOW FUNCTIONS` as part of 
https://github.com/apache/datafusion/issues/12144 that could also likely take 
advantage of this documentation if it was present


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to