Re: [PR] User documentation: Generate docs from macros. [datafusion]

2024-11-19 Thread via GitHub


alamb commented on code in PR #12822:
URL: https://github.com/apache/datafusion/pull/12822#discussion_r1848211730


##
datafusion/doc-gen/src/lib.rs:
##
@@ -190,7 +177,7 @@ impl DocumentationBuilder {
 self
 }
 
-pub fn build(self) -> Result {
+pub fn build(self) -> Documentation {

Review Comment:
   I like that idea



##
Cargo.toml:
##
@@ -100,6 +102,7 @@ datafusion = { path = "datafusion/core", version = 
"43.0.0", default-features =
 datafusion-catalog = { path = "datafusion/catalog", version = "43.0.0" }
 datafusion-common = { path = "datafusion/common", version = "43.0.0", 
default-features = false }
 datafusion-common-runtime = { path = "datafusion/common-runtime", version = 
"43.0.0" }
+datafusion-doc = { path = "datafusion/doc-gen", version = "43.0.0" }

Review Comment:
   looks like the directory name is still `doc` when calling it `doc` would be 
more consistent with the others



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] User documentation: Generate docs from macros. [datafusion]

2024-11-18 Thread via GitHub


comphead commented on PR #12822:
URL: https://github.com/apache/datafusion/pull/12822#issuecomment-2484303050

   Another thing comes to my mind, we can automate transforming current API 
documentation approach to attribute based one wherever possible. There are more 
complicated options with macros, we can leave them with API approach for now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] User documentation: Generate docs from macros. [datafusion]

2024-11-18 Thread via GitHub


comphead commented on PR #12822:
URL: https://github.com/apache/datafusion/pull/12822#issuecomment-2484292557

   Done, 
   
   > Maybe we could call the function something like macro_doc or 
derived_documentation to make it easier to find (aka so someone who runs across 
it can easily grep and find the macro definition)
   
   Good idea, maybe we can name it as `user_documentation`. I'd prefer to leave 
it to the following PR though, because it will affect dozens of files


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] User documentation: Generate docs from macros. [datafusion]

2024-11-18 Thread via GitHub


comphead commented on code in PR #12822:
URL: https://github.com/apache/datafusion/pull/12822#discussion_r1847352401


##
datafusion/doc-gen/Cargo.toml:
##
@@ -0,0 +1,35 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+[package]
+name = "datafusion-doc-gen"

Review Comment:
   Yes, procedural macros crates are not very flexible and requires all 
structures to be predefined before the macros is used. Renaming it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] User documentation: Generate docs from macros. [datafusion]

2024-11-18 Thread via GitHub


comphead commented on code in PR #12822:
URL: https://github.com/apache/datafusion/pull/12822#discussion_r1847350883


##
datafusion/doc-gen/src/lib.rs:
##
@@ -190,7 +177,7 @@ impl DocumentationBuilder {
 self
 }
 
-pub fn build(self) -> Result {
+pub fn build(self) -> Documentation {

Review Comment:
   Done. I was thinking to convert the builder to support constructor `new` 
with mandatory arguments so we dont need having this panics, I'll do it in 
follow up PR 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] User documentation: Generate docs from macros. [datafusion]

2024-11-18 Thread via GitHub


alamb commented on code in PR #12822:
URL: https://github.com/apache/datafusion/pull/12822#discussion_r1847317488


##
datafusion/functions/src/datetime/to_date.rs:
##
@@ -162,7 +151,7 @@ impl ScalarUDFImpl for ToDateFunc {
 }
 
 fn documentation(&self) -> Option<&Documentation> {

Review Comment:
   Maybe we could call the function something like `macro_doc` or 
`derived_documentation` to make it easier to find (aka so someone who runs 
across it can easily `grep` and find the macro definition)



##
datafusion/functions/src/datetime/to_date.rs:
##
@@ -162,7 +151,7 @@ impl ScalarUDFImpl for ToDateFunc {
 }
 
 fn documentation(&self) -> Option<&Documentation> {

Review Comment:
   I think as long as it is well documented, this boiler plate is ok



##
datafusion/expr/src/lib.rs:
##
@@ -66,6 +65,7 @@ pub mod var_provider;
 pub mod window_frame;
 pub mod window_state;
 
+pub use datafusion_doc_gen::{DocSection, Documentation, DocumentationBuilder};

Review Comment:
   👍 



##
datafusion/functions/src/datetime/to_date.rs:
##
@@ -26,9 +26,42 @@ use 
datafusion_expr::scalar_doc_sections::DOC_SECTION_DATETIME;
 use datafusion_expr::{
 ColumnarValue, Documentation, ScalarUDFImpl, Signature, Volatility,
 };
+use datafusion_macros::udf_doc;
 use std::any::Any;
 use std::sync::OnceLock;
 
+#[udf_doc(
+doc_section(include = "true", label = "Time and Date Functions"),

Review Comment:
   SInce we auto generate the documentation, part of the PR review would be to 
look at the generated pages. I think we might be able to catch typos at that 
stage too



##
datafusion/functions/src/datetime/to_date.rs:
##
@@ -22,13 +22,46 @@ use arrow::error::ArrowError::ParseError;
 use arrow::{array::types::Date32Type, compute::kernels::cast_utils::Parser};
 use datafusion_common::error::DataFusionError;
 use datafusion_common::{arrow_err, exec_err, internal_datafusion_err, Result};
-use datafusion_expr::scalar_doc_sections::DOC_SECTION_DATETIME;
 use datafusion_expr::{
 ColumnarValue, Documentation, ScalarUDFImpl, Signature, Volatility,
 };
+use datafusion_macros::user_doc;
 use std::any::Any;
 use std::sync::OnceLock;
 
+#[user_doc(
+doc_section(include = "true", label = "Time and Date Functions"),
+description = r"Converts a value to a date (`-MM-DD`).
+Supports strings, integer and double types as input.
+Strings are parsed as -MM-DD (e.g. '2023-07-20') if no [Chrono 
format](https://docs.rs/chrono/latest/chrono/format/strftime/index.html)s are 
provided.
+Integers and doubles are interpreted as days since the unix epoch 
(`1970-01-01T00:00:00Z`).
+Returns the corresponding date.
+
+Note: `to_date` returns Date32, which represents its values as the number of 
days since unix epoch(`1970-01-01`) stored as signed 32 bit value. The largest 
supported date value is `-12-31`.",
+syntax_example = "to_date('2017-05-31', '%Y-%m-%d')",

Review Comment:
   this is pretty fancy



##
datafusion/macros/src/lib.rs:
##
@@ -0,0 +1,154 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+extern crate proc_macro;
+use proc_macro::TokenStream;
+use quote::quote;
+use syn::{parse_macro_input, DeriveInput, LitStr};
+
+#[proc_macro_attribute]
+pub fn user_doc(args: TokenStream, input: TokenStream) -> TokenStream {

Review Comment:
   I think this macro needs user facing documentation (I realize that is quite 
meta 😆 ). Specifically a 
   1. Introduction of what it does
   2. A list of the optional arguments (`doc_section_include`, 
`doc_section_lbl`, etc) and examples
   3. That it creates a standard `impl ... ` with a `doc_function`
   
   It would also be really neat to make a doc example of it in use, if possible



##
datafusion/doc-gen/src/lib.rs:
##
@@ -190,7 +177,7 @@ impl DocumentationBuilder {
 self
 }
 
-pub fn build(self) -> Result {
+pub fn build(self) -> Documentation {

Review Comment:
   Perhaps it is worth documenting that this will panic if doc section, 
description, or syntax example is missing



##
datafusion/doc-gen/Cargo.toml:
##
@@ -0,0 +1,35 @@
+# Licensed to the Apache Sof

Re: [PR] User documentation: Generate docs from macros. [datafusion]

2024-11-18 Thread via GitHub


comphead commented on PR #12822:
URL: https://github.com/apache/datafusion/pull/12822#issuecomment-2483936429

   @alamb @Omega359 the PR is ready for review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org