[ 
https://issues.apache.org/jira/browse/FLINK-25777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17537522#comment-17537522
 ] 

Alexander Preuss commented on FLINK-25777:
------------------------------------------

As a first step we investigated how we could automate the generation of the 
docs from the current {{{}ConnectorOption{}}}s classes with hugo. Our 
experimental setup looked like this:

1) Symlink a specific ConnectorOptions java file into hugos /{{{}static{}}} 
folder

2) Create a short code to process the file like this
{code:java}
{{ $optionsFile := os.ReadFile 
"static/connector-options/ElasticsearchConnectorOptions.java" }}
{{ findRE "ConfigOption<(.|\n)*?;" $optionsFile }} {code}
3) Include the shortcode on the documentation page for the connector. This 
produces output like this:


{code:java}
[ConfigOption<List<String>> HOSTS_OPTION = ConfigOptions.key("hosts") 
.stringType() .asList() .noDefaultValue() .withDescription("Elasticsearch hosts 
to connect to."); ConfigOption<String> INDEX_OPTION = 
ConfigOptions.key("index") .stringType() .noDefaultValue() 
.withDescription("Elasticsearch index for every record.");  ....] {code}
One major issue with this approach is that using regexes to parse the 
individual Options and their building blocks (description, default value) is 
super brittle. Another issue is that for values that are very long the output 
would have to merge the individual description lines as they contain the " + " 
signs from line breaks. This also means any references variable in the code 
would appear as is in the docs and would not be resolved to its actual value.

After a short discussion we also thought about linking to the ConnectorOption's 
java doc instead of creating the table. While this might be a valid approach, 
the current java docs lack the information seen in the code as can be seen on 
this example 
([https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/streaming/connectors/elasticsearch/table/ElasticsearchConnectorOptions.html#BULK_FLASH_MAX_SIZE_OPTION])
 which only shows the return type but includes no statements about the default 
value or provided description. 
Maybe improving the JavaDoc output generated from ConnectorOptions classes 
would be a first step in the right direction. For now we concluded to leave the 
manual creation of the option tables as is.

> Generate documentation for Table factories (formats and connectors)
> -------------------------------------------------------------------
>
>                 Key: FLINK-25777
>                 URL: https://issues.apache.org/jira/browse/FLINK-25777
>             Project: Flink
>          Issue Type: Technical Debt
>          Components: Connectors / Common, Documentation, Formats (JSON, Avro, 
> Parquet, ORC, SequenceFile)
>            Reporter: Francesco Guardiani
>            Priority: Critical
>
> The goal of this issue is to generate automatically from code the 
> documentation of configuration options for table connectors and formats.
> This issue includes:
> * Tweak {{ConfigOptionsDocGenerator}} to work with 
> {{Factory#requiredOptions}}, {{Factory#requiredOptions}} and newly introduced 
> {{DynamicTableFactory#forwardOptions}} and {{FormatFactory#forwardOptions}}. 
> Also see this https://github.com/apache/flink/pull/18290 as reference. From 
> these methods we should extract if an option is required or not, and if it's 
> forwardable or not.
> * Decide whether the generator output should be, and how to link/include it 
> in the connector/format docs pages.
> * Enable the code generator in CI.
> * Regenerate all the existing docs.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to