Re: Refactor reference.md

Julian Hyde Wed, 24 Jan 2024 14:07:42 -0800

The current review process is that I hand-edit most of the submissions. (Or 
make detailed suggestions: ‘Convert this verb from declarative to imperative. 
Remove the space before the open parenthesis.') I we increase the scope of the 
documentation, that puts more burden on me.


We should not increase the scope of the documentation unless we introduce 
generation. Generation will at least allow us to automate some checks. Such as 
that the examples actually parse, and return the results the doc says they do.

Julian


> On Jan 24, 2024, at 1:55 PM, Mihai Budiu <mbu...@gmail.com> wrote:
> 
> I am not proposing a new process, the existing review process would continue 
> to apply. The page would still be part of the repository. Just a separate web 
> page on the calcite site, unbundled from the SQL language page.
> 
> Mihai
> 
> ________________________________
> From: Julian Hyde <jhyde.apa...@gmail.com>
> Sent: Wednesday, January 24, 2024 1:27 PM
> To: dev@calcite.apache.org <dev@calcite.apache.org>
> Subject: Re: Refactor reference.md
> 
> "The documentation would be incrementally improved, like the code base." Or 
> it might incrementally decline into a shambles. Sure, this is open source, 
> and open source can sometimes create miracles, but we need to be realistic. 
> We need an owner, and systems in place, to overcome the effects of entropy.
> 
> Other products have a separate page for each function, and an index 
> containing all functions. For example, see PostGIS: 
> https://postgis.net/docs/manual-1.5/ST_MakeLine.html. But take a look at the 
> meta tags at the top of the page - it’s generated from DocBook. That is a 
> tell that it is maintained by a professional writer.
> 
> Julian
> 
> 
> 
>> On Jan 24, 2024, at 1:12 PM, Mihai Budiu <mbu...@gmail.com> wrote:
>> 
>> The documentation would be incrementally improved, like the codebase.
>> We could start by just moving it to a different file. The narrow table also 
>> makes it difficult to read, so perhaps we can reformat that. A third column 
>> would be nice for examples, but it would mostly be empty initially. Some 
>> functions require additional clarifications as long text, maybe these can be 
>> footnotes?
>> 
>> Another solution is to make a separate table for each class of functions: 
>> string, numeric, array, etc. That would make it easier to intersperse with 
>> additional notes.
>> 
>> The terse format makes it very difficult to explain things that are subtle. 
>> For example, see my PR https://github.com/apache/calcite/pull/3571 which 
>> only attempts to clarify something, but has not been approved since early 
>> December.
>> 
>> Mihai
>> ________________________________
>> From: Julian Hyde <jhyde.apa...@gmail.com>
>> Sent: Wednesday, January 24, 2024 12:54 PM
>> To: dev@calcite.apache.org <dev@calcite.apache.org>
>> Subject: Re: Refactor reference.md
>> 
>> Extra documentation would be nice. But who is going to write (and maintain) 
>> this extra documentation?
>> 
>> Even the current documentation takes a lot of work. When reviewing a PR to 
>> add a function, I have to tell people to remove a ‘.’ at the end of the line 
>> to be consistent with the existing doc. Without those efforts, the 
>> documentation would be a shambles, and no one would trust it. We have over 
>> 500 functions.
>> 
>> Julian
>> 
>> 
>>> On Jan 24, 2024, at 9:46 AM, Mihai Budiu <mbu...@gmail.com> wrote:
>>> 
>>> I think we should make a separate document for the functions, and in 
>>> general give more details about the functions' behavior. The current model 
>>> is to give a very brief description of the function, but that's often not 
>>> enough, users have to resort to either experiments or to reading 
>>> documentation from other databases. The behavior should be described for 
>>> corner cases, and ideally there should be examples as well.
>>> 
>>> Mihai
>>> ________________________________
>>> From: Cancai Cai <can...@apache.org>
>>> Sent: Wednesday, January 24, 2024 7:14 AM
>>> To: dev@calcite.apache.org <dev@calcite.apache.org>
>>> Subject: Refactor reference.md
>>> 
>>> Hey Calcite Devs,
>>> 
>>> I am currently working on CALCITE-6215
>>> <https://issues.apache.org/jira/browse/CALCITE-6215>. During my work, I
>>> have noticed that certain functions have multiple variations with different
>>> parameter types in their respective databases. For example, in PostgreSQL,
>>> the to_char function supports multiple forms such as to_char(timestamp,
>>> text), to_char(interval, text), and to_char(numeric_type, text).
>>> 
>>> However, the description in Calcite is not clear enough. For instance, the
>>> reference.md document describes the to_char function as follows:
>>> 
>>> | m o p | TO_CHAR(timestamp, format) | Converts *timestamp* to a string
>>> using the format *format*.
>>> 
>>> This description may not provide enough clarity for users to understand the
>>> usage of each function across different databases.
>>> 
>>> I suggest considering adding specific links to the corresponding database
>>> functions in the reference.md document to enhance its completeness. This
>>> would allow users to easily access the documentation for the respective
>>> database functions.
>>> 
>>> Thanks as always,
>>> 
>>> Cancai Cai
>>> 
>>> https://www.postgresql.org/docs/16/functions-formatting.html#FUNCTIONS-FORMATTING-DATETIME-TABLE
>> 
>

Re: Refactor reference.md

Reply via email to