On 08.04.24 21:29, Daniel Gustafsson wrote:
Over in [0] I asked whether it would be worthwhile converting all our README
files to Markdown, and since it wasn't met with pitchforks I figured it would
be an interesting excercise to see what it would take (my honest gut feeling
was that it would be way too intrusive).  Markdown does brings a few key
features however so IMHO it's worth attempting to see:

* New developers are very used to reading/writing it
* Using a defined format ensures some level of consistency
* Many users and contributors new*as well as*  old like reading documentation
   nicely formatted in a browser
* The documentation now prints really well
* pandoc et.al can be used to render nice looking PDF's
* All the same benefits as discussed in [0]

The plan was to follow Grubers original motivation for Markdown closely:

        "The idea is that a Markdown-formatted document should be publishable
        as-is, as plain text, without looking like it’s been marked up with
        tags or formatting instructions."

This translates to making the least amount of changes to achieve a) retained
plain text readability at todays level, b) proper Markdown rendering, not
looking like text files in a HTML window, and c) absolutly no reflows and
minimal impact on git blame.

I started looking through this and immediately found a bunch of tiny problems. (This is probably in part because the READMEs under src/backend/access/ are some of the more complicated ones, but then they are also the ones that might benefit most from better rendering.)

One general problem is that original Markdown and GitHub-flavored Markdown (GFM) are incompatible in some interesting aspects. For example, the line

    A split initially marks the left page with the F_FOLLOW_RIGHT flag.

is rendered by GFM as you'd expect.  But original Markdown converts it to

    A split initially marks the left page with the F<em>FOLLOW</em>RIGHT
    flag.

This kind of problem is pervasive, as you'd expect.

Another incompatibility is that GFM accepts "1)" as a list marker (which appears to be used often in the READMEs), but original Markdown does not. This then also affects surrounding formatting.

Also, the READMEs often do not indent lists in a non-ambiguous way. For example, if you look into src/backend/optimizer/README, section "Join Tree Construction", there are two list items, but it's not immediately clear which paragraphs belong to the list and which ones follow the list. This also interacts with the previous point. The resulting formatting in GFM is quite misleading.

src/port/README.md is a similar case.

There are also various places where whitespace is used for ad-hoc formatting. Consider for example in src/backend/access/gin/README

  the "category" of the null entry.  These are the possible categories:

    1 = ordinary null key value extracted from an indexable item
    2 = placeholder for zero-key indexable item
    3 = placeholder for null indexable item

  Placeholder null entries are inserted into the index because otherwise

But this does not preserve the list-like formatting, it just flows it together.

There is a similar case with the authors list at the end of src/backend/access/gist/README.md.

src/test/README.md wasn't touched by your patch, but it also needs adjustments for list formatting.


In summary, I think before we could accept this, we'd need to go through this with a fine-toothed comb line by line and page by page to make sure the formatting is still sound. And we'd need to figure out which Markdown flavor to target.



Reply via email to