Re: [squid-dev] squid.conf future

2020-02-29 Thread Alex Rousskov
On 2/29/20 6:07 AM, Francesco Chemolli wrote:
> I like the idea of markdown, if possible.
> It is simple and gets the job done.
> Would processing it be more cumbersome than the alternatives?

If we look at the big picture rather then the next step, then the
answer, IMO, is "no". Many modern projects use markdown. I suspect there
are more modern/supported/quality tools processing markdown than
modern/supported/quality tools processing linuxdoc or similar XML-like
markup.

Moreover, _if_ we want that, we can avoid doing any mardown-to-HTML
conversion ourselves! Services like Github Pages do it for you
automatically. I am not (yet) saying that we should use such a service,
but I think we should at least _consider_ it because such services have
some serious advantages over the current web site integration approach.


Cheers,

Alex.


> On Wed, Feb 26, 2020 at 8:43 PM Alex Rousskov wrote:
> 
> On 2/25/20 1:31 AM, Amos Jeffries wrote:
> 
> > Any suggestions of formats I should look at then?
> 
> I believe cf.data.pre should use two primary formats, each optimized
> specifically for the content it is applied to. The secondary details of
> each format will evolve, but here is where I would start today:
> 
> 1. YAML-like metadata to supply formal details about each directive. We
> already use this today, and I expect no significant changes in the
> immediate future (even though many improvements are possible in this
> area!).
> 
> 2. Markdown for informal documentation inside DOC_START...DOC_END,
> DEFAULT_DOC, and similar text blocks. Preprocessing includes validation
> of internal references (for sure), generation of internal anchors
> (probably), and removal of minimal common indentation (not sure; need to
> experiment/discuss). Other preprocessing actions may be desirable as
> well, of course. An agreement on the lower-level details and a
> non-trivial conversion effort would be required. We can discuss whether
> to do it incrementally or once-and-for-all.
> 
> 
> HTH,
> 
> Alex.
> ___
> squid-dev mailing list
> squid-dev@lists.squid-cache.org 
> http://lists.squid-cache.org/listinfo/squid-dev
> 
> 
> 
> -- 
>     Francesco

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] squid.conf future

2020-02-29 Thread Francesco Chemolli
I like the idea of markdown, if possible.
It is simple and gets the job done.
Would processing it be more cumbersome than the alternatives?

On Wed, Feb 26, 2020 at 8:43 PM Alex Rousskov <
rouss...@measurement-factory.com> wrote:

> On 2/25/20 1:31 AM, Amos Jeffries wrote:
>
> > Any suggestions of formats I should look at then?
>
> I believe cf.data.pre should use two primary formats, each optimized
> specifically for the content it is applied to. The secondary details of
> each format will evolve, but here is where I would start today:
>
> 1. YAML-like metadata to supply formal details about each directive. We
> already use this today, and I expect no significant changes in the
> immediate future (even though many improvements are possible in this
> area!).
>
> 2. Markdown for informal documentation inside DOC_START...DOC_END,
> DEFAULT_DOC, and similar text blocks. Preprocessing includes validation
> of internal references (for sure), generation of internal anchors
> (probably), and removal of minimal common indentation (not sure; need to
> experiment/discuss). Other preprocessing actions may be desirable as
> well, of course. An agreement on the lower-level details and a
> non-trivial conversion effort would be required. We can discuss whether
> to do it incrementally or once-and-for-all.
>
>
> HTH,
>
> Alex.
> ___
> squid-dev mailing list
> squid-dev@lists.squid-cache.org
> http://lists.squid-cache.org/listinfo/squid-dev
>


-- 
Francesco
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] squid.conf future

2020-02-26 Thread Alex Rousskov
On 2/25/20 1:31 AM, Amos Jeffries wrote:

> Any suggestions of formats I should look at then?

I believe cf.data.pre should use two primary formats, each optimized
specifically for the content it is applied to. The secondary details of
each format will evolve, but here is where I would start today:

1. YAML-like metadata to supply formal details about each directive. We
already use this today, and I expect no significant changes in the
immediate future (even though many improvements are possible in this area!).

2. Markdown for informal documentation inside DOC_START...DOC_END,
DEFAULT_DOC, and similar text blocks. Preprocessing includes validation
of internal references (for sure), generation of internal anchors
(probably), and removal of minimal common indentation (not sure; need to
experiment/discuss). Other preprocessing actions may be desirable as
well, of course. An agreement on the lower-level details and a
non-trivial conversion effort would be required. We can discuss whether
to do it incrementally or once-and-for-all.


HTH,

Alex.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] squid.conf future

2020-02-25 Thread Alex Rousskov
On 2/25/20 1:31 AM, Amos Jeffries wrote:
> On 25/02/20 6:11 am, Alex Rousskov wrote:
>> On 2/24/20 3:11 AM, Amos Jeffries wrote:
>>> For the future I am considering a switch of cf.data.pre to a format like
>>> SGML or XML which we can better generate the website contents from.

>> If you meant something specific by "SGML", please clarify.

> We have the Linuxdoc toolchain already used for release notes
> etc. so long as we have a simple set of rules about the markup used for
> bits that cf_gen needs to pull out for code generation we can use any of
> the more powerful markup in the documentation comment parts.

With all due respect to LDP, LinuxDoc feels like is a dying project
these days -- not a lot of activity and a lot of stale sites. The
current(?) toolchain maintainer said[1] that modern (at the time)
developers prefer DocBook: "DocBook DTD [...] is now a more popular DTD
than LinuxDoc in writing technical software documentation".

That was ... 11 years ago.

[1] https://gitlab.com/agmartin/linuxdoc-tools



>> Automated rendering of squid.conf sources, including web site content
>> generation, should be straightforward with any good source format,
>> including writer-friendly formats. Thus, web site generation is not an
>> important deciding criteria here AFAICT.

> It is an existing use-case for documentation output we need to maintain.

Agreed. My point is _not_ that we do not need to support web site
generation. My point is that any decent tool, including custom scripts,
can generate web sites these days (and, in most cases, do a better job
than what we have today). Thus, we should decide based on other, more
selective factors first.


>> IMO, an ideal markup language for cf.data.pre (or its replacements)
>> would satisfy these draft high-level criteria:
>>
>> 1. Writer-friendly. Proper whitespace, indentation, and other
>> presentation features of the _rendered_ output are the responsibility of
>> renderes, not content writers. Decent _sources_ formatting should be
>> automatically handled by popular modern text editors that developers
>> already use. No torturing humans with counting tags or brackets.

> This nullifies the argument that XML is torturous. Good editing tools
> can handle XML easily.

Good editors can close the current tag for you, but closing tags and
dealing with all the other machine noise is still tedious for most
humans. XML is just not designed to be friendly to human writers (and
readers!). It is like JSON: Yes, one can edit JSON by hand, especially
with a good editor, but that does not make it human-friendly. Both
formats are meant for exchanging information between programs.


>> 2. Expressive enough to define all the squid.conf concepts that we want
>> to keep/support, so that they can be rendered beautifully without hacks.
>> For example, if we agree that those sections are a good idea, then this
>> item includes support for introduction sections that define no
>> configuration options themselves.

> What are you calling squid.conf concepts here?

Everything that may need to be referenced or rendered specially. For
example, directive names, directive parameter lists, individual
parameter documentation, parameter defaults, default parameter
documentation, configuration examples, C++ macro guards, AND prose
elements such as sections, paragraphs, lists, emphasized phrases,
verbatim text, hyperlinks, etc.



>> 3. Supports documentation duplication avoidance so that we do not have
>> to duplicate a lot of text or refer the reader to directive X for
>> details of directive Y functionality.

> The XML idea supports that. I am not sure about SGML.

With XML (and many SGML DTDs), the question is not so much whether it is
_possible_ to support Foo or Bar, but how difficult that support is
going to be (for documentation writers, readers, and tool
developers/admins). I suspect that reusing XML snippets is going to
require custom tooling unless those snippets are isolated into
entities/macros. We can live with that isolation, but a more flexible
"foo.faz documentation is the same as bar.baz documentation (after
replacing baz with faz)" may work a lot better.

N.B. If by "SGML" you mean Linuxdoc DTD, then I am not sure whether it
supports quoting. SGML itself, being a meta-language (compared to XML),
can "support" anything XML can support and a lot more.


> All the other text syntax I'm aware of do not have nice writer-friendly
> referencing. The YAML-like one we currently have is a case in point.

I am not aware of _nice_ referencing in XML either. FWIW, Markdown
referencing is OK. Certainly not nice, just OK. We have no referencing
today in cf.data.pre AFAIK.

Please note that referencing and quoting/reusing content are different
beasts: Item 3 is about the latter (which is more difficult to find good
support for compared to the more prevailing referencing).


>> 6. Git-friendly: Adding two new unrelated directives does not lead to
>> conflicting pull requests.
> 
> This is 

Re: [squid-dev] squid.conf future

2020-02-24 Thread Amos Jeffries
On 25/02/20 6:11 am, Alex Rousskov wrote:
> On 2/24/20 3:11 AM, Amos Jeffries wrote:
> 
>> While doing some polish to cf_gen tool (PR #558) I am faced with some
>> large code edits to get that tool any more compliant with our current
>> guidelines. With that comes the question of whether that more detailed
>> work is worth doing at all ...
> 
> Probably not. Even PR #558 changes might be going a bit too far (or not
> far enough). Ideally, we should agree on key code cleanup principles
> before doing such cleanup, to minimize tensions in every such PR.
> Cleanup for the sake of cleanup should be done under a general
> agreement/consent rather than ad-hoc. I am working on the corresponding
> suggestions but need another week or so to post a specific proposal.
> 
> 
>> For the future I am considering a switch of cf.data.pre to a format like
>> SGML or XML which we can better generate the website contents from.
> 
> I do support fixing cf.data.pre-related issues -- they are a well-known
> constant (albeit moderate) pain for developers and users alike. However,
> using writer-unfriendly formats such as XML is not the best solution
> IMO. SGML may be a good fit, but that concept covers such a wide variety
> of languages that it is difficult to say anything specific about it in
> this context (e.g., both raw XML and wiki-like markups can be valid
> SGML!). If you meant something specific by "SGML", please clarify.

Exactly. We have the Linuxdoc toolchain already used for release notes
etc. so long as we have a simple set of rules about the markup used for
bits that cf_gen needs to pull out for code generation we can use any of
the more powerful markup in the documentation comment parts.


> 
> Automated rendering of squid.conf sources, including web site content
> generation, should be straightforward with any good source format,
> including writer-friendly formats. Thus, web site generation is not an
> important deciding criteria here AFAICT.

It is an existing use-case for documentation output we need to maintain.
We can still decide to forego adding nice-to-have outputs that do not exist.


> 
> IMO, an ideal markup language for cf.data.pre (or its replacements)
> would satisfy these draft high-level criteria:
> 
> 1. Writer-friendly. Proper whitespace, indentation, and other
> presentation features of the _rendered_ output are the responsibility of
> renderes, not content writers. Decent _sources_ formatting should be
> automatically handled by popular modern text editors that developers
> already use. No torturing humans with counting tags or brackets.

This nullifies the argument that XML is torturous. Good editing tools
can handle XML easily.

For writers dealing with the tags directly a simple SGML markup is
better. But not a huge amount.


> 
> 2. Expressive enough to define all the squid.conf concepts that we want
> to keep/support, so that they can be rendered beautifully without hacks.
> For example, if we agree that those sections are a good idea, then this
> item includes support for introduction sections that define no
> configuration options themselves.

What are you calling squid.conf concepts here?


> 
> 3. Supports documentation duplication avoidance so that we do not have
> to duplicate a lot of text or refer the reader to directive X for
> details of directive Y functionality.
> 

The XML idea supports that. I am not sure about SGML.

All the other text syntax I'm aware of do not have nice writer-friendly
referencing. The YAML-like one we currently have is a case in point.



> 4. Allows for automated validation of internal cross-references (and
> possibly other internal concepts that can be validated). Specification
> of these cross-references is covered by item 2.
> 
> 5. Allows for automated spellchecking without dangerous exceptions.
> 

Any syntax we choose with good tooling should support that. If not the
requirement to translate between formats will at least involve moving
the text parts into a format that can be spell-checked (HTML).


> 6. Git-friendly: Adding two new unrelated directives does not lead to
> conflicting pull requests.

This is unrealistic so long as the source code remains in one file. Only
edits to independent files are guaranteed not to conflict.

What I am considering is a change to the internal syntax within
cf.data.pre. At most a filename/extension change to match. It remains a
source code file like any other.


> 
> 7. Either already well-known or easy to learn by example (as far as
> major used concepts are concerned).
> 

AFAIK, that effectively means SGML or XML.


> 8. Can be easily parsed using programming languages that our renderers
> are (going to be) written in (e.g., using existing parser libraries). We
> should probably discuss whether these renderers should be (re)written in
> some specific languages.

This is where XML has the the advantage over wider SGML. Both are
parseable, but XML end-tags and libxml make is a bit more simple for the
cf_gen 

Re: [squid-dev] squid.conf future

2020-02-24 Thread Alex Rousskov
On 2/24/20 3:11 AM, Amos Jeffries wrote:

> While doing some polish to cf_gen tool (PR #558) I am faced with some
> large code edits to get that tool any more compliant with our current
> guidelines. With that comes the question of whether that more detailed
> work is worth doing at all ...

Probably not. Even PR #558 changes might be going a bit too far (or not
far enough). Ideally, we should agree on key code cleanup principles
before doing such cleanup, to minimize tensions in every such PR.
Cleanup for the sake of cleanup should be done under a general
agreement/consent rather than ad-hoc. I am working on the corresponding
suggestions but need another week or so to post a specific proposal.


> For the future I am considering a switch of cf.data.pre to a format like
> SGML or XML which we can better generate the website contents from.

I do support fixing cf.data.pre-related issues -- they are a well-known
constant (albeit moderate) pain for developers and users alike. However,
using writer-unfriendly formats such as XML is not the best solution
IMO. SGML may be a good fit, but that concept covers such a wide variety
of languages that it is difficult to say anything specific about it in
this context (e.g., both raw XML and wiki-like markups can be valid
SGML!). If you meant something specific by "SGML", please clarify.

Automated rendering of squid.conf sources, including web site content
generation, should be straightforward with any good source format,
including writer-friendly formats. Thus, web site generation is not an
important deciding criteria here AFAICT.

IMO, an ideal markup language for cf.data.pre (or its replacements)
would satisfy these draft high-level criteria:

1. Writer-friendly. Proper whitespace, indentation, and other
presentation features of the _rendered_ output are the responsibility of
renderes, not content writers. Decent _sources_ formatting should be
automatically handled by popular modern text editors that developers
already use. No torturing humans with counting tags or brackets.

2. Expressive enough to define all the squid.conf concepts that we want
to keep/support, so that they can be rendered beautifully without hacks.
For example, if we agree that those sections are a good idea, then this
item includes support for introduction sections that define no
configuration options themselves.

3. Supports documentation duplication avoidance so that we do not have
to duplicate a lot of text or refer the reader to directive X for
details of directive Y functionality.

4. Allows for automated validation of internal cross-references (and
possibly other internal concepts that can be validated). Specification
of these cross-references is covered by item 2.

5. Allows for automated spellchecking without dangerous exceptions.

6. Git-friendly: Adding two new unrelated directives does not lead to
conflicting pull requests.

7. Either already well-known or easy to learn by example (as far as
major used concepts are concerned).

8. Can be easily parsed using programming languages that our renderers
are (going to be) written in (e.g., using existing parser libraries). We
should probably discuss whether these renderers should be (re)written in
some specific languages.

9. Translation-friendly. (I do not know what that entails, but I am sure
that others can detail this reqiurement.)

It is unlikely that we can find a language that fully satisfies all the
criteria, but I hope that we can come close. It is not a new/unusual
problem. Let's not rush into rewrites until we agree on this.


> The main point in favour of these is that we already have infrastructure
> in place for ESI and release notes. It would be less work to re-use one
> of those than integrate a new library or tooling for some other format.

Reusing existing infrastructure is a nice bonus, of course, but I think
that any major format rework should be focusing on optimizing for the
long-term. Any infrastructure changes required to render static content
on a web site seem relatively small to me. (And does not ESI support
injection of any content, not just XML-based?)


Thank you,

Alex.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


[squid-dev] squid.conf future

2020-02-24 Thread Amos Jeffries
Hi all,

While doing some polish to cf_gen tool (PR #558) I am faced with some
large code edits to get that tool any more compliant with our current
guidelines. With that comes the question of whether that more detailed
work is worth doing at all ...


For the future I am considering a switch of cf.data.pre to a format like
SGML or XML which we can better generate the website contents from.

The main point in favour of these is that we already have infrastructure
in place for ESI and release notes. It would be less work to re-use one
of those than integrate a new library or tooling for some other format.


Amos
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev