On Sat, Sep 13, 2025 at 11:28:56PM +0100, Gavin Smith wrote:
> On Sat, Sep 13, 2025 at 05:48:30PM +0100, Gavin Smith wrote:
> > * The language name mapping is extremely rudimentary:
> > 
> > my %highlight_type_languages_name_mappings = (
> >   'source-highlight' => {
> >     'C++' => 'C',
> >     'Perl' => 'perl',
> >   },
> >   'highlight' => {
> >     'C++' => 'c++',
> >   },
> >   'pygments' => {
> >     'C++' => 'c++',
> >   }
> > );
> > 
> > Is this useful or necessary for us to maintain on a program-by-program
> > basis?
> 
> Here's what I propose about "language name mapping".  There are two
> possibilities:
> 
> * Basic: The argument on the @example line (or value of
> HIGHLIGHT_SYNTAX_DEFAULT_LANGUAGE) is used directly in the call to the
> syntax highlighting program.  This would require the user changing e.g.
> "@example C++" to "@example c++" or "@example C" - not a big deal at all.
> 
> * Advanced: If that is not enough: the user has to create their own
> wrapper script which could process the language names.

I would have preferred if it could work out of the box, but I agree that
having a language name mapping that is the same for all the users may
not be possible.

> The user might need to create their own wrapper script handling language 
> names,
> anyway.  In pygments, "lexers" (what we are calling language names) have
> "options", of which there are many:
> 
> https://pygments.org/docs/lexers/#
> 
> If they want to provide different options for different languages, then this
> information would have to be in their wrapper script.
> 
> Generally, there could be many language-specific options that the user
> might want to provide and there is no point for us to try to provide defaults.

The default is:
pygmentize -f html -O noclasses=True

It is generic enough, I believe.  My feeling is that it covers most use
cases, although using a wrapper script would also be ok.

> This then has implications for the "checks on languages" - checking that
> the language is recognised by the highlighter program.  I have only just
> understood that highlight_syntax.pm does this (the 'highlight_setup' function
> was just a bunch of code I didn't really understand).  If the wrapper
> script does its own conversion of language names there can't be any
> error checking on the highlight_syntax.pm side.  It's up to the wrapper
> script to make sure it invokes the highlighting program with the correct
> language (lexer) name (and any other options are correct too).  Again, this
> does not seem like a problem.  We could capture any error output by
> HIGHLIGHT_SYNTAX (or HIGHLIGHT_SYNTAX_PROGRAM or similar variable) so
> that the reason for the error is apparent to the user, and not highlight
> the output if the highlighter program (or wrapper script) exits 
> unsuccessfully.

That is already what is done, if the HIGHLIGHT_SYNTAX value is not
"highlight", "pygments" nor "source-highlight".

In that case, the user has to do the language analysis herself and
reject or map languages, which is ok for an advanced use, but which I
would have liked to avoid for a basic use.

-- 
Pat

Reply via email to