Re: Rewriting Plaintext/Info converter in C?

Patrice Dumas Thu, 25 Jun 2026 13:34:21 -0700

On Wed, Jun 24, 2026 at 07:38:33PM +0100, Gavin Smith wrote:
> On Tue, Jun 23, 2026 at 11:51:20PM +0200, Patrice Dumas wrote:
> > > I looked at HTML.pm to see if I could see how this module was implemented
> > > in C as part of the program.  I noticed there were two modules, one called
> > > HTML.pm, the other called HTMLNonXS.pm.
> > 
> > The HTMLNonXS.pm has all the functions that have an XS interface and
> > are not needed when the XS interface is loaded
> > (tta/perl/XSTexinfo/convert/ConvertHTMLXS.xs).  HTML.pm has the
> > functions do not have an XS interface, and in practice need to be
> > defined even when the XS interface is loaded.  There is no other logic
> > behind that organization, one should consider that the Perl module is
> > the union of the two files.  This organization makes managing the XS
> > interface much easier, and sometime can help finding if some functions
> > are better in the XS interface or not. 
> 
> What I still don't understand is why most of HTML.pm needs to be loaded
> even if the XS modules are used.  HTML.pm has for example 
> _convert_email_command,
> which is part of the conversion code.  This should not be used if the XS 
> modules
> are used, so could be described as "non-XS" code.  If I understand correctly,
> _convert_email_command is not HTMLNonXS.pm because it does not directly have
> an XSUB defined in ConvertHTMLXS.xs.


The case of the HTML converter is special, because user-defined code in
customization files can call quite many functions that are not
overriden.  In particular, most functions that use an element in
argument cannot have an XS override, because we cannot find the C tree
element based on the Perl tree element (exceptions are nodes, sections,
or command with index entries, global commands).  In general, I tried to
have XS overrides as much as possible, as it simplifies the code by
removing the need to resynchronize the Perl data, but there are still
many functions that are needed.  In particular, convert_tree called from
user-defined code is pure Perl, which means that _convert needs to be
there too (though there are XS interfaces to change the conversion
state).

> If this split is useful, then there might be a better naming convention
> or structure.  Maybe a fourfold split, like HTML.pm / HTMLPublic.pm /
> HTMLNonXS.pm / HTMLPublicNonXS.pm.
> 
> HTMLNonXS.pm could contain the pure Perl conversion code like
> _convert_email_command as mentioned, while HTMLPublicNonXS.pm would
> contain the functions for use by other parts of the code and the API,
> as HTMLNonXS.pm currently is.  HTML.pm and/or HTMLPublic.pm would load
> the HTML XS modules or the non-XS modules as required.
>
> (Another idea: HTMLPrivate.pm / HTML.pm / HTMLPrivateNonXS.pm / HTMLNonXS.pm.
> HTML.pm would be the "public" module.)
> 
> Does that make sense?

Would all be in the same Texinfo::Convert::HTML package?

Maybe something that could be relevant is how I structured the C code:

html_converter_init_options.c: setup converter defaults.
html_prepare_converter.c: initializations independent of anything,
    converter initialization, setting directions, targets and filenames,
    htmlxref and css files, document units, applying customization,
    registering conversion functions, conversion initialization,
    units initialization
convert_html.c: string translation and tree conversion, last
                preparations of conversion and conversion functions
                (html_convert_tree_append, convert_output_unit,
                html_convert_output), conversion finalization.
html_conversion_state.c: manage conversion state and CSS element
               selection.
format_html.c: target, links, href and root command text formatting,
               formatting format_* functions and elements formatting
               plus html_output_internal_links
html_converter_finish.c: free converter.
html_converter_api.c: some "high-level" functions corresponding to the
      Perl API not already defined elsewhere.  (called from C only,
      I think, not from Perl/XS).

-- 
Pat

Re: Rewriting Plaintext/Info converter in C?

Reply via email to