Hi again!

Lennart, you nerdsniped me.

Dixi quod…
>Lennart Jablonka dixit:
>>> I’m not sure whether it may then also self-close all tags but would
>>> assume so (except I know tech is… tricky).
>>
>> As in an XML document, <asdf/> and <asdf></asdf> are entirely equivalent, 
>> yes,
>> the server may then “self-close” all empty elements.
>
>That’s what made me say I’d assume so, but I know tech, which is
>why I hesitate.

I found hints towards still requiring the empty not-self-closed
tags even in XML but I forgot where during the subsequent hacking
which took m̲u̲c̲h̲ longer than expected.

But here is that hacking’s result. Find attached an LD_PRELOAD library
that makes “xmlstarlet fo”, without -o (because it then uses yet other
libxml2 function calls), output XHTML ☻

Prepare:

$ sudo apt-get install libxml2-dev

Compile and link:

$ gcc -Wdate-time -D_FORTIFY_SOURCE=2 -O2 -fstack-protector-strong \
      -Wformat -Werror=format-security -Wall -Wextra \
      $(xml2-config --cflags) -DPIC -fPIC -shared -o libforceXHTML.so \
      forceXHTML.c

Use:

$ LD_PRELOAD=$PWD/libforceXHTML.so xmlstarlet fo [-n] [-e encoding] filename|-

C̲a̲v̲e̲a̲t̲:̲ without -n it breaks up “old browser-safe” framing for CSS and 
JS:

 <style type="text/css"><!--/*--><![CDATA[/*><!--*/
  …
 /*]]>*/--></style>
 <script type="text/javascript"><!--//--><![CDATA[//><!--
  …
 //--><!]]></script>

This is because in XML, the <!--/*--> or <!--//--> is a
comment node inside the style/script node (as is correct)
and libxml2’s “XHTML” output code writes a newline after
each node if indenting. xhtmlNodeListDumpOutput() is
static, so not up for LD_PRELOAD hacks. But the OP was
not formatting/indenting their XML anyway so this strikes
me as a suitable postprocessing step. I did verify that it
properly adds spaces and not-self-closes elements for one
static XHTML file.

This was initially very mildly based on libxml2 itself,
whose public API sucks badly enough I had to redraft it
from the beginning. (This the reason of taking so long.)
I publish this under Ⓕ CC0.

Enjoy,
//mirabilos
PS: Shlomi Fish, when replying to me, please send to the list
    as your provider fails badly enough at SMTP it cannot send
    eMails directly to me :/
-- 
FWIW, I'm quite impressed with mksh interactively. I thought it was much
*much* more bare bones. But it turns out it beats the living hell out of
ksh93 in that respect. I'd even consider it for my daily use if I hadn't
wasted half my life on my zsh setup. :-) -- Frank Terbeck in #!/bin/mksh
/* sudo apt-get install libxml2-dev */
/* gcc -Wdate-time -D_FORTIFY_SOURCE=2 -O2 -fstack-protector-strong -Wformat 
-Werror=format-security -Wall -Wextra $(xml2-config --cflags) -DPIC -fPIC 
-shared -o libforceXHTML.so forceXHTML.c */
/* LD_PRELOAD=$PWD/libforceXHTML.so xmlstarlet fo [-n] [-e encoding] filename|- 
*/
/* do not use the xmlstarlet fo -o option! */

#include <stddef.h>
#include <stdio.h>
#include <libxml2/libxml/tree.h>
#include <libxml2/libxml/xmlsave.h>

static int
rpl_xmlSaveFormatFileEnc(const char *filename, xmlDocPtr cur,
    const char *encoding)
{
        xmlSaveCtxtPtr ctx;
        int i;

        if (!encoding)
                encoding = (const void *)cur->encoding;

        i = XML_SAVE_XHTML;
        if (xmlIndentTreeOutput)
                i |= XML_SAVE_FORMAT;

        ctx = xmlSaveToFilename(filename, encoding, i);
        if (!ctx)
                return (-1);
        i = xmlSaveDoc(ctx, cur);
        /* fucking libxml2 API gives us no way to check close() errors */
        if (i >= 0)
                i = xmlSaveClose(ctx);
        else
                xmlSaveClose(ctx);
        fprintf(stderr, "\nI: forceXHTML save %s\n",
            i < 0 ? "failed" : "ok");
        return (i);
}

int
xmlSaveFormatFileEnc(const char *filename, xmlDocPtr cur,
    const char *encoding, int format __attribute__((__unused__)))
{
        return (rpl_xmlSaveFormatFileEnc(filename, cur, encoding));
}

int
xmlSaveFormatFile(const char *filename, xmlDocPtr cur,
    int format __attribute__((__unused__)))
{
        return (rpl_xmlSaveFormatFileEnc(filename, cur, NULL));
}

Reply via email to