On Mon, Aug 30, 2021 at 10:00:51AM -0700, Per Bothner wrote: > > However, having a customization variable to > > output only numerical entities would be ok to me, maybe something like > > > > USE_ONLY_NUMERICAL_ENTITY or NO_NAMED_ENTITY to avoid confusion with > > USE_NUMERICAL_ENTITY. > > I think more valuable would an "XML_COMPATIBLE" variable. > In addition to numeric entities, it would guarantee to close all tags. > E.g. instead of <br> it would emit <br/> - which also works with > most (all?) HTML parsers. And possibly other issues.
I don't see the point in adding customization variables to fine-tune details like whether to use named entities or not. If named entities are valid HTML there's no problem. I believe the decimal entities existed first although hexadecimal entities would certainly be more legible especially for codepoints > 255. > > What I'm looking for is: > (1) Be able to post-process html output with xml tools, such as xslt. > (2) Generate valid epub3 ebooks. These seem like valid goals so would be happy to see patches that produced XML output, likely as an option.
