On Thu, May 25, 2023 at 5:35 PM Thorsten Glaser
<2020...@bugs.launchpad.net> wrote:
>
> I doubt this is a bug: nowhere do you pass the validator a DTD, and
> entities are defined in the DTD.
>
> It’s best practice nowadays to not use entities but just write the UTF-8
> characters directly.
>
> An em dash surrounded by hair spaces is: “ — ” (for your copy/paste
> convenience)

I think you are right - this is not a bug. I took a quick peek at RFC
3470, and I don't see where HTML entity references are optional.

Sorry about that. I got some bad info off the internet (surprise,
surprise). It said to use the character entity reference for emdash
due to portability problems when using the character itself.

Jeff

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to libxml2 in Ubuntu.
https://bugs.launchpad.net/bugs/2020814

Title:
  xmllint does not recognize emdash (&mdash;)

Status in libxml2 package in Ubuntu:
  New

Bug description:
  I'm using Ubuntu 20.04.2 LTS, x85_64, fully patched. I'm using DocBook to 
build a PDF. One of the steps I use in my build script is to validate and 
format the XML using xmllint from libxml2-utils
  2.9.13+dfsg-1ubuntu0.3:

      echo "Validating book..."
      if ! xmllint --xinclude --noout --postvalid book.xml
      then
          echo "Validation failed. Exiting."
          exit 1
      fi
      echo "Complete."

      echo "Formatting source code..."
      for file in *.xml
      do
          if xmllint --format "${file}" --output "${file}.format"
          then
              mv "${file}.format" "${file}"
          fi
      done
      echo "Complete."

  When I added an emdash (&mdash;) the book failed to format:

      Validating book...
      Complete.
      Formatting source code...
      ch02.xml:58: parser error : Entity 'mdash' not defined
       injections are remediated using several methods. And two output devices 
&mdash;
                                                                                
     ^
      ch02.xml:58: parser error : Entity 'mdash' not defined
       methods. And two output devices &mdash; the printer and plaintext email 
&mdash;
                                                                                
     ^
      Complete.

  The text is:

      <para>... And two output devices &mdash; the printer and plaintext
  email &mdash; do not require...</para>

  It seems like emdash should be recognized.

  -----

  $ lsb_release -a
  No LSB modules are available.
  Distributor ID: Ubuntu
  Description:    Ubuntu 22.04.2 LTS
  Release:        22.04
  Codename:       jammy

  -----

  $ xmllint --version
  xmllint: using libxml version 20913
     compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP 
HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer XInclude Iconv ICU 
ISO8859X Unicode Regexps Automata Schemas Schematron Modules Debug Zlib Lzma

  $ command -v xmllint
  /usr/bin/xmllint

  $ dpkg -S /usr/bin/xmllint
  libxml2-utils: /usr/bin/xmllint

  $ apt-cache show libxml2-utils
  Package: libxml2-utils
  Architecture: amd64
  Version: 2.9.13+dfsg-1ubuntu0.3
  Multi-Arch: foreign
  Priority: optional
  Section: text
  Source: libxml2
  Origin: Ubuntu
  Maintainer: Ubuntu Developers <ubuntu-devel-disc...@lists.ubuntu.com>
  Original-Maintainer: Debian XML/SGML Group 
<debian-xml-sgml-p...@lists.alioth.debian.org>
  Bugs: https://bugs.launchpad.net/ubuntu/+filebug
  Installed-Size: 202
  Depends: libc6 (>= 2.34), libxml2 (>= 2.9.0)
  Filename: 
pool/main/libx/libxml2/libxml2-utils_2.9.13+dfsg-1ubuntu0.3_amd64.deb
  Size: 40192
  MD5sum: 3ca7de07562010fcaabf255ea8fea9c4
  SHA1: 128a9cfaff49e85f2ab08578f389eecb21f17766
  SHA256: c279c07caf909545e2cedb7845b5ac652e0a70f9784e5faf799a1a01441b4649
  SHA512: 
51600d7206c9a5568fdaeee9adddbc48962fc094cc479f4bd42c0714b3725cd3200937f8c876a897db08fc50d891005d7dbabfb6ae12ad27ad6ed416f8b6a03d
  Homepage: http://xmlsoft.org
  ...

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libxml2/+bug/2020814/+subscriptions


-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to