Public bug reported:

In the past I ran HTML pages through HTML Tidy provided by MacPorts on
OS X. I'm now working on a  Ubuntu 16.10/Yakkety system, and its doing
an awful job on the pages. When I diff the pages nearly everything has
changed.

For example, Ubuntu's HTML Tidy is not indenting, its adding extra
characters and its stripping whitespace that should remain. Others have
experienced the problem, too:
https://stackoverflow.com/questions/24505764/html-tidy-stripping-space-
at-the-start.

Please update to a more recent version of HTML Tidy.

**********

The nice thing about this report is the pages and the script is located
at https://github.com/weidai11/website. You can duplicate with the
following. You don't even need to make a change. Just diff after running
`cleanup.sh`.


   git clone https://github.com/weidai11/website
   cd website
   ./cleanup.sh

HTML Tidy is invoked with the following in the script:

   # Cleanup HTML files
   for file in *.html
   do
      echo "**************** $file ****************"

      echo "tidy: processing file $file..."
      "$HTML_TIDY" --quiet yes --output-bom no --indent auto --wrap 90 -m 
"$file"

      echo "sed: processing file $file..."

      # Delete trailing whitespace
      "$SED" "${SED_OPTS[@]}" -e's/[[:space:]]*$//' "$file"

      # Delete the generator markup tag
      "$SED" "${SED_OPTS[@]}" -e'/<meta name="generator"/d' "$file"

      # Fix CRLF endings after sed
      unix2dos "$file"
   done


**********

   $ lsb_release -a
   No LSB modules are available.
   Distributor ID: Ubuntu
   Description:    Ubuntu 16.10
   Release:        16.10
   Codename:       yakkety

**********

$ apt-cache show tidy
Package: tidy
Priority: optional
Section: universe/web
Installed-Size: 83
Maintainer: Ubuntu Developers <[email protected]>
Original-Maintainer: Jason Thomas <[email protected]>
Architecture: amd64
Source: tidy-html5
Version: 1:5.2.0-2
Depends: libc6 (>= 2.14), libtidy5 (= 1:5.2.0-2)
Filename: pool/universe/t/tidy-html5/tidy_5.2.0-2_amd64.deb
Size: 25524
MD5sum: 06fda2013e8edb31fbc37fb2bb407e5c
SHA1: 58b1b60cd8bc2a084d78d374b24fefd24acc7783
SHA256: 6c9492519b78c37f3ac97c88237b7832e4f50d3eb303364e5afb44ecbe0ed548
...

** Affects: tidy (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1660537

Title:
  HTML Tidy is dooing a poor job; please update to newer HTML Tidy

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/tidy/+bug/1660537/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to