Control: retitle -1 licensecheck: misparses utf8-encoded files by default Quoting Ximin Luo (2017-07-05 18:00:28) > licensecheck seems to generate bad output for unicode files such as: > > https://sources.debian.net/src/sagemath/7.6-2/sage/src/doc/ja/tutorial/tour_rings.rst > > An example command line is: > > $ licensecheck -l250 --deb-machine --merge-licenses > src/doc/ja/tutorial/tour_rings.rst > > I get glyphs like <U+008D>ã<U+0081> suggesting that maybe it is > getting utf-8-encoded twice.
Licensecheck reads data as Latin1 by default. Explicitly tell licensecheck to use (or more accurately first try) utf8: licensecheck -l250 --deb-machine --merge-licenses --encoding utf8 tour_rings.rst I agree that this is not optimal: Nowadays licensecheck should use utf8 by default. I am just not quite certain how to go about that - if ok to simply switch, or if I should make a mimor or major version bump when doing such change. -- * Jonas Smedegaard - idealist & Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ [x] quote me freely [ ] ask before reusing [ ] keep private
signature.asc
Description: signature

