It seems the odd characters do in fact have a valid UTF-8 encoding, but
for some reason they have been encoded incorrectly. I was able to fix
them as follows:

cat /var/lib/dpkg/status | iconv -c -f utf-8 -t utf-8 > /tmp/status.fixed
cat /var/lib/dpkg/available | iconv -c -f utf-8 -t utf-8 > /tmp/available.fixed

Now you still have to replace the originals with the fixed copies. In my
case, there were about 100 offending packages:

hwdata ("Noël Köthe" -> "Noël Köthe")
shared-mime-info ("Sebastian Dröge" -> "Sebastian Dröge")
glines
...

I have the impression there is a structural root cause for this, it's
not just about a rare and obscure package with a rogue character.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1053749

Title:
  UnicodeDecodeError from broken package descriptions

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/dpkg/+bug/1053749/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to