Hi,
On 8/15/25 1:53 PM, Adrian Bunk wrote:
On Fri, Aug 15, 2025 at 01:23:57PM +0200, Philipp Kern wrote:
On 8/15/25 10:33 AM, Adrian Bunk wrote:
bunk@wuiet:~$ wb info libarchive . hurd-i386
Traceback (most recent call last):
File "/usr/local/bin/wb", line 435, in <module>
main()
File "/usr/local/bin/wb", line 81, in main
do_command(args)
File "/usr/local/bin/wb", line 278, in do_command
info = get_wb_info(pkg, dist, arch)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/bin/wb", line 374, in get_wb_info
for line in p.stdout:
File "<frozen codecs>", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc3 in position 1991:
invalid continuation byte
bunk@wuiet:~$
wanna-build=> SELECT * FROM packages_public WHERE package = 'libarchive' AND
architecture = 'hurd-i386';
ERROR: invalid byte sequence for encoding "UTF8": 0xc3 0x2d
wanna-build=>
Thanks for reporting. old_failed was broken.
...
Thanks.
Is this a more widespread issue?
wanna-build=> SELECT * FROM packages_public;
ERROR: invalid byte sequence for encoding "UTF8": 0xa7
wanna-build=>
I narrowed it down to another row in hurd-i386 in sid. But I don't know
what the row is. I assume some email reply made it with invalid UTF8
into the table. And it's hilariously complicated to figure out what the
row is because the postgres client just explodes when it sees invalid
content.
`\encoding sql_ascii' in the client makes it work.
If you can spot which row is at fault I'm happy to fix it. And maybe we
can recover the content this time and track back what happened.
Kind regards
Philipp Kern