On Fri, Mar 23, 2018 at 02:10:07PM +0100, Bill Allombert wrote: > > Probably. Is the format of that file documented somewhere? > This is a list of key/value pair in RFC822 style. > See /usr/share/doc/popularity-contest/examples/bin/README.examples > for the format of the Package line.
I have a few questions: How is the package name separated from the integer fields? It does not look like a fixed-width field: Package: abev-form-obhgepi-fpk-nav 0 0 0 2 Package: abev-form-obhgepi-fpk-nav-egyeb 0 0 0 2 If it is instead space-separated, currently I didn't see package names that contained spaces, but is there a guarantee that the package name won't contain spaces? Alternatively, should the parsing instead be done by splitting on \s+ from the right with a maximum of 4 splits? Some package names seem to be truncated, like this one: Package: apache-openoffice-4.1.4-linux-x86-install-rpm-de 0 0 0 1 Is the character set guaranteed to be UTF8, or should I parse it as binary, and drop all lines that do not decode as UTF8, or even all lines that are not strictly 7-bit ascii, like this one? Package: li37sp©y 0 0 0 1 Enrico -- GPG key: 4096R/634F4BD1E7AD5568 2009-05-08 Enrico Zini <enr...@enricozini.org>