[Reproducible-builds] Bug#813052: : Bug#813052: diffoscope takes more than an hour on foreign arch libc6

2016-02-17 Thread Steven Chamberlain
Jérémy Bobbio wrote:
> [...] It missed another bit. Thanks for double-checking, I hadn't
> tested the other change properly.

And thanks for fixing this!  The changes from diffoscope/48 to 49
have made it 26x faster for this particular test case.

Regards,
-- 
Steven Chamberlain
ste...@pyro.eu.org


signature.asc
Description: Digital signature
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

[Reproducible-builds] Bug#813052: Bug#813052: diffoscope takes more than an hour on foreign arch libc6

2016-02-17 Thread Steven Chamberlain
Jérémy Bobbio wrote:
> Steven Chamberlain:
> > But it will still stat() everything in the containing directory,
> > looking for .debs.  It also opens some files and reads them - even
> > decompressing random .gz files along the way!
> 
> Are you sure that it is actually decompressing files and not just
> identifying them?

Ah okay, it reads just one block to check its file magic, I think.

> Anyway, I've just pushed another patch to filter by filenames before
> looking at content. This should further improve the situation.

I don't think it worked?  It's still doing as before, looking at
text, gzip files and validating the sha1sums in a .buildinfo:

| DEBUG Looking for a dbgsym package for Build Id 
4bfc8175c9c53156a7e20d0216bc9fff0d25ae2a (debuglink: 
fc8175c9c53156a7e20d0216bc9fff0d25ae2a.debug)
| DEBUG Using TextFile for a/build/.bash_logout
| DEBUG Using TextFile for a/build/.bashrc
| DEBUG Using TextFile for a/build/.profile
| DEBUG Using DebFile for a/build/cpp-4.9-dbgsym_4.9.3-11_kfreebsd-amd64.deb
| DEBUG Using DebFile for a/build/cpp-4.9_4.9.3-11_kfreebsd-amd64.deb
| DEBUG Using DebFile for a/build/g++-4.9-dbgsym_4.9.3-11_kfreebsd-amd64.deb
| DEBUG Using DebFile for a/build/g++-4.9-multilib_4.9.3-11_kfreebsd-amd64.deb
| DEBUG Using DebFile for a/build/g++-4.9_4.9.3-11_kfreebsd-amd64.deb
| DEBUG Using DebFile for a/build/gcc-4.9-base_4.9.3-11_kfreebsd-amd64.deb
| DEBUG Using DebFile for a/build/gcc-4.9-dbgsym_4.9.3-11_kfreebsd-amd64.deb
| DEBUG Using DebFile for a/build/gcc-4.9-locales_4.9.3-11_all.deb
| DEBUG Using DebFile for a/build/gcc-4.9-multilib_4.9.3-11_kfreebsd-amd64.deb
| DEBUG Using DebFile for 
a/build/gcc-4.9-plugin-dev-dbgsym_4.9.3-11_kfreebsd-amd64.deb
| DEBUG Using DebFile for a/build/gcc-4.9-plugin-dev_4.9.3-11_kfreebsd-amd64.deb
| DEBUG Using DebFile for a/build/gcc-4.9-source_4.9.3-11_all.deb
| DEBUG Using GzipFile for a/build/gcc-4.9_4.9.3-11.diff.gz
| DEBUG Using DotDscFile for a/build/gcc-4.9_4.9.3-11.dsc
| DEBUG Using DotBuildinfoFile for 
a/build/gcc-4.9_4.9.3-11_kfreebsd-amd64.buildinfo
| DEBUG validating sha1 checksums

and I'm pretty sure I used current Git master:

| grep -n -C2 irrelevant 
/usr/lib/python3/dist-packages/diffoscope/comparators/deb.py 
| 40-for member_name, member in container.get_members().items():
| 41-# Let's assume the name will end with .deb to avoid looking at
| 42:# too many irrelevant files
| 43-if not member_name.endswith('.deb'):
| 44-continue

Regards,
-- 
Steven Chamberlain
ste...@pyro.eu.org


signature.asc
Description: Digital signature
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

[Reproducible-builds] Bug#813052: Bug#813052: diffoscope takes more than an hour on foreign arch libc6

2016-01-29 Thread Helmut Grohne
Hi Lunar,

On Fri, Jan 29, 2016 at 03:11:55PM +0100, Jérémy Bobbio wrote:
> Helmut Grohne:
> > Even though I cannot reproduce the issue at hand, I think that the code
> > adding automatic debug symbols looks fishy to me. It appears to recurse
> > over /tmp here and that looks very wrong to me.
> 
> I don't understand what you mean by that. Could you provide be (at least
> some) of the `--debug` output?

What I mean is that diffoscope takes the directory that contains the
first debian package and then recursively looks at all contained files.
If that tree happens to be big, bad things can happen.

So I finally managed to reproduce that bit and I'll give a recipe and a
--debug log.

chroot into a fresh sid bootstrap.

# apt-get --no-install-recommends install diffoscope binutils-multiarch
$ cd /tmp
$ mkdir -p buildd/diffoscope/should/not/be/looking snapshot
$ echo no > buildd/diffoscope/should/not/be/looking/here
$ cd buildd

Now obtain a full set of binary packages from an arch-only glibc build.

$ cd ../snapshot

Obtain the corresponding libc6_*.deb from snapshot.d.o.

$ cd ../buildd
$ diffoscope --debug --text out ./libc6_*.deb /tmp/snapshot/libc6_*.deb 2>debug

The run finishes quickly (< 3 minutes) and the debug log contains:

| DEBUG Looking for a dbgsym package for Build Id ...
| DEBUG Using TextFile for ./diffoscope/should/not/be/looking/here

Now for the profitbricks node, what diffoscope looks at is a build tree
for glibc. Recursively.

Note that I said "./libc6_*.deb" above. If you drop the "./", diffoscope
doesn't look where it's not supposed to look.

Helmut


debug.xz
Description: Binary data
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

[Reproducible-builds] Bug#813052: Bug#813052: diffoscope takes more than an hour on foreign arch libc6

2016-01-29 Thread Helmut Grohne
Hi Holger,

On Fri, Jan 29, 2016 at 02:08:53AM +0100, Holger Levsen wrote:
> to be clear: there is nothing running on profitbricks-build4-amd64 except 
> Helmut's jobs, which I assume haven't changed on the 24th???

Correct. I'm watching top atm and see diffoscope exploding in memory.
100MB resident after a minute. 600MB resident after 4 minutes. 1.7G
after 6 minutes. 2.2G after 8 minutes. 2.7G after 10 minutes. So yeah,
diffoscope really is the culprit here.

Helmut

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds