Please use gzip "--no-name" flag on ftp.debian.org

2017-01-07 Thread Oren Tirosh
I submitted a bug against pseudopackage ftp.debian.org but it seems it got
lost somehow.

On Saturday, 7 January 2017, Joerg Jaspert > wrote:

> On 14536 March 1977, Oren Tirosh wrote:
> > The timestamp stored in the gzip file header results in a
> non-deterministic
> > output even when the input is identical. Many files (e.g. under indices/)
> > [...]
>
> > The quick fix is to add "--no-name" to the gzip command (or GZIP=-n to
> the
> > environment). A better fix would be to generate a temporary file, compare
> > it to the current file and replace the file only if not identical. This
> > will preserve the timestamp of the original file and should help some
> > mirroring protocols.
>
> I implemented the quick-fix now in dak, simple and quick fixes rock.
> Next dinstall run will either break or should have this nicety, lets
> see.
>
> Thanks for the report!
>
> --
> bye, Joerg
>


Re: Please use gzip "--no-name" flag on ftp.debian.org

2017-01-07 Thread Joerg Jaspert
On 14536 March 1977, Oren Tirosh wrote:
> The timestamp stored in the gzip file header results in a non-deterministic
> output even when the input is identical. Many files (e.g. under indices/)
> [...] 

> The quick fix is to add "--no-name" to the gzip command (or GZIP=-n to the
> environment). A better fix would be to generate a temporary file, compare
> it to the current file and replace the file only if not identical. This
> will preserve the timestamp of the original file and should help some
> mirroring protocols.

I implemented the quick-fix now in dak, simple and quick fixes rock.
Next dinstall run will either break or should have this nicety, lets
see.

Thanks for the report!

-- 
bye, Joerg



Re: Please use gzip "--no-name" flag on ftp.debian.org

2016-12-29 Thread Chris Lamb
Hi Oren,

> The timestamp stored in the gzip file header results in a non-deterministic
> output even when the input is identical.
[…]

Whoops. Can you re-file this email as a bug in the BTS against ftp.debian.org
so we can track it?

Many thanks.


Regards,

-- 
  ,''`.
 : :'  : Chris Lamb
 `. `'`  la...@debian.org / chris-lamb.co.uk
   `-



Please use gzip "--no-name" flag on ftp.debian.org

2016-12-29 Thread Oren Tirosh
The timestamp stored in the gzip file header results in a non-deterministic
output even when the input is identical. Many files (e.g. under indices/)
are frequently regenerated with identical contents but their compressed
versions end up slightly different. This unnecessarily inflates the number
of unique file hashes that snapshot.debian.org has to deal with, for
example. It may also make mirror updates less efficient.

I encountered this when I tried to find when certain changes were made by
comparing the checksum of indices/files/components/suite-stable.list.gz and
found that it changes on every update. This obviously applies to many other
compressed files.

The quick fix is to add "--no-name" to the gzip command (or GZIP=-n to the
environment). A better fix would be to generate a temporary file, compare
it to the current file and replace the file only if not identical. This
will preserve the timestamp of the original file and should help some
mirroring protocols.

Oren