Re: [Savannah-hackers-public] Truncated tarball from git.savannah.gnu.org

2017-02-08 Thread James Cloos
> "BP" == Bob Proulx  writes:

BP> Using gzip is much less stressful on the cpu.  It only takes 1m30s to
BP> create and download a tar.gz file.  The gz is a larger file than the
BP> xz but the overall impact of the gz is less.

Yes.  It is possible something like varnish in front of the https server
would cache the tar files so that multiple requests can run from ram.

Also, it should be possible to configure cgit to use -0, -1 or -2 when
invoking xz.

Or you might be able to disable xz and have such requests redirect to an
error page suggesting a tar.gz instead.

I know fdo uses varnish.  I wouldn't be surprized if it was for this
exact reason.

-JimC
-- 
James Cloos  OpenPGP: 0x997A9F17ED7DAEA6



Re: [Savannah-hackers-public] Truncated tarball from git.savannah.gnu.org

2017-02-08 Thread Bob Proulx
Eli Zaretskii wrote:
> James Cloos wrote:
> > It looks like there is a 60 second limit.

Yes.  There appeared to be a 60 second limit.

> > And the transmission is unnaturally slow.  My test averaged only 154KB/s
> > even though I ran it on a machine in a very well connected data center
> > near Boston which supports more than 1G incoming bandwidth.
> 
> I think the tarball is produced on the fly, so it isn't the bandwidth

Yes.  The tar file is produced on the fly and then compressed with
xz.  This is quite a cpu intensive operation.  It pegs one core at
100% cpu during the operation.  It takes 3 minutes on a well connected
machine to create and download a tar.xz file.

> that limits the speed, it's the CPU processing resources needed to
> xz-compress the files.  Try the same with .tar.gz, and you will see
> quite a different speed.

Using gzip is much less stressful on the cpu.  It only takes 1m30s to
create and download a tar.gz file.  The gz is a larger file than the
xz but the overall impact of the gz is less.

> > The 60s limit needs to be much longer; I doubt that it should be any
> > less than ten minutes.

There is a read timeout that can be hit such that the data must start
transferring before the timeout occurs or the web server thinks the
process has failed.  In this case I think the start is after it has
finished the compression.  After it starts transfering data then reads
continue and the read timeout resets.

> No, I think 3 min should be enough.  But I don't really understand why
> there has to be a limit.

There must be limits because otherwise the activity of the global
Internet hitting the server will drive it out of resources creating
what is indistinguishable from a denial of service attack.  There must
be limits to prevent clients from consuming all server resources.
That is just a fact of life when running a busy public server.  You
never have enough resources for everything.  You can't.  Because there
are more clients on the net than you have server resources.  All it
takes is for someone to say that there is a new release and that
synchronizes many people to go download all at the same time and the
system become overwhelmed.

In any case, I am coming back to this thread because we have just
moved git off of the old server and onto the new server.  We are just
now starting to tune the parameters on the new system.  If you try
this again you will find the current read time limit for data to start
transferring to be 300s.  Plus the new system should be faster than
the old one.  The combined effect should be much better.  But remember
that we can't make it unlimited.

Frankly from the server perspective I don't like the cgit dynamic tar
file creation on the server.  It has quite an impact on it.  It is
easier on the server if people keep their own copy of a git clone
updated and build the release tar files on the local ciient system
rather than on the server system.  Then updates to the git repository
are incremental.  Much less impact on the server.  Or to have
maintainers create the tar file once and then simply serve that file
out repeatedly from a download server.

Bob



Re: [Savannah-hackers-public] Truncated tarball from git.savannah.gnu.org

2017-01-15 Thread James Cloos
JC>> It looks like there is a 60 second limit.

EZ> I think the tarball is produced on the fly, so it isn't the bandwidth
EZ> that limits the speed, it's the CPU processing resources needed to
EZ> xz-compress the files.  Try the same with .tar.gz, and you will see
EZ> quite a different speed.

Indeed.  I should have thought of that.

Grabbing http://git.savannah.gnu.org/cgit/emacs.git/snapshot/emacs-master.tar
across doccis works and takes 63 seconds, so the limit is more likely to
be an RLIMIT on the xz(1) process.  Perhaps an RLIMIT_CPU of 60 seconds
or an RLIMIT_AS or RLIMIT_DATA which hits after xz(1) spends around 60 s
compressing the emacs tar?

-JimC
-- 
James Cloos  OpenPGP: 0x997A9F17ED7DAEA6




[Savannah-hackers-public] Truncated tarball from git.savannah.gnu.org

2017-01-15 Thread Eli Zaretskii
If I try this command:

  wget http://git.savannah.gnu.org/cgit/emacs.git/snapshot/emacs-master.tar.xz

I get only about 11MB worth of data, less than half of what I expect.
Sounds like some limitation, either by download time or by something
else, kicks in?