Re: [Savannah-hackers-public] Truncated tarball from git.savannah.gnu.org

2017-02-08 Thread James Cloos
> "BP" == Bob Proulx  writes:

BP> Using gzip is much less stressful on the cpu.  It only takes 1m30s to
BP> create and download a tar.gz file.  The gz is a larger file than the
BP> xz but the overall impact of the gz is less.

Yes.  It is possible something like varnish in front of the https server
would cache the tar files so that multiple requests can run from ram.

Also, it should be possible to configure cgit to use -0, -1 or -2 when
invoking xz.

Or you might be able to disable xz and have such requests redirect to an
error page suggesting a tar.gz instead.

I know fdo uses varnish.  I wouldn't be surprized if it was for this
exact reason.

-JimC
-- 
James Cloos  OpenPGP: 0x997A9F17ED7DAEA6



Re: [Savannah-hackers-public] Testing new Savannah website infrastructure

2017-02-08 Thread Bob Proulx
> > I sync'd the file attachments from the old to the new.  [...]
> > 
> > frontend:/var/lib/savane/trackers_attachments# rsync -av 
> > /var/lib/savane/trackers_attachments/ 
> > frontend0:/var/lib/savane/trackers_attachments/
> 
> If it helps,
> I have a script to do this (and also copy the "new project registration" 
> uploads), in
>vcs:/root/agn/sync-sv-uploads-to-frontend0.sh

I am rather hoping it is on frontend0 to stay and we don't need to do
this again. :-)

Bob



Re: [Savannah-hackers-public] Testing new Savannah website infrastructure

2017-02-08 Thread Assaf Gordon

> On Feb 8, 2017, at 15:59, Bob Proulx  wrote:
> 
> I sync'd the file attachments from the old to the new.  [...]
> 
> frontend:/var/lib/savane/trackers_attachments# rsync -av 
> /var/lib/savane/trackers_attachments/ 
> frontend0:/var/lib/savane/trackers_attachments/

If it helps,
I have a script to do this (and also copy the "new project registration" 
uploads), in
   vcs:/root/agn/sync-sv-uploads-to-frontend0.sh

regards,
 - assaf





Re: [Savannah-hackers-public] hgweb provides a broken file when trying to download an archive

2017-02-08 Thread Assaf Gordon

> On Feb 8, 2017, at 23:32, Bob Proulx  wrote:
>> If the files are truncated, they are truncated by just a tiny little bit.
> 
> we have a recent report of the same problem for the cgit interface.
> 
> [...]However note that at that time git and cgit were on vcs not vcs0.
> Which means these are both reports of truncated files but from two
> different systems with completely different software.

Indeed, it was apache on the old vcs(and wgsi or regular cgi, not fastcgi).
Also based on your later email it was due to cpu-time limitation?

As for hgweb,
A voodoo attempt to disable and enable nginx's "gzip" mode did not help,
and neither did disabling "transfer-encoding: chunked".
So the search continues...

regards,
  - assaf




Re: [Savannah-hackers-public] bzr post-commit email hook (was: gsrc-commit auto messages stopped working)

2017-02-08 Thread Assaf Gordon
Hi Bob,

> On Feb 9, 2017, at 00:39, Bob Proulx  wrote:
> 
> Do you know how hook scripts with bzr work?  Perhaps you can help me
> help you.

Not sure if that's helpful, but last month I installed a missing 'email'
plugin:
  
http://lists.gnu.org/archive/html/savannah-hackers-public/2017-01/msg00046.html
clearly just installing it was not enough to fix the problem.

Perhaps one of these files will provide a hint ?

question for BZR experts: is this plugin even relevant for the email hooks?

===
vcs0:~$ dpkg -L bzr-email | grep 'py$'
/usr/share/pyshared/bzrlib/plugins/email/tests/__init__.py
/usr/share/pyshared/bzrlib/plugins/email/tests/testemail.py
/usr/share/pyshared/bzrlib/plugins/email/__init__.py
/usr/share/pyshared/bzrlib/plugins/email/emailer.py
/usr/lib/python2.7/dist-packages/bzrlib/plugins/email/tests/__init__.py
/usr/lib/python2.7/dist-packages/bzrlib/plugins/email/tests/testemail.py
/usr/lib/python2.7/dist-packages/bzrlib/plugins/email/__init__.py
/usr/lib/python2.7/dist-packages/bzrlib/plugins/email/emailer.py
===

regards,
 - assaf







Re: [Savannah-hackers-public] bzr post-commit email hook (was: gsrc-commit auto messages stopped working)

2017-02-08 Thread Bob Proulx
Hello Carl,

carl hansen wrote:
> still still not working

Sorry there has been delays.  We have been overwhelmed trying to
juggle too many things.

I would love to help fix your hook problem.  But I enter this knowing
nothing about bzr.  I've never used bzr before.  I poke into your bzr
repository and explore.

  vcs:/srv/bzr/gsrc# grep -rl gsrc-commit .
./trunk/.bzr/branch/branch.conf

  vcs:/srv/bzr/gsrc# cat ./trunk/.bzr/branch/branch.conf
last_revision_mailed = bran...@invergo.net-20130610174455-orqjagls9tgp1noq
post_commit_to = gsrc-com...@gnu.org
post_commit_body = ""
post_commit_subject = $nick r$revision: $message

Okay.  That seems to be the configuration for it.  But how does that
even do anything?  I was expecting to see a hooks directory or
something.  Searching the web for documentation on setting up bzr
hooks did not yield anything useful to me.

Do you know how hook scripts with bzr work?  Perhaps you can help me
help you.

Bob



Re: [Savannah-hackers-public] Truncated tarball from git.savannah.gnu.org

2017-02-08 Thread Bob Proulx
Eli Zaretskii wrote:
> James Cloos wrote:
> > It looks like there is a 60 second limit.

Yes.  There appeared to be a 60 second limit.

> > And the transmission is unnaturally slow.  My test averaged only 154KB/s
> > even though I ran it on a machine in a very well connected data center
> > near Boston which supports more than 1G incoming bandwidth.
> 
> I think the tarball is produced on the fly, so it isn't the bandwidth

Yes.  The tar file is produced on the fly and then compressed with
xz.  This is quite a cpu intensive operation.  It pegs one core at
100% cpu during the operation.  It takes 3 minutes on a well connected
machine to create and download a tar.xz file.

> that limits the speed, it's the CPU processing resources needed to
> xz-compress the files.  Try the same with .tar.gz, and you will see
> quite a different speed.

Using gzip is much less stressful on the cpu.  It only takes 1m30s to
create and download a tar.gz file.  The gz is a larger file than the
xz but the overall impact of the gz is less.

> > The 60s limit needs to be much longer; I doubt that it should be any
> > less than ten minutes.

There is a read timeout that can be hit such that the data must start
transferring before the timeout occurs or the web server thinks the
process has failed.  In this case I think the start is after it has
finished the compression.  After it starts transfering data then reads
continue and the read timeout resets.

> No, I think 3 min should be enough.  But I don't really understand why
> there has to be a limit.

There must be limits because otherwise the activity of the global
Internet hitting the server will drive it out of resources creating
what is indistinguishable from a denial of service attack.  There must
be limits to prevent clients from consuming all server resources.
That is just a fact of life when running a busy public server.  You
never have enough resources for everything.  You can't.  Because there
are more clients on the net than you have server resources.  All it
takes is for someone to say that there is a new release and that
synchronizes many people to go download all at the same time and the
system become overwhelmed.

In any case, I am coming back to this thread because we have just
moved git off of the old server and onto the new server.  We are just
now starting to tune the parameters on the new system.  If you try
this again you will find the current read time limit for data to start
transferring to be 300s.  Plus the new system should be faster than
the old one.  The combined effect should be much better.  But remember
that we can't make it unlimited.

Frankly from the server perspective I don't like the cgit dynamic tar
file creation on the server.  It has quite an impact on it.  It is
easier on the server if people keep their own copy of a git clone
updated and build the release tar files on the local ciient system
rather than on the server system.  Then updates to the git repository
are incremental.  Much less impact on the server.  Or to have
maintainers create the tar file once and then simply serve that file
out repeatedly from a download server.

Bob



Re: [Savannah-hackers-public] hgweb provides a broken file when trying to download an archive

2017-02-08 Thread Bob Proulx
Hi Assaf,

Assaf Gordon wrote:
> If the files are truncated, they are truncated by just a tiny little bit.

I don't know if this is somehow related or if it will be noise.  But
we have a recent report of the same problem for the cgit interface.

  
http://lists.gnu.org/archive/html/savannah-hackers-public/2017-01/msg00040.html

However note that at that time git and cgit were on vcs not vcs0.
Which means these are both reports of truncated files but from two
different systems with completely different software.

Bob



Re: [Savannah-hackers-public] hgweb provides a broken file when trying to download an archive

2017-02-08 Thread Assaf Gordon
Follow-up:

> On Feb 8, 2017, at 22:31, Assaf Gordon  wrote:
> 
> Few more observations:
> 
> These consistently fail (truncated file):
>  http://hg.savannah.gnu.org/hgweb/octave/archive/tip.tar.bz2
>  http://hg.savannah.gnu.org/hgweb/octave/archive/tip.tar.gz
> 
> While this consistently succeeds:
>  http://hg.savannah.gnu.org/hgweb/octave/archive/tip.zip

If the files are truncated, they are truncated by just a tiny little bit.

When I create an archives locally, they have very similar sizes:

locally on vcs0 (these archives are not truncated):

cd /srv/hg/octave
hg archive -r tip /tmp/tip.tar.gz
hg archive -r tip /tmp/tip.tar.bz2
hg archive -r tip /tmp/tip.zip

$ ls -log /tmp/tip.*
-rw-rw-r-- 1 5299911 Feb  8 22:51 /tmp/tip.tar.bz2
-rw-rw-r-- 1 6570179 Feb  8 22:51 /tmp/tip.tar.gz
-rw-rw-r-- 1 8971416 Feb  8 22:52 /tmp/tip.zip

and over the web:

wget http://hg.savannah.gnu.org/hgweb/octave/archive/tip.tar.gz
wget http://hg.savannah.gnu.org/hgweb/octave/archive/tip.tar.bz2
wget http://hg.savannah.gnu.org/hgweb/octave/archive/tip.zip

$ ls -log tip.*
-rw-r--r-- 1 5319138 Feb  8 22:52 tip.tar.bz2
-rw-r--r-- 1 6596892 Feb  8 22:52 tip.tar.gz
-rw-r--r-- 1 9099805 Feb  8 22:53 tip.zip

Note the file sizes are not expected to be equal, because in the web version
the subdirectory is 'octave-ac76a90f17ff' while in the local version it is 
simply 'tip'.

But the truncation is very small.

So perhaps it's an nginx/fastcgi/wsgi/python problem ?

to be continued...
 - assaf





Re: [Savannah-hackers-public] hgweb provides a broken file when trying to download an archive

2017-02-08 Thread Assaf Gordon
Hello,

> On Feb 8, 2017, at 15:13, Bob Proulx  wrote:
> 
>> in the following Bug report someone tried to download a specific octave
>> revision from the hgweb interface, but got a corrupted file(files in the
>> archive were missing).
>> [...]
>> http://savannah.gnu.org/bugs/index.php?50246

This bug was closed - somewhat confusing...
I guess it was closed because the OP found an alternative download link,
not because this was solved?

> Reading the bug ticket I agree this looks like aproblem with the hg
> web interface.

Indeed.

> will need to debug it.  This
> is an upgrade of the hg web interface from the old to the new.  We
> will need to understand it and debug it.  None of us are hg experts.

Few more observations:

These consistently fail (truncated file):
  http://hg.savannah.gnu.org/hgweb/octave/archive/tip.tar.bz2
  http://hg.savannah.gnu.org/hgweb/octave/archive/tip.tar.gz

While this consistently succeeds:
  http://hg.savannah.gnu.org/hgweb/octave/archive/tip.zip

Looking at the running processes, it seems archiving is
done internally by mercurial's 'hgwebdir' module
( "from mercurial.hgweb.hgwebdir_mod import hgwebdir" in our script ).
It does not fork and does not run tar/gzip/bzip2 .

A cursory google-search did not come up with anything about hgwebdir
truncating archives, but I'll continue to look.

comments welcome,
 - assaf









Re: [Savannah-hackers-public] Fwd: post-receive-email problem when committing to savannah grep

2017-02-08 Thread Jim Meyering
On Wed, Feb 8, 2017 at 2:15 PM, Bob Proulx  wrote:
> Hi Jim, Paul,
>
> Jim Meyering wrote:
>> Savannah is undergoing some changes as it switches to new hardware, so
>> forwarding your message to savannah-hackers-public, where the guys
>> (Bob and Assaf) doing all the work hang out.
>
> Eventually we will either wack-a-mole all of these or we will have a
> systematic search and destroy of all of the hook problems. :-/
>
>> Paul just pushed a commit to the grep repository and saw the
>> following, which suggests the regular mail-mirroring hook is not
>> working:
>
> Thank you for the report.  It wasn't working.
>
>> remote: run-parts: failed to stat component
>> hooks/post-receive.d/post-receive-email: No such file or directory
>> To git.sv.gnu.org:/srv/git/grep.git
>>6e98364..6e4c872  master -> master
>
> Fixed this.  Manually pushed the commit notification.  It should show
> up in the commit mailing list.

Thanks, Bob!



Re: [Savannah-hackers-public] Fwd: post-receive-email problem when committing to savannah grep

2017-02-08 Thread Bob Proulx
Hi Jim, Paul,

Jim Meyering wrote:
> Savannah is undergoing some changes as it switches to new hardware, so
> forwarding your message to savannah-hackers-public, where the guys
> (Bob and Assaf) doing all the work hang out.

Eventually we will either wack-a-mole all of these or we will have a
systematic search and destroy of all of the hook problems. :-/

> Paul just pushed a commit to the grep repository and saw the
> following, which suggests the regular mail-mirroring hook is not
> working:

Thank you for the report.  It wasn't working.

> remote: run-parts: failed to stat component
> hooks/post-receive.d/post-receive-email: No such file or directory
> To git.sv.gnu.org:/srv/git/grep.git
>6e98364..6e4c872  master -> master

Fixed this.  Manually pushed the commit notification.  It should show
up in the commit mailing list.

Bob



[Savannah-hackers-public] old vcs access update

2017-02-08 Thread Bob Proulx
Bob Proulx wrote:
> Which means I want to say that all of the version control systems from
> vcs are migrated now.  (Since tla arch is hosted on download.)
> Therefore I think I will set up an iptables block for other services
> in order to force finding any unknown issues.

Which I have done just now.  I agressively created a copy of the
iptables firewall.  I removed all access ports for the version control
systems and for web access.  I blocked ssh access from the world
(needed for the version control access) but allowed it from the
standard list of local systems such as fencepost and mgt0 and the FSF
admins vpn network.  I did this as a temporary change from the command
line.  A reboot would restore operation to the previous rules.

In theory nothing from the outside world is using the vcs server for
any version control or web access or any other access.  This should
enforce that theory and cause anything using it to be blocked.

Note that vcs is still very much a required system.  It is hosting the
data storage by NFS onto the new system vcs0.  NFS access to the data
is still very much required for every operation.  Plus us admins need
shell access for repository maintenance actions using local root
access.  NFS root_squash is in effect, as desired, and vcs0 has no
root access to the nfs mounted file system.

As usual please report any problems.

Bob



Re: [Savannah-hackers-public] hgweb provides a broken file when trying to download an archive

2017-02-08 Thread Bob Proulx
Hello Martin,

casti...@uni-bremen.de wrote:
> in the following Bug report someone tried to download a specific octave
> revision from the hgweb interface, but got a corrupted file(files in the
> archive were missing).

Ouch!

> I tried it, too, with .gz and .bz2 and got both times an ~6MB file (instead
> of about 20MB), which was broken. looking near the end, I found a python
> error message and stack trace in html.
> 
> http://savannah.gnu.org/bugs/index.php?50246

Reading the bug ticket I agree this looks like aproblem with the hg
web interface.

> I don't know, if you are the right people to write to.
> I will give more information, if necessary. I'm interested in the cause of
> this error. If you can resolve it, please tell me.

Yes.  This is the right place to report the bug.  I, personally, know
little about hg however.  We, in the "royal we" of the Savannah folks,
meaning in this case probably Assaf :-), will need to debug it.  This
is an upgrade of the hg web interface from the old to the new.  We
will need to understand it and debug it.  None of us are hg experts.
Unfortunately.

Thank you for reporting this problem.

Bob



Re: [Savannah-hackers-public] git status update

2017-02-08 Thread Bob Proulx
Bob Proulx wrote:
> But the cgit web server was returning a proxy error.  In order to
> avoid needing to debug that on the spot I proxied it over to the old
> server temporarily.  That is running okay this way for the moment.
> Can debug the local native server at our leisure.

Thanks to Assaf for debugging the native cgit problem and obviating
the need to proxy over to the old server.  This is now fixed and back
running natively on the new server again.

Thanks Assaf!
Bob



[Savannah-hackers-public] Chef client hitting emacs cgit eterm-color?

2017-02-08 Thread Bob Proulx
Anyone know why "Chef Client" is repeatedly hitting this?  (I know
that Chef is a Puppet like tool.)

  http://git.savannah.gnu.org/cgit/emacs.git/plain/etc/e/eterm-color

I have been staring at the activity and for some reason I am seeing a
lot of "Chef Client" accessing two particular emacs cgit pages
repeatedly.  I think some kind of example recipe has escaped into the
wild and is now causing a lot of traffic.  It isn't to the point of it
breaking anything but it definitely seems unusual.

How many hits in the last day?

  grep /cgit/emacs.git/plain/etc/e/eterm-color /var/log/nginx/access.log | wc -l
12835

How many unique addresses are hitting that URL?

  grep /cgit/emacs.git/plain/etc/e/eterm-color /var/log/nginx/access.log | awk 
'{print$1}' | sort -u | wc -l
38

Ranking them by unique IP address:

  grep /cgit/emacs.git/plain/etc/e/eterm-color /var/log/nginx/access.log | awk 
'{print$1}' | sort | uniq -c | sort -nr | awk '{print$1}' | head
 3138
  278
  276
  270
  270
  269
  268
  268
  268
  268
...

There is one particular IP address that is hitting that link *a lot*
and quite a few others that are hitting it often.  Anyone know
anything about this particular issue?  Does anyone reading this have a
Chef installation that they could look at and see if one of the
example scripts may have escaped and doing this accidentally?

Bob



Re: [Savannah-hackers-public] git over https

2017-02-08 Thread Bob Proulx
Bob Proulx wrote:
> This isn't the final configuration though since this uses the package
> installed git, which should be fine but doesn't support shallow clones
> with --depth 1 yet as of the OS Trisquel 7 release with git 1.9.1.
> Therefore we are using git from a Debian Jessie Stable chroot with git
> 2.1.4.  The reason to use the chroot is to get a stable release with a
> community supported security release stream.  Therefore I need to
> point it to the custom version from the chroot.  Then everything
> should be working.

Done.  I tinkered this into working using the custom git and with that
shallow clones with --depth 1 work with http and https now too.

I think I will let things simmer for a bit yet before making any wide
announcements.  Just in case there is a more serious problem and we
have to revert.  That way people won't lose functionality.

Bob