On Wed, Jul 24, 2019 at 11:59 AM Thomas Munro <thomas.mu...@gmail.com> wrote:
> On Tue, Jul 16, 2019 at 12:21 PM Tom Lane <t...@sss.pgh.pa.us> wrote:
> > In the meantime, we've had *lots* of buildfarm failures in the
> > added pg_stat_all_tables query, which indicate that indeed the
> > stats collector mechanism isn't terribly reliable.  But that
> > doesn't directly prove anything about the original problem,
> > since the planner doesn't look at stats collector data.
>
> I noticed that if you look at the list of failures of this type, there
> are often pairs of animals belonging to Andres that failed at the same
> time.  I wonder if he might be running a bunch of animals on one
> kernel, and need to increase net.core.rmem_max and
> net.core.rmem_default (or maybe the write side variants, or both, or
> something like that).

In further support of that theory, here are the counts of 'stats'
failures (excluding bogus reports due to crashes) for the past 90
days:

          owner          |    animal    | count
-------------------------+--------------+-------
 andres-AT-anarazel.de   | desmoxytes   |     5
 andres-AT-anarazel.de   | dragonet     |     9
 andres-AT-anarazel.de   | flaviventris |     1
 andres-AT-anarazel.de   | idiacanthus  |     5
 andres-AT-anarazel.de   | komodoensis  |    11
 andres-AT-anarazel.de   | pogona       |     1
 andres-AT-anarazel.de   | serinus      |     3
 andrew-AT-dunslane.net  | lorikeet     |     1
 buildfarm-AT-coelho.net | moonjelly    |     1
 buildfarm-AT-coelho.net | seawasp      |    17
 clarenceho-AT-gmail.com | mayfly       |     2

Andres's animals report the same hostname and run at the same time, so
it'd be interesting to know what net.core.rmem_max is set to and
whether these problems go away if it's cranked up 10x higher or
something.  In a quick test I can see that make installcheck is
capable of sending a *lot* of 936 byte messages in the same
millisecond.

-- 
Thomas Munro
https://enterprisedb.com


Reply via email to