2016-08-08 23:55:01+0000 [-] P4 poll failed on atx-p4-buildproxy.rsi.global:1666, //starcitizen/ Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/buildbot/changes/base.py", line 65, in doPoll d = defer.maybeDeferred(self.poll) File "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line 150, in maybeDeferred result = f(*args, **kw) File "/usr/local/lib/python2.7/dist-packages/buildbot/changes/p4poller.py", line 162, in poll d = self._poll() File "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1274, in unwindGenerator return _inlineCallbacks(None, gen, Deferred()) --- <exception caught here> --- File "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1128, in _inlineCallbacks result = g.send(result) File "/usr/local/lib/python2.7/dist-packages/buildbot/changes/p4poller.py", line 232, in _poll result = yield self._get_process_output(args) File "/usr/local/lib/python2.7/dist-packages/buildbot/changes/p4poller.py", line 170, in _get_process_output d = utils.getProcessOutput(self.p4bin, args, env) File "/usr/local/lib/python2.7/dist-packages/twisted/internet/utils.py", line 128, in getProcessOutput reactor) File "/usr/local/lib/python2.7/dist-packages/twisted/internet/utils.py", line 28, in _callProtocolWithDeferred reactor.spawnProcess(p, executable, (executable,)+tuple(args), env, path) File "/usr/local/lib/python2.7/dist-packages/twisted/internet/posixbase.py", line 340, in spawnProcess processProtocol, uid, gid, childFDs) File "/usr/local/lib/python2.7/dist-packages/twisted/internet/process.py", line 731, in __init__ self._fork(path, uid, gid, executable, args, environment, fdmap=fdmap) File "/usr/local/lib/python2.7/dist-packages/twisted/internet/process.py", line 405, in _fork self.pid = os.fork() exceptions.OSError: [Errno 12] Cannot allocate memory
I'll run one day with the p4 poller disabled and see how it goes. On Tue, Aug 2, 2016 at 7:28 PM, Francesco Di Mizio < [email protected]> wrote: > Just one. Here is what the poller looks like > > s = changes.P4Source( > p4port=config.p4_server, > p4user=config.p4_user, > p4passwd=config.p4_password, > p4base='//XXXXXXX/', > pollInterval=10, > pollAtLaunch = False, > split_file=lambda branchfile: branchfile.split('/',1), > encoding='cp437' > ) > > > > > > On Tue, Aug 2, 2016 at 7:24 PM, Pierre Tardy <[email protected]> wrote: > >> How many projects are your pulling? I'll see if I can make a PoC of a >> builder which runs statprof >> >> Le mar. 2 août 2016 à 18:53, Francesco Di Mizio < >> [email protected]> a écrit : >> >>> Thanks for the kind replies both of you. >>> >>> @Pierre: >>> Not sure I get what you mean. Given the context, for a step to be CPU >>> demanding it should be a master side step right? I happen to not have any. >>> What would you be profiling with statprof? >>> I'd really appreciate if you could elaborate on your idea. >>> >>> Really all I can think of is the poller. I'll keep looking into it. >>> >>> >>> >>> On Tue, Aug 2, 2016 at 6:36 PM, Dan Kegel <[email protected]> wrote: >>> >>>> With gitpoller, it was easy to see; whenever the number of git >>>> sessions from the poller went over 0 or so, web gui performance was >>>> poor. >>>> And if it went over 10, well, you could kiss the gui goodbye for >>>> several minutes. >>>> >>>> One countermeasure was to randomize the polling intervals, a la >>>> >>>> interval=6 # minutes >>>> self['change_source'].append( >>>> # Fuzz the interval to avoid slamming the git server >>>> and hitting the MaxStartups or MaxSessions limits >>>> # If you hit them, twistd.log will have lots of >>>> "ssh_exchange_identification: Connection closed by remote host" errors >>>> # See http://trac.buildbot.net/ticket/2480 >>>> changes.GitPoller(repourl, branches=branchnames, >>>> workdir='gitpoller-workdir-'+name, pollinterval=interval*60 + >>>> random.uniform(-10, 10))) >>>> >>>> That made life just barely bearable, at least until number of projects >>>> polled was under 50 or so. >>>> What really helped was not using pollers anymore, and switching to >>>> gitlab's webhooks. >>>> We're at 190 now, of which 57 are still using gitpoller, and it's >>>> almost ok. (I really have >>>> to move the last 57 onto gitlab. Or, well, since they're not >>>> critical, increase the polling >>>> interval...) >>>> >>>> On Tue, Aug 2, 2016 at 9:13 AM, Pierre Tardy <[email protected]> wrote: >>>> > Hi, >>>> > >>>> > Pollers are usually indeed not scaling as they, hmm, poll. >>>> > What you are describing here is hints that the twisted reactor thread >>>> is >>>> > always busy, which should not happen if you only start 10 builds. >>>> > You might have some custom steps which are doing something heavily >>>> cpu bound >>>> > in the main thread. >>>> > What I usually do is to use statprof: >>>> > https://pypi.python.org/pypi/statprof/ >>>> > >>>> > in order to know what the cpu is doing. >>>> > You could create a builder which you can trig whenever you need, and >>>> which >>>> > would start the profiling, wait a few minutes, and then save >>>> profiling to a >>>> > file. >>>> > >>>> > >>>> > >>>> > Le mar. 2 août 2016 à 17:53, Francesco Di Mizio < >>>> [email protected]> >>>> > a écrit : >>>> >> >>>> >> Hey Dan, >>>> >> >>>> >> I am using a p4 poller. Maybe it's suffering from the same problems? >>>> >> >>>> >> On Tue, Aug 2, 2016 at 5:45 PM, Francesco Di Mizio >>>> >> <[email protected]> wrote: >>>> >>> >>>> >>> I'd like to provide a bit more context.Right after restarting the >>>> master >>>> >>> and kicking off 10 builds CPU was at 110-120%. This made the UI >>>> unusable and >>>> >>> basically all the services were stuck, including the REST API. >>>> >>> After 3-4 minutes like this and WITH all the 10 builds still >>>> running the >>>> >>> CPU usage went down to 5%, stayed there for 5 minutes and all was >>>> smooth and >>>> >>> quick again. From then on it keps oscillating, I've seen spikes of >>>> 240% :( >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> On Tue, Aug 2, 2016 at 4:12 PM, Francesco Di Mizio >>>> >>> <[email protected]> wrote: >>>> >>>> >>>> >>>> Sometimes it goes up to 140%. I was not able to relate this with a >>>> >>>> particular builds condition - seems like it can happen any time >>>> and is not >>>> >>>> related to how many builds are going on. >>>> >>>> >>>> >>>> I usually realize the server got into this state because the web >>>> UI gets >>>> >>>> stuck. As soon as the CPU% goes back to normal values (2-3% most >>>> times) the >>>> >>>> web finishes loading just instantly. >>>> >>>> >>>> >>>> Any pointers as to what might be causing this? Only reason I can >>>> think >>>> >>>> of is too many people trying to access the web UI simultaniously - >>>> may I be >>>> >>>> right? >>>> >>>> >>>> >>> >>>> >> >>>> >> _______________________________________________ >>>> >> users mailing list >>>> >> [email protected] >>>> >> https://lists.buildbot.net/mailman/listinfo/users >>>> > >>>> > >>>> > _______________________________________________ >>>> > users mailing list >>>> > [email protected] >>>> > https://lists.buildbot.net/mailman/listinfo/users >>>> >>> >>> >
_______________________________________________ users mailing list [email protected] https://lists.buildbot.net/mailman/listinfo/users
