On 04/03/2015 12:47 PM, Hausmann Simon wrote:
> Hi,
> 
> I believe what we are seeing is caused by instability in the network that 
> connects the Jenkins service with the Jenkins slave machines. Occasionally 
> network connectivity between the slaves and the master is lost, causing the 
> running build as a whole to abort - all other still running builds are 
> aborted and the results from builds that had already finished are discarded. 
> In an attempt to recover, a whole new integration with builds for all 
> configurations is started.
> 
> We have observed that this scenario repeats itself several times, causing 
> overall integration of many hours.

I think this is documented here:

http://code.qt.io/cgit/qt/qtqa.git/tree/scripts/jenkins/qt-jenkins-integrator.pl#n439

once $MAX_ATTEMPTS is reached .. somobody needs to manually restart the
integrator .. CI admins should be notified with an email like
http://lists.qt-project.org/pipermail/ci-reports/2015-April/038140.html

> As part of the work on the new CI system, we have observed similar network 
> connectivity related symptoms. We are treating them more gracefully by not 
> discarding otherwise successful results. Nevertheless it is a major annoyance.
> 
> Based on rumors and observation of symptoms it is a theory ‎of Frederik and I 
> that there is a firewall service centrally installed in this virtual network. 
> It shows symptoms of connection tracking and - more importantly - signs of 
> being able to handle only an insufficient amount of traffic or connections. 
> Beyond that limit, connection attempts time out and existing connections 
> become "spotty".
> 
> I would like to get to the bottom of this at some point, because it severely 
> affects the efficiency of the current ci system as well.
> 
> Tony, do you happen to have any more details about this?
> 
> I'll see about filing a ticket with IT next week unless we conclude anything 
> different.
> 
> Simon
> 
>   Original Message
> From: Thiago Macieira
> Sent: Friday, April 3, 2015 07:11
> To: [email protected]
> Subject: [Development] Why are qtbase integrations taking so long?
> 
> 
> qtbase integrations used to take around 3 hours as recently as two weeks ago.
> 
> In the past week, I've caught several integrations lasting more than 6 hours.
> The one currently running is integrating a single commit and has been running
> for 6h30. I've seen one for 12 hours.
> 
> Is this a timeout not caught by the coordinator?
> 
> http://testresults.qt.io/ci/status/ says that it is in state "monitor-jenkins-
> build" and "build_attempt: 6". For attempt 5, the only stage not to be at
> SUCCESS was linux-g++_developer-build_qtnamespace_qtlibinfix_RHEL65_x64. The
> same for attempts 3 and 4.

I think that the integrator (coordinator) gives up after 8
retries/attempts .. so if qtbase takes around 3 hrs to run and it is run
8 times .. you could easily wait (worst case) for 24 hrs if no action is
taken
<http://code.qt.io/cgit/qt/qtqa.git/tree/scripts/jenkins/qt-jenkins-integrator.pl#n1056>

-- 
Sergio Ahumada
[email protected]

_______________________________________________
Development mailing list
[email protected]
http://lists.qt-project.org/mailman/listinfo/development

Reply via email to