Thank Chris you for clarifying,
those log files are created by Jenkins Maven support[1], and they are
created for each Maven module of the project that's being built, with
Camel having 840 maven modules. I don't think this approach scales
with such big number of Maven modules.

So Camel devs, seems like we need to switch to a different Jenkins job
type, I propose that we switch to Jenkins pipeline job type. At some
point I've been experimenting with that[2], and I can lead this
effort.

I wonder, though, if the same issue will pop up with the JUnit test
reports archiving in the pipeline job, there will still be roughly the
same amount of files transferred to master, it could be that there has
been some work on Jenkins side to optimize that with pipeline.

zoran

[1] https://github.com/jenkinsci/maven-plugin/blob/
[2] https://github.com/zregvart/camel/blob/jenkinsfile/Jenkinsfile

On Fri, Feb 16, 2018 at 6:39 PM, Chris Lambertus <c...@apache.org> wrote:
> Here is a small sample of the types of logs we’re seeing:
>
> cml@jenkins-master:~$ cat camel-20180215.lsof | grep camel | grep log | wc -l
> 2277
>
>
> java    18713 jenkins 4931w      REG               8,17         0  215356692 
> /x1/jenkins/jenkins-home/jobs/Camel.trunk.notest.java9/modules/org.apache.camel.example$camel-example-twitter-salesforce/builds/1328/log
> java    18713 jenkins 4932w      REG               8,17         0  215356694 
> /x1/jenkins/jenkins-home/jobs/Camel.trunk.notest.java9/modules/org.apache.camel.example$camel-example-twitter-websocket/builds/1328/log
> java    18713 jenkins 4933w      REG               8,17         0  215356696 
> /x1/jenkins/jenkins-home/jobs/Camel.trunk.notest.java9/modules/org.apache.camel.example$camel-example-twitter-websocket-blueprint/builds/1328/log
> java    18713 jenkins 4934w      REG               8,17         0  215356698 
> /x1/jenkins/jenkins-home/jobs/Camel.trunk.notest.java9/modules/org.apache.camel.example$camel-example-validator-spring-boot/builds/1328/log
> java    18713 jenkins 4935w      REG               8,17         0  215356701 
> /x1/jenkins/jenkins-home/jobs/Camel.trunk.notest.java9/modules/org.apache.camel.example$camel-example-widget-gadget-cdi/builds/1328/log
> java    18713 jenkins 4936w      REG               8,17         0  215356706 
> /x1/jenkins/jenkins-home/jobs/Camel.trunk.notest.java9/modules/org.apache.camel.example$camel-example-widget-gadget-java/builds/1328/log
> java    18713 jenkins 4937w      REG               8,17         0  215356709 
> /x1/jenkins/jenkins-home/jobs/Camel.trunk.notest.java9/modules/org.apache.camel.example$camel-example-widget-gadget-xml/builds/1328/log
> java    18713 jenkins 4938w      REG               8,17         0  215356714 
> /x1/jenkins/jenkins-home/jobs/Camel.trunk.notest.java9/modules/org.apache.camel.example$camel-example-zipkin/builds/1328/log
> java    18713 jenkins 4939w      REG               8,17         0  215356725 
> /x1/jenkins/jenkins-home/jobs/Camel.trunk.notest.java9/modules/org.apache.camel.example$camel-example-zipkin-client/builds/1328/log
>
>
>
>> On Feb 16, 2018, at 3:56 AM, Zoran Regvart <zo...@regvart.com> wrote:
>>
>> Hi Chris,
>> thank you for troubleshooting this, can you clarify one thing for me,
>> when you mention log files, are these the `.log` files generated
>> during the test phase of the build or the XML/TXT files with JUnit
>> reports?
>>
>> I would think that the job type being Maven and the automatic
>> gathering of JUnit test reports is the culprit but would like a
>> confirmation.
>>
>> If it is so, I think we one possible solution is to migrate to
>> Pipeline/Freestyle job type as, as far as I'm aware, there is no way
>> to prevent Maven job type from gathering JUnit reports.
>>
>> zoran
>>
>> On Fri, Feb 16, 2018 at 2:37 AM, Chris Lambertus <c...@apache.org> wrote:
>>>
>>> Hi Camel PMC,
>>>
>>> We have been having an ongoing problem with Jenkins for quite some time, 
>>> where the CPU usage and IOPS skyrocket on the master. Each time this has 
>>> happened, the jenkins build nodes lose all of their associated labels, and 
>>> all new builds are unable to start.
>>>
>>> In the times I’ve been able to investigate this, there has in each case 
>>> been several Camel builds running, and in each case, the builds are opening 
>>> somewhere between 1500 and 2200 log files, which seems to be killing the 
>>> jenkins master. For comparison, some very large build jobs for Hadoop only 
>>> open ~15 or so log files.
>>>
>>> I have had to take the rather drastic step of disabling the Camel Jenkins 
>>> jobs (many of which have been failing for awhile now) while we continue to 
>>> investigate this issue. Before we re-enable the jobs, we’re going to have 
>>> to figure out how to get your builds to open a sane number of log files — 
>>> the current situation where the builds are creating thousands of log files 
>>> is not sustainable, and we believe this may be one of the causative factors 
>>> of the ongoing jenkins outages. While I cannot say with any certainty that 
>>> this is what’s been killing the master, it’s far enough out of the norm 
>>> that I need to rule it out.
>>>
>>> Please do not re-enable any of the disabled builds until we have had a 
>>> chance to work on this together. Can you please identify someone from the 
>>> project to act as a liaison with Infra to troubleshoot the issues with 
>>> these builds?
>>>
>>> Since I have sent this to a private list, you have my permission to forward 
>>> this message on to your devs or other public lists as you deem appropriate.
>>>
>>>
>>> Thanks,
>>>
>>> -Chris
>>> ASF Infra
>>>
>>
>>
>>
>> --
>> Zoran Regvart
>



-- 
Zoran Regvart

Reply via email to