Pierre i am not using SQLite but pg hosted in an other container on the same box. I'd say the issue isn't which dB engine i am running as it also happens with MySQL.
Thanks, Francesco On Dec 13, 2016 12:40 AM, "Pierre Tardy" <[email protected]> wrote: > Hi Francesco, > I would say it is normal that the CPU goes to 100% while the log are being > compressed. As soon as anything else is not blocked, for me this is fine. > I can see you are using sqlite. The thing is with sqlite the number of > parallel access to the db is restricted to one by default in order to > protect against weird parallel behaviour we got with sqlite. > I think this explains your behaviour. > I would recommend as an easy fix to switch to pg or mysql. I'll try to > think of some workarounds like compressing the log only when there is a low > load average. This is already something that we though about with my > teammates. Log compression is important, but not urgent, so this is fine to > delay until there is not so much to do. > > Pierre > > > > Le lun. 12 déc. 2016 à 12:40, Francesco Di Mizio < > [email protected]> a écrit : > >> Hey Pierre, >> >> first of all thanks for taking a look. I never ever have more than 9-10 >> builds going on at the same time. Like I said many times the system can >> happily be sitting on a nice 10-15% CPU usage when even 8 builds are >> running, but the CPU spikes can happen at any time with any number of >> builds going on. Things seem to go worse when the system runs for a few >> days with no reboot. Just now I had 7 builds running and the cpu keeps >> oscillating between 10% and 120%, now it's contantly over 100%. I wonder >> if we can find a way for you to access the system so you can take a look >> yourself (of course if you are fine with it)? >> >> I will def be looking at multimaster in the future, just have no time atm. >> >> Thanks again, >> Francesco >> >> On Mon, Dec 12, 2016 at 11:18 AM, Pierre Tardy <[email protected]> wrote: >> >> Hi Francesco, >> I finally got the time to look at your profile. >> >> I dont see anything other than normal load. Buildbot is hardworking at >> compressing the log in the db, managing the build and steps, etc. >> How many build in parallel do you have for your master? Maybe its time to >> start thinking about multimaster. >> >> Regards, >> Pierre >> >> Le lun. 28 nov. 2016 à 11:30, Francesco Di Mizio < >> [email protected]> a écrit : >> >> Reviving this old thread. According to things that were said on this >> thread, I reduced as much as I could all the logs produced by all steps. >> Bigger has now 300 lines. >> Despite that I am still seeing cpu spiked. I managed to grab a hopefully >> useful json from the profiler which you'll find attached. Toards the end >> you can clearly see the load goes through the roof. >> >> If anybody could take a look I'd really appreciate! >> >> Thanks, >> Francesco >> >> On Fri, Aug 26, 2016 at 6:06 PM, Neil Gilmore <[email protected]> >> wrote: >> >> Hi Pierre, >> >> Well, my previous message was really saying that I think buildbot can >> handle logs a lot bigger then 12K lines. >> >> It seems unlikely that it's stuck on log compression. The step says it's >> been running for 9.5 hours, and previous runs of the entire build have >> durations in the 1-3 hour range. The particular step is a ShellCommand that >> actually does the build. The place where the log stops is during an svn up >> operation. >> >> I don't find any exceptions in the log for this, though that's not >> definitive (current number of log files is 122, and sometimes finding >> things is tricky). I do see the build cancellation in the log, and it also >> shows up in the UI. >> >> I'm betting on a lost deferred. So far, I've never seen an exception >> relating to that. >> >> On the bright side, this seems to be the only worker whose builders are >> stuck (some of the rest are waiting for locks, which this build has). >> Usually, there's more. >> >> Neil Gilmore >> grammatech.com >> >> >> On 8/26/2016 10:35 AM, Pierre Tardy wrote: >> >> Hum, >> It might not be stuck, actually, but just spending very long time to >> compress the log. >> >> In theory, the log compression is not waited for, though. >> >> Note that if you stop a build that is waiting for a lost deferred, this >> will have no effect as you describe. >> >> You got absolutly no exception in twisted.log? >> >> Pierre >> >> >> Le ven. 26 août 2016 à 17:06, Neil Gilmore <[email protected]> a >> écrit : >> >> I'm currently looking at a step with 3 logs: >> 81531 lines >> 489285 lines >> 489311 lines >> >> An earlier successful run would have that first log at 244080 lines. >> >> This particular build is stuck, though. :( (which is why I'm looking at >> it.) (A bit off-topic, but I tried stopping it. The last step is marked >> cancelled, but that's the only effect.) >> >> >> Neil Gilmore >> grammatech.com >> >> >> On 8/26/2016 6:25 AM, Pierre Tardy wrote: >> >> >> The ram looks like sufficient, it might be a good test to try and >> increase the number of cpu for that VM. >> In your trace, I can see the use of up to 7 threads at the same time, so >> you might gain by going 8 CPUs >> >> Also, make sure that your VM host is not overbooked. In my experience of >> using VMware VMs provided by IT, overbooking has been a source of >> inexplicable performance issues. >> >> 12k lines is a lot, but buildbot shall support this kind of load without >> issue. >> >> Le ven. 26 août 2016 à 12:05, Francesco Di Mizio < >> [email protected]> a écrit : >> >> It's a vmware virtual machine with 4 GIGs RAM and 4 CPUs@3Ghz. It runs, >> among other marginal things, 2 docker containers - one for the buildbot and >> one for the postgres db. >> >> The most beefy logs have around 12K lines. Is it too much? >> Also some other logs are read from the worker's filesystem and added as >> additional logs. >> >> On Fri, Aug 26, 2016 at 11:51 AM, Pierre Tardy <[email protected]> wrote: >> >> Cool! >> I can indeed see 3 spikes. >> >> Looks related to logs and logs compression. >> >> What is the HW spec of your master machine? >> How much log does your build generate? >> >> Pierre >> >> Le ven. 26 août 2016 à 11:42, Francesco Di Mizio < >> [email protected]> a écrit : >> >> Pierre, >> >> I enabled it, waited 1 min and saw the spike, then stopped after a few >> secs. Attached the json. >> Awesome tool btw, work wonders! >> >> On Thu, Aug 25, 2016 at 1:14 PM, Pierre Tardy <[email protected]> wrote: >> >> You can try to hit the button in the morning with a two hours gather >> period, and hope that you see the spike during that period.. >> >> >> Le jeu. 25 août 2016 à 12:17, Francesco Di Mizio < >> [email protected]> a écrit : >> >> Thanks a lot! Pierre I will def will give it a shot. I am not sure I'll >> be able to smash that 'start recording' button as the UI isusuallystuck >> when the CPU spikes. Updates to come! >> >> On Thu, Aug 25, 2016 at 10:45 AM, Pierre Tardy <[email protected]> wrote: >> >> Hi Francesco, >> >> I spent some time in order to implement a profiler plugin for buildbot >> >> You can give it a look, and send your profile.json file if you need more >> analysis from me. >> https://github.com/tardyp/buildbot_profiler >> >> Regards, >> Pierre >> >> >> Le mer. 24 août 2016 à 22:43, Francesco Di Mizio < >> [email protected]> a écrit : >> >> I've tried and it's not an easy task because of my Win into Vagrant into >> Docker setup. >> I'll try again soon when I get a Linux box! >> >> On Fri, Aug 19, 2016 at 5:54 PM, Vasily <[email protected]> wrote: >> >> Umm, no. VTune has Python support starting 2017 Beta, and, well, it was >> my team (at Intel) work actually :-) >> >> P.S. I'm from Intel, too. >> >> Thanks, >> Vasily >> 19 авг. 2016 г. 18:17 пользователь "Francesco Di Mizio" < >> [email protected]> написал: >> >> I had thought you were making fun of Intel somehow ;) >> >> On Aug 19, 2016 5:07 PM, "Pierre Tardy" <[email protected]> wrote: >> >> ahah >> >> I though this was a taunt on me being employed by Intel. >> I actually had mitigated experience with vtune few years ago, and didn't >> know they had python support until then. >> Being an opensource guy, I usually neglegate to look at proprietary stuff. >> >> Pierre >> >> Le ven. 19 août 2016 à 12:18, Vasily <[email protected]> a écrit : >> >> I'm again suggesting to look into Python profiling capabilities of Intel® >> VTune™ Amplifier. It could run statistical profiling for a long time and >> display CPU usage over time, so the developer can look at specific time >> range where CPU usage was too high and see which functions were executed. >> >> Thanks, >> Vasily >> 19 авг. 2016 г. 11:57 пользователь "Pierre Tardy" <[email protected]> >> написал: >> >> Hi Francesco, >> >> Your described setup looks sane to me. >> >> The problems we are trying to catch are cpu spikes, as far as I >> understand, which does not happen for very long, but are very annoying for >> users, as it is blocking the reactor. >> >> This problem is not easy to see in the profile you sent, as this profile >> is over long time, so we see the average of each method during the day and >> not the spikes. >> >> What would really be needed is a on-demand profiler which would detect >> cpu spikes and only log the stack traces during those times. >> >> Here is a nice blog pst explaining why statistic profiling is cool and >> easy to implement in python. >> https://nylas.com/blog/performance >> >> For 0.9.1 I want to concentrate on scalability, and write a debugging ui >> plugin based on those ideas (and probably code) >> >> That would be great if your team can help on that matter. >> >> Regards, >> Pierre >> >> >> >> >> >> >> >> >> _______________________________________________ >> users mailing >> [email protected]https://lists.buildbot.net/mailman/listinfo/users >> >> _______________________________________________ >> users mailing list >> [email protected] >> https://lists.buildbot.net/mailman/listinfo/users >> >> >> >> _______________________________________________ >> users mailing list >> [email protected] >> https://lists.buildbot.net/mailman/listinfo/users >> >> >> >>
_______________________________________________ users mailing list [email protected] https://lists.buildbot.net/mailman/listinfo/users
