Re: [Sugar-devel] git.sugarlabs.org down for unplanned maintenance
On 04/12/2014 02:07 AM, Sebastian Silva wrote: Here I just got home. Sorry for the inconvenience I might have caused. Bernie, do you know which log was/is growing out of hand? Both access.log and node.sugarlabs.org.log. I discarded the first and compressed the second (it compresses very well). You can still examine it by doing: xzless access.log-20140411.xz | tail You'll see lines like this one: node.sugarlabs.org:80 181.65.159.107 - - [11/Apr/2014:19:51:36 -0400] GET /?cmd=subscribe HTTP/1.1 200 232 - python-requests/1.2.1 CPython/2.7.0 Linux/2.6.35.13_xo1.5-20120508.1139.olpc.eb0c7a8 The problem seems to be that laptops retry the connection to /context.atom and /feedback.atom quickly. It's probably near the end of the file though. Don't try to uncompress the whole file because it's over 2GB. Here's a report on everything I know about the issue. We've been experiencing some performance degradation and also some downtime in Sugar Network services (this is documented at http://tareas.somosazucar.org/hxp/issue71 ). We've seen a burst in users since deployment OS images with Sugar Network features ( http://network.sugarlabs.org/stats-viewer/ growing pretty fast user_total). There is a notification feature that is polling the sugar network node service. This was causing the allocation and exhaustion of resources (open files). Crashes got to a frequency of every hour or so. It's code I don't understand really well, but I went ahead and patched the Sugar Network with: http://tareas.somosazucar.org/hxp/file66/sn_disable_notifications.patch This made the SN much snappier and it stopped crashing. However logs were saving a traceback several times per second. I thought I had contained the log issue but apparently I missed some other logs (I guess apache logs but they seem clean now). I took a glance at jita and could not find the growing log. Let me know where I can help mitigation. Regards Sebastian El vie, 11 de abr 2014 a las 7:51 PM, Bernie Innocenti ber...@sugarlabs.org escribió: I was notified that git.sugarlabs.org was showing errors. After some head scraping I realized that the root filesystem on jita was full. I looked around and found giant request logs containing millions of requests apparently originating from XOs located in Peru. We've been DDOSed by our own creature :-) Anyway, the machine also had a giant, very fragmented mysql database that I'm currently cleaning up. Gitorious will be back online in less than 1 hour. Contact me on IRC if this is blocking your work, I can postpone the maintenance. -- Bernie Innocenti Sugar Labs Infrastructure Team http://wiki.sugarlabs.org/go/Infrastructure_Team ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel -- Bernie Innocenti Sugar Labs Infrastructure Team http://wiki.sugarlabs.org/go/Infrastructure_Team ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] git.sugarlabs.org down for unplanned maintenance
Here I just got home. Sorry for the inconvenience I might have caused. Bernie, do you know which log was/is growing out of hand? Here's a report on everything I know about the issue. We've been experiencing some performance degradation and also some downtime in Sugar Network services (this is documented at http://tareas.somosazucar.org/hxp/issue71 ). We've seen a burst in users since deployment OS images with Sugar Network features ( http://network.sugarlabs.org/stats-viewer/ growing pretty fast user_total). There is a notification feature that is polling the sugar network node service. This was causing the allocation and exhaustion of resources (open files). Crashes got to a frequency of every hour or so. It's code I don't understand really well, but I went ahead and patched the Sugar Network with: http://tareas.somosazucar.org/hxp/file66/sn_disable_notifications.patch This made the SN much snappier and it stopped crashing. However logs were saving a traceback several times per second. I thought I had contained the log issue but apparently I missed some other logs (I guess apache logs but they seem clean now). I took a glance at jita and could not find the growing log. Let me know where I can help mitigation. Regards Sebastian El vie, 11 de abr 2014 a las 7:51 PM, Bernie Innocenti ber...@sugarlabs.org escribió: I was notified that git.sugarlabs.org was showing errors. After some head scraping I realized that the root filesystem on jita was full. I looked around and found giant request logs containing millions of requests apparently originating from XOs located in Peru. We've been DDOSed by our own creature :-) Anyway, the machine also had a giant, very fragmented mysql database that I'm currently cleaning up. Gitorious will be back online in less than 1 hour. Contact me on IRC if this is blocking your work, I can postpone the maintenance. -- Bernie Innocenti Sugar Labs Infrastructure Team http://wiki.sugarlabs.org/go/Infrastructure_Team ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] git.sugarlabs.org down for unplanned maintenance
I know nothing about our infraestructure but, is possible run proyects in development, like Sugar Network in a different server/vm than critical services like git? Gonzalo On Sat, Apr 12, 2014 at 6:07 AM, Sebastian Silva sebast...@fuentelibre.orgwrote: Here I just got home. Sorry for the inconvenience I might have caused. Bernie, do you know which log was/is growing out of hand? Here's a report on everything I know about the issue. We've been experiencing some performance degradation and also some downtime in Sugar Network services (this is documented at http://tareas.somosazucar.org/hxp/issue71 ). We've seen a burst in users since deployment OS images with Sugar Network features ( http://network.sugarlabs.org/stats-viewer/ growing pretty fast user_total). There is a notification feature that is polling the sugar network node service. This was causing the allocation and exhaustion of resources (open files). Crashes got to a frequency of every hour or so. It's code I don't understand really well, but I went ahead and patched the Sugar Network with: http://tareas.somosazucar.org/hxp/file66/sn_disable_notifications.patch This made the SN much snappier and it stopped crashing. However logs were saving a traceback several times per second. I thought I had contained the log issue but apparently I missed some other logs (I guess apache logs but they seem clean now). I took a glance at jita and could not find the growing log. Let me know where I can help mitigation. Regards Sebastian El vie, 11 de abr 2014 a las 7:51 PM, Bernie Innocenti ber...@sugarlabs.org escribió: I was notified that git.sugarlabs.org was showing errors. After some head scraping I realized that the root filesystem on jita was full. I looked around and found giant request logs containing millions of requests apparently originating from XOs located in Peru. We've been DDOSed by our own creature :-) Anyway, the machine also had a giant, very fragmented mysql database that I'm currently cleaning up. Gitorious will be back online in less than 1 hour. Contact me on IRC if this is blocking your work, I can postpone the maintenance. -- Bernie Innocenti Sugar Labs Infrastructure Team http://wiki.sugarlabs.org/go/Infrastructure_Team___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel -- Gonzalo Odiard SugarLabs - Software for children learning ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] git.sugarlabs.org down for unplanned maintenance
If it is necessary it could be moved. Sugar Network at this point is in production and we do have a process for deployment (there is testing and devel instance). Alsroot has had a pretty good track record of keeping git up and running. I think it's a shame people are moving to github. What we should be asking I think is how we can provide better service for our users/developers (for example, having more people monitoring services and reacting when things crash). Regards, Sebastian El sáb, 12 de abr 2014 a las 9:59 AM, Gonzalo Odiard godi...@sugarlabs.org escribió: I know nothing about our infraestructure but, is possible run proyects in development, like Sugar Network in a different server/vm than critical services like git? Gonzalo On Sat, Apr 12, 2014 at 6:07 AM, Sebastian Silva sebast...@fuentelibre.org wrote: Here I just got home. Sorry for the inconvenience I might have caused. Bernie, do you know which log was/is growing out of hand? Here's a report on everything I know about the issue. We've been experiencing some performance degradation and also some downtime in Sugar Network services (this is documented at http://tareas.somosazucar.org/hxp/issue71 ). We've seen a burst in users since deployment OS images with Sugar Network features ( http://network.sugarlabs.org/stats-viewer/ growing pretty fast user_total). There is a notification feature that is polling the sugar network node service. This was causing the allocation and exhaustion of resources (open files). Crashes got to a frequency of every hour or so. It's code I don't understand really well, but I went ahead and patched the Sugar Network with: http://tareas.somosazucar.org/hxp/file66/sn_disable_notifications.patch This made the SN much snappier and it stopped crashing. However logs were saving a traceback several times per second. I thought I had contained the log issue but apparently I missed some other logs (I guess apache logs but they seem clean now). I took a glance at jita and could not find the growing log. Let me know where I can help mitigation. Regards Sebastian El vie, 11 de abr 2014 a las 7:51 PM, Bernie Innocenti ber...@sugarlabs.org escribió: I was notified that git.sugarlabs.org was showing errors. After some head scraping I realized that the root filesystem on jita was full. I looked around and found giant request logs containing millions of requests apparently originating from XOs located in Peru. We've been DDOSed by our own creature :-) Anyway, the machine also had a giant, very fragmented mysql database that I'm currently cleaning up. Gitorious will be back online in less than 1 hour. Contact me on IRC if this is blocking your work, I can postpone the maintenance. -- Bernie Innocenti Sugar Labs Infrastructure Team http://wiki.sugarlabs.org/go/Infrastructure_Team ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel -- Gonzalo Odiard SugarLabs - Software for children learning ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
[Sugar-devel] git.sugarlabs.org down for unplanned maintenance
I was notified that git.sugarlabs.org was showing errors. After some head scraping I realized that the root filesystem on jita was full. I looked around and found giant request logs containing millions of requests apparently originating from XOs located in Peru. We've been DDOSed by our own creature :-) Anyway, the machine also had a giant, very fragmented mysql database that I'm currently cleaning up. Gitorious will be back online in less than 1 hour. Contact me on IRC if this is blocking your work, I can postpone the maintenance. -- Bernie Innocenti Sugar Labs Infrastructure Team http://wiki.sugarlabs.org/go/Infrastructure_Team ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] git.sugarlabs.org down for unplanned maintenance
Thanks Berni :) On Sat, Apr 12, 2014 at 12:51 AM, Bernie Innocenti ber...@sugarlabs.orgwrote: I was notified that git.sugarlabs.org was showing errors. After some head scraping I realized that the root filesystem on jita was full. I looked around and found giant request logs containing millions of requests apparently originating from XOs located in Peru. We've been DDOSed by our own creature :-) Anyway, the machine also had a giant, very fragmented mysql database that I'm currently cleaning up. Gitorious will be back online in less than 1 hour. Contact me on IRC if this is blocking your work, I can postpone the maintenance. -- Bernie Innocenti Sugar Labs Infrastructure Team http://wiki.sugarlabs.org/go/Infrastructure_Team ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel -- Gonzalo Odiard SugarLabs - Software for children learning ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel