Re: [Pharo-project] [Seaside] my site is completely dead..

Sven Van Caekenberghe Wed, 30 Jan 2013 00:34:13 -0800

On 29 Jan 2013, at 19:29, Gastón Dall' Oglio <[email protected]> wrote:


> Hi Sergio.
> 
> Some weeks ago I had deal with an image that works normally, whereas an 
> Seaside app within it was not responding (until that time when an app was not 
> responding always was because the image was hung).
> 
> I dis some forense analisis in this image :), and I saw several zombies 
> forked process, in really with an very long timeout in semaphore. See 
> screenshot, in left side I inspected the Semaphore to see DelayWaitTimeout at 
> one sane forked process, while the right same for one broken. Plus, note that 
> these broken process don't die if you stop and start the Seaside server 
> adaptor.
> 
> To see that in your image, open an Process Browser (and turn on auto-update) 
> and see if there are several process "ZnManagingMultiThreadedServer HTTP 
> worker", if so, then terminate some of them and see if site begin to respond. 
> My app began to respond after termine several of them.
> 
> I guess that this problem occurred when I save AND QUIT the image whereas 
> exist those forked processes.
> 
> The solution to that the image begin to respond again was kill (terminate) 
> manually all processes "ZnManagingMultiThreadedServer HTTP worker", and in 
> the future be aware that there isn't workers running when I save the imagen.
> 
> I don't know if it is a bug, if we think that yes I can give more data about 
> context (my image, package versions, SO, …).

Some clarifications: ZnManagingMultiThreadedServer has one server process 
listening for and accepting incoming requests, forking a worker process each 
time. Such a worker process will loop over HTTP 1.1 request/response cycles 
until the other end closes or something goes wrong. There is currently no 
timeout as such but of course the socket connection dies eventually, so that is 
almost the same thing.

The 'Managing' aspect means that the server keeps track of all open connections 
or socket streams. When the server is stopped, all the connections will be 
closed. The idea is that all the worker processes using these connections (a 
one to one mapping) will eventually get an exception that is then handled by 
cleaning up and finally stopping.

This last mechanism, the closing of a socket stream from another process 
resulting in an exception in a process using that connection does not work 
identically or equally well on the different platforms (Mac, Windows, Linux) 
because these have completely different socket implementations in the VM. 
Saving an image interacts with this is various subtle ways.

On my main development platform, Mac, I see no problems. In my production 
deploys on Linux things are fine too. But I do various things to minimise 
problems:

- my images hold no 'running' server(s), these are always created and started 
freshly using a startup script
- I never save images after that
- all the images are controlled by init.d scripts to start automatically with 
the machine
- all my images are controlled by monit so that they restart automatically when 
they stop working
- most of the time, I have multiple images under a load balancer, statefull or 
stateless, to improve availability and capacity
- the load balancer also functions as a sanitizer and controller of incoming 
requests protecting the images
- the load balancer can handle static resources directly, off loading work from 
the images 

http://zn.stfx.eu/zn/index.html#livedemo
http://stfx.eu/pharo-server/

Yes, like any computer program, a Smalltalk vm+image combination has limits: 
there is some maximum number of processes and connections that can be running 
and open at the same time and there are general memory limits. I am pretty sure 
that with a setup like the one I described above production systems handling 
hundreds to thousands requests per second are possible.

Sven

> 2013/1/29 sergio_101 <[email protected]>
> i think i need to bring the image local, and see what's going on.. i am 
> moving it to a new server this week anyway..
> 
> thanks!
> 
> 
> On Tue, Jan 29, 2013 at 8:18 AM, sergio_101 <[email protected]> wrote:
> hey, dale.. it seems like lately, i am seeing this problem at least once a 
> week. there were times when i would run problem free for months, but not 
> lately..
> 
> 
> On Tue, Jan 29, 2013 at 1:24 AM, Dale Henrichs <[email protected]> wrote:
> Sergio,
> 
> Most of my experience is from working with GemStone, which is different 
> animal, so take what I say with a grain of salt.
> 
> If the running image is completely frozen, then you don't have much choice 
> but to kill it and restart ... hopefully you haven't lost any data ...
> 
> If after restart you see the problem again, then you might be able to debug 
> the issue by copying the image to a local machine and bringing it up ... If 
> the problem doesn't reproduce, I'd still be inclined to take a copy of the 
> image and attempt to understand the particular problem.
> 
> It's hard to tell from the screen shot what the thread is doing or even which 
> thread it is ... it's not likely that the thread is a seaside application 
> thread because those are normally forked and will sit around with an open 
> debugger, but not necessarily affect the image itself. So I can't really 
> guess what operation is causing trouble ...
> 
> If you're lucky you can reproduce the problem on your local machine ... If 
> you search the pharo bug list you might find a bug in this area and from that 
> we might be able to figure out which thread is the bad boy and there might 
> even be a fix ..
> 
> You mentioned stability...are you seeing this particular problem occur often 
> or are you seeing different issues?
> 
> Dale
> 
> ----- Original Message -----
> | From: "sergio t. ruiz" <[email protected]>
> | To: "discussion" <[email protected]>
> | Sent: Monday, January 28, 2013 9:59:11 PM
> | Subject: [Seaside] my site is completely dead..
> |
> |
> | my site completely died today. i tried logging in with vnc, and it
> | seems just stuck.. i can't do anything to it..
> |
> | anyone have any ideas?  i really need this thing to run consistently
> | ..
> |
> | here is a screenshot of its current state:
> |
> | http://db.tt/eVxJX6lr
> |
> | thanks!
> |
> |
> | ----
> | peace,
> | sergio
> | photographer, journalist, visionary
> |
> | http://www.ThoseOptimizeGuys.com
> | http://www.CodingForHire.com
> | http://www.coffee-black.com
> | http://www.painlessfrugality.com
> | http://www.twitter.com/sergio_101
> | http://www.facebook.com/sergio101
> |
> |
> |
> | _______________________________________________
> | seaside mailing list
> | [email protected]
> | http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
> |
> _______________________________________________
> seaside mailing list
> [email protected]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
> 
> 
> 
> -- 
> ----
> peace,
> sergio
> photographer, journalist, visionary
> 
> http://www.ThoseOptimizeGuys.com
> http://www.CodingForHire.com
> http://www.coffee-black.com
> http://www.painlessfrugality.com
> http://www.twitter.com/sergio_101
> http://www.facebook.com/sergio101
> 
> 
> 
> -- 
> ----
> peace,
> sergio
> photographer, journalist, visionary
> 
> http://www.ThoseOptimizeGuys.com
> http://www.CodingForHire.com
> http://www.coffee-black.com
> http://www.painlessfrugality.com
> http://www.twitter.com/sergio_101
> http://www.facebook.com/sergio101
> 
> _______________________________________________
> seaside mailing list
> [email protected]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
> 
> 
> <image1.png>_______________________________________________
> seaside mailing list
> [email protected]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside

--
Sven Van Caekenberghe
http://stfx.eu
Smalltalk is the Red Pill

Re: [Pharo-project] [Seaside] my site is completely dead..

Reply via email to