Hi all, recently I was investigating an OOM reported by a user that was basically just using OpenLayers with tiles and meta-tiling on a single machine (so one user connected to GeoServer).
The result of the investigation is not completely new, but it's worrisome anyways. Basically the user was moving around a lot using OL, panning and zooming, and the VM was configured as default, which on the platform of this example meant only having 64M or memory. Each request resulted in the building of a 3x3 meta tile, thought of course not all requests triggered that as the code prevents the same meta tile to be computed in parallel by more than one thread. I've added some machinery to get a count of the concurrent request working in parallel and usually the count was 6 (which is the default Firefox max connections) but if someone starts zooming around while OL is still asking for the tiles of the current level, boom, one can easily get up to 30-40 concurrent requests and the OOM is pretty much guaranteed. The thing is, Firefox gives up on the older requests, but GeoServer does not know that until it actually tries to write anything to the response, which happens only after the rendering is fully done. Given that each meta tile uses 2+MB of memory, it does not take much to fill up a 64MB heap (especially since good part of it is already filled with the HSQL EPSG database cache, around 19MB, hopefully switching to H2 will give us some breathing room in the future). We really need to find a way to make GeoServer stop working on requests that the client has dropped. I've looked a bit around, here is what I've found. Apache in CGI mode kills the cgi process as soon as the connection is dropped. In Java we cannot, because we're using threads, and the threads share resources, one cannot kill one without bad consequences. I looked into the servlet API but could find no "supported" way to actually guess if the client connection is still alive or not, it seems one has actually to try and write something on the output. I asked on the Sun J2EE servlet forum and got a couple of answers: http://forums.sun.com/thread.jspa?threadID=5408542 The idea of trying to flush() periodically seems to be a good one, I've read in other places that flushing the output stream should not turn the response into committed status: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4362880 The reason it's important that flush() does not commit the response is that by the time one commits the response the headers have to be set and cannot be modified, and our dispatch system sets them only after the response object has been created (the fully rendered image in our case). Since we want to try periodic flush() call during the rendering we would be in troubles, as the headers are set only after that. Alternatively, or in parallel to this, we could make sure no more than X threads are rendering. This could be done by using a concurrent queue limited in size, each rendering action trying to push a token into it and end up waiting if full. This would solve the OOM, but would make all the new request wait for the older ones to be dropped, basically making GS WMS unusuable for a while. Failing everything else, this may not be a such a bad idea. With a little generalization we could apply this at the dispatcher level and allow the administrator to set limits to the number of requests GS is serving for each service (typically you can serve much more WFS requests in parallel than WMS ones). Another option that comes to mind is to get our hands dirty and write plugins that leverage container specific api to check if the connection is still alive. Downside, it would work only for specific versions of specific containers, and I haven't checked if such an API exists at all. Well, do anybody have experiences on this? Suggestions? Cheers Andrea -- Andrea Aime OpenGeo - http://opengeo.org Expert service straight from the developers. ------------------------------------------------------------------------------ Come build with us! The BlackBerry® Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9-12, 2009. Register now! http://p.sf.net/sfu/devconf _______________________________________________ Geoserver-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/geoserver-devel
