On Sat, Apr 28, 2012 at 4:29 PM, Andrea Aime <[email protected]> wrote: > Doh, forgot one last bit that I've found interesting. Citing Gabriel: > > ---------------------------------------------------------------------------- > > If starting up geoserver in seconds instead of minutes, loading the > home page almost instantly instead waiting for seconds or even > minutes, under-second response times for the layers page list with > thousands of layers, including filtering and paging; not going OOM or > getting timeouts when doing a GetCapabilities request under > concurrency and/or low heap size, but instead streaming out as quickly > as possible, using as little memory as possible, and gracefully > degrading under load; are not ways of exercising the new API, then I'm > lost. > > ---------------------------------------------------------------------------- > > All good stuff that I did not see mentioned in the proposal, though > I can hardly imagine a GS going OOM under concurrent load of > GetCapabilities unless... well, maybe it has 200k layers and > works off just 256M of memory (gut feeling estimate). > > In the past OOM in case of many layers is caused by leaks in the > DescribeFeatureType subsystem... did you measure how much memory > does it take for the keep the catalog in memory and do GetCapabilities? > > Pardon the very rough way of assessing it, but if I we take as reference > the release directory we have 19 layers, with quite a bit of stores that > could be avoided (single shapefiles instead of directory stores), and > running inside "workspaces" the following gives me: > > du -csh `find . -name "*.xml"` > 256K total > > which means, on average, 13KB of xml per layer (which still pads it quite > a bit since the service configuration is shared and normally you don't > have so many stores). And then it's XML, would you agree that the > in memory representation should be something like 5 times more compact? > This would give us a rough estimate of 3KB per layer. > If I have 200k layers it means 600MB of in memory storage. > Which is a lot, I'm not denying it, but if you are handling that many layers > you do also want to have some beefy hardware, 600MB should be peanuts. > > I'm not trying to deny the scalability advantages of secondary storage, it > just seems > to me the OOM reports may be a bit exaggerated.
Actually it is not exaggerated. I can make it go OOM with 25K layers (not 200K), 2GB heap size, and 20 concurrent GetCapabilities requests. Try the following for yourself, from the GSIP69 branch <https://github.com/groldan/geoserver/tree/GSIP69> (finally managed to redo it with squashed commits where the system builds ok on each one, the old branch is still there for reference, called GSIP69_old): 1- Check out the code at a point where the getcapabilities processing doesn't use the new API (commit "GSIP-69: add catalog bulk copy tool...") 2- Run against release data directory, with -Xmx1024m -XX:MaxPermSize=128m -XX:+UseCompressedOops (Oracle Java 6, 64bit, Linux), but without the -P jdbcconfig profile, in order to use the default catalog. 3- Disable GWC's "Automatically configure a GeoWebCache Layer for every new Layer and LayerGroup" 4- Use the catalog bulk load tool and add 25k copies of topp:tasmania_water_bodies 5- Shut down, checkout the master branch so you're sure no GSIP69 code gets in the middle, restart geoserver (I'm doing all this through eclipse). 6- Connect jconsole to the java process 7- Run curl -v "http://localhost:8080/geoserver/ows?service=wms&version=1.3.0&request=GetCapabilities"> caps.xml. grep "<Layer" caps.xml |wc -l gives something like 25122, all right. 8- Check jsonsole, memory usage should be around 227M 9- Hit "Perform GC", memory should go down to around 130M. Clearing the resource cache again takes it up to over 300M, hit "Free memory" and it should get it back to ~130M again. 10- Run ab -n 10 -c 10 "http://localhost:8080/geoserver/ows?service=wms&version=1.3.0&request=GetCapabilities" 11- Go check memory consumption in jconsole. Memory fills up almost completely. Both the old gen and the eden memory pools are up to the top. Jconsole says ~950M are in use. And this is only 10 concurrent GetCapabilities requests. "ab" reports a mean response time of 184.2 seconds. Other times ab times out after 10 minutes or so. Most of the time spent on GC. Running with a 2GB heap shows up GeoServer is happy with 1.17 GB to serve 10 getcaps requests, with a mean of 23 seconds. But with 20 concurrent requests, it eats up to 1.8GB of heap and then the GC is having a hard time not to OOM, but it finally does (*) after about 750 seconds. 12- Shut down GeoServer and checkout the commit "GSIP-69: port WMS GetCapabilities 1.3 to extended Catalog API" 13- Start up GeoServer (it's gonna take a while) and repeat the process from 6) to 11). This time ab reports (on my system) a mean response time of 14.794 seconds. Memory usage in jconsole peaks up at about 370m, then goes down to about 180M without explicitly calling the GC. And back to 130M if clearing the resource cache. Doing it with 100 concurrent requests instead of 10, memory usage barely exceeds 550M, and mean response time is about 130.9 seconds. Pretty much linearly scaling. And this is with the default catalog, which has to do sorting in-memory. If doing the same with 100 concurrent requests, but without the getcaps transformer ported to the new API - same commit than 1) -, the 1G heap fills up and ab times out: "Benchmarking localhost (be patient)...apr_poll: The timeout specified has expired (70007)" If adding a bigger timeout (add -t 3600 to the ab arguments), (*) java.lang.OutOfMemoryError: GC overhead limit exceeded at sun.misc.FloatingDecimal.dtoa(FloatingDecimal.java:659) at sun.misc.FloatingDecimal.<init>(FloatingDecimal.java:440) at java.lang.Double.toString(Double.java:179) at java.lang.String.valueOf(String.java:2973) at org.geoserver.wms.capabilities.Capabilities_1_3_0_Transformer$Capabilities_1_3_0_Translator.handleBBox(Capabilities_1_3_0_Transformer.java:1174) at org.geoserver.wms.capabilities.Capabilities_1_3_0_Transformer$Capabilities_1_3_0_Translator.handleLayer(Capabilities_1_3_0_Transformer.java:834) at org.geoserver.wms.capabilities.Capabilities_1_3_0_Transformer$Capabilities_1_3_0_Translator.handleLayerTree(Capabilities_1_3_0_Transformer.java:791) at org.geoserver.wms.capabilities.Capabilities_1_3_0_Transformer$Capabilities_1_3_0_Translator.handleLayers(Capabilities_1_3_0_Transformer.java:658) at org.geoserver.wms.capabilities.Capabilities_1_3_0_Transformer$Capabilities_1_3_0_Translator.handleCapability(Capabilities_1_3_0_Transformer.java:437) at org.geoserver.wms.capabilities.Capabilities_1_3_0_Transformer$Capabilities_1_3_0_Translator.encode(Capabilities_1_3_0_Transformer.java:252) ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Geoserver-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/geoserver-devel
