For the record, I have split this conversation further into two
separate threads: "GSIP 69 - Catalog scalability enhancements - OOM"
and "GSIP 69 - Catalog scalability enhancements - fast startup"


On Sat, Apr 28, 2012 at 4:29 PM, Andrea Aime
<[email protected]> wrote:
> Doh, forgot one last bit that I've found interesting. Citing Gabriel:
>
> ----------------------------------------------------------------------------
>
> If starting up geoserver in seconds instead of minutes, loading the
> home page almost instantly instead waiting for seconds or even
> minutes, under-second response times for the layers page list with
> thousands of layers, including filtering and paging; not going OOM or
> getting timeouts when doing a GetCapabilities request under
> concurrency and/or low heap size, but instead streaming out as quickly
> as possible, using as little memory as possible, and gracefully
> degrading under load; are not ways of exercising the new API, then I'm
> lost.
>
> ----------------------------------------------------------------------------
>
> All good stuff that I did not see mentioned in the proposal, though
> I can hardly imagine a GS going OOM under concurrent load of
> GetCapabilities unless... well, maybe it has 200k layers and
> works off just 256M of memory (gut feeling estimate).
>
> In the past OOM in case of many layers is caused by leaks in the
> DescribeFeatureType subsystem... did you measure how much memory
> does it take for the keep the catalog in memory and do GetCapabilities?
>
> Pardon the very rough way of assessing it, but if I we take as reference
> the release directory we have 19 layers, with quite a bit of stores that
> could be avoided (single shapefiles instead of directory stores), and
> running inside "workspaces" the following gives me:
>
>  du -csh `find . -name "*.xml"`
> 256K total
>
> which means, on average, 13KB of xml per layer (which still pads it quite
> a bit since the service configuration is shared and normally you don't
> have so many stores). And then it's XML, would you agree that the
> in memory representation should be something like 5 times more compact?
> This would give us a rough estimate of 3KB per layer.
> If I have 200k layers it means 600MB of in memory storage.
> Which is a lot, I'm not denying it, but if you are handling that many layers
> you do also want to have some beefy hardware, 600MB should be peanuts.
>
> I'm not trying to deny the scalability advantages of secondary storage, it
> just seems
> to me the OOM reports may be a bit exaggerated.
>
> The other thing that raises my interest and worries me is "starting up in
> seconds".
> My understanding of the current startup slowness is mostly due to the
> validation checks we do on startup to see if a layer/feature type/store are
> working
> and valid, that results in opening all stores, computing the feature types
> and so on.
>
> Maybe the jdbc config is that much faster because it's not loading
> everything up front
> and thus those listeners are not being called?
> If so we have a problem at hand, since the listeners are there to prevent
> the caps
> documents to error miserably with a service exception at the first sign of
> trouble.
>
> You may say that's a design issue in the caps generator and I would agree,
> me
> and Justin discussed it a bit during the FOSS4G-NA code sprint, we basically
> can remove those checks if we can make the XML documents generation
> "transactional" in some way, that is, put a mark on the ouput stream,
> generate
> the xml for a layer, if it's ok push it out, in case of exception throw away
> the
> buffer and start back from the mark, and so on.
> This would have to be made for all caps document, for rest-config parts that
> do list resources, and for all Describe* calls (since they need to accept
> the
> lack of an identifier as a request to describe all that you have in the
> server).
>
> The above would be very welcomed, but we need to make sure it's there
> before un-plugging the listeners that keep GeoServer caps generation sane
>
> Cheers
> Andrea



-- 
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Geoserver-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Reply via email to