Hi Andrea,

for the record, here's the branch with the catalog loader optimization [1].
I need to add some docs before
proposing it as a community module, but it's working ok in a vanilla
geoserver deployment with ~80k layers, ~4k workspaces,
and wms/wfs/wcs/wmts services configured for each workspace individually,
which surprisingly was a big perf offender.

That said, it won't help with the home page combos at all.

My proposal would be to use progressive loading instead of preemptive
loading of all workspaces/layers. The downside is you need to
know at least a couple of letters about what you're looking for, but IMO
it's a good compromise. Catalog-side wise, it'd only perform
well if there's an actual full-text-search engine backing the search. Back
in the day I had a prototype for an in-memory lucene index
running the searches for the UI's full text searches that worked like a
charm, but IIRC missed a good update of the index whenever
something changes. That could be something to do some research on.

[1]
https://github.com/groldan/geoserver/tree/catalog/perf/data_directory_loader


On Thu, 24 Nov 2022 at 07:18, Andrea Aime <andrea.a...@geosolutionsgroup.com>
wrote:

> Hi all,
> I've just got a report from a customer that they tried to upgrade to
> 2.22.0, but had to quickly revert back to 2.21.x, as the GeoServer home
> page was unreachable.
>
> What is interesting about that deployment is the number of layer, well
> above 20k. Not the largest I've seen, but large. Also, all the layers are
> sourced from an Oracle database.
> In their case, the home page takes several minutes to load.
>
> Locally I have an oddball test data directory with 40k layers, but with an
> easing factor, it's a "many times copy" of the GeoServer demo layers,
> meaning it's all shapefiles.
> The landing page for me displays quick enough (few seconds), but then
> the browser is on its knees, completely unresponsive, for 10+ seconds.
> After that, trying to use the workspace/layer dropdown also incurs in
> severe slowdown, with the browser blocked for several seconds.
> Chrome reports that one tab with the home page is using 776MB of memory,
> too.
>
> Considering I've seen installations with up to 1 million layers (a case
> where they actually had 3 millions, and split them across 3 different data
> directories), this is a serious problem...
>
> I have also seen Gabriel experiment with large geoserver-cloud deployments
> with a lot of workspaces (tens of thousands? more?) but I cannot find the
> relevant branch anymore (believe it was about better parallelizing data
> directory loading, cannot find the commit anymore).
>
> How to address it though? Throwing in a couple of ideas:
>
>    - Make the functionality opt-in or opt-out via a flag or UI
>    configuration. The flag might be hard to discover, but the UI setting could
>    be hard to reach if one cannot get to the home page to start with...
>    - Automatically disable the dropdowns after a certain threshold of
>    workspaces layers is reached, with the threshold being configurable? Say
>    1000 for example? However it might still cause issues for data sources that
>    are slow to be connected (I'm guessing part of the slowness is due to some
>    data type verification that requires actual connection to the data source,
>    based on the fact the Oracle seems a lot slower to just generate the page
>
> Any other idea?
>
> Cheers
> Andrea
>
> ==
>
> GeoServer Professional Services from the experts!
>
> Visit http://bit.ly/gs-services-us for more information.
> ==
>
> Ing. Andrea Aime
> @geowolf
> Technical Lead
>
> GeoSolutions Group
> phone: +39 0584 962313
>
> fax:     +39 0584 1660272
>
> mob:   +39  339 8844549
>
> https://www.geosolutionsgroup.com/
>
> http://twitter.com/geosolutions_it
>
> -------------------------------------------------------
>
> Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE
> 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si
> precisa che ogni circostanza inerente alla presente email (il suo
> contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è
> riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il
> messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra
> operazione è illecita. Le sarei comunque grato se potesse darmene notizia.
>
> This email is intended only for the person or entity to which it is
> addressed and may contain information that is privileged, confidential or
> otherwise protected from disclosure. We remind that - as provided by
> European Regulation 2016/679 “GDPR” - copying, dissemination or use of this
> e-mail or the information herein by anyone other than the intended
> recipient is prohibited. If you have received this email by mistake, please
> notify us immediately by telephone or e-mail
> _______________________________________________
> Geoserver-devel mailing list
> Geoserver-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/geoserver-devel
>


-- 
Gabriel Roldán
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Reply via email to