I am happy to merge this in for the beta; indeed I would like to see this
kind of work receieve the slightly wider testing a beta and release
candidate can provide.

--
Jody Garnett

On 19 February 2017 at 03:08, Andrea Aime <[email protected]>
wrote:

> Hi,
> I managed to get a "light" paralellization during the catalog load by
> rolling a background
> thread pool that would take care of loading the xml files into memory,
> without having
> to parallelize the code that runs in the spring startup.
>
> Basically, the existing code does a number of "scan this directory for xml
> files and load them
> one by one", that's the part I parallelized by creating a asynch iterator
> doing the loading,
> and giving the main thread a byte[] for each file, to decode and put int
> the catalog.
>
> The benefit of doing so is visible because IO is the bottleneck.
> So, let's see some numbers, but before, a reminder of the data dirs
> involved:
>
>    - "Many states": 1 workspace, 1 store, 10k layers, 10k cached layers
>    - "Large": 1001 workspaces, 11000 stores (a mix of shapefiles,
>    postgis, directory of shapefile, single tiff, arcgrid, mosaics), 42000
>    layers and 42000 associated tile layers
>
> Here is a comparison of the last loading times I have vs the ones with
> parallel IO loading.
>
>    - Many states, cold startup. Before, 68s. After: 30s
>    - Many states, warm startup. Before: 29s After: 21s
>    - Large, cold startup. Before: 230s After: 107s
>    - Large, warm startup: Before: 45s After: 45s (weird?)
>
> As expected the benefit shows up mostly on col startups, where actual IO
> against the disk happens (btw, the
> IO is likely happening against the SSD cache of my hybrid HD, a pure
> spinning disk drive will likely have worse timings).
>
> Pull request available here, please let me know if anybody wants to review
> or if you have reservations
> against a merge (plan B, land it after the freeze and wait September to
> have it in a release):
> https://github.com/geoserver/geoserver/pull/2116
>
> Cheers
> Andrea
>
> PS: the IO rate I see is still quite a bit below the potential of my local
> drive, so a true
> parallelization of the catalog loading (including the CPU bound part) is
> likely to reap extra
> benefits, but as said previously, it's harder to implement. This is a more
> modest approach that
> still manages to provide a speedup.
>
> --
> ==
> GeoServer Professional Services from the experts! Visit
> http://goo.gl/it488V for more information.
> ==
>
> Ing. Andrea Aime
> @geowolf
> Technical Lead
>
> GeoSolutions S.A.S.
> Via di Montramito 3/A
> 55054  Massarosa (LU)
> phone: +39 0584 962313 <+39%200584%20962313>
> fax: +39 0584 1660272 <+39%200584%20166%200272>
> mob: +39  339 8844549 <+39%20339%20884%204549>
>
> http://www.geo-solutions.it
> http://twitter.com/geosolutions_it
>
> *AVVERTENZE AI SENSI DEL D.Lgs. 196/2003*
>
> Le informazioni contenute in questo messaggio di posta elettronica e/o
> nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il
> loro utilizzo è consentito esclusivamente al destinatario del messaggio,
> per le finalità indicate nel messaggio stesso. Qualora riceviate questo
> messaggio senza esserne il destinatario, Vi preghiamo cortesemente di
> darcene notizia via e-mail e di procedere alla distruzione del messaggio
> stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso,
> divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od
> utilizzarlo per finalità diverse, costituisce comportamento contrario ai
> principi dettati dal D.Lgs. 196/2003.
>
>
>
> The information in this message and/or attachments, is intended solely for
> the attention and use of the named addressee(s) and may be confidential or
> proprietary in nature or covered by the provisions of privacy act
> (Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection
> Code).Any use not in accord with its purpose, any disclosure, reproduction,
> copying, distribution, or either dissemination, either whole or partial, is
> strictly forbidden except previous formal approval of the named
> addressee(s). If you are not the intended recipient, please contact
> immediately the sender by telephone, fax or e-mail and delete the
> information in this message that has been received in error. The sender
> does not give any warranty or accept liability as the content, accuracy or
> completeness of sent messages and accepts no responsibility  for changes
> made after they were sent or for other risks which arise as a result of
> e-mail transmission, viruses, etc.
>
> -------------------------------------------------------
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> _______________________________________________
> Geoserver-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/geoserver-devel
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Geoserver-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Reply via email to