Jody / Andrea,
Thanks for the detailed responses. I was not aware of the work to decouple
the GeoServer configuration from the filesystem - that would certainly
facilitate making a change away from the file system as a first class
configuration storage mechanism.
I had briefly investigated both the jdbc-config and jms approaches. The
primary issue we had with anything but the file-system backed config was
that all seemed to lack complete configuration support. We make heavy use
of image mosaics and a fair number of FTL template files and both of these
require configuration that falls outside of that managed by the community
plugins. This may have changed, but at the time that was a deal breaker
(initial investigation was back around 2.6 or 2.7). Jody, I also noticed
that the Resource API plan doc on the wiki said the TemplateLoader wasn't
handled yet in the transition. Is that accurate or out-of-date? I assume
this is where changes would be made to support FTL files.
https://github.com/geoserver/geoserver/wiki/Resource-API-Transition-Plan#file-events
Ultimately, I'd like to see configuring a different backend being as simple
as how Zookeeper config works. Setting an environment variable on GeoServer
startup, such as "ZK=zk://zk-1.zk:2181,zk-2.zk:2181,zk-3.zk:2181/geoserver",
that would behave just as the file-system backed configuration. This would
require separation of any actual data within the "data" directory from the
config, which I've always personally avoided. Short of the demo files, I've
always referenced image data from a location outside the "data" tree as
that seemed like a best practice and made it possible to put directory
under SCM if desired.
If complete configuration coordination (FTL and Image Mosaic config), to
your knowledge, is available with the JMS plugin, I may look into that
again. Doing complete configuration reload works reasonably well for us
presently as we have a large amount of data in a fairly small number of
layers (~100) / stores (~10, but I can imagine in a scenario with 1000s+
plus this would be more problematic.
Thanks,
Jonathan Meyer
Sr. Software Engineer
Applied Information Sciences
On Wed, Jul 12, 2017 at 2:36 AM Andrea Aime <[email protected]>
wrote:
> On Wed, Jul 12, 2017 at 12:19 AM, Jonathan Meyer <[email protected]> wrote:
>
>> As background, I'm aware of various efforts / documentation on how to
>> coordinate GeoServer configuration between multiple instances:
>> http://docs.geoserver.org/latest/en/user/community/jms-cluster/index.html
>>
>> https://boundlessgeo.com/2013/04/geoserver-in-a-clustered-configuration-part-1/
>>
>> https://2016.foss4g-na.org/sites/default/files/slides/High%20Performance%20Geoserver%20Clusters_0.pdf
>>
>>
>> In developing a GeoServer package for Apache Mesos via DC/OS, I went down
>> a similar path to Derek Kern as identified in his 2016 FOSS4G-NA talk
>> (linked above) - mounted network storage to share GeoServer data
>> configuration across multiple machines.
>>
>
> The overall approach of the presentation is a bit .... old? it represents
> the "state of the art" as of 2010, as Jody remarked other
> avenues have been considered since then. To be fair, that approach is
> still simple and viable if you have a master driven, low change
> rate configuration (though the separate front-end GWC is something I have
> not seen in a while) and with the speedups to
> file system catalog loading in 2.11 it's viable even if you have a
> large-ish number of layers.
> At the same time, shared data dir and reload is the only approach that you
> can take if you restrict yourself to supported modules
> (both jms and jdbc-config are community, thus, unsupported).
>
>
>> While this solution is functional, it enforces a requirement on
>> consistent mounted data across a cluster, as well as requiring an external
>> coordination service to monitor configuration directory and force instances
>> to reload from disk. My preferred approach would be to either directly
>> coordinate between GeoServers or use a cluster native coordination system
>> (such as Zookeeper) for configuration. I have considered looking into using
>> the GeoServer backup/restore plugin that was recently developed to push
>> configuration to all other GeoServer instances within a cluster.
>>
>
> The backup/restore module has been developed for "full/slow"
> backup/restore operations, not on the fly change notification.
> Something based on zookeper would be interesting. I'd also like to play
> with/develop a distributed in memory configuration based on
> Hazelcast (or something similar) to see how it works, nowadays the
> jdbcconfig module is taking a significant performance hit
> due to the many queries it does to the config db per request, slowing down
> each OGC request (Niels showed interest in
> improving that, haven't heard about it since though).
>
> Ideally, I'd like to see something easier to setup than JMS clustering,
> with a performance comparable to in memory config storage
> and not requiring changes to the database when the configuration object
> properties change, or queries towards the catalog change
> (something that jdbcconfig nowadays requires, making it hard to upgrade
> [1]).
>
> That said, the configuration needs to be stored somewhere (to support full
> cluster restart at the very least), as Jody said
> there are indirections in the code nowadays to allow storage on something
> other than a file system, there is a community (unsupported)
> module allowing storage on a relational database, to be used along
> jdbcconfig.
>
>
>> Does any one else have experience or opinions in this domain? I'm just
>> brainstorming and would love to discuss this in more detail.
>>
>
> Been playing with all options above, yep.
>
> Regards,
>
> Andrea Aime
>
>
> [1]: This is one annoying issue in jdbcconfig imho.
>
> Basically, jdbcconfig stores XML blobs and maps out interesting attributes
> in a separate table for indexing searches.
>
> So, if a new property pops up that needs to be searchable for whatever
> reason, one has to go and change the
>
> jdbcconfig mappings to map it out, failing to do so will make jdbcconfig
> go and de-serialize the xml blobs from db
>
> every time a search based on the incriminated property happens.
>
> Another issue happens if the code querying the catalog starts issuing
> queries against that
>
> are already in the stored XML, but have not been mapped out to be indexed.
>
> There is no tooling to add the mapping and extract them from the XML blobs,
>
> the only approach I've found is to re-import from a file system based data
> dir... which is not possible once
>
> you have been using jdbcconfig for a while and it got out of synch with
> the fs based data dir.
>
> Hopefully one has used a dbms with xml/xpath extraction support to setup
> mass extraction
>
> queries to re-align the db.
>
>
> ==
> GeoServer Professional Services from the experts! Visit
> http://goo.gl/it488V for more information.
> ==
>
> Ing. Andrea Aime
> @geowolf
> Technical Lead
>
> GeoSolutions S.A.S.
> Via di Montramito 3/A
> 55054 Massarosa (LU)
> phone: +39 0584 962313 <+39%200584%20962313>
> fax: +39 0584 1660272 <+39%200584%20166%200272>
> mob: +39 339 8844549 <+39%20339%20884%204549>
>
> http://www.geo-solutions.it
> http://twitter.com/geosolutions_it
>
> AVVERTENZE AI SENSI DEL D.Lgs. 196/2003
>
> Le informazioni contenute in questo messaggio di posta elettronica e/o
> nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il
> loro utilizzo è consentito esclusivamente al destinatario del messaggio,
> per le finalità indicate nel messaggio stesso. Qualora riceviate questo
> messaggio senza esserne il destinatario, Vi preghiamo cortesemente di
> darcene notizia via e-mail e di procedere alla distruzione del messaggio
> stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso,
> divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od
> utilizzarlo per finalità diverse, costituisce comportamento contrario ai
> principi dettati dal D.Lgs. 196/2003.
>
> The information in this message and/or attachments, is intended solely for
> the attention and use of the named addressee(s) and may be confidential or
> proprietary in nature or covered by the provisions of privacy act
> (Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection
> Code).Any use not in accord with its purpose, any disclosure, reproduction,
> copying, distribution, or either dissemination, either whole or partial, is
> strictly forbidden except previous formal approval of the named
> addressee(s). If you are not the intended recipient, please contact
> immediately the sender by telephone, fax or e-mail and delete the
> information in this message that has been received in error. The sender
> does not give any warranty or accept liability as the content, accuracy or
> completeness of sent messages and accepts no responsibility for changes
> made after they were sent or for other risks which arise as a result of
> e-mail transmission, viruses, etc.
>
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Geoserver-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geoserver-devel