Hi Daniel, Thank you very much for this complete answer. It is extremely helpful and very much appreciated!
I've updated go-geos to only lock the first context in https://github.com/twpayne/go-geos/commit/05d44c727f8d75b5bb420a557b956a52de8fd538 . Best regards, Tom On Thu, 9 Jan 2025 at 14:41, Daniel Baston <dbas...@gmail.com> wrote: > Hi Tom, > > With the caveat that I did not develop the "_r" API, here's my > understanding. The implementation of GEOS contexts is quite short and can > be found near the top of geos_ts_c.cpp. From looking at the implementation > we can see that a context has pretty limited functionality: > > - registering error and notice handlers > - passing error messages to these handlers using a context-local buffer > - storing properties of a global WKB writer > - storing a single point that is mutated by a small set of functions such > as GEOSPreparedContainsXY that want to avoid instantiating a point on each > call > > The "_r" functions are a C API construct only and do not indicate anything > about the thread-safety of the underlying algorithms. The parts of GEOS > that actually _do_ things are implemented in C++ and have no notion of a > "context." > > With regard to the situation you describe in your email: > > GEOSUnion_r(g.context.handle, g.geom, other.geom) > > GEOSUnion_r only accesses a context for the purpose of error reporting, > and the only context it will access is the first argument. It does not > matter whether "g" or "other" were created using this context, or whether > "g" and "other" have the same context. If you were to call GEOSUnion_r > simultaneously from multiple threads using the same first argument, AFAIK > the worst thing that could happen is that simultaneously-emitted error > messages generated would be garbled. But you can prevent that by locking > the context provided in the first argument. > > If the context of "g" and "other" is irrelevant, can you use them safely > from multiple threads? Not necessarily. The programming style used in GEOS > lends itself to thread-safety (objects are generally immutable, etc.) > although the use of lazy initialization sometimes gets in the way and the > thread-safety of various operations is not well-documented or tested. The > pattern you describe of locking the contexts of both "g" and "other" will > protect against thread-safety issues but is unnecessarily broad; the same > benefit could be achieved with locks scoped to the geometries themselves. > Even this more limited locking is unnecessary for many operations, but > unfortunately the GEOS docs don't give any guidance about which ones. > Adding more tests and documentation around GEOS thread-safety would be a > big improvement for the library. > > Dan > > PS > > There has been some discussion about giving contexts additional > responsibility, such as managing interrupts: > https://github.com/libgeos/geos/pull/803 > > You might also find some useful discussion within the georust project, e.g. > https://github.com/georust/geos/pull/164 > > On Wed, Jan 8, 2025 at 9:36 PM Tom Payne <twpa...@gmail.com> wrote: > >> Gentle ping on this. I would really like to understand the requirements >> for combining geometries from different GEOS contexts. >> >> If my question is unclear, missing information, stupid, answered >> elsewhere, or there is any other problem with it then please tell me. I >> would really like to understand this. >> >> Regards, >> Tom >> >> On Fri, 22 Nov 2024 at 10:50, Tom Payne <twpa...@gmail.com> wrote: >> >>> Hi, >>> >>> tl;dr when calling a function which takes multiple geometries, like >>> GEOSUnion_r >>> <https://libgeos.org/doxygen/geos__c_8h.html#afd3d82a6d039ab5637e0a8a066694b7d>, >>> where >>> the two geometries are associated with different contexts, do I have to >>> ensure that both geometries' contexts are used exclusively? >>> >>> Background: >>> >>> I maintain the Go bindings for GEOS <https://github.com/twpayne/go-geos>, >>> which exclusively use the thread-safe *_r functions. Every created geometry >>> is associated with a context. Every context has a mutex to ensure that it >>> is only accessed from a single thread at time. >>> >>> For functions that take multiple geometries I check if the geometries >>> are from different contexts, and if so, lock both mutexes. Here >>> <https://github.com/twpayne/go-geos/blob/c9ed31526fa2ee3599ffe0fdf4556a6cf9c0b204/geommethods.go#L865-L875> >>> is an example: >>> >>> // Union returns the union of g and other. >>> func (g *Geom) Union(other *Geom) *Geom { >>> g.mustNotBeDestroyed() >>> g.context.Lock() >>> defer g.context.Unlock() >>> if other.context != g.context { >>> other.context.Lock() >>> defer other.context.Unlock() >>> } >>> return g.context.newGeom(C.GEOSUnion_r(g.context.handle, g.geom, >>> other.geom), nil) >>> } >>> >>> However, there is a potential deadlock if there are two geometries A and >>> B owned by different contexts and A.Union(B) and B.Union(A) are called >>> simultaneously from different threads. In practice this pattern is unlikely >>> to occur, but I would like to guard against it. >>> >>> I checked the documentation on GEOS's C API >>> <https://libgeos.org/usage/c_api/>, the GEOS developer notes >>> <https://github.com/libgeos/geos/blob/main/DEVELOPER-NOTES.md>, did a >>> superficial search of the GitHub issues >>> <https://github.com/search?q=repo%3Alibgeos%2Fgeos+context&type=issues>, >>> and a superficial search of the geos-devel >>> <https://www.google.com/search?q=site%3Alists.osgeo.org+%22%5Bgeos-devel%5D%22+context> >>> archives, and could not find an answer to this question. >>> >>> Many thanks for any insight, >>> Tom >>> >>