This is great. Thank you for the detailed information. Here are some follow ups:
> All other 6 dependencies (geoapi, sis-utility, etc.) are transitive dependencies brought back automatically. This is true, except we have special gradle plugins that disable transitive dependency resolution (`transitive = false`) because of the environments we operate in. We can't use gradle's "behind the scenes" to silently pull in dependencies w/o explicit consent (think CVE risk mitigation). My comment was more around the fact that, transitive or not, some sensitive environments make it more difficult to bring in a "new" project when it has so many dependencies that require special security scanning (a concept we put into practice when working on Lucene core). > Can you try to invoke the following method: org.apache.sis.system.Shutdown.stop(YourClassOnlyForLoggingPurpose.class); > Do you mean that you already tried above-cited Shutdown.stop method and it didn't shutdown the Derby database? If yes, this is a bug. I tried that method and the derby database does get shutdown but the EPSGFactory thread does not. Here is the code: @After @Override public void tearDown() throws Exception { Shutdown.stop(ReprojectionProcessorFactoryTests.class); super.tearDown(); } Here is the exception: 1 thread leaked from SUITE scope at io.test.ReprojectionProcessorFactoryTests: 1) Thread[id=64, name=Timer-1, state=WAITING, group=TGRP-ReprojectionProcessorFactoryTests] at java.base/java.lang.Object.wait0(Native Method) at java.base/java.lang.Object.wait(Object.java:378) at java.base/java.lang.Object.wait(Object.java:352) at java.base/java.util.TimerThread.mainLoop(Timer.java:543) at java.base/java.util.TimerThread.run(Timer.java:522) com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE scope at io.test.ReprojectionProcessorFactoryTests: 1) Thread[id=64, name=Timer-1, state=WAITING, group=TGRP-ReprojectionProcessorFactoryTests] at java.base/java.lang.Object.wait0(Native Method) at java.base/java.lang.Object.wait(Object.java:378) at java.base/java.lang.Object.wait(Object.java:352) at java.base/java.util.TimerThread.mainLoop(Timer.java:543) at java.base/java.util.TimerThread.run(Timer.java:522) at __randomizedtesting.SeedInfo.seed([9200700DD47A228]:0) > The embedded Derby database is currently the easiest approach. We could provide alternatives if there are easier ones, but I don't know yet which one it could be. A SQL database is needed Thanks for that confirmation. This is where I'm looking at creating my own jar'd up Lucene index and using simple Lucene APIs to query the EPSG database instead. We did this with maxmind's GeoIP database and it resulted in a much smaller database because Lucene can highly compress the data (its designed for text search use cases like this after all). > Indeed, the embedded Derby database targets one specific database engine... The rational was that if an application already has a database, it would be unfortunate to add another database on top of it. This makes sense and is definitely reasonable so I'm not knocking the approach. We have a use case that's more cloud native, thus needs to be lightweight and not carry all the sidecar SQL database overhead. > But do you have a list of the permissions that are causing problems? Here are the required permissions: grant { permission java.lang.RuntimePermission "createClassLoader"; permission java.lang.RuntimePermission "getClassLoader"; permission java.lang.RuntimePermission "getStackWalkerWithClassReference"; permission java.lang.RuntimePermission "shutdownHooks"; permission org.apache.derby.shared.common.security.SystemPermission "engine", "usederbyinternals"; permission java.io.FilePermission "derby.log", "read,write"; }; I wouldn't necessarily classify all of them as "problems" per se. This one, though, definitely causes concerns: ... permission java.lang.RuntimePermission "shutdownHooks"; ... The concern is that we have a "plugin" framework that enables users to write extension points. I have to look at limiting this permission to just the codebase needing it so users don't write nefarious shutdown hooks in their plugin that could kill the process. > SIS could be used when centimetric precision is desired, or for n-dimensional operations. This is why I'm spending time migrating from Proj4j to SIS as our default reprojection provider. In fact, the code I'm showing here is just small snippet in a larger SPI framework that enables the user to pick whatever Projection library they want ("proj4j", "sis", "my-projection-implementation"). If I'm not mistaken, SIS has the ability to support the OGC dynamic CRS spec as well? Thanks again for all the help and information and let me know if you have any ideas how to resolve the remaining leaked thread. Nicholas Knize, Ph.D., GISP Chief Technology Officer | Lucenia <https://lucenia.io> Apache Lucene PMC Member and Committer nkn...@apache.org On Wed, Dec 4, 2024 at 11:23 AM Martin Desruisseaux < martin.desruisse...@geomatys.com> wrote: > Hello Nicholas > > Le 2024-12-04 à 16 h 31, Nicholas Knize a écrit : > > > (…snip…) Then the test harness goes to shutdown the derby memory > > database and SIS threads and I'm informed of the following leaked > threads: > > > In the list of threads that follow, I can identify the following: > > * From Apache SIS: > o id=54, name=ReferenceQueueConsumer > o id=63, name=DelayedExecutor > * Apparently from Derby: > o id=57, name=derby.rawStoreDaemon > * Apparently from Java itself, maybe through the use of java.util.Timer: > o id=56, name=Timer-0 > > > > (…snip about Derby shutdown…) Here I get the following exception > > thrown: java.sql.SQLNonTransientConnectionException: Database > > classpath:SIS_DATA/Databases/spatial-metadata' shutdown. > > > Yes this is normal. Derby shutdown always throws a SQLException as per > Derby specification. It sound surprising, but it is specified that way > in their method contract. > > > > Which gets me further, but I still have a lingering timer thread that > > I can only trace to the *EPSGFactory* class. > > > Can you try to invoke the following method: > > > org.apache.sis.system.Shutdown.stop(YourClassOnlyForLoggingPurpose.class); > > The above is not in public API, and its intend was to be invoked from > Servlet container or OSGi. There is classes below starting to do that in > an incubator module. They may be used as a source of inspiration: > > > https://github.com/apache/sis/tree/main/incubator/src/org.apache.sis.webapp/main/org/apache/sis/services > > > > Do I really need this derby shutdown gymnastics? If so, why doesn't > > SIS Shutdown class handle all this for me? > > > Do you mean that you already tried above-cited Shutdown.stop method and > it didn't shutdown the Derby database? If yes, this is a bug. > > > > 2. How can I gracefully shutdown the EPSGFactory timer thread? Are > > there more code acrobats needed to add a special Shutdown hook to > > handle this zombie thread? > > > Same as above. If the above-cited Shutdown.stop method was tried and > didn't worked, this is a bug that I will investigate. > > > > (…snip…) Is there a more simple approach to using the EPSG database > > that I'm missing > > > The embedded Derby database is currently the easiest approach. We could > provide alternatives if there are easier ones, but I don't know yet > which one it could be. A SQL database is needed (a list of CRS > definitions in a properties file is not sufficient), because the EPSG > registry is not only about CRS definitions. EPSG is also about > coordinate operations, and SIS refers to that database extensively every > times that a transform between a pair of CRS is requested. > > > > so I'm considering just investing the time in creating a jar that > > contains a simple Lucene index with the entire EPSG database so I can > > query it in a disconnected deployment without having to carry this > > Derby in memory SQL DB ball and chain around. > > > Indeed, the embedded Derby database targets one specific database > engine, and then the `sis-epsg` module provides the script for creating > the database in other environments. The rational was that if an > application already has a database, it would be unfortunate to add > another database on top of it. The `sis-epsg` module is aimed to allow > developers to use their own database engine. It could have been Lucene, > but can Lucene execute SQL scripts? > > > > *Here are some suggestions:* > > > > (…snip…) There are quite a few security permissions needed just to get > > SIS working for a "simple" use case of reprojecting a coordinate to > WGS84. > > > In a previous version, we were providing a security policy file, and SIS > also had some `catch (SecurityException e)` statements for execution in > security-constrained environment. Those statements have been removed in > SIS 1.4 (I do not remember if they were present in SIS 1.3). I way try > to check tomorrow. But do you have a list of the permissions that are > causing problems? > > > > Same goes for the *eight dependencies* needed for a "simple" > reprojection. > > > We already plan to avoid the Jakarta dependency (or make it optional), > but it may take a year. > > For the OpenGIS GeoAPI dependency, actually it could be used at Lucene's > advantage. That dependency is aimed to provide an implementation-neutral > API for coordinate operations, like JDBC provides an > implementation-neutral API for databases. If the code can use only > org.opengis interfaces as much as possible, it should be possible to not > depend explicitly on SIS, but give the choice to users. Proj4J do not > implement those interfaces yet, but it would be easy to do. Proj4J has > limited capabilities, but it may be sufficient for 2-dimensional > operations when errors up to 3 kilometers does not matter. SIS could be > used when centimetric precision is desired, or for n-dimensional > operations. > > Another part of the dependency is the metadata package, which is indeed > as large as the referencing part. But it would be worth to put some > parts of it to user's attention. In particular, the following code: > > return CRS.findOperation(fromCRS, toCRS, bbox).getMathTransform(); > > Should not invoke getMathTransform() so fast and paid some attention to > the CoordinateOperation object. In particular, the domain of validity > and the accuracy should, if possible, be reported to the user in some way. > > Martin > >