This is great. Thank you for the detailed information. Here are some follow
ups:

> All other 6 dependencies (geoapi, sis-utility, etc.) are transitive
dependencies brought back automatically.

This is true, except we have special gradle plugins that disable transitive
dependency resolution  (`transitive = false`) because of the environments
we operate in. We can't use gradle's "behind the scenes" to silently pull
in dependencies w/o explicit consent (think CVE risk mitigation).
My comment was more around the fact that, transitive or not, some sensitive
environments make it more difficult to bring in a "new" project when it has
so many dependencies that require special security scanning (a concept we
put into practice when working on Lucene core).

> Can you try to invoke the following method:
org.apache.sis.system.Shutdown.stop(YourClassOnlyForLoggingPurpose.class);
> Do you mean that you already tried above-cited Shutdown.stop method and
it didn't shutdown the Derby database? If yes, this is a bug.

I tried that method and the derby database does get shutdown but the
EPSGFactory thread does not. Here is the code:


@After
@Override
public void tearDown() throws Exception {

    Shutdown.stop(ReprojectionProcessorFactoryTests.class);

    super.tearDown();
}

Here is the exception:


1 thread leaked from SUITE scope at io.test.ReprojectionProcessorFactoryTests:
   1) Thread[id=64, name=Timer-1, state=WAITING,
group=TGRP-ReprojectionProcessorFactoryTests]
        at java.base/java.lang.Object.wait0(Native Method)
        at java.base/java.lang.Object.wait(Object.java:378)
        at java.base/java.lang.Object.wait(Object.java:352)
        at java.base/java.util.TimerThread.mainLoop(Timer.java:543)
        at java.base/java.util.TimerThread.run(Timer.java:522)
com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked
from SUITE scope at io.test.ReprojectionProcessorFactoryTests:
   1) Thread[id=64, name=Timer-1, state=WAITING,
group=TGRP-ReprojectionProcessorFactoryTests]
        at java.base/java.lang.Object.wait0(Native Method)
        at java.base/java.lang.Object.wait(Object.java:378)
        at java.base/java.lang.Object.wait(Object.java:352)
        at java.base/java.util.TimerThread.mainLoop(Timer.java:543)
        at java.base/java.util.TimerThread.run(Timer.java:522)
        at __randomizedtesting.SeedInfo.seed([9200700DD47A228]:0)


> The embedded Derby database is currently the easiest approach. We could
provide alternatives if there are easier ones, but I don't know yet which
one it could be. A SQL database is needed

Thanks for that confirmation. This is where I'm looking at creating my own
jar'd up Lucene index and using simple Lucene APIs to query the EPSG
database instead. We did this with maxmind's GeoIP database and it resulted
in a much smaller database because Lucene can highly compress the data (its
designed for text search use cases like this after all).

>  Indeed, the embedded Derby database targets one specific database
engine... The rational was that if an application already has a database,
it would be unfortunate to add another database on top of it.

This makes sense and is definitely reasonable so I'm not knocking the
approach. We have a use case that's more cloud native, thus needs to be
lightweight and not carry all the sidecar SQL database overhead.

> But do you have a list of the permissions that are causing problems?

Here are the required permissions:

grant {
  permission java.lang.RuntimePermission "createClassLoader";
  permission java.lang.RuntimePermission "getClassLoader";
  permission java.lang.RuntimePermission "getStackWalkerWithClassReference";
  permission java.lang.RuntimePermission "shutdownHooks";
  permission org.apache.derby.shared.common.security.SystemPermission
"engine", "usederbyinternals";
  permission java.io.FilePermission "derby.log", "read,write";
};


I wouldn't necessarily classify all of them as "problems" per se. This one,
though, definitely causes concerns:

...
  permission java.lang.RuntimePermission "shutdownHooks";
...


The concern is that we have a "plugin" framework that enables users to
write extension points. I have to look at limiting this permission to just
the codebase needing it so users don't write nefarious shutdown hooks in
their plugin that could kill the process.

> SIS could be used when centimetric precision is desired, or for
n-dimensional operations.

This is why I'm spending time migrating from Proj4j to SIS as our default
reprojection provider. In fact, the code I'm showing here is just small
snippet in a larger SPI framework that enables the user to pick whatever
Projection library they want ("proj4j", "sis",
"my-projection-implementation").

If I'm not mistaken, SIS has the ability to support the OGC dynamic CRS
spec as well?

Thanks again for all the help and information and let me know if you have
any ideas how to resolve the remaining leaked thread.


Nicholas Knize, Ph.D., GISP
Chief Technology Officer  |  Lucenia <https://lucenia.io>
Apache Lucene PMC Member and Committer
nkn...@apache.org


On Wed, Dec 4, 2024 at 11:23 AM Martin Desruisseaux <
martin.desruisse...@geomatys.com> wrote:

> Hello Nicholas
>
> Le 2024-12-04 à 16 h 31, Nicholas Knize a écrit :
>
> > (…snip…) Then the test harness goes to shutdown the derby memory
> > database and SIS threads and I'm informed of the following leaked
> threads:
> >
> In the list of threads that follow, I can identify the following:
>
>    * From Apache SIS:
>       o id=54, name=ReferenceQueueConsumer
>       o id=63, name=DelayedExecutor
>   * Apparently from Derby:
>       o id=57, name=derby.rawStoreDaemon
>   * Apparently from Java itself, maybe through the use of java.util.Timer:
>       o id=56, name=Timer-0
>
>
> > (…snip about Derby shutdown…) Here I get the following exception
> > thrown: java.sql.SQLNonTransientConnectionException: Database
> > classpath:SIS_DATA/Databases/spatial-metadata' shutdown.
> >
> Yes this is normal. Derby shutdown always throws a SQLException as per
> Derby specification. It sound surprising, but it is specified that way
> in their method contract.
>
>
> > Which gets me further, but I still have a lingering timer thread that
> > I can only trace to the *EPSGFactory* class.
> >
> Can you try to invoke the following method:
>
>
> org.apache.sis.system.Shutdown.stop(YourClassOnlyForLoggingPurpose.class);
>
> The above is not in public API, and its intend was to be invoked from
> Servlet container or OSGi. There is classes below starting to do that in
> an incubator module. They may be used as a source of inspiration:
>
>
> https://github.com/apache/sis/tree/main/incubator/src/org.apache.sis.webapp/main/org/apache/sis/services
>
>
> > Do I really need this derby shutdown gymnastics? If so, why doesn't
> > SIS Shutdown class handle all this for me?
> >
> Do you mean that you already tried above-cited Shutdown.stop method and
> it didn't shutdown the Derby database? If yes, this is a bug.
>
>
> > 2. How can I gracefully shutdown the EPSGFactory timer thread? Are
> > there more code acrobats needed to add a special Shutdown hook to
> > handle this zombie thread?
> >
> Same as above. If the above-cited Shutdown.stop method was tried and
> didn't worked, this is a bug that I will investigate.
>
>
> > (…snip…) Is there a more simple approach to using the EPSG database
> > that I'm missing
> >
> The embedded Derby database is currently the easiest approach. We could
> provide alternatives if there are easier ones, but I don't know yet
> which one it could be. A SQL database is needed (a list of CRS
> definitions in a properties file is not sufficient), because the EPSG
> registry is not only about CRS definitions. EPSG is also about
> coordinate operations, and SIS refers to that database extensively every
> times that a transform between a pair of CRS is requested.
>
>
> > so I'm considering just investing the time in creating a jar that
> > contains a simple Lucene index with the entire EPSG database so I can
> > query it in a disconnected deployment without having to carry this
> > Derby in memory SQL DB ball and chain around.
> >
> Indeed, the embedded Derby database targets one specific database
> engine, and then the `sis-epsg` module provides the script for creating
> the database in other environments. The rational was that if an
> application already has a database, it would be unfortunate to add
> another database on top of it. The `sis-epsg` module is aimed to allow
> developers to use their own database engine. It could have been Lucene,
> but can Lucene execute SQL scripts?
>
>
> > *Here are some suggestions:*
> >
> > (…snip…) There are quite a few security permissions needed just to get
> > SIS working for a "simple" use case of reprojecting a coordinate to
> WGS84.
> >
> In a previous version, we were providing a security policy file, and SIS
> also had some `catch (SecurityException e)` statements for execution in
> security-constrained environment. Those statements have been removed in
> SIS 1.4 (I do not remember if they were present in SIS 1.3). I way try
> to check tomorrow. But do you have a list of the permissions that are
> causing problems?
>
>
> > Same goes for the *eight dependencies* needed for a "simple"
> reprojection.
> >
> We already plan to avoid the Jakarta dependency (or make it optional),
> but it may take a year.
>
> For the OpenGIS GeoAPI dependency, actually it could be used at Lucene's
> advantage. That dependency is aimed to provide an implementation-neutral
> API for coordinate operations, like JDBC provides an
> implementation-neutral API for databases. If the code can use only
> org.opengis interfaces as much as possible, it should be possible to not
> depend explicitly on SIS, but give the choice to users. Proj4J do not
> implement those interfaces yet, but it would be easy to do. Proj4J has
> limited capabilities, but it may be sufficient for 2-dimensional
> operations when errors up to 3 kilometers does not matter. SIS could be
> used when centimetric precision is desired, or for n-dimensional
> operations.
>
> Another part of the dependency is the metadata package, which is indeed
> as large as the referencing part. But it would be worth to put some
> parts of it to user's attention. In particular, the following code:
>
>     return CRS.findOperation(fromCRS, toCRS, bbox).getMathTransform();
>
> Should not invoke getMathTransform() so fast and paid some attention to
> the CoordinateOperation object. In particular, the domain of validity
> and the accuracy should, if possible, be reported to the user in some way.
>
>      Martin
>
>

Reply via email to