Let me see if I can keep this simple with enough helpful information.

I'm finally getting around to migrating some lucene logic from using Proj4j
to SIS for EPSG re-projection through sis-embedded-data release 1.4 (our
customers run in disconnected environments so I need to ship our product
with the EPSG database).

I've isolated a thread leak through a very simple unit test that contains
the following line:

assertEquals("EPSG:WGS 84 / Pseudo-Mercator",
    ((DefaultProjectedCRS)
(processor.getCrsHandler().getToCRS())).getName().toString());


Here are the build.gradle dependencies:

implementation "org.apache.sis.core:sis-referencing:${versions.sis}"
implementation "org.apache.sis.core:sis-utility:${versions.sis}"
implementation "org.opengis:geoapi:${versions.geoapi}"
runtimeOnly "org.apache.sis.non-free:sis-embedded-data:${versions.sis}"

implementation 'org.apache.derby:derby:10.15.2.0'
implementation 'org.apache.derby:derbytools:10.15.2.0'
implementation 'org.apache.derby:derbyshared:10.15.2.0'

implementation 'jakarta.xml.bind:jakarta.xml.bind-api:4.0.0'


Here is a code snippet for the reprojection calls:

public MathTransform getTransform(CoordinateReferenceSystem fromCRS,
CoordinateReferenceSystem toCRS, Object... extraArgs) throws Exception
{
    GeographicBoundingBox bbox = (extraArgs != null &&
extraArgs.length > 0 && extraArgs[0] instanceof GeographicBoundingBox)
        ? (GeographicBoundingBox) extraArgs[0]
        : null;
    return CRS.findOperation(fromCRS, toCRS, bbox).getMathTransform();
}


public CoordinateReferenceSystem getCRS(final String crsString) throws
Exception {
    return CRS.forCode(crsString);
}

public void reproject(final double[] from, double[] to, final double
tolerance) throws Exception {
    reusableFrom.setLocation(from[0], from[1]);
    transform.transform(reusableFrom, reusableTo);
    to[0] = reusableTo.getX();
    to[1] = reusableTo.getY();
}



The test case passes successfully.

 Then the test harness goes to shutdown the derby memory database and SIS
threads and I'm informed of the following leaked threads:

4 threads leaked from SUITE scope at io.test.ReprojectionProcessorFactoryTests:
   1) Thread[id=57, name=derby.rawStoreDaemon, state=TIMED_WAITING,
group=derby.daemons]
        at java.base/java.lang.Object.wait0(Native Method)
        at java.base/java.lang.Object.wait(Object.java:378)
        at 
org.apache.derby.impl.services.daemon.BasicDaemon.rest(BasicDaemon.java:579)
        at 
org.apache.derby.impl.services.daemon.BasicDaemon.run(BasicDaemon.java:393)
        at java.base/java.lang.Thread.run(Thread.java:1575)
   2) Thread[id=56, name=Timer-0, state=WAITING,
group=TGRP-ReprojectionProcessorFactoryTests]
        at java.base/java.lang.Object.wait0(Native Method)
        at java.base/java.lang.Object.wait(Object.java:378)
        at java.base/java.lang.Object.wait(Object.java:352)
        at java.base/java.util.TimerThread.mainLoop(Timer.java:543)
        at java.base/java.util.TimerThread.run(Timer.java:522)
   3) Thread[id=54, name=ReferenceQueueConsumer, state=WAITING, group=Daemons]
        at java.base/jdk.internal.misc.Unsafe.park(Native Method)
        at 
java.base/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
        at 
java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
        at 
java.base/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:4021)
        at 
java.base/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3967)
        at 
java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1712)
        at java.base/java.lang.ref.ReferenceQueue.await(ReferenceQueue.java:75)
        at 
java.base/java.lang.ref.ReferenceQueue.remove0(ReferenceQueue.java:166)
        at 
java.base/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:246)
        at 
org.apache.sis.system.ReferenceQueueConsumer.run(ReferenceQueueConsumer.java:111)
   4) Thread[id=63, name=DelayedExecutor, state=TIMED_WAITING, group=Daemons]
        at java.base/jdk.internal.misc.Unsafe.park(Native Method)
        at 
java.base/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:269)
        at 
java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1763)
        at java.base/java.util.concurrent.DelayQueue.take(DelayQueue.java:255)
        at java.base/java.util.concurrent.DelayQueue.take(DelayQueue.java:100)
        at org.apache.sis.system.DelayedExecutor.run(DelayedExecutor.java:122)
com.carrotsearch.randomizedtesting.ThreadLeakError: 4 threads leaked
from SUITE scope at io.test.ReprojectionProcessorFactoryTests:
   1) Thread[id=57, name=derby.rawStoreDaemon, state=TIMED_WAITING,
group=derby.daemons]
        at java.base/java.lang.Object.wait0(Native Method)
        at java.base/java.lang.Object.wait(Object.java:378)
        at 
org.apache.derby.impl.services.daemon.BasicDaemon.rest(BasicDaemon.java:579)
        at 
org.apache.derby.impl.services.daemon.BasicDaemon.run(BasicDaemon.java:393)
        at java.base/java.lang.Thread.run(Thread.java:1575)
   2) Thread[id=56, name=Timer-0, state=WAITING,
group=TGRP-ReprojectionProcessorFactoryTests]
        at java.base/java.lang.Object.wait0(Native Method)
        at java.base/java.lang.Object.wait(Object.java:378)
        at java.base/java.lang.Object.wait(Object.java:352)
        at java.base/java.util.TimerThread.mainLoop(Timer.java:543)
        at java.base/java.util.TimerThread.run(Timer.java:522)
   3) Thread[id=54, name=ReferenceQueueConsumer, state=WAITING, group=Daemons]
        at java.base/jdk.internal.misc.Unsafe.park(Native Method)
        at 
java.base/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
        at 
java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
        at 
java.base/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:4021)
        at 
java.base/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3967)
        at 
java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1712)
        at java.base/java.lang.ref.ReferenceQueue.await(ReferenceQueue.java:75)
        at 
java.base/java.lang.ref.ReferenceQueue.remove0(ReferenceQueue.java:166)
        at 
java.base/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:246)
        at 
org.apache.sis.system.ReferenceQueueConsumer.run(ReferenceQueueConsumer.java:111)
   4) Thread[id=63, name=DelayedExecutor, state=TIMED_WAITING, group=Daemons]
        at java.base/jdk.internal.misc.Unsafe.park(Native Method)
        at 
java.base/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:269)
        at 
java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1763)
        at java.base/java.util.concurrent.DelayQueue.take(DelayQueue.java:255)
        at java.base/java.util.concurrent.DelayQueue.take(DelayQueue.java:100)
        at org.apache.sis.system.DelayedExecutor.run(DelayedExecutor.java:122)
        at __randomizedtesting.SeedInfo.seed([45CCC261A46DFA96]:0)



So tracing the issue further leads to LocalDataSource.java#L288
<https://github.com/apache/sis/blob/1.4/endorsed/src/org.apache.sis.metadata/main/org/apache/sis/metadata/sql/util/LocalDataSource.java#L288>
which
is reached through *Shutdown.stop() *(seems logical / graceful)

case DERBY: {
    source.getClass().getMethod("setShutdownDatabase",
String.class).invoke(source, "shutdown");
    source.getConnection().close();             // Does the actual shutdown.
    break;
}


Here I get the following exception thrown:

*java.sql.SQLNonTransientConnectionException: Database
'classpath:SIS_DATA/Databases/spatial-metadata' shutdown.*

So I have to add some shutdown gymnastics to the test to get around these
leaked derby threads.

@After
@Override
public void tearDown() throws Exception {
    try {
        connection =
DriverManager.getConnection("jdbc:derby:classpath:SIS_DATA/Databases/spatial-metadata");
        if (connection != null && connection.isClosed() == false) {
            connection.close();
        }
    } catch (SQLException e) {
        if ("08006".equals(e.getSQLState()) == false) { // Ignore
expected SQLState for Derby shutdown
            e.printStackTrace();
        }
    }
    super.tearDown();
}

@AfterClass
public static void shutdownDerby() throws Exception {
    try {
        DriverManager.getConnection("jdbc:derby:;shutdown=true");
    } catch (SQLException e) {
        if (!"XJ015".equals(e.getSQLState())) { // Ignore expected
shutdown SQLState
            throw new RuntimeException("Unexpected error during Derby
shutdown", e);
        }
    }
    Shutdown.stop(ReprojectionProcessorFactoryTests.class);
}


Which gets me further, but I still have a lingering timer thread that I can
only trace to the *EPSGFactory* class.

1 thread leaked from SUITE scope at io.test.ReprojectionProcessorFactoryTests:
   1) Thread[id=64, name=Timer-1, state=WAITING,
group=TGRP-ReprojectionProcessorFactoryTests]
        at java.base/java.lang.Object.wait0(Native Method)
        at java.base/java.lang.Object.wait(Object.java:378)
        at java.base/java.lang.Object.wait(Object.java:352)
        at java.base/java.util.TimerThread.mainLoop(Timer.java:543)
        at java.base/java.util.TimerThread.run(Timer.java:522)
com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked
from SUITE scope at io.test.ReprojectionProcessorFactoryTests:
   1) Thread[id=64, name=Timer-1, state=WAITING,
group=TGRP-ReprojectionProcessorFactoryTests]
        at java.base/java.lang.Object.wait0(Native Method)
        at java.base/java.lang.Object.wait(Object.java:378)
        at java.base/java.lang.Object.wait(Object.java:352)
        at java.base/java.util.TimerThread.mainLoop(Timer.java:543)
        at java.base/java.util.TimerThread.run(Timer.java:522)
        at __randomizedtesting.SeedInfo.seed([9200700DD47A228]:0)



*Here are my questions*.

1. Do I really need this derby shutdown gymnastics? If so, why doesn't SIS
Shutdown class handle all this for me?
2. How can I gracefully shutdown the EPSGFactory timer thread? Are there
more code acrobats needed to add a special Shutdown hook to handle this
zombie thread?
3. I know there are alternatives to Derby (postgres, oracle, etc), but all
of those are even more burdensome (sidecar another db for this?). Is there
a more simple approach to using the EPSG database that I'm missing (I've
scoured docs and can't find anything)? The "easy button" doesn't seem to
exist here, so I'm considering just investing the time in creating a jar
that contains a simple Lucene index with the entire EPSG database so I can
query it in a disconnected deployment without having to carry this Derby in
memory SQL DB ball and chain around.

*Here are some suggestions:*

1. Our codebase aggressively uses Java Security Manager. Yes, I know that's
deprecated and soon going away. But we haven't migrated to JSM sandboxes
because our architecture is shifting to a different design anyway and...
umm large codebase + minimal time == choices made. There are quite a few
security permissions needed just to get SIS working for a "simple" use case
of reprojecting a coordinate to WGS84. This could cause an issue for the
environment we're deploying to which may require us to back out SIS and go
back to Proj4j altogether. That would be unfortunate and a large tick
against the SIS project. It's possible this may have already impacted
adoption.
2. Same goes for the *eight dependencies* needed for a "simple"
reprojection.

Thanks in advance for all the help!

Nicholas Knize, Ph.D., GISP
Chief Technology Officer  |  Lucenia <https://lucenia.io>
Apache Lucene PMC Member and Committer
nkn...@apache.org

Reply via email to