Copilot commented on code in PR #243:
URL: https://github.com/apache/uima-uimaj/pull/243#discussion_r3298270124


##########
uimaj-core/src/test/java/org/apache/uima/jcas/impl/LoadingBuiltinAnnotationBeforeCasTest.java:
##########
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.uima.jcas.impl;
+
+import static java.lang.System.getProperty;
+import static org.assertj.core.api.Assertions.assertThat;
+import static org.assertj.core.api.Assertions.fail;
+
+import java.io.ByteArrayOutputStream;
+import java.io.File;
+import java.nio.charset.StandardCharsets;
+import java.util.concurrent.TimeUnit;
+
+import org.apache.uima.jcas.tcas.Annotation;
+import org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl;
+import org.apache.uima.util.CasCreationUtils;
+import org.junit.jupiter.api.Test;
+
+/**
+ * Reproduces <a href="https://github.com/apache/uima-uimaj/issues/234";>issue 
#234</a>: when a
+ * built-in Annotation cover class is touched before any CAS exists, the type 
system records a wrong
+ * (zero) jcasType for it and Annotation's feature callsites are never 
updated. Subsequent
+ * {@code getCasType} / {@code setBegin} calls then fail.
+ *
+ * <p>
+ * The bug only manifests in a fresh JVM where nothing has yet initialized 
{@link Annotation}. Class
+ * initialization is once-per-JVM, so running this in the shared Surefire JVM 
(where other tests
+ * have already created CASes and thereby initialized Annotation normally) 
would not reproduce it.
+ * The test therefore forks a dedicated JVM running {@link #main(String[])} 
and asserts on its exit
+ * code.
+ */
+public class LoadingBuiltinAnnotationBeforeCasTest {
+
+  /**
+   * Reproducer body. Must run in a fresh JVM (see the {@code @Test} below). 
Throws on failure so
+   * the JVM exits non-zero.
+   */
+  public static void main(String[] args) throws Exception {
+    // Force Annotation's <clinit> to start now, before any TypeSystemImpl / 
CAS exists.
+    // Annotation's super (AnnotationBase) references TypeSystemImpl in its 
own <clinit>, which
+    // kicks off a recursive init storm that reads Annotation.typeIndexID 
while it's still 0 --
+    // caching a broken JCasClassInfo and skipping Annotation's _FC_begin / 
_FC_end callsite update.
+    Class.forName(Annotation.class.getName());
+
+    var tsd = new TypeSystemDescription_impl();
+    var jcas = CasCreationUtils.createCas(tsd, null, null).getJCas();
+
+    // (1) Type system lookup by JCas type id. With the bug, slot 
Annotation.type of
+    // jcasRegisteredTypes is empty (the JCCI got registered at slot 0 
instead).
+    jcas.getCasType(Annotation.type);
+
+    // (2) setDocumentText eventually calls Annotation.setBegin, which 
dispatches through the
+    // _FC_begin callsite. With the bug, that callsite still returns -1 and 
the offset lookup
+    // throws ArrayIndexOutOfBoundsException.
+    jcas.setDocumentText("hello");
+  }
+
+  @Test
+  void thatLoadingAnnotationBeforeCasDoesNotBreakTypeSystemManagement() throws 
Exception {
+    var windows = getProperty("os.name").toLowerCase().contains("win");
+    var java = getProperty("java.home") + File.separator + "bin" + 
File.separator
+            + (windows ? "java.exe" : "java");
+
+    var pb = new ProcessBuilder(java, "-cp", getProperty("java.class.path"), 
getClass().getName());
+    pb.redirectErrorStream(true);
+    var p = pb.start();
+
+    var buf = new ByteArrayOutputStream();
+    p.getInputStream().transferTo(buf);
+
+    // Bound the wait so a hypothetical class-init deadlock in the subprocess 
can't hang CI.
+    if (!p.waitFor(60, TimeUnit.SECONDS)) {

Review Comment:
   `p.getInputStream().transferTo(buf)` is executed before the timed `waitFor`. 
Since `transferTo` blocks until EOF, a deadlocked/hung subprocess will block 
here indefinitely and the 60s timeout will never be reached. Consider consuming 
the subprocess output asynchronously (e.g., a separate thread) or waiting with 
a timeout first and only then reading the completed output (or redirecting 
output to a file) so the test cannot hang CI.



##########
uimaj-core/src/test/java/org/apache/uima/jcas/impl/LoadingBuiltinAnnotationBeforeCasTest.java:
##########
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.uima.jcas.impl;
+
+import static java.lang.System.getProperty;
+import static org.assertj.core.api.Assertions.assertThat;
+import static org.assertj.core.api.Assertions.fail;
+
+import java.io.ByteArrayOutputStream;
+import java.io.File;
+import java.nio.charset.StandardCharsets;
+import java.util.concurrent.TimeUnit;
+
+import org.apache.uima.jcas.tcas.Annotation;
+import org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl;
+import org.apache.uima.util.CasCreationUtils;
+import org.junit.jupiter.api.Test;
+
+/**
+ * Reproduces <a href="https://github.com/apache/uima-uimaj/issues/234";>issue 
#234</a>: when a
+ * built-in Annotation cover class is touched before any CAS exists, the type 
system records a wrong
+ * (zero) jcasType for it and Annotation's feature callsites are never 
updated. Subsequent
+ * {@code getCasType} / {@code setBegin} calls then fail.
+ *
+ * <p>
+ * The bug only manifests in a fresh JVM where nothing has yet initialized 
{@link Annotation}. Class
+ * initialization is once-per-JVM, so running this in the shared Surefire JVM 
(where other tests
+ * have already created CASes and thereby initialized Annotation normally) 
would not reproduce it.
+ * The test therefore forks a dedicated JVM running {@link #main(String[])} 
and asserts on its exit
+ * code.
+ */
+public class LoadingBuiltinAnnotationBeforeCasTest {
+
+  /**
+   * Reproducer body. Must run in a fresh JVM (see the {@code @Test} below). 
Throws on failure so
+   * the JVM exits non-zero.
+   */
+  public static void main(String[] args) throws Exception {
+    // Force Annotation's <clinit> to start now, before any TypeSystemImpl / 
CAS exists.
+    // Annotation's super (AnnotationBase) references TypeSystemImpl in its 
own <clinit>, which
+    // kicks off a recursive init storm that reads Annotation.typeIndexID 
while it's still 0 --
+    // caching a broken JCasClassInfo and skipping Annotation's _FC_begin / 
_FC_end callsite update.
+    Class.forName(Annotation.class.getName());
+
+    var tsd = new TypeSystemDescription_impl();
+    var jcas = CasCreationUtils.createCas(tsd, null, null).getJCas();
+
+    // (1) Type system lookup by JCas type id. With the bug, slot 
Annotation.type of
+    // jcasRegisteredTypes is empty (the JCCI got registered at slot 0 
instead).
+    jcas.getCasType(Annotation.type);
+
+    // (2) setDocumentText eventually calls Annotation.setBegin, which 
dispatches through the
+    // _FC_begin callsite. With the bug, that callsite still returns -1 and 
the offset lookup
+    // throws ArrayIndexOutOfBoundsException.
+    jcas.setDocumentText("hello");
+  }
+
+  @Test
+  void thatLoadingAnnotationBeforeCasDoesNotBreakTypeSystemManagement() throws 
Exception {
+    var windows = getProperty("os.name").toLowerCase().contains("win");

Review Comment:
   Windows detection via `os.name.toLowerCase().contains("win")` can mis-detect 
non-Windows OS names containing "win" (e.g. "Darwin"). Use a stricter check 
such as `startsWith("windows")` (or similar) to avoid building a non-existent 
`java.exe` path.
   



##########
uimaj-core/src/main/java/org/apache/uima/cas/impl/TypeSystemImpl.java:
##########
@@ -2867,11 +2867,40 @@ public List<TypeImpl> getAllTypes() {
   }
 
   public static final TypeSystemImpl staticTsi = new TypeSystemImpl();
-  static {
-    TypeSystemImpl tsi = staticTsi.commit(); // needed to assign adjusted 
offsets to the builtins
-    if (tsi != staticTsi) {
-      Misc.internalError();
+
+  // staticTsi was previously committed inside this class's <clinit>. That 
triggered a recursive
+  // load-cover-classes storm: a built-in JCas cover class's <clinit> could be 
the very thing that
+  // caused TypeSystemImpl to load (e.g. via createCallSiteForBuiltIn). The 
commit then asked the
+  // JVM to fully initialize that same cover class while it was still mid-init 
on the current
+  // thread, producing a zero/default typeIndexID and broken callsites (issue 
#234). Commit is now
+  // deferred until first real demand via committedStaticTsi().
+  //
+  // Note: under deferred commit, another TypeSystemImpl with the same 
builtin-only structure may
+  // be committed before staticTsi is. In that case staticTsi.commit() 
consolidates to that other
+  // instance via committedTypeSystems (and staticTsi itself stays 
unfinalized). We therefore cache
+  // and return whatever commit() returns -- that is the instance with the 
correctly computed
+  // offsets.
+  private static volatile TypeSystemImpl committedStaticTsi;
+
+  /**
+   * Returns a committed builtin-only {@link TypeSystemImpl} -- either {@link 
#staticTsi} itself if
+   * it was the first one to commit, or whatever structurally-equivalent 
committed instance it was
+   * consolidated to. Commit is performed lazily and exactly once across all 
threads. Code that
+   * needs commit-dependent state (e.g. {@link TypeImpl#getAdjOffset(String)}) 
must go through this
+   * accessor rather than reading {@link #staticTsi} directly.
+   */
+  public static TypeSystemImpl committedStaticTsi() {
+    TypeSystemImpl result = committedStaticTsi;
+    if (result == null) {
+      synchronized (TypeSystemImpl.class) {
+        result = committedStaticTsi;
+        if (result == null) {
+          result = staticTsi.commit();
+          committedStaticTsi = result;

Review Comment:
   The new lazy-commit design explicitly allows `staticTsi.commit()` to 
consolidate to a different committed instance, leaving `staticTsi` itself 
unfinalized. However, other code (notably `createCallSiteForBuiltIn(...)`) 
still uses `staticTsi` directly to read `jcasRegisteredTypes` and compute 
`TypeImpl#getAdjOffset`, which can yield `null`/`-1` and leave builtin call 
sites permanently incorrect when the commit already happened on a different 
instance. Consider updating builtin callsite creation to use the 
already-committed instance (`committedStaticTsi` if non-null) for offset 
lookups, and avoid forcing a commit during cover-class initialization; also, 
the existing `if (staticTsi == null)` guard in `createCallSiteForBuiltIn` is 
ineffective since `staticTsi` is `static final`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to