derrickaw commented on code in PR #36073:
URL: https://github.com/apache/beam/pull/36073#discussion_r2337648968


##########
sdks/java/core/src/main/java/org/apache/beam/sdk/util/RowJsonUtils.java:
##########
@@ -63,14 +67,41 @@ public static void increaseDefaultStreamReadConstraints(int 
newLimit) {
   }
 
   static {
-    increaseDefaultStreamReadConstraints(100 * 1024 * 1024);
+    increaseDefaultStreamReadConstraints(MAX_STRING_LENGTH);
+  }
+
+  /**
+   * Creates a thread-safe JsonFactory with custom stream read constraints.
+   *
+   * <p>This method encapsulates the logic to increase the default 
jackson-databind stream read
+   * constraint to 100MB. This functionality was introduced in Jackson 2.15 
causing string > 20MB
+   * (5MB in <2.15.0) parsing failure. This has caused regressions in its 
dependencies including
+   * Beam. Here we create a streamReadConstraints minimum size limit set to 
100MB and exposing the
+   * factory to higher limits. If needed, call this method during pipeline run 
time, e.g. in
+   * DoFn.setup. This avoids a data race caused by modifying the global 
default settings.
+   */
+  public static JsonFactory createJsonFactory(int sizeLimit) {
+    sizeLimit = Math.max(sizeLimit, MAX_STRING_LENGTH);
+    JsonFactory jsonFactory = new JsonFactory();
+    try {
+      // Check if StreamReadConstraints is available (Jackson 2.15+)
+      Class.forName("com.fasterxml.jackson.core.StreamReadConstraints");
+      com.fasterxml.jackson.core.StreamReadConstraints streamReadConstraints =
+          com.fasterxml.jackson.core.StreamReadConstraints.builder()
+              .maxStringLength(sizeLimit)
+              .build();
+      jsonFactory.setStreamReadConstraints(streamReadConstraints);
+    } catch (ClassNotFoundException e) {
+      // If the class is not found (i.e., Jackson version < 2.15), do nothing.
+    }
+    return jsonFactory;
   }

Review Comment:
   done, thanks



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to