gemini-code-assist[bot] commented on code in PR #36073: URL: https://github.com/apache/beam/pull/36073#discussion_r2337632022
########## sdks/java/core/src/main/java/org/apache/beam/sdk/util/RowJsonUtils.java: ########## @@ -63,14 +67,41 @@ public static void increaseDefaultStreamReadConstraints(int newLimit) { } static { - increaseDefaultStreamReadConstraints(100 * 1024 * 1024); + increaseDefaultStreamReadConstraints(MAX_STRING_LENGTH); + } + + /** + * Creates a thread-safe JsonFactory with custom stream read constraints. + * + * <p>This method encapsulates the logic to increase the default jackson-databind stream read + * constraint to 100MB. This functionality was introduced in Jackson 2.15 causing string > 20MB + * (5MB in <2.15.0) parsing failure. This has caused regressions in its dependencies including + * Beam. Here we create a streamReadConstraints minimum size limit set to 100MB and exposing the + * factory to higher limits. If needed, call this method during pipeline run time, e.g. in + * DoFn.setup. This avoids a data race caused by modifying the global default settings. + */ + public static JsonFactory createJsonFactory(int sizeLimit) { + sizeLimit = Math.max(sizeLimit, MAX_STRING_LENGTH); + JsonFactory jsonFactory = new JsonFactory(); + try { + // Check if StreamReadConstraints is available (Jackson 2.15+) + Class.forName("com.fasterxml.jackson.core.StreamReadConstraints"); + com.fasterxml.jackson.core.StreamReadConstraints streamReadConstraints = + com.fasterxml.jackson.core.StreamReadConstraints.builder() + .maxStringLength(sizeLimit) + .build(); + jsonFactory.setStreamReadConstraints(streamReadConstraints); + } catch (ClassNotFoundException e) { + // If the class is not found (i.e., Jackson version < 2.15), do nothing. + } + return jsonFactory; } Review Comment:  For performance and robustness, the check for `StreamReadConstraints` availability should be done only once, and the code that depends on it should be isolated to prevent `NoClassDefFoundError`. The current implementation calls `Class.forName` on every invocation of `createJsonFactory`, which is inefficient. A better approach is to perform this check once in a static context and cache the result. Additionally, using a static nested helper class to encapsulate the logic that uses `StreamReadConstraints` is a robust pattern for handling optional dependencies. This ensures the helper class is only loaded when the dependency is confirmed to be present. I've provided a suggestion that implements these improvements. ```java private static final boolean STREAM_READ_CONSTRAINTS_AVAILABLE = streamReadConstraintsAvailable(); private static boolean streamReadConstraintsAvailable() { try { Class.forName("com.fasterxml.jackson.core.StreamReadConstraints"); return true; } catch (ClassNotFoundException e) { return false; } } private static class StreamReadConstraintsHelper { static void setStreamReadConstraints(JsonFactory jsonFactory, int sizeLimit) { com.fasterxml.jackson.core.StreamReadConstraints streamReadConstraints = com.fasterxml.jackson.core.StreamReadConstraints.builder() .maxStringLength(sizeLimit) .build(); jsonFactory.setStreamReadConstraints(streamReadConstraints); } } /** * Creates a thread-safe JsonFactory with custom stream read constraints. * * <p>This method encapsulates the logic to increase the default jackson-databind stream read * constraint to 100MB. This functionality was introduced in Jackson 2.15 causing string > 20MB * (5MB in <2.15.0) parsing failure. This has caused regressions in its dependencies including * Beam. Here we create a streamReadConstraints minimum size limit set to 100MB and exposing the * factory to higher limits. If needed, call this method during pipeline run time, e.g. in * DoFn.setup. This avoids a data race caused by modifying the global default settings. */ public static JsonFactory createJsonFactory(int sizeLimit) { sizeLimit = Math.max(sizeLimit, MAX_STRING_LENGTH); JsonFactory jsonFactory = new JsonFactory(); if (STREAM_READ_CONSTRAINTS_AVAILABLE) { StreamReadConstraintsHelper.setStreamReadConstraints(jsonFactory, sizeLimit); } return jsonFactory; } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org