dhruvaggarwal2000 opened a new pull request, #1716: URL: https://github.com/apache/commons-lang/pull/1716
This issue is tracked as [LANG-1824](https://issues.apache.org/jira/browse/LANG-1824). Repro: ```java final String text = "axb"; final String search = "x"; final String replacement = StringUtils.repeat('y', 33_554_433); Strings.CS.replace(text, search, replacement, 64); // → NegativeArraySizeException ``` The pre-allocation estimate inside replace is (replacement.length() - search.length()) * multiplier, where multiplier is Math.min(max, 64) for non-negative max and 16 otherwise. With the inputs above, that product is exactly Integer.MAX_VALUE + 1, wraps negative in int, and the subsequent new StringBuilder(text.length() + increase) is handed a negative capacity. Cause: The pre-allocation estimate runs entirely in `int`: ```java int increase = Math.max(replacement.length() - replLength, 0); increase *= max < 0 ? 16 : Math.min(max, 64); final StringBuilder buf = new StringBuilder(text.length() + increase); ``` The × multiplier step alone can exceed Integer.MAX_VALUE. Once increase wraps negative, text.length() + increase produces a value new StringBuilder(int) rejects. There was no clamp against the JVM's array-size limit either, so even a correctly-computed int capacity could exceed what the constructor accepts on some JVMs. Fix: Fix: Compute the capacity in `long` instead of `int`, and clamp the result to `ArrayUtils.SAFE_MAX_ARRAY_LENGTH` before casting back. The math moves into a small package-private helper, `computeInitialCapacity`, so the overflow paths are directly unit-testable. The multipliers (`16` when `max < 0`, otherwise `Math.min(max, 64)`) are unchanged — any input that didn't overflow before allocates the same capacity as before. <details> <summary><b>Compatibility report</b> (<code>mvn package japicmp:cmp</code> against 3.20.0)</summary> ### `org.apache.commons.lang3.Strings` - [X] Binary-compatible - [X] Source-compatible - [X] Serialization-compatible | Status | Modifiers | Type | Name | Extends | JDK | Serialization | Compatibility Changes | |----------|---------------------|-------|-----------|------------|-------|---------------------|-----------------------| | Modified | `public` `abstract` | Class | `Strings` | [`Object`] | JDK 8 | ![Not serializable] | ![No changes] | The class is reported as Modified only because of the internal refactor — no method signatures, fields, or supertypes changed. All three compatibility axes are clean. </details> - [x] Read the contribution guidelines for this project. - [x] Read the ASF Generative Tooling Guidance. - AI used: Claude (Anthropic, claude-opus-4-7) via the Claude Code CLI. Assisted with Javadoc drafting for the helper. I reviewed every change, validated the arithmetic by hand for each overflow path, and confirmed the existing Strings.replace callers are unaffected. Per ASF guidance: Anthropic's terms place no restrictions inconsistent with the OSD; no third-party material is reproduced in the output. - [x] Run a successful build using the default Maven goal with mvn. - [x] Write unit tests that match behavioral changes. - [x] Pull request description detailed enough to understand what the PR does, how, and why. - [x] Each commit has a meaningful subject line. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
