Akanksha-kedia opened a new pull request, #18892:
URL: https://github.com/apache/pinot/pull/18892

   ## Description
   
   Refactors the 4-argument `splitPart(String, String, int limit, int index)` 
overload to avoid allocating a full `String[]` array via 
`StringUtils.splitByWholeSeparator()` on every invocation. Instead, the 
implementation now scans the input string directly and extracts only the 
requested field.
   
   ## Related Issue
   
   Addresses #17585
   
   ## Changes Made
   
   - Replaced the array-based implementation with index-based forward scanning 
that extracts only the target field without materializing all split parts
   - New `splitPartLimitedForward` method scans forward, skipping leading 
separators, collapsing consecutive separators, and handling trailing separators 
(one empty trailing token) — matching `splitByWholeSeparator` semantics exactly
   - New `countFieldsLimited` method counts total fields without allocating 
String objects (used for negative index resolution)
   - Falls back to the array-based path only for null/empty delimiters 
(whitespace splitting) where the tokenization rules are complex
   - Improved Javadoc on the public method
   
   ### Performance characteristics
   
   - **Positive index**: single forward scan, O(index) work, zero intermediate 
String allocations
   - **Negative index**: two forward passes (count + extract), still no 
`String[]` allocation
   - Reduces GC pressure in hot query paths where `splitPart` is invoked per-row
   
   ## Testing Done
   
   - [x] All 172 existing unit tests pass (including comprehensive edge cases 
for trailing delimiters, consecutive delimiters, multi-char delimiters, 
negative indices, and Integer.MIN_VALUE guard)
   - [x] Randomized fuzz test (10,000 iterations) passes, confirming behavioral 
equivalence with the array-based reference implementation
   - [x] `ScalarTransformFunctionWrapperTest` (90 tests) passes
   - [x] Spotless formatting check passes
   - [x] Checkstyle validation passes (0 violations)
   - [x] License header check passes
   - [x] No new compiler warnings introduced
   
   ## Checklist
   
   - [x] Code follows project style guidelines
   - [x] Self-review completed
   - [x] No new warnings introduced
   - [x] Backward compatible — public contract unchanged


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to