RAVINARAYAN SINGH created NIFI-15025:
----------------------------------------
Summary: LookupFailureException with large HTTP responses due to
BufferedInputStream mark/reset limitation
Key: NIFI-15025
URL: https://issues.apache.org/jira/browse/NIFI-15025
Project: Apache NiFi
Issue Type: Bug
Affects Versions: 2.6.0
Reporter: RAVINARAYAN SINGH
Assignee: RAVINARAYAN SINGH
Fix For: 2.7.0
When wrapping an HTTP response body stream with BufferedInputStream for large
files, NiFi record readers can fail with the following error:
{code:java}
org.apache.nifi.lookup.LookupFailureException: java.io.IOException: Resetting
to invalid mark
Caused by: java.io.IOException: Resetting to invalid mark {code}
This happens because BufferedInputStream only supports mark/reset up to its
internal buffer size (default 8 KB). Once the reader attempts to reset beyond
that buffer, the stream becomes invalid.
Code from [RestLookupService.java|
https://github.com/apache/nifi/blob/60330769f668abf963f8a32202841cafa10f1885/nifi-extension-bundles/nifi-standard-services/nifi-lookup-services-bundle/nifi-lookup-services/src/main/java/org/apache/nifi/lookup/RestLookupService.java#L383-L388]
{code:java}
final Record record;
try (final InputStream is = responseBody.byteStream();
final InputStream bufferedIn = new BufferedInputStream(is)) {
record = handleResponse(bufferedIn, responseBody.contentLength(), context);
} {code}
h3. *Proposed Fix / Solutions*
# {*}Remove BufferedInputStream{*}{*}{*}
Use responseBody.byteStream() directly if mark/reset is not required by the
reader.
# {*}Configurable Buffer Size{*}{*}{*}
Introduce a NiFi property to configure the buffer size for streams that require
buffering. Default could remain 8 KB, but users may increase it for larger
payloads.
# {*}Spooling InputStream Wrapper (Recommended){*}{*}{*}
Provide a robust InputStream wrapper that preserves streaming while supporting
unlimited mark/reset via spooling to disk:
** mark() records the current absolute position.
** reset() replays bytes starting from the marked position.
** Additional bytes beyond the replay window are streamed and spooled
transparently.
** Temporary spool file is automatically deleted on stream close.
This ensures NiFi processors handle large HTTP payloads correctly without
running into mark/reset limits or heap issues.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)