[
https://issues.apache.org/jira/browse/NIFI-9463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17458036#comment-17458036
]
Sam Williams commented on NIFI-9463:
------------------------------------
[~joewitt] - I think the property you're talking about is
nifi.web.request.timeout which adjusts how long Jetty waits for a request to
complete before giving up. The main issue here is that it appears as thought
the content of the flow file is being replicated to a node before it is being
served to the browser. This replication can take minutes depending on the
filesize, which means Jetty will timeout on larger files. The simple solution
may be to stream the file from the Node that physically has the file rather
than replicate the request across the cluster.
> Large file downloads timeout
> ----------------------------
>
> Key: NIFI-9463
> URL: https://issues.apache.org/jira/browse/NIFI-9463
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Affects Versions: 1.12.1, 1.15.0
> Environment: Centos 7, Docker, 3-node cluster, SSL, certificate
> authentication, JVM Heap 4GB
> Reporter: Sam Williams
> Priority: Minor
>
> When attempting to download large files (greater than 500MB) from a queue or
> from provenance, the request will timeout and the file will not download. The
> HTTP response from NiFi is:
>
> {code:java}
> HTTP ERROR 503: Service Unavailable
> URI: /nifi-api/flowfile-queues/<queue-id>/flowfiles/<flowfile-id>/content
> STATUS: 503
> MESSAGE: Service Unavailable
> SERVLET: jerseySpring
> {code}
>
>
>
> {code:java}
> nifi-app.log:
> <DTG> WARN [Replicate Request Thread-1337]
> o.a.n.c.c.h.r.ThreadPoolRequestReplicator
> java.ne.SocketTimeoutException: timeout
> <...>
> {code}
>
> {code:java}
> nifi.properties:
> nifi.cluster.node.connection.timeout=120 secs
> nifi.cluster.node.read.timeout=120 secs
> nifi.web.request.timeout=120 secs{code}
>
> As I have been increasing the timeout values and the JVM heap size, I have
> managed to download larger and larger files, but this does not seem to be a
> linear phenomenon (i.e. 500MB might take ~30sec, while 600MB will take ~90sec
> to download)
> This has been happening since at least 1.12.0, and I believe it to relate to
> the implementation of the Jersey client [NIFI-5112] Inefficiency in
> replicating requests across cluster - ASF JIRA (apache.org)
> My guess would be the flowfile content is being streamed back to the node
> serving the UI which is buffering the content in memory and then streaming to
> the client.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)