Pat,
It appears to be hard-coded, in JettyServer (full path is
nifi/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java
)
Line 294 calls the gzip method, which looks like:
private Handler gzip(final Handler handler) {
final GzipHandler gzip = new GzipHandler();
gzip.setIncludedMethods("GET", "POST", "PUT", "DELETE");
gzip.setHandler(handler);
return gzip;
}
We probably would want to add a "gzip.setExcludedPath()" call to exclude
anything that goes to the site-to-site path.
Thanks
-Mark
On Feb 14, 2019, at 11:46 AM, Joe Witt
<[email protected]<mailto:[email protected]>> wrote:
...interesting. I dont have an answer but will initiate some research.
Hopefully someone else replies if they know off-hand.
Thanks
On Thu, Feb 14, 2019 at 11:43 AM Pat White
<[email protected]<mailto:[email protected]>> wrote:
Hi Folks,
Could someone point me at the correct way to modify Nifi's embedded jetty
configuration settings? Specifically i'd like to turn off jetty's automatic
compression of payload.
Reason for asking, think i've found my performance issue, uncompressed input to
jetty is getting automatically compressed, by jetty, causing very small and
fragmented packets to be sent, which pegs the cpu receive thread, recombining
and uncompressing the incoming packets. I'd like to verify by turning off auto
compress.
This is what i'm seeing, app layer compressed data (nifi output port
compression=on) is accepted by jetty as-is and sent over as large, complete tcp
packets, which the receiver is able to keep up with (do not see rcv net buffers
fill up). With app layer uncompressed data (nifi output port compression=off),
jetty automatically wants to compress and sends payload as many small
fragmented packets, this causes high cpu load on the receiver and fills up the
net buffers, causing a great deal of throttling and backoff to the sender. This
is consistent in wireshark traces, good case shows no throttling, bad case
shows constant throttling with backoff.
I've checked the User and Admin guides, as well as looking at JettyServer and
web/webdefault.xml for such controls but i'm clearly missing something, changes
have no effect on the server behavior. Appreciate any help on how to set jetty
configs properly, thank you.
patw
On Tue, Feb 5, 2019 at 9:07 AM Pat White
<[email protected]<mailto:[email protected]>> wrote:
Hi Mark, thank you very much for the feedback, and the JettyServer reference,
will take a look at that code.
I'll update the thread if i get any more info. Very strange issue, and hard to
see what's going on in the stream due to https encryption.
Our usecase is fairly basic, get/put flows using https over s2s, i'd expect
folks would have hit this if it is indeed an issue, so i tend to suspect my
install or config, however the behavior is very consistent, across multiple
clean installs, with small files as well as larger files (10s of MB vs GB sized
files).
Thanks again.
patw
On Mon, Feb 4, 2019 at 5:18 PM Mark Payne
<[email protected]<mailto:[email protected]>> wrote:
Hey Pat,
I saw this thread but have not yet had a chance to look into it. So thanks for
following up!
The embedded server is handled in the JettyServer class [1]. I can imagine that
it may automatically turn on
GZIP. When pushing data, though, the client would be the one supplying the
stream of data, so the client is not
GZIP'ing the data. But when requesting from Jetty, it may well be that Jetty is
compressing the data. If that is the
case, I would imagine that we could easily update the Site-to-Site client to
add an Accept-Encoding header of None.
I can't say for sure, off the top of my head, though, that it will be as simple
of a fix as I'm hoping :)
Thanks
-Mark
[1]
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java
On Feb 4, 2019, at 5:58 PM, Pat White
<[email protected]<mailto:[email protected]>> wrote:
This looks like a thrashing behavior in compress/decompress, found that if i
enable compression in the output port of the receiver's RPG, the issue goes
away, throughput becomes just as good as for the sender's flow. Again though, i
believe i have compression off for all flows and components. Only thing i can
think of is if jetty's enforcing compression, and with an uncompressed stream
has an issue, but not sure why only in one direction.
Could someone point me to where Nifi's embedded jetty configuration code is, or
equiv controls?
patw
On Fri, Feb 1, 2019 at 4:13 PM Pat White
<[email protected]<mailto:[email protected]>> wrote:
Hi Folks,
I'm trying to track a very odd performance issue, this is on 1.6.0 using S2S,
would like to ask if there are any known issues like this or if my flow
configuration is broken. From point of view of the RPG, receiving takes ~15x
longer to xsfr the same 1.5gb file as a send from that RPG. I've setup two
simple flows and see this behavior consistently, also duplicated the flows
between two single node instances to verify the behavior follows the xsfr
direction versus the node, behavior follows the direction of xsfr, ie a receive
on both nodes is much slower than sending.
Flows are:
FlowA: GetFile_nodeA > OutputPort_nodeA > RPG_nodeB > PutFile_nodeB
FlowB: GetFile_nodeB > RPG_nodeB > InputPort_nodeA > PutFile_nodeA
For the same 1.5gb file, FlowA will consistently xsfr at ~3.5MB/s, FlowB xsfrs
at ~52.0MB/s, this is leaving default values for all processors, connections
and the RPG with the exception that RPG uses https (instead of raw), the nodes
are running secure. Same policy values were applied on both nodes to both flows.
Aside from the latency diff, the xsfrs appear to work fine with no anomalies
that i can find, the file transfers correctly in both directions. The one
anomaly i do see is in the slow case, the destination node will have cpu go to
100% for the majority of the 6 to 7 minutes it takes to transfer the file, from
a jstack on the thread that's using 99%+ of cpu, it looks like this thread is
spending a lot of time in nifi.remote.util.SiteToSiteRestApiClient.read doing
LazyDecompressingInputStream/InflaterInputStream, which puzzles me quite a bit
because all of the ports have compression turned off, there should be no
compress/decompress activity, as far as i can tell.
Example stack for that thread:
"Timer-Driven Process Thread-6" #90 prio=5 os_prio=0 tid=0x00007f4c48002000
nid=0xdb38 runnable [0x00007f4c734f5000]
java.lang.Thread.State: RUNNABLE
at java.util.zip.Inflater.inflateBytes(Native Method)
at java.util.zip.Inflater.inflate(Inflater.java:259)
- locked <0x00007f55d891cf50> (a java.util.zip.ZStreamRef)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
at
org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:58)
at
org.apache.nifi.remote.util.SiteToSiteRestApiClient$3.read(SiteToSiteRestApiClient.java:722)
at java.io.InputStream.read(InputStream.java:179)
at
org.apache.nifi.remote.io.InterruptableInputStream.read(InterruptableInputStream.java:57)
at
org.apache.nifi.stream.io.ByteCountingInputStream.read(ByteCountingInputStream.java:51)
at java.util.zip.CheckedInputStream.read(CheckedInputStream.java:82)
at
org.apache.nifi.stream.io.LimitingInputStream.read(LimitingInputStream.java:88)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at
org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:57)
at
org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:53)
at
org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:62)
at org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35)
at
org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:744)
at
org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2990)
at
org.apache.nifi.remote.StandardRemoteGroupPort.receiveFlowFiles(StandardRemoteGroupPort.java:419)
at
org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:286)
at
org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
at
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
at
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Has anyone seen this behavior or symptoms like this?
patw