I admit I haven't read this thread in detail, but is it a requirement for you to use HTTP site-to-site?
I would think you could avoid this issue by using traditional raw site-to-site which is over a direct socket and not hitting Jetty. If you do want to modify Jetty's configuration you would have to modify this part of the code and create a custom build of NiFi: https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java#L294 Probably just remove that call to the gzip method that wraps all the handlers. On Thu, Feb 14, 2019 at 11:46 AM Joe Witt <[email protected]> wrote: > > ...interesting. I dont have an answer but will initiate some research. > Hopefully someone else replies if they know off-hand. > > Thanks > > On Thu, Feb 14, 2019 at 11:43 AM Pat White <[email protected]> wrote: >> >> Hi Folks, >> >> Could someone point me at the correct way to modify Nifi's embedded jetty >> configuration settings? Specifically i'd like to turn off jetty's automatic >> compression of payload. >> >> Reason for asking, think i've found my performance issue, uncompressed input >> to jetty is getting automatically compressed, by jetty, causing very small >> and fragmented packets to be sent, which pegs the cpu receive thread, >> recombining and uncompressing the incoming packets. I'd like to verify by >> turning off auto compress. >> >> This is what i'm seeing, app layer compressed data (nifi output port >> compression=on) is accepted by jetty as-is and sent over as large, complete >> tcp packets, which the receiver is able to keep up with (do not see rcv net >> buffers fill up). With app layer uncompressed data (nifi output port >> compression=off), jetty automatically wants to compress and sends payload as >> many small fragmented packets, this causes high cpu load on the receiver and >> fills up the net buffers, causing a great deal of throttling and backoff to >> the sender. This is consistent in wireshark traces, good case shows no >> throttling, bad case shows constant throttling with backoff. >> >> I've checked the User and Admin guides, as well as looking at JettyServer >> and web/webdefault.xml for such controls but i'm clearly missing something, >> changes have no effect on the server behavior. Appreciate any help on how to >> set jetty configs properly, thank you. >> >> patw >> >> >> >> >> On Tue, Feb 5, 2019 at 9:07 AM Pat White <[email protected]> wrote: >>> >>> Hi Mark, thank you very much for the feedback, and the JettyServer >>> reference, will take a look at that code. >>> >>> I'll update the thread if i get any more info. Very strange issue, and hard >>> to see what's going on in the stream due to https encryption. >>> Our usecase is fairly basic, get/put flows using https over s2s, i'd expect >>> folks would have hit this if it is indeed an issue, so i tend to suspect my >>> install or config, however the behavior is very consistent, across multiple >>> clean installs, with small files as well as larger files (10s of MB vs GB >>> sized files). >>> >>> Thanks again. >>> >>> patw >>> >>> >>> On Mon, Feb 4, 2019 at 5:18 PM Mark Payne <[email protected]> wrote: >>>> >>>> Hey Pat, >>>> >>>> I saw this thread but have not yet had a chance to look into it. So thanks >>>> for following up! >>>> >>>> The embedded server is handled in the JettyServer class [1]. I can imagine >>>> that it may automatically turn on >>>> GZIP. When pushing data, though, the client would be the one supplying the >>>> stream of data, so the client is not >>>> GZIP'ing the data. But when requesting from Jetty, it may well be that >>>> Jetty is compressing the data. If that is the >>>> case, I would imagine that we could easily update the Site-to-Site client >>>> to add an Accept-Encoding header of None. >>>> I can't say for sure, off the top of my head, though, that it will be as >>>> simple of a fix as I'm hoping :) >>>> >>>> Thanks >>>> -Mark >>>> >>>> [1] >>>> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java >>>> >>>> >>>> On Feb 4, 2019, at 5:58 PM, Pat White <[email protected]> wrote: >>>> >>>> This looks like a thrashing behavior in compress/decompress, found that if >>>> i enable compression in the output port of the receiver's RPG, the issue >>>> goes away, throughput becomes just as good as for the sender's flow. Again >>>> though, i believe i have compression off for all flows and components. >>>> Only thing i can think of is if jetty's enforcing compression, and with an >>>> uncompressed stream has an issue, but not sure why only in one direction. >>>> >>>> Could someone point me to where Nifi's embedded jetty configuration code >>>> is, or equiv controls? >>>> >>>> patw >>>> >>>> >>>> On Fri, Feb 1, 2019 at 4:13 PM Pat White <[email protected]> wrote: >>>>> >>>>> Hi Folks, >>>>> >>>>> I'm trying to track a very odd performance issue, this is on 1.6.0 using >>>>> S2S, would like to ask if there are any known issues like this or if my >>>>> flow configuration is broken. From point of view of the RPG, receiving >>>>> takes ~15x longer to xsfr the same 1.5gb file as a send from that RPG. >>>>> I've setup two simple flows and see this behavior consistently, also >>>>> duplicated the flows between two single node instances to verify the >>>>> behavior follows the xsfr direction versus the node, behavior follows the >>>>> direction of xsfr, ie a receive on both nodes is much slower than sending. >>>>> >>>>> Flows are: >>>>> >>>>> FlowA: GetFile_nodeA > OutputPort_nodeA > RPG_nodeB > PutFile_nodeB >>>>> FlowB: GetFile_nodeB > RPG_nodeB > InputPort_nodeA > PutFile_nodeA >>>>> >>>>> For the same 1.5gb file, FlowA will consistently xsfr at ~3.5MB/s, FlowB >>>>> xsfrs at ~52.0MB/s, this is leaving default values for all processors, >>>>> connections and the RPG with the exception that RPG uses https (instead >>>>> of raw), the nodes are running secure. Same policy values were applied on >>>>> both nodes to both flows. >>>>> >>>>> Aside from the latency diff, the xsfrs appear to work fine with no >>>>> anomalies that i can find, the file transfers correctly in both >>>>> directions. The one anomaly i do see is in the slow case, the destination >>>>> node will have cpu go to 100% for the majority of the 6 to 7 minutes it >>>>> takes to transfer the file, from a jstack on the thread that's using 99%+ >>>>> of cpu, it looks like this thread is spending a lot of time in >>>>> nifi.remote.util.SiteToSiteRestApiClient.read doing >>>>> LazyDecompressingInputStream/InflaterInputStream, which puzzles me quite >>>>> a bit because all of the ports have compression turned off, there should >>>>> be no compress/decompress activity, as far as i can tell. >>>>> >>>>> Example stack for that thread: >>>>> "Timer-Driven Process Thread-6" #90 prio=5 os_prio=0 >>>>> tid=0x00007f4c48002000 nid=0xdb38 runnable [0x00007f4c734f5000] >>>>> java.lang.Thread.State: RUNNABLE >>>>> at java.util.zip.Inflater.inflateBytes(Native Method) >>>>> at java.util.zip.Inflater.inflate(Inflater.java:259) >>>>> - locked <0x00007f55d891cf50> (a java.util.zip.ZStreamRef) >>>>> at >>>>> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152) >>>>> at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117) >>>>> at >>>>> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122) >>>>> at >>>>> org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:58) >>>>> at >>>>> org.apache.nifi.remote.util.SiteToSiteRestApiClient$3.read(SiteToSiteRestApiClient.java:722) >>>>> at java.io.InputStream.read(InputStream.java:179) >>>>> at >>>>> org.apache.nifi.remote.io.InterruptableInputStream.read(InterruptableInputStream.java:57) >>>>> at >>>>> org.apache.nifi.stream.io.ByteCountingInputStream.read(ByteCountingInputStream.java:51) >>>>> at >>>>> java.util.zip.CheckedInputStream.read(CheckedInputStream.java:82) >>>>> at >>>>> org.apache.nifi.stream.io.LimitingInputStream.read(LimitingInputStream.java:88) >>>>> at java.io.FilterInputStream.read(FilterInputStream.java:133) >>>>> at >>>>> org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:57) >>>>> at >>>>> org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:53) >>>>> at >>>>> org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:62) >>>>> at org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35) >>>>> at >>>>> org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:744) >>>>> at >>>>> org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2990) >>>>> at >>>>> org.apache.nifi.remote.StandardRemoteGroupPort.receiveFlowFiles(StandardRemoteGroupPort.java:419) >>>>> at >>>>> org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:286) >>>>> at >>>>> org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250) >>>>> at >>>>> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175) >>>>> at >>>>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) >>>>> at >>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >>>>> at >>>>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) >>>>> at >>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) >>>>> at >>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) >>>>> at >>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >>>>> at >>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >>>>> at java.lang.Thread.run(Thread.java:748) >>>>> >>>>> Has anyone seen this behavior or symptoms like this? >>>>> >>>>> patw >>>> >>>>
