Sounds great Koji, thank you for looking into that.

I'm trying some tests with changes in GzipHandler included methods, will
update if i have any useful info from that.

patw


On Fri, Feb 15, 2019 at 3:39 AM Koji Kawamura <[email protected]>
wrote:

> Hi Pat,
>
> Thanks for sharing your insights.
> I will try benchmarking before and after "gzip.setExcludedPath()" that
> Mark has suggested if it helps improving S2S HTTP throughput.
>
> Koji
>
> On Fri, Feb 15, 2019 at 9:31 AM Pat White <[email protected]>
> wrote:
> >
> > Hi Andy,
> >
> > My requirement is to use https with minimum tls v1.2, https being an
> approved protocol.
> > I haven't looked at websockets though, i need to do that, thank you for
> the suggestion.
> >
> > patw
> >
> >
> >
> > On Thu, Feb 14, 2019 at 12:24 PM Andy LoPresto <[email protected]>
> wrote:
> >>
> >> Pat,
> >>
> >> Just to clarify, your connection must be HTTPS or it just must be
> secure? What about Websockets over TLS (wss://)?
> >>
> >> Andy LoPresto
> >> [email protected]
> >> [email protected]
> >> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> >>
> >> On Feb 14, 2019, at 9:56 AM, Pat White <[email protected]>
> wrote:
> >>
> >> Thanks very much folks, definitely appreciate the feedback.
> >>
> >> Right, required to use tls/https connections for s2s, so raw is not an
> option for me.
> >>
> >> Will look further at JettyServer and setIncludedMethods, thanks again.
> >>
> >> patw
> >>
> >> On Thu, Feb 14, 2019 at 11:07 AM Mark Payne <[email protected]>
> wrote:
> >>>
> >>> Pat,
> >>>
> >>> It appears to be hard-coded, in JettyServer (full path is
> >>>
> nifi/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java
> )
> >>>
> >>> Line 294 calls the gzip method, which looks like:
> >>>
> >>> private Handler gzip(final Handler handler) {
> >>>     final GzipHandler gzip = new GzipHandler();
> >>>     gzip.setIncludedMethods("GET", "POST", "PUT", "DELETE");
> >>>     gzip.setHandler(handler);
> >>>     return gzip;
> >>> }
> >>>
> >>>
> >>> We probably would want to add a "gzip.setExcludedPath()" call to
> exclude anything that goes to the site-to-site path.
> >>>
> >>> Thanks
> >>> -Mark
> >>>
> >>>
> >>> On Feb 14, 2019, at 11:46 AM, Joe Witt <[email protected]> wrote:
> >>>
> >>> ...interesting.  I dont have an answer but will initiate some
> research.  Hopefully someone else replies if they know off-hand.
> >>>
> >>> Thanks
> >>>
> >>> On Thu, Feb 14, 2019 at 11:43 AM Pat White <[email protected]>
> wrote:
> >>>>
> >>>> Hi Folks,
> >>>>
> >>>> Could someone point me at the correct way to modify Nifi's embedded
> jetty configuration settings? Specifically i'd like to turn off jetty's
> automatic compression of payload.
> >>>>
> >>>> Reason for asking, think i've found my performance issue,
> uncompressed input to jetty is getting automatically compressed, by jetty,
> causing very small and fragmented packets to be sent, which pegs the cpu
> receive thread, recombining and uncompressing the incoming packets. I'd
> like to verify by turning off auto compress.
> >>>>
> >>>> This is what i'm seeing, app layer compressed data (nifi output port
> compression=on) is accepted by jetty as-is and sent over as large, complete
> tcp packets, which the receiver is able to keep up with (do not see rcv net
> buffers fill up). With app layer uncompressed data (nifi output port
> compression=off), jetty automatically wants to compress and sends payload
> as many small fragmented packets, this causes high cpu load on the receiver
> and fills up the net buffers, causing a great deal of throttling and
> backoff to the sender. This is consistent in wireshark traces, good case
> shows no throttling, bad case shows constant throttling with backoff.
> >>>>
> >>>> I've checked the User and Admin guides, as well as looking at
> JettyServer and web/webdefault.xml for such controls but i'm clearly
> missing something, changes have no effect on the server behavior.
> Appreciate any help on how to set jetty configs properly, thank you.
> >>>>
> >>>> patw
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Feb 5, 2019 at 9:07 AM Pat White <[email protected]>
> wrote:
> >>>>>
> >>>>> Hi Mark, thank you very much for the feedback, and the JettyServer
> reference, will take a look at that code.
> >>>>>
> >>>>> I'll update the thread if i get any more info. Very strange issue,
> and hard to see what's going on in the stream due to https encryption.
> >>>>> Our usecase is fairly basic, get/put flows using https over s2s, i'd
> expect folks would have hit this if it is indeed an issue, so i tend to
> suspect my install or config, however the behavior is very consistent,
> across multiple clean installs, with small files as well as larger files
> (10s of MB vs GB sized files).
> >>>>>
> >>>>> Thanks again.
> >>>>>
> >>>>> patw
> >>>>>
> >>>>>
> >>>>> On Mon, Feb 4, 2019 at 5:18 PM Mark Payne <[email protected]>
> wrote:
> >>>>>>
> >>>>>> Hey Pat,
> >>>>>>
> >>>>>> I saw this thread but have not yet had a chance to look into it. So
> thanks for following up!
> >>>>>>
> >>>>>> The embedded server is handled in the JettyServer class [1]. I can
> imagine that it may automatically turn on
> >>>>>> GZIP. When pushing data, though, the client would be the one
> supplying the stream of data, so the client is not
> >>>>>> GZIP'ing the data. But when requesting from Jetty, it may well be
> that Jetty is compressing the data. If that is the
> >>>>>> case, I would imagine that we could easily update the Site-to-Site
> client to add an Accept-Encoding header of None.
> >>>>>> I can't say for sure, off the top of my head, though, that it will
> be as simple of a fix as I'm hoping :)
> >>>>>>
> >>>>>> Thanks
> >>>>>> -Mark
> >>>>>>
> >>>>>> [1]
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java
> >>>>>>
> >>>>>>
> >>>>>> On Feb 4, 2019, at 5:58 PM, Pat White <[email protected]>
> wrote:
> >>>>>>
> >>>>>> This looks like a thrashing behavior in compress/decompress, found
> that if i enable compression in the output port of the receiver's RPG, the
> issue goes away, throughput becomes just as good as for the sender's flow.
> Again though, i believe i have compression off for all flows and
> components. Only thing i can think of is if jetty's enforcing compression,
> and with an uncompressed stream has an issue, but not sure why only in one
> direction.
> >>>>>>
> >>>>>> Could someone point me to where Nifi's embedded jetty configuration
> code is, or equiv controls?
> >>>>>>
> >>>>>> patw
> >>>>>>
> >>>>>>
> >>>>>> On Fri, Feb 1, 2019 at 4:13 PM Pat White <[email protected]>
> wrote:
> >>>>>>>
> >>>>>>> Hi Folks,
> >>>>>>>
> >>>>>>> I'm trying to track a very odd performance issue, this is on 1.6.0
> using S2S, would like to ask if there are any known issues like this or if
> my flow configuration is broken. From point of view of the RPG, receiving
> takes ~15x longer to xsfr the same 1.5gb file as a send from that RPG. I've
> setup two simple flows and see this behavior consistently, also duplicated
> the flows between two single node instances to verify the behavior follows
> the xsfr direction versus the node, behavior follows the direction of xsfr,
> ie a receive on both nodes is much slower than sending.
> >>>>>>>
> >>>>>>> Flows are:
> >>>>>>>
> >>>>>>> FlowA:  GetFile_nodeA > OutputPort_nodeA > RPG_nodeB >
> PutFile_nodeB
> >>>>>>> FlowB:  GetFile_nodeB > RPG_nodeB > InputPort_nodeA > PutFile_nodeA
> >>>>>>>
> >>>>>>> For the same 1.5gb file, FlowA will consistently xsfr at ~3.5MB/s,
> FlowB xsfrs at ~52.0MB/s, this is leaving default values for all
> processors, connections and the RPG with the exception that RPG uses https
> (instead of raw), the nodes are running secure. Same policy values were
> applied on both nodes to both flows.
> >>>>>>>
> >>>>>>> Aside from the latency diff, the xsfrs appear to work fine with no
> anomalies that i can find, the file transfers correctly in both directions.
> The one anomaly i do see is in the slow case, the destination node will
> have cpu go to 100% for the majority of the 6 to 7 minutes it takes to
> transfer the file, from a jstack on the thread that's using 99%+ of cpu, it
> looks like this thread is spending a lot of time in
> nifi.remote.util.SiteToSiteRestApiClient.read doing
> LazyDecompressingInputStream/InflaterInputStream, which puzzles me quite a
> bit because all of the ports have compression turned off, there should be
> no compress/decompress activity, as far as i can tell.
> >>>>>>>
> >>>>>>> Example stack for that thread:
> >>>>>>> "Timer-Driven Process Thread-6" #90 prio=5 os_prio=0
> tid=0x00007f4c48002000 nid=0xdb38 runnable [0x00007f4c734f5000]
> >>>>>>>    java.lang.Thread.State: RUNNABLE
> >>>>>>>         at java.util.zip.Inflater.inflateBytes(Native Method)
> >>>>>>>         at java.util.zip.Inflater.inflate(Inflater.java:259)
> >>>>>>>         - locked <0x00007f55d891cf50> (a java.util.zip.ZStreamRef)
> >>>>>>>         at
> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
> >>>>>>>         at
> java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
> >>>>>>>         at
> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
> >>>>>>>         at
> org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:58)
> >>>>>>>         at
> org.apache.nifi.remote.util.SiteToSiteRestApiClient$3.read(SiteToSiteRestApiClient.java:722)
> >>>>>>>         at java.io.InputStream.read(InputStream.java:179)
> >>>>>>>         at org.apache.nifi.remote.io
> .InterruptableInputStream.read(InterruptableInputStream.java:57)
> >>>>>>>         at org.apache.nifi.stream.io
> .ByteCountingInputStream.read(ByteCountingInputStream.java:51)
> >>>>>>>         at
> java.util.zip.CheckedInputStream.read(CheckedInputStream.java:82)
> >>>>>>>         at org.apache.nifi.stream.io
> .LimitingInputStream.read(LimitingInputStream.java:88)
> >>>>>>>         at
> java.io.FilterInputStream.read(FilterInputStream.java:133)
> >>>>>>>         at org.apache.nifi.stream.io
> .MinimumLengthInputStream.read(MinimumLengthInputStream.java:57)
> >>>>>>>         at org.apache.nifi.stream.io
> .MinimumLengthInputStream.read(MinimumLengthInputStream.java:53)
> >>>>>>>         at org.apache.nifi.controller.repository.io
> .TaskTerminationInputStream.read(TaskTerminationInputStream.java:62)
> >>>>>>>         at org.apache.nifi.stream.io
> .StreamUtils.copy(StreamUtils.java:35)
> >>>>>>>         at
> org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:744)
> >>>>>>>         at
> org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2990)
> >>>>>>>         at
> org.apache.nifi.remote.StandardRemoteGroupPort.receiveFlowFiles(StandardRemoteGroupPort.java:419)
> >>>>>>>         at
> org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:286)
> >>>>>>>         at
> org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
> >>>>>>>         at
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
> >>>>>>>         at
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
> >>>>>>>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> >>>>>>>         at
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> >>>>>>>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> >>>>>>>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> >>>>>>>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >>>>>>>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >>>>>>>         at java.lang.Thread.run(Thread.java:748)
> >>>>>>>
> >>>>>>> Has anyone seen this behavior or symptoms like this?
> >>>>>>>
> >>>>>>> patw
> >>>>>>
> >>>>>>
> >>>
> >>
>

Reply via email to