If I run two downloads concurrently: 1,073,741,824 46.1MB/s in 22s 1,073,741,824 51.3MB/s in 22s
So it isn't a limitation of the Knox gateway itself in total bandwidth but a per connection limitation somehow. Kevin Risden On Tue, Oct 9, 2018 at 12:24 PM Kevin Risden <kris...@apache.org> wrote: > So I was able to reproduce a slowdown with SSL with a pseudo distributed > HDFS setup on a single node with Knox running on the same node. This was > setup in Virtualbox on my laptop. > > Rough timings with wget for a 1GB random file: > > - directly to webhdfs - 1,073,741,824 252MB/s in 3.8s > - knox no ssl - 1,073,741,824 264MB/s in 3.6s > - knox ssl - 1,073,741,824 54.3MB/s in 20s > > There is a significant decrease with Knox SSL for some reason. > > Kevin Risden > > > On Sun, Sep 23, 2018 at 8:53 PM larry mccay <lmc...@apache.org> wrote: > >> SSL handshake will likely happen at least twice. >> Once for the request through Knox to the NN then the redirect from the NN >> to the DN goes all the way back to the client. >> So they have to follow the redirect and do the handshake to the DN. >> >> >> On Sun, Sep 23, 2018 at 8:30 PM Kevin Risden <kris...@apache.org> wrote: >> >>> So I found this in the Knox issues list in JIRA: >>> >>> https://issues.apache.org/jira/browse/KNOX-1221 >>> >>> It sounds familiar in terms of a slowdown when going through Knox. >>> >>> Kevin Risden >>> >>> >>> On Sat, Sep 15, 2018 at 10:17 PM Kevin Risden <kris...@apache.org> >>> wrote: >>> >>>> Hmmm yea curl for a single file should do the handshake once. >>>> >>>> What are the system performance statistics during the SSL vs non SSL >>>> testing? CPU/memory/disk/etc? Ambari metrics with Grafana would help here >>>> if using that. Otherwise watching top may be helpful. It would be help to >>>> determine if the Knox is working harder during the SSL transfer. >>>> >>>> Kevin Risden >>>> >>>> >>>> On Wed, Sep 12, 2018 at 2:52 PM Guang Yang <k...@uber.com> wrote: >>>> >>>>> I'm just using curl to download a single large file. So I suspect SSL >>>>> handshake just happens once? >>>>> >>>>> On Tue, Sep 11, 2018 at 12:02 PM >>>>> Kevin Risden >>>>> <kris...@apache.org> wrote: >>>>> >>>>>> What client are you using to connect Knox? Is this for a single file >>>>>> or a bunch of files? >>>>>> >>>>>> The SSL handshake can be slow if the client doesn't keep the >>>>>> connection open. >>>>>> >>>>>> Kevin Risden >>>>>> >>>>>> On Tue, Sep 11, 2018, 14:51 Guang Yang <k...@uber.com> wrote: >>>>>> >>>>>>> Thanks Larry. But the only difference is this part in my >>>>>>> gateway-site.xml. >>>>>>> >>>>>>> *<property>* >>>>>>> * <name>ssl.enabled</name>* >>>>>>> * <value>false</value>* >>>>>>> * <description>Indicates whether SSL is >>>>>>> enabled.</description>* >>>>>>> *</property>* >>>>>>> >>>>>>> On Tue, Sep 11, 2018 at 11:42 AM, larry mccay <lmc...@apache.org> >>>>>>> wrote: >>>>>>> >>>>>>>> I really don't think that kind of difference should be expected >>>>>>>> from merely SSL overhead. >>>>>>>> I don't however have any metrics to contradict it either since I do >>>>>>>> not run Knox without SSL. >>>>>>>> >>>>>>>> Given the above, I am struggling coming up with a meaningful >>>>>>>> response to this. :( >>>>>>>> I don't think you should see a 10 fold increase in speed by >>>>>>>> disabling SSL though. >>>>>>>> >>>>>>>> On Tue, Sep 11, 2018 at 2:35 PM Guang Yang <k...@uber.com> wrote: >>>>>>>> >>>>>>>>> Any idea guys? >>>>>>>>> >>>>>>>>> On Mon, Sep 10, 2018 at 3:07 PM, Guang Yang <k...@uber.com> wrote: >>>>>>>>> >>>>>>>>>> Thanks guys! The issue seems exactly what David pointed out, >>>>>>>>>> which is because of encrypted over SSL. >>>>>>>>>> >>>>>>>>>> Without Knox, the download speed can reach to *400M/s* if I call >>>>>>>>>> Namenode directly. And with disabling SSL, the speed can reach to >>>>>>>>>> *~400M/s* as well through Knox. But with SSL, the speed drops >>>>>>>>>> significantly to *~40M/s*. I know it's because of encrypted, but >>>>>>>>>> it does surprised me with such a difference. Is it normal from your >>>>>>>>>> perspective? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Guang >>>>>>>>>> >>>>>>>>>> On Tue, Sep 4, 2018 at 11:07 AM, David Villarreal < >>>>>>>>>> dvillarr...@hortonworks.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Guang, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Keep in mind the data is being encrypted over SSL. If you >>>>>>>>>>> disable SSL you will most likely see a very significant boost in >>>>>>>>>>> throughput. Some people have used more powerful computers to make >>>>>>>>>>> encryption quicker. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *From: *Sean Roberts <srobe...@hortonworks.com> >>>>>>>>>>> *Reply-To: *"user@knox.apache.org" <user@knox.apache.org> >>>>>>>>>>> *Date: *Tuesday, September 4, 2018 at 1:53 AM >>>>>>>>>>> *To: *"user@knox.apache.org" <user@knox.apache.org> >>>>>>>>>>> *Subject: *Re: WebHDFS performance issue in Knox >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Guang – This is somewhat to be expected. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> When you talk to WebHDFS directly, the client can distribute the >>>>>>>>>>> request across many data nodes. Also, you are getting data directly >>>>>>>>>>> from >>>>>>>>>>> the source. >>>>>>>>>>> >>>>>>>>>>> With Knox, all traffic goes through the single Knox host. Knox >>>>>>>>>>> is responsible for fetching from the datanodes and consolidating to >>>>>>>>>>> send to >>>>>>>>>>> you. This means overhead as it’s acting as a middle man, and lower >>>>>>>>>>> network >>>>>>>>>>> capacity since only 1 host is serving data to you. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Also, if running on a cloud provider, the Knox host may be a >>>>>>>>>>> smaller instance size with lower network capacity. >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> Sean Roberts >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *From: *Guang Yang <k...@uber.com> >>>>>>>>>>> *Reply-To: *"user@knox.apache.org" <user@knox.apache.org> >>>>>>>>>>> *Date: *Tuesday, 4 September 2018 at 07:46 >>>>>>>>>>> *To: *"user@knox.apache.org" <user@knox.apache.org> >>>>>>>>>>> *Subject: *WebHDFS performance issue in Knox >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> We're using Knox 1.1.0 to proxy WebHDFS request. If we download >>>>>>>>>>> a file through WebHDFS in Knox, the download speed is just about >>>>>>>>>>> 11M/s. >>>>>>>>>>> However, if we download directly from datanode, the speed is about >>>>>>>>>>> 40M/s at >>>>>>>>>>> least. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Are you guys aware of this problem? Any suggestion? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Guang >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>