Hi Mohammad -

I have played around with this a bit and haven't been able to reproduce
your results.

My environment is a sandbox VM download and the Apache Knox 0.10.0 test
instance running on the host machine.
I put an ~8.5 GB file in hdfs and OPENed it with and without Knox.

With Knox:
100 8470M    0 8470M    0     0   9.9M      0 --:--:--  0:14:09 --:--:--
 9.9M

Direct to WebHDFS:
100 8470M    0 8470M    0     0  13.6M      0 --:--:--  0:10:20 --:--:--
14.9M

While we are certainly not speeding things up it isn't too bad.
I believe that there is still room for some optimization in our rewrite
process as has been discussed a bit on [1].

This would get the numbers even closer together probably.
However, even that won't make up the difference that you are seeing.

I wonder what your test environment looks like where you are getting 99.6M
avg speed direct and 4.8M from Knox.
If the KNOX_HOST and WEBHDFS_HOST are different machines maybe you should
try the direct curl command from the KNOX_HOST and see if there is a
difference being introduced by the network or something like that.

thanks,

--larry

[1] https://issues.apache.org/jira/browse/KNOX-767



On Sat, Nov 5, 2016 at 6:41 PM, larry mccay <[email protected]> wrote:

> Hi Mohammad -
>
> Thanks for reporting this.
>
> That is a big difference.
> Let me play around with it and see what I can reproduce.
>
> thanks,
>
> --larry
>
> On Sat, Nov 5, 2016 at 5:52 PM, Mohammad Islam <[email protected]> wrote:
>
>> Hi,
>> I did a very basic comparison of download speed. I used similar "curl .."
>>  command to download a large file (13.6 GB) and gathered the numbers.
>>
>> Looks like WebHDFS with Knox is very slow ( at least 20x slower). I ran
>> it twice with similar numbers. For Knox, I turned off  SSL and both cases I
>> used unsecured (non-Kerberos)  cluster.
>>
>> Download with Knox took nearly 49 minutes whereas direct download took 2
>> mins. The download speed was *4811k* for Knox and  *99.6M* for direct
>> download.
>>
>> I'm sure I have done something wrong. Do you see any such performance?
>> Any help will be really appreciated.
>>
>> Regards,
>> Mohammad
>>
>>
>>
>>
>>
>>
>> Interactions:
>> curl -o t2.direct -L http://<WEBHDFS_HOST>:50070/we
>> bhdfs/v1/<FILE_PATH>?op=OPEN
>>   % Total    % Received % Xferd  Average Speed   Time    Time     Time
>> Current
>>                                  Dload  Upload   Total   Spent    Left
>> Speed
>>   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--
>>     0
>> 100 13.5G  100 13.5G    0     0  *99.6M*      0  0:02:19  *0:02:19*
>> --:--:--  117M
>>
>>
>>
>>
>> curl -H X-Auth-Params-Email: [email protected] -o t2 -L http://
>> <http://hadoopdevgw01-/><KNOW_HOST>:8445/gatewa
>> y/sandbox/webhdfs/v1/<FILE_PATH>?op=OPEN
>>   % Total    % Received % Xferd  Average Speed   Time    Time     Time
>> Current
>>                                  Dload  Upload   Total   Spent    Left
>> Speed
>>   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--
>>     0
>>   0     0    0 13.5G    0     0  *4811k*      0 --:--:--  *0:49:12*
>> --:--:-- 6121k
>>
>>
>
>

Reply via email to