Hi Mohammad - I have played around with this a bit and haven't been able to reproduce your results.
My environment is a sandbox VM download and the Apache Knox 0.10.0 test instance running on the host machine. I put an ~8.5 GB file in hdfs and OPENed it with and without Knox. With Knox: 100 8470M 0 8470M 0 0 9.9M 0 --:--:-- 0:14:09 --:--:-- 9.9M Direct to WebHDFS: 100 8470M 0 8470M 0 0 13.6M 0 --:--:-- 0:10:20 --:--:-- 14.9M While we are certainly not speeding things up it isn't too bad. I believe that there is still room for some optimization in our rewrite process as has been discussed a bit on [1]. This would get the numbers even closer together probably. However, even that won't make up the difference that you are seeing. I wonder what your test environment looks like where you are getting 99.6M avg speed direct and 4.8M from Knox. If the KNOX_HOST and WEBHDFS_HOST are different machines maybe you should try the direct curl command from the KNOX_HOST and see if there is a difference being introduced by the network or something like that. thanks, --larry [1] https://issues.apache.org/jira/browse/KNOX-767 On Sat, Nov 5, 2016 at 6:41 PM, larry mccay <[email protected]> wrote: > Hi Mohammad - > > Thanks for reporting this. > > That is a big difference. > Let me play around with it and see what I can reproduce. > > thanks, > > --larry > > On Sat, Nov 5, 2016 at 5:52 PM, Mohammad Islam <[email protected]> wrote: > >> Hi, >> I did a very basic comparison of download speed. I used similar "curl .." >> command to download a large file (13.6 GB) and gathered the numbers. >> >> Looks like WebHDFS with Knox is very slow ( at least 20x slower). I ran >> it twice with similar numbers. For Knox, I turned off SSL and both cases I >> used unsecured (non-Kerberos) cluster. >> >> Download with Knox took nearly 49 minutes whereas direct download took 2 >> mins. The download speed was *4811k* for Knox and *99.6M* for direct >> download. >> >> I'm sure I have done something wrong. Do you see any such performance? >> Any help will be really appreciated. >> >> Regards, >> Mohammad >> >> >> >> >> >> >> Interactions: >> curl -o t2.direct -L http://<WEBHDFS_HOST>:50070/we >> bhdfs/v1/<FILE_PATH>?op=OPEN >> % Total % Received % Xferd Average Speed Time Time Time >> Current >> Dload Upload Total Spent Left >> Speed >> 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- >> 0 >> 100 13.5G 100 13.5G 0 0 *99.6M* 0 0:02:19 *0:02:19* >> --:--:-- 117M >> >> >> >> >> curl -H X-Auth-Params-Email: [email protected] -o t2 -L http:// >> <http://hadoopdevgw01-/><KNOW_HOST>:8445/gatewa >> y/sandbox/webhdfs/v1/<FILE_PATH>?op=OPEN >> % Total % Received % Xferd Average Speed Time Time Time >> Current >> Dload Upload Total Spent Left >> Speed >> 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- >> 0 >> 0 0 0 13.5G 0 0 *4811k* 0 --:--:-- *0:49:12* >> --:--:-- 6121k >> >> > >
