Updates:
*Root cause : The log level  was DEBUG. As soon as I moved it to INFO for all, 
the performance got very comparable.
Data : I ran those downloads for 7 times and then averaged. Looks like they are 
very close to each other. I tried it from a 3rd machine NOT the machine where 
Knox was running. Overall: download speed for direct WebHDFS was nearly384M. 
With Knox proxy, download speed was 383M. 


| Data Size  | ~14 GB |  |  |  |  |  |
|  |  |  |  |  |  |  |
| Iteration | Approach | Time (sec) | Downland speed (MBS) |  |  |  |
| 1 | Direct | 42 | 325 | Knox | 44 | 310 |
| 2 | Direct  | 31 | 444 | Knox | 29 | 467 |
| 3 | Direct | 44 | 314 | Knox | 51 | 270 |
| 4 | Direct | 38 | 359 | Knox | 36 | 382 |
| 5 | Direct | 73 | 188 | Knox | 39 | 350 |
| 6 | Direct | 25 | 536 | Knox | 28 | 489 |
| 7 | Direct | 26 | 523 | Knox | 33 | 410 |
|  |  | 39.85714286 | 384.1428571 |  | 37.14285714 | 382.5714286 |

 

    On Saturday, November 5, 2016 7:14 PM, Mohammad Islam <[email protected]> 
wrote:
 

 Thanks Larry for sharing your findings.Number looks much better than mine. I 
tried with 0.9.1. Should i upgrade to 0,10,
If possible can you please share your exact command with various options. Did 
you try with SSL  on? My two hosts were different and i tried it from KNOX_HOST 
box.
Any other idea of how can i get better number?
Regards,Mohammad


 

    On Saturday, November 5, 2016 6:55 PM, larry mccay <[email protected]> 
wrote:
 

 Hi Mohammad -
I have played around with this a bit and haven't been able to reproduce your 
results.
My environment is a sandbox VM download and the Apache Knox 0.10.0 test 
instance running on the host machine.I put an ~8.5 GB file in hdfs and OPENed 
it with and without Knox.
With Knox:100 8470M    0 8470M    0     0   9.9M      0 --:--:--  0:14:09 
--:--:--  9.9M
Direct to WebHDFS:100 8470M    0 8470M    0     0  13.6M      0 --:--:--  
0:10:20 --:--:-- 14.9M
While we are certainly not speeding things up it isn't too bad.I believe that 
there is still room for some optimization in our rewrite process as has been 
discussed a bit on [1].
This would get the numbers even closer together probably.However, even that 
won't make up the difference that you are seeing.
I wonder what your test environment looks like where you are getting 99.6M avg 
speed direct and 4.8M from Knox.If the KNOX_HOST and WEBHDFS_HOST are different 
machines maybe you should try the direct curl command from the KNOX_HOST and 
see if there is a difference being introduced by the network or something like 
that.
thanks,
--larry
[1] https://issues.apache.org/jira/browse/KNOX-767


On Sat, Nov 5, 2016 at 6:41 PM, larry mccay <[email protected]> wrote:

Hi Mohammad -
Thanks for reporting this.
That is a big difference.Let me play around with it and see what I can 
reproduce.

thanks,
--larry
On Sat, Nov 5, 2016 at 5:52 PM, Mohammad Islam <[email protected]> wrote:

Hi,I did a very basic comparison of download speed. I used similar "curl .."  
command to download a large file (13.6 GB) and gathered the numbers. 
Looks like WebHDFS with Knox is very slow ( at least 20x slower). I ran it 
twice with similar numbers. For Knox, I turned off  SSL and both cases I used 
unsecured (non-Kerberos)  cluster. 
Download with Knox took nearly 49 minutes whereas direct download took 2 mins. 
The download speed was 4811k for Knox and  99.6M for direct download.
I'm sure I have done something wrong. Do you see any such performance? Any help 
will be really appreciated.
Regards,Mohammad





Interactions:curl -o t2.direct -L http://<WEBHDFS_HOST>:50070/we 
bhdfs/v1/<FILE_PATH>?op=OPEN  % Total    % Received % Xferd  Average Speed   
Time    Time     Time  Current                                 Dload  Upload   
Total   Spent    Left  Speed  0     0    0     0    0     0      0      0 
--:--:-- --:--:-- --:--:--     0100 13.5G  100 13.5G    0     0  99.6M      0  
0:02:19  0:02:19 --:--:--  117M



curl -H X-Auth-Params-Email: [email protected] -o t2 -L 
http://<KNOW_HOST>:8445/gatewa y/sandbox/webhdfs/v1/<FILE_ PATH>?op=OPEN  % 
Total    % Received % Xferd  Average Speed   Time    Time     Time  Current     
                            Dload  Upload   Total   Spent    Left  Speed  0     
0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  0     0 
   0 13.5G    0     0  4811k      0 --:--:--  0:49:12 --:--:-- 6121k    






   

   

Reply via email to