ericbadger commented on pull request #2513:
URL: https://github.com/apache/hadoop/pull/2513#issuecomment-743467026


   I've been able to run the tool locally and it seems to work as designed. At 
least from my initial testing. However, I found that the tool runs quite a bit 
slower than the docker-to-squash python script. Notably, I ran both this tool 
and the docker-to-squash tool on a fairly large image (14.8 GB, 32 layers) and 
it took 38:13 to run on this tool while taking 21:50 to run on 
docker-to-squash. I'm currently trying to figure out where the differences are 
that make this tool take so much longer.
   
   My first thought is that this tool appears to download layers sequentially, 
while the docker-to-squash tool does them in parallel (since it uses docker 
pull). The next step is converting the layers, which is the internal 
implementation vs mksquashfs. It's possible that mkquashfs is just faster 
there. I'll need to do more analysis. And then the last step is the layer 
upload. I know that this tool is uploading both the sqsh image as well as the 
tgz file. So that's about double the work, which makes sense why it would take 
longer.
   
   Anyway, I'm still looking into the performance, but @insideo , feel free to 
post your insights.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to