volmasoft commented on issue #22: URL: https://github.com/apache/accumulo-proxy/issues/22#issuecomment-623729390
@madrob I would usually liken multi-stage builds to things like simplifying or reducing the size of a compilation based tool where you perhaps need a tonne of libraries installed to compile but once compiled the binary is standalone. This was more about being smart about how we have each cache layer, by splitting the download and untar across two commands we essentially doubled the size. I'm happy to look at a multistage build approach if you have a good idea of where to draw the line/split? I couldn't spot an easy one that made sense. The only idea I came up with is using a builder to acquire the binaries (hadoop, accumulo, zookeeper, accumulo-proxy) and then using the second image definition to grab these binaries. I welcome your thoughts though as I'm no expert in multi stage builds, I've used them at work and on home projects a few times but mostly for making consistent compilation environments e.g. with GCC or test tools in, that aren't needed for running the app. If it helps I did push a branch working on this ticket (https://github.com/apache/accumulo-proxy/pull/23), but I still need to verify something on it, I got very distracted by going down a rabbit hole of seeing if I could get accumulo-proxy to run on an alpine linux backed JDK to see if I could reduce the size more. Have a gander, see what you think? Given I'd like to take the same approach for the main accumulo-docker image (it suffers from the same problem) it'd be good to get eyes on this. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
