Jeff, I looked a brief look, this looks great! 1. Is the API reasonable, more or less?
Yeah, it looks reasonable to me. 2. Is the name of the API reasonable? Andy suggested namespace archive, > so you could do something like "archive::extract", but unfortunately > libarchive uses that with struct archive *. Any better name suggestions? Assuming we'll be adding "compress" functionality in the future, any name that can allow both would be great. 3. I'd like to NOT support a specific format. SGTM! So, if the community wouldn't mind taking a look at this and giving me > initial feedback, I'd be most appreciative! Realize that this is only a > rough initial implementation; any and all comments would be welcome. Looks good. Looks like libarchive is in BSD license? If that's the case, we can bundle it in the repo (and also provide an option allowing using pre-installed version). Thanks for doing that! Very useful thing for the project! - Jie On Wed, Feb 28, 2018 at 11:37 AM, Jeff Coffler < jeff.coff...@microsoft.com.invalid> wrote: > Hi, > > We have a work item, https://issues.apache.org/jira/browse/MESOS-8064, > which discusses programmatically decoding .tar, .tar.gz, .zip, and other > common file compression schemes. > > I have an initial implementation for this (rough only), and I wanted to > reach out to the development community for thoughts/input before > proceeding. This is a fairly big change for Linux as well as Windows. > > My initial implementation makes use of libarchive. I implemented an > interface in stout with a set of unit tests as well. > > Initial implementation: https://github.com/jeffaco/mesos/blob/libarchive/ > 3rdparty/stout/include/stout/decompress.hpp > Unit tests: https://github.com/jeffaco/mesos/blob/libarchive/ > 3rdparty/stout/tests/decompress_tests.cpp > Commit chain that has relevant changes: https://github.com/jeffaco/ > mesos/commits/libarchive > > Basically, I'd like the community to look at a rough initial > implementation. You can see how the implementation is called with the unit > tests. One implemented, we'll change current command line usage for > utilities (tar, zip, etc) to use stout interface to libarchive rather than > Linux command line utilities. You can read more about libarchive here: > https://www.libarchive.org/. > > Outstanding issues and questions (to be solved before committing to > master): > > > 1. Is the API reasonable, more or less? > > 2. Is the name of the API reasonable? Andy suggested namespace archive, > so you could do something like "archive::extract", but unfortunately > libarchive uses that with struct archive *. Any better name suggestions? > > 3. I'd like to NOT support a specific format. That is, I'd like to just > call libarchive and have it determine the file format and handle it > "magically" (which it does), rather than restrict it to a very specific > format and error out if the file is not in that format. Right now you can > pass any supported format (.tar, .tar.gz, .tar.xv, .tar.bz2, zip, etc), as > you can see by the unit tests. Is there a need to pass a specific format > limiter, say, FORMAT_ZIP, and error if it's not a ZIP file? > > Given that libarchive supports multiple compression formats simultaneously > (i.e. foo.tar.bz2.gz), this could be overly restrictive anyway. But I > wanted community feedback on this point. > > 4. There's some TODOs to better utilize libarchive APIs to avoid > reading from stdin. These will be taken care of. > > 5. I'd like to programmatically be able to place extracted bits in a > specific location. Currently, it goes to the current directory, which will > work for fetcher cases. This will be investigated to see if we need it, and > how libarchive can/will support it. > > 6. There's a biggie TODO dealing with long path support and Unicode. > This will be investigated and taken care of. > > So, if the community wouldn't mind taking a look at this and giving me > initial feedback, I'd be most appreciative! Realize that this is only a > rough initial implementation; any and all comments would be welcome. > > /Jeff >