Serhiy Storchaka added the comment:

The patch for similar issue with the glob module was rejected recently since it 
is easy to sort the result of glob.glob() (see issue30461). This issue looks 
similar, but there are differences. On one side, the command line tar utility 
doesn't have the option for sorting file names and seems don't sort them by 
default (I didn't checked). It is possible to use external sorting with the 
tarfile module as with the tar utility (generate the list of all files and 
directories, sort it, and pass every item to TarFile.add with the option 
recursive=False). But on other side, this is not so easy as for glob.glob(). 
And the overhead of the sorting is expected to be smaller than for glob.glob(). 
This may be considered as additional arguments for approving the patch.

If this approach will be approved, it should be applied also to the ZIP 
archives.

FYI the order of archived files can affect the compression ratio of the 
compressed tar archive. For example the 7-Zip archiver sorts files by 
extensions, this increases the chance that files of the same type (text, 
multimedia, spreadsheet, executables, etc) are grouped together and use the 
common dictionary for global compression. This isn't directly related to this 
issue, just a material for possible future enhancement.

----------
nosy: +lars.gustaebel, rhettinger, serhiy.storchaka
stage:  -> patch review
versions:  -Python 3.3, Python 3.4

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30693>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to