[issue30693] tarfile add uses random order

2022-03-06 Thread Roundup Robot


Change by Roundup Robot :


--
nosy: +python-dev
nosy_count: 8.0 -> 9.0
pull_requests: +29831
pull_request: https://github.com/python/cpython/pull/31713

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-02-06 Thread Serhiy Storchaka

Change by Serhiy Storchaka :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-02-06 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:


New changeset 2c6f6682768f401c297c584ef106d48c78697f67 by Serhiy Storchaka 
(Miss Islington (bot)) in branch '3.7':
bpo-30693: Fix tarfile test cleanup on MSWindows (GH-5557) (GH-5567)
https://github.com/python/cpython/commit/2c6f6682768f401c297c584ef106d48c78697f67


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-02-06 Thread miss-islington

Change by miss-islington :


--
pull_requests: +5388

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-02-06 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:


New changeset 4ad703b7ca463d1183539277dde90ffb1c808487 by Serhiy Storchaka 
(Bernhard M. Wiedemann) in branch 'master':
bpo-30693: Fix tarfile test cleanup on MSWindows (#5557)
https://github.com/python/cpython/commit/4ad703b7ca463d1183539277dde90ffb1c808487


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-02-05 Thread Bernhard M. Wiedemann

Bernhard M. Wiedemann  added the comment:

Serhiy, can you test https://github.com/python/cpython/pull/5557

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-02-05 Thread Bernhard M. Wiedemann

Change by Bernhard M. Wiedemann :


--
keywords: +patch
pull_requests: +5379
stage: needs patch -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-02-04 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:

Tests are failing on Windows.

==
ERROR: test_ordered_recursion (test.test_tarfile.Bz2WriteTest)
--
Traceback (most recent call last):
  File "C:\py\cpython3.7\lib\unittest\mock.py", line 1191, in patched
return func(*args, **keywargs)
  File "C:\py\cpython3.7\lib\test\test_tarfile.py", line 1152, in 
test_ordered_recursion
support.unlink(os.path.join(path, "1"))
  File "C:\py\cpython3.7\lib\test\support\__init__.py", line 394, in unlink
_unlink(filename)
  File "C:\py\cpython3.7\lib\test\support\__init__.py", line 344, in _unlink
_waitfor(os.unlink, filename)
  File "C:\py\cpython3.7\lib\test\support\__init__.py", line 341, in _waitfor
RuntimeWarning, stacklevel=4)
RuntimeWarning: tests may fail, delete still pending for 
C:\py\cpython3.7\build\test_python_8504\@test_8504_tmp-tardir\directory\1

==
ERROR: test_directory_size (test.test_tarfile.GzipWriteTest)
--
Traceback (most recent call last):
  File "C:\py\cpython3.7\lib\test\test_tarfile.py", line 1121, in 
test_directory_size
os.mkdir(path)
FileExistsError: [WinError 183] Cannot create a file when that file already 
exists: 
'C:\\py\\cpython3.7\\build\\test_python_8504\\@test_8504_tmp-tardir\\directory'

==
ERROR: test_ordered_recursion (test.test_tarfile.GzipWriteTest)
--
Traceback (most recent call last):
  File "C:\py\cpython3.7\lib\unittest\mock.py", line 1191, in patched
return func(*args, **keywargs)
  File "C:\py\cpython3.7\lib\test\test_tarfile.py", line 1137, in 
test_ordered_recursion
os.mkdir(path)
FileExistsError: [WinError 183] Cannot create a file when that file already 
exists: 
'C:\\py\\cpython3.7\\build\\test_python_8504\\@test_8504_tmp-tardir\\directory'

==
ERROR: test_directory_size (test.test_tarfile.LzmaWriteTest)
--
Traceback (most recent call last):
  File "C:\py\cpython3.7\lib\test\test_tarfile.py", line 1121, in 
test_directory_size
os.mkdir(path)
FileExistsError: [WinError 183] Cannot create a file when that file already 
exists: 
'C:\\py\\cpython3.7\\build\\test_python_8504\\@test_8504_tmp-tardir\\directory'

==
ERROR: test_ordered_recursion (test.test_tarfile.LzmaWriteTest)
--
Traceback (most recent call last):
  File "C:\py\cpython3.7\lib\unittest\mock.py", line 1191, in patched
return func(*args, **keywargs)
  File "C:\py\cpython3.7\lib\test\test_tarfile.py", line 1137, in 
test_ordered_recursion
os.mkdir(path)
FileExistsError: [WinError 183] Cannot create a file when that file already 
exists: 
'C:\\py\\cpython3.7\\build\\test_python_8504\\@test_8504_tmp-tardir\\directory'

==
ERROR: test_directory_size (test.test_tarfile.WriteTest)
--
Traceback (most recent call last):
  File "C:\py\cpython3.7\lib\test\test_tarfile.py", line 1121, in 
test_directory_size
os.mkdir(path)
FileExistsError: [WinError 183] Cannot create a file when that file already 
exists: 
'C:\\py\\cpython3.7\\build\\test_python_8504\\@test_8504_tmp-tardir\\directory'

==
ERROR: test_ordered_recursion (test.test_tarfile.WriteTest)
--
Traceback (most recent call last):
  File "C:\py\cpython3.7\lib\unittest\mock.py", line 1191, in patched
return func(*args, **keywargs)
  File "C:\py\cpython3.7\lib\test\test_tarfile.py", line 1137, in 
test_ordered_recursion
os.mkdir(path)
FileExistsError: [WinError 183] Cannot create a file when that file already 
exists: 
'C:\\py\\cpython3.7\\build\\test_python_8504\\@test_8504_tmp-tardir\\directory'

--

--
stage: patch review -> needs patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-01-31 Thread Ned Deily

Ned Deily  added the comment:


New changeset 57750be4ad3fa2cfd3473b5be1f1e1a5d0fa9f50 by Ned Deily (Bernhard 
M. Wiedemann) in branch '3.7':
bpo-30693: zip+tarfile: sort directory listing (#2263)
https://github.com/python/cpython/commit/57750be4ad3fa2cfd3473b5be1f1e1a5d0fa9f50


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-01-31 Thread Bernhard M. Wiedemann

Bernhard M. Wiedemann  added the comment:

@Serhiy IMHO, just because we fix one problem, we do not have to fix all other 
problems at the same time. You can still open a pull-request for the others, 
but I know too little about those to test them.
And having commits pending for 7 months is not exactly energizing either.

For my use-case I just needed a trivial 1 line fix in tarfile.py and already 
ended up with a diffstat of
 7 files changed, 39 insertions(+), 6 deletions(-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-01-31 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:

I requested additional changes in msg310337.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-01-31 Thread STINNER Victor

STINNER Victor  added the comment:

> We missed beta freeze deadline. :/

I merged the PR. We will have to create a cherry-pick request once the 3.7 
branch will be created. If Ned rejects it, we have to change the version number 
of documentation.

https://mail.python.org/pipermail/python-dev/2018-January/152012.html

IMHO the change is very safe to be merged into 3.7b2.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-01-31 Thread STINNER Victor

STINNER Victor  added the comment:


New changeset 84521047e413d7d1150aaa1c333580b683b3f4b1 by Victor Stinner 
(Bernhard M. Wiedemann) in branch 'master':
bpo-30693: zip+tarfile: sort directory listing (#2263)
https://github.com/python/cpython/commit/84521047e413d7d1150aaa1c333580b683b3f4b1


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-01-31 Thread Christian Heimes

Christian Heimes  added the comment:

We missed beta freeze deadline. :/

Ned,
can we get this change into beta 2? It's low risk change to make the tarballs 
and other archives have a stable sort order. We even considered to backport the 
change to 3.6 and 2.7.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-01-20 Thread STINNER Victor

STINNER Victor  added the comment:

I now agree to leave Python 2.7 and 3.6 unchanged.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-01-20 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:

If make this change you need to make similar changes in other places that 
recursively add files to archives: shutil, zipapp, distutils, and maybe more.

--
versions:  -Python 2.7, Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-01-17 Thread Ned Deily

Ned Deily  added the comment:

This doesn't seem appropriate to me for backporting to existing releases (3.6. 
and 2.7).  AFAIK, the current file-system-order behavior has never been 
identified as a bug.  Unless there is a stronger case for changing the existing 
3.6.x behavior, I am -1 on backporting.

--
nosy: +ned.deily

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-01-17 Thread Christian Heimes

Christian Heimes  added the comment:

The patch changes behavior. It's fine for 3.7 but not for 3.6/2.7. Somebody may 
depend on filesystem order.

--
versions:  -Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-01-17 Thread STINNER Victor

STINNER Victor  added the comment:

The only warranty in that TarFile.getmembers(), TarFile.getnames() and 
ZipFile.infolist() returns members/names "in the same order as the members in 
the archive".

Currently, there is no warranty when packing, only on unpack.

--
nosy: +vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-01-17 Thread R. David Murray

R. David Murray  added the comment:

Ah, I was just going to ask about that.  I guess I'm -0 on the backport as 
well.  The other reproducible build stuff is only going to land in 3.7. 
However, this is in a more general category than the pyc stuff, so I can see 
the argument for backporting it.

--
nosy:  -vstinner
versions: +Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-01-17 Thread STINNER Victor

STINNER Victor  added the comment:

Since we currently don't warranty *anything* about ordering, I like the idea of 
*fixing* Python 2.7 and 3.6 as well. +1 for fix it in 2.7, 3.6 and master.

--
nosy: +vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-01-17 Thread Christian Heimes

Christian Heimes  added the comment:

PS: I'm -0 to backport the change to 3.6 and 2.7. 3.5 is in security fix mode 
and therefore completely out of scope.

--
versions:  -Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-01-17 Thread Christian Heimes

Christian Heimes  added the comment:

+1 from me

In my opinion it's both a good idea to not sort the result of glob.glob() and 
make the order of tar and zip module content ordered. The glob module is low 
level and it makes sense to expose the file system sort order.

On the other hand tar and zip modules are on a higher level. Without sorting 
it's impossible to create reproducible archives. The performance impact is 
irrelevant. I/O and compression dominant performance.

--
nosy: +christian.heimes

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2018-01-17 Thread R. David Murray

R. David Murray  added the comment:

Given the reproducible builds angle, I'd say this was worth doing.

--
nosy: +r.david.murray

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2017-06-19 Thread Bernhard M. Wiedemann

Bernhard M. Wiedemann added the comment:

note: recent GNU tar versions (1.28?) added an option --sort=name

and the overhead of sorting (e.g. I measured 4ms for 1 files) is negligible 
compared to the other processing done on the files here.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2017-06-18 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

The patch for similar issue with the glob module was rejected recently since it 
is easy to sort the result of glob.glob() (see issue30461). This issue looks 
similar, but there are differences. On one side, the command line tar utility 
doesn't have the option for sorting file names and seems don't sort them by 
default (I didn't checked). It is possible to use external sorting with the 
tarfile module as with the tar utility (generate the list of all files and 
directories, sort it, and pass every item to TarFile.add with the option 
recursive=False). But on other side, this is not so easy as for glob.glob(). 
And the overhead of the sorting is expected to be smaller than for glob.glob(). 
This may be considered as additional arguments for approving the patch.

If this approach will be approved, it should be applied also to the ZIP 
archives.

FYI the order of archived files can affect the compression ratio of the 
compressed tar archive. For example the 7-Zip archiver sorts files by 
extensions, this increases the chance that files of the same type (text, 
multimedia, spreadsheet, executables, etc) are grouped together and use the 
common dictionary for global compression. This isn't directly related to this 
issue, just a material for possible future enhancement.

--
nosy: +lars.gustaebel, rhettinger, serhiy.storchaka
stage:  -> patch review
versions:  -Python 3.3, Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2017-06-17 Thread Bernhard M. Wiedemann

Changes by Bernhard M. Wiedemann :


--
pull_requests: +2313

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30693] tarfile add uses random order

2017-06-17 Thread Bernhard M. Wiedemann

New submission from Bernhard M. Wiedemann:

Filesystems do not give any guarantees about ordering of files returned in 
directory listings, thus tarfile.add adds files in random order, when using 
os.listdir in recursion.

See also https://reproducible-builds.org/docs/stable-inputs/ on that topic.

--
components: Library (Lib)
messages: 296251
nosy: bmwiedemann
priority: normal
severity: normal
status: open
title: tarfile add uses random order
type: behavior
versions: Python 2.7, Python 3.3, Python 3.4, Python 3.5, Python 3.6, Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com