[issue39430] tarfile.open(mode="r") race condition when importing lzma

2020-01-24 Thread Maciej Gol

Maciej Gol  added the comment:

This is a HUGE eye opener! Didn't know of that 'import' vs 'from x import
y' difference. Thanks a lot! Is it documented somewhere ?

pt., 24 sty 2020, 15:08 użytkownik Serhiy Storchaka 
napisał:

>
> Serhiy Storchaka  added the comment:
>
> It is intended to support circular imports. Let foo.py contains "import
> bar" and bar.py contains "import foo". When you execute "import foo", the
> import machinery first creates an empty module foo, adds it to sys.modules,
> reads foo.py and executes it in the namespace of module foo. When the
> interpreter encounters "import bar" in foo.py, the import machinery creates
> an empty module bar, adds it to sys.modules, reads bar.py and executes it
> in the namespace of module bar. When the interpreter encounters "import
> foo" in bar.py, the import machinery takes the module foo from sys.modules.
> So you break an infinite cycle and can import modules with cyclic
> dependencies.
>
> You can argue that cyclic import does not look as a good practice, but
> actually it is pretty common case when you import a submodule in a package.
> If foo/__init__.py contains "from .bar import Bar", the foo module must be
> imported before you import foo.bar, but is not completely initialized at
> that time yet.
>
> --
>
> ___
> Python tracker 
> <https://bugs.python.org/issue39430>
> ___
>

--

___
Python tracker 
<https://bugs.python.org/issue39430>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39430] tarfile.open(mode="r") race condition when importing lzma

2020-01-24 Thread Maciej Gol


Maciej Gol  added the comment:

By the way, thanks a lot for the fix <3

--

___
Python tracker 
<https://bugs.python.org/issue39430>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39430] tarfile.open(mode="r") race condition when importing lzma

2020-01-24 Thread Maciej Gol


Maciej Gol  added the comment:

> PR 18161 fixes race condition by using "from ... import ..."
> which waits until the module be completely initialized if the specified
> names are not set.

Correct me if I'm wrong, but I think the behavior of 'import lzma' in
this case (vulnerable to race conditions) is not as intended? Shouldn't we also 
fix
the 'import' statement itself?

In general, I understand that due to how dynamic Python is, it might not be
possible to load every single name at import time using `import` and using
`from x import y` brings more determinism (because we have a safeguard now).

But, looking at the stacktrace from ipython the problem lies in a sequence of
import statements, not dynamic python coding. Shouldn't the importing mechanism
be more deterministic in such case? For sure, it should not return an empty 
module
(this is the case when race condition occurs).

I think a race condition caused by simply using `import` statements
(not `from x import y`) is a big caveat in the statement itself and how python
imports work.

Haven't checked if the cause is isolated to how tarfile works, or works in 
general, though :-(

--

___
Python tracker 
<https://bugs.python.org/issue39430>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39430] tarfile.open(mode="r") race condition when importing lzma

2020-01-23 Thread Maciej Gol


Maciej Gol  added the comment:

Uploading fixed file (the former had a typo)

--
Added file: https://bugs.python.org/file48861/test.py

___
Python tracker 
<https://bugs.python.org/issue39430>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39430] tarfile.open(mode="r") race condition when importing lzma

2020-01-23 Thread Maciej Gol


Change by Maciej Gol :


Removed file: https://bugs.python.org/file48860/test.py

___
Python tracker 
<https://bugs.python.org/issue39430>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39430] tarfile.open(mode="r") race condition when importing lzma

2020-01-23 Thread Maciej Gol


New submission from Maciej Gol :

Hey guys,

We have a component that archives and unarchives multiple files in separate 
threads that started to
misbehave recently.

We have noticed a bunch of `AttributeError: module 'lzma' has no attribute 
'LZMAFile'` errors, which are
unexpected because our python is not compiled with LZMA support.

What is unfortunate, is that given the traceback:

Traceback (most recent call last):
  File "test.py", line 18, in 
list(pool.map(test_lzma, range(100)))
  File "/opt/lang/python37/lib/python3.7/concurrent/futures/_base.py", line 
598, in result_iterator
yield fs.pop().result()
  File "/opt/lang/python37/lib/python3.7/concurrent/futures/_base.py", line 
428, in result
return self.__get_result()
  File "/opt/lang/python37/lib/python3.7/concurrent/futures/_base.py", line 
384, in __get_result
raise self._exception
  File "/opt/lang/python37/lib/python3.7/concurrent/futures/thread.py", 
line 57, in run
result = self.fn(*self.args, **self.kwargs)
  File "test.py", line 14, in test_lzma
tarfile.open(fileobj=buf, mode="r")
  File "/opt/lang/python37/lib/python3.7/tarfile.py", line 1573, in open
return func(name, "r", fileobj, **kwargs)
  File "/opt/lang/python37/lib/python3.7/tarfile.py", line 1699, in xzopen
fileobj = lzma.LZMAFile(fileobj or name, mode, preset=preset)
AttributeError: module 'lzma' has no attribute 'LZMAFile'


the last line of the traceback is right AFTER this block (tarfile.py:1694):

try:
import lzma
except ImportError:
raise CompressionError("lzma module is not available")


Importing lzma in ipython fails properly:

In [2]: import lzma 
  
---
ModuleNotFoundError   Traceback (most recent call last)
 in 
> 1 import lzma

/opt/lang/python37/lib/python3.7/lzma.py in 
25 import io
26 import os
---> 27 from _lzma import *
28 from _lzma import _encode_filter_properties, 
_decode_filter_properties
29 import _compression

ModuleNotFoundError: No module named '_lzma'

When trying to debug the problem, we have noticed it's not deterministic. In 
order to reproduce it,
we have created a test python that repeatedly writes an archive to BytesIO and 
then reads from it.
Using it with 5 threads and 100 calls, gives very good chances of reproducing 
the issue. For us it
was almost every time.

Race condition occurs both on Python 3.7.3 and 3.7.6.
Test script used to reproduce it attached.

I know that the test script writes uncompressed archives and during opening 
tries to guess the compression.
But I guess this is a legitimate scenario and should not matter in this case.

--
files: test.py
messages: 360551
nosy: Maciej Gol
priority: normal
severity: normal
status: open
title: tarfile.open(mode="r") race condition when importing lzma
type: crash
versions: Python 3.7
Added file: https://bugs.python.org/file48860/test.py

___
Python tracker 
<https://bugs.python.org/issue39430>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com