[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-03-24 Thread Alban Browaeys


Alban Browaeys  added the comment:

I managed to reproduce the bug.py crash with main branch up to commit 
12c0012cf97d21bc637056983ede0eaf4c0d9c33 .

--
Added file: 
https://bugs.python.org/file50700/bug.py_asyncio_cpustressed-crash.log

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-03-24 Thread Alban Browaeys


Alban Browaeys  added the comment:

bisect of main for bug.py with each local python builds get me to commit 
b127e70a8a682fe869c22ce04c379bd85a00db67 "bpo-46070: Fix asyncio initialization 
guard (GH-30423)" as the one that fixed bug.py most of the time.

At times I can make bug.py segfault be it on python 3.9, 3.10 or main branch. 
It is pretty hard (I can have a batch of 200 runs without an issue) but seems 
easier to reproduce with a CPU stressed, then I can have two segfaults in a 
batch of 50 runs.

Bash:
for i in {1..50}; do ./python  ../python-crash-kodi/bug.py ; done
or sh:
for i in `seq 1 50`; do ./python  ../python-crash-kodi/bug.py ; done

with:
stress -c `nproc --all` at the same time.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-03-24 Thread Alban Browaeys


Alban Browaeys  added the comment:

While bisecting main branch I did not only get segfaults but also exceptions, 
namely:

$ ./python  ../python-crash-kodi/sqlite3_crash.py 
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File 
"/home/prahal/Projects/WIP/libreelec/cpython_bisect/Lib/sqlite3/dbapi2.py", 
line 81, in register_adapters_and_converters
register_adapter(datetime.datetime, adapt_datetime)
KeyError: 'isoformat'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File 
"/home/prahal/Projects/WIP/libreelec/cpython_bisect/Lib/sqlite3/dbapi2.py", 
line 83, in register_adapters_and_converters
register_converter("timestamp", convert_timestamp)
KeyError: 'timepart_full'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File 
"/home/prahal/Projects/WIP/libreelec/cpython_bisect/Lib/sqlite3/dbapi2.py", 
line 83, in register_adapters_and_converters
register_converter("timestamp", convert_timestamp)
KeyError: 'day'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File 
"/home/prahal/Projects/WIP/libreelec/cpython_bisect/Lib/sqlite3/dbapi2.py", 
line 83, in register_adapters_and_converters
register_converter("timestamp", convert_timestamp)
KeyError: 'month'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File 
"/home/prahal/Projects/WIP/libreelec/cpython_bisect/Lib/sqlite3/dbapi2.py", 
line 83, in register_adapters_and_converters
register_converter("timestamp", convert_timestamp)
KeyError: 'year'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File 
"/home/prahal/Projects/WIP/libreelec/cpython_bisect/Lib/sqlite3/dbapi2.py", 
line 83, in register_adapters_and_converters
register_converter("timestamp", convert_timestamp)
KeyError: 'timepart'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File 
"/home/prahal/Projects/WIP/libreelec/cpython_bisect/Lib/sqlite3/dbapi2.py", 
line 83, in register_adapters_and_converters
register_converter("timestamp", convert_timestamp)
KeyError: 'datepart'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File "", line 688, in _load_unlocked
KeyError: 'convert_timestamp'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File "", line 688, in _load_unlocked
KeyError: 'convert_date'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File "", line 688, in _load_unlocked
KeyError: 'adapt_datetime'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File "", line 688, in _load_unlocked
KeyError: 'adapt_date'


The 3.10 branch bisect pointed to one commit that fix the crash after 3.10.1 
which is 72c260cf0c71eb01eb13100b751e9d5007d00b70 [3.10] bpo-46006: Revert 
"bpo-40521: Per-interpreter interned strings (GH-20085)" (GH-30422) (GH-30425) 
which makes sense regarding main branch logs. It is commit 
35d6540c904ef07b8602ff014e520603f84b5886 in the main branch.

What remains to be seen is why "bpo-46070: _PyGC_Fini() untracks objects 
(GH-30577)" looks fine in the main branch (though it has no effect on the 
import crash) and not in 3.9 and 3.10 branch.
Mind in the main branch "bpo-46006: Revert "bpo-40521: Per-interpreter interned 
strings (GH-20085)" (GH-30422)" was already applied when "bpo-46070: 
_PyGC_Fini() untracks objects (GH-30577)" went in so it was also unrelated to 
the fix of the initial report.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-03-24 Thread Alban Browaeys


Alban Browaeys  added the comment:

3.10 branch is fixed if I revert e6bb17fe29713368e1fd93d9ac9611017c4f570c 
bpo-46070: _PyGC_Fini() untracks objects (GH-30577). Be it if I revert it atop 
current head 9006b4471cc4d74d0cbf782d841a330b0f46a3d0 or if I test the commit 
before e6bb17fe29713368e1fd93d9ac9611017c4f570c was merged.

As this made no sense with regards to this bug report history that this fix 
broke the branch, I tried v3.10.1 which lacks this fix. Vanilla is broken too. 
Also applying the "_PyGC_Fini untracks objects" upon 3.10.1 does not fix the 
test case crash.

I am puzzled. Will try to bisect the commit that fixed the testcase in the 3.10 
branch before "_PyGC_Fini untracks objects" was merged and after 3.10.1.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-03-24 Thread Alban Browaeys


Alban Browaeys  added the comment:

I did 3.9 branch bisect and commit 52937c26adc35350ca0402070160cf6dc838f359 
bpo-46070: _PyGC_Fini() untracks objects (GH-30577) (GH-30580) is the one that 
broke 3.9 . While with my main branch builds this commit is fine, it is not for 
the 3.9 branches.

Proceeding with 3.10 branch bisect of first bad commit and redoing main branch 
first good commit.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-03-24 Thread Alban Browaeys


Alban Browaeys  added the comment:

By "It is fixed in main branch commit 12c0012cf97d21bc637056983ede0eaf4c0d9c33 
." I mean that this commit is good not that this commit fixes the issue.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-03-24 Thread Alban Browaeys


Alban Browaeys  added the comment:

sqlite3_crash.py does not crashes on python 3.9 below and equal 3.9.9 and 
python main branch 12c0012cf97d21bc637056983ede0eaf4c0d9c33. I confirm it 
crashes on python 3.9.10, 3.9.11, 3.10 branch commit 
9006b4471cc4d74d0cbf782d841a330b0f46a3d0 .
It is fixed in main branch commit 12c0012cf97d21bc637056983ede0eaf4c0d9c33 .

Currently bisecting both 3.9.9 to 3.9.10 and 3.10 to 3.11 main branch for bad 
to good.

The patches in this bug report are already merged in the 3.10 branch which 
crash.

I cannot reproduce win_py399_crash_reproducer.py which I used as a basis for 
this test case.
The backtrace is the same as the ones from the crashes of the kodi addons (me 
Jellyfin Kodi addon), which is the initial report .
This looks like importing sqlite3 in threads plays badly.

I can reproduce on aarch64 (Odroid C2) LibreElec and builds of cpython on 
Debian stable x86_64 (the extensive testing of the broken interpreters is done 
on x86_64 Debian stable bullseye with a cpython clone and running from 
builddir).

--
nosy: +prahal
Added file: https://bugs.python.org/file50699/sqlite3_crash.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-20 Thread STINNER Victor


Change by STINNER Victor :


--
Removed message: https://bugs.python.org/msg411050

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-20 Thread STINNER Victor


STINNER Victor  added the comment:

Another measure using the command:

PYTHONHASHSEED=0 ./python -X showrefcount -c pass

I had to run the command 20 times to get a stable value, I don't know why.

main branch: [21981 refs, 5716 blocks]
PR: [21887 refs, 5667 blocks]

=> the PR removes 94 references and 49 memory blocks.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-20 Thread jokot3


Change by jokot3 :


--
nosy: +jokot3

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-13 Thread STINNER Victor


STINNER Victor  added the comment:

I created bpo-46368: "faulthandler: add the ability to dump all interpreters, 
not only the current interpreter".

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-13 Thread STINNER Victor


STINNER Victor  added the comment:

pyobject_ob_interp.patch: Quick & dirty patch that I wrote to add 
PyObject.ob_interp, store in which interpreter an object has been created.

--
Added file: https://bugs.python.org/file50560/pyobject_ob_interp.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-13 Thread STINNER Victor


STINNER Victor  added the comment:

Victor:
> (*) I made the GC state per-interpreter: commit 
> 7247407c35330f3f6292f1d40606b7ba6afd5700 (Nov 20, 2019)

Eric Snow:
> FYI, this was done by me in an earlier comment which we ended up
reverting.  Later you basically un.reverted that.

Well, I recall that your change had to be reverted 2 or 3 times because there 
were many crashes on FreeBSD, and no one understood why it crashed. The root 
cause was bugs related to the GIL and daemon threads. It took me a while (and 
multiple commits) to identify and fix all of them:
https://vstinner.github.io/gil-bugfixes-daemon-threads-python39.html

I decided to split your work into smaller changes to better debug these 
crashes. bpo-36854 contains a few changes, but these changes are based on work 
that I pushed earlier.

For example, there was a tricky bug related to clearing a Python thread state:
https://github.com/python/cpython/commit/9da7430675ceaeae5abeb9c9f7cd552b71b3a93a

Also, once the GC was made per interpreter, we started to discover more and 
more tricky reference leaks:
https://vstinner.github.io/subinterpreter-leaks.html

I spent a significant time to reorder code of Py_Finalize() and 
Py_EndInterpreter() to clear objects earlier or in a different order. Recently, 
I made sure that the free lists can no longer be used after they are cleared. 
It took some notes at:
https://pythondev.readthedocs.io/finalization.html

One of the hardest fix was the commit 9ad58acbe8b90b4d0f2d2e139e38bb5aa32b7fb6 
of bpo-19466. To make this change, first I had to fix a very old bug of 
PyThreadState_Clear() with commit 5804f878e779712e803be927ca8a6df389d82cdf 
(bpo-20526).

Well, it was a long journey and it's not done yet :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-13 Thread STINNER Victor


STINNER Victor  added the comment:

It would be nice to add some tests.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-13 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset 52937c26adc35350ca0402070160cf6dc838f359 by Victor Stinner in 
branch '3.9':
bpo-46070: _PyGC_Fini() untracks objects (GH-30577) (GH-30580)
https://github.com/python/cpython/commit/52937c26adc35350ca0402070160cf6dc838f359


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-13 Thread miss-islington


miss-islington  added the comment:


New changeset e6bb17fe29713368e1fd93d9ac9611017c4f570c by Miss Islington (bot) 
in branch '3.10':
bpo-46070: _PyGC_Fini() untracks objects (GH-30577)
https://github.com/python/cpython/commit/e6bb17fe29713368e1fd93d9ac9611017c4f570c


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-13 Thread STINNER Victor


STINNER Victor  added the comment:

I tested manually my fix GH-30580 using:

* (1) attached  win_py399_crash_reproducer.py
* (2) https://bugs.python.org/issue46070#msg410447 mthod

Without my fix, I can easily reproduce the crash with (1) and (2).

With my fix, I can no longer reproduce the crash with (1) or (2).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-13 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +28779
pull_request: https://github.com/python/cpython/pull/30580

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-13 Thread miss-islington


Change by miss-islington :


--
pull_requests: +28778
pull_request: https://github.com/python/cpython/pull/30579

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-13 Thread miss-islington


Change by miss-islington :


--
pull_requests: +28777
pull_request: https://github.com/python/cpython/pull/30578

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-13 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset 1a4d1c1c9b08e75e88aeac90901920938f649832 by Victor Stinner in 
branch 'main':
bpo-46070: _PyGC_Fini() untracks objects (GH-30577)
https://github.com/python/cpython/commit/1a4d1c1c9b08e75e88aeac90901920938f649832


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-13 Thread Eric Snow


Eric Snow  added the comment:

> (*) I made the GC state per-interpreter: commit 
> 7247407c35330f3f6292f1d40606b7ba6afd5700 (Nov 20, 2019)

FYI, this was done by me in an earlier comment which we ended up
reverting.  Later you basically un.reverted that.

> The bug is that a C function object (_sre.compile) is created in an 
> interpreter, tracked by the GC list of this interpreter, and then it is 
> destroye and untracked in another interpreter.

FWIW, at one point I had a branch that supported sharing read-only
Py_Buffer data.  When the receiving interpreter was done with it I'd
call Py_AddPendingCall() to schedule the cleanup in the "owner"
interpreter.  However, this only worked because I kept track of the
owner.  Adding that pointer to every object wouldn't be feasible but I
suppose there are other things we could do that wouldn't be super
inefficient, like don't worry about it for the main interpreter, use a
hash table (Victor's idea), borrow some of the bits of the PyObject
head to store a flag or even an index into an array (if there are only
a few interpreters), or even make the allocator per-interpreter and
then extrapolate the interpreter based on the object's address.

Regardless, it is still much simpler to make all objects per-interpreter.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-13 Thread STINNER Victor


STINNER Victor  added the comment:

This issue has a complex history.

(*) I made the GC state per-interpreter: commit 
7247407c35330f3f6292f1d40606b7ba6afd5700 (Nov 20, 2019)

(*) This change triggered a _PyImport_FixupExtensionObject() bug in 
sub-interpreter, I fixed it with commit 
82c83bd907409c287a5bd0d0f4598f2c0538f34d (Nov 22, 2019)

(*) My _PyImport_FixupExtensionObject() fix introduced bpo-44050 regression, it 
was fixed by commit b9bb74871b27d9226df2dd3fce9d42bda8b43c2b (Oct 5, 2021)

(*) A race condition in the _asyncio extension has been identified and fixed by 
the commit b127e70a8a682fe869c22ce04c379bd85a00db67 (Jan 7, 2021)

(*) I identified a race condition introduced by the per-interpreter GC state 
cahnge: I proposed GH-30577 to fix it.


So far, the GC race condition has only been reproduced on Windows with Python 
3.9 and the _sre exception. On Python 3.10 and newer, it's harder to reproduce 
the crash using stdlib extensions since many of them have been ported to the 
multi-phase initializatioin API.

The GC race condition involves dangling pointers and depends on the memory 
allocator and when GC collections are triggered.

The bug is that a C function object (_sre.compile) is created in an 
interpreter, tracked by the GC list of this interpreter, and then it is 
destroye and untracked in another interpreter. The problem is that the object 
is untracked after the GC list has been destroyed and so "prev" and "next" 
objects of the PyGC_Head structure *can* become dangling pointers.

It's unclear to me what are the "prev" and "next" objects of the C function 
causing the crash (_sre.compile). At least, it seems like it's also used by 
more than one interpreter: it should *not* be done, see bpo-40533.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-13 Thread STINNER Victor


STINNER Victor  added the comment:

Oh. I managed to write a simple fix which doesn't require to revert the whole 
"per-interpreter GC" change: GH-30577.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-13 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +28776
pull_request: https://github.com/python/cpython/pull/30577

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-13 Thread STINNER Victor


STINNER Victor  added the comment:

When the crash occurs, the _sre.compile function is not destroyed in the 
interpreter which created the function.



The crash is related to _sre.compile method. This method is created in 
PyInit__sre() called by "import _sre".

On Windows, the _sre module is not imported at startup. So it's imported first 
in a subinterpreter.

In Python 3.9, the _sre module doesn't use the multiphase initialization API 
and PyModuleDef.m_size = -1. When the module is imported, 
_PyImport_FixupExtensionObject() copies the module dictionary into 
PyModuleDef.m_copy.

In Py_Finalize() and Py_EndInterpreter(), _PyImport_Cleanup() does two things:

* (1) set _sre.__dict__['compile'] to None -> kill the first reference to the 
function
* (2) call _PyInterpreterState_ClearModules() which does 
Py_CLEAR(def->m_base.m_copy), clear the cached copy of the _sre module dict -> 
kill the second reference

I modified Python to add an "ob_interp" member to PyObject to log in which 
interpreter an object is created. I also modified meth_dealloc() to log when 
_sre.compile function is deleted.

Extract of the reformatted output to see what's going on:
---
(...)

(1)
fixup: COPY _sre ModuleDef copy: def=7FFF19209810 interp=01EC1846F2A0

(2)
import: UPDATE(_sre ModuleDef copy): interp=01EC184AB790

(3)
_PyImport_Cleanup: interp=01EC1846F2A0
_PyInterpreterState_ClearModules: PY_CLEAR _sre ModuleDef m_copy: 
def=7FFF19209810 interp=01EC1846F2A0

(4)
_PyImport_Cleanup: interp=01EC184AB790
meth_dealloc(compile): m->ob_interp=01EC1846F2A0, 
interp=01EC184AB790

Windows fatal exception: access violation
(...)
---

Steps:

* (1)

  * interpreter #1 (01EC1846F2A0) creates the _sre.compile function
  * interpreter #1 (01EC1846F2A0) copies _sre module dict into 
PyModuleDef.m_copy
  * at this point, _sre.compile should have 2 references

* (2)

  * interpreter #2 (01EC184AB790) imports _sre: it creates a new module 
object and copies the function from PyModuleDef.m_copy
  * at this point, _sre.compile should have 3 references

* (3)

  * interpreter #1 exit: Py_EndInterpreter() calls _PyImport_Cleanup()
  * at this point, _sre.compile should have 1 reference

* (4)

  * interpreter #2 exit: Py_EndInterpreter() calls _PyImport_Cleanup()
  * the last reference to _sre.compile is deleted: 0 reference
  * meth_dealloc() is called

The first problem is that the function was created in the interpreter #1 but 
deleted in the interpreter #2.

The second problem is that the function is tracked by the GC and it is part of 
the GC list of the interpreter #1. When the interpreter #2 destroys the 
function, the GC list of interpreter #1 is already freed: PyGC_Head contains 
dangling pointers.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-12 Thread STINNER Victor


STINNER Victor  added the comment:

I wrote 3 scripts to reproduce the bug in a more reliable way. So I just have 
to type "bisect" and it runs the test 12 times.

(1) bisect.bat:
---
@"C:\vstinner\python\3.9\PCbuild\amd64\python_d.exe" bisect.py
---


(2) bisect.py:
---
import subprocess
import os
import sys

BISECT = False

def run(*args):
print("+ ", ' '.join(args))
env = dict(os.environ)
env['PYTHONFAULTHANDLER'] = '1'
proc = subprocess.run(args, env=env)
exitcode = proc.returncode
if exitcode:
print()
print(f"COMMAND FAILED: {exitcode}")
if BISECT:
print()
print("type: git bisect bad")
sys.exit(exitcode)

python = sys.executable
#script = "win_py399_crash_reproducer.py"
script = "bug.py"
nrun = 12
for i in range(1, nrun+1):
print(f"Run #{i}/{nrun}")
if i % 2:
run(python, script)
else:
run(python, "-X", "dev", script)

if BISECT:
print()
print("Not reproduced")
print()
run("git", "checkout", ".")
run("git", "bisect", "good")
---


(3) win_py399_crash_reproducer.py (import "_sre"):
---
# When this program is run on windows using python 3.9.9 it crashes about 50%
# of the time.

import _testcapi
import threading

code = """
import _sre
print("exit subinterpreter")
"""

def doIt():
_testcapi.run_in_subinterp(code)

tt=[]

for i in range(16):
t = threading.Thread(target=doIt)
t.start()
tt.append(t)

for t in tt:
t.join()
print("exit main")
---


Example:
---
vstinner@DESKTOP-DK7VBIL C:\vstinner\python\3.9>bisect
Run #1/12
+  C:\vstinner\python\3.9\PCbuild\amd64\python_d.exe bug.py
Run #2/12
+  C:\vstinner\python\3.9\PCbuild\amd64\python_d.exe -X dev bug.py
Run #3/12
+  C:\vstinner\python\3.9\PCbuild\amd64\python_d.exe bug.py
Windows fatal exception: access violation
(...)
Current thread 0x0420 (most recent call first):
  File "C:\vstinner\python\3.9\bug.py", line 13 in doIt
  File "C:\vstinner\python\3.9\lib\threading.py", line 910 in run
  File "C:\vstinner\python\3.9\lib\threading.py", line 973 in _bootstrap_inner
  File "C:\vstinner\python\3.9\lib\threading.py", line 930 in _bootstrap
(...)
COMMAND FAILED: 3221225477
---

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-12 Thread STINNER Victor


STINNER Victor  added the comment:

I modified PR 30565 (3.10) and PR 30566 (3.9) to fix the ABI. I added 
_PyGC_GetState() which always use PyInterpreterState.gc of the main interpreter.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-12 Thread Eric Snow


Eric Snow  added the comment:

> adding a new "gc" member in the _PyRuntimeState structure also causes the ABI 
> CI check to fail.

What if you move it to the end of the struct?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-12 Thread STINNER Victor


STINNER Victor  added the comment:

I prepared 3 pull requests to revert the commit 
7247407c35330f3f6292f1d40606b7ba6afd5700:

* PR 30564: main branch
* PR 30565: 3.10 branch
* PR 30566: 3.9 branch

The problem is that the "Check if the ABI has changed" CI job fails in 3.9 and 
3.10 branches.

I recently had the issue for a different revert in bpo-46006: I decided to keep 
the "removed" member, and mark it as "unused". See the commit 
72c260cf0c71eb01eb13100b751e9d5007d00b70 in the 3.10 branch:

struct _Py_unicode_state {
(...)

-PyObject *interned;

+// Unused member kept for ABI backward compatibility with Python 3.10.0:
+// see bpo-46006.
+PyObject *unused_interned;

(...)
}


I can keep the "gc" member in PyInterpreterState, but adding a new "gc" member 
in the _PyRuntimeState structure also causes the ABI CI check to fail.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-12 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +28769
pull_request: https://github.com/python/cpython/pull/30566

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-12 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +28768
pull_request: https://github.com/python/cpython/pull/30565

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-12 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +28767
pull_request: https://github.com/python/cpython/pull/30564

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-07 Thread STINNER Victor


STINNER Victor  added the comment:

This bug is hard to reproduce for different reasons:

* It occurs randomly: I need between 1 and 50 attempts to reproduce the bug 
using win_py399_crash_reproducer.py

* So far, the bug was only reproduced on Windows.

* I failed to reproduce the crash on Linux. I tried PYTHONMALLOC=malloc and 
PYTHONMALLOC=malloc_debug with and without 
LD_PRELOAD=/usr/lib64/libjemalloc.so.2 (jemalloc memory allocator).

* The _sre extension has been converted to multi-phase init in Python 3.10. 
"import _sre" is no longer enough to reproduce the crash on Python 3.10 and 
newer.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

2022-01-07 Thread STINNER Victor


Change by STINNER Victor :


--
title: [subinterpreters] asyncio crash when importing _asyncio in 
subinterpreter (Python 3.8 regression) -> [subinterpreters] crash when 
importing _sre in subinterpreters in parallel (Python 3.9 regression)

___
Python tracker 
<https://bugs.python.org/issue46070>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Parallel python in the cloud

2014-05-24 Thread Charles Gagnon
We were happily using PiCloud for several long calculations and we very happy 
with with it. With their realtime cores, we could take really large 
calculations set and run through fairly quickly.

Now that PiCloud is going away, we ran a few tests on Mutlyvac but so far, we 
are struggling to accomplish the same thing we had on PiCloud.

I have several pieces of my puzzle but can't seem to be able to put it 
together. I've seen and tried StarCluster and also various parallel python 
options but all options seem challenging to put together.

The goal is to mimic PiCloud, ie. loop through a function:

def some_NP_func(x, y):
   ...
   return z

some_cloud.call(some_NP_func, a1, a2)

Which computes the function on the cloud. We use this often in for loops with 
arrays of arguments. The other scenario is:

some_cloud.map(some_NP_intense_func, [...], [...])

Which iterates through and returns results. We need to run a lot of this in 
batch from a scheduler so I always try to avoid interactive environment (how 
does iPython parallel work in batch?).

What is the preferred approach or method right now for heavy parallel 
computation like this?


Regards,
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel python in the cloud

2014-05-24 Thread Robert Kern

On 2014-05-24 07:46, Charles Gagnon wrote:

We were happily using PiCloud for several long calculations and we very happy 
with with it. With their realtime cores, we could take really large 
calculations set and run through fairly quickly.

Now that PiCloud is going away, we ran a few tests on Mutlyvac but so far, we 
are struggling to accomplish the same thing we had on PiCloud.

I have several pieces of my puzzle but can't seem to be able to put it 
together. I've seen and tried StarCluster and also various parallel python options but 
all options seem challenging to put together.

The goal is to mimic PiCloud, ie. loop through a function:

def some_NP_func(x, y):
...
return z

some_cloud.call(some_NP_func, a1, a2)

Which computes the function on the cloud. We use this often in for loops with 
arrays of arguments. The other scenario is:

some_cloud.map(some_NP_intense_func, [...], [...])

Which iterates through and returns results. We need to run a lot of this in 
batch from a scheduler so I always try to avoid interactive environment (how 
does iPython parallel work in batch?).


IPython parallel works just fine in batch. As far as your client code (i.e. 
what you wrote above) is concerned, it's just another library. E.g.


https://github.com/ipython/ipython/blob/master/examples/Parallel%20Computing/nwmerge.py
https://github.com/ipython/ipython/blob/master/examples/Parallel%20Computing/itermapresult.py

etc.

--
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth.
  -- Umberto Eco

--
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-27 Thread Jurko Gospodnetić

  Hi.

On 25.11.2013. 14:42, Jurko Gospodnetić wrote:

   So far all tests seem to indicate that things work out fine if we
install to some dummy target folder, copy the target folder to some
version specific location  uninstall. That leaves us with a working
Python folder sans the start menu and registry items, both of which we
do not need for this. Everything I've played around with so far seems to
use the correct Python data depending on the interpreter executable
invoked, whether or not there is a regular Windows installation
somewhere on the same machine.

   We can use the script suggested by Ned Batchelder to temporarily
change the 'current installation' if needed for some external installer
package to correctly recognize where to install its content.

   I'm still playing around with this, and will let you know how it goes.


  Just wanted to let you know that the usage I described above seems to 
work in all the cases I tried out.


  I added some batch scripts for running a specific Python interpreter 
as a convenience and everything works 'naturally' in our development 
environment.


  Packages can be easily installed to a specific targeted environment 
using for example:

  py243 -m easy_install pip
  py332 -m pip install pytest
[not mentioning tweaks needed for specific ancient Python versions]

  Thank you all for all the suggestions.

  Best regards,
Jurko Gospodnetić


--
https://mail.python.org/mailman/listinfo/python-list


Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Jurko Gospodnetić

  Hi all.

  I was wondering what is the best way to install multiple Python 
installations on a single Windows machine.


  Regular Windows installer works great as long as all your 
installations have a separate major.minor version identifier. However, 
if you want to have let's say 2.4.3  2.4.4 installed at the same time 
it does not seem to work.


  I have not been able to find any prepackaged Python installation or 
really any solution to this. Most of the advice seems to boil down to 
'do not use such versions together, use only the latest'.


  We would like to run automated tests on one of our projects (packaged 
as a Python library) with different Python versions, and since our code 
contains workarounds for several problems with specific Python patch 
versions, we'd really like to be able to run the tests with those 
specific versions and with as little fuss as possible.


  Looking at what the Python installer does, the only problematic part 
for working around this manually seems to be the registry entries under 
'Software\Python\PythonCore\M.m' where 'M.n' is the major.minor version 
identifier. If Python interpreter expects to always find its entries 
there, then I guess there is no way to do what we need without building 
customized Python executables. Is there a way to force a specific Python 
interpreter to not read in this information, read it from an .ini file 
or something similar?


  Many thanks.

  Best regards,
Jurko Gospodnetić

--
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Ned Batchelder
On Monday, November 25, 2013 7:32:30 AM UTC-5, Jurko Gospodnetić wrote:
 Hi all.
 
I was wondering what is the best way to install multiple Python 
 installations on a single Windows machine.
 
Regular Windows installer works great as long as all your 
 installations have a separate major.minor version identifier. However, 
 if you want to have let's say 2.4.3  2.4.4 installed at the same time 
 it does not seem to work.
 
I have not been able to find any prepackaged Python installation or 
 really any solution to this. Most of the advice seems to boil down to 
 'do not use such versions together, use only the latest'.
 
We would like to run automated tests on one of our projects (packaged 
 as a Python library) with different Python versions, and since our code 
 contains workarounds for several problems with specific Python patch 
 versions, we'd really like to be able to run the tests with those 
 specific versions and with as little fuss as possible.
 
Looking at what the Python installer does, the only problematic part 
 for working around this manually seems to be the registry entries under 
 'Software\Python\PythonCore\M.m' where 'M.n' is the major.minor version 
 identifier. If Python interpreter expects to always find its entries 
 there, then I guess there is no way to do what we need without building 
 customized Python executables. Is there a way to force a specific Python 
 interpreter to not read in this information, read it from an .ini file 
 or something similar?
 
Many thanks.
 
Best regards,
  Jurko Gospodnetiďż˝

IIRC, Python itself doesn't read those registry entries, except when installing 
pre-compiled .msi or .exe kits.  Once you have Python installed, you can move 
the directory someplace else, then install another version of Python.

If you need to use many different Pythons of the same version, this script 
helps manage the registry: 
http://nedbatchelder.com/blog/201007/installing_python_packages_from_windows_installers_into.html

--Ned.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Chris Angelico
On Mon, Nov 25, 2013 at 11:32 PM, Jurko Gospodnetić
jurko.gospodne...@pke.hr wrote:
 Most of the advice seems to boil down to 'do not use such versions together,
 use only the latest'.

   We would like to run automated tests on one of our projects (packaged as a
 Python library) with different Python versions, and since our code contains
 workarounds for several problems with specific Python patch versions, we'd
 really like to be able to run the tests with those specific versions and
 with as little fuss as possible.

What this says to me is that you're doing something very unusual here
- most people won't be doing that. So maybe you need an unusual
solution.

Is it possible to set up virtualization to help you out? Create a
virtual machine in something like VirtualBox, then clone it for every
Python patch you want to support (you could have one VM that handles
all the .0 releases and another that handles all the .1 releases, or
you could have a separate VM for every Python you want to test). You
could then have a centralized master that each VM registers itself
with, and it feeds out jobs to them. Assuming your tests can be fully
automated, this could work out fairly efficiently - each VM has a
script that establishes a socket connection to the master, the master
hands out a job, the VMs run the test suite, the master collects up a
series of Pass/Fail reports. You could run everything on a single
physical computer, even.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Albert-Jan Roskam

On Mon, 11/25/13, Jurko Gospodnetić jurko.gospodne...@pke.hr wrote:

 Subject: Parallel Python x.y.A and x.y.B installations on a single Windows 
machine
 To: python-list@python.org
 Date: Monday, November 25, 2013, 1:32 PM
 
   Hi all.
 
   I was wondering what is the best way to install
 multiple Python installations on a single Windows machine.
 
   Regular Windows installer works great as long as all
 your installations have a separate major.minor version
 identifier. However, if you want to have let's say 2.4.3
  2.4.4 installed at the same time it does not seem to
 work.
 
   I have not been able to find any prepackaged Python
 installation or really any solution to this. Most of the
 advice seems to boil down to 'do not use such versions
 together, use only the latest'.
 
   We would like to run automated tests on one of our
 projects (packaged as a Python library) with different
 Python versions, and since our code contains workarounds for
 several problems with specific Python patch versions, we'd
 really like to be able to run the tests with those specific
 versions and with as little fuss as possible.
 
   Looking at what the Python installer does, the only
 problematic part for working around this manually seems to
 be the registry entries under
 'Software\Python\PythonCore\M.m' where 'M.n' is the
 major.minor version identifier. If Python interpreter
 expects to always find its entries there, then I guess there
 is no way to do what we need without building customized
 Python executables. Is there a way to force a specific
 Python interpreter to not read in this information, read it
 from an .ini file or something similar?
 
HI Jurko,

Check out the following packages: virtualenv, virtualenvwrapper, tox
virtualenv + wrapper make it very easy to switch from one python version to 
another. Stricly speaking you don't need virtualenvwrapper, but it makes 
working with virtualenv a whole lot easier.Tox also uses virtualenv. You can 
configure it to sdist your package under different python versions. Also, you 
can make it run nosetests for each python version and/or implementation (pypy 
and jython are supported)

Albert-Jan 


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Jurko Gospodnetić

  Hi.

On 25.11.2013. 14:04, Chris Angelico wrote:

Is it possible to set up virtualization to help you out? Create a
virtual machine in something like VirtualBox, then clone it for every
Python patch you want to support (you could have one VM that handles
all the .0 releases and another that handles all the .1 releases, or
you could have a separate VM for every Python you want to test).
...


  Thank you for the suggestion.

  Yup, we could do that, but at first glance it really smells like an 
overkill. Not to mention the potential licensing issues with Windows and 
an unlimited number of Windows installations. :-)


  So far all tests seem to indicate that things work out fine if we 
install to some dummy target folder, copy the target folder to some 
version specific location  uninstall. That leaves us with a working 
Python folder sans the start menu and registry items, both of which we 
do not need for this. Everything I've played around with so far seems to 
use the correct Python data depending on the interpreter executable 
invoked, whether or not there is a regular Windows installation 
somewhere on the same machine.


  We can use the script suggested by Ned Batchelder to temporarily 
change the 'current installation' if needed for some external installer 
package to correctly recognize where to install its content.


  I'm still playing around with this, and will let you know how it goes.

  Thank you again for replying!

  Best regards,
Jurko Gospodnetić


--
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Chris Angelico
On Tue, Nov 26, 2013 at 12:42 AM, Jurko Gospodnetić
jurko.gospodne...@pke.hr wrote:
   Yup, we could do that, but at first glance it really smells like an
 overkill. Not to mention the potential licensing issues with Windows and an
 unlimited number of Windows installations. :-)

Ah, heh... didn't think of that. When I spin up arbitrary numbers of
VMs, they're always Linux, so licensing doesn't come into it :)

   So far all tests seem to indicate that things work out fine if we install
 to some dummy target folder, copy the target folder to some version specific
 location  uninstall. That leaves us with a working Python folder sans the
 start menu and registry items, both of which we do not need for this.
 Everything I've played around with so far seems to use the correct Python
 data depending on the interpreter executable invoked, whether or not there
 is a regular Windows installation somewhere on the same machine.

Okay! That sounds good. Underkill is better than overkill if you can
get away with it!

Good luck. You'll need it, if you're trying to support Python 2.4 and
all newer versions AND manage issues across patch releases...

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Jurko Gospodnetić

  Hi.

On 25.11.2013. 13:46, Ned Batchelder wrote:

IIRC, Python itself doesn't read those registry entries, except when installing 
pre-compiled .msi or .exe kits.  Once you have Python installed, you can move 
the directory someplace else, then install another version of Python.

If you need to use many different Pythons of the same version, this script 
helps manage the registry: 
http://nedbatchelder.com/blog/201007/installing_python_packages_from_windows_installers_into.html


  Thank you for the information!

  As I mentioned in another reply, so far I think we can use this 
script to temporarily change the 'current installation' if needed for 
some external installer package to correctly recognize where to install 
its content.


  bike-sheddingIf we do use it, I'll most likely modify it to first 
make a backup copy of the original registry key and use that later on to 
restore the original registry state instead of reconstructing its 
content to what the script assumes it should be./bike-shedding


  Best regards,
Jurko Gospodnetić


--
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Jurko Gospodnetić

  Hi.

On 25.11.2013. 14:20, Albert-Jan Roskam wrote:

Check out the following packages: virtualenv, virtualenvwrapper, tox
virtualenv + wrapper make it very easy to switch from one python
version to another. Stricly speaking you don't need
virtualenvwrapper, but it makes working with virtualenv a whole lot
easier.Tox also uses virtualenv. You can configure it to sdist your
package under different python versions. Also, you can make it run
nosetests for each python version and/or implementation (pypy and
jython are supported)


  I'll look into using virtualenv and possibly tox once I get into 
issues with mismatched installed Python package versions, but for now 
I'm dealing with installing different Python interpreter versions and, 
unless I'm overlooking something here, virtualenv does not help with 
that. :-(


  Thanks for the suggestion though, I'm definitely going to read up on 
those packages soon. :-)


  Best regards,
Jurko Gospodnetić


--
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Albert-Jan Roskam

On Mon, 11/25/13, Jurko Gospodnetić jurko.gospodne...@pke.hr wrote:

 Subject: Re: Parallel Python x.y.A and x.y.B installations on a single Windows 
machine
 To: python-list@python.org
 Date: Monday, November 25, 2013, 2:57 PM
 
   Hi.
 
 On 25.11.2013. 14:20, Albert-Jan Roskam wrote:
  Check out the following packages: virtualenv,
 virtualenvwrapper, tox
  virtualenv + wrapper make it very easy to switch from
 one python
  version to another. Stricly speaking you don't need
  virtualenvwrapper, but it makes working with virtualenv
 a whole lot
  easier.Tox also uses virtualenv. You can configure it
 to sdist your
  package under different python versions. Also, you can
 make it run
  nosetests for each python version and/or implementation
 (pypy and
  jython are supported)
 
   I'll look into using virtualenv and possibly tox once
 I get into issues with mismatched installed Python package
 versions, but for now I'm dealing with installing different
 Python interpreter versions and, unless I'm overlooking
 something here, virtualenv does not help with that. :-(
 
  Are you sure? 
http://stackoverflow.com/questions/1534210/use-different-python-version-with-virtualenv

Below is a little terminal session.  I often switch between python 3.3 and 
python 2.7. My virtualenv for python 3.3 is called python33. workon is a 
virtualenv wrapper command. And check out the envlist in tox.ini on 
http://tox.readthedocs.org/en/latest/example/basic.html

antonia@antonia-HP-2133 ~ $ workon python3.3
ERROR: Environment 'python3.3' does not exist. Create it with 'mkvirtualenv 
python3.3'.
antonia@antonia-HP-2133 ~ $ workon python33
(python33)antonia@antonia-HP-2133 ~ $ python
Python 3.3.2 (default, Sep  1 2013, 22:59:57) 
[GCC 4.7.2] on linux
Type help, copyright, credits or license for more information.
 quit()
(python33)antonia@antonia-HP-2133 ~ $ deactivate
antonia@antonia-HP-2133 ~ $ python
Python 2.7.3 (default, Sep 26 2013, 16:38:10) 
[GCC 4.7.2] on linux2
Type help, copyright, credits or license for more information.
 quit()



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Chris Angelico
On Tue, Nov 26, 2013 at 1:15 AM, Albert-Jan Roskam fo...@yahoo.com wrote:
 Below is a little terminal session.  I often switch between python 3.3 and 
 python 2.7. My virtualenv for python 3.3 is called python33. workon is a 
 virtualenv wrapper command. And check out the envlist in tox.ini on 
 http://tox.readthedocs.org/en/latest/example/basic.html

That's two different minor versions, though. Can you have 3.3.1 and
3.3.2 installed, by that method?

Incidentally, if this were on Linux, I would just build the different
versions in different directories, and then run them without
installing. But the OP seems to have a solution that works, and I
think it'll be the simplest.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Jurko Gospodnetić

  Hi.

On 25.11.2013. 15:15, Albert-Jan Roskam wrote:

   Are you sure? 
http://stackoverflow.com/questions/1534210/use-different-python-version-with-virtualenv


  Yup, I'm pretty sure by now (based on reading the docs, not trying it 
out though).


  Virtualenv allows you to set up different environments, each of them 
having a separate Python folder structure and each possibly connected to 
a separate Python interpreter executable. However, it does not solve the 
problem of getting those separate Python interpreter executables 
installed in the first place, which is the problem I was attacking. :-)


  Still playing around with my multiple installations setup here. Will 
let you know how it goes...


  So far, one important thing I noticed is that you need to run all 
your installations 'for the current user only', or otherwise it moves at 
least one DLL file (python24.dll) into a Windows system folder and then 
the next installation deletes it from there, and overwrites it with its 
own. :-( But I can live with that... :-)


  Best regards,
Jurko Gospodnetić


--
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Terry Reedy

On 11/25/2013 8:42 AM, Jurko Gospodnetić wrote:


   So far all tests seem to indicate that things work out fine if we
install to some dummy target folder, copy the target folder to some
version specific location  uninstall.


If the dummy folder had 3.3.0, you should not need to uninstall to 
install 3.3.1 on top of it. But it is easy and probably safest.



That leaves us with a working
Python folder sans the start menu and registry items, both of which we
do not need for this. Everything I've played around with so far seems to
use the correct Python data depending on the interpreter executable
invoked, whether or not there is a regular Windows installation
somewhere on the same machine.


Just a reminder: you can run one file or set of files with multiple 
Pythons by putting 'project.pth' containing the same 'path-to-project' 
in the Lib/site-packages of each Python directory. I do this to test one 
file with 2.7 and 3.3 (and just added 3.4) without copying the file.


--
Terry Jan Reedy


--
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Jurko Gospodnetić

  Hi.

On 25.11.2013. 17:38, Terry Reedy wrote:

   So far all tests seem to indicate that things work out fine if we
install to some dummy target folder, copy the target folder to some
version specific location  uninstall.


If the dummy folder had 3.3.0, you should not need to uninstall to
install 3.3.1 on top of it. But it is easy and probably safest.


  Without the uninstall step you get stuck with invalid registry and 
start menu items refering to an invalid path until you install another 
matching major.minor.X version.




Just a reminder: you can run one file or set of files with multiple
Pythons by putting 'project.pth' containing the same 'path-to-project'
in the Lib/site-packages of each Python directory. I do this to test one
file with 2.7 and 3.3 (and just added 3.4) without copying the file.


  Thanks for the tip. That might come in useful. At the moment I just 
run the pytest framework using different python interpreters, without 
having to install the package at all (possibly first running 'setup.py 
build' to get the sources converted to Python 3 format).


  Best regards,
Jurko Gospodnetić


--
https://mail.python.org/mailman/listinfo/python-list


Parallel python + ??

2008-06-11 Thread Thor
Hi,

I am running a program using Parallel Python and I wonder if there is a
way/module to know in which CPU/core the process is running in. Is that
possible?

Ángel
--
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel python + ??

2008-06-11 Thread Gerhard Häring

Thor wrote:

Hi,

I am running a program using Parallel Python and I wonder if there is a
way/module to know in which CPU/core the process is running in. Is that
possible?


This is of course OS-specific. On Linux, you can parse the proc filesystem:

 open(/proc/%i/stat % os.getpid()).read().split()[39]

You can use the taskset utility to query or set CPU affinity on Linux.

-- Gerhard

--
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel python + ??

2008-06-11 Thread Thor
Gerhard Häring wrote:

 This is of course OS-specific. On Linux, you can parse the proc
 filesystem:
 
   open(/proc/%i/stat % os.getpid()).read().split()[39]
 
 You can use the taskset utility to query or set CPU affinity on Linux.
 
It is going to be in Linux (mainly) I was thinking about something like
this:

import Module

def process(self):
  print I am running on processor, Module.cpu,core, Module.core
  


Checking the raskset right now...:) Thanks.
--
http://mail.python.org/mailman/listinfo/python-list


RE: Parallel Python environments..

2007-11-07 Thread Thorsten Kampe
* bruce (Tue, 6 Nov 2007 13:43:10 -0800)
 if i have python 2.4.3 installed, it gets placed in the python2.4 dir.. if i
 don't do anything different, and install python 2.4.2, it too will get
 placed in the python2.4 tree... which is not what i want.
 
 i'm running rhel4/5...

So you're using rpm as a packet manager. I suggest you RTFM to see if 
there are options for slots or dual installations.
 
 so.. i still need to know what to do/change in order to be able to run
 multiple versions of python, and to switch back/forth between the versions.

Unpack the Python rpm to ~/bin or compile Python yourself. And be more 
specific about what you mean with switching back/forth. 

On the other hand - as Gabriel pointed out: there is almost a 100% 
certainty that the problem you want to solve by having Python 2.4.2 
*and* 2.4.3 simultaneously exists only in your head or cannot be 
solved this way.

Thorsten
-- 
http://mail.python.org/mailman/listinfo/python-list


Parallel Python environments..

2007-11-06 Thread bruce
Hi..

If I wanted to be able to build/test/use parallel python versions, what
would I need to do/set (paths/libs/etc...) and where would I need to place
the 2nd python version, so as not to screw up my initial python dev env.

I'd like to be able to switch back/forth between the different versions if
possible. I know it should be, but I haven't been able to find what I'm
looking for via the 'net...

Any sites/pointers describing the process would be helpuful. In particular,
any changfes to the bashrc/profile/etc... files to allow me to accomplish
this would be helpful.

thanks

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python environments..

2007-11-06 Thread Thorsten Kampe
* bruce (Tue, 6 Nov 2007 07:13:43 -0800)
 If I wanted to be able to build/test/use parallel python versions, what
 would I need to do/set (paths/libs/etc...)

nothing

 and where would I need to place the 2nd python version, so as not to
 screw up my initial python dev env.

Anywhere you like (probably ~/bin would be best)
 
 Any sites/pointers describing the process would be helpuful. In particular,
 any changfes to the bashrc/profile/etc... files to allow me to accomplish
 this would be helpful.

Nothing like that. Just change the shebang.

Thorsten
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python environments..

2007-11-06 Thread [EMAIL PROTECTED]
In Gentoo Linux you can select between installed python version using
python-config script.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python environments..

2007-11-06 Thread Diez B. Roggisch
bruce wrote:

 Hi..
 
 If I wanted to be able to build/test/use parallel python versions, what
 would I need to do/set (paths/libs/etc...) and where would I need to place
 the 2nd python version, so as not to screw up my initial python dev env.
 
 I'd like to be able to switch back/forth between the different versions if
 possible. I know it should be, but I haven't been able to find what I'm
 looking for via the 'net...
 
 Any sites/pointers describing the process would be helpuful. In
 particular, any changfes to the bashrc/profile/etc... files to allow me to
 accomplish this would be helpful.

Installation of several python versions is easy on at least Windows 
unixish platforms. For the latter, ususally your package-management offers
several versions. If your's doesn't or you use windows, just install as
required by python itself.

Only what gets chosen as default python version in case of e.g. *.py-files
clicking in windows explorer depends on the python version installed as
last and must be changed by the OS's means for it if other behavior is
desired.

Diez
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python environments..

2007-11-06 Thread Gabriel Genellina
En Tue, 06 Nov 2007 18:43:10 -0300, bruce [EMAIL PROTECTED]  
escribió:

 if i have python 2.4.3 installed, it gets placed in the python2.4 dir..  
 if i
 don't do anything different, and install python 2.4.2, it too will get
 placed in the python2.4 tree... which is not what i want.

Any reason you want to keep 2.4.2 *and* 2.4.3 separate? The latter is only  
a bugfix over the 2.4 version - anything working on 2.4.2 should work on  
2.4.3. And 2.4.4, the latest bugfix on that series. Binaries, shared  
libraries, extensions, etc. targetted to 2.4x should work with 2.4.4
You may want to have separate directories for 2.4 and 2.5, yes; binaries,  
shared libraries and extensions do NOT work across versions changing the  
SECOND digit. But changes on the THIRD digit should not have compatibility  
problems.

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Parallel Python environments..

2007-11-06 Thread bruce
thorsten...

if i have python 2.4.3 installed, it gets placed in the python2.4 dir.. if i
don't do anything different, and install python 2.4.2, it too will get
placed in the python2.4 tree... which is not what i want.

i'm running rhel4/5...

so.. i still need to know what to do/change in order to be able to run
multiple versions of python, and to switch back/forth between the versions.

thanks


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf
Of Thorsten Kampe
Sent: Tuesday, November 06, 2007 8:19 AM
To: python-list@python.org
Subject: Re: Parallel Python environments..


* bruce (Tue, 6 Nov 2007 07:13:43 -0800)
 If I wanted to be able to build/test/use parallel python versions, what
 would I need to do/set (paths/libs/etc...)

nothing

 and where would I need to place the 2nd python version, so as not to
 screw up my initial python dev env.

Anywhere you like (probably ~/bin would be best)

 Any sites/pointers describing the process would be helpuful. In
particular,
 any changfes to the bashrc/profile/etc... files to allow me to accomplish
 this would be helpful.

Nothing like that. Just change the shebang.

Thorsten
--
http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Parallel Python environments..

2007-11-06 Thread bruce
i'm running rhel...

so there isn't a python-config script as far as i know..


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf
Of [EMAIL PROTECTED]
Sent: Tuesday, November 06, 2007 8:26 AM
To: python-list@python.org
Subject: Re: Parallel Python environments..


In Gentoo Linux you can select between installed python version using
python-config script.

-- 
http://mail.python.org/mailman/listinfo/python-list
-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Parallel Python environments..

2007-11-06 Thread bruce
hi gabriel...

i have my reasons, for some testing that i'm doing on a project.

that said, i'm still trying to figure out how to make this occur...

thanks



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf
Of Gabriel Genellina
Sent: Tuesday, November 06, 2007 2:07 PM
To: python-list@python.org
Subject: Re: Parallel Python environments..


En Tue, 06 Nov 2007 18:43:10 -0300, bruce [EMAIL PROTECTED]
escribió:

 if i have python 2.4.3 installed, it gets placed in the python2.4 dir..
 if i
 don't do anything different, and install python 2.4.2, it too will get
 placed in the python2.4 tree... which is not what i want.

Any reason you want to keep 2.4.2 *and* 2.4.3 separate? The latter is only
a bugfix over the 2.4 version - anything working on 2.4.2 should work on
2.4.3. And 2.4.4, the latest bugfix on that series. Binaries, shared
libraries, extensions, etc. targetted to 2.4x should work with 2.4.4
You may want to have separate directories for 2.4 and 2.5, yes; binaries,
shared libraries and extensions do NOT work across versions changing the
SECOND digit. But changes on the THIRD digit should not have compatibility
problems.

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Can Parallel Python run on a muti-CPU server ?

2007-02-09 Thread parallelpython
Hi,

That is definitely possible!
To achieve the best performance split your calculation either into 128
equal parts or int 128 part of any size (then load balancing will
spread workload equally). Let us know the results, if need any help
with parallelization feel free to request it here:
http://www.parallelpython.com/component/option,com_smf/Itemid,29/
Thank you!

On Feb 7, 2:13 am, [EMAIL PROTECTED] [EMAIL PROTECTED]
wrote:
 Hi all,

 I'm interested inParallelPythonand I learned from the website ofParallelPython
 that it can run on SMP and clusters. But can it run on a our muti-CPU
 server ?
 We are running an origin3800 server with 128 CPUs.

 Thanks.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Can Parallel Python run on a muti-CPU server ?

2007-02-08 Thread azrael
no, not renting. i need such one at home. when you say rent it sounds
like buy a hosting package. i need one to work all the time on max
power. cracking md5 needs power. :-D

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Can Parallel Python run on a muti-CPU server ?

2007-02-07 Thread Nick Vatamaniuc
From the www.parallelpython.com , the 'Features' section:

Features:
 *Parallel execution of python code on SMP and clusters
---

PP uses processes, and thus it will take advantage of multiple cores
for a CPU bound task.

-Nick


On Feb 6, 9:13 pm, [EMAIL PROTECTED] [EMAIL PROTECTED]
wrote:
 Hi all,

 I'm interested in Parallel Python and I learned from the website of
 Parallel Python
 that it can run on SMP and clusters. But can it run on a our muti-CPU
 server ?
 We are running an origin3800 server with 128 CPUs.

 Thanks.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Can Parallel Python run on a muti-CPU server ?

2007-02-07 Thread Martin P. Hellwig
[EMAIL PROTECTED] wrote:
 [EMAIL PROTECTED] wrote:
 Hi all,

 I'm interested in Parallel Python and I learned from the website of 
 Parallel Python
 that it can run on SMP and clusters. But can it run on a our muti-CPU 
 server ?
 We are running an origin3800 server with 128 CPUs.

 Thanks.
   
 I have tested that at least it could run sum_primes.py on our server.
 
 But it seems Parallel Python just launch one python process for
 each job, and if I let it use 12 CPUs for 8 jobs, Parallel Python
 launches 12 python processes, 4 of which just sleep until all 8 jobs
 are done.

I've just downloaded it ,having a couple of M-CPU machines, it's quite 
interesting. At this moment I think that you should not see it as 
'magical distribution' of your processing pool but more like the way 
threads work. So every function will stay on it's CPU while executing 
and will not use the process power of another CPU if available.

So I guess you have to fine grain your program to take the advantage of 
multiple CPU's, just like you would do if you had 'real' threads.

-- 
mph
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Can Parallel Python run on a muti-CPU server ?

2007-02-07 Thread azrael
On Feb 7, 3:13 am, [EMAIL PROTECTED] [EMAIL PROTECTED]
wrote:
 Hi all,

 I'm interested in Parallel Python and I learned from the website of
 Parallel Python
 that it can run on SMP and clusters. But can it run on a our muti-CPU
 server ?
 We are running an origin3800 server with 128 CPUs.

 Thanks.

I see you got a problem. me to. how can i get such a little toy for my
own and how much do i have to spend on it.
i also want sometjing like thi. 128 cpu-s, 64 GB rainbowtables, oh my
god. i want this. i am very close to have an orgasm.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Can Parallel Python run on a muti-CPU server ?

2007-02-07 Thread Martin P. Hellwig
azrael wrote:
 On Feb 7, 3:13 am, [EMAIL PROTECTED] [EMAIL PROTECTED]
 wrote:
 Hi all,

 I'm interested in Parallel Python and I learned from the website of
 Parallel Python
 that it can run on SMP and clusters. But can it run on a our muti-CPU
 server ?
 We are running an origin3800 server with 128 CPUs.

 Thanks.
 
 I see you got a problem. me to. how can i get such a little toy for my
 own and how much do i have to spend on it.
 i also want sometjing like thi. 128 cpu-s, 64 GB rainbowtables, oh my
 god. i want this. i am very close to have an orgasm.
 

Rent it :-) Well you could do at Sara, just google for sara teras ;-)

-- 
mph
-- 
http://mail.python.org/mailman/listinfo/python-list


Can Parallel Python run on a muti-CPU server ?

2007-02-06 Thread [EMAIL PROTECTED]
Hi all,

I'm interested in Parallel Python and I learned from the website of 
Parallel Python
that it can run on SMP and clusters. But can it run on a our muti-CPU 
server ?
We are running an origin3800 server with 128 CPUs.

Thanks.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Can Parallel Python run on a muti-CPU server ?

2007-02-06 Thread [EMAIL PROTECTED]
[EMAIL PROTECTED] wrote:
 Hi all,

 I'm interested in Parallel Python and I learned from the website of 
 Parallel Python
 that it can run on SMP and clusters. But can it run on a our muti-CPU 
 server ?
 We are running an origin3800 server with 128 CPUs.

 Thanks.
   
I have tested that at least it could run sum_primes.py on our server.

But it seems Parallel Python just launch one python process for
each job, and if I let it use 12 CPUs for 8 jobs, Parallel Python
launches 12 python processes, 4 of which just sleep until all 8 jobs
are done.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-02-04 Thread parallelpython
On Jan 12, 11:52 am, Neal Becker [EMAIL PROTECTED] wrote:
 [EMAIL PROTECTED] wrote:
  Has anybody tried to runparallelpythonapplications?
  It appears that if your application is computation-bound using 'thread'
  or 'threading' modules will not get you any speedup. That is because
 pythoninterpreter uses GIL(Global Interpreter Lock) for internal
  bookkeeping. The later allows only onepythonbyte-code instruction to
  be executed at a time even if you have a multiprocessor computer.
  To overcome this limitation, I've created ppsmp module:
  http://www.parallelpython.com
  It provides an easy way to runparallelpythonapplications on smp
  computers.
  I would appreciate any comments/suggestions regarding it.
  Thank you!

 Looks interesting, but is there any way to use this for a cluster of
 machines over a network (not smp)?

There are 2 major updates regarding Parallel Python: http://
www.parallelpython.com

1) Now (since version 1.2) parallel python software could be used for
cluster-wide parallelization (or even Internet-wide). It's also
renamed accordingly: pp (module is backward compatible with ppsmp)

2) Parallel Python became open source (under BSD license): http://
www.parallelpython.com/content/view/18/32/

-- 
http://mail.python.org/mailman/listinfo/python-list


Distributed computation of jobs (was: Parallel Python)

2007-01-17 Thread A.T.Hofkamp
On 2007-01-12, robert [EMAIL PROTECTED] wrote:
 
 [1] http://www.python.org/pypi/parallel

 I'd be interested in an overview.
 For ease of use a major criterion for me would be a pure python 
 solution, which also does the job of starting and controlling the 
 other process(es) automatically right (by default) on common 
 platforms.

Let me add a few cents to the discussion with this announcement:

About three years ago, I wrote two Python modules, one called 'exec_proxy',
which uses ssh to run another exec_proxy instance at a remote machine, thus
providing ligh-weight transparent access to a machine across a network.

The idea behind this module was/is that by just using ssh you have network
transparency, much more light weight than most other distributed modules where
you have to start deamons at all machines.
Recently, the 'rthread' module was announced which takes the same approach (it
seems from the announcement). I have not compared both modules with each other.


The more interesting Python module called 'batchlib' lies on top of the former
(or any other module that provides transparency across the network). It
handles distribution of computation jobs in the form of a 'start-computation'
and 'get-results' pair of functions.

That is, you give it a set of machines it may use, you say to the entry-point,
compute for me this-and-this function with this-and-this parameters, and
batchlib does the rest.
(that is, it finds a free machine, copies the parameters over the network, runs
the job, the result is transported back, and you can get the result of a
computation by using the same (uniq) identification given by you when the job
was given to batchlib.)

We used it as computation backend for optimization problems, but since
'computation job' may mean anything, the module should be very generically
applicable.


Compared to most other parallel/distributed modules, I think that the other
modules more-or-less compare with exec_proxy (that is, they stop with
transparent network access), where exec_proxy was designed to have minimal
impact on required infra structure (ie just ssh or rsh which is generally
already available) and thus without many of the features available from the
other modules.

Batchlib starts where exec_proxy ends, namely lifting network primitives to the
level of providing a simple way of doing distributed computations (in the case
of exec_proxy, without adding network infra structure such as deamons).




Until now, both modules were used in-house, and it was not clear what we wanted
to do further with the software. Recently, we have decided that we have no
further use for this software (we think we want to move into a different
direction), clearing the way to release this software to the community.

You can get the software from my home page http://seweb.se.wtb.tue.nl/~hat
Both packages can be downloaded, and include documentation and an example.
The bad news is that I will not be able to do further development of these
modules. The code is 'end-of-life' for us.


Maybe you find the software useful,
Albert
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Distributed computation of jobs (was: Parallel Python)

2007-01-17 Thread Paul Boddie
A.T.Hofkamp skrev:

 Let me add a few cents to the discussion with this announcement:

[Notes about exec_proxy, batchlib and rthread]

I've added entries for these modules, along with py.execnet, to the
parallel processing solutions page on the python.org Wiki:

http://wiki.python.org/moin/ParallelProcessing

Thanks for describing your work to us!

Paul

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-13 Thread parallelpython
 Looks interesting, but is there any way to use this for a cluster of
 machines over a network (not smp)?

Networking capabilities will be included in the next release of
Parallel Python software (http://www.parallelpython.com), which is
coming soon.


 Couldn't you just provide similar conveniences on top of MPI? Searching
 for Python MPI yields a lot of existing work (as does Python PVM),
 so perhaps someone has already done so.

Yes, it's possible to do it on the top of any environment which
supports IPC.

 That's one more project... It seems that there is significant
 interest in parallel computing in Python. Perhaps we should start a
 special interest group? Not so much in order to work on a single
 project; I believe that at the current state of parallel computing we
 still need many different approaches to be tried. But an exchange of
 experience could well be useful for all of us.
Well, I may just add that everybody is welcome to start discussion
regarding any parallel python project or idea in this forum:
http://www.parallelpython.com/component/option,com_smf/Itemid,29/board,2.0

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-12 Thread Paul Boddie
[EMAIL PROTECTED] wrote:

 The main difference between MPI python solutions and ppsmp is that with
 MPI you have to organize both computations
 {MPI_Comm_rank(MPI_COMM_WORLD, id); if id==1 then ... else } and
 data distribution (MPI_Send / MPI_Recv) by yourself. While with ppsmp
 you just submit a function with arguments to the execution server and
 retrieve the results later.

Couldn't you just provide similar conveniences on top of MPI? Searching
for Python MPI yields a lot of existing work (as does Python PVM),
so perhaps someone has already done so. Also, what about various grid
toolkits?

[...]

 Overall ppsmp is still work in progress and there are other interesting
 features which I would like to implement. This is the main reason why I
 do not open the source of ppsmp - to have better control of its future
 development, as advised here: http://en.wikipedia.org/wiki/Freeware :-)

Despite various probable reactions from people who will claim that
they're comfortable with binary-only products from a single vendor, I
think more people would be inclined to look at your software if you did
distribute the source code, even if they then disregarded what you've
done. My own experience with regard to releasing software is that even
with an open source licence, most people are likely to ignore your
projects than to suddenly jump on board and take control, and even if
your project somehow struck a chord and attracted a lot of interested
developers, would it really be such a bad thing? Many developers have
different experiences and insights which can only make your project
better, anyway.

Related to your work, I've released a parallel execution solution
called parallel/pprocess [1] under the LGPL and haven't really heard
about anyone really doing anything with it, let alone forking it and
showing my original efforts in a bad light. Perhaps most of the
downloaders believe me to be barking up the wrong tree (or just
barking) with the approach I've taken, but I think the best thing is to
abandon any fears of not doing things the best possible way and just be
open to improvements and suggestions.

Paul

[1] http://www.python.org/pypi/parallel

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-12 Thread Nick Maclaren

In article [EMAIL PROTECTED],
Paul Boddie [EMAIL PROTECTED] writes:
| [EMAIL PROTECTED] wrote:
| 
|  The main difference between MPI python solutions and ppsmp is that with
|  MPI you have to organize both computations
|  {MPI_Comm_rank(MPI_COMM_WORLD, id); if id==1 then ... else } and
|  data distribution (MPI_Send / MPI_Recv) by yourself. While with ppsmp
|  you just submit a function with arguments to the execution server and
|  retrieve the results later.
| 
| Couldn't you just provide similar conveniences on top of MPI? Searching
| for Python MPI yields a lot of existing work (as does Python PVM),
| so perhaps someone has already done so. 

Yes.  No problem.

| Also, what about various grid toolkits?

If you can find one that is robust enough for real work by someone who
is not deeply into developing Grid software, I will be amazed.


Regards,
Nick Maclaren.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-12 Thread robert
Paul Boddie wrote:
 [EMAIL PROTECTED] wrote:
 The main difference between MPI python solutions and ppsmp is that with
 MPI you have to organize both computations
 {MPI_Comm_rank(MPI_COMM_WORLD, id); if id==1 then ... else } and
 data distribution (MPI_Send / MPI_Recv) by yourself. While with ppsmp
 you just submit a function with arguments to the execution server and
 retrieve the results later.
 
 Couldn't you just provide similar conveniences on top of MPI? Searching
 for Python MPI yields a lot of existing work (as does Python PVM),
 so perhaps someone has already done so. Also, what about various grid
 toolkits?
 
 [...]
 
 Overall ppsmp is still work in progress and there are other interesting
 features which I would like to implement. This is the main reason why I
 do not open the source of ppsmp - to have better control of its future
 development, as advised here: http://en.wikipedia.org/wiki/Freeware :-)
 
 Despite various probable reactions from people who will claim that
 they're comfortable with binary-only products from a single vendor, I
 think more people would be inclined to look at your software if you did
 distribute the source code, even if they then disregarded what you've
 done. My own experience with regard to releasing software is that even
 with an open source licence, most people are likely to ignore your
 projects than to suddenly jump on board and take control, and even if
 your project somehow struck a chord and attracted a lot of interested
 developers, would it really be such a bad thing? Many developers have
 different experiences and insights which can only make your project
 better, anyway.
 
 Related to your work, I've released a parallel execution solution
 called parallel/pprocess [1] under the LGPL and haven't really heard
 about anyone really doing anything with it, let alone forking it and
 showing my original efforts in a bad light. Perhaps most of the
 downloaders believe me to be barking up the wrong tree (or just
 barking) with the approach I've taken, but I think the best thing is to
 abandon any fears of not doing things the best possible way and just be
 open to improvements and suggestions.
 
 Paul
 
 [1] http://www.python.org/pypi/parallel

I'd be interested in an overview.
For ease of use a major criterion for me would be a pure python 
solution, which also does the job of starting and controlling the 
other process(es) automatically right (by default) on common 
platforms.
Which of the existing (RPC) solutions are that nice?


Robert
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-12 Thread Neal Becker
[EMAIL PROTECTED] wrote:

 Has anybody tried to run parallel python applications?
 It appears that if your application is computation-bound using 'thread'
 or 'threading' modules will not get you any speedup. That is because
 python interpreter uses GIL(Global Interpreter Lock) for internal
 bookkeeping. The later allows only one python byte-code instruction to
 be executed at a time even if you have a multiprocessor computer.
 To overcome this limitation, I've created ppsmp module:
 http://www.parallelpython.com
 It provides an easy way to run parallel python applications on smp
 computers.
 I would appreciate any comments/suggestions regarding it.
 Thank you!
 

Looks interesting, but is there any way to use this for a cluster of
machines over a network (not smp)?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-12 Thread Paul Boddie
robert wrote:
 Paul Boddie wrote:
 
  [1] http://www.python.org/pypi/parallel

 I'd be interested in an overview.

I think we've briefly discussed the above solution before, and I don't
think you're too enthusiastic about anything using interprocess
communication, which is what the above solution uses. Moreover, it's
intended as a threading replacement for SMP/multicore architectures
where one actually gets parallel execution (since it uses processes).

 For ease of use a major criterion for me would be a pure python
 solution, which also does the job of starting and controlling the
 other process(es) automatically right (by default) on common
 platforms.
 Which of the existing (RPC) solutions are that nice?

Many people have nice things to say about Pyro, and there seem to be
various modules attempting parallel processing, or at least some kind
of job control, using that technology. See Konrad Hinsen's
ScientificPython solution for an example of this - I'm sure I've seen
others, too.

Paul

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-12 Thread Konrad Hinsen
On Jan 12, 2007, at 11:21, Paul Boddie wrote:

 done. My own experience with regard to releasing software is that even
 with an open source licence, most people are likely to ignore your
 projects than to suddenly jump on board and take control, and even if

My experience is exactly the same. And looking into the big world of  
Open Source programs, the only case I ever heard of in which a  
project was forked by someone else is the Emacs/XEmacs split. I'd be  
happy if any of my projects ever reached that level of interest.

 Related to your work, I've released a parallel execution solution
 called parallel/pprocess [1] under the LGPL and haven't really heard
 about anyone really doing anything with it, let alone forking it and

That's one more project... It seems that there is significant  
interest in parallel computing in Python. Perhaps we should start a  
special interest group? Not so much in order to work on a single  
project; I believe that at the current state of parallel computing we  
still need many different approaches to be tried. But an exchange of  
experience could well be useful for all of us.

Konrad.
--
-
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: [EMAIL PROTECTED]
-


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-12 Thread Paul Boddie
Konrad Hinsen wrote:

 That's one more project... It seems that there is significant
 interest in parallel computing in Python. Perhaps we should start a
 special interest group? Not so much in order to work on a single
 project; I believe that at the current state of parallel computing we
 still need many different approaches to be tried. But an exchange of
 experience could well be useful for all of us.

I think a special interest group might be productive, but I've seen
varying levels of special interest in the different mailing lists
associated with such groups: the Web-SIG list started with enthusiasm,
produced a cascade of messages around WSGI, then dried up; the XML-SIG
list seems to be a sorry indication of how Python's XML scene has
drifted onto other matters; other such groups have also lost their
momentum.

It seems to me that a more useful first step would be to create an
overview of the different modules and put it on the python.org Wiki:

http://wiki.python.org/moin/FrontPage
http://wiki.python.org/moin/UsefulModules (a reasonable entry point)

If no-one beats me to it, I may write something up over the weekend.

Paul

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-12 Thread Konrad Hinsen
On Jan 12, 2007, at 15:08, Paul Boddie wrote:

 It seems to me that a more useful first step would be to create an
 overview of the different modules and put it on the python.org Wiki:

 http://wiki.python.org/moin/FrontPage
 http://wiki.python.org/moin/UsefulModules (a reasonable entry point)

 If no-one beats me to it, I may write something up over the weekend.

That sounds like a good idea. I won't beat you to it, but I'll have a  
look next week and perhaps add information that I have.

Konrad.
--
-
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: [EMAIL PROTECTED]
-


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-12 Thread mheslep

Konrad Hinsen wrote:
 Perhaps we should start a
 special interest group? Not so much in order to work on a single
 project; I believe that at the current state of parallel computing we
 still need many different approaches to be tried. But an exchange of
 experience could well be useful for all of us.
 
+ 1

-Mark

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread robert
sturlamolden wrote:
 Nick Maclaren wrote:
 
 I wonder if too much emphasis is put on thread programming these days.
 Threads may be nice for programming web servers and the like, but not
 for numerical computing. Reading books about thread programming, one
 can easily get the impression that it is 'the' way to parallelize
 numerical tasks on computers with multiple CPUs (or multiple CPU


Most threads on this planet are not used for number crunching jobs, but for 
organization of execution.

Also if one wants to exploit the speed of upcoming multi-core CPUs for all 
kinds of fine grained programs, things need fast fine grained communication - 
and most important: huge data trees in memory have to be shared effectively.
CPU frequencies will not grow anymore in the future, but we will see 
multi-cores/SMP. How to exploit them in a manner as if we had really faster 
CPU's: threads and thread-like techniques.

Things like MPI, IPC are just for the area of small message, big job - 
typically sci number crunching, where you collect the results at the end of 
day. Its more a slow network technique.

A most challenging example on this are probably games - not to discuss about 
gaming here, but as tech example to the point: Would you do MPI, RPC etc. while 
30fps 3D and real time physics simulation is going on?


Robert
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread robert
Nick Maclaren wrote:
 In article [EMAIL PROTECTED],
 Paul Rubin http://[EMAIL PROTECTED] writes:
 |
 |  Yes, I know that it is a bit Irish for the best way to use a shared
 |  memory system to be to not share memory, but that's how it is.
 | 
 | But I thought serious MPI implementations use shared memory if they
 | can.  That's the beauty of it, you can run your application on SMP
 | processors getting the benefit of shared memory, or split it across
 | multiple machines using ethernet or infiniband or whatever, without
 | having to change the app code.
 
 They use it for the communication, but don't expose it to the
 programmer.  It is therefore easy to put the processes on different
 CPUs, and get the memory consistency right.
 

Thus communicated data is serialized - not directly used as with threads or 
with custom shared memory techniques like POSH object sharing.


Robert
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread Sergei Organov
[EMAIL PROTECTED] (Nick Maclaren) writes:

 In article [EMAIL PROTECTED],
 Sergei Organov [EMAIL PROTECTED] writes:
 | 
 | Do you mean that POSIX threads are inherently designed and implemented
 | to stay idle most of the time?! If so, I'm afraid those guys that
 | designed POSIX threads won't agree with you. In particular, as far as I
 | remember, David R. Butenhof said a few times in comp.programming.threads
 | that POSIX threads were primarily designed to meet parallel programming
 | needs on SMP, or at least that was how I understood him.

 I do mean that, and I know that they don't agree.  However, the word
 designed doesn't really make a lot of sense for POSIX threads - the
 one I tend to use is perpetrated.

OK, then I don't think the POSIX threads were perpetrated to be idle
most of time.

 The people who put the specification together were either unaware of
 most of the experience of the previous 30 years, or chose to ignore it.
 In particular, in this context, the importance of being able to control
 the scheduling was well-known, as was the fact that it is NOT possible
 to mix processes with different scheduling models on the same set of
 CPUs.  POSIX's facilities are completely hopeless for that purpose, and
 most of the systems I have used effectively ignore them.

I won't argue that. On the other hand, POSIX threads capabilities in the
field of I/O-bound and real-time threads are also limited, and that's
where threads that are idle most of time idiom comes from, I
think. What I argue, is that POSIX were perpetrated to support
I/O-bound or real-time apps any more than to support parallel
calculations apps. Besides, pthreads real-time extensions came later
than pthreads themselves.

What I do see, is that Microsoft designed their system so that it's
almost impossible to implement an interactive application without using
threads, and that fact leads to the current situation where threads are
considered to be beasts that are sleeping most of time.

 I could go on at great length, and the performance aspects are not even
 the worst aspect of POSIX threads.  The fact that there is no usable
 memory model, and the synchronisation depends on C to handle the
 low-level consistency, but there are no CONCEPTS in common between
 POSIX and C's memory consistency 'specifications' is perhaps the worst.

I won't argue that either. However, I don't see how does it make POSIX
threads to be perpetrated to be idle most of time.

 That is why many POSIX threads programs work until the genuinely
 shared memory accesses become frequent enough that you get some to the
 same location in a single machine cycle.

Sorry, I don't understand. Are you saying that it's inherently
impossible to write an application that uses POSIX threads and that
doesn't have bugs accessing shared state? I thought that pthreads
mutexes guarantee sequential access to shared data. Or do you mean
something entirely different? Lock-free algorithms maybe?

-- Sergei.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread Nick Maclaren

In article [EMAIL PROTECTED],
Sergei Organov [EMAIL PROTECTED] writes:
| 
| OK, then I don't think the POSIX threads were perpetrated to be idle
| most of time.

Perhaps I was being unclear.  I should have added In the case where
there are more threads per system than CPUs per system.  The reasons
are extremely obscure and are to do with the scheduling, memory access
and communication.

I am in full agreement that the above effect was not INTENDED.

|  That is why many POSIX threads programs work until the genuinely
|  shared memory accesses become frequent enough that you get some to the
|  same location in a single machine cycle.
| 
| Sorry, I don't understand. Are you saying that it's inherently
| impossible to write an application that uses POSIX threads and that
| doesn't have bugs accessing shared state? I thought that pthreads
| mutexes guarantee sequential access to shared data. Or do you mean
| something entirely different? Lock-free algorithms maybe?

I mean precisely the first.

The C99 standard uses a bizarre consistency model, which requires serial
execution, and its consistency is defined in terms of only volatile
objects and external I/O.  Any form of memory access, signalling or
whatever is outside that, and is undefined behaviour.

POSIX uses a different but equally bizarre one, based on some function
calls being thread-safe and others forcing consistency (which is
not actually defined, and there are many possible, incompatible,
interpretations).  It leaves all language aspects (including allowed
code movement) to C.

There are no concepts in common between C's and POSIX's consistency
specifications (even when they are precise enough to use), and so no
way of mapping the two standards together.


Regards,
Nick Maclaren.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread Nick Maclaren

In article [EMAIL PROTECTED],
robert [EMAIL PROTECTED] writes:
| 
| Most threads on this planet are not used for number crunching jobs,
| but for organization of execution.

That is true, and it is effectively what POSIX and Microsoft threads
are suitable for.  With reservations, even there.

| Things like MPI, IPC are just for the area of small message, big job
| - typically sci number crunching, where you collect the results at
| the end of day. Its more a slow network technique.

That is completely false.  Most dedicated HPC systems use MPI for high
levels of message passing over high-speed networks.

|  They use it for the communication, but don't expose it to the
|  programmer.  It is therefore easy to put the processes on different
|  CPUs, and get the memory consistency right.
| 
| Thus communicated data is serialized - not directly used as with
| threads or with custom shared memory techniques like POSH object
| sharing.

It is not used as directly with threads as you might think.  Even
POSIX and Microsoft threads require synchronisation primitives, and
threading models like OpenMP and BSP have explicit control.

Also, MPI has asynchronous (non-blocking) communication.


Regards,
Nick Maclaren.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread sturlamolden

robert wrote:

 Thus communicated data is serialized - not directly used as with threads or 
 with custom shared memory techniques like POSH object sharing.

Correct, and that is precisely why MPI code is a lot easier to write
and debug than thread code. The OP used a similar technique in his
'parallel python' project.

This does not mean that MPI is inherently slower than threads however,
as there are overhead associated with thread synchronization as well.
With 'shared memory' between threads, a lot more fine grained
synchronization ans scheduling is needed, which impair performance and
often introduce obscure bugs.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread Sergei Organov
[EMAIL PROTECTED] (Nick Maclaren) writes:
[...]
 I mean precisely the first.

 The C99 standard uses a bizarre consistency model, which requires serial
 execution, and its consistency is defined in terms of only volatile
 objects and external I/O.  Any form of memory access, signalling or
 whatever is outside that, and is undefined behaviour.

 POSIX uses a different but equally bizarre one, based on some function
 calls being thread-safe and others forcing consistency (which is
 not actually defined, and there are many possible, incompatible,
 interpretations).  It leaves all language aspects (including allowed
 code movement) to C.

 There are no concepts in common between C's and POSIX's consistency
 specifications (even when they are precise enough to use), and so no
 way of mapping the two standards together.

Ah, now I see what you mean. Even though I only partly agree with what
you've said above, I'll stop arguing as it gets too off-topic for this
group.

Thank you for explanations.

-- Sergei.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread robert
sturlamolden wrote:
 robert wrote:
 
 Thus communicated data is serialized - not directly used as with threads 
 or with custom shared memory techniques like POSH object sharing.
 
 Correct, and that is precisely why MPI code is a lot easier to write
 and debug than thread code. The OP used a similar technique in his
 'parallel python' project.

Thus there are different levels of parallelization:

1 file/database based; multiple batch jobs
2 Message Passing, IPC, RPC, ...
3 Object Sharing 
4 Sharing of global data space (Threads)
5 Local parallelism / Vector computing, MMX, 3DNow,...

There are good reasons for all of these levels.
Yet parallel python to me fakes to be on level 3 or 4 (or even 5 :-) ), while 
its just a level 2 system, where passing, remote, inter-process ... are 
the right vocables.

With all this fakes popping up - a GIL free CPython is a major feature request 
for Py3K - a name at least promising to run 3rd millenium CPU's ...


 This does not mean that MPI is inherently slower than threads however,
 as there are overhead associated with thread synchronization as well.

level 2 communication is slower. Just for selected apps it won't matter a lot.

 With 'shared memory' between threads, a lot more fine grained
 synchronization ans scheduling is needed, which impair performance and
 often introduce obscure bugs.

Its a question of chances and costs and nature of application.
Yet one can easily restrict inter-thread communcation to be as simple and 
modular or even simpler as IPC. Search e.g. Python CallQueue and 
BackgroundCall on Google.
Thread programming is less complicated as it seems. (Just Python's stdlib 
offers cumbersome 'non-functional' classes)


Robert
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread Nick Maclaren

In article [EMAIL PROTECTED],
robert [EMAIL PROTECTED] writes:
| 
| Thus there are different levels of parallelization:
| 
| 1 file/database based; multiple batch jobs
| 2 Message Passing, IPC, RPC, ...
| 3 Object Sharing 
| 4 Sharing of global data space (Threads)
| 5 Local parallelism / Vector computing, MMX, 3DNow,...
| 
| There are good reasons for all of these levels.

Well, yes, but to call them levels is misleading, as they are closer
to communication methods of a comparable level.

|  This does not mean that MPI is inherently slower than threads however,
|  as there are overhead associated with thread synchronization as well.
| 
| level 2 communication is slower. Just for selected apps it won't matter a 
lot.

That is false.  It used to be true, but that was a long time ago.  The
reasons why what seems to be a more heavyweight mechanism (message
passing) can be faster than an apparently lightweight one (data sharing)
are both subtle and complicated.


Regards,
Nick Maclaren.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread Konrad Hinsen
On Jan 8, 2007, at 11:33, Duncan Booth wrote:

 The 'parallel python' site seems very sparse on the details of how  
 it is
 implemented but it looks like all it is doing is spawning some  
 subprocesses
 and using some simple ipc to pass details of the calls and results.  
 I can't
 tell from reading it what it is supposed to add over any of the other
 systems which do the same.

 Combined with the closed source 'no redistribution' license I can't  
 really
 see anyone using it.

I'd also like to see more details - even though I'd probably never  
use any Python module distributed in .pyc form only.

 From the bit of information there is on the Web site, the  
distribution strategy looks quite similar to my own master-slave  
distribution model (based on Pyro) which is part of ScientificPython.  
There is an example at

http://dirac.cnrs-orleans.fr/hg/ScientificPython/main/? 
f=08361040f00a;file=Examples/master_slave_demo.py

and the code itself can be consulted at

http://dirac.cnrs-orleans.fr/hg/ScientificPython/main/? 
f=bce321680116;file=Scientific/DistributedComputing/MasterSlave.py


The main difference seems to be that my implementation doesn't start  
compute jobs itself; it leaves it to the user to start any number he  
wants by any means that works for his setup, but it allows a lot of  
flexibility. In particular, it can work with a variable number of  
slave jobs and even handles disappearing slave jobs gracefully.

Konrad.
--
-
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: [EMAIL PROTECTED]
-


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread parallelpython
sturlamolden wrote:
 [EMAIL PROTECTED] wrote:

 That's right. ppsmp starts multiple interpreters in separate
  processes and organize communication between them through IPC.

 Thus you are basically reinventing MPI.

 http://mpi4py.scipy.org/
 http://en.wikipedia.org/wiki/Message_Passing_Interface

Thanks for bringing that into consideration.

I am well aware of MPI and have written several programs in C/C++ and
Fortran which use it.
I would agree that MPI is the most common solution to run software on a
cluster (computers connected by network). Although there is another
parallelization approach: PVM (Parallel Virtual Machine)
http://www.csm.ornl.gov/pvm/pvm_home.html. I would say ppsmp is more
similar to the later.

By the way there are links to different python parallelization
techniques (including MPI) from PP site:
http://www.parallelpython.com/component/option,com_weblinks/catid,14/Itemid,23/

The main difference between MPI python solutions and ppsmp is that with
MPI you have to organize both computations
{MPI_Comm_rank(MPI_COMM_WORLD, id); if id==1 then ... else } and
data distribution (MPI_Send / MPI_Recv) by yourself. While with ppsmp
you just submit a function with arguments to the execution server and
retrieve the results later.
That makes transition from serial python software to parallel much
simpler with ppsmp than with MPI.

To make this point clearer here is a short example:
serial code 2 lines--
for input in inputs:
print Sum of primes below, input, is, sum_primes(input)
parallel code 3 lines
jobs = [(input, job_server.submit(sum_primes,(input,), (isprime,),
(math,))) for input in inputs]
for input, job in jobs:
print Sum of primes below, input, is, job()
---
In this example parallel execution was added at the cost of 1 line of
code!

The other difference with MPI is that ppsmp dynamically decides where
to run each given job. For example if there are other active processes
running in the system ppsmp will use in the bigger extent the
processors which are free. Since in MPI the whole tasks is usually
divided  between processors equally at the beginning, the overall
runtime will be determined by the slowest running process (the one
which shares processor with another running program). In this
particular case ppsmp will outperform MPI.

The third, probably less important, difference is that with MPI based
parallel python code you must have MPI installed in the system.

Overall ppsmp is still work in progress and there are other interesting
features which I would like to implement. This is the main reason why I
do not open the source of ppsmp - to have better control of its future
development, as advised here: http://en.wikipedia.org/wiki/Freeware :-)

Best regards,
Vitalii

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread parallelpython

 Thus there are different levels of parallelization:

 1 file/database based; multiple batch jobs
 2 Message Passing, IPC, RPC, ...
 3 Object Sharing
 4 Sharing of global data space (Threads)
 5 Local parallelism / Vector computing, MMX, 3DNow,...

 There are good reasons for all of these levels.
 Yet parallel python to me fakes to be on level 3 or 4 (or even 5 :-) ), 
 while its just a level 2
 system, where passing, remote, inter-process ... are the right vocables.
In one of the previous posts I've mentioned that ppsmp is based on
processes + IPC, which makes it a system with level 2 parallelization,
the same level where MPI is.
Also it's obvious from the fact that it's written completely in python,
as python objects cannot be shared due to GIL (POSH can do sharing
because it's an extension written in C).

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread jairodsl
Hi,

You guys forgot pyMPI, http://pympi.sourceforge.net/  It works fine !!!
A little hard installation and configuration but finally works !!!

Cordially,

Jairo Serrano
Bucaramanga, Colombia

[EMAIL PROTECTED] wrote:
 
  Thus there are different levels of parallelization:
 
  1 file/database based; multiple batch jobs
  2 Message Passing, IPC, RPC, ...
  3 Object Sharing
  4 Sharing of global data space (Threads)
  5 Local parallelism / Vector computing, MMX, 3DNow,...
 
  There are good reasons for all of these levels.
  Yet parallel python to me fakes to be on level 3 or 4 (or even 5 :-) ), 
  while its just a level 2
  system, where passing, remote, inter-process ... are the right 
  vocables.
 In one of the previous posts I've mentioned that ppsmp is based on
 processes + IPC, which makes it a system with level 2 parallelization,
 the same level where MPI is.
 Also it's obvious from the fact that it's written completely in python,
 as python objects cannot be shared due to GIL (POSH can do sharing
 because it's an extension written in C).

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-10 Thread parallelpython
 I always thought that if you use multiple processes (e.g. os.fork) then
 Python can take advantage of multiple processors. I think the GIL locks
 one processor only. The problem is that one interpreted can be run on
 one processor only. Am I not right? Is your ppm module runs the same
 interpreter on multiple processors? That would be very interesting, and
 something new.


 Or does it start multiple interpreters? Another way to do this is to
 start multiple processes and let them communicate through IPC or a local
 network.

   That's right. ppsmp starts multiple interpreters in separate
processes and organize communication between them through IPC.

   Originally ppsmp was designed to speedup an existent application
which is written in pure python but is quite computationally expensive
(the other ways to optimize it were used too). It was also required
that the application will run out of the box on the most standard Linux
distributions (they all contain CPython).

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-10 Thread sturlamolden

robert wrote:

 Thats true. IPC through sockets or (somewhat faster) shared memory -  cPickle 
 at least - is usually the maximum of such approaches.
 See 
 http://groups.google.de/group/comp.lang.python/browse_frm/thread/f822ec289f30b26a

 For tasks really requiring threading one can consider IronPython.
 Most advanced technique I've see for CPython ist posh : 
 http://poshmodule.sourceforge.net/


In SciPy there is an MPI-binding project, mpi4py.

MPI is becoming the de facto standard for high-performance parallel
computing, both on shared memory systems (SMPs) and clusters. Spawning
threads or processes is not recommended way to do numerical parallel
computing. Threading makes programming certain tasks more convinient
(particularly GUI and I/O, for which the GIL does not matter anyway),
but is not a good paradigm for dividing CPU bound computations between
multiple processors. MPI is a high level API based on a concept of
message passing, which allows the programmer to focus on solving the
problem, instead on irrelevant distractions such as thread managament
and synchronization.

Although MPI has standard APIs for C and Fortran, it may be used with
any programming language. For Python, an additional advantage of using
MPI is that the GIL has no practical consequence for performance. The
GIL can lock a process but not prevent MPI from using multiple
processors as MPI is always using multiple processes. For IPC, MPI will
e.g. use shared-memory segments on SMPs and tcp/ip on clusters, but all
these details are hidden.

It seems like 'ppsmp' of parallelpython.com is just an reinvention of a
small portion of MPI.


http://mpi4py.scipy.org/
http://en.wikipedia.org/wiki/Message_Passing_Interface

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-10 Thread sturlamolden

[EMAIL PROTECTED] wrote:

That's right. ppsmp starts multiple interpreters in separate
 processes and organize communication between them through IPC.

Thus you are basically reinventing MPI.


http://mpi4py.scipy.org/
http://en.wikipedia.org/wiki/Message_Passing_Interface

-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   >