Re: [Python-Dev] C99
Guido van Rossumwrote: > This feels close to a code of conduct violation. This kind of language > may be okay on the Linux kernel list, but I don't see the point of it > here. Sorry, I should have found a more diplomatic formulation. But the principle remains, build problems araising from missing Xcode installation should not be CPython's problem. It is ok to assume that Xcode is always installed when CPython is built on OSX. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C99
"Stephen J. Turnbull"wrote: > I may be talking through my hat here, but Apple has been using LLVM > for several major releases now. They seem to be keeping the GCC > frontend stuck at 4.2.1, though. So just because we've been using GCC > 4.2.1 on Mac, including on buildbots, doesn't mean that there is no > C99 compiler available on Macs -- it's quite possible that the > available clang frontend does already support the C99 features we > would use. I suppose that might mean fiddling with the installer > build process as well as the buildbots. On OSX 10.8 and earlier, the default CC is llvm-gcc-4.2.1, available as the gcc command. clang is also installed, so we can always $ export CC=clang $ export CXX=clang++ to remedy the problem. On OSX 10.9 and later, gcc is just a symlink to clang. Xcode must be installed to build anything on Mac. It is not optional. Users who need to build Python without installing Xcode need to fix their heads. Because that is where their problem resides. There is no remedy for stubbornness to the level of stupidity. Either they install Xcode or they don't get to build anything. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C99
Ned Deily <n...@python.org> wrote: > On Aug 6, 2016, at 01:16, Stephen J. Turnbull > <turnbull.stephen...@u.tsukuba.ac.jp> wrote: >> I may be talking through my hat here, but Apple has been using LLVM >> for several major releases now. They seem to be keeping the GCC >> frontend stuck at 4.2.1, though. So just because we've been using GCC >> 4.2.1 on Mac, including on buildbots, doesn't mean that there is no >> C99 compiler available on Macs -- it's quite possible that the >> available clang frontend does already support the C99 features we >> would use. I suppose that might mean fiddling with the installer >> build process as well as the buildbots. > > Sorry, I wasn't clear; I did not mean to imply there is no C99 compiler > available from Apple for OS X. On current OS X releases, clang is the > default and only supported compiler. I was bringing up the example of > the impact on building on older releases with the supported build tools > for those releases where clang is either not available or was too > immature to be usable. As I said, there are a number of solutions to > that problem - building on newer systems with deployment targets, > installing third-party compilers, etc. Clang is also available (and installed) on OSX 10.8 and earlier, although gcc 4.2.1 is the default frontend to LLVM. The easiest solution to get C99 on those platforms is $ export CC=clang Not very difficult, and highly recommended. Sturla Molden ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C99
Matthias Klosewrote: > GCC 5 and GCC 6 default to C11 (-std=gnu11), does the restriction to C99 mean > that -std=gnu99 should be passed explicitly? Also note that -std=c99 is not the same as -std=gnu99. The latter allows GNU extensions like computed goto. Does the interpreter depend on those? (Presumably it could be a benefit.) Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C99
Nathaniel Smithwrote: > No-one's proposing to use C99 indiscriminately; > There's no chance that CPython is going to drop MSVC support in 3.6. Stinner was proposing that by saying "Is it worth to support a compiler that in 2016 doesn't support the C standard released in 1999, 17 years ago?" This is basically a suggestion to drop MSVC support, as I read it. That is never going to happen. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C99
Victor Stinnerwrote: > Is it worth to support a compiler that in 2016 doesn't support the C > standard released in 1999, 17 years ago? MSVC only supports C99 when its needed for C++11 or some MS extension to C. Is it worth supporting MSVC? If not, we have Intel C, Clang and Cygwin GCC are the viable options we have on Windows (and perhaps Embarcadero, but I haven't used C++ builder for a very long time). Even MinGW does not fully support C99, because it depends on Microsoft's CRT. If we think MSVC and MinGW are worth supporting, we cannot just use C99 indiscriminantly. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C99
Guido van Rossumwrote: > I'm not sure I meant that. But if I have a 3rd party extension that > compiles with 3.5 headers using C89, then it should still compile with > 3.6 headers using C99. Also if I compile it for 3.5 and it only uses > the ABI it should still be linkable with 3.6. Ok, but if third-party developers shall be free to use a C89 compiler for their own code, we cannot have C99 in the include files. Otherwise the include files will taint the C89 purity of their source code. Personally I don't think we need to worry about compilers that don't implement C99 features like inline functions in C. How long have the Linux kernel used inline functions instead of macros? 20 years or more? Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C99
wrote: > I share Guido's priority there - source compatibility is more important than > smoothing a few of C's rough edges. Maybe the next breaking change release > this should be considered (python 4000... python 5000?) I was simply pointing out that Guido's priority removes a lot of the usefulness of C99 at source level. I was not saying I disagreed. If we have to keep header files clean of C99 I think this proposal just adds clutter. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C99
Guido van Rossum <gu...@python.org> wrote: > I'm talking about 3rd party extensions. Those may require source > compatibility with older Python versions. All I'm asking for is to not > require source-level use of C99 features. This of course removes a lot of its usefulness. E.g. macros cannot be replaced by inline functions, as header files must still be plain C89. Sturla Molden ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is there a reference manual for Python bytecode?
Brett Cannonwrote: > Ned also neglected to mention his byterun project which is a pure Python > implementation of the CPython eval loop: href="https://github.com/nedbat/byterun;>https://github.com/nedbat/byterun I would also encourage you to take a look at Numba. It is an LLVM based JIT compiler for Python bytecode, written for hardcore numerical algorithms in Python. It can often achieve the same performance as -O2 in C after a short burn-in while inferring the types of the arguments and variables. Using it is mostly as easy as adding an @numba.jit decorator to the function we want to accelerate. Numba is rapidly becoming what Google's long dead swallow should have been. :-) Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Computed Goto dispatch for Python 2
On 28/05/15 21:37, Chris Barker wrote: I think it's great for it to be used by end users as a system library / utility. i.e. like you would a the system libc -- so if you can write a little python script that only uses the stdlib -- you can simply deliver that script. No it is not, because someone will be 'clever' and try to upgrade it with pip or install packages into it. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Computed Goto dispatch for Python 2
Donald Stufft don...@stufft.io wrote: Honestly, I’m on an OS that *does* ship Python (OS X) and part of me hopes that they stop shipping it. It’s very rare that someone ships Python as part of their OS without modifying it in some way, and those modifications almost always cause pain to some set of users (and since I work on pip, they tend to come to us with the weirdo problems). Case in point: Python on OS X adds some preinstalled software, but they put this pre-installed software before site-packages in sys.path, so pip can’t upgrade those pre-installed software packages at all. Many Unix tools need Python, so Mac OS X (like Linux distros and FreeBSD) will always need a system Python. Yes, it would be great if could be called spython or something else than python. But the main problem is that it is used by end-users as well, not just the operating system. Anyone who use Python on OSX should install their own Python. The system Python should be left alone as it is. If the system Python needs updating, it is the responsibility of Apple to distribute the upgrade. Nobody should attempt to use pip to update the system Python. Who knows what side-effects it might have. Preferably pip should have a check for it and bluntly refuse to do it. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of C compilers for Python on Windows
Antoine Pitrou solip...@pitrou.net wrote: But you can compile OpenBLAS with one compiler and then link it to Python using another compiler, right? There is a single C ABI. BLAS and LAPACK are actually Fortran, which does not have a single C ABI. The ABI depends on the Fortran compiler. g77 and gfortran will produce different C ABIs. This is a consistent source of PITA in any scientific programming that combines C and Fortran. There is cblas though, which is a C API, but it does not include LAPACK. Another thing is that libraries are different. MSVC wants a .lib file, but MinGW produces .a files like GCC does on Linux. Perhaps you can generate a .lib file from a .a file, but I have never tried. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of C compilers for Python on Windows
Sturla Molden sturla.mol...@gmail.com wrote: BLAS and LAPACK are actually Fortran, which does not have a single C ABI. The ABI depends on the Fortran compiler. g77 and gfortran will produce different C ABIs. This is a consistent source of PITA in any scientific programming that combines C and Fortran. There is cblas though, which is a C API, but it does not include LAPACK. Another thing is that libraries are different. MSVC wants a .lib file, but MinGW produces .a files like GCC does on Linux. Perhaps you can generate a .lib file from a .a file, but I have never tried. And not to mention that the Fortran run-time depends on the C runtime... What Carl Keffner did for SciPy was to use a static libgfortran, which is not liked against any specific CRT, so it could be linked with msvcr90.dll when the Python extension is built. The vanilla libgfortran.dll from MinGW is linked with msvcrt.dll. However, not linking with msvcrt.dll broke the pthreads library, which in turn broke OpenMP, so he had to patch the pthreads library for this... This just shows some of the difficulties of trying to combine the GNU and Microsoft compilers. There are many others, like different stack alignment, differenr exception handling, and the mingw runtime (which causes segfaults when linked dynamically to MSVC executables). It's not just getting the CRT right. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of C compilers for Python on Windows
Antoine Pitrou solip...@pitrou.net wrote: It sound like whatever MSVC produces should be the defacto standard under Windows. Yes, and that is what Clang does on Windows. It is not as usable as MinGW yet, but soon it will be. Clang also suffers fronthe lack of a Fortran compiler, though. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of C compilers for Python on Windows
Steve Dower steve.do...@microsoft.com wrote: Is there some reason the Fortran part can't be separated out into a DLL? DLL hell, I assume. Using the Python extension module loader makes it less of a problem. If we stick with .pyd files where everything is statically linked we can rely on the Python dev team to make sure that DLL hell does not bite us. Most of the contributors to projects like NumPy and SciPy are not computer scientists. So the KISS principle is important, which is why scientific programmers often use Fortran in the first place. Making sure DLLs are resolved and loaded correctly, or using stuff like COM or .NET to mitigate DLL hell, is just in a different league. That is for computer engineers to take care of, but we are trained as physicists, matematicians, astronomers, chemists, biologists, or what ever... I am sure that engineers at Microsoft could do this correctly, but we are not the kind of guys you would hire :-) OT: Contrary to common belief, there is no speed advantage of using Fortran on a modern CPU, because the long pipeline and the hierarchical memory alleviates the problems with pointer aliasing. C code tends to run faster then Fortran, often 10 to 20 % faster, and C++ tends to be slightly faster than C. In 2014, Fortran is only used because it is easier to program for non-specialists. And besides, correctness is far more important than speed, which is why we prefer Python or MATLAB in the first place. If you ever see the argument that Fortran is used because of pointer aliasing, please feel free to ignore it. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of C compilers for Python on Windows
Nathaniel Smith n...@pobox.com wrote: You may want to get in touch with Carl Kleffner -- he's done a bunch of work lately on getting a mingw-based toolchain to the point where it can build numpy and scipy. To build *Python extensions*, one can use Carl's toolchain or the VC9 compiler for Python 2.7 that Microsoft just released. To build *Python* you need Visual Studio, Visual Studio Express, Windows SDK, or Cygwin because there is no other build process available on Windows. Python cannot be built with MinGW. The official 64-bit Python installer from Python.org is built with the Windows SDK compiler, not Visual Studio. The Windows SDK is a free download. The 32-bit installer is built with Visual Studio. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of C compilers for Python on Windows
Paul Moore p.f.mo...@gmail.com wrote: Having said that, I'm personally not interested in this, as I am happy with MSVC Express. Python 3.5 will be using MSVC 14, where the express edition supports both 32 and 64 bit. If you build Python yourself, you can (more or less) use whichever version of Visual Studio you want. There is nothing that prevents you from building Python 2.7 or 3.4 with MSVC 14. But then you have to build all Python extensions with this version of Visual Studio as well. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of C compilers for Python on Windows
Larry Hastings la...@hastings.org wrote: Just to make something clear that may not be clear to non-Windows developers: the C library is implicitly part of the ABI. MacOS X also has this issue, but it less known amon Mac developers! There tends to be multiple versions of the C library, one for each SDK version. If you link the wrong one when building a Python extension it can crash. For example if you have Python built with the 10.6 SDK (e.g. Enthought) but has XCode with the 10.9 SDK as default, you need to build with the flag -mmacosx-version-min=10.6, and for C++ also -stdlib=libstdc++. Not doing so will cause all sorts of mysterious errors. Two other ABI problems on Windows is the stack alignment and the MinGW runtime: On 32-bit applications, MSVC use 16 bit stack alignment whereas MinGW uses 32-bit alignment. This is a common cause of segfaults for Python extensions built with MinGW. Most developers just assume it is sufficient to link the same CRT as Python. Another problem is the MinGW runtime (mingw32.a or mingw32.dll) which conflicts with MSCV and can cause segfaults unless it is statically linked. The vanlilla MinGW distro defaults to dynamic linkage for this library. Because of this a special MinGW toolchain was created for building SciPy on Windows: https://github.com/numpy/numpy/wiki/Mingw-static-toolchain Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of C compilers for Python on Windows
Victor Stinner victor.stin...@gmail.com wrote: Is MinGW fully compatible with MSVS ABI? I read that it reuses the MSVCRT, but I don't know if it's enough. Not out of the box. See: https://github.com/numpy/numpy/wiki/Mingw-static-toolchain Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of C compilers for Python on Windows
Larry Hastings la...@hastings.org wrote: So as a practical matter I think I'd prefer if we continued to only support MSVC. In fact I'd prefer it if we removed support for other Windows compilers, instead asking those maintainers to publish their own patches / repos, in the way that Stackless does. The scientific community needs to use MinGW or Intel compilers because of Fortran. So some support for other compilers will be good, at least for building C extensions. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of C compilers for Python on Windows
Merlijn van Deen valhall...@arctus.nl wrote: VC++ 2008/2010 EE do not *bundle* a 64-bit compiler, Actually it does, but it is not available from the UI. You can use it from the command line, though. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of C compilers for Python on Windows
Steve Dower steve.do...@microsoft.com wrote: I don't have any official confirmation, but my guess would be that the 64-bit compilers were omitted from the VC 2008 Express to save space (bearing in mind that WinXP was the main target at that time, which had poor 64-bit support, and very few people cared about building 64-bit binaries) and were not available in the IDE for VC 2010 Express by mistake. For building extensions, the former is resolved by the package at http://aka.ms/vcpython27, and the latter works fine since the 64-bit compiler is there, just not exposed in the IDE. Neither of these will be an issue with VC14 - 64-bit is far too important these days. The 64-bit compiler is in VC 2008 Express as well, just not exposed in the IDE. I know this because when I got the Absoft Fortran compiler I was told to download VC 2008 Express, because Absoft uses the VC9 linker. And indeed there was a 64-bit compiler in VC 2008 Express as well, just not available from the IDE. If I remeber correctly, some fiddling with vcvars.bat was required to turn it on. I never tried to build Python extensions with it, though. In the beginning I thought Absoft had given me the wrong product, because I had ordered a 64-bit Fortran compiler and I knew VC 2008 Express was only 32-bit. But they assured me the 64-bit VC9 compiler was there as well, and indeed it was. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of C compilers for Python on Windows
Larry Hastings la...@hastings.org wrote: CPython doesn't require OpenBLAS. Not that I am not receptive to the needs of the numeric community... but, on the other hand, who in the hell releases a library with Windows support that doesn't work with MSVC?! It uses ATT assembly syntax instead of Intel assembly syntax. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Microsoft Visual C++ Compiler for Python 2.7
Christian Heimes christ...@python.org wrote: Is it possible to compile extensions from Python's numerical stack such as NumPy, SciPy and SciKit, too? The official NumPy installer is currently built with VC9, so probably yes. Other parts of the SciPy stack needs a Fortran compiler as well, so those might be more tricky. Currently the limitation to Fortran 77 is considered lifted, Fortran 90 and later will be allowed, so g77 is no longer an option. In practice you will need Intel ifort or a patched MinGW gfortran. Because of this the SciPy community has been creating a customized MinGW toolchain (including gfortran) for building binary wheels on Windows. It is patched to make sure that e.g. the MinGW runtime does not conflict with the VC9 code in the official Python 2.7 installer and that libgfortran uses the correct C runtime. The stack alignment is also changed to make it VC9 compatible. There was also a customization of the C++ exception handling. In addition to this the MinGW runtime and libgfortran are statically linked, so there are no extra runtime DLLs to install. https://github.com/numpy/numpy/wiki/Mingw-static-toolchain The toolchain also contains a build of OpenBLAS to use as BLAS and LAPACK when building NumPy and the SciPy stack. Intel MKL or ATLAS might be preferred though, due to concerns about the maturity of OpenBLAS. Sturla Molden ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Microsoft Visual C++ Compiler for Python 2.7
Steve Dower steve.do...@microsoft.com wrote: It'll help with the numerical stack, but only a little. The devs involved have largely figured it out already and I can't provide a good Fortran compiler or BLAS library, which is what they need. We finally have a MinGW based toolchain that can be used. Making sure it was compatible with VC9 is actually worse than most would expect, as there are subtile incompatibilities between vanilla MinGW and VC9, beyond just linking the same C runtime DLL. But it was worked out: https://github.com/numpy/numpy/wiki/Mingw-static-toolchain As for BLAS, the NumPy/SciPy devs have procured a permission from Intel to use MKL in binary wheels. But still there will be official binaries linked with a free BLAS library available. Currently we use ATLAS, but the plan is to use OpenBLAS (successor to GotoBLAS2) when it matures. OpenBLAS is currently the fastest abd most scalable BLAS library available, actually better than MKL, but it is severely underfunded. It is not a good situation for the industry that the only open BLAS library with the performance of MKL is a Chinese student project in HPC. ATLAS is unfortunately far less performant and scalable. Apple and Cray solved the problem on their platforms by building high-performance BLAS and LAPACK libraries into their operating systems (Apple Accelerate Framework and Cray libsci). But AFAIK, Windows does not have a BLAS library from Microsoft. Sturla Molden ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [numpy wishlist] Interpreter support for temporary elision in third-party classes
Julian Taylor jtaylor.deb...@googlemail.com wrote: The problem with this approach is that it is already difficult enough to handle memory in numpy. I would not do this in a way that complicates memory management in NumPy. I would just replace malloc and free with temporarily cached versions. From the perspective of NumPy the API should be the same. Having a cache that potentially stores gigabytes of memory out of the users sight will just make things worse. Buffer don't need to stay in cache forver, just long enough to allow resue within an expression. We are probably talking about delaying the call to free with just a few microseconds. We could e.g. have a setup like this: NumPy thread on malloc: - tries to grab memory off the internal heap - calls system malloc on failure NumPy thread on free: - returns a buffer to the internal heap - signals a condition Background daemonic GC thread: - wakes after sleeping on the condition - sleeps for another N microseconds (N = magic number) - flushes or shrinks the internal heap with system free - goes back to sleeping on the condition It can be implemented with the same API as malloc and free, and plugged directly into the existing NumPy code. We would in total need two mutexes, one condition variable, a pthread, and a heap. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [numpy wishlist] Interpreter support for temporary elision in third-party classes
Nathaniel Smith n...@pobox.com wrote: The proposal in my initial email requires zero pthreads, and is substantially more effective. (Your proposal reduces only the alloc overhead for large arrays; mine reduces both alloc and memory access overhead for boyh large and small arrays.) My suggestion prevents the kernel from zeroing pages in the middle of a computation, which is an important part. It would also be an optimiation the Python interpreter could benefit from indepently of NumPy, by allowing reuse of allocated memory pages within CPU bound portions of the Python code. And no, the method I suggested does not only work for large arrays. If we really want to take out the memory access overhead, we need to consider lazy evaluation. E.g. a context manager that collects a symbolic expression and triggers evaluation on exit: with numpy.accelerate: x = expression y = expression z = expression # evaluation of x,y,z happens here Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
Brett Cannon bcan...@gmail.com wrote: Nope. A new minor release of Python is a massive undertaking which is why we have saved ourselves the hassle of doing a Python 2.8 or not giving a clear signal as to when Python 2.x will end as a language. Why not just define Python 2.8 as Python 2.7 except with a newer compiler? I cannot see why that would be massive undertaking, if changing compiler for 2.7 is neccesary anyway. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
Brian Curtin br...@python.org wrote: Adding features into 3.x is already not enough of a carrot on the stick for many users. Intentionally leaving 2.7 on a dead compiler is like beating them with the stick. Those who want to build extensions on Windows will just use MinGW (currently GCC 2.8.2) instead. NumPy and SciPy are planning a switch to a GCC based toolchain with static linkage of the MinGW runtime on Windows. It is carefully configured to be binary compatible with VS2008 on Python 2.7. The major reason for this is to use gfortran also on Windows. But the result will be a GCC based toolchain that anyone can use to build extensions on Windows. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
Brian Curtin br...@python.org wrote: Well we're certainly not going to assume such a thing. I know people do that, but many don't (I never have). If Python 2.7 users are left with a dead compiler on Windows, they will find a solution. For example, Enthought is already bundling their Python distribution with gcc 2.8.1 on Windows. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
Eli Bendersky eli...@gmail.com wrote: While we're at it, Clang in nearing a stage where it can compile C and C++ on Windows *with ABI-compatibility to MSVC* (yes, even C++) -- see a href=http://clang.llvm.org/docs/MSVCCompatibility.html;http://clang.llvm.org/docs/MSVCCompatibility.html/a for more details. Could this help? Possibly. cl-clang is exciting and I hope distutils will support it one day. Clang is not well known among Windows users as it is among users of Unix (Apple, Linux, FreeBSD, et al.) It would be even better if Python were bundled with Clang on Windows. The MinGW-based SciPy toolchain has ABI compatibility with MSVC only for C (and Fortran), not C++. Differences from vanilla MinGW is mainly static linkage of the MinGW runtime, different stack alignment (4 bytes instead of 16), and it links with msvcr91.dll instead of msvcrt.dll. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
Brian Curtin br...@python.org wrote: If Python 2.7 users are left with a dead compiler on Windows, they will find a solution. For example, Enthought is already bundling their Python distribution with gcc 2.8.1 on Windows. Again, not something I think we should depend on. A lot of people use python.org installers. I am not talking about changing the python.org installers. Let it remain on VS2008 for Python 2.7. I am only suggesting we make it easier to find a free C compiler compatible with the python.org installers. The NumPy/SciPy dev team have taken the burden to build a MinGW toolchain that is configured to be 100 % ABI compatible with the python.org installer. I am only suggesting a link to it or something like that, perhaps even host it as a separate download. (It is GPL, so anyone can do that.) That way it would be easy to find a compatible C compiler. We have to consider that VS2008 will be unobtainable abandonware long before the promised Python 2.7 support expires. When that happens, users of Python 2.7 will need to find another compiler to build C extensions. If Python.org makes this easier it would hurt less to have Python 2.7 remain on VS2008 forever. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [numpy wishlist] Interpreter support for temporary elision in third-party classes
Greg Ewing greg.ew...@canterbury.ac.nz wrote: Julian Taylor wrote: tp_can_elide receives two objects and returns one of three values: * can work inplace, operation is associative * can work inplace but not associative * cannot work inplace Does it really need to be that complicated? Isn't it sufficient just to ask the object potentially being overwritten whether it's okay to overwrite it? How can it know this without help from the interpreter? Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [numpy wishlist] Interpreter support for temporary elision in third-party classes
Nathaniel Smith n...@pobox.com wrote: with numpy.accelerate: x = expression y = expression z = expression # evaluation of x,y,z happens here Using an alternative evaluation engine is indeed another way to optimize execution, which is why projects like numexpr, numba, theano, etc. exist. But this is basically switching to a different language in a different VM. I was not thinking that complicated. Let us focus on what an unmodified CPython can do. A compound expression with arrays can also be seen as a pipeline. Imagine what would happen if in NumPy 2.0 arithmetic operators returned coroutines instead of temporary arrays. That way an expression could be evaluated chunkwise, and the chunks would be small enough to fit in cache. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [numpy wishlist] Interpreter support for temporary elision in third-party classes
On 05/06/14 22:51, Nathaniel Smith wrote: This gets evaluated as: tmp1 = a + b tmp2 = tmp1 + c result = tmp2 / c All these temporaries are very expensive. Suppose that a, b, c are arrays with N bytes each, and N is large. For simple arithmetic like this, then costs are dominated by memory access. Allocating an N byte array requires the kernel to clear the memory, which incurs N bytes of memory traffic. It seems to be the case that a large portion of the run-time in Python code using NumPy can be spent in the kernel zeroing pages (which the kernel does for security reasons). I think this can also be seen as a 'malloc problem'. It comes about because each new NumPy array starts with a fresh buffer allocated by malloc. Perhaps buffers can be reused? Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should standard library modules optimize for CPython?
Stefan Behnel stefan...@behnel.de wrote: Thus my proposal to compile the modules in CPython with Cython, rather than duplicating their code or making/keeping them CPython specific. I think reducing the urge to reimplement something in C is a good thing. For algorithmic and numerical code, Numba has already proven that Python can be JIT compiled comparable to -O2 in C. For non-algorthmic code, the speed determinants are usually outside Python (e.g. the network connection). Numba is becoming what the dead swallow should have been. The question is rather should the standard library use a JIT compiler like Numba? Cython is great for writing C extensions while avoiding all the details of the Python C API. But for speeding up algorithmic code, Numba is easier to use. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should standard library modules optimize for CPython?
Stefan Behnel stefan...@behnel.de wrote: So the argument in favour is mostly a pragmatic one. If you can have 2-5x faster code essentially for free, why not just go for it? I would be easier if the GIL or Cython's use of it was redesigned. Cython just grabs the GIL and holds on to it until it is manually released. The standard lib cannot have packages that holds the GIL forever, as a Cython compiled module would do. Cython has to start sharing access the GIL like the interpreter does. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 2.7.7. on Windows
Mike Miller python-...@mgmiller.net wrote: The main rationale given (for not using the standard %ProgramFiles%) has been that the full path to python is too long to type, and ease of use is more important than the security benefits given by following Windows conventions. C:\Program Files\Python27 contains an empty space in the path. If you want to randomly break build tools for C extensions, then go ahead and change it. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 465: A dedicated infix operator for matrix multiplication
Björn Lindqvist bjou...@gmail.com wrote: import numpy as np from numpy.linalg import inv, solve # Using dot function: S = np.dot((np.dot(H, beta) - r).T, np.dot(inv(np.dot(np.dot(H, V), H.T)), np.dot(H, beta) - r)) # Using dot method: S = (H.dot(beta) - r).T.dot(inv(H.dot(V).dot(H.T))).dot(H.dot(beta) - r) Don't keep your reader hanging! Tell us what the magical variables H, beta, r and V are. And why import solve when you aren't using it? Curious readers that aren't very good at matrix math, like me, should still be able to follow your logic. Even if it is just random data, it's better than nothing! Perhaps. But you don't need to know matrix multiplication to see that those expressions are not readable. And by extension, you can still imagine that bugs can easily hide in unreadable code. Matrix multiplications are used extensively in anything from engineering to statistics to computer graphics (2D and 3D). This operator will be a good thing for a lot of us. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pyston: a Python JIT on LLVM
Kevin Modzelewski k...@dropbox.com wrote: Using optional type annotations is a really promising strategy and may eventually be added to Pyston, but our primary target right now is unmodified and untyped Python code What I meant to say is that Numba already has done the boiler-plate coding. Even if you use no type annotations, it is already a Python bytecode JIT-compiler based on LLVM that is hooked up with CPython. You might have to add optimizations to it, yes, but it has the skeleton for a CPython LLVM-based JIT compiler set up and running. If you provide no type annotations, Numba's autojit decorator will do a data-guided specialization. The types will be inferred from running the code through the CPython interpreter, and then Numba will generate a specialization. This is somewhat similar to the information-gathering that GCC does when we run profile-guided optimizations. Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pyston: a Python JIT on LLVM
Kevin Modzelewski k...@dropbox.com wrote: Since it's the question that I think most people will inevitably (and rightly) ask, why do we think there's a place for Pyston when there's PyPy and (previously) Unladen Swallow? Have you seen Numba, the Python JIT that integrates with NumPy? http://numba.pydata.org It uses LLVM to compile Python bytecode. When I have tried it I tend to get speed comparable to -O2 in C for numerical and algorithmic code. Here is an example, giving a 150 times speed boost to Python: http://stackoverflow.com/questions/21811381/how-to-shove-this-loop-into-numpy/21818591#21818591 Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Start writing inlines rather than macros?
Brett Cannon br...@python.org wrote: The Visual Studio team has publicly stated they will never support C99, so dropping C89 blindly is going to alienate a big part of our user base unless we switch to C++ instead. I'm fine with trying to pull in C99 features, though, that we can somehow support in a backwards-compatible way with VS. So you are saying that Python should use the C that Visual Studio supports? I believe Microsoft is not competent to define the C standard. If they cannot provide a compiler that is their bad. There are plenty of other standard-compliant compilers we can use, including Intel, clang and gcc (MinGW). Sturla ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Coverity Scan Spotlight Python
Do the numbers add up? .005 defects in 1,000 lines of code is one defect in every 200,000 lines of code. However they also claim that to date, the Coverity Scan service has analyzed nearly 400,000 lines of Python code and identified 996 new defects – 860 of which have been fixed by the Python community. Sturla Sendt fra min iPad Den 30. aug. 2013 kl. 00:10 skrev Christian Heimes christ...@python.org: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Hello, Coverity has published its Coverity Scan Spotlight Python a couple of hours ago. It features a summary of Python's ecosystem, an interview with me about Python core development and a defect report. The report is awesome. We have reached a defect density of .005 defects per 1,000 lines of code. In 2012 the average defect density of Open Source Software was 0.69. http://www.coverity.com/company/press-releases/read/coverity-finds-python-sets-new-level-of-quality-for-open-source-software http://wpcme.coverity.com/wp-content/uploads/2013-Coverity-Scan-Spotlight-Python.pdf The internet likes it, too. http://www.prnewswire.com/news-releases/coverity-finds-python-sets-new-level-of-quality-for-open-source-software-221629931.html http://www.securityweek.com/python-gets-high-marks-open-source-software-security-report Thank you very much to Kristin Brennan and Dakshesh Vyas from Coverity as well as everybody who has helped to fix the remaining issues! Christian -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQIcBAEBCgAGBQJSH8bEAAoJEMeIxMHUVQ1FFQcQAL1/Tb5PFMdLXwWsMt9D06aP A2qQPunEnfDBMdQz4GTEeDmHPdjs/EgAtUz4sLI48HlAmpdWEtoVPCdg1GvKSvMi IRVHR5LAtxe5p8M42+8DnSFyIOtEsbtv06W5cHvRxr6RuIkY3bTy0SVhtP9JW+N7 wQKsp2cOIOz/FHDWWQWjxwlZmUWEGkvSSggzbYxcdsaJeGHoJgkuzoChQ3mCtUCo w231OTKBZhGQp/VpMK+Q7OXWm78BZdB6d4GcSR3meCU9GpRMfPBxPF7v4IWvDPv9 4l/y922hmLLoOchJG+PDqcDhX1dnFm1t3Q199iqS5c0c+ttgaMRdSJEXZpZrubxe k+frJiOivG4G7BuzgQ39yF01rRHpjs57FW9FBbt4pp2c+4iOEkgARH+L/e2ZwOnk puXE45AfKwJwHLc4RDOhxdaPy/ovOh53HY68UxXoKjeZKWK5ShRopk0muvYG0y5O +8PbAKOYgJbe//NC3ac89V/1eu4rrFhN7xsK2Wc8i+kcbTB2XIVFElLHuV5wjmLd MMXFlm9LDJFOw12E4sF3MPaHyXQYpNJHvbnuxCkcHRQoLKzrcRJ2Y0Jj4HPSUCsj JhfmHX7Zu+/akmT4haqXUdtRrn4wji0OYqGydEqi4aLy7ELrC1EVNZY4OkbUhJO8 gGbpseJXtVThXQ7fymMS =++g9 -END PGP SIGNATURE- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/sturla%40molden.no ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The end of 2.7
On 07.04.2013 21:50, Martin v. Löwis wrote: So I believe that extension building is becoming more and more painful on Windows for Python 2.7 as time passes (and it is already way more painful than it is on Linux), and I see no way to do much about that. The stable ABI would have been a solution, but it's too late now for 2.7. I think extension building for Python 2.7 on Windows for this reason is moving from VS2008 to GCC 4.7 (MinGW). When using VS, we are stuck with an old compiler (i.e. the .NET 3.5 SDK). With GCC, there is no such issue - we just link with whatever CRT is appropriate. Thus, providing link libraries for GCC/MinGW (both for the Python and the CRT DLL) somewhat alleviates the problem, unless using VS is mandatory. A long-term solution might be to expose the CRT used by the Python 2.7 DLL with DLL forwarding. That way, linking with the Python DLL's import library would also link the correct CRT. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Slides from today's parallel/async Python talk
Den 14. mars 2013 kl. 23:23 skrev Trent Nelson tr...@snakebite.org: For the record, here are all the Windows calls I'm using that have no *direct* POSIX equivalent: Interlocked singly-linked lists: - InitializeSListHead() - InterlockedFlushSList() - QueryDepthSList() - InterlockedPushEntrySList() - InterlockedPushListSList() - InterlockedPopEntrySlist() Synchronisation and concurrency primitives: - Critical sections - InitializeCriticalSectionAndSpinCount() - EnterCriticalSection() - LeaveCriticalSection() - TryEnterCriticalSection() - Slim read/writer locks (some pthread implements have rwlocks)*: - InitializeSRWLock() - AcquireSRWLockShared() - AcquireSRWLockExclusive() - ReleaseSRWLockShared() - ReleaseSRWLockExclusive() - TryAcquireSRWLockExclusive() - TryAcquireSRWLockShared() - One-time initialization: - InitOnceBeginInitialize() - InitOnceComplete() - Generic event, signalling and wait facilities: - CreateEvent() - SetEvent() - WaitForSingleObject() - WaitForMultipleObjects() - SignalObjectAndWait() Native thread pool facilities: - TrySubmitThreadpoolCallback() - StartThreadpoolIo() - CloseThreadpoolIo() - CancelThreadpoolIo() - DisassociateCurrentThreadFromCallback() - CallbackMayRunLong() - CreateThreadpoolWait() - SetThreadpoolWait() Memory management: - HeapCreate() - HeapAlloc() - HeapDestroy() Structured Exception Handling (#ifdef Py_DEBUG): - __try/__except Sockets: - ConnectEx() - AcceptEx() - WSAEventSelect(FD_ACCEPT) - DisconnectEx(TF_REUSE_SOCKET) - Overlapped WSASend() - Overlapped WSARecv() Don't get me wrong, I grew up with UNIX and love it as much as the next guy, but you can't deny the usefulness of Windows' facilities for writing high-performance, multi-threaded IO code. It's decades ahead of POSIX. (Which is also why it bugs me when I see select() being used on Windows, or IOCP being used as if it were a poll-type generic IO multiplexor -- that's like having a Ferrari and speed limiting it to 5mph!) So, before any of this has a chance of working on Linux/BSD, a lot more scaffolding will need to be written to provide the things we get for free on Windows (threadpools being the biggest freebie). Have you considered using OpenMP instead of Windows API or POSIX threads directly? OpenMP gives you a thread pool and synchronization primitives for free as well, with no special code needed for Windows or POSIX. OpenBLAS (and GotoBLAS2) uses OpenMP to produce a thread pool on POSIX systems (and actually Windows API on Windows). The OpenMP portion of the C code is wrapped so it looks like sending an asynch task to a thread pool; the C code is not littered with OpenMP pragmas. If you need something like Windows threadpools on POSIX, just look at the BSD licensed OpenBLAS code. It is written to be scalable for the world's largest supercomputers (but also beautifully written and very easy to read). Cython has code to register OpenMP threads as Python threads, in case that is needed. So that problem is also solved. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ctypes is not an acceptable implementation strategy for modules in the standard library?
On 05.11.2012 15:14, Xavier Morel wrote: Such as segfaulting the interpreter. I seem to reliably segfault everything every time I try to use ctypes. You can do that with C extensions too, by the way. Apart from that, dependency on ABI is more annoying to maintain across platforms than dependency on API. Function calls with ctypes are also very slow. For C extensions in the stdlib, Cython might be a better choice then ctypes. ctypes might be a good choice if you are to use a DLL on your own computer. Because then you only have one ABI to worry about. Not so for Python's standard library. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] possible bug in distutils (Mingw32CCompiler)?
Mingw32CCompiler in cygwincompiler.py emits the symbol -mno-cygwin. This is used to make Cygwin's gcc behave as mingw. As of gcc 4.6 it is not recognized by the mingw gcc compiler itself, and causes as crash. It should be removed because it is never needed for mingw (in any version), only for cross-compilation to mingw from other gcc versions. Instead, those who use CygwinCCompiler or Linux GCC to cross-compile to plain Win32 can set -mno-cygwin manually. It also means -mcygwin should be removed from the output of CygwinCCompiler. I think... Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GIL removal question
Den 10.08.2011 13:43, skrev Guido van Rossum: They have a specific plan, based on Software Transactional Memory: http://morepypy.blogspot.com/2011/06/global-interpreter-lock-or-how-to-kill.html Microsoft's experiment to use STM in .NET failed though. And Linux got rid of the BKL without STM. There is a similar but simpler paradim called bulk synchronous parallel (BSP) which might work too. Threads work independently for a particular amount of time with private objects (e.g. copy-on-write memory), then enter a barrier, changes to global objects are synchronized and the GC collects garbage, after which worker threads leave the barrier, and the cycle repeats. To communicate changes to shared objects between synchronization barriers, Python code must use explicit locks and flush statements. But for the C code in the interpreter, BSP should give the same atomicity for Python bytecodes as the GIL (there is just one active thread inside the barrier). BSP is much simpler to implement than STM because of the barrier synchronization. BSP also cannot deadlock or livelock. And because threads in BSP work with private memory, there will be no trashing (false sharing) from the reference counting GC. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GIL removal question
Den 13.08.2011 17:43, skrev Antoine Pitrou: These days we have PyGILState_Ensure(): http://docs.python.org/dev/c-api/init.html#PyGILState_Ensure With the most recent Cython (0.15) we can just do: with gil: suite to ensure holding the GIL. And similarly from a thread holding the GIL with nogil: suite to temporarily release it. There are also some OpenMP support in Cython 0.15. OpenMP is much easier than messing around with threads manually (it moves all the hard parts of multithreading to the compiler). Now Cython almost makes it look Pythonic: http://docs.cython.org/src/userguide/parallelism.html Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GIL removal question
Den 12.08.2011 18:51, skrev Xavier Morel: * Erlang uses erlang processes, which are very cheap preempted *processes* (no shared memory). There have always been tens to thousands to millions of erlang processes per interpreter source contention within the interpreter going back to pre-SMP by setting the number of schedulers per node to 1 can yield increased overall performances) Technically, one can make threads behave like processes if they don't share memory pages (though they will still share address space). Erlangs use of 'process' instead of 'thread' does not mean an Erlang process has to be implemented as an OS process. With one interpreter per thread, and a malloc that does not let threads share memory pages (one heap per thread), Python could do the same. On Windows, there is an API function called HeapAlloc, which lets us allocate memory form a dedicated heap. The common use case is to prevent threads from sharing memory, thus behaving like light-weight processes (except address space is shared). On Unix, is is more common to use fork() to create new processes instead, as processes are more light-weight than on Windows. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GIL removal question
Den 12.08.2011 18:57, skrev Rene Nejsum: My two danish kroner on GIL issues…. I think I understand the background and need for GIL. Without it Python programs would have been cluttered with lock/synchronized statements and C-extensions would be harder to write. Thanks to Sturla Molden for he's explanation earlier in this thread. I doesn't seem I managed to explain it :( Yes, C extensions would be cluttered with synchronization statements, and that is annoying. But that was not my point all! Even with fine-grained locking in place, a system using reference counting will not scale on an multi-processor computer. Cache-lines containing reference counts will become incoherent between the processors, causing traffic jam on the memory bus. The technical term in parallel computing litterature is false sharing. However, the GIL is also from a time, where single threaded programs running in single core CPU's was the common case. On a new MacBook Pro I have 8 core's and would expect my multithreaded Python program to run significantly fast than on a one-core CPU. Instead the program slows down to a much worse performance than on a one-core CPU. A multi-threaded program can be slower on a multi-processor computer as well, if it suffered from extensive false sharing (which Python programs nearly always will do). That is, instead of doing useful work, the processors are stepping on each others toes. So they spend the bulk of the time synchronizing cache lines with RAM instead of computing. On a computer with a single processor, there cannot be any false sharing. So even without a GIL, a multi-threaded program can often run faster on a single-processor computer. That might seem counter-intuitive at first. I seen this inversed scaling blamed on the GIL many times, but it's dead wrong. Multi-threading is hard to get right, because the programmer must ensure that processors don't access the same cache lines. This is one of the reasons why numerical programs based on MPI (multiple processes and IPC) are likely to perform better than numerical programs based on OpenMP (multiple threads and shared memory). As for Python, it means that it is easier to make a program based on multiprocessing scale well on a multi-processor computer, than a program based on threading and releasing the GIL. And that has nothing to do with the GIL! Albeit, I'd estimate 99% of Python programmers would blame it on the GIL. It has to do with what shared memory does if cache lines are shared. Intuition about what affects the performance of a multi-threaded program is very often wrong. If one needs parallel computing, multiple processes is much more likely to scale correctly. Threads are better reserved for things like non-blocking I/O. The problem with the GIL is merely what people think it does -- not what it actually does. It is so easy to blame a performance issue on the GIL, when it is actually the use of threads and shared memory per se that is the problem. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GIL removal question
Den 09.08.2011 11:33, skrev Марк Коренберг: Probably I want to re-invent a bicycle. I want developers to say me why we can not remove GIL in that way: 1. Remove GIL completely with all current logick. 2. Add it's own RW-locking to all mutable objects (like list or dict) 3. Add RW-locks to every context instance 4. use RW-locks when accessing members of object instances Only one reason, I see, not do that -- is performance of singlethreaded applications. Why not to fix locking functions for this 4 cases to stubs when only one thread present? This has been discussed to death before, and is probably OT to this list. There is another reason than speed of single-threaded applications, but it is rather technical: As CPython uses reference counting for garbage collection, we would get false sharing of reference counts -- which would work as an invisible GIL (synchronization bottleneck) anyway. That is, if one processor writes to memory in a cache-line shared by another processor, they must stop whatever they are doing to synchronize the dirty cache lines with RAM. Thus, updating reference counts would flood the memory bus with traffic and be much worse than the GIL. Instead of doing useful work, the processors would be stuck synchronizing dirty cache lines. You can think of it as a severe traffic jam. To get rid of the GIL, CPython would either need (a) another GC method (e.g. similar to .NET or Java) or (b) another threading model (e.g. one interpreter per thread, as in Tcl, Erlang, or .NET app domains). As CPython has neither, we are better off with the GIL. Nobody likes the GIL, fork a project to write a GIL free CPython if you can. But note that: 1. With Cython, you have full manual control over the GIL. IronPython and Jython does not have a GIL at all. 2. Much of the FUD against the GIL is plain ignorance: The GIL slows down parallel computational code, but any serious number crunching should use numerical performance libraries (i.e. C extensions) anyway. Libraries are free to release the GIL or spawn threads internally. Also, the GIL does not matter for (a) I/O bound code such as network servers or clients and (b) background threads in GUI programs -- which are the two common use-cases for threads in Python programs. If the GIL bites you, it's most likely a warning that your program is badly written, independent of the GIL issue. There seems to be a common misunderstanding that Python threads work like fibers due to they GIL. They do not! Python threads are native OS threads and can do anything a thread can do, including executing library code in parallel. If one thread is blocking on I/O, the other threads can continue with their business. The only thing Python threads cannot do is access the Python interpreter concurrently. And the reason CPython needs that restriction is reference counting. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] CPython optimization: storing reference counters outside of objects
Den 24.05.2011 11:55, skrev Artur Siekielski: PYRO/multiprocessing proxies isn't a comparable solution because of ORDERS OF MAGNITUDE worser performance. You compare here direct memory access vs serialization/message passing through sockets/pipes. The bottleneck is likely the serialization, but only if you serialize large objects. IPC is always very fast, at least on localhost . Just out of curiosity, have you considered using a database? Sqlite and BSD DB can even be put in shared memory if you want. It sounds like you are trying to solve a database problem using os.fork, something which is more or less doomed to fail (i.e. you have to replicate all effort put into scaling up databases). If a database is too slow, I am rather sure you need something else than Python as well. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] CPython optimization: storing reference counters outside of objects
Den 24.05.2011 13:31, skrev Maciej Fijalkowski: Not sure what scenario exactly are you discussing here, but storing reference counts outside of objects has (at least on a single processor) worse cache locality than inside objects. Artur Siekielski is not talking about cache locality, but copy-on-write fork on Linux et al. When reference counts are updated after forking, memory pages marked copy-on-write are copied if they store reference counts. And then he quickly runs out of memory. He wants to put reference counts and PyObjects in different pages, so only the pages with reference counts get copied. I don't think he cares about cache locality at all, but the rest of us do :-) Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] CPython optimization: storing reference counters outside of objects
Den 24.05.2011 11:55, skrev Artur Siekielski: POSH might be good, but the project is dead for 8 years. And this copy-on-write is nice because you don't need changes/restrictions to your code, or a special garbage collector. Then I have a solution for you, one that is cheaper than anything else you are trying to do (taking work hours into account): BUY MORE RAM! RAM is damn cheap. You just need more of it. And 64-bit Python :-) Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] CPython optimization: storing reference counters outside of objects
Den 24.05.2011 17:39, skrev Artur Siekielski: Disk access is about 1000x slower than memory access in C, and Python in a worst case is 50x slower than C, so there is still a huge win (not to mention that in a common case Python is only a few times slower). You can put databases in shared memory (e.g. Sqlite and BSDDB have options for this). On linux you can also mount /dev/shm as ramdisk. Also, why do you distrust the database developers of Oracle et al. not to do the suffient optimizations? Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] CPython optimization: storing reference counters outside of objects
Den 23.05.2011 06:59, skrev Martin v. Löwis: My expectation is that your approach would likely make the issues worse in a multi-CPU setting. If you put multiple reference counters into a contiguous block of memory, unrelated reference counters will live in the same cache line. Consequentially, changing one reference counter on one CPU will invalidate the cached reference counters of that cache line on other CPU, making your problem a) actually worse. In a multi-threaded setting with concurrent thread accessing reference counts, this would certainly worsen the situation. In a single-threaded setting, this will likely be an improvement. CPython, however, has a GIL. Thus there is only one concurrently active thread with access to reference counts. On a thread switch in the interpreter, I think the performance result will depend on the nature of the Python code: If threads share a lot of objects, it could help to reduce the number of dirty cache lines. If threads mainly work on private objects, it would likely have the effect you predict. Which will dominate is hard to tell. Instead, we could use multiple heaps: Each Python thread could manage it's own heap for malloc and free (cf. HeapAlloc and HeapFree in Windows). Objects local to one thread only reside in the locally managed heap. When an object becomes shared by seveeral Python threads, it is moved from a local heap to the global heap of the process. Some objects, such as modules, would be stored directly onto the global heap. This way, objects only used by only one thread would never dirty cache lines used by other threads. This would also be a way to reduce the CPython dependency on the GIL. Only the global heap would need to be protected by the GIL, whereas the local heaps would not need any global synchronization. (I am setting follow-up to the Python Ideas list, it does not belong on Python dev.) Sturla Molden ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] CPython optimization: storing reference counters outside of objects
Den 24.05.2011 00:07, skrev Artur Siekielski: Oh, and using explicit shared memory or mmap is much harder, because you have to map the whole object graph into bytes. It sounds like you need PYRO, POSH or multiprocessing's proxy objects. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bugs in thread_nt.h
Den 10.03.2011 11:06, skrev Scott Dial: http://www.kernel.org/doc/Documentation/volatile-considered-harmful.txt The important part here (forgive me for being a pedant) is that register allocation of the (1) 'owned' field is actually unwanted, and (2) Microsoft specify 'volatile' in calls to the Interlocked* API-functions. I am sorry for misreading 32 bits (4 bytes) as 32 bytes. That is obviously very different. If Microsoft's malloc is sufficient, why does MSDN tell us to use _aligned_malloc instead of malloc? The rest is a comment to the link and a bit OT: Their argument is that (1) volatile supresses optimization; (2) within a critical section, access to shared data is synchronized; and thus (3) volatile is critical there. That is true, to some extent. Volatile is not a memory barrier, nor a mutex. Volatile's main purpose is to prevent the compiler from storing a variable in a register. Volatile might be used incorrectly if we don't understand this. Obvious usecases for volatile are: - Implementation of a spinlock, where register allocation is detrimental. - A buffer that is filled from the outside with some DMA mechanism. - Real-time programs and games where order of execution and and timing is critical, so optimization must be supressed. Even though volatile is not needed for processing within a critical section, we still need the shared data to be re-loaded upon entering and er-written upon leaving. We can use a typecast to non-volatile within a critical section, and achieve both data consisitency and compiler optimization. OpenMP has a better mechanism for this, where a flush-operand (#pragma omp flush) will force a synchronization of shared data among threads (write and reload), and volatile is never needed for consistency of shared data. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Bugs in thread_nt.h
Atomic operations (InterlockedCompareExchange, et al.) are used on the field 'owned' in NRMUTEX. These methods require the memory to be aligned on 32-byte boundaries. They also require the volatile qualifer. Three small changes are therefore needed (see below). Regards, Sturla Molden typedef struct NRMUTEX { volatile LONG owned ; /* Bugfix: remember volatile */ DWORD thread_id ; HANDLE hevent ; } NRMUTEX, *PNRMUTEX; NRMUTEX AllocNonRecursiveMutex(void) { PNRMUTEX mutex = (PNRMUTEX)_aligned_malloc(sizeof(NRMUTEX),32) ; /* Bugfix: align to 32-bytes */ if (mutex !InitializeNonRecursiveMutex(mutex)) { free(mutex) ; mutex = NULL ; } return mutex ; } void FreeNonRecursiveMutex(PNRMUTEX mutex) { if (mutex) { DeleteNonRecursiveMutex(mutex) ; _aligned_free(mutex) ; /* Bugfix: align to 32-bytes */ } } ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bugs in thread_nt.h
Den 10.03.2011 03:02, skrev Mark Hammond: These issues are best put in the tracker so they don't get lost - especially at the moment with lots of regulars at pycon. Ok, sorry :-) It would also be good to know if there is an actual behaviour bug caused by this (ie, what problems can be observed which are caused by the current code?) None that I have observed, but this is required according to MSDN. Theoretically, an optimizing compiler could cache the 'owned' field if it's not declared volatile. It currently works because a wait on the lock is implemented with a WaitForSingleObject on a kernel event object when the waitfalg is set. If the wait mechanism is changed to a much less expensive user-space spinlock, just releasing the time-slice by Sleep(0) for each iteration, it will certainly fail without a volatile qualifier. As for InterlockedCompareExchange et al., MSDN says this: The parameters for this function must be aligned on a 32-bit boundary; otherwise, the function will behave unpredictably on multiprocessor x86 systems and any non-x86 systems. See _aligned_malloc. Well, it does not hurt to obey :-) Regards, Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] mingw support?
The problem really is that when people ask for MingW support, they mean all kinds of things, Usually it means they want to build C or C++ extensions, don't have Visual Studio, don't know about the SDK compiler, and have misunderstood the CRT problem. As long at Python builds with the free Windows 7 SDK, I think it is sufficient to that mingw is supported for extensions (and the only reasons for selecing mingw over Microsoft C/C++ on windows are Fortran and C99 -- the Windows SDK compiler is a free download as well.) Enthought (32-bit) ships with a mingw gcc compiler configured to build extensions. That might be something to consider for Python on Windows. It will prevent accidental linking with wrong libraries (particularly the CRT). Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] mingw support?
Cesare Di Mauro: I like to use Windows because it's a comfortable and productive environment, certainly not because someone forced me to use it. Also, I have limited time, so I want to spend it the better I can, focusing on solving real problems. Setup, Next, Next, Finish, and I want it working without thinking about anything else. [...] Give users a better choice, and I don't see logical reasons because they'll not change their mind. I use Windows too, even though I am a scientist and most people seem to prefer Linux for scientific computing. I do like to just click on the installer from Enthought and forget about building all the binaries an libraries myself. Maybe I am just lazy... But likewise, I think that most Windows users don't care which C compiler was used to build Python. Nor do we/they care which compiler was used to build any other third-party software, as long as the MSI installers works and the binaries are void of malware. Also note that there are non-standard things on Windows that mingw does not support properly, such as COM and structured exceptions. Extensions like pywin32 depend on Microsoft C/C++ for that reason. So for Windows I think it is sufficient to support mingw for extension libraries. The annoying part is the CRT DLL hell, which is the fault of Microsoft. An easy fix would be a Python/mingw bundle, or a correctly configured mingw compiler from python.org. Or Python devs could consider not using Microsoft's CRT at all on Windows, and replacing it with a custom CRT or plain Windows API calls. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] mingw support?
David Cournapeau: Autotools only help for posix-like platforms. They are certainly a big hindrance on windows platform in general, That is why mingw has MSYS. mingw is not just a gcc port, but also a miniature gnu environment for windows. MSYS' bash shell allows us to do things like: $ ./configure $ make make install Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] mingw support?
Terry Reedy: MingW has become less attractive in recent years by the difficulty in downloading and installing a current version and finding out how to do so. Some projects have moved on to the TDM packaging of MingW. http://tdm-gcc.tdragon.net/ MinGW has become a mess. Equation.com used to have a decent installer, but at some point they started to ship mingw builds with a Trojan. TDM looks OK for now. Building 32-bit Python extension works with MinGW. 64-bit extensions are not possible due to lacking import libraries (no libmsvcr90.a and libpython26.a for amd64). It is not possible to build Python with mingw, only extensions. I think it is possible to build Python with Microsoft's SDK compiler, as it has nmake. The latest is Windows 7 SDK for .NET 4, but we need the version for .NET 3.5 to maintain CRT compatibility with current Python releases. Python's distutils do not work with the SDK compiler, only Visual Studio. Building Python extensions with the SDK compiler is not as easy as it could (or should) be. One advantage of mingw for scientific programmers (which a frequent users of Python) is the gfortran compiler. Although it is not as capable as Absoft or Intel Fortran, it is still decent and can be used with f2py. This makes the lack of 64-bit support for Python extensions with mingw particularly annoying. Microsoft's SDK does not have a Fortran compiler, and commercial versions are very expensive (though I prefer to pay for Absoft anyway). I do not wish for a complete build process for mingw. But support for 64-bit exensions with mingw and distutils support for Microsoft's SDK compiler would be nice. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] mingw support?
Please understand that this very choice is there already. That's great. Is that what the documentation refers to when it says MSVCCompiler will normally choose the right compiler, linker etc. on its own. To override this choice, the environment variables DISTUTILS_USE_SDK and MSSdk must be both set. MSSdk indicates that the current environment has been setup by the SDK?s SetEnv.Cmd script, or that the environment variables had been registered when the SDK was installed; DISTUTILS_USE_SDK indicates that the distutils user has made an explicit choice to override the compiler selection by MSVCCompiler. That isn't particularly clear to me, but it may be to others more familiar with building on Windows. Ahh... MSSdk must be set typically means we must use the Windows 7 SDK command prompt. Without DISTUTILS_USE_SDK, the build fails: C:\DEVELOPMENT\test-distutilssetup.py build_ext running build_ext building 'test' extension Traceback (most recent call last): File C:\DEVELOPMENT\test-distutils\setup.py, line 6, in module ext_modules=[Extension('test', ['test.c'])], File C:\Python26\lib\distutils\core.py, line 152, in setup dist.run_commands() File C:\Python26\lib\distutils\dist.py, line 975, in run_commands self.run_command(cmd) File C:\Python26\lib\distutils\dist.py, line 995, in run_command cmd_obj.run() File C:\Python26\lib\distutils\command\build_ext.py, line 340, in run self.build_extensions() File C:\Python26\lib\distutils\command\build_ext.py, line 449, in build_exte nsions self.build_extension(ext) File C:\Python26\lib\distutils\command\build_ext.py, line 499, in build_exte nsion depends=ext.depends) File C:\Python26\lib\distutils\msvc9compiler.py, line 449, in compile self.initialize() File C:\Python26\lib\distutils\msvc9compiler.py, line 359, in initialize vc_env = query_vcvarsall(VERSION, plat_spec) File C:\Python26\lib\distutils\msvc9compiler.py, line 275, in query_vcvarsal l raise ValueError(str(list(result.keys( ValueError: [u'path', u'include', u'lib'] Now let's do what the documentations says: C:\DEVELOPMENT\test-distutilsset DISTUTILS_USE_SDK=1 C:\DEVELOPMENT\test-distutilssetup.py build_ext running build_ext building 'test' extension creating build creating build\temp.win-amd64-2.6 creating build\temp.win-amd64-2.6\Release C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\Bin\amd64\cl.exe /c /nolog o /Ox /MD /W3 /GS- /DNDEBUG -IC:\Python26\include -IC:\Python26\PC /Tctest.c /Fo build\temp.win-amd64-2.6\Release\test.obj test.c creating build\lib.win-amd64-2.6 C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\Bin\amd64\link.exe /DLL /n ologo /INCREMENTAL:NO /LIBPATH:C:\Python26\libs /LIBPATH:C:\Python26\PCbuild\amd 64 /EXPORT:inittest build\temp.win-amd64-2.6\Release\test.obj /OUT:build\lib.win -amd64-2.6\test.pyd /IMPLIB:build\temp.win-amd64-2.6\Release\test.lib /MANIFESTF ILE:build\temp.win-amd64-2.6\Release\test.pyd.manifest test.obj : warning LNK4197: export 'inittest' specified multiple times; using fi rst specification Creating library build\temp.win-amd64-2.6\Release\test.lib and object build\t emp.win-amd64-2.6\Release\test.exp C:\Program Files\Microsoft SDKs\Windows\v7.0\Bin\x64\mt.exe -nologo -manifest bu ild\temp.win-amd64-2.6\Release\test.pyd.manifest -outputresource:build\lib.win-a md64-2.6\test.pyd;2 :-D Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] mingw support?
At one point Mike Fletcher published a patch to make distutils use the SDK compiler. It would make a lot of sense if it were built in to distutils as a further compiler choice. Please understand that this very choice is there already. Yes you are right. I did not know about DISTUTILS_USE_SDK. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reworking the GIL
Martin v. Löwis skrev: b) notice that, on Windows, minimum wait resolution may be as large as 15ms (e.g. on XP, depending on the hardware). Not sure what this means for WaitForMultipleObjects; most likely, if you ask for a 5ms wait, it waits until the next clock tick. It would be bad if, on some systems, a wait of 5ms would mean that it immediately returns. Which is why one should use multimedia timers with QPC on Windows. To get a wait function with much better resolution than Windows' default, do this: 1. Set a high resolution with timeBeginPeriod. 2. Loop using a time-out of 0 for WaitForMultipleObjects and put a Sleep(0) in the loop not to burn the CPU. Call QPF to get a precise timing, and break the loop when the requested time-out has been reached. 3. When you are done, call timeBeginPeriod to turn the multimedia timer off. This is how you create usleep() in Windows as well: Just loop on QPF and Sleep(0) after setting timeBeginPeriod(1). ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reworking the GIL
Martin v. Löwis skrev: Maybe you should study the code under discussion before making such a proposal. I did, and it does nothing of what I suggested. I am sure I can make the Windows GIL in ceval_gil.h and the mutex in thread_nt.h at lot more precise and efficient. This is the kind of code I was talking about, from ceval_gil.h: r = WaitForMultipleObjects(2, objects, TRUE, milliseconds); I would turn on multimedia timer (it is not on by default), and replace this call with a loop, approximately like this: for (;;) { r = WaitForMultipleObjects(2, objects, TRUE, 0); /* blah blah blah */ QueryPerformanceCounter(cnt); if (cnt timeout) break; Sleep(0); } And the timeout milliseconds would now be computed from querying the performance counter, instead of unreliably by the Windows NT kernel. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reworking the GIL
Sturla Molden skrev: I would turn on multimedia timer (it is not on by default), and replace this call with a loop, approximately like this: for (;;) { r = WaitForMultipleObjects(2, objects, TRUE, 0); /* blah blah blah */ QueryPerformanceCounter(cnt);if (cnt timeout) break; Sleep(0); } And just so you don't ask: There should not just be a Sleep(0) in the loop, but a sleep that gets shorter and shorter until a lower threshold is reached, where it skips to Sleep(0). That way we avoid hammering om WaitForMultipleObjects and QueryPerformanceCounter more than we need. And for all that to work better than just giving a timeout to WaitForMultipleObjects, we need the multimedia timer turned on. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reworking the GIL
Sturla Molden skrev: And just so you don't ask: There should not just be a Sleep(0) in the loop, but a sleep that gets shorter and shorter until a lower threshold is reached, where it skips to Sleep(0). That way we avoid hammering om WaitForMultipleObjects and QueryPerformanceCounter more than we need. And for all that to work better than just giving a timeout to WaitForMultipleObjects, we need the multimedia timer turned on. The important thing about multimedia timer is that the granularity of Sleep() and WaitForMultipleObjects() by default is 10 ms or at most 20 ms. But if we call timeBeginPeriod(1); the MM timer is on and granularity becomes 1 ms or at most 2 ms. But we can get even more precise than that by hammering on Sleep(0) for timeouts less than 2 ms. We can get typical granularity in the order of 10 µs, with the occational 100 µs now and then. I know this because I was using Windows 2000 to generate TTL signals on the LPT port some years ago, and watched them on the oscilloscope. ~ 15 ms granularity is Windows default. But that is brain dead. By the way Antoine, if you think granularity of 1-2 ms is sufficient, i.e. no need for µs precision, then just calling timeBeginPeriod(1) will be sufficient. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reworking the GIL
Antoine Pitrou skrev: It certainly is. But once again, I'm no Windows developer and I don't have a native Windost host to test on; therefore someone else (you?) has to try. I'd love to try, but I don't have VC++ to build Python, I use GCC on Windows. Anyway, the first thing to try then is to call timeBeginPeriod(1); once on startup, and leave the rest of the code as it is. If 2-4 ms is sufficient we can use timeBeginPeriod(2), etc. Microsoft is claiming Windows performs better with high granularity, which is why it is 10 ms by default. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reworking the GIL
Martin v. Löwis skrev: I did, and it does nothing of what I suggested. I am sure I can make the Windows GIL in ceval_gil.h and the mutex in thread_nt.h at lot more precise and efficient. Hmm. I'm skeptical that your code makes it more accurate, and I completely fail to see that it makes it more efficient (by what measurement of efficiency?) Also, why would making it more accurate make it better? IIUC, accuracy is completely irrelevant here, though efficiency (low overhead) does matter. This is the kind of code I was talking about, from ceval_gil.h: r = WaitForMultipleObjects(2, objects, TRUE, milliseconds); I would turn on multimedia timer (it is not on by default), and replace this call with a loop, approximately like this: for (;;) { r = WaitForMultipleObjects(2, objects, TRUE, 0); /* blah blah blah */ QueryPerformanceCounter(cnt);if (cnt timeout) break; Sleep(0); } And the timeout milliseconds would now be computed from querying the performance counter, instead of unreliably by the Windows NT kernel. Hmm. This creates a busy wait loop; if you add larger sleep values, then it loses accuracy. Actually an usleep lookes like this, and the call to the wait function must go into the for loop. But no, it's not a busy sleep. static int inited = 0; static __int64 hz; static double dhz; const double sleep_granularity = 2.0E10-3; void usleep( long us ) { __int64 cnt, end; double diff; if (!inited) { timeBeginPeriod(1); QueryPerformanceFrequency((LARGE_INTEGER*)hz); dhz = (double)hz; inited = 1; } QueryPerformanceCounter((LARGE_INTEGER*)cnt); end = cnt + (__int64)(1.0E10-6 * (double)(us) * dhz); for (;;) { QueryPerformanceCounter((LARGE_INTEGER*)cnt); if (cnt = end) break; diff = (double)(end - cnt)/dhz; if (diff sleep_granularity) Sleep((DWORD)(diff - sleep_granularity)); else Sleep(0); } } Why not just call timeBeginPeriod, and then rely on the higher clock rate for WaitForMultipleObjects? That is what I suggested when Antoine said 1-2 ms was enough. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2.7 Release? 2.7 == last of the 2.x line?
I'd just like to mention that the scientific community is highly dependent on NumPy. As long as NumPy is not ported to Py3k, migration is out of the question. Porting NumPy is not a trivial issue. It might take a complete rewrite of the whole C base using Cython. NumPy's ABI is not even PEP 3118 compliant. Changing the ABI for Py3k might break extension code written for NumPy using C. And scientists tend to write CPU-bound routines in languages like C and Fortran, not Python, so that is a major issue as well. If we port NumPy to Py3k, everyone using NumPy will have to port their C code to the new ABI. There are lot of people stuck with Python 2.x for this reason. It does not just affect individual scientists, but also large projects like IBM and CERN's blue brain and NASA's space telecope. So please, do not cancel 2.x support before we have ported NumPy, Matplotlib and most of their dependant extensions to Py3k. The community of scientists and engineers using Python is growing, but shutting down 2.x support might bring an end to that. Sturla Molden ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Integer behaviour in Python 2.6.4
Why does this happen? type(2**31-1) type 'long' It seems to have broken NumPy's RNG on Win32. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Integer behaviour in Python 2.6.4
Curt Hagenlocher skrev: Does that not happen on non-Windows platforms? 2**31 can't be represented as a 32-bit signed integer, so it's automatically promoted to a long. Yes you are right. I've now traced down the problem to an integer overflow in NumPy. It seems to have this Pyrex code: cdef long lo, hi, diff [...] diff = hi - lo - 1 :-D Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reworking the GIL
Antoine Pitrou skrev: - priority requests, which is an option for a thread requesting the GIL to be scheduled as soon as possible, and forcibly (rather than any other threads). So Python threads become preemptive rather than cooperative? That would be great. :-) time.sleep should generate a priority request to re-acquire the GIL; and so should all other blocking standard library functions with a time-out. S.M. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reworking the GIL
Antoine Pitrou skrev: - priority requests, which is an option for a thread requesting the GIL to be scheduled as soon as possible, and forcibly (rather than any other threads). T Should a priority request for the GIL take a priority number? - If two threads make a priority requests for the GIL, the one with the higher priority should get the GIL first. - If a thread with a low priority make a priority request for the GIL, it should not be allowed to preempt (take the GIL away from) a higher-priority thread, in which case the priority request would be ignored. Related issue: Should Python threads have priorities? They are after all real OS threads. S.M. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] time.clock() on windows
Kristján Valur Jónsson skrev: Thanks, I'll take a look in that direction. I have a suggestion, forgive me if I am totally ignorant. :-) Sturla Molden #include windows.h union __reftime { double us; __int64 bits; }; static volatile union __reftime __ref_perftime, __ref_filetime; double clock() { __int64 cnt, hz, init; double us; union __reftime ref_filetime; union __reftime ref_perftime; for (;;) { ref_filetime.bits = __ref_filetime.bits; ref_perftime.bits = __ref_perftime.bits; if(!QueryPerformanceCounter((LARGE_INTEGER*)cnt)) goto error; if(!QueryPerformanceFrequency((LARGE_INTEGER*)hz)) goto error; us = ref_filetime.us + ((double)(100*cnt)/(double)hz - ref_perftime.us); /* verify that values did not change */ init = InterlockedCompareExchange64((LONGLONG*)__ref_filetime.bits, (LONGLONG)ref_filetime.bits, (LONGLONG)ref_filetime.bits); if (init != ref_filetime.bits) continue; init = InterlockedCompareExchange64((LONGLONG*)__ref_perftime.bits, (LONGLONG)ref_perftime.bits, (LONGLONG)ref_perftime.bits); if (init == ref_perftime.bits) break; } return us; error: /* only if there is no performance counter */ return -1; /* or whatever */ } int periodic_reftime_check() { /* call this function at regular intervals, e.g. once every second */ __int64 cnt1, cnt2, hz; FILETIME systime; double ft; if(!QueryPerformanceFrequency((LARGE_INTEGER*)hz)) goto error; if(!QueryPerformanceCounter((LARGE_INTEGER*)cnt1)) goto error; GetSystemTimeAsFileTime(systime); __ref_filetime.us = (double)(__int64)(systime.dwHighDateTime)) 32) | ((__int64)(systime.dwLowDateTime)))/10); if(!QueryPerformanceCounter((LARGE_INTEGER*)cnt2)) goto error; __ref_perftime.us = 50*(cnt1 + cnt2)/((double)hz); return 0; error: /* only if there is no performance counter */ return -1; } ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] time.clock() on windows
Sturla Molden skrev: I have a suggestion, forgive me if I am totally ignorant. :-) Ah, damn... Since there is a GIL, we don't need any of that crappy synchronization. And my code does not correct for the 20 ms time jitter in GetSystemTimeAsFileTime. Sorry! S.M. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GIL behaviour under Windows
Antoine Pitrou skrev: This number lacks the elapsed time. 61 switches in one second is probably enough, the same amount of switches in 10 or 20 seconds is too small (at least for threads needing good responsivity, e.g. I/O threads). Also, fair has to take into account the average latency and its relative stability, which is why I wrote ccbench. Since I am a scientist and statistics interests me, let's do this properly :-) Here is a suggestion: _Py_Ticker is a circular variable. Thus, it can be transformed to an angle measured in radians, using: a = 2 * pi * _Py_Ticker / _Py_CheckInterval With simultaneous measurements of a, check interval count x, and time y (µs), we can fit the multiple regression: y = b0 + b1*cos(a) + b2*sin(a) + b3*x + err using some non-linear least squares solver. We can then extract all the statistics we need on interpreter latencies for ticks with and without periodic checks. On a Python setup with many missed thread switches (pthreads according to D. Beazley), we could just extend the model to take into account successful and unsccessful check intervals: y = b0 + b1*cos(a) + b2*sin(a) + b3*x1 + b4*x2 + err where x1 being successful thread switches and x2 being missed thread switches. But at least on Windows we can use the simpler model. The reason why multiple regression is needed, is that the record method of my GIL_Battle class is not called on every interpreter tick. I thus cannot measure precicesly each latency, which I could have done with a direct hook into ceval.c. So statistics to the rescue. But on the bright side, it reduces the overhead of the profiler. Would that help? Sturla Molden ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-ideas] Remove GIL with CAS instructions?
Phillip Sitbon skrev: Some of this is more low-level. I did see higher performance when using non-Event objects, although I have not had time to follow up and do a deeper analysis. The GIL flashing problem with critical sections can very likely be rectified with a call to Sleep(0) or YieldProcessor() for those who are worried about it. For those who don't know what Sleep(0) on Windows does: It returns the reminder of the current time-slice back to the system is a thread with equal or higher-priority is ready to run. Otherwise it does nothing. GIL flashing is a serious issue if it happens often; with the current event-based GIL on Windows, it never happens (61 cases of GIL flash in 100,000 periodic checks is as good as never). S.M. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GIL behaviour under Windows
Antoine Pitrou skrev: (*) http://svn.python.org/view/sandbox/trunk/ccbench/ I´ve run it twice on my dual core machine. It hangs every time, but not in the same place: D:\pydev\python\trunk\PCbuildpython.exe \tmp\ccbench.py Ah, you should report a bug then. ccbench is pure Python (and not particularly evil Python), it shouldn't be able to crash the interpreter. It does not crash the interpreter, but it seems it can deadlock. Here is what I get con a quadcore (Vista, Python 2.6.3). D:\ccbench.py --- Throughput --- Pi calculation (Python) threads=1: 568 iterations/s. threads=2: 253 ( 44 %) threads=3: 274 ( 48 %) threads=4: 283 ( 49 %) regular expression (C) threads=1: 510 iterations/s. threads=2: 508 ( 99 %) threads=3: 503 ( 98 %) threads=4: 502 ( 98 %) bz2 compression (C) threads=1: 456 iterations/s. threads=2: 892 ( 195 %) threads=3: 1320 ( 289 %) threads=4: 1743 ( 382 %) --- Latency --- Background CPU task: Pi calculation (Python) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 0 ms. (std dev: 0 ms.) CPU threads=2: 0 ms. (std dev: 0 ms.) CPU threads=3: 0 ms. (std dev: 0 ms.) CPU threads=4: 0 ms. (std dev: 0 ms.) Background CPU task: regular expression (C) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 37 ms. (std dev: 21 ms.) CPU threads=2: 379 ms. (std dev: 175 ms.) CPU threads=3: 625 ms. (std dev: 310 ms.) CPU threads=4: 718 ms. (std dev: 381 ms.) Background CPU task: bz2 compression (C) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 0 ms. (std dev: 0 ms.) CPU threads=2: 0 ms. (std dev: 0 ms.) CPU threads=3: 0 ms. (std dev: 0 ms.) CPU threads=4: 1 ms. (std dev: 3 ms.) D:\ccbench.py --- Throughput --- Pi calculation (Python) threads=1: 554 iterations/s. threads=2: 400 ( 72 %) threads=3: 273 ( 49 %) threads=4: 231 ( 41 %) regular expression (C) threads=1: 508 iterations/s. threads=2: 509 ( 100 %) threads=3: 509 ( 100 %) threads=4: 509 ( 100 %) bz2 compression (C) threads=1: 456 iterations/s. threads=2: 897 ( 196 %) threads=3: 1316 ( 288 %) DEADLOCK D:\ccbench.py --- Throughput --- Pi calculation (Python) threads=1: 559 iterations/s. threads=2: 397 ( 71 %) threads=3: 274 ( 49 %) threads=4: 238 ( 42 %) regular expression (C) threads=1: 507 iterations/s. threads=2: 499 ( 98 %) threads=3: 505 ( 99 %) threads=4: 495 ( 97 %) bz2 compression (C) threads=1: 455 iterations/s. threads=2: 896 ( 196 %) threads=3: 1320 ( 290 %) threads=4: 1736 ( 381 %) --- Latency --- Background CPU task: Pi calculation (Python) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 0 ms. (std dev: 0 ms.) CPU threads=2: 0 ms. (std dev: 0 ms.) CPU threads=3: 0 ms. (std dev: 0 ms.) CPU threads=4: 0 ms. (std dev: 0 ms.) Background CPU task: regular expression (C) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 34 ms. (std dev: 21 ms.) CPU threads=2: 358 ms. (std dev: 174 ms.) CPU threads=3: 619 ms. (std dev: 312 ms.) CPU threads=4: 742 ms. (std dev: 382 ms.) Background CPU task: bz2 compression (C) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 0 ms. (std dev: 0 ms.) CPU threads=2: 0 ms. (std dev: 0 ms.) CPU threads=3: 0 ms. (std dev: 0 ms.) CPU threads=4: 6 ms. (std dev: 13 ms.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GIL behaviour under Windows
Sturla Molden skrev: does not crash the interpreter, but it seems it can deadlock. Here is what I get con a quadcore (Vista, Python 2.6.3). This what I get with affinity set to CPU 3. There are deadlocks happening at random locations in ccbench.py. It gets worse with affinity set to one processor. Sturla D:\start /AFFINITY 3 /B ccbench.py D:\--- Throughput --- Pi calculation (Python) threads=1: 554 iterations/s. threads=2: 257 ( 46 %) threads=3: 272 ( 49 %) threads=4: 280 ( 50 %) regular expression (C) threads=1: 501 iterations/s. threads=2: 505 ( 100 %) threads=3: 493 ( 98 %) threads=4: 507 ( 101 %) bz2 compression (C) threads=1: 455 iterations/s. threads=2: 889 ( 195 %) threads=3: 1309 ( 287 %) threads=4: 1710 ( 375 %) --- Latency --- Background CPU task: Pi calculation (Python) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 0 ms. (std dev: 0 ms.) CPU threads=2: 0 ms. (std dev: 0 ms.) CPU threads=3: 0 ms. (std dev: 0 ms.) CPU threads=4: 0 ms. (std dev: 0 ms.) Background CPU task: regular expression (C) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 40 ms. (std dev: 22 ms.) CPU threads=2: 384 ms. (std dev: 179 ms.) CPU threads=3: 618 ms. (std dev: 314 ms.) CPU threads=4: 713 ms. (std dev: 379 ms.) Background CPU task: bz2 compression (C) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 0 ms. (std dev: 0 ms.) CPU threads=2: 0 ms. (std dev: 0 ms.) CPU threads=3: 0 ms. (std dev: 3 ms.) CPU threads=4: 0 ms. (std dev: 1 ms.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GIL behaviour under Windows
Antoine Pitrou skrev: Kristján sent me a patch which I applied and is supposed to fix this. Anyway, thanks for the numbers. The GIL does seem to fare a bit better (zero latency with the Pi calculation in the background) than under Linux, although it may be caused by the limited resolution of time.time() under Windows. My critisism of the GIL on python-ideas was partly motivated by this: http://blip.tv/file/2232410 However, David Beazley is not talking about Windows. Since the GIL is apparently not a mutex on Windows, it could behave differently. So I wrote a small script that contructs a GIL battle, and record how often a check-interval results in a thread-switch or not. For monitoring check intervals, I used a small C extension to read _Py_Ticker from ceval.c. It is not declared static so I could easily hack into it. With two threads and a check interval og 100, only 61 of 10 check intervals failed to produce a thread-switch in the interpreter. I'd call that rather fair. :-) And in case someone asks, the nthreads=1 case is just for verification. S.M. D:\test.py check interval = 1 nthreads=1, swiched=0, missed=10 nthreads=2, swiched=57809, missed=42191 nthreads=3, swiched=91535, missed=8465 nthreads=4, swiched=99751, missed=249 nthreads=5, swiched=95839, missed=4161 nthreads=6, swiched=10, missed=0 D:\test.py check interval = 10 nthreads=1, swiched=0, missed=10 nthreads=2, swiched=99858, missed=142 nthreads=3, swiched=2, missed=8 nthreads=4, swiched=10, missed=0 nthreads=5, swiched=10, missed=0 nthreads=6, swiched=10, missed=0 D:\test.py check interval = 100 nthreads=1, swiched=0, missed=10 nthreads=2, swiched=99939, missed=61 nthreads=3, swiched=10, missed=0 nthreads=4, swiched=10, missed=0 nthreads=5, swiched=10, missed=0 nthreads=6, swiched=10, missed=0 D:\test.py check interval = 1000 nthreads=1, swiched=0, missed=10 nthreads=2, swiched=9, missed=1 nthreads=3, swiched=10, missed=0 nthreads=4, swiched=10, missed=0 nthreads=5, swiched=10, missed=0 nthreads=6, swiched=10, missed=0 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GIL behaviour under Windows
Sturla Molden skrev: However, David Beazley is not talking about Windows. Since the GIL is apparently not a mutex on Windows, it could behave differently. So I wrote a small script that contructs a GIL battle, and record how often a check-interval results in a thread-switch or not. For monitoring check intervals, I used a small C extension to read _Py_Ticker from ceval.c. It is not declared static so I could easily hack into it. Anyway, if anyone wants to run a GIL battle, here is the code I used. If it turns out the GIL is far worse with pthreads, as it is implemented with a mutex, it might be a good idea to reimplement it with an event object as it is on Windows. Sturla Molden In python: from giltest import * from time import clock import threading import sys def thread(rank, battle, start): while not start.isSet(): if rank == 0: start.set() try: while 1: battle.record(rank) except: pass if __name__ == '__main__': sys.setcheckinterval(1000) print check interval = %d % sys.getcheckinterval() for nthreads in range(1,7): start = threading.Event() battle = GIL_Battle(10) threads = [threading.Thread(target=thread, args=(i,battle,start)) for i in range(1,nthreads)] for t in threads: t.setDaemon(True) t.start() thread(0, battle, start) for t in threads: t.join() s,m = battle.report() print nthreads=%d, swiched=%d, missed=%d % (nthreads, s, m) In Cython or Pyrex: from exceptions import Exception cdef extern from *: ctypedef int vint volatile int vint _Py_Ticker class StopBattle(Exception): pass cdef class GIL_Battle: tests the fairness of the GIL cdef vint prev_tick, prev_rank, switched, missed cdef int trials def __cinit__(GIL_Battle self, int trials=10): self.prev_tick = _Py_Ticker self.prev_rank = -1 self.missed = 0 self.switched = 0 self.trials = trials def record(GIL_Battle self, int rank): if self.trials == self.switched + self.missed: raise StopBattle if self.prev_rank == -1: self.prev_tick = _Py_Ticker self.prev_rank = rank else: if _Py_Ticker self.prev_tick: if self.prev_rank == rank: self.missed += 1 else: self.switched += 1 self.prev_tick = _Py_Ticker self.prev_rank = rank else: self.prev_tick = _Py_Ticker def report(GIL_Battle self): return int(self.switched), int(self.missed) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3142: Add a while clause to generator expressions
On 1/20/2009 4:45 PM, Gerald Britton wrote: OK, so your suggestion: g = (n for n in range(100) if n*n 50 or raiseStopIteration()) really means return in in the range 0-99 if n-squared is less than 50 or the function raiseStopIteration() returns True. How would this get the generator to stop once n*n =50? It looks instead like the first time around, StopIteration will be raised and (presumably) the generator will terminate. I still find it odd to invent new syntax for simple things like def quit(): raise StopIteration gen = itertools.imap( lambda x: x if x = 50 else quit(), (i for i in range(100)) ) for i in gen: print i Sturla Molden ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3142: Add a while clause to generator expressions
On 1/19/2009 6:51 PM, Terry Reedy wrote: The other, posted by Steven Bethard, is that it fundamentally breaks the current semantics of abbreviating (except for iteration variable scoping) an 'equivalent' for loop. The proposed syntax would suggest that this should be legal as well: for i in iterable while cond: blahblah or perhaps: while cond for i in iterable: blahblah A while-for or for-while loop would be a novel invention, not seen in any other language that I know of. I seriously doubt its usefulness though... Sturla Molden ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The endless GIL debate: why not remove thread support instead?
On 12/12/2008 11:52 AM, Lennart Regebro wrote: The use of threads for load balancing should be discouraged, yes. That is not what they are designed for. Threads are designed to allow blocking processes to go on in the background without blocking the main process. It seems that most programmers with Java or Windows experience don't understand this; hence the ever lasting GIL debate. With multiple interpreters - one interpreter per thread - this could still be accomplished. Let one interpreter block while another continues to work. Then the result of the blocking operation is messaged back. Multi-threaded C libraries could be used the in same way. But there would be no need for a GIL, because each interpreter would be a single-threaded compartment. .NET have something similar in what is called 'appdomains'. I am not suggesting removal of threads but rather the Java threading model. I just think it is a mistake to let multiple OS threads touch the same interpreter. Sturla Molden ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] The endless GIL debate: why not remove thread support instead?
Last month there was a discussion on Python-Dev regarding removal of reference counting to remove the GIL. I hope you forgive me for continuing the debate. I think reference counting is a good feature. It prevents huge piles of garbage from building up. It makes the interpreter run more smoothly. It is not just important for games and multimedia applications, but also servers under high load. Python does not pause to look for garbage like Java or .NET. It only pauses to look for dead reference cycles. This can be safely turned off temporarily; it can be turned off completely if you do not create reference cycles. With Java and .NET, no garbage is ever reclaimed except by the intermittent garbage collection. Python always reclaims an object when the reference count drops to zero whether the GC is enabled or not. This makes Python programs well-behaved. For this reason, I think removing reference counting is a genuinely bad idea. Even if the GIL is evil, this remedy is even worse. I am not a Python core developer; I am a research scientist who use Python because Matlab is (or used to be) a bad programming language, albeit a good computing environment. As most people who have worked with scientific computing know, there are better paradigms for concurrency than threads. In particular, there are message-passing systems like MPI and Erlang, and there are autovectorizing compilers for OpenMP and Fortran 90/95. There are special LAPACK, BLAS and FFT libraries for parallel computer architectures. There are fork-join systems like cilk and java.util.concurrent. Threads seem to be used only because mediocre programmers don't know what else to use. I genuinely think the use of threads should be discouraged. It leads to code that are full of bugs and difficult to maintain - race conditions, deadlocks, and livelocks are common pitfalls. Very few developers are capable of implementing efficient load-balancing by hand. Multi-threaded programs tend to scale badly because they are badly written. If the GIL discourages the abuse of threads, it serves a purpose albeit being evil like the Linux kernel's BKL. Python could be better off doing what tcl does. Allow each process to embed multiple interpreters; run each interpreter in its own thread. Implement a fast message-passing system between the interpreters (e.g. copy-on-write by making communicated objects immutable), and Python would be closer to Erlang than Java. I thus think the main offender is the thread and threading modules - not the GIL. Without thread support in the interpreter, there would be no threads. Without threads, there would be no need for a GIL. Both sources of evil can be removed by just removing thread support from the Python interpreter. In addition, it would make Python faster at executing linear code. Just copy the concurrency model of Erlang instead of Java and get rid of those nasty threads. In the meanwhile, I'll continue to experiment with multiprocessing. Removing reference counting to encourage the use of threads is like shooting ourselves in the leg twice. Thats my two cents on this issue. There is another issue to note as well: If you can endure a 200x loss of efficacy by using Python instead of Fortran, scalability on dual or quad-core processors may not be that important. Just move the bottlenecks out of Python and you are much better off. Regards, Sturla Molden ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com