On Thu, Feb 12, 2009 at 12:42:37AM -0600, Robert Kern wrote:
It is implemented using threads, with Windows native threads on
Windows. I think Gaël really just meant threads there.
I guess so :). Once you reformulate my remark in proper terms, this is
indeed what comes out.
I guess all what it
Hi Brian,
A Thursday 12 February 2009, Brian Granger escrigué:
Hi,
This is relevant for anyone who would like to speed up array based
codes using threads.
I have a simple loop that I have implemented using Cython:
def backstep(np.ndarray opti, np.ndarray optf,
int istart,
Brian Granger schrieb:
I am curious: would you know what would be different in numpy's case
compared to matlab array model concerning locks ? Matlab, up to
recently, only spreads BLAS/LAPACK on multi-cores, but since matlab 7.3
(or 7.4), it also uses multicore for mathematical functions (cos,
A Thursday 12 February 2009, Dag Sverre Seljebotn escrigué:
A quick digression:
It would be interesting to see how a spec would look for integrating
OpenMP natively into Cython for these kinds of purposes. Cython is
still flexible as a language after all.
That would be really nice indeed.
Francesc Alted wrote:
A Thursday 12 February 2009, Dag Sverre Seljebotn escrigué:
A quick digression:
It would be interesting to see how a spec would look for integrating
OpenMP natively into Cython for these kinds of purposes. Cython is
still flexible as a language after all.
Gregor Thalhammer wrote:
Recent Matlab versions use Intels Math Kernel Library, which performs
automatic multi-threading - also for mathematical functions like sin
etc, but not for addition, multiplication etc.
It does if you have access to the parallel toolbox I mentioned earlier
in this
On 2/12/2009 7:15 AM, David Cournapeau wrote:
Since openmp also exists on windows, I doubt that it is required that
openmp uses pthread :)
On Windows, MSVC uses Win32 threads and GCC (Cygwin and MinGW) uses
pthreads. If you use OpenMP with MinGW, the executable becomes dependent
on
On 2/12/2009 11:30 AM, Dag Sverre Seljebotn wrote:
It would be interesting to see how a spec would look for integrating
OpenMP natively into Cython for these kinds of purposes. Cython is still
flexible as a language after all. Avoiding language bloat is also
important, but it is difficult to
A Thursday 12 February 2009, Dag Sverre Seljebotn escrigué:
FYI, I am one of the core Cython developers and can make such
modifications in Cython itself as long as there's consensus on how it
should look on the Cython mailing list. My problem is that I don't
really know OpenMP and have little
A Thursday 12 February 2009, Sturla Molden escrigué:
OpenMP does not need to be a aprt of the Cython language. It can be
special comments in the code as in Fortran. After all, #pragma omp
parallel is a comment in Cython.
Hey! That's very nice to know. We already have OpenMP support in
On 2/12/2009 12:20 PM, David Cournapeau wrote:
It does if you have access to the parallel toolbox I mentioned earlier
in this thread (again, no experience with it, but I think it is
specially popular on clusters; in that case, though, it is not limited
to thread-based implementation).
As has
Sturla Molden wrote:
On 2/12/2009 12:20 PM, David Cournapeau wrote:
It does if you have access to the parallel toolbox I mentioned earlier
in this thread (again, no experience with it, but I think it is
specially popular on clusters; in that case, though, it is not limited
to
Francesc Alted wrote:
I don't know OpenMP enough neither, but I'd say that in this list there
could be some people that could help.
At any rate, I really like the OpenMP approach and prefer to have
support for it in Cython much better than threading, MPI or whatever.
But the thing is: is
Sturla Molden wrote:
On 2/12/2009 12:20 PM, David Cournapeau wrote:
Hi,
It does if you have access to the parallel toolbox I mentioned earlier
in this thread (again, no experience with it, but I think it is
specially popular on clusters; in that case, though, it is not limited
to
Sturla Molden wrote:
On 2/12/2009 1:50 PM, Francesc Alted wrote:
Hey! That's very nice to know. We already have OpenMP support in
Cython for free (or apparently it seems so :-)
Not we don't, as variable names are different in C and Cython. But
adding support for OpenMP would
I am curious: would you know what would be different in numpy's case
compared to matlab array model concerning locks ? Matlab, up to
recently, only spreads BLAS/LAPACK on multi-cores, but since matlab 7.3
(or 7.4), it also uses multicore for mathematical functions (cos,
etc...). So at least
Yes, it is. You have to link against pthread (at least with Linux ;))
You have to write a single parallel region if you don't want this
overhead (which is not possible with Python).
Matthieu
2009/2/12 Gael Varoquaux gael.varoqu...@normalesup.org:
On Wed, Feb 11, 2009 at 11:52:40PM -0600,
2009/2/12 Sturla Molden stu...@molden.no:
On 2/12/2009 1:50 PM, Francesc Alted wrote:
Hey! That's very nice to know. We already have OpenMP support in
Cython for free (or apparently it seems so :-)
Not we don't, as variable names are different in C and Cython. But
adding support for
Matthieu Brucher wrote:
No - I have never seen deep explanation of the matlab model. The C api
is so small that it is hard to deduce anything from it (except that the
memory handling is not ref-counting-based, I don't know if it matters
for our discussion of speeding up ufunc). I would guess
On 2/12/2009 12:34 PM, Dag Sverre Seljebotn wrote:
FYI, I am one of the core Cython developers and can make such
modifications in Cython itself as long as there's consensus on how it
should look on the Cython mailing list. My problem is that I don't
really know OpenMP and have little
Matthieu Brucher wrote:
Sorry, I was refering to my last mail, but I sent so many in 5 minuts ;)
In C, if you have to arrays (two pointers), the compiler can't make
aggressive optimizations because they may intersect. With Fortran,
this is not possible. In this matter, Numpy behaves like C
On 2/12/2009 1:44 PM, Sturla Molden wrote:
Here is an example of SciPy's ckdtree.pyx modified to use OpenMP.
It seems I managed to post an errorneous C file. :(
S.M.
/*
* Parallel query for faster kd-tree searches on SMP computers.
* This function will
David Cournapeau wrote:
Matthieu Brucher wrote:
For BLAS level 3, the MKL is parallelized (so matrix multiplication is).
Hi David,
Same for ATLAS: thread support is one focus in the 3.9 serie, currently
in development.
ATLAS has had thread support for a long, long time. The 3.9 series
2009/2/12 David Cournapeau da...@ar.media.kyoto-u.ac.jp:
Matthieu Brucher wrote:
No - I have never seen deep explanation of the matlab model. The C api
is so small that it is hard to deduce anything from it (except that the
memory handling is not ref-counting-based, I don't know if it matters
2009/2/12 David Cournapeau da...@ar.media.kyoto-u.ac.jp:
Matthieu Brucher wrote:
Sorry, I was refering to my last mail, but I sent so many in 5 minuts ;)
In C, if you have to arrays (two pointers), the compiler can't make
aggressive optimizations because they may intersect. With Fortran,
On Thu, Feb 12, 2009 at 03:27:51PM +0100, Sturla Molden wrote:
The question is: Should OpenMP be comments in the Cython code (as they
are in C and Fortran), or should OpenMP be special objects?
My two cents: go for cython objects/statements. Not only does code in
comments looks weird and a
On 2/12/2009 5:24 PM, Gael Varoquaux wrote:
My two cents: go for cython objects/statements. Not only does code in
comments looks weird and a hack, but also it means to you have to hack
the parser.
I agree with this. Particularly because Cython uses intendation as
syntax. With comments you
Sturla Molden wrote:
On 2/12/2009 12:34 PM, Dag Sverre Seljebotn wrote:
FYI, I am one of the core Cython developers and can make such
modifications in Cython itself as long as there's consensus on how it
should look on the Cython mailing list. My problem is that I don't
really know
Dag Sverre Seljebotn wrote:
Hmm... yes. Care would need to be taken though because Cython might in
the future very well generate a while loop instead for such a
statement under some circumstances, and that won't work with OpenMP. One
should be careful with assuming what the C result will be
If your problem is evaluating vector expressions just like the above
(i.e. without using transcendental functions like sin, exp, etc...),
usually the bottleneck is on memory access, so using several threads is
simply not going to help you achieving better performance, but rather
the contrary
Recent Matlab versions use Intels Math Kernel Library, which performs
automatic multi-threading - also for mathematical functions like sin
etc, but not for addition, multiplication etc. It seems to me Matlab
itself does not take care of multi-threading. On
At any rate, I really like the OpenMP approach and prefer to have
support for it in Cython much better than threading, MPI or whatever.
But the thing is: is OpenMP stable, mature enough for allow using it in
most of common platforms? I think that recent GCC compilers support
the latest
Wow, interesting thread. Thanks everyone for the ideas. A few more comments:
GPUs/CUDA:
* Even though there is a bottleneck between main memory and GPU
memory, as Nathan mentioned, the much larger memory bandwidth on a GPU
often makes GPUs great for memory bound computations...as long as you
Brian Granger wrote:
And a question:
With the new Numpy support in Cython, does Cython release the GIL if
it can when running through through loops over numpy arrays? Does
Cython call into the C API during these sections?
You know, I thought of the exact same thing when reading your post.
You know, I thought of the exact same thing when reading your post. No,
you need the GIL currently, but that's something I'd like to fix.
Ideally, it would be something like this:
cdef int i, s = 0, n = ...
cdef np.ndarray[int] arr = ... # will require the GIL
with nogil:
for i in
Sturla Molden wrote:
IMO there's a problem with using literal variable names here, because
Python syntax implies that the value is passed. One shouldn't make
syntax where private=(i,) is legal but private=(f(),) isn't.
The latter would be illegal in OpenMP as well. OpenMP pragmas only take
On Wed, Feb 11, 2009 at 23:46, Brian Granger ellisonbg@gmail.com wrote:
Hi,
This is relevant for anyone who would like to speed up array based
codes using threads.
I have a simple loop that I have implemented using Cython:
def backstep(np.ndarray opti, np.ndarray optf,
int
Eric Jones tried to do this with pthreads in C some time ago. His work is
here:
http://svn.scipy.org/svn/numpy/branches/multicore/
The lock overhead makes it usually not worthwhile.
I was under the impression that Eric's implementation didn't use a
thread pool. Thus I thought the
On Thu, Feb 12, 2009 at 00:03, Brian Granger ellisonbg@gmail.com wrote:
Eric Jones tried to do this with pthreads in C some time ago. His work is
here:
http://svn.scipy.org/svn/numpy/branches/multicore/
The lock overhead makes it usually not worthwhile.
I was under the impression
On Wed, Feb 11, 2009 at 11:52:40PM -0600, Robert Kern wrote:
This seem like pretty heavy solutions though.
From a programmer's perspective, it seems to me like OpenMP is a muck
lighter weight solution than pthreads.
From a programmer's perspective, because, IMHO, openmp is implemented
using
Robert Kern wrote:
Eric Jones tried to do this with pthreads in C some time ago. His work is
here:
http://svn.scipy.org/svn/numpy/branches/multicore/
The lock overhead makes it usually not worthwhile.
I am curious: would you know what would be different in numpy's case
compared to
Gael Varoquaux wrote:
From a programmer's perspective, because, IMHO, openmp is implemented
using pthreads.
Since openmp also exists on windows, I doubt that it is required that
openmp uses pthread :)
On linux, with gcc, using -fopenmp implies -pthread, so I guess it uses
pthread (can you be
I am curious: would you know what would be different in numpy's case
compared to matlab array model concerning locks ? Matlab, up to
recently, only spreads BLAS/LAPACK on multi-cores, but since matlab 7.3
(or 7.4), it also uses multicore for mathematical functions (cos,
etc...). So at least
Good point. Is it possible to tell what array size it switches over
to using multiple threads?
Yes.
http://svn.scipy.org/svn/numpy/branches/multicore/numpy/core/threadapi.py
Sorry, I was curious about what Matlab does in this respect. But,
this is very useful and I will look at it.
Brian Granger wrote:
I am curious: would you know what would be different in numpy's case
compared to matlab array model concerning locks ? Matlab, up to
recently, only spreads BLAS/LAPACK on multi-cores, but since matlab 7.3
(or 7.4), it also uses multicore for mathematical functions (cos,
45 matches
Mail list logo