Re: [Numpy-discussion] Matlab vs. Python (Was: Re: [SciPy-Dev] Good-bye, sort of (John Hunter)) (Sturla Molden)

2010-08-15 Thread Ioan-Alexandru Lazar
Hi everyone,

I've been pretty happy with how Spider went along when I tried it. I use
Emacs but I think Spyder is perfectly usable for someone used to Matlab. A
few HPC-centric of my reasons (I've shamelessly copy-pasted this because
I'm lazy right now):

1. Python is an expressive, full-fledged, general-purpose application
language. There is slightly more boilerplate for doing math-related 
operations (i.e. creating a matrix is not as simple as A=[[1, 2, 3]; [4,
5, 6]] but rather A=numpy.array([[1,2,3],[4,5,6]])). On the other hand,
everything other than math can be expressed without causing severe nausea
and vomiting to the user.

2. The package-module structure of Python allows me to obtain a
significantly less cluttered workspace.

3. There’s built-in support for automatically generating meaningful and
useful documentation

4. The wonderful FFI support allows me to easily work with external C
code. Writing MEX functions is a total mess, I hate that. On the other
hand, it’s easy for me to integrate Python with a custom-compiled version
of UMFPACK or any other solver, and wrappers can be automatically
generated with SWIG for a minimal amount of effort.

5. There is a Matlab wrapper called mlabwrap, so legacy code written in
Matlab is not lost effort.

6. I can use Emacs for my development rather than choosing between a)
working in half-assed environment without code completion or b) working
with Matlab’s incredibly slow and sloppy user interface on Unix systems.

7. I have built-in support for primitives like linked lists, queues,
stacks and tuples

8. I have standards-compliant support for MPI that does not look alien in
Python (there’s support for that in Matlab too, but it feels like you’re
coding on Mars in a library written somewhere in the Andromeda galaxy).
This is extremely important to me right now because the algorithm we’re
implementing will be presented in a conference in September, and I want to
have an implementation that uses standard tools, available to every HPC
developer. On the other hand, I will be able to further optimize
communication (if this will seem to be required) by using a good
distributed-object library like PyRO.

Some details about what exactly it is that I am using it for are on my
blog at http://zencoding.org/archives/137#more-137, although it's rather
sketchy -- I plan to write a slightly more detailed document about my
experience with Python for HPC and how it compares against Matlab's PCS
and DCS. I've been banging my head against Matlab for a while so I'll
gladly write a thing or two on the wiki if you think this sort of use case
is relevant.

One other observation -- Matlab does have some support for parallel
processing via PCS, but you have to pay for that, and it's not too
flexible. There's also DCS for distributed computing. Some of my
colleagues have been using those and aren't too happy about their
flexibility just yet. MathWorks is also not too verbose about what
particular parts of Matlab are parallelized and not, so we've been
randomly stumbling upon parts that were actually parallelized (e.g.
UMFPACK is linked against a multithreaded BLAS) even though we thought
they weren't. Their documentation is detailed, but as far as careful
details about under-the-hood issues and optimization methods, it's a bad
joke. If you need it as a glorified handheld calculator or as a
prototyping tool, it's great, but writing full-fledged apps in it is
painful.

Best regards,
Alexandru Lazar,

Numerical Modeling Laboratory, Politehnica University of Bucharest
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] (no subject)

2010-07-21 Thread Ioan-Alexandru Lazar
Hello everyone,

I'm currently planning to use a Python-based infrastructure for our HPC
project.
I've previously used NumPy and SciPy for basic scientific computing tasks,
but
performance hasn't been quite an issue for me until now. At the moment I'm
not too
sure as to what to do next though, and I was hoping that someone with more
experience in performance-related issues could point me to a way out of this.

The trouble lays in the following piece of code:

===
w = 2 * math.pi * f
M = A - (1j*w*E)
n = M.shape[1]
B1 = numpy.zeros(n)
B2 = numpy.zeros(n)
B1[n-2] = 1.0
B2[n-1] = 1.0
- slow part starts here
umfpack.numeric(M)
x1 = umfpack.solve( um.UMFPACK_A, M, B1, autoTranspose = False)
x2 = umfpack.solve( um.UMFPACK_A, M, B2, autoTranspose = False)
solution = scipy.array([ [ x1[n-2], x2[n-2] ], [ x1[n-1], x2[n-1] ]])
return solution


This isn't really too much -- it's generating a small
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] UMFPACK interface is unexpectedly slow

2010-07-21 Thread Ioan-Alexandru Lazar
Hello everyone,

First of all, let me apologize for my earlier message; I made the mistake
of trying to indent my code using SquirrelMail's horrible interface -- and
pressing Tab and Space resulted in sending my (incomplete) e-mail to the
list. Cursed be Opera's keyboard shortcuts now :-).

I'm currently planning to use a Python-based infrastructure for our HPC
project.
I've previously used NumPy and SciPy for basic scientific computing tasks,
so
performance hasn't been quite an issue for me until now. At the moment I'm
not too
sure as to what to do next though, and I was hoping that someone with more
experience in performance-related issues could point me to a way out of this.

The trouble lays in the following piece of code:

===
w = 2 * math.pi * f
M = A - (1j*w*E)
n = M.shape[1]
B1 = numpy.zeros(n)
B2 = numpy.zeros(n)
B1[n-2] = 1.0
B2[n-1] = 1.0
- slow part starts here
umfpack.numeric(M)
x1 = umfpack.solve( um.UMFPACK_A, M, B1, autoTranspose = False)
x2 = umfpack.solve( um.UMFPACK_A, M, B2, autoTranspose = False)
solution = scipy.array([ [ x1[n-2], x2[n-2] ], [ x1[n-1], x2[n-1] ]])
return solution


This isn't really too much -- it's generating a system matrix via
operations that take little time, as I was expecting. Trouble is, the
solve part takes significantly more time than Octave -- about 4 times.

I'm using the stock version of UMFPACK in Ubuntu's repository; it's
compiled against standard BLAS, so it's fairly slow, but so is Octave --
so the problem isn't there.

I'm obviously doing something wrong related to memory management here,
because the memory consumption is also rocketing, but I'm not sure what
exactly it is that I'm doing wrong. Could you point me towards some
relevant documentation describing what I could do in order to improve the
performance, or give me some hint related to that?

Best regards,
Alexandru Lazar
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] UMFPACK interface is unexpectedly slow

2010-07-21 Thread Ioan-Alexandru Lazar
I hope I won't get identified as a spam bot :-). While I have not resolved
the problem itself, this is an issue that I cannot reproduce on our
cluster. I wanted to get back with some actual timings from the real
hardware we are going to be using and some details about the matrices, so
as not to chase ghosts, but this proved to be a headache saver.

It's still baffling because on the cluster I have also used stock packages
(albeit from Fedora, which is what our system administrator insists on
using) rather than my hand-compiled and optimized GotoBLAS and UMFPACK. It
didn't even occur to me to try to reproduce this on another system in the
last 4 hours I've been struggling with this, because I assumed that using
stock packages was giving me the uniformity I required. It seems I was
wrong. Nonetheless, I think it's safe to assume in this case that the
problem is not in NumPy or my code, and it would be wiser to bring this up
in Ubuntu's trackpad.

Thanks for your patience,
Alexandru

On Thu, July 22, 2010 4:10 am, Ioan-Alexandru Lazar wrote:
 Hello everyone,

 First of all, let me apologize for my earlier message; I made the mistake
 of trying to indent my code using SquirrelMail's horrible interface -- and
 pressing Tab and Space resulted in sending my (incomplete) e-mail to the
 list. Cursed be Opera's keyboard shortcuts now :-).

 I'm currently planning to use a Python-based infrastructure for our HPC
 project.
 I've previously used NumPy and SciPy for basic scientific computing tasks,
 so
 performance hasn't been quite an issue for me until now. At the moment I'm
 not too
 sure as to what to do next though, and I was hoping that someone with more
 experience in performance-related issues could point me to a way out of
 this.

 The trouble lays in the following piece of code:

 ===
 w = 2 * math.pi * f
 M = A - (1j*w*E)
 n = M.shape[1]
 B1 = numpy.zeros(n)
 B2 = numpy.zeros(n)
 B1[n-2] = 1.0
 B2[n-1] = 1.0
 - slow part starts here
 umfpack.numeric(M)
 x1 = umfpack.solve( um.UMFPACK_A, M, B1, autoTranspose = False)
 x2 = umfpack.solve( um.UMFPACK_A, M, B2, autoTranspose = False)
 solution = scipy.array([ [ x1[n-2], x2[n-2] ], [ x1[n-1], x2[n-1] ]])
 return solution
 

 This isn't really too much -- it's generating a system matrix via
 operations that take little time, as I was expecting. Trouble is, the
 solve part takes significantly more time than Octave -- about 4 times.

 I'm using the stock version of UMFPACK in Ubuntu's repository; it's
 compiled against standard BLAS, so it's fairly slow, but so is Octave --
 so the problem isn't there.

 I'm obviously doing something wrong related to memory management here,
 because the memory consumption is also rocketing, but I'm not sure what
 exactly it is that I'm doing wrong. Could you point me towards some
 relevant documentation describing what I could do in order to improve the
 performance, or give me some hint related to that?

 Best regards,
 Alexandru Lazar


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion