[Numpy-discussion] Weighted covariance.

2015-04-29 Thread Charles R Harris
The weighted covariance function in PR #4960
 is evolving to the following,
where frequency weights are `f` and reliability weights are `a`.

Assume that the observations are in the columns of the observation matrix.
the steps to compute the weighted covariance are as follows::

>>> w = f * a
>>> v1 = np.sum(w)
>>> v2 = np.sum(a * w)
>>> m -= np.sum(m * w, axis=1, keepdims=True) / v1
>>> cov = np.dot(m * w, m.T) * v1 / (v1**2 - ddof * v2)

Note that when ``a == 1``, the normalization factor ``v1 / (v1**2 -
ddof * v2)`` goes over to ``1 / (np.sum(f) - ddof)``
as it should.

This is probably a good time for comments from all the kibitzers out there.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance of numpy.array()

2015-04-29 Thread Julian Taylor
On 29.04.2015 17:50, Robert Kern wrote:
> On Wed, Apr 29, 2015 at 4:05 PM, simona bellavista  > wrote:
>>
>> I work on two distinct scientific clusters. I have run the same python
> code on the two clusters and I have noticed that one is faster by an
> order of magnitude than the other (1min vs 10min, this is important
> because I run this function many times).
>>
>> I have investigated with a profiler and I have found that the cause of
> this is that (same code and same data) is the function numpy.array that
> is being called 10^5 times. On cluster A it takes 2 s in total, whereas
> on cluster B it takes ~6 min.  For what regards the other functions,
> they are generally faster on cluster A. I understand that the clusters
> are quite different, both as hardware and installed libraries. It
> strikes me that on this particular function the performance is so
> different. I would have though that this is due to a difference in the
> available memory, but actually by looking with `top` the memory seems to
> be used only at 0.1% on cluster B. In theory numpy is compiled with
> atlas on cluster B, and on cluster A it is not clear, because
> numpy.__config__.show() returns NOT AVAILABLE for anything.
>>
>> Does anybody has any insight on that, and if I can improve the
> performance on cluster B?
> 
> Check to see if you have the "Transparent Hugepages" (THP) Linux kernel
> feature enabled on each cluster. You may want to try turning it off. I
> have recently run into a problem with a large-memory multicore machine
> with THP for programs that had many large numpy.array() memory
> allocations. Usually, THP helps memory-hungry applications (you can
> Google for the reasons), but it does require defragmenting the memory
> space to get contiguous hugepages. The system can get into a state where
> the memory space is so fragmented such that trying to get each new
> hugepage requires a lot of extra work to create the contiguous memory
> regions. In my case, a perfectly well-performing program would suddenly
> slow down immensely during it's memory-allocation-intensive actions.
> When I turned THP off, it started working normally again.
> 
> If you have root, try using `perf top` to see what C functions in user
> space and kernel space are taking up the most time in your process. If
> you see anything like `do_page_fault()`, this, or a similar issue, is
> your problem.
> 

this issue it has nothing to do with thp, its a change in array in numpy
1.9. Its now as fast as vstack, while before it was really really slow.

But the memory compaction is indeed awful, especially the backport
redhat did for their enterprise linux.

Typically it is enough to only disable the automatic defragmentation on
allocation only, not the full thps, e.g. via
echo never | sudo tee /sys/kernel/mm/transparent_hugepage/defrag
(on redhat backports its a different path)

You still have the hugepaged running defrags at times of low load and in
limited fashion, you can also manually trigger a defrag by writting to:
/prog/sys/vm/compact_memory
Though the hugepaged which runs only occasionally should already do a
good job.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance of numpy.array()

2015-04-29 Thread Julian Taylor
numpy 1.9 makes array(list) performance similar in performance to vstack
in 1.8 its very slow.

On 29.04.2015 17:40, simona bellavista wrote:
> on cluster A 1.9.0 and on cluster B 1.8.2
> 
> 2015-04-29 17:18 GMT+02:00 Nick Papior Andersen  >:
> 
> Compile it yourself to know the limitations/benefits of the
> dependency libraries.
> 
> Otherwise, have you checked which versions of numpy they are, i.e.
> are they the same version?
> 
> 2015-04-29 17:05 GMT+02:00 simona bellavista  >:
> 
> I work on two distinct scientific clusters. I have run the same
> python code on the two clusters and I have noticed that one is
> faster by an order of magnitude than the other (1min vs 10min,
> this is important because I run this function many times). 
> 
> I have investigated with a profiler and I have found that the
> cause of this is that (same code and same data) is the function
> numpy.array that is being called 10^5 times. On cluster A it
> takes 2 s in total, whereas on cluster B it takes ~6 min.  For
> what regards the other functions, they are generally faster on
> cluster A. I understand that the clusters are quite different,
> both as hardware and installed libraries. It strikes me that on
> this particular function the performance is so different. I
> would have though that this is due to a difference in the
> available memory, but actually by looking with `top` the memory
> seems to be used only at 0.1% on cluster B. In theory numpy is
> compiled with atlas on cluster B, and on cluster A it is not
> clear, because numpy.__config__.show() returns NOT AVAILABLE for
> anything.
> 
> Does anybody has any insight on that, and if I can improve the
> performance on cluster B?
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org 
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> 
> 
> -- 
> Kind regards Nick
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org 
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance of numpy.array()

2015-04-29 Thread Robert Kern
On Wed, Apr 29, 2015 at 4:05 PM, simona bellavista  wrote:
>
> I work on two distinct scientific clusters. I have run the same python
code on the two clusters and I have noticed that one is faster by an order
of magnitude than the other (1min vs 10min, this is important because I run
this function many times).
>
> I have investigated with a profiler and I have found that the cause of
this is that (same code and same data) is the function numpy.array that is
being called 10^5 times. On cluster A it takes 2 s in total, whereas on
cluster B it takes ~6 min.  For what regards the other functions, they are
generally faster on cluster A. I understand that the clusters are quite
different, both as hardware and installed libraries. It strikes me that on
this particular function the performance is so different. I would have
though that this is due to a difference in the available memory, but
actually by looking with `top` the memory seems to be used only at 0.1% on
cluster B. In theory numpy is compiled with atlas on cluster B, and on
cluster A it is not clear, because numpy.__config__.show() returns NOT
AVAILABLE for anything.
>
> Does anybody has any insight on that, and if I can improve the
performance on cluster B?

Check to see if you have the "Transparent Hugepages" (THP) Linux kernel
feature enabled on each cluster. You may want to try turning it off. I have
recently run into a problem with a large-memory multicore machine with THP
for programs that had many large numpy.array() memory allocations. Usually,
THP helps memory-hungry applications (you can Google for the reasons), but
it does require defragmenting the memory space to get contiguous hugepages.
The system can get into a state where the memory space is so fragmented
such that trying to get each new hugepage requires a lot of extra work to
create the contiguous memory regions. In my case, a perfectly
well-performing program would suddenly slow down immensely during it's
memory-allocation-intensive actions. When I turned THP off, it started
working normally again.

If you have root, try using `perf top` to see what C functions in user
space and kernel space are taking up the most time in your process. If you
see anything like `do_page_fault()`, this, or a similar issue, is your
problem.

--
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance of numpy.array()

2015-04-29 Thread Sebastian Berg
There was a major improvement to np.array in some cases.

You can probably work around this by using np.concatenate instead of
np.array in your case (depends on the usecase, but I will guess you have
code doing:

np.array([arr1, arr2, arr3])

or similar. If your use case is different, you may be out of luck and
only an upgrade would help.


On Mi, 2015-04-29 at 17:41 +0200, Nick Papior Andersen wrote:
> You could try and install your own numpy to check whether that
> resolves the problem.
> 
> 2015-04-29 17:40 GMT+02:00 simona bellavista :
> on cluster A 1.9.0 and on cluster B 1.8.2
> 
> 2015-04-29 17:18 GMT+02:00 Nick Papior Andersen
> :
> Compile it yourself to know the limitations/benefits
> of the dependency libraries.
> 
> 
> Otherwise, have you checked which versions of numpy
> they are, i.e. are they the same version?
> 
> 
> 2015-04-29 17:05 GMT+02:00 simona bellavista
> :
> 
> I work on two distinct scientific clusters. I
> have run the same python code on the two
> clusters and I have noticed that one is faster
> by an order of magnitude than the other (1min
> vs 10min, this is important because I run this
> function many times). 
> 
> 
> I have investigated with a profiler and I have
> found that the cause of this is that (same
> code and same data) is the function
> numpy.array that is being called 10^5 times.
> On cluster A it takes 2 s in total, whereas on
> cluster B it takes ~6 min.  For what regards
> the other functions, they are generally faster
> on cluster A. I understand that the clusters
> are quite different, both as hardware and
> installed libraries. It strikes me that on
> this particular function the performance is so
> different. I would have though that this is
> due to a difference in the available memory,
> but actually by looking with `top` the memory
> seems to be used only at 0.1% on cluster B. In
> theory numpy is compiled with atlas on cluster
> B, and on cluster A it is not clear, because
> numpy.__config__.show() returns NOT AVAILABLE
> for anything.
> 
> 
> Does anybody has any insight on that, and if I
> can improve the performance on cluster B?
> 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> 
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> 
> 
> 
> -- 
> Kind regards Nick
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> 
> 
> 
> -- 
> Kind regards Nick
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion



signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance of numpy.array()

2015-04-29 Thread Nick Papior Andersen
You could try and install your own numpy to check whether that resolves the
problem.

2015-04-29 17:40 GMT+02:00 simona bellavista :

> on cluster A 1.9.0 and on cluster B 1.8.2
>
> 2015-04-29 17:18 GMT+02:00 Nick Papior Andersen :
>
>> Compile it yourself to know the limitations/benefits of the dependency
>> libraries.
>>
>> Otherwise, have you checked which versions of numpy they are, i.e. are
>> they the same version?
>>
>> 2015-04-29 17:05 GMT+02:00 simona bellavista :
>>
>>> I work on two distinct scientific clusters. I have run the same python
>>> code on the two clusters and I have noticed that one is faster by an order
>>> of magnitude than the other (1min vs 10min, this is important because I run
>>> this function many times).
>>>
>>> I have investigated with a profiler and I have found that the cause of
>>> this is that (same code and same data) is the function numpy.array that is
>>> being called 10^5 times. On cluster A it takes 2 s in total, whereas on
>>> cluster B it takes ~6 min.  For what regards the other functions, they are
>>> generally faster on cluster A. I understand that the clusters are quite
>>> different, both as hardware and installed libraries. It strikes me that on
>>> this particular function the performance is so different. I would have
>>> though that this is due to a difference in the available memory, but
>>> actually by looking with `top` the memory seems to be used only at 0.1% on
>>> cluster B. In theory numpy is compiled with atlas on cluster B, and on
>>> cluster A it is not clear, because numpy.__config__.show() returns NOT
>>> AVAILABLE for anything.
>>>
>>> Does anybody has any insight on that, and if I can improve the
>>> performance on cluster B?
>>>
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>>
>> --
>> Kind regards Nick
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
Kind regards Nick
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance of numpy.array()

2015-04-29 Thread simona bellavista
on cluster A 1.9.0 and on cluster B 1.8.2

2015-04-29 17:18 GMT+02:00 Nick Papior Andersen :

> Compile it yourself to know the limitations/benefits of the dependency
> libraries.
>
> Otherwise, have you checked which versions of numpy they are, i.e. are
> they the same version?
>
> 2015-04-29 17:05 GMT+02:00 simona bellavista :
>
>> I work on two distinct scientific clusters. I have run the same python
>> code on the two clusters and I have noticed that one is faster by an order
>> of magnitude than the other (1min vs 10min, this is important because I run
>> this function many times).
>>
>> I have investigated with a profiler and I have found that the cause of
>> this is that (same code and same data) is the function numpy.array that is
>> being called 10^5 times. On cluster A it takes 2 s in total, whereas on
>> cluster B it takes ~6 min.  For what regards the other functions, they are
>> generally faster on cluster A. I understand that the clusters are quite
>> different, both as hardware and installed libraries. It strikes me that on
>> this particular function the performance is so different. I would have
>> though that this is due to a difference in the available memory, but
>> actually by looking with `top` the memory seems to be used only at 0.1% on
>> cluster B. In theory numpy is compiled with atlas on cluster B, and on
>> cluster A it is not clear, because numpy.__config__.show() returns NOT
>> AVAILABLE for anything.
>>
>> Does anybody has any insight on that, and if I can improve the
>> performance on cluster B?
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
>
> --
> Kind regards Nick
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance of numpy.array()

2015-04-29 Thread Nick Papior Andersen
Compile it yourself to know the limitations/benefits of the dependency
libraries.

Otherwise, have you checked which versions of numpy they are, i.e. are they
the same version?

2015-04-29 17:05 GMT+02:00 simona bellavista :

> I work on two distinct scientific clusters. I have run the same python
> code on the two clusters and I have noticed that one is faster by an order
> of magnitude than the other (1min vs 10min, this is important because I run
> this function many times).
>
> I have investigated with a profiler and I have found that the cause of
> this is that (same code and same data) is the function numpy.array that is
> being called 10^5 times. On cluster A it takes 2 s in total, whereas on
> cluster B it takes ~6 min.  For what regards the other functions, they are
> generally faster on cluster A. I understand that the clusters are quite
> different, both as hardware and installed libraries. It strikes me that on
> this particular function the performance is so different. I would have
> though that this is due to a difference in the available memory, but
> actually by looking with `top` the memory seems to be used only at 0.1% on
> cluster B. In theory numpy is compiled with atlas on cluster B, and on
> cluster A it is not clear, because numpy.__config__.show() returns NOT
> AVAILABLE for anything.
>
> Does anybody has any insight on that, and if I can improve the performance
> on cluster B?
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
Kind regards Nick
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] performance of numpy.array()

2015-04-29 Thread simona bellavista
I work on two distinct scientific clusters. I have run the same python code
on the two clusters and I have noticed that one is faster by an order of
magnitude than the other (1min vs 10min, this is important because I run
this function many times).

I have investigated with a profiler and I have found that the cause of this
is that (same code and same data) is the function numpy.array that is being
called 10^5 times. On cluster A it takes 2 s in total, whereas on cluster B
it takes ~6 min.  For what regards the other functions, they are generally
faster on cluster A. I understand that the clusters are quite different,
both as hardware and installed libraries. It strikes me that on this
particular function the performance is so different. I would have though
that this is due to a difference in the available memory, but actually by
looking with `top` the memory seems to be used only at 0.1% on cluster B.
In theory numpy is compiled with atlas on cluster B, and on cluster A it is
not clear, because numpy.__config__.show() returns NOT AVAILABLE for
anything.

Does anybody has any insight on that, and if I can improve the performance
on cluster B?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ANN: numexpr 2.4.3 released

2015-04-29 Thread Neil Girdhar
Sorry for the late reply.   I will definitely consider submitting a pull
request to numexpr if it's the direction I decide to go.  Right now I'm
still evaluating all of the many options for my project.

I am implementing a machine learning algorithm as part of my thesis work.
I'm in the "make it work", but quickly approaching the "make it fast" part.

With research, you usually want to iterate quickly, and so whatever
solution I choose has to be automated.  I can't be coding things in an
intuitive, natural way, and then porting it to a different implementation
to make it fast.  What I want is for that conversion to be automated.  I'm
still evaluating how to best achieve that.

On Tue, Apr 28, 2015 at 6:08 AM, Francesc Alted  wrote:

> 2015-04-28 4:59 GMT+02:00 Neil Girdhar :
>
>> I don't think I'm asking for so much.  Somewhere inside numexpr it builds
>> an AST of its own, which it converts into the optimized code.   It would be
>> more useful to me if that AST were in the same format as the one returned
>> by Python's ast module.  This way, I could glue in the bits of numexpr that
>> I like with my code.  For my purpose, this would have been the more ideal
>> design.
>>
>
> I don't think implementing this for numexpr would be that complex. So for
> example, one could add a new numexpr.eval_ast(ast_expr) function.  Pull
> requests are welcome.
>
> At any rate, which is your use case?  I am curious.
>
> --
> Francesc Alted
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion