Re: [Numpy-discussion] Numpy Overhead

2017-02-28 Thread Sebastian K
Thank you! That is the information I needed.

2017-03-01 0:18 GMT+01:00 Matthew Brett <matthew.br...@gmail.com>:

> Hi,
>
> On Tue, Feb 28, 2017 at 3:04 PM, Sebastian K
> <sebastiankas...@googlemail.com> wrote:
> > Yes you are right. There is no need to add that line. I deleted it. But
> the
> > measured heap peak is still the same.
>
> You're applying the naive matrix multiplication algorithm, which is
> ideal for minimizing memory use during the computation, but terrible
> for speed-related stuff like keeping values in the CPU cache:
>
> https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm
>
> The Numpy version is likely calling into a highly optimized compiled
> routine for matrix multiplication, which can load chunks of the
> matrices at a time, to speed up computation.   If you really need
> minimum memory heap usage and don't care about the order of
> magnitude(s) slowdown, then you might need to use the naive method,
> maybe implemented in Cython / C.
>
> Cheers,
>
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy Overhead

2017-02-28 Thread Sebastian K
Yes you are right. There is no need to add that line. I deleted it. But the
measured heap peak is still the same.

2017-03-01 0:00 GMT+01:00 Joseph Fox-Rabinovitz <jfoxrabinov...@gmail.com>:

> For one thing, `C = np.empty(shape=(n,n), dtype='float64')` allocates 10^4
> extra elements before being immediately discarded.
>
> -Joe
>
> On Tue, Feb 28, 2017 at 5:57 PM, Sebastian K <sebastiankaster@googlemail.
> com> wrote:
>
>> Yes it is true the execution time is much faster with the numpy function.
>>
>>  The Code for numpy version:
>>
>> def createMatrix(n):
>> Matrix = np.empty(shape=(n,n), dtype='float64')
>> for x in range(n):
>> for y in range(n):
>> Matrix[x, y] = 0.1 + ((x*y)%1000)/1000.0
>> return Matrix
>>
>>
>>
>> if __name__ == '__main__':
>> n = getDimension()
>> if n > 0:
>> A = createMatrix(n)
>> B = createMatrix(n)
>> C = np.empty(shape=(n,n), dtype='float64')
>> C = np.dot(A,B)
>>
>> #print(C)
>>
>> In the pure python version I am just implementing the multiplication with
>> three for-loops.
>>
>> Measured data with libmemusage:
>> dimension of matrix: 100x100
>> heap peak pure python3: 1060565
>> heap peakt numpy function: 4917180
>>
>>
>> 2017-02-28 23:17 GMT+01:00 Matthew Brett <matthew.br...@gmail.com>:
>>
>>> Hi,
>>>
>>> On Tue, Feb 28, 2017 at 2:12 PM, Sebastian K
>>> <sebastiankas...@googlemail.com> wrote:
>>> > Thank you for your answer.
>>> > For example a very simple algorithm is a matrix multiplication. I can
>>> see
>>> > that the heap peak is much higher for the numpy version in comparison
>>> to a
>>> > pure python 3 implementation.
>>> > The heap is measured with the libmemusage from libc:
>>> >
>>> >   heap peak
>>> >   Maximum of all size arguments of malloc(3), all
>>> products
>>> >   of nmemb*size of calloc(3), all size arguments of
>>> >   realloc(3), length arguments of mmap(2), and new_size
>>> >   arguments of mremap(2).
>>>
>>> Could you post the exact code you're comparing?
>>>
>>> I think you'll find that a naive Python 3 matrix multiplication method
>>> is much, much slower than the same thing with Numpy, with arrays of
>>> any reasonable size.
>>>
>>> Cheers,
>>>
>>> Matthew
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@scipy.org
>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy Overhead

2017-02-28 Thread Sebastian K
Yes it is true the execution time is much faster with the numpy function.

 The Code for numpy version:

def createMatrix(n):
Matrix = np.empty(shape=(n,n), dtype='float64')
for x in range(n):
for y in range(n):
Matrix[x, y] = 0.1 + ((x*y)%1000)/1000.0
return Matrix



if __name__ == '__main__':
n = getDimension()
if n > 0:
A = createMatrix(n)
B = createMatrix(n)
C = np.empty(shape=(n,n), dtype='float64')
C = np.dot(A,B)

#print(C)

In the pure python version I am just implementing the multiplication with
three for-loops.

Measured data with libmemusage:
dimension of matrix: 100x100
heap peak pure python3: 1060565
heap peakt numpy function: 4917180


2017-02-28 23:17 GMT+01:00 Matthew Brett <matthew.br...@gmail.com>:

> Hi,
>
> On Tue, Feb 28, 2017 at 2:12 PM, Sebastian K
> <sebastiankas...@googlemail.com> wrote:
> > Thank you for your answer.
> > For example a very simple algorithm is a matrix multiplication. I can see
> > that the heap peak is much higher for the numpy version in comparison to
> a
> > pure python 3 implementation.
> > The heap is measured with the libmemusage from libc:
> >
> >   heap peak
> >   Maximum of all size arguments of malloc(3), all
> products
> >   of nmemb*size of calloc(3), all size arguments of
> >   realloc(3), length arguments of mmap(2), and new_size
> >   arguments of mremap(2).
>
> Could you post the exact code you're comparing?
>
> I think you'll find that a naive Python 3 matrix multiplication method
> is much, much slower than the same thing with Numpy, with arrays of
> any reasonable size.
>
> Cheers,
>
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy Overhead

2017-02-28 Thread Sebastian K
Thank you for your answer.
For example a very simple algorithm is a matrix multiplication. I can see
that the heap peak is much higher for the numpy version in comparison to a
pure python 3 implementation.
The heap is measured with the libmemusage from libc:


  *heap peak*
  Maximum of all *size* arguments of malloc(3)
<http://man7.org/linux/man-pages/man3/malloc.3.html>, all products
  of *nmemb***size* of calloc(3)
<http://man7.org/linux/man-pages/man3/calloc.3.html>, all *size*
arguments of
  realloc(3)
<http://man7.org/linux/man-pages/man3/realloc.3.html>, *length*
arguments of mmap(2)
<http://man7.org/linux/man-pages/man2/mmap.2.html>, and *new_size*
  arguments of mremap(2)
<http://man7.org/linux/man-pages/man2/mremap.2.html>.

Regards

Sebastian


On 28 Feb 2017 11:03 p.m., "Benjamin Root" <ben.v.r...@gmail.com> wrote:

> You are going to need to provide much more context than that. Overhead
> compared to what? And where (io, cpu, etc.)? What are the size of your
> arrays, and what sort of operations are you doing? Finally, how much
> overhead are you seeing?
>
> There can be all sorts of reasons for overhead, and some can easily be
> mitigated, and others not so much.
>
> Cheers!
> Ben Root
>
>
> On Tue, Feb 28, 2017 at 4:47 PM, Sebastian K <
> sebastiankas...@googlemail.com> wrote:
>
>> Hello everyone,
>>
>> I'm interested in the numpy project and tried a lot with the numpy array.
>> I'm wondering what is actually done that there is so much overhead when I
>> call a function in Numpy. What is the reason?
>> Thanks in advance.
>>
>> Regards
>>
>> Sebastian Kaster
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Numpy Overhead

2017-02-28 Thread Sebastian K
Hello everyone,

I'm interested in the numpy project and tried a lot with the numpy array.
I'm wondering what is actually done that there is so much overhead when I
call a function in Numpy. What is the reason?
Thanks in advance.

Regards

Sebastian Kaster
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion