Re: [Numpy-discussion] Fortran order in recarray.

2017-02-22 Thread Robert McLeod
Just as a note, Appveyor supports uploading modules to "public websites":

https://packaging.python.org/appveyor/

The main issue I would see from this, is the PyPi has my password stored on
my machine in a plain text file.   I'm not sure whether there's a way to
provide Appveyor with a SSH key instead.

On Wed, Feb 22, 2017 at 4:23 PM, Alex Rogozhnikov <
alex.rogozhni...@yandex.ru> wrote:

> Hi Francesc,
> thanks a lot for you reply and for your impressive job on bcolz!
>
> Bcolz seems to make stress on compression, which is not of much interest
> for me, but the *ctable*, and chunked operations look very appropriate to
> me now. (Of course, I'll need to test it much before I can say this for
> sure, that's current impression).
>
> The strongest concern with bcolz so far is that it seems to be completely
> non-trivial to install on windows systems, while pip provides binaries for
> most (or all?) OS for numpy.
> I didn't build pip binary wheels myself, but is it hard / impossible to
> cook pip-installabel binaries?
>
> ​You can change shapes of numpy arrays, but that usually involves copies
> of the whole container.
>
> sure, but this is ok for me, as I plan to organize column editing in
> 'batches', so this should require seldom copying.
> It would be nice to see an example to understand how deep I need to go
> inside numpy.
>
> Cheers,
> Alex.
>
>
>
>
> 22 февр. 2017 г., в 17:03, Francesc Alted <fal...@gmail.com> написал(а):
>
> Hi Alex,
>
> 2017-02-22 12:45 GMT+01:00 Alex Rogozhnikov <alex.rogozhni...@yandex.ru>:
>
>> Hi Nathaniel,
>>
>>
>> pandas
>>
>>
>> yup, the idea was to have minimal pandas.DataFrame-like storage (which I
>> was using for a long time),
>> but without irritating problems with its row indexing and some other
>> problems like interaction with matplotlib.
>>
>> A dict of arrays?
>>
>>
>> that's what I've started from and implemented, but at some point I
>> decided that I'm reinventing the wheel and numpy has something already. In
>> principle, I can ignore this 'column-oriented' storage requirement, but
>> potentially it may turn out to be quite slow-ish if dtype's size is large.
>>
>> Suggestions are welcome.
>>
>
> ​You may want to try bcolz:
>
> https://github.com/Blosc/bcolz
>
> bcolz is a columnar storage, basically as you require, but data is
> compressed by default even when stored in-memory (although you can disable
> compression if you want to).​
>
>
>
>>
>> Another strange question:
>> in general, it is considered that once numpy.array is created, it's shape
>> not changed.
>> But if i want to keep the same recarray and change it's dtype and/or
>> shape, is there a way to do this?
>>
>
> ​You can change shapes of numpy arrays, but that usually involves copies
> of the whole container.  With bcolz you can change length and add/del
> columns without copies.​  If your containers are large, it is better to
> inform bcolz on its final estimated size.  See:
>
> http://bcolz.blosc.org/en/latest/opt-tips.html
>
> ​Francesc​
>
>
>>
>> Thanks,
>> Alex.
>>
>>
>>
>> 22 февр. 2017 г., в 3:53, Nathaniel Smith <n...@pobox.com> написал(а):
>>
>> On Feb 21, 2017 3:24 PM, "Alex Rogozhnikov" <alex.rogozhni...@yandex.ru>
>> wrote:
>>
>> Ah, got it. Thanks, Chris!
>> I thought recarray can be only one-dimensional (like tables with named
>> columns).
>>
>> Maybe it's better to ask directly what I was looking for:
>> something that works like a table with named columns (but no labelling
>> for rows), and keeps data (of different dtypes) in a column-by-column way
>> (and this is numpy, not pandas).
>>
>> Is there such a magic thing?
>>
>>
>> Well, that's what pandas is for...
>>
>> A dict of arrays?
>>
>> -n
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
>
> --
> Francesc Alted
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universität Basel
Mattenstrasse 26, 4058 Basel
Work: +41.061.387.3225
robert.mcl...@unibas.ch
robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch>
robbmcl...@gmail.com
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ANN: NumExpr3 Alpha

2017-02-19 Thread Robert McLeod
Hi Juan,

A guy on reddit suggested looking at SymPy for just such a thing. I know
that Dask also represents its process as a graph.

https://www.reddit.com/r/Python/comments/5um04m/numexpr3/

I'll think about it some more but it seems a little abstract still. To a
certain extent the NE3 compiler already works this way.  The compiler has a
dictionary in which keys are `ast.Node` types, and each value is a function
pointer, which knows how to handle that particular node. Providing an
external interface to this would be the most natural extension.

There's quite a few things to do before I would think about a functional
interface. The things I mentioned in my original mail; pickling of the
C-object so that it can be using within modules like `multiprocessing`;
having a pre-allocated shared memory region shared among threads for
temporaries and parameters, etc.  If someone else wants to dabble in it
they are welcome to.

Robert

On Sun, Feb 19, 2017 at 4:19 AM, Juan Nunez-Iglesias <jni.s...@gmail.com>
wrote:

> Hi everyone,
>
> Thanks for this. It looks absolutely fantastic. I've been putting off
> using numexpr but it looks like I don't have a choice anymore. ;)
>
> Regarding feature requests, I've always found it off putting that I have
> to wrap my expressions in a string to speed them up. Has anyone explored
> the possibility of using Python 3.6's frame evaluation API to do this? I
> remember a vague discussion on this list a while back but I don't know
> whether anything came of it.
>
> Thanks!
>
> Juan.
>
> On 18 Feb 2017, 3:42 AM +1100, Robert McLeod <robbmcl...@gmail.com>,
> wrote:
>
> Hi David,
>
> Thanks for your comments, reply below the fold.
>
> On Fri, Feb 17, 2017 at 4:34 PM, Daπid <davidmen...@gmail.com> wrote:
>
>> This is very nice indeed!
>>
>> On 17 February 2017 at 12:15, Robert McLeod <robbmcl...@gmail.com> wrote:
>> > * bytes and unicode support
>> > * reductions (mean, sum, prod, std)
>>
>> I use both a lot, maybe I can help you get them working.
>>
>> Also, regarding "Vectorization hasn't been done yet with cmath
>> functions for real numbers (such as sqrt(), exp(), etc.), only for
>> complex functions". What is the bottleneck? Is it in GCC or just
>> someone has to sit down and adapt it?
>
>
> I just haven't done it yet.  Basically I'm moving from Switzerland to
> Canada in a week so this was the gap to push something out that's usable if
> not perfect. Rather I just import cmath functions, which are inlined but I
> suspect what's needed is to break them down into their components. For
> example, the complex arccos function looks like this:
>
> static void
> nc_acos( npy_intp n, npy_complex64 *x, npy_complex64 *r)
> {
> npy_complex64 a;
> for( npy_intp I = 0; I < n; I++ ) {
> a = x[I];
> _inline_mul( x[I], x[I], r[I] );
> _inline_sub( Z_1, r[I], r[I] );
> _inline_sqrt( r[I], r[I] );
> _inline_muli( r[I], r[I] );
> _inline_add( a, r[I], r[I] );
> _inline_log( r[I] , r[I] );
> _inline_muli( r[I], r[I] );
> _inline_neg( r[I], r[I]);
> }
> }
>
> I haven't sat down and inspected whether the cmath versions get
> vectorized, but there's not a huge speed difference between NE2 and 3 for
> such a function on float (but their is for complex), so my suspicion is
> they aren't.  Another option would be to add a library such as Yeppp! as
> LIB_YEPPP or some other library that's faster than glib.  For example the
> glib function "fma(a,b,c)" is slower than doing "a*b+c" in NE3, and that's
> not how it should be.  Yeppp is also built with Python generating C code,
> so it could either be very easy or very hard.
>
> On bytes and unicode, I haven't seen examples for how people use it, so
> I'm not sure where to start. Since there's practically not a limitation on
> the number of operations now (the library is 1.3 MB now, compared to 1.2 MB
> for NE2 with gcc 5.4) the string functions could grow significantly from
> what we have in NE2.
>
> With regards to reductions, NumExpr never multi-threaded them, and could
> only do outer reductions, so in the end there was no speed advantage to be
> had compared to having NumPy do them on the result.  I suspect the primary
> value there was in PyTables and Pandas where the expression had to do
> everything.  One of the things I've moved away from in NE3 is doing output
> buffering (rather it pre-allocates the output array), so for reductions the
> understanding NumExpr has of broadcasting would have to be deeper.
>
> In any event contributions would certainly be welcome.
>
> Robert
>
> --
> Robert McLeod, Ph.D.
>

Re: [Numpy-discussion] ANN: NumExpr3 Alpha

2017-02-17 Thread Robert McLeod
Hi David,

Thanks for your comments, reply below the fold.

On Fri, Feb 17, 2017 at 4:34 PM, Daπid <davidmen...@gmail.com> wrote:

> This is very nice indeed!
>
> On 17 February 2017 at 12:15, Robert McLeod <robbmcl...@gmail.com> wrote:
> > * bytes and unicode support
> > * reductions (mean, sum, prod, std)
>
> I use both a lot, maybe I can help you get them working.
>
> Also, regarding "Vectorization hasn't been done yet with cmath
> functions for real numbers (such as sqrt(), exp(), etc.), only for
> complex functions". What is the bottleneck? Is it in GCC or just
> someone has to sit down and adapt it?


I just haven't done it yet.  Basically I'm moving from Switzerland to
Canada in a week so this was the gap to push something out that's usable if
not perfect. Rather I just import cmath functions, which are inlined but I
suspect what's needed is to break them down into their components. For
example, the complex arccos function looks like this:

static void
nc_acos( npy_intp n, npy_complex64 *x, npy_complex64 *r)
{
npy_complex64 a;
for( npy_intp I = 0; I < n; I++ ) {
a = x[I];
_inline_mul( x[I], x[I], r[I] );
_inline_sub( Z_1, r[I], r[I] );
_inline_sqrt( r[I], r[I] );
_inline_muli( r[I], r[I] );
_inline_add( a, r[I], r[I] );
_inline_log( r[I] , r[I] );
_inline_muli( r[I], r[I] );
_inline_neg( r[I], r[I]);
}
}

I haven't sat down and inspected whether the cmath versions get vectorized,
but there's not a huge speed difference between NE2 and 3 for such a
function on float (but their is for complex), so my suspicion is they
aren't.  Another option would be to add a library such as Yeppp! as
LIB_YEPPP or some other library that's faster than glib.  For example the
glib function "fma(a,b,c)" is slower than doing "a*b+c" in NE3, and that's
not how it should be.  Yeppp is also built with Python generating C code,
so it could either be very easy or very hard.

On bytes and unicode, I haven't seen examples for how people use it, so I'm
not sure where to start. Since there's practically not a limitation on the
number of operations now (the library is 1.3 MB now, compared to 1.2 MB for
NE2 with gcc 5.4) the string functions could grow significantly from what
we have in NE2.

With regards to reductions, NumExpr never multi-threaded them, and could
only do outer reductions, so in the end there was no speed advantage to be
had compared to having NumPy do them on the result.  I suspect the primary
value there was in PyTables and Pandas where the expression had to do
everything.  One of the things I've moved away from in NE3 is doing output
buffering (rather it pre-allocates the output array), so for reductions the
understanding NumExpr has of broadcasting would have to be deeper.

In any event contributions would certainly be welcome.

Robert

-- 
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universität Basel
Mattenstrasse 26, 4058 Basel
Work: +41.061.387.3225 <061%20387%2032%2025>
robert.mcl...@unibas.ch
robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch>
robbmcl...@gmail.com
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] ANN: NumExpr3 Alpha

2017-02-17 Thread Robert McLeod
-

* vectorize real functions (such as exp, sqrt, log) similar to the
complex_functions.hpp vectorization.
* Add a keyword (likely 'yield') to indicate that a token is intended to be
changed by a generator inside a loop with each call to NumExpr.run()

If you have any thoughts or find any issues please don't hesitate to open
an issue at the Github repo. Although unit tests have been run over the
operation space there are undoubtedly a number of bugs to squash.

Sincerely,

Robert

-- 
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universität Basel
Mattenstrasse 26, 4058 Basel
Work: +41.061.387.3225 <061%20387%2032%2025>
robert.mcl...@unibas.ch
robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch>
robbmcl...@gmail.com
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] composing Euler rotation matrices

2017-02-01 Thread Robert McLeod
Instead of trying to decipher what someone wrote on a Wikipedia, why don't
you look at a working piece of source code?

e.g.

https://github.com/3dem/relion/blob/master/src/euler.cpp

Robert


On Wed, Feb 1, 2017 at 4:27 AM, Seb <splu...@gmail.com> wrote:

> On Tue, 31 Jan 2017 21:23:55 -0500,
> Joseph Fox-Rabinovitz <jfoxrabinov...@gmail.com> wrote:
>
> > Could you show what you are doing to get the statement "However, I
> > cannot reproduce this matrix via composition; i.e. by multiplying the
> > underlying rotation matrices.". I would guess something involving the
> > `*` operator instead of `@`, but guessing probably won't help you
> > solve your issue.
>
> Sure, although composition is not something I can take credit for, as
> it's a well-described operation for generating linear transformations.
> It is the matrix multiplication of two or more transformation matrices.
> In the case of Euler transformations, it's matrices specifying rotations
> around 3 orthogonal axes by 3 given angles.  I'm using `numpy.dot' to
> perform matrix multiplication on 2D arrays representing matrices.
>
> However, it's not obvious from the link I provided what particular
> rotation matrices are multiplied and in what order (i.e. what
> composition) is used to arrive at the Z1Y2X3 rotation matrix shown.
> Perhaps I'm not understanding the conventions used therein.  This is one
> of my attempts at reproducing that rotation matrix via composition:
>
> ---<cut here---start--
> ->---
> import numpy as np
>
> angles = np.radians(np.array([30, 20, 10]))
>
> def z1y2x3(alpha, beta, gamma):
> """Z1Y2X3 rotation matrix given Euler angles"""
> return np.array([[np.cos(alpha) * np.cos(beta),
>   np.cos(alpha) * np.sin(beta) * np.sin(gamma) -
>   np.cos(gamma) * np.sin(alpha),
>   np.sin(alpha) * np.sin(gamma) +
>   np.cos(alpha) * np.cos(gamma) * np.sin(beta)],
>  [np.cos(beta) * np.sin(alpha),
>   np.cos(alpha) * np.cos(gamma) +
>   np.sin(alpha) * np.sin(beta) * np.sin(gamma),
>   np.cos(gamma) * np.sin(alpha) * np.sin(beta) -
>   np.cos(alpha) * np.sin(gamma)],
>  [-np.sin(beta), np.cos(beta) * np.sin(gamma),
>   np.cos(beta) * np.cos(gamma)]])
>
> euler_mat = z1y2x3(angles[0], angles[1], angles[2])
>
> ## Now via composition
>
> def rotation_matrix(theta, axis, active=False):
> """Generate rotation matrix for a given axis
>
> Parameters
> --
>
> theta: numeric, optional
> The angle (degrees) by which to perform the rotation.  Default is
> 0, which means return the coordinates of the vector in the rotated
> coordinate system, when rotate_vectors=False.
> axis: int, optional
> Axis around which to perform the rotation (x=0; y=1; z=2)
> active: bool, optional
> Whether to return active transformation matrix.
>
> Returns
> ---
> numpy.ndarray
> 3x3 rotation matrix
> """
> theta = np.radians(theta)
> if axis == 0:
> R_theta = np.array([[1, 0, 0],
> [0, np.cos(theta), -np.sin(theta)],
> [0, np.sin(theta), np.cos(theta)]])
> elif axis == 1:
> R_theta = np.array([[np.cos(theta), 0, np.sin(theta)],
> [0, 1, 0],
> [-np.sin(theta), 0, np.cos(theta)]])
> else:
> R_theta = np.array([[np.cos(theta), -np.sin(theta), 0],
> [np.sin(theta), np.cos(theta), 0],
> [0, 0, 1]])
> if active:
> R_theta = np.transpose(R_theta)
> return R_theta
>
> ## The rotations are given as active
> xmat = rotation_matrix(angles[2], 0, active=True)
> ymat = rotation_matrix(angles[1], 1, active=True)
> zmat = rotation_matrix(angles[0], 2, active=True)
> ## The operation seems to imply this composition
> euler_comp_mat = np.dot(xmat, np.dot(ymat, zmat))
> ---<cut here---end
> ->---
>
> I believe the matrices `euler_mat' and `euler_comp_mat' should be the
> same, but they aren't, so it's unclear to me what particular composition
> is meant to produce the matrix specified by this Z1Y2X3 transformation.
> What am I missing?
>
> --
> Seb
>
> ___
> NumPy-Discussion mailing list
> NumP

Re: [Numpy-discussion] How to use user input as equation directly

2016-10-28 Thread Robert McLeod
On Thu, Oct 27, 2016 at 11:35 PM, Benjamin Root <ben.v.r...@gmail.com>
wrote:

> Perhaps the numexpr package might be safer? Not exactly meant for this
> situation (meant for optimizations), but the evaluator is pretty darn safe.
>
>
It would not be able to evaluate something like 'np.arange(50)' for
example, since it only has a limited subset of numpy functionality. In the
example provided that or linspace is likely the natural input for the
variable 't'.

-- 
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universität Basel
Mattenstrasse 26, 4058 Basel
Work: +41.061.387.3225
robert.mcl...@unibas.ch
robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch>
robbmcl...@gmail.com
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-27 Thread Robert McLeod
rejected on these
>> grounds [1]
>>
>> [1] https://github.com/numpy/numpy/issues/3485
>>
>>
> Yes it would make numpy GPL, but that is not a concern for a lot of users.
> Users for who it is a problem can still use the non-GPL version.
> A more interesting debate is whether our binary wheels should then be GPL
> wheels by default or not. Probably not, but that is something that should
> be discussed when its an actual issue.
>
> But to clarify what I said, it would be accepted if the value it provides
> is sufficient compared to the code maintenance it adds. Given that pyfftw
> already exists the value is probably relatively small, but personally I'd
> still be interested in code that allows switching the fft backend as that
> could also allow plugging e.g. gpu based implementations (though again this
> is already covered by other third party modules).
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universität Basel
Mattenstrasse 26, 4058 Basel
Work: +41.061.387.3225
robert.mcl...@unibas.ch
robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch>
robbmcl...@gmail.com
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] automatically avoiding temporary arrays

2016-10-05 Thread Robert McLeod
On Wed, Oct 5, 2016 at 1:11 PM, srean <srean.l...@gmail.com> wrote:

> Thanks Francesc, Robert for giving me a broader picture of where this fits
> in. I believe numexpr does not  handle slicing, so that might be another
> thing to look at.
>

Dereferencing would be relatively simple to add into numexpr, as it would
just be some getattr() calls.  Personally I will add that at some point
because it will clean up my code.

Slicing, maybe only for continuous blocks in memory?

I.e.

imageStack[0,:,:]

would be possible, but

imageStack[:, ::2, ::2]

would not be trivial (I think...).  I seem to remember someone asked David
Cooke about slicing and he said something along the lines of, "that's what
Numba is for."  Perhaps NumPy backended by Numba is more so what you are
looking for, as it hooks into the byte compiler? The main advantage of
numexpr is that a series of numpy functions in  can be enclosed
in  ne.evaluate( "" ) and it provides a big acceleration for
little programmer effort, but it's not nearly as sophisticated as Numba or
PyPy.



> On Wed, Oct 5, 2016 at 4:26 PM, Robert McLeod <robbmcl...@gmail.com>
> wrote:
>
>>
>> As Francesc said, Numexpr is going to get most of its power through
>> grouping a series of operations so it can send blocks to the CPU cache and
>> run the entire series of operations on the cache before returning the block
>> to system memory.  If it was just used to back-end NumPy, it would only
>> gain from the multi-threading portion inside each function call.
>>
>
> Is that so ?
>
> I thought numexpr also cuts down on number of temporary buffers that get
> filled (in other words copy operations) if the same expression was written
> as series of operations. My understanding can be wrong, and would
> appreciate correction.
>
> The 'out' parameter in ufuncs can eliminate extra temporaries but its not
> composable. Right now I have to manually carry along the array where the in
> place operations take place. I think the goal here is to eliminate that.
>

 The numexpr virtual machine does create temporaries where needed when it
parses the abstract syntax tree for all the operations it has to do.  I
believe the main advantage is that the temporaries are created on the CPU
cache, and not in system memory. It's certainly true that numexpr doesn't
create a lot of OP_COPY operations, rather it's optimized to minimize them,
so probably it's fewer ops than naive successive calls to numpy within
python, but I'm unsure if there's any difference in operation count between
a hand-optimized numpy with out= set and numexpr.  Numexpr just does it for
you.

This blog post from Tim Hochberg is useful for understanding the
performance advantages of blocking versus multithreading:

http://www.bitsofbits.com/2014/09/21/numpy-micro-optimization-and-numexpr/


Robert

-- 
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universität Basel
Mattenstrasse 26, 4058 Basel
Work: +41.061.387.3225
robert.mcl...@unibas.ch
robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch>
robbmcl...@gmail.com
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] automatically avoiding temporary arrays

2016-10-05 Thread Robert McLeod
All,

On Wed, Oct 5, 2016 at 11:46 AM, Francesc Alted <fal...@gmail.com> wrote:

> 2016-10-05 8:45 GMT+02:00 srean <srean.l...@gmail.com>:
>
>> Good discussion, but was surprised by the absence of numexpr in the
>> discussion., given how relevant it (numexpr) is to the topic.
>>
>> Is the goal to fold in the numexpr functionality (and beyond) into Numpy ?
>>
>
> Yes, the question about merging numexpr into numpy has been something that
> periodically shows up in this list.  I think mostly everyone agree that it
> is a good idea, but things are not so easy, and so far nobody provided a
> good patch for this.  Also, the fact that numexpr relies on grouping an
> expression by using a string (e.g. (y = ne.evaluate("x**3 + tanh(x**2) +
> 4")) does not play well with the way in that numpy evaluates expressions,
> so something should be suggested to cope with this too.
>

As Francesc said, Numexpr is going to get most of its power through
grouping a series of operations so it can send blocks to the CPU cache and
run the entire series of operations on the cache before returning the block
to system memory.  If it was just used to back-end NumPy, it would only
gain from the multi-threading portion inside each function call. I'm not
sure how one would go about grouping successive numpy expressions without
modifying the Python interpreter?

I put a bit of effort into extending numexpr to use 4-byte word opcodes
instead of 1-byte.  Progress has been very slow, however, due to time
constraints, but I have most of the numpy data types (u[1-4], i[1-4],
f[4,8], c[8,16], S[1-4], U[1-4]).  On Tuesday I finished writing a Python
generator script that writes all the C-side opcode macros for opcodes.hpp.
Now I have about 900 opcodes, and this could easily grow into thousands if
more functions are added, so I also built a reverse lookup tree (based on
collections.defaultdict) for the Python-side of numexpr.

Robert

-- 
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universität Basel
Mattenstrasse 26, 4058 Basel
Work: +41.061.387.3225
robert.mcl...@unibas.ch
robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch>
robbmcl...@gmail.com
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Using library-specific headers

2016-09-29 Thread Robert McLeod
Pavlyk,

NumExpr optionally includes MKL's VML at compile-time.  You may want to
look at its implementation.  From what I recall it relies on a function in
a bootstrapped __config__.py to determine if MKL is present.

Robert

On Thu, Sep 29, 2016 at 7:27 PM, Pavlyk, Oleksandr <
oleksandr.pav...@intel.com> wrote:

> Hi Julian,
>
> Thank you very much for the response. It appears to work.
>
> I work on "Intel Distribution for Python" at Intel Corp. This question was
> motivated by work needed to
> prepare pull requests with our changes/optimizations to numpy source code.
> In particular, the numpy.random_intel package
>
>https://mail.scipy.org/pipermail/numpy-discussion/2016-June/075693.html
>
> relies on MKL, but its potential inclusion in numpy should not break the
> build if MKL is unavailable.
>
> Also our benchmarking was pointing at Numpy's sequential memory copying as
> a bottleneck.
> I am working to open a pull request into the main trunk of numpy to take
> advantage of multithreaded
> MKL's BLAS dcopy function to do memory copying in parallel for
> sufficiently large sizes.
>
> Related to numpy.random_inter, I noticed that the randomstate package,
> which extends numpy.random was
> not being made a part of numpy, but rather published on PyPI as a
> stand-alone module. Does that mean that
> the community decided against  including it in numpy's codebase? If so, I
> would appreciate if someone could
> elaborate on or point me to the reasoning behind that decision.
>
> Thank you,
> Oleksandr
>
>
>
> -Original Message-
> From: NumPy-Discussion [mailto:numpy-discussion-boun...@scipy.org] On
> Behalf Of Julian Taylor
> Sent: Thursday, September 29, 2016 8:10 AM
> To: numpy-discussion@scipy.org
> Subject: Re: [Numpy-discussion] Using library-specific headers
>
> On 09/27/2016 11:09 PM, Pavlyk, Oleksandr wrote:
> > Suppose I would like to take advantage of some functions from MKL in
> > numpy C source code, which would require to use
> >
> >
> >
> > #include "mkl.h"
> >
> >
> >
> > Ideally this include line must not break the build of numpy when MKL
> > is not present, so my initial approach was to use
> >
> >
> >
> > #if defined(SCIPY_MKL_H)
> >
> > #include "mkl.h"
> >
> > #endif
> >
> >
> >
> > Unfortunately, this did not work when building with gcc on a machine
> > where MKL is present on default LD_LIBRARY_PATH, because then the
> > distutils code was setting SCIPY_MKL_H preprocessor variable, even
> > though mkl headers are not on the C_INCLUDE_PATH.
> >
> >
> >
> > What is the preferred solution to include an external library header
> > to ensure that code-base continues to build in most common cases?
> >
> >
> >
> > One approach I can think of is to set a preprocessor variable, say
> > HAVE_MKL_HEADERS in numpy/core/includes/numpy/config.h depending on an
> > outcome of building of a simple _configtest.c using
> > config.try_compile(), like it is done in numpy/core/setup.py //
> >
> > / /
> >
> > Is there a simpler, or a better way?
> >
>
> hi,
> you could put the header into OPTIONAL_HEADERS in
> numpy/core/setup_common.py. This will define HAVE_HEADERFILENAME_H for you
> but this will not check that the corresponding the library actually exists
> and can be linked.
> For that SCIPY_MKL_H is probably the right macro, though its name is
> confusing as it does not check for the header presence ...
>
> Can you tell us more about what from mkl you are attempting to add and for
> what purpos, e.g. is it something that should go into numpy proper or just
> for personal/internal use?
>
> cheers,
> Julian
>
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universität Basel
Mattenstrasse 26, 4058 Basel
Work: +41.061.387.3225
robert.mcl...@unibas.ch
robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch>
robbmcl...@gmail.com
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: ifft padding

2016-05-26 Thread Robert McLeod
Allen,

Probably it needs to work in n-dimensions, like the existing
np.fft.fftshift function does, with an optional axis=tuple parameter. I
recall that fftshift is just an array indexing trick?  It would be helpful
to see what's faster, two fftshifts and a edge padding or your
inter-padding.  Probably it's faster to make a new zeros array of the
appropriate padded size and do 2*ndim copies?

Robert

On Wed, May 25, 2016 at 9:35 PM, Allen Welkie <allen.wel...@gmail.com>
wrote:

> I'd like to get some feedback on my [pull request](
> https://github.com/numpy/numpy/pull/7593).
>
> This pull request adds a function `ifftpad` which pads a spectrum by
> inserting zeros where the highest frequencies would be. This is necessary
> because the padding that `ifft` does simply inserts zeros at the end of the
> array. But because of the way the spectrum is laid out, this changes which
> bins represent which frequencies and in general messes up the result of
> `ifft`. If you pad with the proposed `ifftpad` function, the zeros will be
> inserted in the middle of the spectrum and the time signal that results
> from `ifft` will be an interpolated version of the unpadded time signal.
> See the discussion in [issue #1346](
> https://github.com/numpy/numpy/issues/1346).
>
> The following is a script to demonstrate what I mean:
>
> ```
> import numpy
> from numpy import concatenate, zeros
> from matplotlib import pyplot
>
> def correct_padding(a, n, scale=True):
> """ A copy of the proposed `ifftpad` function. """
> spectrum = concatenate((a[:len(a) // 2],
> zeros(n - len(a)),
> a[len(a) // 2:]))
> if scale:
> spectrum *= (n / len(a))
> return spectrum
>
> def plot_real(signal, label):
> time = numpy.linspace(0, 1, len(signal) + 1)[:-1]
> pyplot.plot(time, signal.real, label=label)
>
> def main():
> spectrum = numpy.zeros(10, dtype=complex)
> spectrum[-1] = 1 + 1j
>
> signal = numpy.fft.ifft(spectrum)
> signal_bad_padding = numpy.fft.ifft(10 * spectrum, 100)
> signal_good_padding = numpy.fft.ifft(correct_padding(spectrum, 100))
>
> plot_real(signal, 'No padding')
> plot_real(signal_bad_padding, 'Bad padding')
> plot_real(signal_good_padding, 'Good padding')
>
> pyplot.legend()
> pyplot.show()
>
>
> if __name__ == '__main__':
> main()
> ```
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universität Basel
Mattenstrasse 26, 4058 Basel
Work: +41.061.387.3225
robert.mcl...@unibas.ch
robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch>
robbmcl...@gmail.com
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Fwd: Numexpr-3.0 proposal

2016-02-16 Thread Robert McLeod
On Mon, Feb 15, 2016 at 10:43 AM, Gregor Thalhammer <
gregor.thalham...@gmail.com> wrote:

>
> Dear Robert,
>
> thanks for your effort on improving numexpr. Indeed, vectorized math
> libraries (VML) can give a large boost in performance (~5x), except for a
> couple of basic operations (add, mul, div), which current compilers are
> able to vectorize automatically. With recent gcc even more functions are
> vectorized, see https://sourceware.org/glibc/wiki/libmvec But you need
> special flags depending on the platform (SSE, AVX present?), runtime
> detection of processor capabilities would be nice for distributing
> binaries. Some time ago, since I lost access to Intels MKL, I patched
> numexpr to use Accelerate/Veclib on os x, which is preinstalled on each
> mac, see https://github.com/geggo/numexpr.git veclib_support branch.
>
> As you increased the opcode size, I could imagine providing a bit to
> switch (during runtime) between internal functions and vectorized ones,
> that would be handy for tests and benchmarks.
>

Dear Gregor,

Your suggestion to separate the opcode signature from the library used to
execute it is very clever. Based on your suggestion, I think that the
natural evolution of the opcodes is to specify them by function signature
and library, using a two-level dict, i.e.

numexpr.interpreter.opcodes['exp_f8f8f8'][gnu] = some_enum
numexpr.interpreter.opcodes['exp_f8f8f8'][msvc] = some_enum +1
numexpr.interpreter.opcodes['exp_f8f8f8'][vml] = some_enum + 2
numexpr.interpreter.opcodes['exp_f8f8f8'][yeppp] = some_enum +3

I want to procedurally generate opcodes.cpp and interpreter_body.cpp.  If I
do it the way you suggested funccodes.hpp and all the many #define's
regarding function codes in the interpreter can hopefully be removed and
hence simplify the overall codebase. One could potentially take it a step
further and plan (optimize) each expression, similar to what FFTW does with
regards to matrix shape. That is, the basic way to control the library
would be with a singleton library argument, i.e.:

result = ne.evaluate( "A*log(foo**2 / bar**2", lib=vml )

However, we could also permit a tuple to be passed in, where each element
of the tuple reflects the library to use for each operation in the AST tree:

result = ne.evaluate( "A*log(foo**2 / bar**2", lib=(gnu,gnu,gnu,yeppp,gnu) )

In this case the ops are (mul,mul,div,log,mul).  The op-code picking is
done by the Python side, and this tuple could be potentially optimized by
numexpr rather than hand-optimized, by trying various permutations of the
linked C math libraries. The wisdom from the planning could be pickled and
saved in a wisdom file.  Currently Numexpr has cacheDict in util.py but
there's no reason this can't be pickled and saved to disk. I've done a
similar thing by creating wrappers for PyFFTW already.

Robert

-- 
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universität Basel
Mattenstrasse 26, 4058 Basel
Work: +41.061.387.3225
robert.mcl...@unibas.ch
robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch>
robbmcl...@gmail.com
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numexpr-3.0 proposal

2016-02-16 Thread Robert McLeod
On Mon, Feb 15, 2016 at 7:28 AM, Ralf Gommers <ralf.gomm...@gmail.com>
wrote:

>
>
> On Sun, Feb 14, 2016 at 11:19 PM, Robert McLeod <robbmcl...@gmail.com>
> wrote:
>
>>
>> 4.) I took a stab at converting from distutils to setuputils but this
>> seems challenging with numpy as a dependency. I wonder if anyone has tried
>> monkey-patching so that setup.py build_ext uses distutils and then pass the
>> interpreter.pyd/so as a data file, or some other such chicanery?
>>
>
> Not sure what you mean, since numpexpr already uses setuptools:
> https://github.com/pydata/numexpr/blob/master/setup.py#L22. What is the
> real goal you're trying to achieve?
>
> This monkeypatching is a bad idea:
> https://github.com/robbmcleod/numexpr/blob/numexpr-3.0/setup.py#L19. Both
> setuptools and numpy.distutils already do that, and that's already one too
> many. So you definitely don't want to add a third place You can use the
> -j (--parallel) flag to numpy.distutils instead, see
> http://docs.scipy.org/doc/numpy-dev/user/building.html#parallel-builds
>
> Ralf
>

Dear Ralf,

Yes, this appears to be a bad idea.  I was just trying to think about if I
could use the more object-oriented approach that I am familiar with in
setuptools to easily build wheels for Pypi.  Thanks for the comments and
links; I didn't know I could parallelize the numpy build.

Robert

-- 
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universität Basel
Mattenstrasse 26, 4058 Basel
Work: +41.061.387.3225
robert.mcl...@unibas.ch
robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch>
robbmcl...@gmail.com
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Numexpr-3.0 proposal

2016-02-14 Thread Robert McLeod
Hello everyone,

I've done some work on making a new version of Numexpr that would fix some
of the limitations of the original virtual machine with regards to data
types and operation/function count. Basically I re-wrote the Python and C
sides to use 4-byte words, instead of null-terminated strings, for
operations and passing types.  This means the number of operations and
types isn't significantly limited anymore.

Francesc Alted suggested I should come here and get some advice from the
community. I wrote a short proposal on the Wiki here:

https://github.com/pydata/numexpr/wiki/Numexpr-3.0-Branch-Overview

One can see my branch here:

https://github.com/robbmcleod/numexpr/tree/numexpr-3.0

If anyone has any comments they'd be welcome. Questions from my side for
the group:

1.) Numpy casting: I downloaded the Numpy source and after browsing it
seems the best approach is probably to just use
numpy.core.numerictypes.find_common_type?

2.) Can anyone foresee any issues with casting build-in Python types (i.e.
float and integer) to their OS dependent numpy equivalents? Numpy already
seems to do this.

3.) Is anyone enabling the Intel VML library? There are a number of
comments in the code that suggest it's not accelerating the code. It also
seems to cause problems with bundling numexpr with cx_freeze.

4.) I took a stab at converting from distutils to setuputils but this seems
challenging with numpy as a dependency. I wonder if anyone has tried
monkey-patching so that setup.py build_ext uses distutils and then pass the
interpreter.pyd/so as a data file, or some other such chicanery?

(I was going to ask about attaching a debugger, but I just noticed:
https://wiki.python.org/moin/DebuggingWithGdb   )

Ciao,

Robert

-- 
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universität Basel
Mattenstrasse 26, 4058 Basel
Work: +41.061.387.3225
robert.mcl...@unibas.ch
robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch>
robbmcl...@gmail.com
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion