Re: [Numpy-discussion] Fwd: Backslash operator A\b and np/sp.linalg.solve

2017-01-09 Thread Robert Kern
On Mon, Jan 9, 2017 at 7:10 PM, Ilhan Polat  wrote:
>
> Yes, that's precisely the case but when we know the structure we can just
choose the appropriate solver anyhow with a little bit of overhead. What I
mean is that, to my knowledge, FORTRAN routines for checking for
triangularness etc. are absent.

I'm responding to that. The reason that they don't have those FORTRAN
routines for testing for structure inside of a generic dense matrix is that
in FORTRAN it's more natural (and efficient) to just use the explicit
packed structure and associated routines instead. You would only use a
generic dense matrix if you know that there isn't structure in the matrix.
So there are no routines for detecting that structure in generic dense
matrices.

--
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: Backslash operator A\b and np/sp.linalg.solve

2017-01-09 Thread Ilhan Polat
Yes, that's precisely the case but when we know the structure we can just
choose the appropriate solver anyhow with a little bit of overhead. What I
mean is that, to my knowledge, FORTRAN routines for checking for
triangularness etc. are absent.

On Tue, Jan 10, 2017 at 2:29 AM, Robert Kern  wrote:

> On Mon, Jan 9, 2017 at 5:09 PM, Ilhan Polat  wrote:
>
> > So every test in the polyalgorithm is cheaper than the next one. I'm not
> exactly sure what might be the best strategy yet hence the question. It's
> really interesting that LAPACK doesn't have this type of fast checks.
>
> In Fortran LAPACK, if you have a special structured matrix, you usually
> explicitly use packed storage and call the appropriate function type on it.
> It's only when you go to a system that only has a generic, unstructured
> dense matrix data type that it makes sense to do those kinds of checks.
>
> --
> Robert Kern
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] From Python to Numpy

2017-01-09 Thread Matthew Harrigan
I also have been stalking this email thread.  First, excellent book!

Regarding the vectorization example mentioned above, one thing to note is
that it increases the order of the algorithm relative to the pure python.
The vectorized approach uses correlate, which requires ~(len(seq) *
len(sub)) FLOPs.  In the case where the first element in sub is not equal
to the vast majority of elements in seq, the basic approach requires
~len(seq) comparisons.  Note that is the case in the SO answer.  One fairly
common thing I have seen in vectorized approaches is that the memory or
operations required scales worse than strictly required.  It may or may not
be an issue, largely depends on the specifics of how its used, but it
usually indicates a better approach exists.  That may be worth mentioning
here.

Given that, I tried to come up with an "ideal" approach.  stride_tricks can
be used to convert seq to a 2D array, and then ideally each row could be
compared to sub.  However I can't think of how to do that with numpy
function calls other than compare each element in the 2D array, requiring
O(n_sub*n_seq) operations again.  array_equal

is an example of that.  Python list equality scales better, for instance if
x = [0]*n and y = [1]*n, x == y is very fast and the time is independent of
the value of n.

It seems a generalized ufunc "all_equal" with signature (i),(i)->() and
short circuit logic once the first non equal element is encountered would
be an important performance improvement.  In the ideal case it is
dramatically faster, and even if every element must be compared then its
still probably meaningfully faster since the boolean intermediate array
isn't created.  Even better would be to get the axis argument in place for
generalized ufuncs.  Then this problem could be vectorized in one line with
far better performance.  If others think this is a good idea I will post an
issue and attempt a solution.


On Sat, Dec 31, 2016 at 5:23 AM, Nicolas P. Rougier <
nicolas.roug...@inria.fr> wrote:

>
> > I’ve seen vectorisation taken to the extreme, with negative consequences
> in terms of both speed and readability, in both Python and MATLAB
> codebases, so I would suggest some discussion / wisdom about when not to
> vectorise.
>
>
> I agree and there is actually a warning in the introduction about
> readability vs speed with an example showing a clever optimization (by
> Jaime Fernández del Río) that is hardly readable for the non-experts
> (including myself).
>
>
> Nicolas
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: Backslash operator A\b and np/sp.linalg.solve

2017-01-09 Thread Robert Kern
On Mon, Jan 9, 2017 at 5:09 PM, Ilhan Polat  wrote:

> So every test in the polyalgorithm is cheaper than the next one. I'm not
exactly sure what might be the best strategy yet hence the question. It's
really interesting that LAPACK doesn't have this type of fast checks.

In Fortran LAPACK, if you have a special structured matrix, you usually
explicitly use packed storage and call the appropriate function type on it.
It's only when you go to a system that only has a generic, unstructured
dense matrix data type that it makes sense to do those kinds of checks.

-- 
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: Backslash operator A\b and np/sp.linalg.solve

2017-01-09 Thread Ilhan Polat
Indeed, generic is the cheapest discovery including the worst case that
only the last off-diagonal element is nonzero, a pseudo code is first
remove the diagonals check the remaining parts for nonzero, then check the
upper triangle then lower, then morally triangularness from zero structure
if any then bandedness and so on. If you have access to matlab, then you
can set the sparse monitor to verbose mode " spparms('spumoni', 1) " and
perform a backslash operation on sparse matrices. It will spit out what it
does during the checks.

A = sparse([0 2 0 1 0; 4 -1 -1 0 0; 0 0 0 3 -6; -2 0 0 0 2; 0 0 4 2 0]);
B = sparse([8; -1; -18; 8; 20]);
spparms('spumoni',1)
x = A\B

So every test in the polyalgorithm is cheaper than the next one. I'm not
exactly sure what might be the best strategy yet hence the question. It's
really interesting that LAPACK doesn't have this type of fast checks.

On Mon, Jan 9, 2017 at 8:30 PM,  wrote:

>
>
> On Mon, Jan 9, 2017 at 6:27 AM, Ilhan Polat  wrote:
>
>> > Note that you're proposing a new scipy feature (right?) on the numpy
>> list
>>
>> > This sounds like a good idea to me. As a former heavy Matlab user I
>> remember a lot of things to dislike, but "\" behavior was quite nice.
>>
>> Correct, I am not sure where this might go in. It seemed like a NumPy
>> array operation (touching array elements rapidly etc. can also be added for
>> similar functionalities other than solve) hence the NumPy list. But of
>> course it can be pushed as an exclusive SciPy feature. I'm not sure what
>> the outlook on np.linalg.solve is.
>>
>>
>> > How much is a noticeable slowdown? Note that we still have the current
>> interfaces available for users that know what they need, so a nice
>> convenience function that is say 5-10% slower would not be the end of the
>> world.
>>
>> the fastest case was around 150-400% slower but of course it might be the
>> case that I'm not using the fastest methods. It was mostly shuffling things
>> around and using np.any on them in the pure python3 case. I will cook up
>> something again for the baseline as soon as I have time.
>>
>>
>>
> All this checks sound a bit expensive, if we have almost always completely
> unstructured arrays that don't satisfy any special matrix pattern.
>
> In analogy to the type proliferation in Julia to handle those cases: Is
> there a way to attach information to numpy arrays that for example signals
> that a 2d array is hermitian, banded or diagonal or ...?
>
> (After second thought: maybe completely unstructured is not too expensive
> to detect if the checks are short-circuited, one off diagonal element
> nonzero - not diagonal, two opposite diagonal different - not symmetric,
> ...)
>
> Josef
>
>
>
>
>>
>>
>>
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: Backslash operator A\b and np/sp.linalg.solve

2017-01-09 Thread josef . pktd
On Mon, Jan 9, 2017 at 6:27 AM, Ilhan Polat  wrote:

> > Note that you're proposing a new scipy feature (right?) on the numpy
> list
>
> > This sounds like a good idea to me. As a former heavy Matlab user I
> remember a lot of things to dislike, but "\" behavior was quite nice.
>
> Correct, I am not sure where this might go in. It seemed like a NumPy
> array operation (touching array elements rapidly etc. can also be added for
> similar functionalities other than solve) hence the NumPy list. But of
> course it can be pushed as an exclusive SciPy feature. I'm not sure what
> the outlook on np.linalg.solve is.
>
>
> > How much is a noticeable slowdown? Note that we still have the current
> interfaces available for users that know what they need, so a nice
> convenience function that is say 5-10% slower would not be the end of the
> world.
>
> the fastest case was around 150-400% slower but of course it might be the
> case that I'm not using the fastest methods. It was mostly shuffling things
> around and using np.any on them in the pure python3 case. I will cook up
> something again for the baseline as soon as I have time.
>
>
>
All this checks sound a bit expensive, if we have almost always completely
unstructured arrays that don't satisfy any special matrix pattern.

In analogy to the type proliferation in Julia to handle those cases: Is
there a way to attach information to numpy arrays that for example signals
that a 2d array is hermitian, banded or diagonal or ...?

(After second thought: maybe completely unstructured is not too expensive
to detect if the checks are short-circuited, one off diagonal element
nonzero - not diagonal, two opposite diagonal different - not symmetric,
...)

Josef




>
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] ANN: Bokeh 0.12.4 Released

2017-01-09 Thread Bryan Van de Ven
Hi all,

On behalf of the Bokeh team, I am pleased to announce the release of version 
0.12.4 of Bokeh!

Please see the announcement post at:

https://bokeh.github.io/blog/2017/1/6/release-0-12-4/

which has more information as well as live demonstrations. 

If you are using Anaconda/miniconda, you can install it with conda:

conda install -c bokeh bokeh

Alternatively, you can also install it with pip:

pip install bokeh

Full information including details about how to use and obtain BokehJS are at:

http://bokeh.pydata.org/en/0.12.4/docs/installation.html

Issues, enhancement requests, and pull requests can be made on the Bokeh Github 
page: https://github.com/bokeh/bokeh

Documentation is available at http://bokeh.pydata.org/en/0.12.4

There are over 200 total contributors to Bokeh and their time and effort help 
make Bokeh such an amazing project and community. Thank you again for your 
contributions. 

Finally (as always), for questions, technical assistance or if you're 
interested in contributing, questions can be directed to the Bokeh mailing 
list: bo...@continuum.io or the Gitter Chat room: https://gitter.im/bokeh/bokeh

Thanks,

Bryan Van de Ven
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: Backslash operator A\b and np/sp.linalg.solve

2017-01-09 Thread Ilhan Polat
> Note that you're proposing a new scipy feature (right?) on the numpy
list

> This sounds like a good idea to me. As a former heavy Matlab user I
remember a lot of things to dislike, but "\" behavior was quite nice.

Correct, I am not sure where this might go in. It seemed like a NumPy array
operation (touching array elements rapidly etc. can also be added for
similar functionalities other than solve) hence the NumPy list. But of
course it can be pushed as an exclusive SciPy feature. I'm not sure what
the outlook on np.linalg.solve is.


> How much is a noticeable slowdown? Note that we still have the current
interfaces available for users that know what they need, so a nice
convenience function that is say 5-10% slower would not be the end of the
world.

the fastest case was around 150-400% slower but of course it might be the
case that I'm not using the fastest methods. It was mostly shuffling things
around and using np.any on them in the pure python3 case. I will cook up
something again for the baseline as soon as I have time.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: Backslash operator A\b and np/sp.linalg.solve

2017-01-09 Thread Ralf Gommers
On Mon, Jan 9, 2017 at 4:17 AM, Ilhan Polat  wrote:

>
> Hi everyone,
>
> I was stalking the deprecating the numpy.matrix discussion on the other
> thread and I wondered maybe the mailing list is a better place for the
> discussion about something I've been meaning to ask the dev members. I
> thought mailing lists are something we dumped using together with ICQ and
> geocities stuff but apparently not :-)
>
> Anyways, first thing is first: I have been in need of the ill-conditioned
> warning behavior of matlab (and possibly other software suites) for my own
> work. So I looked around in the numpy issues and found
> https://github.com/numpy/numpy/issues/3755 some time ago. Then I've
> learned from @rkern that there were C translations involved in the numpy
> source and frankly I couldn't even find the entry point of how the project
> is structured so I've switched to SciPy side where things are a bit more
> convenient. Next to teaching me more about f2py machinery, I have noticed
> that the linear algebra module is a bit less competitive than the usual
> level of scipy though it is definitely a personal opinion.
>
> So in order to get the ill-conditioning (or at least the condition number)
> I've wrapped up a PR using the expert routines of LAPACK (which is I think
> ready to merge) but still it is far from the contemporary software
> convenience that you generally get.
>
> https://github.com/scipy/scipy/pull/6775
>
> The "assume_a" keyword introduced here is hopefully modular enough that
> should there be any need for more structures we can simply keep adding to
> the list without any backwards compatibility. It will be at least offering
> more options than what we have currently. The part that I would like to
> discuss requires a bit of intro so please bear with me. Let me copy/paste
> the part from the old PR:
>
>   Around many places online, we can witness the rant about numpy/scipy not
> letting the users know about the conditioning for example Mike Croucher's
> blog  and numpy/numpy#3755
> 
>
>   Since we don't have any central backslash function that optimizes
> depending on the input data, should we create a function, let's name it
> with the matlab equivalent for the time being linsolve such that it
> automatically calls for the right solver? This way, we can avoid writing
> new functions for each LAPACK driver . As a reference here is a SO thread
> 
> that summarizes the linsolve
>  functionality.
>

Note that you're proposing a new scipy feature (right?) on the numpy
list

This sounds like a good idea to me. As a former heavy Matlab user I
remember a lot of things to dislike, but "\" behavior was quite nice.


> I'm sure you are aware, but just for completeness, the linear equation
> solvers are often built around the concept of polyalgorithm which is a
> fancy way of saying that the array is tested consecutively for certain
> structures and the checks are ordered in such a way that the simpler
> structure is tested the sooner. E.g. first check for diagonal matrix, then
> for upper/lower triangular then permuted triangular then symmetrical and so
> on. Here is also another example from AdvanPix
> http://www.advanpix.com/2016/10/07/architecture-of-linear-systems-solver/
>
> Now, according to what I have coded and optimized as much as I can, a pure
> Python is not acceptable as an overhead during these checks. It would
> definitely be a noticeable slowdown if this was in place in the existing
> linalg.solve however I think this is certainly doable in the low-level
> C/FORTRAN level.
>

How much is a noticeable slowdown? Note that we still have the current
interfaces available for users that know what they need, so a nice
convenience function that is say 5-10% slower would not be the end of the
world.

Ralf



> CPython is certainly faster but I think only a straight C/FORTRAN
> implementation would cut it. Note that we only need the discovery of the
> structure then we can pass to the dedicated solver as is. Hence I'm not
> saying that we should recode the existing solve functionality. We already
> have the branching in place to ?GE/SY/HE/POSVX routines.
>
> ---
>
> The second issue about the current linalg.solve function is when trying to
> solve for right inverse e.g. xA = b. Again with some copy/paste: The right
> inversion is currently a bit annoying, that is to say if we would like to
> compute, say, BA^{-1}, then the user has to explicitly transpose the
> explicitly transposed equation to avoid using an explicit inv(whose use
> should be discouraged anyways)
> x = scipy.linalg.solve(A.T, B.T).T.
>
> Since expert drivers come with a trans switch that can internally handle
> whether to solve the transposed or the regular