from:"Pauli Virtanen"

Re: [Numpy-discussion] Move scipy.org docs to Github?

2017-03-16 Thread Pauli Virtanen

Thu, 16 Mar 2017 08:15:08 +0100, Didrik Pinte kirjoitti:
>> The advantage of something like github pages is that it's big enough
>> that it *does* have dedicated ops support.
>
> Agreed. One issue is that we are working with a lot of legacy. Github
> will more than likely be a great solution to host static web pages but
> the evaluation for the shift needs to get into all the funky legacy
> redirects/rewrites we have in place, etc. This is probably not a real
> issue for docs.scipy.org but would be for other services.

IIRC, there's not that many of them, so in principle it could be possible 
to cobble them with  redirects.

>> As long as we can fit under the 1 gig size limit then GH pages seems
>> like the best option so far... it's reliable, widely understood, and
>> all of the limits besides the 1 gig size are soft limits where they say
>> they'll work with us to figure things out.
>
> Another option would be to just host the content under S3 with
> Cloudfront.
> It will also be pretty simple as a setup, scale nicely and won't have
> much restrictions on sizing.

Some minor-ish disadvantages of this are that it brings a new set of 
credentials to manage, it will be somewhat less transparent, and the 
tooling will be less familiar to people (eg release managers) who have to 
deal with it.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Move scipy.org docs to Github?

2017-03-15 Thread Pauli Virtanen

Wed, 15 Mar 2017 14:56:52 -0700, Nathaniel Smith kirjoitti:
[clip]
> As long as we can fit under the 1 gig size limit then GH pages seems
> like the best option so far... it's reliable, widely understood, and all
> of the limits besides the 1 gig size are soft limits where they say
> they'll work with us to figure things out.

The Scipy html docs weigh 60M apiece and numpy is 35M, so it can be done 
if a limited number of releases are kept, and the rest and the auxiliary 
files are put as release downloads.

Otherwise there's probably no problem as you can stick a CDN in front if 
it's too heavy otherwise.

Sounds sensible? Certainly it's the lowest-effort approach, and would 
simplify management of S3/etc origin site access permissions.

Pauli

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Move scipy.org docs to Github?

2017-03-15 Thread Pauli Virtanen

Wed, 15 Mar 2017 12:11:09 -0400, Marten van Kerkwijk kirjoitti:
> Astropy uses readthedocs quite happily (auto-updates on merges to master
> too).

AFAIK, scipy cannot be built on readthedocs.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Ensuring one can operate on array-like argument in place

2016-11-13 Thread Pauli Virtanen

Sat, 12 Nov 2016 17:00:07 +, Pavlyk, Oleksandr kirjoitti:
[clip]
> if x_arr is not x:
>in_place = 1  # a copy was made, so we can work in place.
> 
> The logic is of the last line turns out to be incorrect, because the
> input x can be a class with an array interface.

Please see:

https://github.com/scipy/scipy/blob/master/scipy/linalg/misc.py#L169

This probably can be translated to equivalent Numpy C API calls.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] automatically avoiding temporary arrays

2016-10-03 Thread Pauli Virtanen

Mon, 03 Oct 2016 15:07:28 -0400, Benjamin Root kirjoitti:
> With regards to arguments about holding onto large arrays, I would like
> to emphasize that my original suggestion mentioned weakref'ed numpy
> arrays.
> Essentially, the idea is to claw back only the raw memory blocks during
> that limbo period between discarding the numpy array python object and
> when python garbage-collects it.

CPython afaik deallocates immediately when the refcount hits zero. It's 
relatively rare that you have arrays hanging around waiting for cycle 
breaking by gc. If you have them hanging around, I don't think it's 
possible to distinguish these from other arrays without running the gc.

Note also that an "is an array in use" check probably always requires 
Julian's stack based hack since you cannot rely on the refcount.

Pauli

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] New iterator APIs (nditer / MapIter): Overlap detection in NumPy

2016-09-12 Thread Pauli Virtanen

Mon, 12 Sep 2016 11:31:07 +0200, Sebastian Berg kirjoitti:
>> * NPY_ITER_COPY_IF_OVERLAP, NPY_ITER_OVERLAP_NOT_SAME
>>   flags for NpyIter_New.
>> 
>> * New API function PyArray_MapIterArrayCopyIfOverlap,
>>   as ufunc.at needs to check overlaps for index arrays before
>>   constructing iterators, and the parsing is done in multiarray.
> 
> I think here Nathaniels point might be right. It could be we can assume
> that copying is always fine, there is probably only one or two
> downstream projects using this function, plus it seems harder to create
> abusing structures that actually do something useful.
> It was only exposed for usage in `ufunc.at` if I remember right. I know
> theano uses it though, but not sure about anyone else, maybe numba. On
> the other hand It is not the worst API clutter in history.

Do you suggest that I break the PyArray_MapIterArray API?

One issue here is that the function doesn't make distinction between read-
only access and read-write access, so copying may give unnecessary 
slowdown. The second thing is that it will result to a bit uglier code, as 
I need to manage the overlap with the second operation in ufunc_at.

For NpyIter, I'd still be wary about copying by default, because it's not 
needed everywhere (the may_share_memory checks are better done earlier), 
and since the semantic change can break things inside Numpy.

Pauli

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] New iterator APIs (nditer / MapIter): Overlap detection in NumPy

2016-09-11 Thread Pauli Virtanen

Hi,

In the end some further API additions turn out to appear needed:

* NPY_ITER_COPY_IF_OVERLAP, NPY_ITER_OVERLAP_NOT_SAME
  flags for NpyIter_New.

* New API function PyArray_MapIterArrayCopyIfOverlap,
  as ufunc.at needs to check overlaps for index arrays 
  before constructing iterators, and the parsing is done 
  in multiarray.

Continuation here: https://github.com/numpy/numpy/pull/8043



Wed, 07 Sep 2016 18:02:59 +0200, Sebastian Berg kirjoitti:

> Hi all,
> 
> Pauli just opened a nice pull request [1] to add overlap detection to
> the new iterator, this means adding a new iterator flag:
> 
> `NPY_ITER_COPY_IF_OVERLAP`
> 
> If passed to the iterator (also exposed in python), the iterator will
> copy the operands such that reading and writing should only occur for
> identical operands. For now this is implemented by always copying the
> output/writable operand (this could be improved though, so I would not
> say its fixed API).
> 
> Since adding this flag is new API, please feel free to suggest other
> names/approaches or even decline the change ;).
> 
> 
> This is basically a first step, which should be easily followed by
> adding overlap detection to ufuncs, removing traps such as the well (or
> not so well known) `a += a.T`. Other parts of numpy may follow one by
> one.
> 
> The work is based on his older awesome new memory overlap detection
> implementation.
> 
> If there are no comments, I will probably merge it very soon, so we can
> look at the follow up things.
> 
> - Sebastian
> 
> 
> [1] https://github.com/numpy/numpy/
pull/8026___
> NumPy-Discussion mailing list NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] New iterator API (nditer): Overlap detection in NumPy

2016-09-07 Thread Pauli Virtanen

Wed, 07 Sep 2016 09:22:24 -0700, Nathaniel Smith kirjoitti:
[clip]
> I wonder if there is any way we can avoid the flag, and just make this
> happen automatically when appropriate? nditer has too many "unbreak-me"
> flags already.
> 
> Are there any cases where we *don't* want the copy-if-overlap behavior?
> Traditionally overlap has triggered undefined behavior, so there's no
> backcompat issue, right?

I didn't put it on by default, because of backward compatibility and side 
effects that break things.

On side effects: there are some bugs in ufunc code that need fixing if 
the flag is turned on (wheremask code breaks, and ufuncs write to wrong 
output arrays). Moreover, copying write operands with updateifcopy marks 
the original arrays as read-only, until the copied array is decrefed. 
There may also be other side effects that are not so obvious.

The PR is not mergeable if the flag would be on by default --- that 
requires inspecting all the uses of the iterator in the numpy codebase 
and making sure there's no weird stuff done. I'm not sure how much 3rd 
party code is using the iterator, but I'm a bit worried also that copies 
break assumptions also there.

It might be possible to turn it on by default for operands with COPY or 
UPDATEIFCOPY flags --- but I'm not sure if that's helpful (now you'd need 
to set the flags to all input operands).

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] test_closing_fid (in test_io.py) on PyPy

2016-08-05 Thread Pauli Virtanen

Fri, 05 Aug 2016 10:06:02 +0300, Matti Picus kirjoitti:
[clip]
> I can submit a pull request to skip on pypy, or should this be solved in
> a more substantial way?

Should also be safe to just skip it on Pypy, it's testing that the wrong 
way to use np.load also works on CPython.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] StackOverflow documentation

2016-07-21 Thread Pauli Virtanen

Thu, 21 Jul 2016 16:24:15 +0200, mail kirjoitti:
[clip]
> Since it is run by the community, perhaps it's not a bad idea to
> encourage people to share their examples.

I would perhaps rather encourage people to improve the "Numpy User Guide" 
or the main documentation. Of course, working on that requires a somewhat 
different level of committment than editing what's essentially 
Stackexchange-provided wiki (where content gets relicensed with 
attribution clauses where you have to reference stackexchange and not the 
original author directly).

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deterministic, reproducible matmul / __matmult_

2016-07-11 Thread Pauli Virtanen

Mon, 11 Jul 2016 13:01:49 -0400, Jason Newton kirjoitti:
> Does the ML have any ideas on how one could get a matmul that will not
> allow any funny business on the evaluation of the products?  Funny
> business here is something like changing the evaluation order additions
> of terms. I want strict IEEE 754 compliance - no 80 bit registers, and
> perhaps control of the rounding mode, no unsafe math optimizations.

If you link Numpy with a BLAS and LAPACK libraries that have been 
compiled for this purpose, and turn on the compiler flags that enforce 
strict IEEE (and disable SSE) when compiling Numpy, you probably will get 
reproducible builds. Numpy itself just offloads the dot computations to 
BLAS, so if your BLAS is reproducible, things should mainly be OK.

You may also need to turn off the SSE optimizations in Numpy, because 
these can make results depend on memory alignment --- not in dot 
products, but in other computations.

Out of curiosity, what is the application where this is necessary?
Maybe there is a numerically stable formulation?

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Benchmark regression feeds

2016-06-24 Thread Pauli Virtanen

Hi,

In case someone is interested in getting notifications of performance
regressions in the Numpy and Scipy benchmarks, this is available as Atom
feeds at:

https://pv.github.io/numpy-bench/regressions.xml

https://pv.github.io/scipy-bench/regressions.xml

-- 
Pauli Virtanen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Why does asarray() create an intermediate memoryview?

2016-03-27 Thread Pauli Virtanen

Sun, 27 Mar 2016 17:00:51 -0400, Alexander Belopolsky kirjoitti:
[clip]
> Why can't a.base be base?  What is the need for the intermediate
> memoryview object?

Implementation detail vs. life cycle management of buffer acquisitions.

The PEP3118 Py_buffer structure representing an acquired buffer is a C 
struct that is not safe to copy (!), and needs to sit in an allocated 
blob of memory whose life cycle has to be managed. The acquisition also 
needs to be released after use.

Python's memoryview object happens to be a convenient way to babysit this.

Rather than adding a new entry to the ArrayObject struct for a potential 
acquired buffer and inserting corresponding release calls, I picked a 
more localized solution where the acquisition is managed by the 
memoryview object rather than ndarray itself, and the life cycle works out 
via the pre-existing ndarray.base refcounting.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] ATLAS build errors

2016-03-27 Thread Pauli Virtanen

Sat, 26 Mar 2016 14:05:24 -0700, Matthew Brett kirjoitti:
> I'm workon on building manylinux wheels for numpy, and I ran into
> unexpected problems with a numpy built against the ATLAS 3.8 binaries
> supplied by CentOS 5.
[clip]
> Does anyone recognize these?   How should I modify the build to avoid
> them?

Maybe the ATLAS binaries supplied were compiled with g77 instead of 
gfortran. If so, they should not be used with gfortran --- need to 
recompile.

Also, in the past ATLAS binaries shipped by distributions had severe 
bugs. However, 3.8.x may be a new enough version.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] GSoC?

2016-03-04 Thread Pauli Virtanen

Thu, 11 Feb 2016 00:02:52 +0100, Ralf Gommers kirjoitti:
[clip]
> OK first version:
> https://github.com/scipy/scipy/wiki/GSoC-2016-project-ideas I kept some
> of the ideas from last year, but removed all potential mentors as the
> same people may not be available this year - please re-add yourselves
> where needed.
> 
> And to everyone who has a good idea, and preferably is willing to mentor
> for that idea: please add it to that page.

I probably don't have bandwidth for mentoring, but as the Numpy 
suggestions seem to be mostly "hard" problems, we can add another 
one:

## Dealing with overlapping input/output data

Numpy operations where output arrays overlap with 
input arrays can produce unexpected results.
A simple example is
```
x = np.arange(100*100).reshape(100,100)
x += x.T# <- undefined result!
```
The task is to change Numpy so that the results
here become similar to as if the input arrays
overlapping with output were separate (here: `x += x.T.copy()`).
The challenge here lies in doing this without sacrificing 
too much performance or memory efficiency.

Initial steps toward solving this problem were taken in
https://github.com/numpy/numpy/pull/6166
where a simplest available algorithm for detecting
if arrays overlap was added. However, this is not yet
utilized in ufuncs. An initial attempt to sketch what 
should be done is at https://github.com/numpy/numpy/issues/6272
and issues referenced therein.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy 1.11.0rc1 released.

2016-02-23 Thread Pauli Virtanen

23.02.2016, 22:40, Charles R Harris kirjoitti:
[clip]
> On all 32-bit platforms:
> 
> 
> ERROR: test_zeros_big (test_multiarray.TestCreation)
> 
> Traceback (most recent call last):
>   File "X:\Python27\lib\site-packages\numpy\core\tests\test_multiarray.py",
> line 594, in test_zeros_big
> d = np.zeros((30 * 1024**2,), dtype=dt)
> MemoryError
> 
> I would be much obliged if someone else could demonstrate it.

Memory fragmentation in the 2GB address space available? If dt==float64,
that requires 250MB contiguous.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy 1.11.0rc1 released.

2016-02-23 Thread Pauli Virtanen

23.02.2016, 03:47, Charles R Harris kirjoitti:
> I'm delighted to announce the release of Numpy 1.11.0rc1. Hopefully the
> issues discovered in 1.11.0b3 have been dealt with and this release can go
> on to become the official release. Source files and documentation can be
> found on Sourceforge
> , while
> source files and OS X wheels for Python 2.7, 3.3, 3.4, and 3.5 can be
> installed from Pypi. Please test thoroughly.

FWIW https://travis-ci.org/pv/testrig/builds/108384173

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NumPy 1.11.0b3 released.

2016-02-10 Thread Pauli Virtanen

10.02.2016, 04:09, Charles R Harris kirjoitti:
> I'm pleased to announce the release of NumPy 1.11.0b3. This beta contains
[clip]
> Please test, hopefully this will be that last beta needed.

FWIW, https://travis-ci.org/pv/testrig/builds/108384173


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy 1.11.0b2 released

2016-02-05 Thread Pauli Virtanen

05.02.2016, 19:55, Nathaniel Smith kirjoitti:
> On Feb 5, 2016 8:28 AM, "Chris Barker - NOAA Federal" 
> wrote:
>>
>>> An extra ~2 hours of tests / 6-way parallelism is not that big a deal
>>> in the grand scheme of things (and I guess it's probably less than
>>> that if we can take advantage of existing binary builds)
>>
>> If we set up a numpy-testing conda channel, it could be used to cache
>> binary builds for all he versions of everything we want to test
>> against.
>>
>> Conda-build-all could make it manageable to maintain that channel.
> 
> What would be the advantage of maintaining that channel ourselves instead
> of using someone else's binary builds that already exist (e.g. Anaconda's,
> or official project wheels)?

ABI compatibility. However, as I understand it, not many backward ABI
incompatible changes in Numpy are not expected in future.

If they were, I note that if you work in the same environment, you can
push repeated compilation times to zero compared to the time it takes to
run tests in a way that requires less configuration, by enabling
ccache/f90cache.


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy 1.11.0b2 released

2016-02-04 Thread Pauli Virtanen

04.02.2016, 07:56, Nathaniel Smith kirjoitti:
[clip]
> Whoops, got distracted talking about the results and forgot to say --
> I guess we should think about how to combine these? I like the
> information on warnings, because it helps gauge the impact of
> deprecations, which is a thing that takes a lot of our attention. But
> your approach is clearly fancier in terms of how it parses the test
> results. (Do you think the fanciness is worth it? I can see an
> argument for crude and simple if the fanciness ends up being fragile,
> but I haven't read the code -- mostly I was just being crude and
> simple because I'm lazy :-).)

The fanciness is essentially a question of implementation language and
ease of writing the reporting code. At 640 SLOC it's probably not so bad.

I guess it's reasonably robust --- the test report formats are unlikely
to change, and pip/virtualenv will probably continue to work esp. with
pinned pip version.

It should be simple to extract also the warnings from the test stdout.

I'm not sure if the order of test results is deterministic in
nose/py.test, so I don't know if just diffing the outputs always works.

Building downstream from source avoids future binary compatibility issues.

[clip]
> Maybe it should be uploading the reports somewhere? So there'd be a
> readable "what's currently broken by 1.x" page, plus with persistent
> storage we could get travis to flag if new additions to the release
> branch causes any new failures to appear? (That way we only have to
> remember to look at the report manually once per release, instead of
> constantly throughout the process.)

This is probably possible to implement. Although, I'm not sure how much
added value this is compared to travis matrix, eg.
https://travis-ci.org/pv/testrig/

Of course, if the suggestion is that the results are generated on
somewhere else than on travis, then that's a different matter.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy 1.11.0b2 released

2016-02-02 Thread Pauli Virtanen

01.02.2016, 23:25, Ralf Gommers kirjoitti:
[clip]
> So: it would really help if someone could pick up the automation part of
> this and improve the stack testing, so the numpy release manager doesn't
> have to do this.

quick hack: https://github.com/pv/testrig

Not that I'm necessarily volunteering to maintain the setup, though, but
if it seems useful, move it under numpy org.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Downstream integration testing

2016-01-31 Thread Pauli Virtanen

31.01.2016, 12:57, Julian Taylor kirjoitti:
[clip]
> Testing or delegating testing of least our major downstreams should be
> the job of the release manager.
> Thus I also disagree with our more frequent releases. It puts too much
> porting and testing effort on our downstreams and it gets infeaseble for
> a volunteer release manager to handle.
> I fear by doing this we will end up in an situation where more
> downstreams put upper bounds on their supported numpy releases like e.g.
> astropy already did.
> This has bad consequences like the subclass breaking of linspace that
> should have been caught month ago but was not because upstreams were
> discouraging users from upgrading numpy because they could not keep up
> with porting.

I'd suggest that some automation could reduce the maintainer burden
here. Basically, I think being aware of downstream breakage is something
that could be determined without too much manual intervention.

For example, automated test rig that does the following:

- run tests of a given downstream project version, against
  previous numpy version, record output

- run tests of a given downstream project version, against
  numpy master, record output

- determine which failures were added by the new numpy version

- make this happen with just a single command, eg "python run.py",
  and implement it for several downstream packages and versions.
  (Probably good to steal ideas from travis-ci dependency matrix etc.)

This is probably too time intensive and waste of resources for
Travis-CI, but could be run by the Numpy maintainer or someone else
during release process, or periodically on some ad-hoc machine if
someone is willing to set it up.

Of course, understanding the cause of breakages would take some
understanding of the downstream package, but this would at least ensure
we are aware of stuff breaking. Provided it's covered by downstream test
suite, of course.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Downstream integration testing

2016-01-31 Thread Pauli Virtanen

31.01.2016, 14:41, Daπid kirjoitti:
> On 31 Jan 2016 13:08, "Pauli Virtanen" <p...@iki.fi> wrote:
>> For example, automated test rig that does the following:
>>
>> - run tests of a given downstream project version, against
>>   previous numpy version, record output
>>
>> - run tests of a given downstream project version, against
>>   numpy master, record output
>>
>> - determine which failures were added by the new numpy version
>>
>> - make this happen with just a single command, eg "python run.py",
>>   and implement it for several downstream packages and versions.
>>   (Probably good to steal ideas from travis-ci dependency matrix etc.)
> 
> A simpler idea: build the master branch of a series of projects and run the
> tests. In case of failure, we can compare with Travis's logs from the
> project when they use the released numpy. In most cases, the master branch
> is clean, so an error will likely be a change in behaviour.

If you can assume the tests of a downstream project are in an OK state,
then you can skip the build against existing numpy.

But it's an additional and unnecessary burden for the Numpy maintainers
to compare the logs manually (and check the built versions are the same,
and that the difference is not due to difference in build environments).
I would also avoid depending on the other projects' Travis-CI
configurations, since these may change.

I think testing released versions of downstream projects is better than
testing their master versions here, as the master branch may contain
workarounds for Numpy changes and not be representative of what people
get on their computers after Numpy release.

> This can be run automatically once a week, to not hog too much of Travis,
> and counting the costs in hours of work, is very cheap to set up, and free
> to maintain.

It may be that such project could be runnable on Travis, if split to
per-project runs to work around the 50min timeout.

I'm not aware of Travis-CI having support for "automatically once per
week" builds.

Anyway, having any form of central automated integration testing would
be better than the current situation where it's mostly all-manual and
relies on the activity of downstream project maintainers.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Compilation problems npy_float64

2015-11-05 Thread Pauli Virtanen

Thu, 05 Nov 2015 16:26:18 +, Johan kirjoitti:
> Hello, I searched the forum, but couldn't find a post related to my
> problem.  I am installing scipy via pip in cygwin environment
[clip]
> /usr/include/math.h:263:15: note: previous declaration ‘double
> infinity()’
>  extern double infinity _PARAMS((void));
>^
[clip]

This looks like some Cygwin weirdness --- a variable called
"infinity" is apparently there declared by math.h, and thus
a reserved name.

This was fixed by (but not for this reason)
https://github.com/scipy/scipy/commit/832baa20f0b5
so you may have better luck with the dev version.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver

2015-09-30 Thread Pauli Virtanen

Juha Jeronen  jyu.fi> writes:
> I recently developed a Cython-based, OpenMP-accelerated quartic (and 
> cubic, quadratic) polynomial solver to address a personal research need 
> for quickly solving a very large number of independent low-degree 
> polynomial equations for both real and complex coefficients.

My 2c in this context would be to also think how this best fits
in how collections of polynomials are represented in Numpy.

AFAICS, Numpy's polynomial module supports evaluation of polynomial
collections (cf. numpy.polynomial.polynomial.polyval), 
but the corresponding root finding routine
(numpy.polynomial.polynomial.polyroots) only deals with one
polynomial at a time.

The present code in principle could be used to speed up the latter
routine after it is generalized to multiple polynomials. The general
case is probably doable just by coding up the companion matrix
approach using low-level routines (or maybe with the new vectorized
np.linalg routines).

Some relevant code elsewhere for similar purposes can be
found in scipy.inteprolate.PPoly/BPoly (and in the future BSpline).

https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.PPoly.html

However, since it's piecewise, there's purposefully support only
for real-valued roots.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Notes from the numpy dev meeting at scipy 2015

2015-08-26 Thread Pauli Virtanen

26.08.2015, 14:14, Francesc Alted kirjoitti:
[clip]
 2015-08-25 12:03 GMT+02:00 Nathaniel Smith n...@pobox.com:
   Let's focus on evolving numpy as far as we can without major
   break-the-world changes (no numpy 2.0, at least in the foreseeable
   future).

   And, as a target for that evolution, let's change our focus from
   numpy as NumPy is the library that gives you the np.ndarray object
   (plus some attached infrastructure), to NumPy provides the
   standard framework for working with arrays and array-like objects in
   Python

 Sorry to disagree here, but in my opinion NumPy *already* provides the
 standard framework for working with arrays and array-like objects in Python
 as its huge popularity shows.  If what you mean is that there are too many
 efforts trying to provide other, specialized data containers (things like
 DataFrame in pandas, DataArray/Dataset in xarray or carray/ctable in bcolz
 just to mention a few), then let me say that I am of the opinion that there
 can't be a silver bullet for tackling all the problems that the PyData
 community is facing.

My reading of the above was that this was about multimethods, and
allowing different types of containers to interoperate beyond the array
interface and Python's builtin operator hooks.

The exact performance details of course vary, and an algorithm written
for in-memory arrays just fails for too large on-disk or distributed
arrays. However, a case for a minimal common API probably could be made,
esp. in algorithms mainly relying on linear algebra.

This is to a degree different from subclassing, as many of the
array-like objects you might want do not have a simple strided memory model.

Pauli


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] py2/py3 pickling

2015-08-25 Thread Pauli Virtanen

25.08.2015, 01:15, Chris Laumann kirjoitti:
 Would it be possible then (in relatively short order) to create
 a py2 - py3 numpy pickle converter? 

You probably need to modify the pickle stream directly, replacing
*STRING opcodes with *BYTES opcodes when it comes to objects that are
needed for constructing Numpy arrays.

https://hg.python.org/cpython/file/tip/Modules/_pickle.c#l82

Or, use a custom pickler class that emits the new opcodes when it comes
to data that is part of Numpy arrays, as Python 2 pickler doesn't know
how to write bytes opcodes.

It's probably doable, although likely annoying to implement. the pickles
created won't be loadable on Py2, only Py3.

You'd need to find a volunteer who wants to work on this or just do it
yourself, though.


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] py2/py3 pickling

2015-08-24 Thread Pauli Virtanen

24.08.2015, 01:02, Chris Laumann kirjoitti:
[clip]
 Is there documentation about the limits and workarounds for py2/py3 
 pickle/np.save/load compatibility? I haven't found anything except
 developer bug tracking discussions (eg. #4879 in github numpy).

Not sure if it's written down somewhere but:

- You should consider pickles not portable between Py2/3.

- Setting encoding='bytes' or encoding='latin1' should produce correct
results for numerical data. However, neither is safe because the
option also affects other data than numpy arrays that you may have
possibly saved.

- np.save/np.load are portable, as long as you don't save object arrays
or anything that gets converted to one by np.array (these are saved by
pickling)


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Development workflow (not git tutorial)

2015-08-14 Thread Pauli Virtanen

14.08.2015, 20:45, Allan Haldane kirjoitti:
[clip]
 Related to this, does anyone know how to debug numpy in gdb with proper
 symbols/source lines, like I can do with other C extensions? I've tried
 modifying numpy distutils to try to add the right compiler/linker flags,
 without success.

runtests.py --help

gdb --args python runtests.py -g --python script.py

grep env runtests.py

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Development workflow (not git tutorial)

2015-08-14 Thread Pauli Virtanen

15.08.2015, 01:44, Chris Barker kirjoitti:
[clip]
 numpy doesn't use namespace packages, so develop mode works there.

The develop mode is mainly useful with a virtualenv.

Otherwise, you install work-in-progress development version into your
~/.local which then breaks everything else. In addition to this, python
setupegg.py develop --uninstall says Note: you must uninstall or
replace scripts manually!, and since the scripts end up with dev
version requirement hardcoded, and you have to delete the scripts manually.

Virtualenvs are annoying to manage, and at least for me personally it's
easier to just deal with pythonpath, especially as runtests.py manages that.

Anyway, TIMTOWTDI

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy, BLAS, and CBLAS questions

2015-07-13 Thread Pauli Virtanen

13.07.2015, 20:08, Nathaniel Smith kirjoitti:
[clip]
 Keep in mind that any solution needs to support weird systems too,
 including Windows. I'm not sure we can assume that all BLAS libraries are
 ABI compatible either. Debian/Ubuntu make sure that this is true for the
 ones they ship, but not all systems are so carefully managed. For LAPACK in
 particular I'm pretty sure there are multiple ABIs in active use that scipy
 supports via code in numpy.distutils. (Hopefully someone with more
 expertise will speak up.)

On the ABI issue: there are two ABI issues to solve:

1. BLAS/LAPACK ABI -- Fortran compiler ABI

2. Fortran compiler ABI -- C

Issue 1 is simple for users who have configured their compiler
environment so that the fortran/BLAS/LAPACK chain is all ABI compatible;
ACML and MKL provide variants for different Fortran compilers. However,
on OSX the available Fortran compiler usually (gfortran) is not ABI
compatible with the system BLAS/LAPACK (Accelerate).

Issue 1+2 is something that is solved by the likes of CBLAS/LAPACKE.

Assuming an user who has properly configured
Fortran/BLAS/LAPACK/CBLAS/LAPACKE environment, the ABI issues are not a
problem that we would need to care about. However, especially on OSX,
providing a painless out-of-the-box experience is in conflict with this
assumption.

***

Examples of Fortran ABIs in the different libraries:

Eigen seems to assume gfortran ABI, not sure.
OSX Accelerate always uses g77 ABI (modulo bugs).
OpenBLAS can be either AFAIK depending on its configuration.
MKL provides several ABI variants for different compilers, with
different library names.
ACML seems to do so too.

Both BLAS and LAPACK have functions whose proper call convention in the
Fortran interface varies in g77 vs. gfortran. The part that doesn't vary
is often considered (eg. by f2py and many others) safe to call directly
from C modulo name mangling, although nothing strictly speaking
guarantees this.

Numpy assumes Fortran ABI - C compatibility is handled by CBLAS. Numpy
only uses LAPACK functions with safe calling conventions, so LAPACKE
is not really required.

If CBLAS/LAPACKE are not used, then you have to implement it yourself.
(i) One option is to rely on a Fortran compiler, assume it is ABI
compatible with the target BLAS/LAPACK (not necessarily true on OSX),
and then connect to C either relying on the safe calling conventions
or Fortran2003 features. (ii) Alternatively, you can detect which
BLAS/LAPACK ABI variant is active, and select a suitable C shim.

Scipy uses a combination of both approaches, using a shim layer for the
problematic functions in all code (C and Fortran). The shim is
configured based on which BLAS seems to be active at compile time. If
none of the recognized special cases (eg. OSX Accelerate) apply, then it
is assumed the Fortran ABI provided by the selected LAPACK/BLAS is
compatible with the selected Fortran compiler. (Communication to Fortran
is done via the safe calling convention subset.)

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy, BLAS, and CBLAS questions

2015-07-13 Thread Pauli Virtanen

13.07.2015, 19:44, Eric Martin kirjoitti:
 It seems to me that a potentially better route than add code to Numpy to
 support BLAS library for each library is to make Numpy easy to configure
 to compile with an arbitrary BLAS library (like what I've been doing).

Does this work:

export ATLAS=None
export BLAS=/path/to/libblas.a
export LAPACK=/path/to/liblapack.a
python setup.py build


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Homu

2015-06-15 Thread Pauli Virtanen

15.06.2015, 12:00, Nathaniel Smith kirjoitti:
[clip]
   http://homu.io/

One thing to consider is the disadvantage from security POV: this gives
full write access to the Numpy repository to that someone who is running
the bot. I don't see information on who this person (or these persons)
is and how access to the bot and the bot account is controlled.
(Travis-CI doesn't have that AFAIK, it can only change the
passed/not-passed icons.)

Pauli

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Verify your sourceforge windows installer downloads

2015-05-28 Thread Pauli Virtanen

28.05.2015, 20:35, Sturla Molden kirjoitti:
 Pauli Virtanen p...@iki.fi wrote:
 
 Is it possible to host them on github? I think there's an option to add
 release notes and (apparently) to upload binaries if you go to the
 Releases section --- there's one for each tag.
 
 And then Sourceforge will put up tainted installers for the benefit of
 NumPy users. :)

Well, let them. They may already be tainted, who knows. It's phishing
and malware distribution at that point, and there are some ways to deal
with that (safe browsing, AV etc).


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Verify your sourceforge windows installer downloads

2015-05-28 Thread Pauli Virtanen

28.05.2015, 20:05, David Cournapeau kirjoitti:
[clip]
 In any case I've always been surprised that NumPy is distributed
 through SourceForge, which has been sketchy for years now. Could it
 simply be hosted on PyPI?

 
 They don't accept arbitrary binaries like SF does, and some of our
 installer formats can't be uploaded there.

Is it possible to host them on github? I think there's an option to add
release notes and (apparently) to upload binaries if you go to the
Releases section --- there's one for each tag.

Pauli

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Verify your sourceforge windows installer downloads

2015-05-28 Thread Pauli Virtanen

28.05.2015, 21:52, Julian Taylor kirjoitti:
 there is no guarantee that github will not do this stuff in future too,
 also PyPI or self hosting do not necessarily help as those resources can
 be compromised.
 The main thing that should be learned this and the many similar
 incidents in the past is that binaries from the internet need to be
 verified of they have been modified from their original state otherwise
 they cannot be trusted.

Indeed, but on the other hand, there's no reason for us to continue
cooperating with shady partners, especially when there are easy
alternatives. We can just quietly change the main binary distribution
channel and be done with it.


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy compilation error

2015-04-12 Thread Pauli Virtanen

12.04.2015, 17:15, Peter Kerpedjiev kirjoitti:
[clip]
 numpy/random/mtrand/distributions.c:892:1: internal compiler error:
 Illegal instruction

An internal compiler error means your compiler (in this case, gcc) is
broken. The easiest solution is to use a newer version of the compiler,
assuming the compiler bug in question has been fixed. Here, it probably
has, since I have not seen similar error reports before from this code.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-03 Thread Pauli Virtanen

03.04.2015, 04:09, josef.p...@gmail.com kirjoitti:
[clip]
 I think numpy indexing is not too difficult and follows a consistent
 pattern, and I completely avoid mixing slices and index arrays with
 ndim  2.
 
 I think it should be DOA, except as a discussion topic for numpy 3000.

If you change how Numpy indexing works, you need to scrap a nontrivial
amount of existing code, at which point everybody should just go back to
Matlab, which at least provides a stable API.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3

2015-03-07 Thread Pauli Virtanen

07.03.2015, 01:29, Julian Taylor kirjoitti:
 On 07.03.2015 00:20, Pauli Virtanen wrote:
 06.03.2015, 22:43, Eric Firing kirjoitti:
 On 2015/03/06 10:23 AM, Pauli Virtanen wrote:
 06.03.2015, 20:00, Benjamin Root kirjoitti:
 A slightly different way to look at this is one of sharing data. If I am
 working on a system with 3.4 and I want to share data with others who may
 be using a mix of 2.7 and 3.3 systems, this problem makes npz format much
 less attractive.

 pickle is used in npy files only if there are object arrays in them.
 Of course, savez could just decline saving object arrays.

 Or issue a prominent warning.

 https://github.com/numpy/numpy/pull/5641

 
 I think the ship for a warning has long sailed. At this point its
 probably more an annoyance for python3 users and will not prevent many
 more python2 users from saving files that can't be loaded into python3.

How about an extra use_pickle=True kwarg that can be used to disable
using pickle altogether in these routines?

Another reason to do this is arbitrary code execution when loading
pickles: https://www.cs.jhu.edu/~s/musings/pickle.html

Easily demonstrated also with npy files (loading this file will only
print something unexpected, nothing more malicious):
http://pav.iki.fi/tmp/unexpected.npy

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3

2015-03-06 Thread Pauli Virtanen

06.03.2015, 22:23, Pauli Virtanen kirjoitti:
 06.03.2015, 20:00, Benjamin Root kirjoitti:
 A slightly different way to look at this is one of sharing data. If I am
 working on a system with 3.4 and I want to share data with others who may
 be using a mix of 2.7 and 3.3 systems, this problem makes npz format much
 less attractive.
 
 pickle is used in npy files only if there are object arrays in them.
 Of course, savez could just decline saving object arrays.

np.load is missing the Py2-3 workaround flags that pickle.load has,
probably could be added:

https://github.com/numpy/numpy/pull/5640


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3

2015-03-06 Thread Pauli Virtanen

06.03.2015, 22:43, Eric Firing kirjoitti:
 On 2015/03/06 10:23 AM, Pauli Virtanen wrote:
 06.03.2015, 20:00, Benjamin Root kirjoitti:
 A slightly different way to look at this is one of sharing data. If I am
 working on a system with 3.4 and I want to share data with others who may
 be using a mix of 2.7 and 3.3 systems, this problem makes npz format much
 less attractive.

 pickle is used in npy files only if there are object arrays in them.
 Of course, savez could just decline saving object arrays.
 
 Or issue a prominent warning.

https://github.com/numpy/numpy/pull/5641

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3

2015-03-06 Thread Pauli Virtanen

Arnd Baecker arnd.baecker at web.de writes:
[clip]
 Still I would have thought that this should be working out-of-the box,
 i.e. without the pickle.loads trick?

Pickle files should be considered incompatible between Python 2 and Python 3.

Python 3 interprets all bytes objects saved by Python 2 as str and attempts
to decode them under some unicode locale. The default locale is ASCII, so it
will simply just fail in most cases if the files contain any binary data.

Failing by default is also the right thing to do, since the saved bytes
objects might actually represent strings in some locale, and ASCII is the
safest guess.

This behavior is that of Python's pickle module, and does not depend on Numpy.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3

2015-03-06 Thread Pauli Virtanen

06.03.2015, 20:00, Benjamin Root kirjoitti:
 A slightly different way to look at this is one of sharing data. If I am
 working on a system with 3.4 and I want to share data with others who may
 be using a mix of 2.7 and 3.3 systems, this problem makes npz format much
 less attractive.

pickle is used in npy files only if there are object arrays in them.
Of course, savez could just decline saving object arrays.


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] GSoC'15 - mentors ideas

2015-02-25 Thread Pauli Virtanen

25.02.2015, 19:59, Pauli Virtanen kirjoitti:
 25.02.2015, 07:11, Nathaniel Smith kirjoitti:
 Not sure if this is a full GSoC but it would be good to get the benchmarks
 into the numpy repository, so we can start asking people who submit
 optimizations to submit new benchmarks as part of the PR (just like other
 changes require tests).
 
 This may be relevant in this respect:
 
 https://github.com/scipy/scipy/pull/4501

Ok, I didn't read the thread. The vbench benchmarks seem to not be so
many and could probably be ported to asv fairly quickly. The bigger job
is in setting up and maintaining a host that runs them periodically.
Also, asv doesn't (yet) do branches.

Pauli

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Matrix Class

2015-02-11 Thread Pauli Virtanen

11.02.2015, 21:57, Alan G Isaac kirjoitti:
[clip]
 I think gains could be in lazy evaluation structures (e.g.,
 a KroneckerProduct object that never actually produces the product
 unless forced to.)

This sounds like an abstract linear operator interface. Several attempts
have been made to this direction in Python world, but I think none of
them has really gained traction so far.

One is even in Scipy. Unfortunately, that one's design has grown
organically, and it's mostly suited just for specifying inputs to sparse
solvers etc. rather than abstract manipulations.

If there was a popular way to deal with these objects, it could become
even more popular reasonably quickly.


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] ANN: Scipy 0.15.1

2015-01-18 Thread Pauli Virtanen

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear all,

We are pleased to announce the Scipy 0.15.1 release.

Scipy 0.15.1 contains only bugfixes. The module
``scipy.linalg.calc_lwork`` removed in Scipy 0.15.0 is restored.
This module is not a part of Scipy's public API, and although it is
available again in Scipy 0.15.1, using it is deprecated and it may be
removed again in a future Scipy release.

Source tarballs, binaries, and full release notes are available at
https://sourceforge.net/projects/scipy/files/scipy/0.15.1/

Best regards,
Pauli Virtanen


==
SciPy 0.15.1 Release Notes
==

SciPy 0.15.1 is a bug-fix release with no new features compared to 0.15.0.

Issues fixed
- 

* `#4413 https://github.com/scipy/scipy/pull/4413`__: BUG: Tests too
strict, f2py doesn't have to overwrite this array
* `#4417 https://github.com/scipy/scipy/pull/4417`__: BLD: avoid
using NPY_API_VERSION to check not using deprecated...
* `#4418 https://github.com/scipy/scipy/pull/4418`__: Restore and
deprecate scipy.linalg.calc_work
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlS8CA4ACgkQ6BQxb7O0pWCmOQCgzg9AXDaqRaK5/QBWopIrv2OA
WkEAn0ltDfDHFpw0zMzB9mUscAAb2xnE
=JrGj
-END PGP SIGNATURE-
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] ANN: Scipy 0.15.0 release

2015-01-11 Thread Pauli Virtanen

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear all,

We are pleased to announce the Scipy 0.15.0 release.

The 0.15.0 release contains bugfixes and new features, most important
of which are mentioned in the excerpt from the release notes below.

Source tarballs, binaries, and full release notes are available at
https://sourceforge.net/projects/scipy/files/scipy/0.15.0/

Best regards,
Pauli Virtanen


==
SciPy 0.15.0 Release Notes
==

SciPy 0.15.0 is the culmination of 6 months of hard work. It contains
several new features, numerous bug-fixes, improved test coverage and
better documentation.  There have been a number of deprecations and
API changes in this release, which are documented below.  All users
are encouraged to upgrade to this release, as there are a large number
of bug-fixes and optimizations.  Moreover, our development attention
will now shift to bug-fix releases on the 0.16.x branch, and on adding
new features on the master branch.

This release requires Python 2.6, 2.7 or 3.2-3.4 and NumPy 1.5.1 or
greater.


New features


Linear Programming Interface
- 

The new function `scipy.optimize.linprog` provides a generic
linear programming similar to the way `scipy.optimize.minimize`
provides a generic interface to nonlinear programming optimizers.
Currently the only method supported is *simplex* which provides
a two-phase, dense-matrix-based simplex algorithm. Callbacks
functions are supported, allowing the user to monitor the progress
of the algorithm.

Differential evolution, a global optimizer
- --

A new `scipy.optimize.differential_evolution` function has been added
to the
``optimize`` module.  Differential Evolution is an algorithm used for
finding
the global minimum of multivariate functions. It is stochastic in
nature (does
not use gradient methods), and can search large areas of candidate
space, but
often requires larger numbers of function evaluations than conventional
gradient based techniques.

``scipy.signal`` improvements
- -

The function `scipy.signal.max_len_seq` was added, which computes a
Maximum
Length Sequence (MLS) signal.

``scipy.integrate`` improvements
- 

It is now possible to use `scipy.integrate` routines to integrate
multivariate ctypes functions, thus avoiding callbacks to Python and
providing better performance.

``scipy.linalg`` improvements
- -

The function `scipy.linalg.orthogonal_procrustes` for solving the
procrustes
linear algebra problem was added.

BLAS level 2 functions ``her``, ``syr``, ``her2`` and ``syr2`` are now
wrapped
in ``scipy.linalg``.

``scipy.sparse`` improvements
- -

`scipy.sparse.linalg.svds` can now take a ``LinearOperator`` as its
main input.

``scipy.special`` improvements
- --

Values of ellipsoidal harmonic (i.e. Lame) functions and associated
normalization constants can be now computed using ``ellip_harm``,
``ellip_harm_2``, and ``ellip_normal``.

New convenience functions ``entr``, ``rel_entr`` ``kl_div``,
``huber``, and ``pseudo_huber`` were added.

``scipy.sparse.csgraph`` improvements
- -

Routines ``reverse_cuthill_mckee`` and ``maximum_bipartite_matching``
for computing reorderings of sparse graphs were added.

``scipy.stats`` improvements
- 

Added a Dirichlet multivariate distribution, `scipy.stats.dirichlet`.

The new function `scipy.stats.median_test` computes Mood's median test.

The new function `scipy.stats.combine_pvalues` implements Fisher's
and Stouffer's methods for combining p-values.

`scipy.stats.describe` returns a namedtuple rather than a tuple, allowing
users to access results by index or by name.


Deprecated features
===

The `scipy.weave` module is deprecated.  It was the only module never
ported
to Python 3.x, and is not recommended to be used for new code - use Cython
instead.  In order to support existing code, ``scipy.weave`` has been
packaged
separately: https://github.com/scipy/weave.  It is a pure Python
package, and
can easily be installed with ``pip install weave``.

`scipy.special.bessel_diff_formula` is deprecated.  It is a private
function,
and therefore will be removed from the public API in a following release.

``scipy.stats.nanmean``, ``nanmedian`` and ``nanstd`` functions are
deprecated
in favor of their numpy equivalents.


Backwards incompatible changes
==

scipy.ndimage
- -

The functions `scipy.ndimage.minimum_positions`,
`scipy.ndimage.maximum_positions`` and `scipy.ndimage.extrema` return
positions as ints instead of floats.

scipy.integrate
- ---

The format of banded Jacobians in `scipy.integrate.ode` solvers is
changed. Note that the previous documentation of this feature

[Numpy-discussion] ANN: Scipy 0.14.1 release

2014-12-30 Thread Pauli Virtanen

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear all,

We are pleased to announce the Scipy 0.14.1 release.

The 0.14.1 release is a bugfix-only release, addressing the following
issues:

- - gh-3630 NetCDF reading results in a segfault
- - gh-3631 SuperLU object not working as expected for complex matrices
- - gh-3733 Segfault from map_coordinates
- - gh-3780 Segfault when using CSR/CSC matrix and uint32/uint64
- - gh-3781 Fix omitted types in sparsetools typemaps
- - gh-3802 0.14.0 API breakage: _gen generators are missing from
  scipy.stats.distributions API
- - gh-3805 Ndimge test failures with numpy 1.10
- - gh-3812 == sometimes wrong on csr_matrix
- - gh-3853 Many scipy.sparse test errors/failures with numpy 1.9.0b2
- - gh-4084 Fix exception declarations for Cython 0.21.1 compatibility
- - gh-4093 Avoid a memory error in splev(x, tck, der=k)
- - gh-4104 Workaround SGEMV segfault in Accelerate (maintenance 0.14.x)
- - gh-4143 Fix ndimage functions for large data
- - gh-4149 Bug in expm for integer arrays
- - gh-4154 Ensure that the 'size' argument of PIL's 'resize' method is a
  tuple
- - gh-4163 ZeroDivisionError in scipy.sparse.linalg.lsqr
- - gh-4164 Remove use of deprecated numpy API in lib/lapack/ f2py wrapper
- - gh-4180 PIL resize support tuple fix
- - gh-4168 Address arpack test failures on windows 32 bits with numpy
  1.9.1
- - gh-4203 Sparse matrix multiplication in 0.14.x slower compared to
  0.13.x
- - gh-4218 Make ndimage interpolation compatible with numpy relaxed
  strides
- - gh-4225 Off-by-one error in PPoly shape checks
- - gh-4248 Fix issue with incorrect use of closure for slsqp

Source tarballs and binaries are available at
https://sourceforge.net/projects/scipy/files/SciPy/0.14.1/

Best regards,
Pauli Virtanen
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlSjCcYACgkQ6BQxb7O0pWBxcwCfcnd4uva5hzMHQlHmWxlfbja3
T0AAn2QQmhcotDRB2c2p41Xzjb4MJ13f
=yBxH
-END PGP SIGNATURE-
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Context manager for seterr

2014-12-14 Thread Pauli Virtanen

15.12.2014, 02:12, Stefan van der Walt kirjoitti:
 Since the topic of context managers recently came up, what do you think
 of adding a context manager for seterr?
 
 with np.seterr(divide='ignore'):
 frac = num / denom

There's this:

with np.errstate(divide='ignore'):
 ...



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Scipy 0.15.0 beta 1 release

2014-11-26 Thread Pauli Virtanen

Julian Taylor jtaylor.debian at googlemail.com writes:
[clip]
 There haven't been any real complaints from applications yet, only
 testsuite failure of scipy.
 Either the one thing that is broken in scipy isn't much used or windows
 32 users aren't using 1.9 yet.

What is broken is calculating eigenvalues of complex-valued sparse
matrices and iterative solution of complex-valued linear equations. 
I.e., nothing obscure.

A likely explanation is that win32 + Numpy 1.9 is a less common platform,
and users whose code started failing just infrequently do not report bugs
as easily...

 The majority of f2py should still be working, numpys own f2py testsuite
 passes on win32. 

Perhaps the arrays are aligned by chance? I don't think the test suite
repeats the complex valued intent(inout) parameter test many times.

[clip]
 I still don't know what exactly arpack is doing
 different but I also did not have time yet to look at the testcase david
 created.

David's test case is this:

n = 4
x = np.zeros(n * 3, dtype=D)
_dummy.zfoo(x, n)

where the argument is declared as double complex, dimension(3*n),
intent(inout)
in f2py. The ARPACK stuff in Scipy also does pretty much just this.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] [SciPy-Dev] ANN: Scipy 0.15.0 beta 1 release

2014-11-25 Thread Pauli Virtanen

25.11.2014, 21:14, Nathaniel Smith kirjoitti:
[clip]
 (I guess scipy could create an overallocated copy and then take a
 slice at the right offset, but asking scipy to use such hacks to work
 around our bugs is clearly wrong.)

Note that the issue is not just with Scipy, but with *all* f2py code
that is out there.

Everyone who uses double complex, intent(inout) in their f2py wrapped
code will start getting random exceptions on Windows. Users of double
complex, intent(in) pay a performance penalty.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] ANN: Scipy 0.15.0 beta 1 release

2014-11-23 Thread Pauli Virtanen

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear all,

We have finally finished preparing the Scipy 0.15.0 beta 1 release.
Please try it and report any issues on the scipy-dev mailing list,
and/or on Github.

If no surprises turn up, the final release is planned on Dec 20 in
three weeks.

Source tarballs and full release notes are available at
https://sourceforge.net/projects/scipy/files/SciPy/0.15.0b1/
Binary installers should also be up soon.

Best regards,
Pauli Virtanen


- 

SciPy 0.15.0 is the culmination of 6 months of hard work. It contains
several new features, numerous bug-fixes, improved test coverage and
better documentation.  There have been a number of deprecations and
API changes in this release, which are documented below.  All users
are encouraged to upgrade to this release, as there are a large number
of bug-fixes and optimizations.  Moreover, our development attention
will now shift to bug-fix releases on the 0.16.x branch, and on adding
new features on the master branch.

This release requires Python 2.6, 2.7 or 3.2-3.3 and NumPy 1.5.1 or
greater.


New features


Linear Programming Interface
- - 

The new function ``scipy.optimize.linprog`` provides a generic
linear programming similar to the way ``scipy.optimize.minimize``
provides a generic interface to nonlinear programming optimizers.
Currently the only method supported is *simplex* which provides
a two-phase, dense-matrix-based simplex algorithm. Callbacks
functions are supported,allowing the user to monitor the progress
of the algorithm.

Differential_evolution, a global optimizer
- - --

A new ``differential_evolution`` function is available in the
``scipy.optimize``
module.  Differential Evolution is an algorithm used for finding the
global
minimum of multivariate functions. It is stochastic in nature (does
not use
gradient methods), and can search large areas of candidate space, but
often
requires larger numbers of function evaluations than conventional gradient
based techniques.

``scipy.signal`` improvements
- - -

The function ``max_len_seq`` was added, which computes a Maximum
Length Sequence (MLS) signal.

``scipy.integrate`` improvements
- - 

It is now possible to use ``scipy.integrate`` routines to integrate
multivariate ctypes functions, thus avoiding callbacks to Python and
providing better performance.

``scipy.linalg`` improvements
- - -

Add function ``orthogonal_procrustes`` for solving the procrustes
linear algebra problem.

``scipy.sparse`` improvements
- - -

``scipy.sparse.linalg.svds`` can now take a ``LinearOperator`` as its
main input.

``scipy.special`` improvements
- - --

Values of ellipsoidal harmonic (i.e. Lame) functions and associated
normalization constants can be now computed using ``ellip_harm``,
``ellip_harm_2``, and ``ellip_normal``.

New convenience functions ``entr``, ``rel_entr`` ``kl_div``,
``huber``, and ``pseudo_huber`` were added.

``scipy.sparse.csgraph`` improvements
- - -

Routines ``reverse_cuthill_mckee`` and ``maximum_bipartite_matching``
for computing reorderings of sparse graphs were added.

``scipy.stats`` improvements
- - 

Added a Dirichlet distribution as multivariate distribution.

The new function ``scipy.stats.median_test`` computes Mood's median test.

The new function ``scipy.stats.combine_pvalues`` implements Fisher's
and Stouffer's methods for combining p-values.

``scipy.stats.describe`` returns a namedtuple rather than a tuple,
allowing
users to access results by index or by name.

Deprecated features
===

The ``scipy.weave`` module is deprecated.  It was the only module
never ported
to Python 3.x, and is not recommended to be used for new code - use Cython
instead.  In order to support existing code, ``scipy.weave`` has been
packaged
separately: `https://github.com/scipy/weave`_.  It is a pure Python
package, and
can easily be installed with ``pip install weave``.

``scipy.special.bessel_diff_formula`` is deprecated.  It is a private
function,
and therefore will be removed from the public API in a following release.


Backwards incompatible changes
==

scipy.ndimage
- - -

The functions ``scipy.ndimage.minimum_positions``,
``scipy.ndimage.maximum_positions`` and ``scipy.ndimage.extrema`` return
positions as ints instead of floats.

scipy.integrate
- - ---

The format of banded Jacobians in ``scipy.integrate.ode`` solvers is
changed. Note that the previous documentation of this feature was
erroneous.


-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlRyaf8ACgkQ6BQxb7O0pWC7XQCeNtdJD4ZNDXvFeNFs7N3KjQn6
8QkAoK3pFmhMrTwCrgkusl+fRNMboN2r
=WSpM
-END PGP

Re: [Numpy-discussion] Numpy 1.9.1, zeros and alignement

2014-11-18 Thread Pauli Virtanen

18.11.2014, 21:44, David Cournapeau kirjoitti:
 On Tue, Nov 18, 2014 at 7:05 PM, Julian Taylor 
 jtaylor.deb...@googlemail.com wrote:
 
 32 bit windows should not provide 16 byte alignment, at least it doesn't
 for me. That is typically a property of 64 bit OS.

 But that does not explain why normal double is not aligned for you, that
 only needs 4 bytes on i386 which even 32 bit OS should provide.
 
 Sorry for the confusion, doubles are aligned, only complex128 are not. But
 I see that on linux 32 bits, this is the same as on windows (zeros output
 not always aligned on D dtype), and yet I don't see the issues with f2py
 not being able to

The scipy ticket is here, btw:
https://github.com/scipy/scipy/issues/4168

The second question is whether F2py actually *needs* to check the
dtype-size alignment, or is just something like sizeof(double) enough
for Fortran compilers. Fortran compilers however apparently do generate
code that crashes and burns if there's no alignment also on x86:
https://github.com/scipy/scipy/pull/2698  All this is probably
unspecified, as it's just F77 up out there.

Apparently, everything has worked OK with the old Numpy behavior, or at
least, nobody managed to pinpoint a crash on Win32 because of this? Can
the F2py alignment checks be relaxed? Maybe it is enough to assume the
Fortran compiler is happy with whatever alignment the system malloc()
assures?

If not, the second option is a bit nasty, since I'd believe many people
have f2py code out there with complex inout arrays, and I think no-one
uses special aligned allocators...

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Detect if array has been transposed

2014-10-12 Thread Pauli Virtanen

12.10.2014, 20:19, Mads Ipsen kirjoitti:
 Is there any way for me to detect (on the Python side) that transpose() 
 has been invoked on the matrix, and thereby only do the copy operation 
 when it really is needed? 

The correct way to do this is to, either:

In your C code check PyArray_IS_C_CONTIGUOUS(obj) and raise an error if
it is not. In addition, on the Python side, check for
`a.flags.c_contiguous` and make a copy if it is not.

OR

In your C code, get an handle to the array using PyArray_FromANY (or
PyArray_FromOTF) with NPY_ARRAY_C_CONTIGUOUS requirement set so that it
makes a copy when necessary.


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Detect if array has been transposed

2014-10-12 Thread Pauli Virtanen

12.10.2014, 22:16, Eric Firing kirjoitti:
 On 2014/10/12, 8:29 AM, Pauli Virtanen wrote:
 12.10.2014, 20:19, Mads Ipsen kirjoitti:
 Is there any way for me to detect (on the Python side) that transpose()
 has been invoked on the matrix, and thereby only do the copy operation
 when it really is needed?

 The correct way to do this is to, either:

 In your C code check PyArray_IS_C_CONTIGUOUS(obj) and raise an error if
 it is not. In addition, on the Python side, check for
 `a.flags.c_contiguous` and make a copy if it is not.

 OR

 In your C code, get an handle to the array using PyArray_FromANY (or
 PyArray_FromOTF) with NPY_ARRAY_C_CONTIGUOUS requirement set so that it
 makes a copy when necessary.
 
 or let numpy handle it on the python side:
 
 foo(numpy.ascontiguousarray(a))

Yes, but the C code really should check that the input array is
C-contiguous, if it only works for C-contiguous inputs.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] @ operator

2014-09-10 Thread Pauli Virtanen

09.09.2014, 22:52, Charles R Harris kirjoitti:
   1. Should the operator accept array_like for one of the arguments?
   2. Does it need to handle __numpy_ufunc__, or will
   __array_priority__ serve?

I think the __matmul__ operator implementation should follow that of
__mul__.

[clip]
3. Do we want PyArray_Matmul in the numpy API?
4. Should a matmul function be supplied by the multiarray module?
 
 If 3 and 4 are wanted, should they use the __numpy_ufunc__ machinery, or
 will __array_priority__ serve?

dot() function deals with __numpy_ufunc__, and the matmul() function
should behave similarly.

It seems dot() uses __array_priority__ for selection of output return
subclass, so matmul() probably needs do the same thing.

 Note that the type number operators, __add__ and such, currently use
 __numpy_ufunc__ in combination with __array_priority__, this in addition to
 the fact that they are by default using ufuncs that do the same. I'd rather
 that the __*__ operators simply rely on __array_priority__.

The whole business of __array_priority__ and __numpy_ufunc__ in the
binary ops is solely about when __op__ should yield the execution to
__rop__ of the other object.

The rule of operation currently is: __rmul__ before __numpy_ufunc__

If you remove the __numpy_ufunc__ handling, it becomes: __rmul__ before
__numpy_ufunc__, except if array_priority happens to be smaller than
that of the other class and your class is not an ndarray subclass.

The following binops also do not IIRC respect __array_priority__ in
preferring right-hand operand:

- in-place operations
- comparisons

One question here is whether it's possible to change the behavior of
__array_priority__ here at all, or whether changes are possible only in
the context of adding new attributes telling Numpy what to do.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] __numpy_ufunc__ and 1.9 release

2014-07-23 Thread Pauli Virtanen

23.07.2014, 20:37, Julian Taylor kirjoitti:
[clip: __numpy_ufunc__]
 So its been a week and we got a few answers and new issues. To
 summarize: - to my knowledge no progress was made on the issues -
 scipy already has a released version using the current
 implementation - no very loud objections to delaying the feature to
 1.10 - I am still unfamiliar with the problematics of subclassing,
 but don't want to release something new which has unsolved issues.
 
 That scipy already uses it in a released version (0.14) is very 
 problematic. Can maybe someone give some insight if the potential 
 changes to resolve the remaining issues would break scipy?
 
 If so we have following choices:
 
 - declare what we have as final and close the remaining issues as
 'won't fix'. Any changes would have to have a new name
 __numpy_ufunc2__ or a somehow versioned the interface - delay the
 introduction, potentially breaking scipy 0.14 when numpy 1.10 is
 released.
 
 I would like to get the next (and last) numpy 1.9 beta out soon, so
 I would propose to make a decision until this Saturday the
 26.02.2014 however misinformed it may be.

It seems fairly unlikely to me that the `__numpy_ufunc__` interface
itself requires any changes. I believe the definition of the interface
is quite safe to consider as fixed --- it is a fairly straighforward
hook for Numpy ufuncs. (There are also no essential changes in it
since last year.)

For the binary operator overriding, Scipy sets the constraint that

ndarray * spmatrix

MUST call spmatrix.__rmul__ even if spmatrix.__numpy_ufunc__ is
defined. spmatrixes are not ndarray subclasses, and various
subclassing problems do not enter here.

Note that this binop discussion is somewhat separate from the
__numpy_ufunc__ interface itself. The only information available about
it at the binop stage is `hasattr(other, '__numpy_ufunc__')`.

   ***

Regarding the blockers:

(1) https://github.com/numpy/numpy/issues/4753

This is a bug in the argument normalization --- output arguments are
not checked for the presence of __numpy_ufunc__ if they are passed
as keyword arguments (as a positional argument it works). It's a bug
in the implementation, but I don't think it is really a blocker.

Scipy sparse matrices will in practice seldom be used as output args
for ufuncs.

   ***

(2) https://github.com/numpy/numpy/pull/4815

The is open question concerns semantics of `__numpy_ufunc__` versus
Python operator overrides. When should ndarray.__mul__(other) return
NotImplemented?

Scipy sparse matrices are not subclasses of ndarray, so the code in
question in Numpy gets to run only for

ndarray * spmatrix

This provides a constraint to what solution we can choose in Numpy to
deal with the issue:

ndarray.__mul__(spmatrix)  MUST  continue to return NotImplemented

This is the current behavior, and cannot be changed: it is not
possible to defer this to __numpy_ufunc__(ufunc=np.multiply), because
sparse matrices define `*` as the matrix multiply, and not the
elementwise multiply. (This settles one line of discussion in the
issues --- ndarray should defer.)

How Numpy currently determines whether to return NotImplemented in
this case or to call np.multiply(self, other) is by comparing
`__array_priority__` attributes of `self` and `other`. Scipy sparse
matrices define an `__array_priority__` larger than ndarrays, which
then makes a NotImplemented be returned.

The idea in the __numpy_ufunc__ NEP was to replace this with
`hasattr(other, '__numpy_ufunc__') and hasattr(other, '__rmul__')`.
However, when both self and other are ndarray subclasses in a certain
configuration, both end up returning NotImplemented, and Python raises
TypeError.

The `__array_priority__` mechanism is also broken in some of the
subclassing cases: https://github.com/numpy/numpy/issues/4766

As far as I see, the backward compatibility requirement from Scipy
only rules out the option that ndarray.__mul__(other) should
unconditionally call `np.add(self, other)`.

We have some freedom how to solve the binop vs. subclass issues. It's
possible to e.g. retain the __array_priority__ stuff as a backward
compatibility measure as we do currently.

-- 
Pauli Virtanen


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] proposal: new commit guidelines for backportable bugfixes

2014-07-19 Thread Pauli Virtanen

19.07.2014 11:04, Ralf Gommers kirjoitti:
[clip]
   1. bugfix PR sent to master by contributor
   2. maintainer decides it's backportable, so after review he doesn't merge
 PR but rebases it and sends a second PR. First one, with review content, is
 closed not merged.
   3. merge PR into maintenance branch.
   4. send third PR to merge back or forward port the fix to master, and
 merge that.
 (or some variation with merge bases which is even more involved)

The maintainer can just rebase on merge base, and then merge and push it
via git as usual, without having to deal with Github. If the pull
request happens to be already based on an OK merge base, it can be
merged via Github directly to master.

The only thing maintainer gains from sending additional pull request via
Github is that the code gets run by Travis-CI. However, the tests will
also run automatically after pushing the merge commits, so test failures
can be caught (although after the fact). This is also the case for
directly pushed cherry-picked commits.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] String type again.

2014-07-18 Thread Pauli Virtanen

18.07.2014 19:33, Chris Barker kirjoitti:
 On Fri, Jul 18, 2014 at 9:07 AM, Pauli Virtanen p...@iki.fi
 wrote:
 
 Another approach would be to add a new 1-byte unicode
 
 you can't do unicode in 1-byte -- so what does this mean, exactly?

The first 256 unicode code points, which happen to coincide with latin1.

 This also is not perfect, since array(['foo']) on Py2 should for 
 backward compatibility continue returning dtype='S'.
 
 yup. but we may be OK -- as bytes in py2 is the same as string
 anyway. But what do we do with null bytes? when going from 'S' to
 py2 string?

Changing the null chopping and preserving backward compat would
require yet another new dtype. This would then mean that the 'S' dtype
would become pretty much deprecated on Py3.

Forcing everyone to re-do their Python 3 ports would be somewhat
cleaner. However, this train may have left a couple of years ago.

-- 
Pauli Virtanen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] proposal: new commit guidelines for backportable bugfixes

2014-07-18 Thread Pauli Virtanen

18.07.2014 19:35, Julian Taylor kirjoitti:
 On Fri, Jul 18, 2014 at 6:23 PM, Nathaniel Smith n...@pobox.com
 wrote:
 On 18 Jul 2014 15:36, Julian Taylor
 jtaylor.deb...@googlemail.com wrote:
 
 git rebase --onto $(git merge-base master maintenance/1.9.x)
 HEAD^
 
 As a potential refinement, this might be simpler if we define a
 branch that points to this commit.
 
 
 we could do that, though the merge base changes to the last commit 
 that was merged in that way. The old merge base is still valid but 
 much older. I applied this method to some of my bugfixes so the 
 current merge base of master and 1.9 is a commit from yesterday
 not anymore the diverging point of master and 1.9. But I don't know
 if the newer merge base makes any difference to git.

Will the merge base actually ever change if you don't merge the
branches to each other?

***

The other well-known alternative to bugfixes is to first commit it in
the earliest maintenance branch where you want to have it, and then
merge that branch forward to the newer maintenance branches, and
finally into master.

Pauli

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] `allclose` vs `assert_allclose`

2014-07-18 Thread Pauli Virtanen

18.07.2014 21:03, josef.p...@gmail.com kirjoitti:
[clip]
 Of course you can change it.
 
 But the testing functions are code and very popular code.
 
 And if you break backwards compatibility, then I wouldn't mind reviewing a
 pull request for statsmodels that adds 300 to 400 `atol=0` to the unit
 tests. :)

10c:

Scipy has 960 of those, and atol ~ 0 is required in some cases
(difficult to say in how big percentage without review). The default of
atol=1e-8 is pretty large.

There's ~60 instances of allclose(), most of which are in tests. About
half of those don't have atol=, whereas most have rtol.

Using allclose in non-test code without specifying both tolerances
explicitly is IMHO a sign of sloppiness, as the default tolerances are
both pretty big (and atol != 0 is not scale-free).

***

Consistency would be nice, especially in not having traps like

assert_allclose(a, b, eps)
-
assert_(not np.allclose(a, b, eps))

Bumping the tolerances in assert_allclose() up to match allclose() will
probably not break code, but it can render some tests ineffective.

If the change is made, it needs to be noted in the release notes. I
think the number of project authors who relied on that the default was
atol=0 is not so big.

(In other news, we should discourage use of assert_almost_equal, by
telling people to use assert_allclose instead in the docstring at the
least. It has only atol= and it specifies it in a very cumbersome log10
basis...)

-- 
Pauli Virtanen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] `allclose` vs `assert_allclose`

2014-07-18 Thread Pauli Virtanen

18.07.2014 22:13, Chris Barker kirjoitti:
[clip]
 but an appropriate rtol would work there too. If only zero testing is
 needed, then atol=0 makes sense as a default. (or maybe atol=eps)

There's plenty of room below eps, but finfo(float).tiny ~ 3e-308 (or
some big multiple) is also reasonable in the scale-freeness sense.


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] proposal: new commit guidelines for backportable bugfixes

2014-07-18 Thread Pauli Virtanen

18.07.2014 23:53, Julian Taylor kirjoitti:
 On 18.07.2014 19:47, Pauli Virtanen wrote:
[clip]
  The other well-known alternative to bugfixes is to first commit it in
  the earliest maintenance branch where you want to have it, and then
  merge that branch forward to the newer maintenance branches, and
  finally into master.
 
 wouldn't that still require basing bugfixes onto the point before the
 master and maintenance branch diverged?
 otherwise a merge from maintenance to master would include the commits
 that are only part of the maintenance branch (release commits,
 regression fixes etc.)

If I understand correctly, the idea is to manually revert the changes
that don't belong in, which needs to be only done once for each, as the
merge logic should deal with it in all subsequent merges.

I think there are in practice not so many commits that you want to have
only in the release branch. Version number bumping is one (and easily
addressed by a follow-up commit in master that bumps it again) --- what
else?

The bugfix-in-release-and-forward-port-to-master seems to be the
recommended practice for Mercurial:

http://mercurial.selenic.com/wiki/StandardBranching

https://docs.python.org/devguide/committing.html

I think there are also git guides that recommend using it.

The option of basing commits on last merge base is probably not really
feasible with Mercurial (I haven't seen git guides that propose it either).

 basing bugfixes on maintenance does allow cherry picking into master as
 you don't care too much about backward mergeability here, but you still
 lose a good git log and git branch --contains to check which bugfix is
 in which branch.

I don't disagree with this. Cherry picking is OK, but only as long as
the number of commits is not too large and you use a tool (e.g. my
git-cherry-tree) that tries to check which patches are in and which not.

Pauli


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] proposal: new commit guidelines for backportable bugfixes

2014-07-18 Thread Pauli Virtanen

19.07.2014 01:49, Nathaniel Smith kirjoitti:
 On Fri, Jul 18, 2014 at 11:44 PM, Pauli Virtanen p...@iki.fi wrote:
 18.07.2014 23:53, Julian Taylor kirjoitti:
 On 18.07.2014 19:47, Pauli Virtanen wrote:
 [clip]
 The other well-known alternative to bugfixes is to first commit it in
 the earliest maintenance branch where you want to have it, and then
 merge that branch forward to the newer maintenance branches, and
 finally into master.

 wouldn't that still require basing bugfixes onto the point before the
 master and maintenance branch diverged?
 otherwise a merge from maintenance to master would include the commits
 that are only part of the maintenance branch (release commits,
 regression fixes etc.)

 If I understand correctly, the idea is to manually revert the changes
 that don't belong in, which needs to be only done once for each, as the
 merge logic should deal with it in all subsequent merges.

 I think there are in practice not so many commits that you want to have
 only in the release branch. Version number bumping is one (and easily
 addressed by a follow-up commit in master that bumps it again) --- what
 else?
 
 Presumably all the commits that we miss on the first pass and end up
 backporting the hard way later :-)

If those are just cherry-picked, they will generate merge conflicts the
next time things are merged back (or, the merge will be smart enough to
note the patch was already applied some time ago). This is then probably
not really a big problem.

Pauli


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] proposal: new commit guidelines for backportable bugfixes

2014-07-18 Thread Pauli Virtanen

19.07.2014 02:10, Pauli Virtanen kirjoitti:
 19.07.2014 01:49, Nathaniel Smith kirjoitti:
 On Fri, Jul 18, 2014 at 11:44 PM, Pauli Virtanen p...@iki.fi wrote:
[clip]
 Presumably all the commits that we miss on the first pass and end up
 backporting the hard way later :-)
 
 If those are just cherry-picked, they will generate merge conflicts the
 next time things are merged back (or, the merge will be smart enough to
 note the patch was already applied some time ago). This is then probably
 not really a big problem.

NB. this is a bit playing devil's advocate --- I'm not advocating
porting bugfixes from merge branches, as using the merge base should
also work fine.


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] __numpy_ufunc__ and 1.9 release

2014-07-17 Thread Pauli Virtanen

Hi,

15.07.2014 21:06, Julian Taylor kirjoitti:
[clip: __numpy_ufunc__]
 So I'm wondering if we should delay the introduction of this
 feature to 1.10 or is it important enough to wait until there is a
 consensus on the remaining issues?

My 10c:

The feature is not so much in hurry that it alone should delay 1.9.
Moreover, it's best for everyone that it is bug-free on the first go,
and it gets some real-world testing before the release. Better safe than
sorry.

I'd pull it out from 1.9.x branch, and iron out the remaining wrinkles
before 1.10.

Pauli

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Short-hand array creation in `numpy.mat` style

2014-07-08 Thread Pauli Virtanen

07.07.2014 21:32, Chris Barker - NOAA Federal kirjoitti:
 If you are going to introduce this functionality, please don't call it
 np.arr.

It might be appropriate for pirate versions of Numpy.

***

Seriously though, having a variant of `mat` that returns arrays could be
useful, so weak +0. Preferably, the name should be quite short to type.

On the other hand, unlike r_ and c_, I haven't seen or used mat() in
real code.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Slightly off-topic - accuracy of C exp function?

2014-04-27 Thread Pauli Virtanen

Hi,

27.04.2014 01:37, Matthew Brett kirjoitti:
[clip]
 Take-home : exp implementation for mingw-w64 is exactly (floating
 point) correct 82% of the time, and one unit-at-the-last-place off for
 the rest [1].  OSX is off by 1 ULP only 0.2% of the time.
 
 Is mingw-w64 accurate enough?  Do we have any policy on this?

I think as mentioned, the C standards don't specify accuracy
requirements. Errors of a couple of ULP should be still acceptable.

Re: powell test --- if this turns out to be complicated to deal with,
just go ahead and disable the trace test.

Optimization routines contain statements of the form `if a  b: ...`
with floating point numbers, so that the execution path can be sensitive
to rounding error if you're unlucky, and the chances go up as the
iteration count increases.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] 64-bit windows numpy / scipy wheels for testing

2014-04-27 Thread Pauli Virtanen

Hi,

25.04.2014 00:56, Matthew Brett kirjoitti: Thanks to Cark Kleffner's
toolchain and some help from Clint Whaley
 (main author of ATLAS), I've built 64-bit windows numpy and scipy
 wheels for testing.

Where can I get your

numpy.patch
scipy.patch

and what's in them?

Cheers,
Pauli


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] 64-bit windows numpy / scipy wheels for testing

2014-04-26 Thread Pauli Virtanen

25.04.2014 08:57, Sturla Molden kirjoitti:
[clip]
 On the positive side: Does this mean we finally can use gfortran on
 Windows? And if so, can we use Fortran versions beyond Fortran 77 in SciPy
 now? Or is Mac OS X a blocker?

Yes, Windows is the only platform on which Fortran was problematic. OSX
is somewhat saner in this respect.

-- 
Pauli Virtanen


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] GSOC

2014-02-23 Thread Pauli Virtanen

23.02.2014 11:30, Ralf Gommers kirjoitti:
[clip]
 1. fix up ideas page with scipy/numpy descriptions, idea difficulty levels
 and preferably some more ideas.

Here's a start:

https://github.com/scipy/scipy/wiki/GSoC-project-ideas



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] How exactly ought 'dot' to work?

2014-02-22 Thread Pauli Virtanen

23.02.2014 00:03, Nathaniel Smith kirjoitti:
 Currently numpy's 'dot' acts a bit weird for ndim2 or ndim1. In
 practice this doesn't usually matter much, because these are very
 rarely used. But, I would like to nail down the behaviour so we can
 say something precise in the matrix multiplication PEP. 

I'm not sure it's necessary to say much about this in the PEP. It should
in my view concentrate on arguing why the new binop is needed in the
Python language, and for that, restricting to 2D is good enough IMHO.

How exactly Numpy makes use of the capability for  2-dim arrays is
something that should definitely be discussed.

But I think this is a problem mainly interesting for Numpy devs, and not
for CPython devs.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Automatic issue triage.

2014-02-21 Thread Pauli Virtanen

Charles R Harris charlesr.harris at gmail.com writes:
 After 6 days of trudging through the numpy issues and
 finally passing the half way point, I'm wondering if we
 can set up so that new defects get a small test that can
 be parsed out and run periodically to mark issues that might
 be fixed. I expect it can be done, but might be more trouble
 than it is worth to keep working.

Github has an API for accessing issue contents.

curl -i https://api.github.com/repos/numpy/numpy/issues?state=open;

If some markup for test cases is devised, a tool can be written
that detects them.

Alternatively, one could just add a separate git repository 
numpy/bugs.git for bug test cases, containing e.g. files
`gh-1234.py`. Such scripts need to be written anyway at some
point (or copypasted to Python shell). It would also be better
from security POV to use a separate repo for bug test cases.

This would also solve the issue of how to add attachments
to bug reports in one way.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Automatic issue triage.

2014-02-21 Thread Pauli Virtanen

Robert Kern robert.kern at gmail.com writes:
[clip]
 Seems like more trouble than it's worth to automate. We don't want
 just anyone with a Github account to add arbitrary code to our test
 suites, do we? The idea of an expected failure test suite is a good
 one, but it seems to me that it could be maintained by normal PR
 processes just fine.

Yes. However, using a separate repository might make this more
easy to deal with. This also does not have the running arbitrary
code problem.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Suggestions for GSoC Projects

2014-02-16 Thread Pauli Virtanen

16.02.2014 23:34, Jennifer stone kirjoitti:
[clip]
 Yeah, many of the known failures seem to revolve around hyp2f1. An 
 unexplained inclination towards hypergeometric functions really
 tempts me to plunge into this. If it's too risky, I can work on
 this after the summers, as I would have gained quite a lot of
 experience with the code here.

If you are interested in the hypergeometric numerical evaluation, it's
probably a good idea to take a look at this recent master's thesis
written on the problem:

http://people.maths.ox.ac.uk/porterm/research/pearson_final.pdf

This may give some systematic overview on the range of methods
available. (Note that for copyright reasons, it's not a good idea to
look closely at the source codes linked from that thesis, as they are
not available under a compatible license.)

It may well be that the best approach for evaluating these functions,
if accuracy in the whole parameter range is wanted, in the end turns
out to require arbitrary-precision computations.  In that case, it
would be a very good idea to look at how the problem is approached in
mpmath. There are existing multiprecision packages written in C, and
using one of them in scipy.special could bring better evaluation
performance even if the algorithm is the same.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] GSOC

2014-02-13 Thread Pauli Virtanen

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

13.02.2014 20:59, josef.p...@gmail.com kirjoitti:
[clip]
 I assume numpy/scipy will participate under the PSF umbrella. So
 this deadline is for the PSF. However, Terri, the organizer for the
 PSF, asked for links to Ideas pages to be able to show Google what
 interesting projects the PSF has.

Here's a shot at that (stolen from roadmap etc):

https://github.com/scipy/scipy/wiki/GSoC-project-ideas

Please update as you see fit.

Did we count the number of prospective mentors = 3? Scipy is not yet
listed on the PSF GSoC 2014 project list, so I think if we are going
to participate, we should let them know.

Best,
Pauli
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)

iEYEARECAAYFAlL9IJoACgkQ6BQxb7O0pWALNwCgy0YwyTBxuaD+As3lOiAlp0/A
3ZcAnR4VCb9rjQ0WE/JDbfpWPxbAj76W
=iYXB
-END PGP SIGNATURE-

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread Pauli Virtanen

Alan G Isaac alan.isaac at gmail.com writes:
[clip]
 Here, quacking is behaving like an ndarray (in your view,
 as I understand it) when asked.  But how do we ask?
 Your view (if I understand) is we ask via the operations
 supported by ndarrays.  But maybe that is the wrong way
 for the library to ask this question.

It is not a good thing that there is no well defined
domain specific language for matrix algebra in Python.

Rather, some code is written with one convention and other
code with a different convention. The conventions disagree
on how to express basic operations, such as matrix
multiplication.

Moreover, the ndarray is also lacking some useful things, as
you point out. But I think the right solution would be to stuff
the required additions into ndarray, rather than retaining the
otherwise incompatible np.matrix as a crutch.

 If so, then scipy libraries could ask an object
 to behave like an an ndarray by calling, e.g.,
 __asarray__ on it. It becomes the responsibility
 of the object to return something appropriate
 when __asarray__ is called. Objects that know how to do
 this will provide __asarray__ and respond
 appropriately.

Another way to achieve similar thing as your suggestion is to add
a coercion function in the vein of scipy.sparse.aslinearoperator.
It could deal with known-failure cases (np.matrix, scipy.sparse matrices)
and for the rest just assume the object satisfies the ndarray API
and pass them through.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread Pauli Virtanen

Sturla Molden sturla.molden at gmail.com writes:
 Pauli Virtanen pav at iki.fi wrote:
  It is not a good thing that there is no well defined
  domain specific language for matrix algebra in Python.
 
 Perhaps Python should get some new operators?

It might still be possible to advocate for this in core Python,
even though the ship has sailed long ago.

Some previous discussion:

[1] http://fperez.org/py4science/numpy-pep225/numpy-pep225.html
[2] http://www.python.org/dev/peps/pep-0225/
[3] http://www.python.org/dev/peps/pep-0211/

(My own take would be that one extra operator is enough for most
purposes, and would be easier to push for.)

[clip]
 On the serious side, I don't think there really is a good solution to this
 problem at all.

This is true. However, I'd prefer to have one solution over several
conflicting ones.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread Pauli Virtanen

11.02.2014 21:20, alex kirjoitti:
[clip]
 In the spirit of offsetting this bias and because this thread is
 lacking in examples of projects that use numpy.matrix, here's another
 data point: cvxpy (https://github.com/cvxgrp/cvxpy) is a serious
 active project that supports the numpy.matrix interface, for example
 as in 
 https://github.com/cvxgrp/cvxpy/tree/master/cvxpy/interface/numpy_interface.

Here's some more data:

http://nullege.com/codes/search?cq=numpy.matrix

http://nullege.com/codes/search?cq=numpy.array

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Suggestions for GSoC Projects

2014-02-11 Thread Pauli Virtanen

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

04.02.2014 20:30, jennifer stone kirjoitti:
 3. As stated earlier, we have spherical harmonic functions (with
 much scope
 for dev) we are yet to have elliptical and cylindrical harmonic
 function, which may be developed.
 
 This sounds very doable. How much work do you think would be
 involved?
 
 As Stefan so rightly pointed out, the function for spherical
 harmonic function, sph_harm at present calls lpmn thus evaluating
 all orders N. An initial glance at the code and the algorithm
 gives me a feeling that it would be very well possible to avoid
 that by maybe avoiding the dependence on lpmn.
 
 Further, we can introduce ellipsoidal harmonic functions of first
 kind and the second kind. I am confident about about the
 implementation of ellipsoidal H function of first kind but don't
 know much about the second kind. But I believe we can work it out
 in due course.And cylindrical harmonics can be carried out using
 Bessel functions.

It's not so often someone wants to work on scipy.special, so you'd be
welcome to improve it :)

The general structure of work on special functions goes as follows:

- - Check if there is a license-compatible implementation that someone
  has already written. This is usually not the case.

- - Find formulas for evaluating the function in terms of more primitive
  operations. (Ie. power series, asymptotic series, continued fractions,
  expansions in terms of other special functions, ...)

- - Determine the parameter region where the expansions converge
  in a floating point implementation, and select algorithms
  appropriately.

  Here it helps if you find a research paper where the author has
  already thought about what sort of an approach works best.

- - Life is usually made *much* easier thanks to Fredrik Johansson's
  prior work on arbitrary-precision arithmetic library mpmath

  http://code.google.com/p/mpmath/

  It can usually be used to check the true values of the functions.
  Also it contains implementations of algorithms for evaluating special
  functions, but because mpmath works with arbitrary precision numbers,
  these algorithms are not directly usable for floating-point
  calculations, as in floating point you cannot adjust the precision of
  the calculation dynamically.

  Moreover, the arbitrary-precision arithmetic can be slow compared
  to a more optimized floating point implementations.


Spherical harmonics might be a reasonable part of a GSoC proposal.
However, note that there exists also a *second* Legendre polynomial
function `lpmv`, which doesn't store the values of the previous N
functions. There's one numerical problem in the current way of
evaluation via ~Pmn(cos(theta)), which is that this approach seems to
lose relatively much precision at large orders for certain values of
theta. I don't recall now exactly how imprecise it becomes at large
orders, but it may be necessary to check.


Adding new special functions also sounds like an useful project. Here,
it helps if they are something that you expect you will need later on :)


There's also the case that several of the functions in Scipy have only
implementations for real-valued inputs, although the functions would
be defined on the whole complex plane. A list of the situation is here:

https://github.com/scipy/scipy/blob/master/scipy/special
/generate_ufuncs.py#L85

Lowercase d correspond to real-valued implementations, uppercase D to
complex-valued. I'm not at the moment completely sure which would have
the highest priority --- whether you need this or not really depends
on the application.


If you want additional ideas about possible things to fix in
scipy.special, take a look at this file:

https://github.com/scipy/scipy/blob/master/scipy/special/tests
/test_mpmath.py#L648

The entries marked @knownfailure* have some undiagnosed issues in the
implementation, which might be useful to look into. However: most of
these have to do with corner cases in hypergeometric functions. Trying
to address those is likely a risky GSoC topic, as the multi-argument
hyp* functions are challenging to evaluate in floating point. (mpmath
and Mathematica can evaluate them in most parameter regimes, but AFAIK
both require arbitrary-precision methods for this.)


So I think there would be a large number of possible things to do
here, and help would be appreciated.

- -- 
Pauli Virtanen
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)

iEYEARECAAYFAlL6iwAACgkQ6BQxb7O0pWBfOgCfYHAB12N4FWDmrqx8/ORTBRps
pXYAoL3ufAiShe+0qTEGfEvrmDgr1X0p
=kAwF
-END PGP SIGNATURE-

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Suggestions for GSoC Projects

2014-02-11 Thread Pauli Virtanen

Hi,

08.02.2014 06:16, Stéfan van der Walt kirjoitti:
 On 8 Feb 2014 04:51, Ralf Gommers ralf.gomm...@gmail.com wrote:

  Members of the dipy team would also be interested.

 That's specifically for the spherical harmonics topic right?
 
 Right. Spherical harmonics are used as bases in many of DiPy's
 reconstruction algorithms.

If help is needed with a GSoC project for scipy.special, I'm in
principle available to chip in co-mentoring, or just trying to help
answer questions.

Best,
Pauli

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Pauli Virtanen

10.02.2014 22:23, Alan G Isaac kirjoitti:
[clip]
 As far as I have been able to discern, the underlying
 motivation for eliminating the matrix class is that
 some developers want to stop supporting in any form
 the subclassing of numpy arrays.  Do I have that right?

What sparked this discussion (on Github) is that it is not possible to
write duck-typed code that works correctly for:

- ndarrays
- matrices
- scipy.sparse sparse matrixes

The semantics of all three are different; scipy.sparse is somewhere
between matrices and ndarrays with some things working randomly like
matrices and others not.

With some hyberbole added, one could say that from the developer point
of view, np.matrix is doing and has already done evil just by existing,
by messing up the unstated rules of ndarray semantics in Python.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Pauli Virtanen

10.02.2014 23:13, Alan G Isaac kirjoitti:
 On 2/10/2014 4:03 PM, Pauli Virtanen wrote:
 What sparked this discussion (on Github) is that it is not
 possible to write duck-typed code that works correctly for:
 
 Do you mean one must start out with an 'asarray'? Or more than
 that?

Starting with asarray won't work: sparse matrices are not subclasses
of ndarray. Matrix-free linear operators are not such either.

In Python code, you usually very seldom coerce your inputs to a
specific type. The situation here is a bit as if there were two
different stream object types in Python, and their .write() methods
did completely different things, so that code doing I/O would need to
always be careful with which type of a stream was in question.

 As I detailed in past discussion, the one thing I really do not
 like about the `matrix` design is that indexing always returns a
 matrix. I speculate this is the primary problem you're running
 into?

The fact that reductions to 1D return 2D objects is also a problem,
but the matrix multiplication vs. elementwise multiplication and
division is also an issue.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Pauli Virtanen

10.02.2014 23:40, Alan G Isaac kirjoitti:
 On 2/10/2014 4:28 PM, Pauli Virtanen wrote:
 Starting with asarray won't work: sparse matrices are not
 subclasses of ndarray.
 
 I was focused on the `matrix` object. For this object, an initial
 asarray is all it takes to use array code. (Or ... not?)  And it is
 a view, not a copy.
 
 I don't have the background to know how scipy ended up with a
 sparse matrix object instead of a sparse array object. In any case,
 it seems like a different question.

I think this is very relevant question, and I believe one of the main
motivations for the continuous reappearance of this discussion.

The existence of np.matrix messes up the general agreement on ndarray
semantics in Python. The meaning of very basic code such as

A * B
A.sum(0)
A[0]

where A and B are NxN matrices of some sort now depends on the types
of A and B. This makes writing duck typed code impossible when both
semantics are in play.

This is more of a community and ecosystem question rather than about
np.matrix and asarray().

I think the existence of np.matrix and its influence has set back the
development of a way to express generic linear algebra (dense, sparse,
matrix-free) algorithms in Python.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Pauli Virtanen

11.02.2014 00:17, Matthew Brett kirjoitti:
[clip]
 That is a very convincing argument.
 
 What would be the problems (apart from code compatibility) in making
 scipy.sparse use the ndarray semantics?

I'd estimate the effort it would take to convert scipy.sparse to ndarray
semantics is about a couple of afternoon hacks (normal, not
Ipython-size), so it should be doable.

Also, a shorthand for right-multiplication is probably necessary, as

A.T.dot(B.T).T

is unwieldy.

As far as backward compatibility goes: change from * to .dot would break
everyone's code. I suspect the rest of the changes have smaller impacts.

The code breakage is such that I don't think it can be easily done by
changing the behavior of csr_matrix. I've previously proposed adding
csr_array et al., and deprecating csr_matrix et al.. Not sure if the
*_matrix can ever be removed, but it would be useful to point new users
to use the interface with the ndarray convention.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Pauli Virtanen

11.02.2014 00:31, Alan G Isaac kirjoitti:
 On 2/10/2014 5:11 PM, Pauli Virtanen wrote:
 The existence of np.matrix messes up the general agreement on ndarray
 semantics in Python. The meaning of very basic code such as

  A * B
  A.sum(0)
  A[0]

 where A and B are NxN matrices of some sort now depends on the types
 of A and B. This makes writing duck typed code impossible when both
 semantics are in play.

 I'm just missing the point here; sorry.
 Why isn't the right approach to require that
 any object that wants to work with scipy
 can be called  by `asarray` to guarantee
 the core semantics? (And the matrix
 object passes this test.)  For some objects
 we can agree that `asarray` will coerce them.
 (E.g., lists.)
 
 I just do not see why scipy should care about
 the semantics an object uses for interacting
 with other objects of the same type.

I have a couple of points:

(A)

asarray() coerces the input to a dense array. This you do not want to do
to sparse matrices or matrix-free linear operators, as many linear
algebra algorithms don't need to know the matrix entries.

(B)

Coercing input types is something that is seldom done in Python code,
since it breaks duck typing.

Usually, the interface is specified by assumed semantics of the input
objects. The user is then free to pass in mock objects that fulfill the
necessary subsection of the assumed interface.

(C)

This is not only about Scipy, but also a language design question:

Suppose someone, who is not a Python expert, wants to implement a
linear algebra algorithm in Python.

Will they write it using matrix or ndarray? (Note: np.matrix is not
uncommon on stackoverflow.)

Will someone who reads the code easily understand what it does (does *
stand for elementwise or matrix product etc)?

Can they easily make it work both with sparse and dense matrices?
Matrix-free operators? Does it work both for ndarray and np.matrix inputs?

(D)

The presence of np.matrix invites people to write code using the
np.matrix semantics. This can further lead to the code spitting out
dense results as np.matrix, and then it becomes difficult to follow
what sort of an object you have.

(E)

Some examples of the above semantics diaspora on scipy.sparse:

* Implementation of GMRES et al in Scipy. The implementation reinvents
  yet another set of semantics that it uses internally.

* scipy.sparse has mostly matrix semantics, but not completely, and the
  return values vary between matrix and ndarray


-- 
Pauli Virtanen


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Pauli Virtanen

11.02.2014 01:39, josef.p...@gmail.com kirjoitti:
[clip]
 Almost all the code in scipy.stats and statsmodels starts with np.asarray.
 The numpy doc standard has the term `array_like` to indicate things that
 can be converted to a usable object by ndasarray.
 
 ducktyping could be restricted to a very narrow category of ducks.

 What about masked arrays and structured dtypes?
 Because we cannot usefully convert them by asarray, we have to tell users
 that they don't work with a function.
 Our ducks that quack in the wrong way.?

The issue here is semantics for basic linear algebra operations, such as
matrix multiplication, that work for different matrix objects, including
ndarrays.

What is there now in scipy.sparse is influenced by np.matrix, and this
is proving to be sub-optimal, as it is incompatible with ndarrays.

 How do you handle list and other array_likes in sparse?

if isinstance(t, (list, tuple)): asarray(...)

Sure, np.matrix can be dealt with as an input too.

But as said, I'm not arguing so much about asarray'in np.matrices as
input, but the fact that agreement on the meaning of * in linear
algebra code in Python is muddled. This should be fixed, and deprecating
np.matrix would point the way.

(I also suspect that this argument has been raised before, but as long
as there's no canonical write-up...)

-- 
Pauli Virtanen


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] MKL and OpenBLAS

2014-01-26 Thread Pauli Virtanen

26.01.2014 14:44, Dinesh Vadhia kirjoitti:
 This conversation gets discussed often with Numpy developers but 
 since the requirement for optimized Blas is pretty common these 
 days, how about distributing Numpy with OpenBlas by default? People
 who don't want optimized BLAS or OpenBLAS can then edit the
 site.cfg file to add/remove.  I can never remember if Numpy comes
 with Atlas by default but either way, if using MKL is not feasible
 because of its licensing issues then Numpy has to be re-compiled
 with OpenBLAS (for example).  Why not make it easier for developers
 to use Numpy with an in-built optimized Blas.

The Numpy Windows binaries distributed in the numpy project at
sourceforge.net are compiled with ATLAS, which should count as an
optimized BLAS. I don't recall what's the situation with OSX binaries,
but I'd believe they're with Atlas too.

If you are suggesting bundling OpenBLAS with Numpy source releases ---
arguments against:

OpenBLAS is big, and still rapidly moving. Moreover, bundling it with
Numpy does not really make it any easier to build.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-17 Thread Pauli Virtanen

Julian Taylor jtaylor.debian at googlemail.com writes:
[clip]
 - inconvenience in dealing with strings in python 3.
 
 bytes are not strings in python3 which means ascii data is either a byte
 array which can be inconvenient to deal with or 4 byte unicode which
 wastes space.

 A proposal to fix this would be to add a one or two byte dtype with a specific
 encoding that behaves similar to bytes but converts to string when outputting
 to python for comparisons etc.

 For backward compatibility we *cannot* change S. Maybe we could change
 the meaning of 'a' but it would be safer to add a new dtype, possibly
 'S' can be deprecated in favor of 'B' when we have a specific encoding dtype.
 
 The main issue is probably: is it worth it and who does the work?

I don't think this is a good idea: the bytes vs. unicode separation in
Python 3 exists for a good reason. If unicode is not needed, why not just
use the bytes data type throughout the program?

(Also, assuming that ASCII is in general good for text-format data is
quite US-centric.)

Christopher Barker wrote:

 How do you spell the dtype that 'S' give you


'S' is bytes.

dtype='S', dtype=bytes, and dtype=np.bytes_ are all equivalent.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-17 Thread Pauli Virtanen

Julian Taylor jtaylor.debian at googlemail.com writes:
[clip]
 For backward compatibility we *cannot* change S.
 Maybe we could change the meaning of 'a' but it would be safer
 to add a new dtype, possibly 'S' can be deprecated in favor
 of 'B' when we have a specific encoding dtype.

Note that the rename 'S' - 'B' was not done in the Python 3 port,
because 'B' already denotes uint8,

 np.array([1], dtype='B')
array([1], dtype=uint8)

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-17 Thread Pauli Virtanen

17.01.2014 15:09, Aldcroft, Thomas kirjoitti:
[clip]
 I've been playing around with porting a stack of analysis libraries
 to Python 3 and this is a very timely thread and comment.  What I
 discovered right away is that all the string data coming from
 binary HDF5 files show up (as expected) as 'S' type,, but that
 trying to make everything actually work in Python 3 without
 converting to 'U' is a big mess of whack-a-mole.
 
 Yes, it's possible to change my libraries to use bytestring
 literals everywhere, but the Python 3 user experience becomes
 horrible because to interact with the data all downstream
 applications need to use bytestring literals everywhere.  E.g.
 doing a simple filter like `string_array == 'foo'` doesn't work,
 and this will break all existing code when trying to run in Python
 3.  And every time you try to print something it has this horrible
 b in front.  Ugly, and it just won't work well in the end.
[clip]

Ok, I see your point.

Having additional Unicode data types with smaller widths could be
useful. On Python 2, they would then be Unicode strings, right? Thanks
to Py2 automatic Unicode encoding/decoding, they might also be usable
in interactive etc. use on Py2.

Adding new data types in Numpy codebase takes some work, but it's
possible to do.

There's also an issue (as noted in the Github ticket) that
array([u'foo'], dtype=bytes) encodes silently via the ASCII codec.
This is probably not how it should be.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] -ffast-math

2013-11-29 Thread Pauli Virtanen

29.11.2013 22:15, Dan Goodman kirjoitti:
 Is it possible to get access to versions of ufuncs like sin and cos but
 compiled with the -ffast-math compiler switch?

You can recompile Numpy with -ffast-math in OPT environment variable.
Caveat emptor.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] official binaries on web page.

2013-10-23 Thread Pauli Virtanen

23.10.2013 20:10, Matthew Brett kirjoitti:
[clip]
 There's no need to prefer one group over the other - we just need to
 make sure that both groups have instructions and binaries they can
 recognize as being for their case.  As in:
 
 (Group 1): The easiest way to get 
 (Group 2): You can also install the stack from community-supported
 binaries, this is more complicated, but possible by ...

This is pretty much what scipy.org/install.html page currently says.
What can be improved is adding more noticeable links to the binaries.

I'm convinced that relying on a Python distribution on Windows and OSX
is a good idea, and needs to be emphasized over needs of advanced users,
who should have enough patience to read the bottom of the page.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] official binaries on web page.

2013-10-23 Thread Pauli Virtanen

23.10.2013 17:51, Chris Barker - NOAA Federal kirjoitti:
[clip]
 But it sounds like the real problem is with the surrounding
 pages--that's the page you find when you try to figure out how to get
 numpy--if that page is about the stack, it should not be linked to
 directly from the numpy.org page without explanation.
 
 We do have a branding problem: scipy is a package, a stack and a
 ecosystem/community. It should be clear which one is being referred to
 when.

Yep, the scipy.org website has a navigation structure problem, in that
the scipy library and scipy stack and community parts are not
separated clearly enough.

The navigation items for both sections are visible at the same time, the
graphical style is the same, numpy.org is on a different domain etc., so
it's a bit of a mess. Still an improvement over Moinmoin, though.

One option would be to separate the navigation tree of the scipy
library part from the entry page. This would likely make things much
clearer.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] official binaries on web page.

2013-10-23 Thread Pauli Virtanen

23.10.2013 21:06, Pauli Virtanen kirjoitti:
 23.10.2013 17:51, Chris Barker - NOAA Federal kirjoitti:
 [clip]
 But it sounds like the real problem is with the surrounding
 pages--that's the page you find when you try to figure out how to get
 numpy--if that page is about the stack, it should not be linked to
 directly from the numpy.org page without explanation.

 We do have a branding problem: scipy is a package, a stack and a
 ecosystem/community. It should be clear which one is being referred to
 when.
 
 Yep, the scipy.org website has a navigation structure problem, in that
 the scipy library and scipy stack and community parts are not
 separated clearly enough.

This may help:
https://github.com/scipy/scipy.org/pull/31


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] official binaries on web page.

2013-10-23 Thread Pauli Virtanen

23.10.2013 22:50, Chris Barker kirjoitti:
[clip]
 This makes me think: apparently there is an offical scipy stack --
 and I even found it with a quick google:
 
 http://www.scipy.org/stackspec.html

If you click More information... on the front page, or About Scipy
in the sidebar, it takes you to an explanation that says that the scipy
exists and what it is. A newcomer may possibly read that.

-- 
Pauli Virtanen


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] official binaries on web page.

2013-10-23 Thread Pauli Virtanen

23.10.2013 23:24, Pauli Virtanen kirjoitti:
 23.10.2013 22:50, Chris Barker kirjoitti:
 [clip]
 This makes me think: apparently there is an offical scipy stack --
 and I even found it with a quick google:

 http://www.scipy.org/stackspec.html
 
 If you click More information... on the front page, or About Scipy
 in the sidebar, it takes you to an explanation that says that the scipy
 exists and what it is. A newcomer may possibly read that.

The reason why it's so obscure is probably that the discussion seems to
have mostly been on the Numfocus mailing list, and not here, and I don't
remember it being announced at any point.

Oh well,

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] official binaries on web page.

2013-10-22 Thread Pauli Virtanen

22.10.2013 06:29, Chris Barker kirjoitti:
 If you go to numpy.org, and try to figure out how to install numpy,
 you are most likely to end up here:
 
 http://www.scipy.org/install.html
 
 where there is no mention of the binaries built by the numpy project
 itself, either Windows or Mac.

The links are there: http://www.scipy.org/install.html#custom

-- 
Pauli Virtanen


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Is there a contributors agreement for numypy?

2013-10-21 Thread Pauli Virtanen

21.10.2013 21:00, Charles R Harris kirjoitti:
[clip]
 There is no agreement needed, but all numpy is released under the 
 simplified BSD license and any contributions need to be compatible
 with that. I don't know that there is any special license for the
 documentation. Anyone?

I don't think the documentation has a separate license; also BSD.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Is there a contributors agreement for numypy?

2013-10-21 Thread Pauli Virtanen

21.10.2013 22:36, Mounir E. Bsaibes kirjoitti:
 On Mon, 2013-10-21 at 21:23 +0300, Pauli Virtanen wrote:
 21.10.2013 21:00, Charles R Harris kirjoitti:
 [clip]
 There is no agreement needed, but all numpy is released under the 
 simplified BSD license and any contributions need to be compatible
 with that. I don't know that there is any special license for the
 documentation. Anyone?

 I don't think the documentation has a separate license; also BSD.

 How the contributors know that their contributions would be released
 under BSD ?

The project is BSD-licensed. Contributing implies agreement to the license.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 995 matches

Mail list logo