Re: [Numpy-discussion] Proposal: Deprecate np.int, np.float, etc.?

2015-07-31 Thread Sturla Molden
Chris Barker - NOAA Federal  wrote:

> Turns out I was passing in numpy arrays that I had typed as "np.int".
> It worked OK two years ago when I was testing only on 32 bit pythons,
> but today I got a bunch of failed tests on 64 bit OS-X -- a np.int is
> now a C long!

It has always been C long. It is the C long that varies between platforms.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: Deprecate np.int, np.float, etc.?

2015-07-31 Thread Nathaniel Smith
On Jul 24, 2015 08:55, "Julian Taylor" 
wrote:
>
> On 07/23/2015 04:29 AM, Nathaniel Smith wrote:
> > Hi all,
> >
> > So one of the things exposed in the numpy namespace are objects called
> >np.int
> >np.float
> >np.bool
> > etc.
> >
> > These are commonly used -- in fact, just yesterday on another project
> > I saw a senior person reviewing a pull request instruct a more junior
> > person that they should use np.float instead of float or np.float64.
> > But AFAICT everyone who is actually using them is doing this based on
> > a very easy-to-fall-for misconception, i.e., that these objects have
> > something to do with numpy.
>
> I don't see the issue. They are just aliases so how is np.float worse
> than just float?

Because np.float systematically confuses people in a way that plain float
does not. Which is problematic given that we have a lot of users who aren't
expert programmers and are easily confused.

> Too me this does not seem worth the bother of deprecation.
> An argument could be made for deprecating creating dtypes from python
> builtin types as they are ambiguous (C float != python float) and
> platform dependent. E.g. dtype=int is just an endless source of bugs.
> But this is also so invasive that the deprecation would never be
> completed and just be a bother to everyone.

Yeah, I don't see any way to ever make dtype=int an error, though I can see
an argument for making it unconditionally int64 or intp. That's a separate
discussion... but every step we can make to simplify these names makes it
easier to untangle the overall knot, IMHO. (E.g. if people have different
expectations about what int and np.int should mean -- as they obviously do
-- then changing the meaning of both of them is harder than deprecating one
and then changing the other, so this deprecation puts us in a better
position even if it doesn't immediately help much.)

> So -1 from me.

Do you really mean this as a true veto? While some of the thread has gotten
a bit confused about how much of a change we're actually talking about,
AFAICT everyone else is very much in favor of this deprecation, including
testimony from multiple specific users who have gotten burned.

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: Deprecate np.int, np.float, etc.?

2015-07-31 Thread Chris Barker - NOAA Federal
So one more bit of anecdotal evidence:

I just today revived some Cython code I wrote a couple years ago and
haven't tested since.

It wraps a C library that uses a lot of "int" typed values.

Turns out I was passing in numpy arrays that I had typed as "np.int".
It worked OK two years ago when I was testing only on 32 bit pythons,
but today I got a bunch of failed tests on 64 bit OS-X -- a np.int is
now a C long!

I really thought I knew better, even a couple years ago, but I guess
it's just too easy to slip up there.

Yeah to Cython for keeping types straight (I got a run-time error).
And Yeah to me for having at least some basic tests.

But Boo to numpy for a very easy to confuse type API.

-Chris


Sent from my iPhone

> On Jul 31, 2015, at 10:45 AM, Sturla Molden  wrote:
>
> Chris Barker  wrote:
>
>> What about Fortan -- I've been out of that loop for ages -- does
>> semi-modern Fortran use well defined integer types?
>
> Modern Fortran is completely sane.
>
> INTEGER without kind number (Fortran 77) is the fastest integer on the CPU.
> On AMD64 that is 32 bit, because it is designed to use a 64 bit pointer
> with a 32 bit offset. (That is also why Microsoft decided to use a 32 bit
> long, because it by definition is the fastest integer of at least 32 bits.
> One can actually claim that the C standard is violated with a 64 bit long
> on AMD64.) Because of this we use a 32 bit interger in BLAS and LAPACK
> linked to NumPy and SciPy.
>
> The function KIND (Fortran 90) allows us to query the kind number of a
> given variable, e.g. to find out the size of INTEGER and REAL.
>
> The function SELECTED_INT_KIND (Fortran 90) returns the kind number of
> smallest integer with a specified range.
>
> The function SELECTED_REAL_KIND (Fortran 90) returns the kind number of
> smallest float with a given range and precision. THe returned kind number
> can be used for REAL and COMPLEX.
>
> KIND, SELECTED_INT_KIND and SELECTED_REAL_KIND will all return compile-time
> constants, and can be used to declare other variables if the return value
> is stored in a variable with the attribute PARAMETER. This allows te
> programmer to get the REAL, COMPLEX or INTEGER the algorithm needs
> numerically, without thinking about how big they need to be in bits.
>
> ISO_C_BINDING is a Fortran 2003 module which contains kind numbers
> corresponding to all C types, including size_t and void*, C structs, an
> attribute for using pass-by-value semantics, controlling the C name to
> avoid name mangling, as well as functions for converting between C and
> Fortran pointers. It allows portable interop between C and Fortran (either
> calling C from Fortran or calling Fortran from C).
>
> ISO_FORTRAN_ENV is a Fortran 2003 and 2008 module. In F2003 it contain kind
> numbers for integers with specified size: INT8, INT16, INT32, and INT64. In
> F2008 it also contains kind numbers for IEEE floating point types: REAL32,
> REAL64, and REAL128. The kind numbers for floating point types can also be
> used to declare complex numbers.
>
> So with modern Fortran we have a completely portable and unambiguous type
> system.
>
> C11/C++11 is sane as well, but not quite as sane as that of modern Fortran.
>
>
> Sturla
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: Deprecate np.int, np.float, etc.?

2015-07-31 Thread Jason Newton
On Fri, Jul 31, 2015 at 5:19 PM, Nick Papior  wrote:

> --
>
> Kind regards Nick Papior
> On 31 Jul 2015 17:53, "Chris Barker"  wrote:
> >
> > On Thu, Jul 30, 2015 at 11:24 PM, Jason Newton  wrote:
> >>
> >> This really needs changing though.  scientific researchers don't catch
> this subtlety and expect it to be just like the c and matlab types they
> know a little about.
> >
> >
> > well, C types are a %&$ nightmare as well! In fact, one of the biggest
> issues comes from cPython's use of a C "long" for an integer -- which is
> not clearly defined. If you are writing code that needs any kind of binary
> compatibility, cross platform compatibility, and particularly if you want
> to be abel to distribute pre-compiled binaries of extensions, etc, then
> you'd better use well-defined types.
>
There was some truth to this but if you, like the majority of scientific
researchers only produce code for x86 or x86_64 on windows and linux... as
long as you aren't treating pointers as int's, everything behaves in
accordance to general expectations.   The standards did and still do allow
for a bit of flux but things like OpenCL [
https://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/scalarDataTypes.html
] made this really strict so we stop writing ifdef's to deal with varying
bitwidths and just implement the algorithms - which is typically a
researcher’s top priority.

I'd say I use the strongly defined types (e.g. int/float32) whenever doing
protocol or communications work - it makes complete sense there. But often
for computation, especially when interfacing with c extensions it makes
more sense for the developer to use types/typenames that ought to match 1:1
with c in every case.

-Jason
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: Deprecate np.int, np.float, etc.?

2015-07-31 Thread Nick Papior
--

Kind regards Nick Papior
On 31 Jul 2015 17:53, "Chris Barker"  wrote:
>
> On Thu, Jul 30, 2015 at 11:24 PM, Jason Newton  wrote:
>>
>> This really needs changing though.  scientific researchers don't catch
this subtlety and expect it to be just like the c and matlab types they
know a little about.
>
>
> well, C types are a %&$ nightmare as well! In fact, one of the biggest
issues comes from cPython's use of a C "long" for an integer -- which is
not clearly defined. If you are writing code that needs any kind of binary
compatibility, cross platform compatibility, and particularly if you want
to be abel to distribute pre-compiled binaries of extensions, etc, then
you'd better use well-defined types.
>
> numpy has had well-defined types for ages, but it is a shame that it's so
easy to use the poorly-defined ones.
>
>>  I can't even keep it straight in all circumstances, how can I expect
them to?  This makes all the newcomers face the same pain and introduce
more bugs into otherwise good code.
>
>
> indeed.
>
>>
>> +1 Change it now like ripping off a bandaid.  Match C11/C++11 types and
solve much pain past present and future in exchange for a few lashings for
the remainder of the year.
>
>
> Sorry -- I'm not sure what C11 types are -- is "int", "long", etc,
deprecated? If so, then yes.
>
> What about Fortan -- I've been out of that loop for ages -- does
semi-modern Fortran use well defined integer types?
Yes, this is much like the c equivalent, integer is int, real is float, for
long and double constant castings are needed.
>
> Is it possible to deprecate a bunch of the built-in numpy dtypes? Without
annoying the heck out everyone -- because tehre is a LOT of code out there
that just uses np.float, np.int, etc.
>
>
>>> An argument could be made for deprecating creating dtypes from python
>>> builtin types as they are ambiguous (C float != python float) and
>>> platform dependent. E.g. dtype=int is just an endless source of bugs.
>>> But this is also so invasive that the deprecation would never be
>>> completed and just be a bother to everyone.
>
>
> yeah, that is a big concern. :-(
>
> -Chris
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R(206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115   (206) 526-6317   main reception
>
> chris.bar...@noaa.gov
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: Deprecate np.int, np.float, etc.?

2015-07-31 Thread Sturla Molden
Chris Barker  wrote:

> What about Fortan -- I've been out of that loop for ages -- does
> semi-modern Fortran use well defined integer types?

Modern Fortran is completely sane.

INTEGER without kind number (Fortran 77) is the fastest integer on the CPU.
On AMD64 that is 32 bit, because it is designed to use a 64 bit pointer
with a 32 bit offset. (That is also why Microsoft decided to use a 32 bit
long, because it by definition is the fastest integer of at least 32 bits.
One can actually claim that the C standard is violated with a 64 bit long
on AMD64.) Because of this we use a 32 bit interger in BLAS and LAPACK
linked to NumPy and SciPy.

The function KIND (Fortran 90) allows us to query the kind number of a
given variable, e.g. to find out the size of INTEGER and REAL.

The function SELECTED_INT_KIND (Fortran 90) returns the kind number of
smallest integer with a specified range. 

The function SELECTED_REAL_KIND (Fortran 90) returns the kind number of
smallest float with a given range and precision. THe returned kind number
can be used for REAL and COMPLEX.

KIND, SELECTED_INT_KIND and SELECTED_REAL_KIND will all return compile-time
constants, and can be used to declare other variables if the return value
is stored in a variable with the attribute PARAMETER. This allows te
programmer to get the REAL, COMPLEX or INTEGER the algorithm needs
numerically, without thinking about how big they need to be in bits.

ISO_C_BINDING is a Fortran 2003 module which contains kind numbers
corresponding to all C types, including size_t and void*, C structs, an
attribute for using pass-by-value semantics, controlling the C name to
avoid name mangling, as well as functions for converting between C and
Fortran pointers. It allows portable interop between C and Fortran (either
calling C from Fortran or calling Fortran from C). 

ISO_FORTRAN_ENV is a Fortran 2003 and 2008 module. In F2003 it contain kind
numbers for integers with specified size: INT8, INT16, INT32, and INT64. In
F2008 it also contains kind numbers for IEEE floating point types: REAL32,
REAL64, and REAL128. The kind numbers for floating point types can also be
used to declare complex numbers.

So with modern Fortran we have a completely portable and unambiguous type
system.

C11/C++11 is sane as well, but not quite as sane as that of modern Fortran.


Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: Deprecate np.int, np.float, etc.?

2015-07-31 Thread Chris Barker
On Thu, Jul 30, 2015 at 11:24 PM, Jason Newton  wrote:

> This really needs changing though.  scientific researchers don't catch
> this subtlety and expect it to be just like the c and matlab types they
> know a little about.
>

well, C types are a %&$ nightmare as well! In fact, one of the biggest
issues comes from cPython's use of a C "long" for an integer -- which is
not clearly defined. If you are writing code that needs any kind of binary
compatibility, cross platform compatibility, and particularly if you want
to be abel to distribute pre-compiled binaries of extensions, etc, then
you'd better use well-defined types.

numpy has had well-defined types for ages, but it is a shame that it's so
easy to use the poorly-defined ones.

 I can't even keep it straight in all circumstances, how can I expect them
> to?  This makes all the newcomers face the same pain and introduce more
> bugs into otherwise good code.
>

indeed.


> +1 Change it now like ripping off a bandaid.  Match C11/C++11 types and
> solve much pain past present and future in exchange for a few lashings for
> the remainder of the year.
>

Sorry -- I'm not sure what C11 types are -- is "int", "long", etc,
deprecated? If so, then yes.

What about Fortan -- I've been out of that loop for ages -- does
semi-modern Fortran use well defined integer types?

Is it possible to deprecate a bunch of the built-in numpy dtypes? Without
annoying the heck out everyone -- because tehre is a LOT of code out there
that just uses np.float, np.int, etc.


An argument could be made for deprecating creating dtypes from python
>> builtin types as they are ambiguous (C float != python float) and
>> platform dependent. E.g. dtype=int is just an endless source of bugs.
>> But this is also so invasive that the deprecation would never be
>> completed and just be a bother to everyone.
>>
>
yeah, that is a big concern. :-(

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: Deprecate np.int, np.float, etc.?

2015-07-31 Thread Julian Taylor
On 31.07.2015 08:24, Jason Newton wrote:
> Been using numpy in it's various forms since like 2005.  burned on int,
> int_ just today with boost.python / ndarray conversions and a number of
> times before that.  intc being C's int!? Didn't even know it existed
> till today.  This isn't the first time, esp with float.  Bool is
> actually expected for me and I'd prefer it stay 1 byte for storage
> efficiency - I'll use a long if I want it machine word wide.

A long is only machine word wide on posix, in windows its not. This
nonsense is unfortunately also in numpy. It also affects dtype=int.
The correct word size type is actually np.intp.

btw. if something needs deprecating it is np.float128, this is the most
confusing type name in all of numpy as its precision is actually a 80
bit in most cases (x86), 64 bit sometimes (arm) and very rarely actually
128 bit (sparc).
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion