Re: [Numpy-discussion] Verify your sourceforge windows installer downloads

2015-05-28 Thread Andrew Collette
Here is their lame excuse: https://sourceforge.net/blog/gimp-win-project-wasnt-hijacked-just-abandoned/ It probably means this: If NumPy installers are moved away from Sourceforge, they will set up a mirror and load the mirrored installers with all sorts of crapware. It is some sort of

Re: [Numpy-discussion] ANN: HDF5 for Python 2.5.0

2015-04-09 Thread Andrew Collette
Congrats! Also btw, you might want to switch to a new subject line format for these emails -- the mention of Python 2.5 getting hdf5 support made me do a serious double take before I figured out what was going on, and 2.6 and 2.7 will be even worse :-) Ha! Didn't even think of that. For our

[Numpy-discussion] ANN: HDF5 for Python 2.5.0

2015-04-09 Thread Andrew Collette
Announcing HDF5 for Python (h5py) 2.5.0 The h5py team is happy to announce the availability of h5py 2.5.0. This release introduces experimental support for the highly-anticipated Single Writer Multiple Reader (SWMR) feature in the upcoming HDF5 1.10

[Numpy-discussion] Copyright status of NumPy binaries on Windows/OS X

2014-10-06 Thread Andrew Collette
proprietary code? We'd like to avoid building NumPy ourselves if we can avoid it. Apologies if this is explained somewhere, but I couldn't find it. Thanks! Andrew Collette ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman

Re: [Numpy-discussion] String type again.

2014-07-18 Thread Andrew Collette
Hi Chris, A Latin-1 based 'a' type would have similar problems. Maybe not -- latin1 is fixed width. Yes, Latin-1 is fixed width, but the issue is that when writing to a fixed-width UTF8 string in HDF5, it will expand, possibly losing data. What I would like to avoid is a situation where a

Re: [Numpy-discussion] String type again.

2014-07-18 Thread Andrew Collette
Hi Chris, Again, they shouldn't do that, they should be pushing a 10-character string into something -- and utf-8 is going to (Possible) truncate that. That's HDF/utf-8 limitation that people are going to have to deal with. I think you're suggesting that numpy follow the HDF model, so that

Re: [Numpy-discussion] String type again.

2014-07-18 Thread Andrew Collette
Hi Chris, What it would do is push the problem from the HDF5-numpy interface to the python-numpy interface. I'm not sure that's a good trade off. Maybe I'm being too paranoid about the truncation issue. We already perform truncation when going from e.g. vlen to fixed-width strings in

Re: [Numpy-discussion] String type again.

2014-07-18 Thread Andrew Collette
Hi Chris, Actually, I agree about the truncation issue, but it's a question of where to put it -- I'm suggesting that I don't want it at the python-numpy interface. Yes, that's a good point. Of course, by using Latin-1 rather than UTF-8 we can't support all Unicode code points (hence the ?

Re: [Numpy-discussion] String type again.

2014-07-17 Thread Andrew Collette
Hi, good argument for ASCII, but utf-8 is a bad idea, as there is no 1:1 correspondence between length of string in bytes and length in characters -- as numpy needs to pre-allocate a defined number of bytes for a dtype, there is a disconnect between the user and numpy as to how long a string is

Re: [Numpy-discussion] String type again.

2014-07-15 Thread Andrew Collette
Hi Chuck, This note proposes to adapt the currently existing 'a' type letter, currently aliased to 'S', as a new fixed encoding dtype. Python 3.3 introduced two one byte internal representations for unicode strings, ascii and latin1. Ascii has the advantage that it is a subset of UTF-8,

[Numpy-discussion] ANN: HDF5 for Python 2.3.0

2014-04-22 Thread Andrew Collette
Announcing HDF5 for Python (h5py) 2.3.0 === The h5py team is happy to announce the availability of h5py 2.3.0 (final). Thanks to everyone who provided beta feedback! What's h5py? The h5py package is a Pythonic interface to the HDF5 binary data

Re: [Numpy-discussion] [Hdf-forum] ANN: HDF5 for Python 2.3.0

2014-04-22 Thread Andrew Collette
Hi, Good work! Small question : do you now have the interface to set alignment? Unfortunately this didn't make it in to 2.3. Pull requests are welcome for this and other MPI features! Andrew ___ NumPy-Discussion mailing list

[Numpy-discussion] ANN: HDF5 for Python 2.3.0 BETA

2014-03-14 Thread Andrew Collette
Announcing HDF5 for Python (h5py) 2.3.0 BETA The h5py team is happy to announce the availability of h5py 2.3.0 beta. This beta release will be available for approximately two weeks. What's h5py? The h5py package is a Pythonic interface

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-22 Thread Andrew Collette
Hi Oscar, Is it fair to say that people should really be using vlen utf-8 strings for text? Is it problematic because of the need to interface with non-Python libraries using the same hdf5 file? The general recommendation has been to use fixed-width strings for exactly that reason; FORTRAN

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-21 Thread Andrew Collette
Hi Chris, Just stumbled on this discussion (I'm the lead author of h5py). We would be overjoyed if there were a 1-byte text type available in NumPy. String handling is the source of major pain right now in the HDF5 world. All HDF5 strings are text (opaque types are used for binary data), but

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-21 Thread Andrew Collette
Hi Chris, it looks from here: http://www.hdfgroup.org/HDF5/doc/ADGuide/WhatsNew180.html that HDF uses utf-8 for unicode strings -- so you _could_ roundtrip with a lot of calls to encode/decode -- which could be pretty slow, compared to other ways to dump numpy arrays into HDF-5 -- that may

[Numpy-discussion] ANN: HDF5 for Python 2.2.1

2013-12-09 Thread Andrew Collette
Announcing HDF5 for Python (h5py) 2.2.1 === The h5py team is happy, in a sense, to announce the availability of h5py 2.2.1. This release fixes a critical bug reported by Jim Parker on December 7th, which affects code using HDF5 compound types. We recommend

[Numpy-discussion] dtype metadata attribute

2013-09-05 Thread Andrew Collette
documentation on it anywhere; the metadata keyword doesn't even appear in the dtype docstring. Is this an officially supported feature in NumPy? I am mostly concerned about it going away in a future release. Thanks! Andrew Collette ___ NumPy-Discussion

[Numpy-discussion] ANN: HDF5 for Python (h5py) 2.2.0

2013-09-04 Thread Andrew Collette
Announcing HDF5 for Python (h5py) 2.2.0 === We are proud to announce that HDF5 for Python 2.2.0 is now available. Thanks to everyone who helped put together this release! The h5py package is a Pythonic interface to the HDF5 binary data format. It lets you

[Numpy-discussion] ANN: HDF5 for Python (h5py) 2.2 BETA

2013-07-18 Thread Andrew Collette
Announcing HDF5 for Python (h5py) 2.2.0 BETA We are proud to announce that HDF5 for Python 2.2.0 (beta) is now available. Because of the large number of new features in this release, we are actively seeking community feedback over the (2-week) beta

Re: [Numpy-discussion] Parameterised dtypes

2013-05-28 Thread Andrew Collette
Hi Richard, I'm in the process of defining some new dtypes to handle non-physical calendars (such as the 360-day calendar used in the climate modelling world). This is all going fine[*] so far, but I'd like to know a little bit more about how much is ultimately possible. The PyArray_Descr

Re: [Numpy-discussion] GSOC 2013

2013-03-05 Thread Andrew Collette
5. Currently dtypes are limited to a set of fixed types, or combinations of these types. You can't have, say, a 48 bit float or a 1-bit bool. This project would be to allow users to create entirely new, non-standard dtypes based on simple rules, such as specifying the length of the sign,

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-09 Thread Andrew Collette
Hi Nathaniel, Sure. But the only reason this is in 1.6 is that the person who made the change never mentioned it to anyone else, so it wasn't noticed until after 1.6 came out. If it had gone through proper review/mailing list discussion (like we're doing now) then it's very unlikely it would

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-08 Thread Andrew Collette
Hi, I think you are voting strongly for the current casting rules, because they make it less obvious to the user that scalars are different from arrays. Maybe this is the source of my confusion... why should scalars be different from arrays? They should follow the same rules, as closely as

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-08 Thread Andrew Collette
Hi Nathaniel, (Responding to both your emails) The problem is that rule for arrays - and for every other party of numpy in general - are that we *don't* pick types based on values. Numpy always uses input types to determine output types, not input values. Yes, of course... array operations

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-08 Thread Andrew Collette
Hi Dag, So you are saying that, for an array x, you want x + random.randint(10) to produce an array with a random dtype? Under the proposed behavior, depending on the dtype of x and the value from random, this would sometimes add-with-rollover and sometimes raise ValueError. Under the

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-08 Thread Andrew Collette
Hi, Keep in mind that in the third option (current 1.6 behavior) the dtype is large enough to hold the random number, but not necessarily to hold the result. So for instance if x is an int16 array with only positive values, the result of this addition may contain negative values (or not,

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-08 Thread Andrew Collette
Hi Olivier, Yes, certainly. But in either the proposed or 1.5 behavior, if the values in x are close to the limits of the type, this can happen also. My previous email may not have been clear enough, so to be sure: in my above example, if the random number is 3, then the result may

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-07 Thread Andrew Collette
Hi Matthew, I realized when I thought about it, that I did not have a clear idea of your exact use case. How does the user specify the thing to add, and why do you need to avoid an error in the case that adding would overflow the type? Would you mind giving an idiot-level explanation?

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-07 Thread Andrew Collette
Hi Matthew, Just to be clear, you mean you might have something like this? def my_func('array_name', some_offset): arr = load_somehow('array_name') # dtype hitherto unknown return arr + some_offset ? And the problem is that it fails late? Is it really better that something bad

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-07 Thread Andrew Collette
Hi Matthew, In this case I think you'd probably agree it would be reasonable to raise an error - all other things being equal? No, I don't agree. I want there to be some default semantics I can rely on. Preferably, I want it to do the same thing it would do if some_offset were an array with

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-07 Thread Andrew Collette
Hi Dag, But the default float dtype is double, and default integer dtype is at least int32. So if you rely on NumPy-supplied default behaviour you are fine! As I mentioned, this caught my interest because people routinely save data in HDF5 as int8 or int16 to save disk space. It's not at

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-07 Thread Andrew Collette
Hi Matthew, Ah - well - I only meant that raising an error in the example would be no more surprising than raising an error at the python prompt. Do you agree with that? I mean, if the user knew that: np.array([1], dtype=np.int8) + 128 would raise an error, they'd probably expect your

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-07 Thread Andrew Collette
Hi, Taking 2) first, in this example: return self.f[dataset_name][...] + heightmap assuming it is not going to upcast, would you rather it overflow than raise an error? Why? The second seems more explicit and sensible to me. Yes, I think this (the 1.5 overflow behavior) was a bit

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-04 Thread Andrew Collette
Hi Olivier, A key difference is that with arrays, the dtype is not chosen just big enough for your data to fit. Either you set the dtype yourself, or you're using the default inferred dtype (int/float). In both cases you should know what to expect, and it doesn't depend on the actual numeric

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-04 Thread Andrew Collette
Hi, (sorry, no time for full reply, so for now just answering what I believe is the main point) Thanks for taking the time to discuss/explain this at all... I appreciate it. The evilness lies in the silent switch between the rollover and upcast behavior, as in the example I gave previously:

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-04 Thread Andrew Collette
Hi, In fact in 1.6 there is no assignment of a dtype to '1' which makes the way 1.6 handles it consistent with the array rules: I guess I'm a little out of my depth here... what are the array rules? # Ah-hah, it looks like '1' has a uint8 dtype: (np.ones(2, dtype=np.uint8) / np.ones(2,

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-03 Thread Andrew Collette
Consensus in that bug report seems to be that for array/scalar operations like: np.array([1], dtype=np.int8) + 1000 # can't be represented as an int8! we should raise an error, rather than either silently upcasting the result (as in 1.6 and 1.7) or silently downcasting the scalar (as in

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-03 Thread Andrew Collette
Hi Dag, If neither is objectively better, I think that is a very good reason to kick it down to the user. Explicit is better than implicit. I agree with you, up to a point. However, we are talking about an extremely common operation that I think most people (myself included) would not expect

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-03 Thread Andrew Collette
Hi Olivier, Another solution is to forget about trying to be smart and always upcast the operation. That would be my 2nd preferred solution, but it would make it very annoying to deal with Python scalars (typically int64 / float64) that would be upcasting lots of things, potentially breaking

Re: [Numpy-discussion] Status of the 1.7 release

2012-12-17 Thread Andrew Collette
other parts of the project. If I can be of any help just let me know how. Andrew Collette ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] ValueError: low level cast function is for unequal type numbers

2012-12-13 Thread Andrew Collette
Hi, the following code using np.object_ data types works with numpy 1.5.1 but fails with 1.6.2. Is this intended or a regression? Other data types, np.float64 for example, seem to work. I am also seeing this problem; there was a change to how string types are handled in h5py 2.1.0 which

[Numpy-discussion] Attaching metadata to dtypes: what's the best way?

2012-12-13 Thread Andrew Collette
in NumPy. Is there a better way to add metadata to dtypes I'm not aware of? Note I'm *not* interested in creating a custom type; one of the advantages of the current system is that people deal with the resulting O object arrays like any other object array in NumPy. Andrew Collette

[Numpy-discussion] ANN: HDF5 for Python (h5py) 2.1.0-final

2012-10-04 Thread Andrew Collette
Announcing HDF5 for Python (h5py) 2.1.0 === We are proud to announce the availability of HDF5 for Python (h5py) 2.1.0! This release has been a long time coming. Thanks to everyone who contributed code and filed bug reports! What's new in h5py 2.1

Re: [Numpy-discussion] ANN: HDF5 for Python 1.1

2009-02-10 Thread Andrew Collette
. :) Andrew Collette On Mon, Feb 9, 2009 at 10:30 PM, Stephen Simmons m...@stevesimmons.com wrote: Hi Andrew, Do you have any plans to support LZO compression in h5py? I have lots of LZO-compressed datasets created with PyTables. There's a real barrier to using both h5py and PyTables if the fast

[Numpy-discussion] ANN: HDF5 for Python 1.1

2009-02-09 Thread Andrew Collette
= Announcing HDF5 for Python (h5py) 1.1 = What is h5py? - HDF5 for Python (h5py) is a general-purpose Python interface to the Hierarchical Data Format library, version 5. HDF5 is a versatile, mature scientific

Re: [Numpy-discussion] ANN: HDF5 for Python 1.1

2009-02-09 Thread Andrew Collette
Thanks, Ondrej. For the record, h5py is designed to provide a NumPy-like interface to HDF5, along with a near-complete wrapping of the low-level HDF5 C API. It has none of the database-like features of PyTables. The FAQ entry has more info. Andrew Collette On Mon, Feb 9, 2009 at 1:06 PM

[Numpy-discussion] Array dtype problems

2009-02-04 Thread Andrew Collette
round trip; I can't even use use astype as it complains about a shape mismatch. How can I create an array in a manner that preserves the dtype array information? Or is there a way to take an existing array of the correct shape and re-cast it to use an array dtype? Thanks, Andrew Collette

Re: [Numpy-discussion] ANN: Numexpr 1.1, an efficient array evaluator

2009-01-21 Thread Andrew Collette
Hi, I get identical results for both shapes now; I manually removed the numexpr-1.1.1.dev-py2.5-linux-i686.egg folder in site-packages and reinstalled. I suppose there must have been a stale set of files somewhere. Andrew Collette On Wed, Jan 21, 2009 at 3:41 AM, Francesc Alted fal

Re: [Numpy-discussion] ANN: Numexpr 1.1, an efficient array evaluator

2009-01-20 Thread Andrew Collette
...@pytables.org wrote: A Tuesday 20 January 2009, Andrew Collette escrigué: Hi Francesc, Looks like a cool project! However, I'm not able to achieve the advertised speed-ups. I wrote a simple script to try three approaches to this kind of problem: 1) Native Python code (i.e. will try to do

Re: [Numpy-discussion] ANN: Numexpr 1.1, an efficient array evaluator

2009-01-19 Thread Andrew Collette
: 0.206279683113 Numexpr: 0.210431909561 Chunked: 0.182894086838 FYI, the current tar file (1.1-1) has a glitch related to the VERSION file; I added to the bug report at google code. Andrew Collette On Fri, Jan 16, 2009 at 4:00 AM, Francesc Alted fal...@pytables.org wrote

Re: [Numpy-discussion] checksum on numpy float array

2008-12-05 Thread Andrew Collette
#h5py.highlevel.Group.create_dataset Like other checksums, fletcher32 provides error-detection but not error-correction. You'll still need to throw away data which can't be read. However, I believe that you can still read sections of the dataset which aren't corrupted. Andrew Collette

Re: [Numpy-discussion] ANN: HDF5 for Python 1.0

2008-12-02 Thread Andrew Collette
Just FYI, the Windows installer for 1.0 is now posted at h5py.googlecode.com after undergoing some final testing. Thanks for trying 0.3.0... too bad about matlab. Andrew On Mon, 2008-12-01 at 21:53 -0500, [EMAIL PROTECTED] wrote: Requires * UNIX-like platform (Linux or Mac OS-X);

Re: [Numpy-discussion] ANN: HDF5 for Python 1.0

2008-12-02 Thread Andrew Collette
lzo compression support, in addition to gzip? This is a feature I used a lot in PyTables. Andrew Collette wrote: = Announcing HDF5 for Python (h5py) 1.0 = What is h5py? - HDF5 for Python (h5py

[Numpy-discussion] ANN: HDF5 for Python 1.0

2008-12-01 Thread Andrew Collette
= Announcing HDF5 for Python (h5py) 1.0 = What is h5py? - HDF5 for Python (h5py) is a general-purpose Python interface to the Hierarchical Data Format library, version 5. HDF5 is a versatile, mature scientific