Hi all, Sorry that this will again be a bit complicated again :(. In brief:
* I would like to pass around scalars in some (partially new) C-API to implement value-based promotion. * There are some subtle commutativity issues with promotion. Commutativity may change in that case (with respect of value based promotion, probably to the better normally). [0] In the past days, I have been looking into implementing value-based promotion in a way that I had done it for Prototype before. The idea was that NEP 42, allows for the creation of DType dynamically, which does allow very powerful value based promotion/casting. But I decided there are too many quirks with creating type instances dynamically (potentially very often) just to pass around one additional piece of information. That approach was far more powerful, but it is power and complexity that we do not require, given that: * Value based promotion is only used for a mix of scalars and arrays (where "scalar" is annoyingly defined as 0-D at the moment) * I assume it is only relevant for `np.result_type` and promotion in ufuncs (which often uses `np.result_type`). `np.can_cast` has such behaviour, but I think it is easier [1]. We could implement more powerful "value based" logic, but I doubt it is worthwhile. * This is already stretching the Python C-API beyond its limits. So I will suggest this instead which *must* modify some (poorly defined) current behaviour: 1. We always evaluate concrete DTypes first in promotion, this means that in rare cases the non-commutativity of promotion may change the result dtype: np.result_type(-1, 2**16, np.float32) The same can also happens when you reorder the normal dtypes: np.result_type(np.int8, np.uint16, np.float32) np.result_type(np.float32, np.int8, np.uint16) in both cases the `np.float32` is moved to the front 2. If we reorder the above operation, we can define that we never promote two "scalar values". Instead we convert both to a concrete one first. This makes it effectively like: np.result_type(np.array(-1).dtype, np.array(2**16).dtype) This means that we never have to deal with promoting two values. 3. We need additional private API (we were always going to need some additional API); That API could become public: * Convert a single value into a concrete dtype, you could say the same as `self.common_dtype(None)`, but a dedicated function seems simpler. A dtype like this will never use `common_dtype()`. * `common_dtype_with_scalar(self, other, scalar)` (note that only one of the DTypes can have a scalar). As a fallback, this function can be implemented by converting to the concrete DType and retrying with the normal `common_dtype`. (At leas the second slot must be made public we are to allow value based promotion for user DTypes. I expect we will, but it is not particularly important to me right now.) 4. Our public API (including new C-API) has to expose and take the scalar values. That means promotion in ufuncs will get DTypes and `scalar_values`, although those should normally be `NULL` (or None). In future python API, this is probably acceptable: np.result_type([t if v is None else v for t, v in zip(dtypes, scalar_values)]) In C, we need to expose a function below `result_type` which accepts both the scalar values and DTypes explicitly. 5. For the future: As said many times, I would like to deprecate using value based promotion for anything except Python core types. That just seems wrong and confusing. My only problem is that while I can warn (possibly sometimes too often) when behaviour will change. I do not have a good idea about silencing that warning. Note that this affects NEP 42 (a little bit). NEP 42 currently makes a nod towards the dynamic type creation, but falls short of actually defining it. So These rules have to be incorporated, but IMO they do not affect the general design choices in the NEP. There is probably even more complexity to be found here, but for now the above seems to be at least good enough to make headway... Any thoughts or clarity remaining that I can try to confuse? :) Cheers, Sebastian [0] We could use the reordering trick also for concrete DTypes, although, that would require introducing some kind of priority... I do not like that much as public API, but it might be something to look at internally or for types deriving from the builtin abstract DTypes: * inexact * other Just evaluating all `inexact` first would probably solve our commutativity issues. [1] NumPy uses `np.can_cast(value, dtype)` also. For example: np.can_cast(np.array(1., dtype=np.float64), np.float32, casting="safe") returns True. My working hypothesis is that `np.can_cast` as above is just a side battle. I.e. we can either: * Flip the switch on it (can-cast does no value based logic, even though we use it internally, we do not need it). * Or, we can implement those cases of `np.can_cast` by using promotion. The first one is tempting, but I assume we should go with the second since it preserves behaviour and is slightly more powerful.
signature.asc
Description: This is a digitally signed message part
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion