>From the user perspective I find the following pretty confusing:

In [1]: np.array([-128, 127], dtype=np.int8()) * 2
Out[1]: array([ 0, -2], dtype=int8)

In [2]: np.array([-128, 127], dtype=np.int16()) * 2
Out[2]: array([-256,  254], dtype=int16)

In my opinion somewhere (on a higher level maybe) we should provide
the correct results promoted to a wider type implicitly.

Clickhouse for example does the type promotion.

On Wed, Jun 3, 2020 at 5:29 PM Antoine Pitrou <[email protected]> wrote:
>
> On Wed, 3 Jun 2020 10:47:38 -0400
> Ben Kietzman <[email protected]> wrote:
> > https://github.com/apache/arrow/pull/7341#issuecomment-638241193
> >
> > How should arithmetic kernels handle integer overflow?
> >
> > The approach currently taken in the linked PR is to promote such that
> > overflow will not occur, for example `(int8, int8)->int16` and `(uint16,
> > uint16)->uint32`.
> >
> > I'm not sure that's desirable. For one thing this leads to inconsistent
> > handling of 64 bit integer types, which are currently allowed to overflow
> > since we cannot promote further (NB: that means this kernel includes
> > undefined behavior for int64).
>
> I agree with you.  I would strongly advise against implicit promotion
> accross arithmetic operations.  We initially did that in Numba and it
> quickly became a can of worms.
>
> The most desirable behaviour IMHO is to keep the original type, so:
> - (int8, int8) -> int8
> - (uint16, uint16) -> uint16
>
> Then the question is what happens when the actual overflow occurs.  I
> think this should be directed by a kernel option.  By default an error
> should probably be raised (letting errors pass and silently produce
> erroneous data is wrong), but we might want to allow people to bypass
> overflow checks for speed.
>
> Note that even if overflow detection is enabled, it *should* be possible
> to enable vectorization, e.g. by making overflow detection a separate
> pass (itself vectorizable).
>
> Regards
>
> Antoine.
>
>

Reply via email to