Re: [PROPOSAL] An initial Schema API in Python

Chad Dombrova Fri, 16 Aug 2019 17:18:00 -0700

>
> >> Agreed on float since it seems to trivially map to a double, but I’m
> torn on int still. While I do want int type hints to work, it doesn’t seem
> appropriate to map it to AtomicType.INT64, since it has a completely
> different range of values.
> >>
> >> Let’s say we used native int for the runtime field type, not just as a
> schema declaration for numpy.int64. What is the real world fallout from
> this? Would there be data loss?
> >
> > I'm not sure I follow the question exactly, what is the interplay
> between int and numpy.int64 in this scenario? Are you saying that np.int64
> is used in the schema declaration, but we just use native int at runtime,
> and check the bit width when encoding?
> >
> > In any case, I don't think the real world fallout of using int is nearly
> that dire. I suppose data loss is possible if a poorly designed pipeline
> overflows an int64 and crashes,
>
> The primary risk is that it *won't* crash when overflowing an int64,
> it'll just silently give the wrong answer. That's much less safe than
> using a native int and then actually crashing in the case it's too
> large at the point one tries to encode it.
>


If the behavior of numpy.int64 is less safe than int, and both support
64-bit integers, and int is the more intuitive type to use, then that seems
to make a strong case for using int rather than numpy.int64.

-chad

Re: [PROPOSAL] An initial Schema API in Python

Reply via email to