> > >> Agreed on float since it seems to trivially map to a double, but I’m > torn on int still. While I do want int type hints to work, it doesn’t seem > appropriate to map it to AtomicType.INT64, since it has a completely > different range of values. > >> > >> Let’s say we used native int for the runtime field type, not just as a > schema declaration for numpy.int64. What is the real world fallout from > this? Would there be data loss? > > > > I'm not sure I follow the question exactly, what is the interplay > between int and numpy.int64 in this scenario? Are you saying that np.int64 > is used in the schema declaration, but we just use native int at runtime, > and check the bit width when encoding? > > > > In any case, I don't think the real world fallout of using int is nearly > that dire. I suppose data loss is possible if a poorly designed pipeline > overflows an int64 and crashes, > > The primary risk is that it *won't* crash when overflowing an int64, > it'll just silently give the wrong answer. That's much less safe than > using a native int and then actually crashing in the case it's too > large at the point one tries to encode it. >
If the behavior of numpy.int64 is less safe than int, and both support 64-bit integers, and int is the more intuitive type to use, then that seems to make a strong case for using int rather than numpy.int64. -chad