jorisvandenbossche commented on issue #35584: URL: https://github.com/apache/arrow/issues/35584#issuecomment-1562653549
Putting some responses to the comments at https://github.com/apache/arrow/issues/35563#issuecomment-1553502384 and below here: @zfoobar I also answered this just above: I think the "design decision" here was just to be consistent with our C++ casting behaviour (i.e. the behaviour discussed in https://github.com/apache/arrow/issues/35563). > Nowadays, I would say it's primarily a legacy implementation that eventually should be rewritten to take advantage of the compute kernels in C++ if possible. The python_to_arrow.cc code is certainly not legacy code, it's the actual interface between python land and arrow (C++) land, and that's something we will always need to implement in pyarrow, to be able to convert builtin Python objects to Arrow data. The C++ compute kernels don't deal with python objects. We of course can try to reuse as much as possible of those (cast) kernels in the python->arrow implementation (eg for `IntegerScalarToDoubleSafe` we could also extract the integer, convert it to a arrow integer scalar, and then call the int->float cast kernel, so we reuse the `safe` logic of the cast kernel). But that might be a quite big refactor (and for which we would also need to evaluate the performance impact). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
