jorisvandenbossche commented on issue #38325: URL: https://github.com/apache/arrow/issues/38325#issuecomment-2021536856
> Assuming we make copies implicit via `__arrow_c_array__`, this makes the only copy we'll support handling is device --> CPU for now. What would addressing along more generic copy requests later look like? I assume we could have something like `obj.__arrow_device_array__(requested_device=kCPU)`, and then it is up to the producer to see if they can provide the data on that device, and if not error or return on native device (depending on whether the requested device should be followed strictly. For `requested_schema` we decided this was only best effort). > So perhaps we're overthinking this, and producers of non-CPU data should simply implement the C Data Interface protocol with implicit cross-device copies. That means that passing such a non-CPU object, like a cudf DataFrame, to an interface that can consume data through this protocol (eg pandas or polars constructors, duckdb query with implicit variable, ...) would automatically do a potentially costly device copy of the full data structure. I am a bit hesitant to do enable that implicitly, that might be unexpected in some cases? (although maybe also convenient ..). > If so, this begs the question: should there be a more robust mechanism to add optional arguments after some producer implementations have already been published? I assume the simple but verbose way to do this is to put the onus on the _consumer_: if they want to use a newer keyword, they need to do that in a try/except, falling back on the version without the keyword, such that it works for producers that support or do not yet support the new keyword. (we could in theory already add a catch-all `**kwargs` to the protocol methods, but that will then silently ignore new keywords if not yet supported by a certain producer, so not sure that is better than raising an error) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
