jorisvandenbossche commented on issue #38325:
URL: https://github.com/apache/arrow/issues/38325#issuecomment-2046859984
> It's moreso that we're talking about this topic because of the desire to
introduce something like a `requested_device` consumer provided parameter, so
it's not just a new struct, but a way for a consumer to provide information to
a producer in a standardized way at a C-API level as opposed to Python-API
level.
Sorry, I don't understand this paragraph. How is adding the
`requested_device` keyword or not (or ways to know if that keyword is
supported) something that plays that the C-API level? The C API is the struct
and there is nothing in there that allows passing options, that's all at the
python level.
> @vyasr's proposal above is interesting, and we could possibly restrict it
even further in that we could document that all evolutions to the protocol that
introduce new parameters should have a default value of `None` that correspond
to the previous behavior. I.E. we could do something like:
I don't have a strong objection to add that, but to be clear, this all still
means that the consumer has to do the work to check if the keyword is supported
or not or has been honored or not.
To use a concrete example, with the current proposal of no special
mechanism, and if we add a `requested_device` keyword in the future, the
consumer code could look like (for an example consumer that can only handle CPU
data):
```python
try:
capsules = object.__arrow_c_device_array__(requested_device=kCPU)
except TypeError:
capsules = object.__arrow_c_device_array__()
# manually check the returned array capsule has a CPU device and
otherwise error
```
With the proposal above to add the `**kwargs` that raise if not None:
```python
try:
capsules = object.__arrow_c_device_array__(requested_device=kCPU)
except NotImplementedError:
capsules = object.__arrow_c_device_array__()
# manually check the returned array capsule has a CPU device and
otherwise error
```
or if the `strict` keyword is used (and honored by the producer), this is a
bit simpler:
```python
capsules = object.__arrow_c_device_array__(requested_device=kCPU,
strict=False)
# manually check the returned array capsule has a CPU device and otherwise
error
```
> From a consumer perspective, passing any new parameter becomes purely
optional and they can handle unsupported parameters via `try` / `except`
handling.
Exactly the same is true for no explicit specification of adding `**kwargs`,
the only difference is the error type to catch.
> we could document that all evolutions to the protocol that introduce new
parameters should have a default value of `None` that correspond to the
previous behavior
I think that is something that we will do anyway in practice: if we add a
new keyword in the future, it should always have a default that matches the
previous behaviour, such that not specifying it preserves behaviour (otherwise
it would be a breaking change).
This is not exactly a "default value of `None`", but essentially comes down
to the same?
---
To summarize, if there is a strong desire to add the `**kwargs` and checking
of it, and potentially the `strict=False` option, to the specification, I can
certainly live with that.
But I do think that 1) it does not add that much compared to the base
situation of not specifying it (also in that case you can handle it with a
simple try/except), and 2) it adds some details to the specification that the
consumer wants to rely on and that producers can get wrong (i.e. the fact that
the producer will check the `kwargs`, the exact error type that the producer
will raise if they are present, that the producer will honor a `strict=False`
keyword). This is of course not hard to get right, but I personally not sure it
is worth the small gain.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]