erratic-pattern commented on issue #5731:
URL: https://github.com/apache/arrow-rs/issues/5731#issuecomment-2123132350
Thank you @tustvold for the additional technical context on why `Any` is
useful here. I had forgotten that you can't catch decoding errors as a form of
message versioning, due to the possibility of false positives when decoding. I
think that puts me in favor of `Any` wrapping here.
The more important part in my mind is that behavior is consistent and not
arbitrary per payload type. I don't think this should ever be the case in the
specification, temporarily or long term.
@matthewmturner:
> That being said, I think we need to distinguish what should be done now to
get the different implementations in a state where they are compatible with
each other (or explicitly deciding we are okay with a transitory state of being
incompatible) and what the next steps are to update the spec to have better
defined behavior with regards to Any / bytes encoding - which will require work
from the whole arrow community.
> While we could hold off on making changes until there is agreement on the
spec changes I think that leaves us in a quite bad spot as the current
implementations are not compatible.
I am not sure whether or not we need a stopgap measure such as this. If the
change is as simple as updating the Go implementation and clarifying the
documentation in the spec, then what you're suggesting would actually be more
work than moving forward with the more ideal change. We would need to break the
conventions used by all other implementations, add a lot of temporary
documentation to the spec, and then remove it all in the "final" iteration.
I will spend some time looking across language implementations to see what
the predominant convention currently is, and that will give us a better idea
the scope of change that needs to be made here.
> My preference is to interpret the spec literally (i.e. Any is allowed
where it is specified) and have all implementations align on this (even if it
is not perfect). Then, update the spec to be explicit and the implementations
can be updated to stay in conformance with that (at least we will be consistent
with regards to following the spec to the letter). I dont think there should be
room for implementations to make any assumptions about how to handle the spec
(maybe thats unreasonable, im not sure).
I think the problem here is that there is ambiguity and absence on what to
do here in the spec, which is why we are in a situation where we have divergent
implementations. I think enforcing a literal interpretation of the spec, even
temporarily, could potentially do more harm than good, especially if there is a
predominant implied convention being used across implementations. In an ideal
world the spec would be the definitive source of truth, but that is rarely the
case. I think it is better to update the spec to document such implied
conventions as they are discovered. [The purpose of a system is what it
does](https://en.wikipedia.org/wiki/The_purpose_of_a_system_is_what_it_does).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]