[
https://issues.apache.org/jira/browse/ARROW-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16743443#comment-16743443
]
David Li commented on ARROW-4213:
---------------------------------
Here's another incompatibility, this time with a C++ server and Java client. (I
discovered this in the process of putting together rough integration tests for
Flight.)
In C++, the serializer for IpcPayload always writes the body tag
(flight/server.cc:143), even if there are no body messages. On the Java side,
this causes ArrowMessage to read and create an empty ArrowBuf object. However,
when ArrowMessage.asSchema is later called, an exception will be raised as Java
asserts that an ArrowMessage containing a schema has no body objects. gRPC
silently swallows the exception, causing the client to hang.
In Java, ArrowMessage.asInputStream explicitly checks if the message represents
a schema, and if so, uses a different code path that does not write a body at
all. I think C++ should have a similar check, and perhaps C++ should also make
the same assertion during deserialization.
> [Flight] C++ and Java implementations are incompatible
> ------------------------------------------------------
>
> Key: ARROW-4213
> URL: https://issues.apache.org/jira/browse/ARROW-4213
> Project: Apache Arrow
> Issue Type: Bug
> Components: FlightRPC
> Reporter: David Li
> Priority: Major
> Labels: flight
> Fix For: 0.13.0
>
>
> A C++ client cannot request streams from a Java service, nor can it decode
> the schema from GetFlightInfo.
> Schema: in Java, GetFlightInfo encodes the schema directly via flatbuffers.
> C++ expects it to be encoded as an IPC message. This isn't a problem in Java
> as a method exists to decode such schemas, but in C++ the API for reading
> such a schema isn't really exposed. I'm willing to submit a patch for this,
> but it's not clear to me which scheme is preferred.
> Streams: in Java, DoGet starts with an ArrowMessage containing a schema. C++
> does not expect this and segfaults when it tries to decode the message as a
> record batch. Based on the presentations I've seen, I think C++ is in the
> wrong here; I have a patch to fix this that I could clean up and submit.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)