This is getting rather off the original topic, so I changed the subject. This is the code in gRPC-Python, where incoming message data is copied into a Python bytearray: https://github.com/grpc/grpc/blob/b8b6df08ae6d9f60e1b282a659d26b8c340de5c9/src/python/grpcio/grpc/_cython/_cygrpc/operation.pyx.pxi#L165-L173
In fact, I think the `bytes(bytearray)` call at the end is an additional copy. We do something similar in Flight-C++: https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/serialization-internal.cc#L105-L118 It's an open question whether we can get gRPC to avoid these copies. Somewhat related, Flight-Java performance is hindered by this gRPC issue: https://github.com/grpc/grpc-java/issues/5433 Essentially, the backpressure signal in gRPC-Java is currently not related to actual network conditions at all. Alluxio implemented their own flow control for a 30% throughput improvement: https://github.com/Alluxio/alluxio/commit/6f02b41ea529b9f59c0c42de216f402b3b4c9882 Best, David On 7/29/19, Antoine Pitrou <anto...@python.org> wrote: > > Le 29/07/2019 à 15:13, David Li a écrit : >> Ah, sorry, I was unclear - the performance issue is not with Flight at >> all, but with putting Arrow over gRPC naively. >> >> At some point, we benchmarked gRPC-Python carrying Arrow data, and >> found that it only achieved ~half the throughput of Flight-Python. So >> implementing BigQuery-Flight would also avoid that performance >> pitfall, assuming the client library for BigQuery-Arrow uses >> gRPC-Python. >> >> The reason we found is that since gRPC technically does not require >> Protobuf, it copies message payloads into a CPython bytestring, and >> then the Python code then turns around and hands that to Protobuf, >> which then copies data into its data structures and gives it back to >> Python > > gRPC shouldn't need to copy the payload into a CPython bytestring. > Instead, it could instantiate a buffer-like Python object pointing to > the original data. This is "easily" done in Cython, and gRPC-python > already uses Cython: > https://cython.readthedocs.io/en/latest/src/userguide/buffer.html > https://docs.python.org/3/c-api/buffer.html > > Regards > > Antoine. >