By looking at the gRPC dashboard published by the benchmark[1], it seems the streaming ping-pong operations per second for gRPC in python is around 2k ~ 3k qps. This seems quite low compared to gRPC performance in other languages, e.g. 600k qps for Java and Go. Is it expected to run multiple sdk_worker processes to improve performance?
[1] https://performance-dot-grpc-testing.appspot.com/explore?dashboard=5652536396611584&widget=713624174&container=1012810333&maximized On Wed, Nov 7, 2018 at 11:14 AM Lukasz Cwik <[email protected]> wrote: > gRPC folks provide a bunch of benchmarks for different scenarios: > https://grpc.io/docs/guides/benchmarking.html > You would be most interested in the streaming throughput benchmarks since > the Data API is written on top of the gRPC streaming APIs. > > 200KB/s does seem pretty small. Have you captured any Python profiles[1] > and looked at them? > > 1: > https://lists.apache.org/thread.html/f8488faede96c65906216c6b4bc521385abeddc1578c99b85937d2f2@%3Cdev.beam.apache.org%3E > > > On Wed, Nov 7, 2018 at 10:18 AM Hai Lu <[email protected]> wrote: > >> Hi, >> >> This is Hai from LinkedIn. I'm currently working on Portable API for >> Samza Runner. I was able to make Python work with Samza container reading >> from Kafka. However, I'm seeing severe performance issue with my set up, >> achieving only ~200KB throughput between the Samza runner in the Java side >> and the sdk_worker in the Python part. >> >> While I'm digging into this, I wonder whether some one has benchmarked >> the data channel between Java and Python and had some results how much >> throughput can be reached? Assuming single worker thread and single >> JobBundleFactory. >> >> I might be missing some very basic and naive gRPC setting which leads to >> this unsatisfactory results. So another question is whether are any good >> articles or documentations about gRPC tuning dedicated to IPC? >> >> Thanks, >> Hai >> >> >>
