kou commented on PR #28: URL: https://github.com/apache/arrow-flight-sql-postgresql/pull/28#issuecomment-1463240883
Thanks for sharing your thought! > Do you have a good idea of the current bottlenecks? I haven't profiled yet but I think that the current shared memory based data exchange isn't optimal. There are 2 processes for Apache Arrow Flight SQL execution: 1. Executor that executes the given SQL and writes the result to shared memory as Apache Arrow IPC streaming format 2. Server that accepts Apache Arrow Flight SQL clients, passes the given SQL to an executor, reads the result Apache Arrow IPC streaming format data and returns the data to a client If result data is larger than shared memory, they work like the followings: 1. An executor writes data as much as possible to shared memory, signals to a server and waits 2. A server reads data as much as possible from shared memory, signals to an executor and waits 3. An executor receives a signal and restarts writing data as much as possible to shared memory, ... It means that an executor and a server don't run in parallel. > A log scale on the y-axis might make that graph a little easier to parse. OK. I generated:  > If I understand right, we're passing serialized record batches over shared memory(?). Right! Great! > We could explore things like binary-format COPY to possibly speed up transfer from Postgres, right? Right. It's an optimization idea. > Also, a benchmark not over localhost may be interesting down the line. OK. I'll add the pattern later. > (since presumably we should be able to interleave conversion/wire transfer - the current structure seems like it should basically already do that due to the ring buffer) Yes. The current implementation uses a shared ring buffer for internal data exchange and interleaves conversion/wire transfer as I mentioned above. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
