David,

Thanks, that’s very helpful. I had suspected as much as I began to dig into
the code. I’m rather weak with concurrency and would like to see how Arrow
Flight is handling every request it gets. Are you suggesting that even for
Arrow Flight, it’s under-the-hood and the concurrency is actually specific
to gRPC—meaning that if I look through Apache Arrow’s source code, the
threading logic would be abstracted to the gRPC dependency?

Presumably, this means that if I have stateful variables on my running
server in Python, I need to manage my own locks to ensure my data
structures are thread safe, though the actual management of threads would
be much farther upstream?

On Tue, Sep 7, 2021 at 4:18 PM David Li <[email protected]> wrote:

> Hey Michael,
>
> The key thing is that the threads are all managed by gRPC's C++
> implementation. On the server side, the C++ libraries underneath handle
> incoming requests, encoding/decoding responses, etc. all concurrently
> without calling into Python. Arrow calls into Python only for the actual
> RPC endpoint logic. This is all multithreaded and within a single process.
> (In fact, you probably should avoid fork() and things built on top of it,
> like the multiprocessing module - it will not play well with the C++-level
> libraries.) Threading is all managed by the C++ library and so there is not
> any one place to look at, is there something specific you were looking for?
>
> Best,
> David
>
> On Tue, Sep 7, 2021, at 18:45, Michael Ark wrote:
>
> I am relatively new to Python and Arrow Flight. I want to understand how
> Arrow Flight works with multiple clients making multiple requests to a
> single server. It seems like Arrow Flight handles concurrency. Is it
> multithreaded, but single process? How are the threads managed? Where can I
> find this logic? When I try to track the threads in the server with
> logging, I get DummyThreads, so it’s not very helpful.
>
> #arrow-flight
>
> Thanks! Appreciate any help you can provide.
>
>
>

Reply via email to