amoeba opened a new issue, #38617:
URL: https://github.com/apache/arrow/issues/38617

   ### Describe the enhancement requested
   
   GRPC (and by extension FlightRPC) isn't fork-safe:
   
   > gRPC Python wraps gRPC core, which uses multithreading for performance, 
and hence doesn't support fork() 
[[Source](https://github.com/grpc/grpc/blob/master/doc/fork_support.md)]
   
   While it may be generally expected for library users to consider thread/fork 
safety while using such mechanisms, Python's 
[multiprocessing](https://docs.python.org/3/library/multiprocessing.html) 
package makes it very easy for users who are unfamiliar with these concepts to 
run into trouble. In the case of FlightRPC, users can get intro trouble when 
they fork _after_ instantiating a `FlightClient` in a failed attempt to 
increase performance when they should fork before importing and using 
`pyarrow.flight` or probably consider another approach. When operators of 
FlightRPC servers encounter users making this type of mistake, debugging can be 
very difficult because the errors seen on the server may make little sense 
since the clients are essentially broken.
   
   Improved documentation and, better, explicit prevention of fork might make 
Flight easier to implement for users.
   
   There are a couple of different concerns here:
   
   - **Where to put such a prevention:** I think it makes more sense to put 
this into PyArrow Flight rather than C++ so that not all implementations need 
to pay any cost associated with it. Plus, it's a common pattern in the Python 
ecosystem to protect users from common mistakes.
   - **How it could be done:** Python has a [mechanism to register fork 
handlers](https://docs.python.org/3/library/os.html#os.register_at_fork) but I 
don't think fork is the only way to run into trouble so I think a PID-check of 
sorts similar to how [fsspec does 
it](https://github.com/fsspec/filesystem_spec/pull/572) could be built into 
FlightClient and would let us offer the highest level of prevention.
   
   I'm curious if others think this is a good idea or not and if anyone has 
thoughts on the approach.
   
   ### Component(s)
   
   FlightRPC, Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to