lidavidm commented on code in PR #13873:
URL: https://github.com/apache/arrow/pull/13873#discussion_r950219203


##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,154 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. tab-set::
+
+   .. tab-item:: C++
+
+      .. code-block:: cpp
+
+         auto options = FlightClientOptions::Defaults();

Review Comment:
   Is there a more realistic example of a setting you would want to customize?



##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,154 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. tab-set::
+
+   .. tab-item:: C++
+
+      .. code-block:: cpp
+
+         auto options = FlightClientOptions::Defaults();
+         // Set a very low limit at the gRPC layer to fail all calls
+         
options.generic_options.emplace_back(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, 4);
+
+   .. tab-item:: Python
+
+      .. code-block:: cpp
+
+         // Set a very low limit at the gRPC layer to fail all calls
+         generic_options = [("GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH", 4)]
+         client = pyarrow.flight.FlightClient(server_uri, 
generic_options=generic_options)
+
+Also see `best gRPC practices`_ and available `gRPC keys`_.
+
+Re-use clients whenever possible
+--------------------------------
+
+Closing clients causes gRPC to close and clean up connections which can take

Review Comment:
   Was it closing the clients that was the issue, or creating new ones?



##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,154 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. tab-set::
+
+   .. tab-item:: C++
+
+      .. code-block:: cpp
+
+         auto options = FlightClientOptions::Defaults();
+         // Set a very low limit at the gRPC layer to fail all calls
+         
options.generic_options.emplace_back(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, 4);
+
+   .. tab-item:: Python
+
+      .. code-block:: cpp
+
+         // Set a very low limit at the gRPC layer to fail all calls
+         generic_options = [("GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH", 4)]
+         client = pyarrow.flight.FlightClient(server_uri, 
generic_options=generic_options)
+
+Also see `best gRPC practices`_ and available `gRPC keys`_.
+
+Re-use clients whenever possible
+--------------------------------
+
+Closing clients causes gRPC to close and clean up connections which can take
+several seconds per connection. This will stall server and client threads if
+done too frequently. Client reuse will avoid this issue.
+
+Don’t round-robin load balance
+------------------------------
+
+`Round robin balancing`_ means every client can have an open connection to
+every server, causing an unexpected number of open connections and depleting
+server resources.
+
+Debugging
+---------
+
+Use netstat to see the number of open connections.
+For debug - env GRPC_VERBOSITY=info GRPC_TRACE=http will print the initial
+headers (on both sides) so you can see if grpc established the connection or
+not. It will also print when a message is sent, so you can tell if the
+connection is open or not.
+gRPC may not report connection errors until a call is actually made.
+Hence, to detect connection errors when creating a client, some sort
+of dummy RPC should be made.
+
+Memory cache management
+-----------------------
+
+Flight tries to reuse allocations made by gRPC to avoid redundant
+data copies.  However, this means that those allocations may not
+be tracked by the Arrow memory pool, and that memory usage behavior,
+such as whether free memory is returned to the system, is dependent
+on the allocator that gRPC uses (usually the system allocator).
+
+A quick way of testing: attach to the process with a debugger and call 
malloc_trim
+or call ReleaseUnused on the system pool.
+
+Excessive traffic
+-----------------
+
+gRPC will spawn an unbounded number of threads for concurrent clients. Those
+threads are not necessarily cleaned up (cached thread pool in java parlance).
+glibc malloc clears some per thread state and the default tuning never clears
+caches in some workloads. But you can explicitly tell malloc to dump caches.
+See ARROW-16697_ as an example.
+
+There are basically two ways to handle excessive traffic:
+* unbounded thread pool -> everyone gets serviced, but it might take forever.
+* bounded thread pool -> Reject connections / requests when under load, and 
have
+clients retry with backoff. This also gives an opportunity to retry with a
+different node. Not everyone gets serviced but quality of service stays 
consistent.
+
+Limiting DoPut Batch Size
+--------------------------
+
+You may wish to limit the maximum size a client can submit to a server through
+DoPut, to prevent a request from taking up too much memory on the server. On
+the client-side, set `write_size_limit_bytes` on `FlightClientOptions`. On the 
+server-side, set the gRPC option `GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH`.
+The client-side option will return an error that can be retried with smaller 
batches,
+while the server-side limit will close out the connection. Setting both can be 
wise, since
+the former provides a better user experience but the latter may be necessary 
to defend
+against impolite clients.
+
+Closing unresponsive connections
+--------------------------------
+
+1. A stale connection can be closed using
+   :member:`arrow::flight::FlightClientOptions::stop_token`. This requires 
recording the
+   stop token at connection establishment time.
+
+   .. tab-set::
+
+      .. tab-item:: C++
+
+         .. code-block:: cpp
+
+      .. tab-item:: Java
+
+         .. code-block:: java
+
+      .. tab-item:: Python
+
+         .. code-block:: python
+
+
+2. Use client timeout.

Review Comment:
   ```suggestion
   2. Use call timeouts. (This is a general gRPC best practice.)
   ```



##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,154 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. tab-set::
+
+   .. tab-item:: C++
+
+      .. code-block:: cpp
+
+         auto options = FlightClientOptions::Defaults();
+         // Set a very low limit at the gRPC layer to fail all calls
+         
options.generic_options.emplace_back(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, 4);
+
+   .. tab-item:: Python
+
+      .. code-block:: cpp
+
+         // Set a very low limit at the gRPC layer to fail all calls
+         generic_options = [("GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH", 4)]
+         client = pyarrow.flight.FlightClient(server_uri, 
generic_options=generic_options)
+
+Also see `best gRPC practices`_ and available `gRPC keys`_.
+
+Re-use clients whenever possible
+--------------------------------
+
+Closing clients causes gRPC to close and clean up connections which can take
+several seconds per connection. This will stall server and client threads if
+done too frequently. Client reuse will avoid this issue.
+
+Don’t round-robin load balance
+------------------------------
+
+`Round robin balancing`_ means every client can have an open connection to
+every server, causing an unexpected number of open connections and depleting
+server resources.
+
+Debugging
+---------
+
+Use netstat to see the number of open connections.
+For debug - env GRPC_VERBOSITY=info GRPC_TRACE=http will print the initial
+headers (on both sides) so you can see if grpc established the connection or
+not. It will also print when a message is sent, so you can tell if the
+connection is open or not.

Review Comment:
   ```suggestion
   
   For debugging, certain environment variables enable logging in gRPC. For
   example, ``env GRPC_VERBOSITY=info GRPC_TRACE=http`` will print the initial
   headers (on both sides) so you can see if gRPC established the connection or
   not. It will also print when a message is sent, so you can tell if the
   connection is open or not.
   
   ```
   
   And link to https://grpc.github.io/grpc/cpp/md_doc_environment_variables.html



##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,154 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. tab-set::
+
+   .. tab-item:: C++
+
+      .. code-block:: cpp
+
+         auto options = FlightClientOptions::Defaults();
+         // Set a very low limit at the gRPC layer to fail all calls
+         
options.generic_options.emplace_back(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, 4);
+
+   .. tab-item:: Python
+
+      .. code-block:: cpp
+
+         // Set a very low limit at the gRPC layer to fail all calls
+         generic_options = [("GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH", 4)]
+         client = pyarrow.flight.FlightClient(server_uri, 
generic_options=generic_options)
+
+Also see `best gRPC practices`_ and available `gRPC keys`_.
+
+Re-use clients whenever possible
+--------------------------------
+
+Closing clients causes gRPC to close and clean up connections which can take
+several seconds per connection. This will stall server and client threads if
+done too frequently. Client reuse will avoid this issue.
+
+Don’t round-robin load balance
+------------------------------
+
+`Round robin balancing`_ means every client can have an open connection to
+every server, causing an unexpected number of open connections and depleting
+server resources.
+
+Debugging
+---------
+
+Use netstat to see the number of open connections.
+For debug - env GRPC_VERBOSITY=info GRPC_TRACE=http will print the initial
+headers (on both sides) so you can see if grpc established the connection or
+not. It will also print when a message is sent, so you can tell if the
+connection is open or not.
+gRPC may not report connection errors until a call is actually made.
+Hence, to detect connection errors when creating a client, some sort
+of dummy RPC should be made.
+
+Memory cache management
+-----------------------
+
+Flight tries to reuse allocations made by gRPC to avoid redundant
+data copies.  However, this means that those allocations may not
+be tracked by the Arrow memory pool, and that memory usage behavior,
+such as whether free memory is returned to the system, is dependent
+on the allocator that gRPC uses (usually the system allocator).
+
+A quick way of testing: attach to the process with a debugger and call 
malloc_trim
+or call ReleaseUnused on the system pool.

Review Comment:
   ```suggestion
   A quick way of testing: attach to the process with a debugger and call
   malloc_trim, or call ReleaseUnused on the system pool. If memory usage
   drops, then likely, there is memory allocated by gRPC or by the
   application that the system allocator was holding on to. This can be
   adjusted in platform-specific ways; see an investigation in JIRA for
   an example of how this works on Linux/glibc. 
   ```
   
   and link the JIRA



##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,154 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. tab-set::
+
+   .. tab-item:: C++
+
+      .. code-block:: cpp
+
+         auto options = FlightClientOptions::Defaults();
+         // Set a very low limit at the gRPC layer to fail all calls
+         
options.generic_options.emplace_back(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, 4);
+
+   .. tab-item:: Python
+
+      .. code-block:: cpp
+
+         // Set a very low limit at the gRPC layer to fail all calls
+         generic_options = [("GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH", 4)]
+         client = pyarrow.flight.FlightClient(server_uri, 
generic_options=generic_options)
+
+Also see `best gRPC practices`_ and available `gRPC keys`_.
+
+Re-use clients whenever possible
+--------------------------------
+
+Closing clients causes gRPC to close and clean up connections which can take
+several seconds per connection. This will stall server and client threads if
+done too frequently. Client reuse will avoid this issue.
+
+Don’t round-robin load balance
+------------------------------
+
+`Round robin balancing`_ means every client can have an open connection to
+every server, causing an unexpected number of open connections and depleting
+server resources.
+
+Debugging
+---------
+
+Use netstat to see the number of open connections.
+For debug - env GRPC_VERBOSITY=info GRPC_TRACE=http will print the initial
+headers (on both sides) so you can see if grpc established the connection or
+not. It will also print when a message is sent, so you can tell if the
+connection is open or not.
+gRPC may not report connection errors until a call is actually made.
+Hence, to detect connection errors when creating a client, some sort
+of dummy RPC should be made.
+
+Memory cache management
+-----------------------
+
+Flight tries to reuse allocations made by gRPC to avoid redundant
+data copies.  However, this means that those allocations may not
+be tracked by the Arrow memory pool, and that memory usage behavior,
+such as whether free memory is returned to the system, is dependent
+on the allocator that gRPC uses (usually the system allocator).
+
+A quick way of testing: attach to the process with a debugger and call 
malloc_trim
+or call ReleaseUnused on the system pool.
+
+Excessive traffic
+-----------------
+
+gRPC will spawn an unbounded number of threads for concurrent clients. Those
+threads are not necessarily cleaned up (cached thread pool in java parlance).
+glibc malloc clears some per thread state and the default tuning never clears
+caches in some workloads. But you can explicitly tell malloc to dump caches.
+See ARROW-16697_ as an example.
+
+There are basically two ways to handle excessive traffic:
+* unbounded thread pool -> everyone gets serviced, but it might take forever.
+* bounded thread pool -> Reject connections / requests when under load, and 
have
+clients retry with backoff. This also gives an opportunity to retry with a
+different node. Not everyone gets serviced but quality of service stays 
consistent.
+
+Limiting DoPut Batch Size

Review Comment:
   This is more like a Cookbook kind of thing IMO



##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,154 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. tab-set::
+
+   .. tab-item:: C++
+
+      .. code-block:: cpp
+
+         auto options = FlightClientOptions::Defaults();
+         // Set a very low limit at the gRPC layer to fail all calls
+         
options.generic_options.emplace_back(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, 4);
+
+   .. tab-item:: Python
+
+      .. code-block:: cpp
+
+         // Set a very low limit at the gRPC layer to fail all calls
+         generic_options = [("GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH", 4)]
+         client = pyarrow.flight.FlightClient(server_uri, 
generic_options=generic_options)
+
+Also see `best gRPC practices`_ and available `gRPC keys`_.
+
+Re-use clients whenever possible
+--------------------------------
+
+Closing clients causes gRPC to close and clean up connections which can take
+several seconds per connection. This will stall server and client threads if
+done too frequently. Client reuse will avoid this issue.
+
+Don’t round-robin load balance
+------------------------------
+
+`Round robin balancing`_ means every client can have an open connection to
+every server, causing an unexpected number of open connections and depleting
+server resources.
+
+Debugging
+---------
+
+Use netstat to see the number of open connections.
+For debug - env GRPC_VERBOSITY=info GRPC_TRACE=http will print the initial
+headers (on both sides) so you can see if grpc established the connection or
+not. It will also print when a message is sent, so you can tell if the
+connection is open or not.
+gRPC may not report connection errors until a call is actually made.
+Hence, to detect connection errors when creating a client, some sort
+of dummy RPC should be made.
+
+Memory cache management
+-----------------------
+
+Flight tries to reuse allocations made by gRPC to avoid redundant
+data copies.  However, this means that those allocations may not
+be tracked by the Arrow memory pool, and that memory usage behavior,
+such as whether free memory is returned to the system, is dependent
+on the allocator that gRPC uses (usually the system allocator).
+
+A quick way of testing: attach to the process with a debugger and call 
malloc_trim
+or call ReleaseUnused on the system pool.
+
+Excessive traffic
+-----------------
+
+gRPC will spawn an unbounded number of threads for concurrent clients. Those
+threads are not necessarily cleaned up (cached thread pool in java parlance).
+glibc malloc clears some per thread state and the default tuning never clears
+caches in some workloads. But you can explicitly tell malloc to dump caches.
+See ARROW-16697_ as an example.

Review Comment:
   This might belong better in the discussion of memory management below



##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,154 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. tab-set::
+
+   .. tab-item:: C++
+
+      .. code-block:: cpp
+
+         auto options = FlightClientOptions::Defaults();
+         // Set a very low limit at the gRPC layer to fail all calls
+         
options.generic_options.emplace_back(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, 4);
+
+   .. tab-item:: Python
+
+      .. code-block:: cpp
+
+         // Set a very low limit at the gRPC layer to fail all calls
+         generic_options = [("GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH", 4)]
+         client = pyarrow.flight.FlightClient(server_uri, 
generic_options=generic_options)
+
+Also see `best gRPC practices`_ and available `gRPC keys`_.
+
+Re-use clients whenever possible
+--------------------------------
+
+Closing clients causes gRPC to close and clean up connections which can take
+several seconds per connection. This will stall server and client threads if
+done too frequently. Client reuse will avoid this issue.
+
+Don’t round-robin load balance
+------------------------------
+
+`Round robin balancing`_ means every client can have an open connection to
+every server, causing an unexpected number of open connections and depleting
+server resources.
+
+Debugging
+---------
+
+Use netstat to see the number of open connections.

Review Comment:
   Give an example?



##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,154 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. tab-set::
+
+   .. tab-item:: C++
+
+      .. code-block:: cpp
+
+         auto options = FlightClientOptions::Defaults();
+         // Set a very low limit at the gRPC layer to fail all calls
+         
options.generic_options.emplace_back(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, 4);
+
+   .. tab-item:: Python
+
+      .. code-block:: cpp
+
+         // Set a very low limit at the gRPC layer to fail all calls
+         generic_options = [("GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH", 4)]
+         client = pyarrow.flight.FlightClient(server_uri, 
generic_options=generic_options)
+
+Also see `best gRPC practices`_ and available `gRPC keys`_.
+
+Re-use clients whenever possible
+--------------------------------
+
+Closing clients causes gRPC to close and clean up connections which can take
+several seconds per connection. This will stall server and client threads if
+done too frequently. Client reuse will avoid this issue.
+
+Don’t round-robin load balance
+------------------------------
+
+`Round robin balancing`_ means every client can have an open connection to
+every server, causing an unexpected number of open connections and depleting
+server resources.
+
+Debugging
+---------
+
+Use netstat to see the number of open connections.
+For debug - env GRPC_VERBOSITY=info GRPC_TRACE=http will print the initial
+headers (on both sides) so you can see if grpc established the connection or
+not. It will also print when a message is sent, so you can tell if the
+connection is open or not.
+gRPC may not report connection errors until a call is actually made.
+Hence, to detect connection errors when creating a client, some sort
+of dummy RPC should be made.
+
+Memory cache management
+-----------------------
+
+Flight tries to reuse allocations made by gRPC to avoid redundant
+data copies.  However, this means that those allocations may not
+be tracked by the Arrow memory pool, and that memory usage behavior,
+such as whether free memory is returned to the system, is dependent
+on the allocator that gRPC uses (usually the system allocator).
+
+A quick way of testing: attach to the process with a debugger and call 
malloc_trim
+or call ReleaseUnused on the system pool.
+
+Excessive traffic
+-----------------
+
+gRPC will spawn an unbounded number of threads for concurrent clients. Those
+threads are not necessarily cleaned up (cached thread pool in java parlance).
+glibc malloc clears some per thread state and the default tuning never clears
+caches in some workloads. But you can explicitly tell malloc to dump caches.
+See ARROW-16697_ as an example.
+
+There are basically two ways to handle excessive traffic:
+* unbounded thread pool -> everyone gets serviced, but it might take forever.
+* bounded thread pool -> Reject connections / requests when under load, and 
have
+clients retry with backoff. This also gives an opportunity to retry with a
+different node. Not everyone gets serviced but quality of service stays 
consistent.
+
+Limiting DoPut Batch Size
+--------------------------
+
+You may wish to limit the maximum size a client can submit to a server through
+DoPut, to prevent a request from taking up too much memory on the server. On
+the client-side, set `write_size_limit_bytes` on `FlightClientOptions`. On the 

Review Comment:
   General comment: reST uses double-backticks



##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,154 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. tab-set::
+
+   .. tab-item:: C++
+
+      .. code-block:: cpp
+
+         auto options = FlightClientOptions::Defaults();
+         // Set a very low limit at the gRPC layer to fail all calls
+         
options.generic_options.emplace_back(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, 4);
+
+   .. tab-item:: Python
+
+      .. code-block:: cpp
+
+         // Set a very low limit at the gRPC layer to fail all calls
+         generic_options = [("GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH", 4)]
+         client = pyarrow.flight.FlightClient(server_uri, 
generic_options=generic_options)
+
+Also see `best gRPC practices`_ and available `gRPC keys`_.
+
+Re-use clients whenever possible
+--------------------------------
+
+Closing clients causes gRPC to close and clean up connections which can take
+several seconds per connection. This will stall server and client threads if
+done too frequently. Client reuse will avoid this issue.
+
+Don’t round-robin load balance
+------------------------------
+
+`Round robin balancing`_ means every client can have an open connection to
+every server, causing an unexpected number of open connections and depleting
+server resources.
+
+Debugging
+---------
+
+Use netstat to see the number of open connections.
+For debug - env GRPC_VERBOSITY=info GRPC_TRACE=http will print the initial
+headers (on both sides) so you can see if grpc established the connection or
+not. It will also print when a message is sent, so you can tell if the
+connection is open or not.
+gRPC may not report connection errors until a call is actually made.
+Hence, to detect connection errors when creating a client, some sort
+of dummy RPC should be made.
+
+Memory cache management
+-----------------------
+
+Flight tries to reuse allocations made by gRPC to avoid redundant
+data copies.  However, this means that those allocations may not
+be tracked by the Arrow memory pool, and that memory usage behavior,
+such as whether free memory is returned to the system, is dependent
+on the allocator that gRPC uses (usually the system allocator).
+
+A quick way of testing: attach to the process with a debugger and call 
malloc_trim
+or call ReleaseUnused on the system pool.

Review Comment:
   `:func:`-reference the appropriate API docs?



##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,154 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. tab-set::
+
+   .. tab-item:: C++
+
+      .. code-block:: cpp
+
+         auto options = FlightClientOptions::Defaults();
+         // Set a very low limit at the gRPC layer to fail all calls
+         
options.generic_options.emplace_back(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, 4);
+
+   .. tab-item:: Python
+
+      .. code-block:: cpp
+
+         // Set a very low limit at the gRPC layer to fail all calls
+         generic_options = [("GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH", 4)]
+         client = pyarrow.flight.FlightClient(server_uri, 
generic_options=generic_options)
+
+Also see `best gRPC practices`_ and available `gRPC keys`_.
+
+Re-use clients whenever possible
+--------------------------------
+
+Closing clients causes gRPC to close and clean up connections which can take
+several seconds per connection. This will stall server and client threads if
+done too frequently. Client reuse will avoid this issue.
+
+Don’t round-robin load balance
+------------------------------
+
+`Round robin balancing`_ means every client can have an open connection to
+every server, causing an unexpected number of open connections and depleting
+server resources.
+
+Debugging
+---------
+
+Use netstat to see the number of open connections.
+For debug - env GRPC_VERBOSITY=info GRPC_TRACE=http will print the initial
+headers (on both sides) so you can see if grpc established the connection or
+not. It will also print when a message is sent, so you can tell if the
+connection is open or not.
+gRPC may not report connection errors until a call is actually made.
+Hence, to detect connection errors when creating a client, some sort
+of dummy RPC should be made.
+
+Memory cache management
+-----------------------
+
+Flight tries to reuse allocations made by gRPC to avoid redundant
+data copies.  However, this means that those allocations may not
+be tracked by the Arrow memory pool, and that memory usage behavior,
+such as whether free memory is returned to the system, is dependent
+on the allocator that gRPC uses (usually the system allocator).
+
+A quick way of testing: attach to the process with a debugger and call 
malloc_trim
+or call ReleaseUnused on the system pool.
+
+Excessive traffic
+-----------------
+
+gRPC will spawn an unbounded number of threads for concurrent clients. Those

Review Comment:
   I'll try to dig up the reference for this



##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,154 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. tab-set::
+
+   .. tab-item:: C++
+
+      .. code-block:: cpp
+
+         auto options = FlightClientOptions::Defaults();
+         // Set a very low limit at the gRPC layer to fail all calls
+         
options.generic_options.emplace_back(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, 4);
+
+   .. tab-item:: Python
+
+      .. code-block:: cpp
+
+         // Set a very low limit at the gRPC layer to fail all calls
+         generic_options = [("GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH", 4)]
+         client = pyarrow.flight.FlightClient(server_uri, 
generic_options=generic_options)
+
+Also see `best gRPC practices`_ and available `gRPC keys`_.
+
+Re-use clients whenever possible
+--------------------------------
+
+Closing clients causes gRPC to close and clean up connections which can take
+several seconds per connection. This will stall server and client threads if
+done too frequently. Client reuse will avoid this issue.
+
+Don’t round-robin load balance
+------------------------------
+
+`Round robin balancing`_ means every client can have an open connection to
+every server, causing an unexpected number of open connections and depleting
+server resources.
+
+Debugging
+---------
+
+Use netstat to see the number of open connections.
+For debug - env GRPC_VERBOSITY=info GRPC_TRACE=http will print the initial
+headers (on both sides) so you can see if grpc established the connection or
+not. It will also print when a message is sent, so you can tell if the
+connection is open or not.
+gRPC may not report connection errors until a call is actually made.
+Hence, to detect connection errors when creating a client, some sort
+of dummy RPC should be made.
+
+Memory cache management
+-----------------------
+
+Flight tries to reuse allocations made by gRPC to avoid redundant
+data copies.  However, this means that those allocations may not
+be tracked by the Arrow memory pool, and that memory usage behavior,
+such as whether free memory is returned to the system, is dependent
+on the allocator that gRPC uses (usually the system allocator).
+
+A quick way of testing: attach to the process with a debugger and call 
malloc_trim
+or call ReleaseUnused on the system pool.
+
+Excessive traffic
+-----------------
+
+gRPC will spawn an unbounded number of threads for concurrent clients. Those
+threads are not necessarily cleaned up (cached thread pool in java parlance).

Review Comment:
   ```suggestion
   threads are not necessarily cleaned up (a "cached thread pool" in Java 
parlance).
   ```



##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,154 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. tab-set::
+
+   .. tab-item:: C++
+
+      .. code-block:: cpp
+
+         auto options = FlightClientOptions::Defaults();
+         // Set a very low limit at the gRPC layer to fail all calls
+         
options.generic_options.emplace_back(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, 4);
+
+   .. tab-item:: Python
+
+      .. code-block:: cpp
+
+         // Set a very low limit at the gRPC layer to fail all calls
+         generic_options = [("GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH", 4)]
+         client = pyarrow.flight.FlightClient(server_uri, 
generic_options=generic_options)
+
+Also see `best gRPC practices`_ and available `gRPC keys`_.
+
+Re-use clients whenever possible
+--------------------------------
+
+Closing clients causes gRPC to close and clean up connections which can take
+several seconds per connection. This will stall server and client threads if
+done too frequently. Client reuse will avoid this issue.
+
+Don’t round-robin load balance
+------------------------------
+
+`Round robin balancing`_ means every client can have an open connection to
+every server, causing an unexpected number of open connections and depleting
+server resources.
+
+Debugging
+---------
+
+Use netstat to see the number of open connections.
+For debug - env GRPC_VERBOSITY=info GRPC_TRACE=http will print the initial
+headers (on both sides) so you can see if grpc established the connection or
+not. It will also print when a message is sent, so you can tell if the
+connection is open or not.
+gRPC may not report connection errors until a call is actually made.
+Hence, to detect connection errors when creating a client, some sort
+of dummy RPC should be made.
+
+Memory cache management

Review Comment:
   ```suggestion
   Memory management
   ```



##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,154 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. tab-set::
+
+   .. tab-item:: C++
+
+      .. code-block:: cpp
+
+         auto options = FlightClientOptions::Defaults();
+         // Set a very low limit at the gRPC layer to fail all calls
+         
options.generic_options.emplace_back(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, 4);
+
+   .. tab-item:: Python
+
+      .. code-block:: cpp
+
+         // Set a very low limit at the gRPC layer to fail all calls
+         generic_options = [("GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH", 4)]
+         client = pyarrow.flight.FlightClient(server_uri, 
generic_options=generic_options)
+
+Also see `best gRPC practices`_ and available `gRPC keys`_.
+
+Re-use clients whenever possible
+--------------------------------
+
+Closing clients causes gRPC to close and clean up connections which can take
+several seconds per connection. This will stall server and client threads if
+done too frequently. Client reuse will avoid this issue.
+
+Don’t round-robin load balance
+------------------------------
+
+`Round robin balancing`_ means every client can have an open connection to
+every server, causing an unexpected number of open connections and depleting
+server resources.
+
+Debugging
+---------
+
+Use netstat to see the number of open connections.
+For debug - env GRPC_VERBOSITY=info GRPC_TRACE=http will print the initial
+headers (on both sides) so you can see if grpc established the connection or
+not. It will also print when a message is sent, so you can tell if the
+connection is open or not.
+gRPC may not report connection errors until a call is actually made.
+Hence, to detect connection errors when creating a client, some sort
+of dummy RPC should be made.
+
+Memory cache management
+-----------------------
+
+Flight tries to reuse allocations made by gRPC to avoid redundant
+data copies.  However, this means that those allocations may not
+be tracked by the Arrow memory pool, and that memory usage behavior,
+such as whether free memory is returned to the system, is dependent
+on the allocator that gRPC uses (usually the system allocator).
+
+A quick way of testing: attach to the process with a debugger and call 
malloc_trim
+or call ReleaseUnused on the system pool.
+
+Excessive traffic
+-----------------
+
+gRPC will spawn an unbounded number of threads for concurrent clients. Those
+threads are not necessarily cleaned up (cached thread pool in java parlance).
+glibc malloc clears some per thread state and the default tuning never clears
+caches in some workloads. But you can explicitly tell malloc to dump caches.
+See ARROW-16697_ as an example.
+
+There are basically two ways to handle excessive traffic:
+* unbounded thread pool -> everyone gets serviced, but it might take forever.
+* bounded thread pool -> Reject connections / requests when under load, and 
have
+clients retry with backoff. This also gives an opportunity to retry with a
+different node. Not everyone gets serviced but quality of service stays 
consistent.
+
+Limiting DoPut Batch Size
+--------------------------
+
+You may wish to limit the maximum size a client can submit to a server through
+DoPut, to prevent a request from taking up too much memory on the server. On
+the client-side, set `write_size_limit_bytes` on `FlightClientOptions`. On the 
+server-side, set the gRPC option `GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH`.
+The client-side option will return an error that can be retried with smaller 
batches,
+while the server-side limit will close out the connection. Setting both can be 
wise, since
+the former provides a better user experience but the latter may be necessary 
to defend
+against impolite clients.
+
+Closing unresponsive connections
+--------------------------------
+
+1. A stale connection can be closed using
+   :member:`arrow::flight::FlightClientOptions::stop_token`. This requires 
recording the
+   stop token at connection establishment time.
+
+   .. tab-set::
+
+      .. tab-item:: C++
+
+         .. code-block:: cpp
+
+      .. tab-item:: Java
+
+         .. code-block:: java
+
+      .. tab-item:: Python
+
+         .. code-block:: python
+
+
+2. Use client timeout.
+   .. tab-set::
+
+      .. tab-item:: C++
+
+         .. code-block:: cpp
+
+      .. tab-item:: Java
+
+         .. code-block:: java
+
+      .. tab-item:: Python
+
+         .. code-block:: python
+
+3. There is a long standing ticket for a per-write/per-read timeout instead of 
a per

Review Comment:
   This first sentence might be better as a `.. note` since it's not actually 
actionable



##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,154 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. tab-set::
+
+   .. tab-item:: C++
+
+      .. code-block:: cpp
+
+         auto options = FlightClientOptions::Defaults();
+         // Set a very low limit at the gRPC layer to fail all calls
+         
options.generic_options.emplace_back(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, 4);
+
+   .. tab-item:: Python
+
+      .. code-block:: cpp
+
+         // Set a very low limit at the gRPC layer to fail all calls
+         generic_options = [("GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH", 4)]
+         client = pyarrow.flight.FlightClient(server_uri, 
generic_options=generic_options)
+
+Also see `best gRPC practices`_ and available `gRPC keys`_.
+
+Re-use clients whenever possible
+--------------------------------
+
+Closing clients causes gRPC to close and clean up connections which can take
+several seconds per connection. This will stall server and client threads if
+done too frequently. Client reuse will avoid this issue.
+
+Don’t round-robin load balance
+------------------------------
+
+`Round robin balancing`_ means every client can have an open connection to
+every server, causing an unexpected number of open connections and depleting
+server resources.
+
+Debugging
+---------
+
+Use netstat to see the number of open connections.
+For debug - env GRPC_VERBOSITY=info GRPC_TRACE=http will print the initial
+headers (on both sides) so you can see if grpc established the connection or
+not. It will also print when a message is sent, so you can tell if the
+connection is open or not.
+gRPC may not report connection errors until a call is actually made.
+Hence, to detect connection errors when creating a client, some sort
+of dummy RPC should be made.
+
+Memory cache management
+-----------------------
+
+Flight tries to reuse allocations made by gRPC to avoid redundant
+data copies.  However, this means that those allocations may not
+be tracked by the Arrow memory pool, and that memory usage behavior,
+such as whether free memory is returned to the system, is dependent
+on the allocator that gRPC uses (usually the system allocator).
+
+A quick way of testing: attach to the process with a debugger and call 
malloc_trim
+or call ReleaseUnused on the system pool.
+
+Excessive traffic
+-----------------
+
+gRPC will spawn an unbounded number of threads for concurrent clients. Those
+threads are not necessarily cleaned up (cached thread pool in java parlance).
+glibc malloc clears some per thread state and the default tuning never clears
+caches in some workloads. But you can explicitly tell malloc to dump caches.
+See ARROW-16697_ as an example.
+
+There are basically two ways to handle excessive traffic:
+* unbounded thread pool -> everyone gets serviced, but it might take forever.
+* bounded thread pool -> Reject connections / requests when under load, and 
have
+clients retry with backoff. This also gives an opportunity to retry with a
+different node. Not everyone gets serviced but quality of service stays 
consistent.
+
+Limiting DoPut Batch Size
+--------------------------
+
+You may wish to limit the maximum size a client can submit to a server through
+DoPut, to prevent a request from taking up too much memory on the server. On
+the client-side, set `write_size_limit_bytes` on `FlightClientOptions`. On the 
+server-side, set the gRPC option `GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH`.
+The client-side option will return an error that can be retried with smaller 
batches,
+while the server-side limit will close out the connection. Setting both can be 
wise, since
+the former provides a better user experience but the latter may be necessary 
to defend
+against impolite clients.
+
+Closing unresponsive connections
+--------------------------------
+
+1. A stale connection can be closed using
+   :member:`arrow::flight::FlightClientOptions::stop_token`. This requires 
recording the
+   stop token at connection establishment time.
+
+   .. tab-set::
+
+      .. tab-item:: C++
+
+         .. code-block:: cpp
+
+      .. tab-item:: Java
+
+         .. code-block:: java
+
+      .. tab-item:: Python
+
+         .. code-block:: python
+
+
+2. Use client timeout.
+   .. tab-set::
+
+      .. tab-item:: C++
+
+         .. code-block:: cpp
+
+      .. tab-item:: Java
+
+         .. code-block:: java
+
+      .. tab-item:: Python
+
+         .. code-block:: python
+
+3. There is a long standing ticket for a per-write/per-read timeout instead of 
a per
+   call timeout (ARROW-6062_), but this is not (easily) possible to implement 
with the
+   blocking gRPC API. For now one can also do something like set up a 
background thread

Review Comment:
   This sort of thing should definitely be a recipe.



##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,154 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. tab-set::
+
+   .. tab-item:: C++
+
+      .. code-block:: cpp
+
+         auto options = FlightClientOptions::Defaults();
+         // Set a very low limit at the gRPC layer to fail all calls
+         
options.generic_options.emplace_back(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, 4);
+
+   .. tab-item:: Python
+
+      .. code-block:: cpp
+
+         // Set a very low limit at the gRPC layer to fail all calls
+         generic_options = [("GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH", 4)]
+         client = pyarrow.flight.FlightClient(server_uri, 
generic_options=generic_options)
+
+Also see `best gRPC practices`_ and available `gRPC keys`_.
+
+Re-use clients whenever possible
+--------------------------------
+
+Closing clients causes gRPC to close and clean up connections which can take
+several seconds per connection. This will stall server and client threads if
+done too frequently. Client reuse will avoid this issue.
+
+Don’t round-robin load balance
+------------------------------
+
+`Round robin balancing`_ means every client can have an open connection to
+every server, causing an unexpected number of open connections and depleting
+server resources.
+
+Debugging
+---------
+
+Use netstat to see the number of open connections.
+For debug - env GRPC_VERBOSITY=info GRPC_TRACE=http will print the initial
+headers (on both sides) so you can see if grpc established the connection or
+not. It will also print when a message is sent, so you can tell if the
+connection is open or not.
+gRPC may not report connection errors until a call is actually made.
+Hence, to detect connection errors when creating a client, some sort
+of dummy RPC should be made.
+
+Memory cache management
+-----------------------
+
+Flight tries to reuse allocations made by gRPC to avoid redundant
+data copies.  However, this means that those allocations may not
+be tracked by the Arrow memory pool, and that memory usage behavior,
+such as whether free memory is returned to the system, is dependent
+on the allocator that gRPC uses (usually the system allocator).
+
+A quick way of testing: attach to the process with a debugger and call 
malloc_trim
+or call ReleaseUnused on the system pool.
+
+Excessive traffic
+-----------------
+
+gRPC will spawn an unbounded number of threads for concurrent clients. Those
+threads are not necessarily cleaned up (cached thread pool in java parlance).
+glibc malloc clears some per thread state and the default tuning never clears
+caches in some workloads. But you can explicitly tell malloc to dump caches.
+See ARROW-16697_ as an example.
+
+There are basically two ways to handle excessive traffic:
+* unbounded thread pool -> everyone gets serviced, but it might take forever.
+* bounded thread pool -> Reject connections / requests when under load, and 
have
+clients retry with backoff. This also gives an opportunity to retry with a
+different node. Not everyone gets serviced but quality of service stays 
consistent.
+
+Limiting DoPut Batch Size
+--------------------------
+
+You may wish to limit the maximum size a client can submit to a server through
+DoPut, to prevent a request from taking up too much memory on the server. On
+the client-side, set `write_size_limit_bytes` on `FlightClientOptions`. On the 
+server-side, set the gRPC option `GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH`.
+The client-side option will return an error that can be retried with smaller 
batches,
+while the server-side limit will close out the connection. Setting both can be 
wise, since
+the former provides a better user experience but the latter may be necessary 
to defend
+against impolite clients.
+
+Closing unresponsive connections
+--------------------------------
+
+1. A stale connection can be closed using
+   :member:`arrow::flight::FlightClientOptions::stop_token`. This requires 
recording the
+   stop token at connection establishment time.
+
+   .. tab-set::
+
+      .. tab-item:: C++
+
+         .. code-block:: cpp
+
+      .. tab-item:: Java
+
+         .. code-block:: java
+
+      .. tab-item:: Python
+
+         .. code-block:: python
+
+
+2. Use client timeout.
+   .. tab-set::
+
+      .. tab-item:: C++
+
+         .. code-block:: cpp
+
+      .. tab-item:: Java
+
+         .. code-block:: java
+
+      .. tab-item:: Python
+
+         .. code-block:: python
+
+3. There is a long standing ticket for a per-write/per-read timeout instead of 
a per
+   call timeout (ARROW-6062_), but this is not (easily) possible to implement 
with the
+   blocking gRPC API. For now one can also do something like set up a 
background thread
+   that calls cancel() on a timer and have the main thread reset the timer 
every time a
+   write operation completes successfully (that means one needs to use 
to_batches() +
+   write_batch and not write_table).

Review Comment:
   ```suggestion
   3. Client timeouts are not great for long-running streaming calls, where it 
may
      be hard to choose a timeout for the entire operation. Instead, what is 
often
      desired is a per-read or per-write timeout so that the operation fails if 
it
      isn't making progress. This can be implemented with a background thread 
that
      calls cancel() on a timer, with the main thread resetting the timer every 
time
      an operation completes successfully. For a fully-worked out example, see 
the
      Cookbook.
      
      .. note:: There is a long standing ticket for a per-write/per-read timeout
                instead of a per call timeout (ARROW-6062_), but this is not 
(easily)
                possible to implement with the blocking gRPC API. 
   ```



##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,154 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. tab-set::
+
+   .. tab-item:: C++
+
+      .. code-block:: cpp
+
+         auto options = FlightClientOptions::Defaults();
+         // Set a very low limit at the gRPC layer to fail all calls
+         
options.generic_options.emplace_back(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, 4);
+
+   .. tab-item:: Python
+
+      .. code-block:: cpp
+
+         // Set a very low limit at the gRPC layer to fail all calls
+         generic_options = [("GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH", 4)]
+         client = pyarrow.flight.FlightClient(server_uri, 
generic_options=generic_options)
+
+Also see `best gRPC practices`_ and available `gRPC keys`_.
+
+Re-use clients whenever possible
+--------------------------------
+
+Closing clients causes gRPC to close and clean up connections which can take
+several seconds per connection. This will stall server and client threads if
+done too frequently. Client reuse will avoid this issue.
+
+Don’t round-robin load balance
+------------------------------
+
+`Round robin balancing`_ means every client can have an open connection to
+every server, causing an unexpected number of open connections and depleting
+server resources.
+
+Debugging
+---------
+
+Use netstat to see the number of open connections.
+For debug - env GRPC_VERBOSITY=info GRPC_TRACE=http will print the initial
+headers (on both sides) so you can see if grpc established the connection or
+not. It will also print when a message is sent, so you can tell if the
+connection is open or not.
+gRPC may not report connection errors until a call is actually made.
+Hence, to detect connection errors when creating a client, some sort
+of dummy RPC should be made.
+
+Memory cache management
+-----------------------
+
+Flight tries to reuse allocations made by gRPC to avoid redundant
+data copies.  However, this means that those allocations may not
+be tracked by the Arrow memory pool, and that memory usage behavior,
+such as whether free memory is returned to the system, is dependent
+on the allocator that gRPC uses (usually the system allocator).
+
+A quick way of testing: attach to the process with a debugger and call 
malloc_trim
+or call ReleaseUnused on the system pool.
+
+Excessive traffic
+-----------------
+
+gRPC will spawn an unbounded number of threads for concurrent clients. Those
+threads are not necessarily cleaned up (cached thread pool in java parlance).
+glibc malloc clears some per thread state and the default tuning never clears
+caches in some workloads. But you can explicitly tell malloc to dump caches.
+See ARROW-16697_ as an example.
+
+There are basically two ways to handle excessive traffic:
+* unbounded thread pool -> everyone gets serviced, but it might take forever.
+* bounded thread pool -> Reject connections / requests when under load, and 
have
+clients retry with backoff. This also gives an opportunity to retry with a
+different node. Not everyone gets serviced but quality of service stays 
consistent.

Review Comment:
   I'd probably word this like ~gRPC's default behavior allows one server to 
accept many connections from many different clients, but if requests do a lot 
of work (as they may under Flight), the server may not be able to keep up. 
Configuring the server to limit the number of clients and reject requests more 
proactively, and configuring clients to retry with backoff (and potentially 
connect to a different node), would give more consistent quality of service.



##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,110 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. code-block:: cpp
+
+   auto options = FlightClientOptions::Defaults();
+   // Set a very low limit at the gRPC layer to fail all calls
+   options.generic_options.emplace_back(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, 
4);
+
+Also see `best gRPC practices`_ and available `gRPC keys`_.
+
+
+Re-use clients whenever possible
+--------------------------------
+
+Closing clients causes gRPC to close and clean up connections which can take
+several seconds per connection. This will stall server and client threads if
+done too frequently. Client reuse will avoid this issue.
+
+Don’t round-robin load balance
+------------------------------
+
+Round robin balancing can cause every client to have an open connection to
+every server causing an unexpected number of open connections and a depletion
+of resources.
+
+Debugging
+---------
+
+Use netstat to see the number of open connections.
+For debug - env GRPC_VERBOSITY=info GRPC_TRACE=http will print the initial
+headers (on both sides) so you can see if grpc established the connection or
+not. It will also print when a message is sent, so you can tell if the
+connection is open or not.
+Note: "connect" isn't really a connect and we’ve observed that gRPC does not
+give you the actual error until you first try to make a call. This can cause
+error being reported at unexpected times.
+
+Use ListFlights sparingly
+-------------------------
+
+ListFlights endpoint is largely just implemented as a normal GRPC stream
+endpoint and can hit transfer bottlenecks if used too much. To estimate data
+transfer bottleneck:
+5k schemas will serialize to about 1-5 MB/call. Assuming a gRPC localhost
+bottleneck of 3GB/s you can at best serve 600-3000 clients/s.
+
+ARROW-15764_ proposes a caching optimisation for server side, but it was not
+yet implemented.
+
+
+Memory cache client-side
+------------------------
+
+Flight uses gRPC allocator wherever possible.
+
+gRPC will spawn an unbounded number of threads for concurrent clients. Those
+threads are not necessarily cleaned up (cached thread pool in java parlance).
+glibc malloc clears some per thread state and the default tuning never clears
+caches in some workloads. But you can explicitly tell malloc to dump caches.
+See ARROW-16697_ as an example.
+
+A quick way of testing: attach to the process with a debugger and call 
malloc_trim
+
+
+Excessive traffic
+-----------------
+
+There are basically two ways to handle excessive traffic:
+* unbounded thread pool -> everyone gets serviced, but it might take forever.
+This is what you are seeing now.
+bounded thread pool -> Reject connections / requests when under load, and have
+clients retry with backoff. This also gives an opportunity to retry with a
+different node. Not everyone gets serviced but quality of service stays 
consistent.
+
+Closing unresponsive connections
+--------------------------------
+
+1. A stale connection can be closed using
+   :member:`arrow::flight::FlightClientOptions::stop_token`. This requires 
recording the
+   stop token at connection establishment time.

Review Comment:
   I would much rather they be cookbook samples so that they actually get run 
in CI. It's probably good to still have a short snippet here, but I worry about 
those kinds of things bit-rotting without them actually getting exercised like 
the Cookbook does.



##########
docs/source/cpp/flight.rst:
##########
@@ -172,6 +172,154 @@ request/response. On the server, they can inspect 
incoming headers and
 fail the request; hence, they can be used to implement custom
 authentication methods.
 
+.. _flight-best-practices:
+
+Best practices
+==============
+
+gRPC
+----
+
+When using default gRPC transport options can be passed to it via
+:member:`arrow::flight::FlightClientOptions::generic_options`. For example:
+
+.. tab-set::
+
+   .. tab-item:: C++
+
+      .. code-block:: cpp
+
+         auto options = FlightClientOptions::Defaults();
+         // Set a very low limit at the gRPC layer to fail all calls
+         
options.generic_options.emplace_back(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, 4);
+
+   .. tab-item:: Python
+
+      .. code-block:: cpp
+
+         // Set a very low limit at the gRPC layer to fail all calls
+         generic_options = [("GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH", 4)]
+         client = pyarrow.flight.FlightClient(server_uri, 
generic_options=generic_options)
+
+Also see `best gRPC practices`_ and available `gRPC keys`_.
+
+Re-use clients whenever possible
+--------------------------------
+
+Closing clients causes gRPC to close and clean up connections which can take
+several seconds per connection. This will stall server and client threads if
+done too frequently. Client reuse will avoid this issue.
+
+Don’t round-robin load balance
+------------------------------
+
+`Round robin balancing`_ means every client can have an open connection to
+every server, causing an unexpected number of open connections and depleting
+server resources.
+
+Debugging
+---------
+
+Use netstat to see the number of open connections.
+For debug - env GRPC_VERBOSITY=info GRPC_TRACE=http will print the initial
+headers (on both sides) so you can see if grpc established the connection or
+not. It will also print when a message is sent, so you can tell if the
+connection is open or not.
+gRPC may not report connection errors until a call is actually made.
+Hence, to detect connection errors when creating a client, some sort
+of dummy RPC should be made.
+
+Memory cache management
+-----------------------
+
+Flight tries to reuse allocations made by gRPC to avoid redundant
+data copies.  However, this means that those allocations may not
+be tracked by the Arrow memory pool, and that memory usage behavior,
+such as whether free memory is returned to the system, is dependent
+on the allocator that gRPC uses (usually the system allocator).
+
+A quick way of testing: attach to the process with a debugger and call 
malloc_trim
+or call ReleaseUnused on the system pool.
+
+Excessive traffic
+-----------------
+
+gRPC will spawn an unbounded number of threads for concurrent clients. Those

Review Comment:
   This might not actually be true? e.g. 
`https://github.com/grpc/grpc/issues/14492`
   
   Ah, it's not actually _unbounded_, just a very large number: 
https://github.com/grpc/proposal/blob/master/L30-cpp-control-max-threads-in-SyncServer.md
   
   I think the threads are cleaned up now? So the next sentence may not be right



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to