kou commented on code in PR #44279:
URL: https://github.com/apache/arrow/pull/44279#discussion_r2015810933


##########
cpp/src/arrow/flight/integration_tests/test_integration.cc:
##########
@@ -281,6 +282,133 @@ class MiddlewareScenario : public Scenario {
   std::shared_ptr<TestClientMiddlewareFactory> client_middleware_;
 };
 
+/// \brief The server used for testing FlightClient data alignment.
+///
+/// The server always returns the same data of various byte widths.
+/// The client should return data that is aligned according to the data type
+/// if FlightCallOptions.read_options.ensure_memory_alignment is true.
+///
+/// This scenario is passed only when the client returns aligned data.
+class AlignmentServer : public FlightServerBase {
+  Status GetFlightInfo(const ServerCallContext& context,
+                       const FlightDescriptor& descriptor,
+                       std::unique_ptr<FlightInfo>* result) override {
+    auto schema = BuildSchema();
+    std::vector<FlightEndpoint> endpoints{FlightEndpoint{{"foo"}, {}, 
std::nullopt, ""}};

Review Comment:
   Can we use meaningful ticket name for readability? Or we may be able to 
return `""` as ticket and not use ticket in `DoGet()` entirely.



##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -44,9 +47,13 @@ namespace {
 Type::type GetTypeForBuffers(const ArrayData& array) {
   Type::type type_id = array.type->storage_id();
   if (type_id == Type::DICTIONARY) {
-    return ::arrow::internal::checked_pointer_cast<DictionaryType>(array.type)
-        ->index_type()
-        ->id();
+    // return index type id, provided by the DictionaryType array.type or
+    // array.type->storage_type() if array.type is an ExtensionType
+    std::shared_ptr<DataType> dict_type = array.type;
+    if (array.type->id() == Type::EXTENSION) {
+      dict_type = 
checked_pointer_cast<ExtensionType>(array.type)->storage_type();
+    }

Review Comment:
   Can we avoid a needless reference count change?
   
   ```suggestion
       std::shared_ptr<DataType> dict_type;
       if (array.type->id() == Type::EXTENSION) {
         dict_type = 
checked_pointer_cast<ExtensionType>(array.type)->storage_type();
       } else {
         dict_type = array.type;
       }
   ```



##########
docs/source/cpp/flight.rst:
##########
@@ -239,7 +239,16 @@ Memory management
 -----------------
 
 Flight tries to reuse allocations made by gRPC to avoid redundant
-data copies. However, this means that those allocations may not
+data copies. Experience shows that Arrow data in those allocations are not
+found at aligned addresses. Some Flight client use cases may require `data 
alignment`_,

Review Comment:
   ```suggestion
   found at aligned addresses. Some Flight client use cases may require 
:ref:`data alignment<buffer-alignment-and-padding>`,
   ```
   
   with
   
   ```diff
   diff --git a/docs/source/format/Columnar.rst 
b/docs/source/format/Columnar.rst
   index e1603e8d8e..5a7ca963ce 100644
   --- a/docs/source/format/Columnar.rst
   +++ b/docs/source/format/Columnar.rst
   @@ -258,6 +258,8 @@ in whichever form is convenient for them. We handle 
metadata
    **serialization** in an implementation-independent way using
    `Flatbuffers`_, detailed below.
    
   +.. _buffer-alignment-and-padding:
   +
    Buffer Alignment and Padding
    ----------------------------
    
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to