bkietz commented on code in PR #37360:
URL: https://github.com/apache/arrow/pull/37360#discussion_r1307628219


##########
cpp/src/arrow/compute/function_internal.cc:
##########
@@ -83,7 +83,8 @@ Result<std::unique_ptr<FunctionOptions>> 
GenericOptionsType::Deserialize(
 
 Result<std::unique_ptr<FunctionOptions>> DeserializeFunctionOptions(
     const Buffer& buffer) {
-  io::BufferReader stream(buffer);
+  // Create a non-owned Buffer to avoid copying

Review Comment:
   This makes me slightly nervous since some function options have scalar or 
buffer DataMembers which *could* be zero copied (leaving that one with a buffer 
which dangles because it points into this non-owning one). I can't immediately 
identify where they are actually copied, but I wrote this ad hoc test to check 
that copying does happen somewhere
   
   ```c++
   TEST(TestCumulative, AdHoc) {
     CumulativeOptions options;
     options.start = MakeScalar("hello");
     ASSERT_OK_AND_ASSIGN(auto buf, options.options_type()->Serialize(options));
     ASSERT_OK_AND_ASSIGN(auto rt_start,
                          options.options_type()->Deserialize(*buf).Map([](auto 
rt) {
                            return *checked_cast<CumulativeOptions&>(*rt).start;
                          }));
     ASSERT_EQ(rt_start->type->id(), Type::STRING);
   
     auto prbuf = [](const Buffer& buf) {
       std::cout << "@" << reinterpret_cast<uintptr_t>(buf.data()) << "[" << 
buf.size()
                 << "]\n";
     };
     prbuf(*buf);
     prbuf(*checked_cast<const StringScalar&>(*rt_start).value);
   }
   ```
   
   I think we should add (a more formal version of) this test to ensure future 
refactoring will never admit production of dangling function options. 
Alternatively we could explicitly ensure deep copying here or in 
GenericFromScalar



##########
cpp/src/arrow/flight/sql/client.cc:
##########
@@ -519,13 +519,17 @@ arrow::Result<std::shared_ptr<PreparedStatement>> 
PreparedStatement::ParseRespon
 
   std::shared_ptr<Schema> dataset_schema;
   if (!serialized_dataset_schema.empty()) {
-    io::BufferReader dataset_schema_reader(serialized_dataset_schema);
+    // Create a non-owned Buffer to avoid copying

Review Comment:
   My above lifetime comment does not apply here since there's no way the 
deserialized `Schema` could reference memory in the buffer we're reading; zero 
copy lgtm



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to