thisisnic commented on issue #47957:
URL: https://github.com/apache/arrow/issues/47957#issuecomment-3475001679

   Hey, I did a bit of a session with Claude where we looked through what 
changed etc and here's the output. No longer looking into this myself as I have 
other tasks that need working on, but pasting here in case it's useful.
   
   It's a bit opinionated!
   
   _________________________________________________
   
   Investigation Summary
   
     After investigating the ARM64 macOS 14 substrait test failures, it looks 
like protobuf 33.0 is the likely culprit.
   
     Timeline
   
     - Oct 13, 2025: Tests passing with runner image 20251013.0032 (protobuf 
32.1)
     - Oct 15, 2025: Homebrew updated protobuf from 32.1 to 33.0
     
(https://github.com/Homebrew/homebrew-core/commit/fd68e3781aa5cca3f377f5777d70b6dfdfe4b0f8)
     - Oct 20, 2025: Tests failing with runner image 20251020.0056 (protobuf 
33.0)
   
     Technical Details
   
     Bug location: cpp/src/arrow/engine/substrait/expression_internal.cc:428-432
   
   ```
     The issue occurs during deserialization of user-defined literals:
     Status Visit(const IntegerType& type) {
       google::protobuf::UInt64Value value;
       if (ARROW_PREDICT_FALSE(!user_defined_->value().UnpackTo(&value))) {
         return FailedToUnpack("integer", "UInt64Value");
       }
       ARROW_ASSIGN_OR_RAISE(scalar_, MakeScalar(type.GetSharedPtr(), 
value.value()));
       return Status::OK();
     }
   ```
    
   `UnpackTo()` returns `true (success)`, but `value.value()` returns 0 instead 
of the actual value.
   
     Test failures: arrow-substrait-substrait-test - ArrowSpecificLiterals test 
in cpp/src/arrow/engine/substrait/serde_test.cc:607-614
     - UInt8Scalar(7) deserializes as 0
     - String "hello" deserializes as ""
   
     Platform Impact
   
     - ❌ ARM64 macOS 14: FAILING
     - ✅ AMD64 macOS 13: PASSING
     - ✅ Linux (all): PASSING
     - ✅ Windows: PASSING
   
   This appears to be an ARM64-specific regression in protobuf 33.0's 
google::protobuf::Any::UnpackTo() implementation.
   
   Attempted Verification
   
   Attempted to create diagnostic PR #48015 to pin protobuf to 32.1 on ARM64 
macOS to confirm the hypothesis, but encountered difficulties with Homebrew 
dependency management (gRPC requires protobuf 33.0) and lack of versioned 
formulas for protobuf 32.x.
   
     Recommendation
   
     Given the strong timeline correlation and platform-specific nature, I 
recommend:
     1. Reporting this as a suspected regression to the protobuf team
     2. Temporarily pinning Arrow's ARM64 macOS CI to protobuf 32.1 until 
resolved
     3. Monitoring for protobuf 33.x patches


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to