Copilot commented on code in PR #49771:
URL: https://github.com/apache/arrow/pull/49771#discussion_r3107520227


##########
cpp/src/arrow/ipc/reader.cc:
##########
@@ -1890,7 +1890,9 @@ class RecordBatchFileReaderImpl : public 
RecordBatchFileReader {
           const auto magic_start = buffer->data() + sizeof(int32_t);
           if (std::string_view(reinterpret_cast<const char*>(magic_start), 
kMagicSize) !=
               kArrowMagicBytes) {
-            return Status::Invalid("Not an Arrow file");
+            return Status::Invalid(
+                "Not an Arrow file. If this is an Arrow IPC Streaming format 
file, try "
+                "open_stream() instead.");

Review Comment:
   The error message suggests calling `open_stream()`, but this is not a C++ 
API entry point (C++ uses `RecordBatchStreamReader::Open` / stream reader 
types). Since this Status message is emitted from the C++ library and may be 
surfaced to C++ callers as well as bindings, consider making the guidance 
API-neutral (e.g., refer to the IPC stream reader) or mention the C++ API name 
explicitly.
   ```suggestion
                   "Not an Arrow file. If this is an Arrow IPC streaming format 
file, use "
                   "the IPC stream reader instead.");
   ```



##########
cpp/src/arrow/ipc/message.cc:
##########
@@ -565,6 +565,17 @@ Status DecodeMessage(MessageDecoder* decoder, 
io::InputStream* file) {
   auto metadata_length = decoder->next_required_size();
   ARROW_ASSIGN_OR_RAISE(auto metadata, file->Read(metadata_length));
   if (metadata->size() != metadata_length) {
+    // The first sizeof(int32_t) bytes of the Arrow file magic ("ARRO") may 
have been
+    // misread as metadata_length. Check if the remaining bytes complete the 
magic.
+    const auto remaining_magic = 
internal::kArrowMagicBytes.substr(sizeof(int32_t));
+    if (metadata->size() >= static_cast<int64_t>(remaining_magic.size()) &&
+        std::string_view(reinterpret_cast<const char*>(metadata->data()),
+                         remaining_magic.size()) == remaining_magic) {
+      return Status::Invalid("Expected to read ", metadata_length,
+                             " metadata bytes, but only read ", 
metadata->size(),
+                             ". This appears to be an Arrow IPC File format 
file. "
+                             "Try open_file() instead of open_stream().");

Review Comment:
   This error text is emitted from the C++ IPC decoder but recommends 
`open_file()` / `open_stream()`, which are binding-specific names (they don't 
exist in the C++ API). Consider API-neutral guidance (IPC file reader vs IPC 
stream reader) or include the C++ entry points (e.g., 
`RecordBatchFileReader::Open` / `RecordBatchStreamReader::Open`). Also, "Arrow 
IPC File format file" repeats "file"; reword to avoid the redundancy.
   ```suggestion
                                ". This appears to be an Arrow IPC file. "
                                "Try the IPC file reader instead of the IPC 
stream reader.");
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to