emkornfield commented on code in PR #48431:
URL: https://github.com/apache/arrow/pull/48431#discussion_r2714711626


##########
cpp/src/parquet/file_reader.cc:
##########
@@ -436,6 +437,34 @@ class SerializedFile : public ParquetFileReader::Contents {
     PARQUET_ASSIGN_OR_THROW(
         auto footer_buffer,
         source_->ReadAt(source_size_ - footer_read_size, footer_read_size));
+    if (properties_.read_metadata3()) {
+      // Try to extract flatbuffer metadata from footer
+      std::string flatbuffer_data;
+      auto result = ExtractFlatbuffer(footer_buffer, &flatbuffer_data);
+      if (result.ok()) {
+        int32_t required_or_consumed = *result;
+        if (required_or_consumed > 
static_cast<int32_t>(footer_buffer->size())) {
+          PARQUET_ASSIGN_OR_THROW(
+              footer_buffer,
+              source_->ReadAt(source_size_ - required_or_consumed, 
required_or_consumed));
+          footer_read_size = required_or_consumed;
+          result = ExtractFlatbuffer(footer_buffer, &flatbuffer_data);
+        }
+        // If successfully extracted flatbuffer data, parse it and return
+        if (result.ok() && *result > 0 && !flatbuffer_data.empty()) {
+          // Get flatbuffer metadata and convert to thrift
+          const format3::FileMetaData* fb_metadata =
+              format3::GetFileMetaData(flatbuffer_data.data());
+          auto thrift_metadata =
+              
std::make_unique<format::FileMetaData>(FromFlatbuffer(fb_metadata));
+          file_metadata_ = FileMetaData::Make(

Review Comment:
   FileMetadata is already a wrapper around thrift, is there a reason we don't 
have a different implementation that is made purely from the FileMetadata?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to