mapleFU commented on code in PR #47185:
URL: https://github.com/apache/arrow/pull/47185#discussion_r2300843804


##########
cpp/src/arrow/array/builder_binary.cc:
##########
@@ -162,7 +164,13 @@ void FixedSizeBinaryBuilder::Reset() {
 
 Status FixedSizeBinaryBuilder::Resize(int64_t capacity) {
   RETURN_NOT_OK(CheckCapacity(capacity));
-  RETURN_NOT_OK(byte_builder_.Resize(capacity * byte_width_));
+  int64_t dest_capacity_bytes;

Review Comment:
   Perhaps the problem here is:
   
   ```c++
     Status ReadColumn(int i, const std::vector<int>& row_groups, ColumnReader* 
reader,
                       std::shared_ptr<ChunkedArray>* out) {
       BEGIN_PARQUET_CATCH_EXCEPTIONS
       // TODO(wesm): This calculation doesn't make much sense when we have 
repeated
       // schema nodes
       int64_t records_to_read = 0;
       for (auto row_group : row_groups) {
         // Can throw exception
         std::cout << "Numvalues:" << 
reader_->metadata()->RowGroup(row_group)->ColumnChunk(i)->num_values() << '\n';
         records_to_read +=
             
reader_->metadata()->RowGroup(row_group)->ColumnChunk(i)->num_values();
       }
   ```
   
   1. Num values is `int64_t::max()`
   2. Generally, builder will throw exception. For fixed sized type, the value 
would exceeds the bounds
   
   Perhaps for builder, the FLBA column is just so huge, and causes the issue. 
The integer and float types might also has this problem.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to