This is an automated email from the ASF dual-hosted git repository.

apitrou pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/main by this push:
     new 00439d1048 GH-47655: [C++][Parquet][CI] Fix failure to generate seed 
corpus (#47656)
00439d1048 is described below

commit 00439d104804fca889016d054c03ff6ba9d5560f
Author: Antoine Pitrou <anto...@python.org>
AuthorDate: Thu Sep 25 22:24:53 2025 +0200

    GH-47655: [C++][Parquet][CI] Fix failure to generate seed corpus (#47656)
    
    ### Rationale for this change
    
    On OSS-Fuzz, generating the Parquet seed corpus would trigger a 
multiplication overflow when converting a Arrow seconds timestamp column to a 
Parquet milliseconds timestamp column.
    
    ### What changes are included in this PR?
    
    Reduce range of input values when writing timestamps to the Parquet seed 
corpus.
    
    ### Are these changes tested?
    
    Manually.
    
    ### Are there any user-facing changes?
    
    No.
    * GitHub Issue: #47655
    
    Authored-by: Antoine Pitrou <anto...@python.org>
    Signed-off-by: Antoine Pitrou <anto...@python.org>
---
 cpp/src/parquet/arrow/generate_fuzz_corpus.cc | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/cpp/src/parquet/arrow/generate_fuzz_corpus.cc 
b/cpp/src/parquet/arrow/generate_fuzz_corpus.cc
index acee0d0ff9..2be025471c 100644
--- a/cpp/src/parquet/arrow/generate_fuzz_corpus.cc
+++ b/cpp/src/parquet/arrow/generate_fuzz_corpus.cc
@@ -147,8 +147,13 @@ Result<std::shared_ptr<RecordBatch>> ExampleBatch1() {
       {name_gen(), gen.Decimal32(decimal32(7, 3), kBatchSize, 
kNullProbability)});
 
   // Timestamp
+  // (Parquet doesn't have seconds timestamps so the values are going to be
+  //  multiplied by 10)
+  auto int64_timestamps_array =
+      gen.Int64(kBatchSize, -9000000000000000LL, 9000000000000000LL, 
kNullProbability);
   for (auto unit : TimeUnit::values()) {
-    ARROW_ASSIGN_OR_RAISE(auto timestamps, int64_array->View(timestamp(unit, 
"UTC")));
+    ARROW_ASSIGN_OR_RAISE(auto timestamps,
+                          int64_timestamps_array->View(timestamp(unit, 
"UTC")));
     columns.push_back({name_gen(), timestamps});
   }
   // Time32, time64

Reply via email to