This is an automated email from the ASF dual-hosted git repository. apitrou pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/main by this push: new 00439d1048 GH-47655: [C++][Parquet][CI] Fix failure to generate seed corpus (#47656) 00439d1048 is described below commit 00439d104804fca889016d054c03ff6ba9d5560f Author: Antoine Pitrou <anto...@python.org> AuthorDate: Thu Sep 25 22:24:53 2025 +0200 GH-47655: [C++][Parquet][CI] Fix failure to generate seed corpus (#47656) ### Rationale for this change On OSS-Fuzz, generating the Parquet seed corpus would trigger a multiplication overflow when converting a Arrow seconds timestamp column to a Parquet milliseconds timestamp column. ### What changes are included in this PR? Reduce range of input values when writing timestamps to the Parquet seed corpus. ### Are these changes tested? Manually. ### Are there any user-facing changes? No. * GitHub Issue: #47655 Authored-by: Antoine Pitrou <anto...@python.org> Signed-off-by: Antoine Pitrou <anto...@python.org> --- cpp/src/parquet/arrow/generate_fuzz_corpus.cc | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/cpp/src/parquet/arrow/generate_fuzz_corpus.cc b/cpp/src/parquet/arrow/generate_fuzz_corpus.cc index acee0d0ff9..2be025471c 100644 --- a/cpp/src/parquet/arrow/generate_fuzz_corpus.cc +++ b/cpp/src/parquet/arrow/generate_fuzz_corpus.cc @@ -147,8 +147,13 @@ Result<std::shared_ptr<RecordBatch>> ExampleBatch1() { {name_gen(), gen.Decimal32(decimal32(7, 3), kBatchSize, kNullProbability)}); // Timestamp + // (Parquet doesn't have seconds timestamps so the values are going to be + // multiplied by 10) + auto int64_timestamps_array = + gen.Int64(kBatchSize, -9000000000000000LL, 9000000000000000LL, kNullProbability); for (auto unit : TimeUnit::values()) { - ARROW_ASSIGN_OR_RAISE(auto timestamps, int64_array->View(timestamp(unit, "UTC"))); + ARROW_ASSIGN_OR_RAISE(auto timestamps, + int64_timestamps_array->View(timestamp(unit, "UTC"))); columns.push_back({name_gen(), timestamps}); } // Time32, time64