mbutrovich opened a new issue, #3472: URL: https://github.com/apache/datafusion-comet/issues/3472
### Describe the bug I've only seen this in the legacy Iceberg tests so far, and it just started in the last day or so. My immediate suspicion is https://github.com/apache/datafusion-comet/commit/8724b769164ac59c3aa908093eed5dc5ca10a7b8 due to the `concat` function. https://github.com/apache/datafusion-comet/actions/runs/21873083010/job/63134355589?pr=3446 ``` org.apache.iceberg.spark.extensions.TestStoragePartitionedJoinsInRowLevelOperations.testMergeOnReadUpdateWithoutShuffles(TestStoragePartitionedJoinsInRowLevelOperations.java:165) 2026-02-10T15:17:48.7693563Z 2026-02-10T15:17:48.7693637Z Caused by: 2026-02-10T15:17:48.7694030Z org.apache.comet.CometNativeException: capacity overflow 2026-02-10T15:17:48.7694747Z at comet::errors::init::{{closure}}(__internal__:0) 2026-02-10T15:17:48.7695338Z at std::panicking::panic_with_hook(__internal__:0) 2026-02-10T15:17:48.7695760Z at std::panicking::panic_handler::{{closure}}(__internal__:0) 2026-02-10T15:17:48.7696654Z at std::sys::backtrace::__rust_end_short_backtrace(__internal__:0) 2026-02-10T15:17:48.7697261Z at __rustc::rust_begin_unwind(__internal__:0) 2026-02-10T15:17:48.7697592Z at core::panicking::panic_fmt(__internal__:0) 2026-02-10T15:17:48.7698155Z at alloc::raw_vec::capacity_overflow(__internal__:0) 2026-02-10T15:17:48.7698783Z at alloc::raw_vec::handle_error(__internal__:0) 2026-02-10T15:17:48.7699295Z at arrow_array::builder::generic_bytes_builder::GenericByteBuilder<T>::with_capacity(__internal__:0) 2026-02-10T15:17:48.7700173Z at arrow_select::concat::concat_bytes(__internal__:0) 2026-02-10T15:17:48.7700751Z at arrow_select::concat::concat(__internal__:0) 2026-02-10T15:17:48.7701203Z at arrow_select::concat::concat_batches(__internal__:0) 2026-02-10T15:17:48.7702059Z at datafusion_physical_plan::sorts::sort::ExternalSorter::in_mem_sort_stream(__internal__:0) 2026-02-10T15:17:48.7702627Z at <S as futures_core::stream::TryStream>::try_poll_next(__internal__:0) 2026-02-10T15:17:48.7703288Z at <datafusion_physical_plan::stream::RecordBatchStreamAdapter<S> as futures_core::stream::Stream>::poll_next(__internal__:0) 2026-02-10T15:17:48.7704098Z at comet::execution::jni_api::Java_org_apache_comet_Native_executePlan::{{closure}}::{{closure}}::{{closure}}(__internal__:0) 2026-02-10T15:17:48.7704833Z at tokio::runtime::runtime::Runtime::block_on(__internal__:0) 2026-02-10T15:17:48.7705895Z at comet::execution::jni_api::Java_org_apache_comet_Native_executePlan::{{closure}}::{{closure}}(__internal__:0) 2026-02-10T15:17:48.7706819Z at comet::errors::try_unwrap_or_throw(__internal__:0) 2026-02-10T15:17:48.7707519Z at Java_org_apache_comet_Native_executePlan(__internal__:0) 2026-02-10T15:17:48.7708150Z at <unknown>(__internal__:0) 2026-02-10T15:17:48.7708886Z at app//org.apache.comet.Native.executePlan(Native Method) 2026-02-10T15:17:48.7710081Z at app//org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2(CometExecIterator.scala:150) 2026-02-10T15:17:48.7711185Z at app//org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2$adapted(CometExecIterator.scala:149) 2026-02-10T15:17:48.7711820Z at app//org.apache.comet.vector.NativeUtil.getNextBatch(NativeUtil.scala:232) 2026-02-10T15:17:48.7712414Z at app//org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1(CometExecIterator.scala:149) 2026-02-10T15:17:48.7712950Z at app//org.apache.comet.Tracing$.withTrace(Tracing.scala:31) 2026-02-10T15:17:48.7713438Z at app//org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:147) 2026-02-10T15:17:48.7714007Z at app//org.apache.comet.CometExecIterator.hasNext(CometExecIterator.scala:203) 2026-02-10T15:17:48.7714820Z at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.cometcolumnartorow_nextBatch_0$(Unknown Source) 2026-02-10T15:17:48.7715831Z at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source) 2026-02-10T15:17:48.7717080Z at app//org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) 2026-02-10T15:17:48.7718406Z at app//org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43) 2026-02-10T15:17:48.7719546Z at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.smj_findNextJoinRows_0$(Unknown Source) 2026-02-10T15:17:48.7720692Z at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.processNext(Unknown Source) 2026-02-10T15:17:48.7721492Z at app//org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) 2026-02-10T15:17:48.7722485Z at app//org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43) 2026-02-10T15:17:48.7723567Z at app//org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.$anonfun$run$5(WriteToDataSourceV2Exec.scala:446) 2026-02-10T15:17:48.7724326Z at app//org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1397) 2026-02-10T15:17:48.7725043Z at app//org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.run(WriteToDataSourceV2Exec.scala:491) 2026-02-10T15:17:48.7725962Z at app//org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.run$(WriteToDataSourceV2Exec.scala:430) 2026-02-10T15:17:48.7727084Z at app//org.apache.spark.sql.execution.datasources.v2.DeltaWithMetadataWritingSparkTask.run(WriteToDataSourceV2Exec.scala:531) 2026-02-10T15:17:48.7728264Z at app//org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.$anonfun$writeWithV2$2(WriteToDataSourceV2Exec.scala:393) 2026-02-10T15:17:48.7729257Z at app//org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) 2026-02-10T15:17:48.7729805Z at app//org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) 2026-02-10T15:17:48.7730288Z at app//org.apache.spark.scheduler.Task.run(Task.scala:141) 2026-02-10T15:17:48.7730805Z at app//org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) 2026-02-10T15:17:48.7732129Z at app//org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) 2026-02-10T15:17:48.7732923Z at app//org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) 2026-02-10T15:17:48.7733488Z at app//org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) 2026-02-10T15:17:48.7734093Z at app//org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) 2026-02-10T15:17:48.7734804Z at [email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) 2026-02-10T15:17:48.7735501Z at [email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) 2026-02-10T15:17:48.7736028Z at [email protected]/java.lang.Thread.run(Thread.java:840) ``` ### Steps to reproduce _No response_ ### Expected behavior _No response_ ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
