mbutrovich opened a new issue, #3472:
URL: https://github.com/apache/datafusion-comet/issues/3472

   ### Describe the bug
   
   I've only seen this in the legacy Iceberg tests so far, and it just started 
in the last day or so. My immediate suspicion is 
https://github.com/apache/datafusion-comet/commit/8724b769164ac59c3aa908093eed5dc5ca10a7b8
 due to the `concat` function.
   
   
https://github.com/apache/datafusion-comet/actions/runs/21873083010/job/63134355589?pr=3446
   
   ```
   
org.apache.iceberg.spark.extensions.TestStoragePartitionedJoinsInRowLevelOperations.testMergeOnReadUpdateWithoutShuffles(TestStoragePartitionedJoinsInRowLevelOperations.java:165)
   2026-02-10T15:17:48.7693563Z 
   2026-02-10T15:17:48.7693637Z         Caused by:
   2026-02-10T15:17:48.7694030Z         org.apache.comet.CometNativeException: 
capacity overflow
   2026-02-10T15:17:48.7694747Z                 at 
comet::errors::init::{{closure}}(__internal__:0)
   2026-02-10T15:17:48.7695338Z                 at 
std::panicking::panic_with_hook(__internal__:0)
   2026-02-10T15:17:48.7695760Z                 at 
std::panicking::panic_handler::{{closure}}(__internal__:0)
   2026-02-10T15:17:48.7696654Z                 at 
std::sys::backtrace::__rust_end_short_backtrace(__internal__:0)
   2026-02-10T15:17:48.7697261Z                 at 
__rustc::rust_begin_unwind(__internal__:0)
   2026-02-10T15:17:48.7697592Z                 at 
core::panicking::panic_fmt(__internal__:0)
   2026-02-10T15:17:48.7698155Z                 at 
alloc::raw_vec::capacity_overflow(__internal__:0)
   2026-02-10T15:17:48.7698783Z                 at 
alloc::raw_vec::handle_error(__internal__:0)
   2026-02-10T15:17:48.7699295Z                 at 
arrow_array::builder::generic_bytes_builder::GenericByteBuilder<T>::with_capacity(__internal__:0)
   2026-02-10T15:17:48.7700173Z                 at 
arrow_select::concat::concat_bytes(__internal__:0)
   2026-02-10T15:17:48.7700751Z                 at 
arrow_select::concat::concat(__internal__:0)
   2026-02-10T15:17:48.7701203Z                 at 
arrow_select::concat::concat_batches(__internal__:0)
   2026-02-10T15:17:48.7702059Z                 at 
datafusion_physical_plan::sorts::sort::ExternalSorter::in_mem_sort_stream(__internal__:0)
   2026-02-10T15:17:48.7702627Z                 at <S as 
futures_core::stream::TryStream>::try_poll_next(__internal__:0)
   2026-02-10T15:17:48.7703288Z                 at 
<datafusion_physical_plan::stream::RecordBatchStreamAdapter<S> as 
futures_core::stream::Stream>::poll_next(__internal__:0)
   2026-02-10T15:17:48.7704098Z                 at 
comet::execution::jni_api::Java_org_apache_comet_Native_executePlan::{{closure}}::{{closure}}::{{closure}}(__internal__:0)
   2026-02-10T15:17:48.7704833Z                 at 
tokio::runtime::runtime::Runtime::block_on(__internal__:0)
   2026-02-10T15:17:48.7705895Z                 at 
comet::execution::jni_api::Java_org_apache_comet_Native_executePlan::{{closure}}::{{closure}}(__internal__:0)
   2026-02-10T15:17:48.7706819Z                 at 
comet::errors::try_unwrap_or_throw(__internal__:0)
   2026-02-10T15:17:48.7707519Z                 at 
Java_org_apache_comet_Native_executePlan(__internal__:0)
   2026-02-10T15:17:48.7708150Z                 at <unknown>(__internal__:0)
   2026-02-10T15:17:48.7708886Z             at 
app//org.apache.comet.Native.executePlan(Native Method)
   2026-02-10T15:17:48.7710081Z             at 
app//org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2(CometExecIterator.scala:150)
   2026-02-10T15:17:48.7711185Z             at 
app//org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2$adapted(CometExecIterator.scala:149)
   2026-02-10T15:17:48.7711820Z             at 
app//org.apache.comet.vector.NativeUtil.getNextBatch(NativeUtil.scala:232)
   2026-02-10T15:17:48.7712414Z             at 
app//org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1(CometExecIterator.scala:149)
   2026-02-10T15:17:48.7712950Z             at 
app//org.apache.comet.Tracing$.withTrace(Tracing.scala:31)
   2026-02-10T15:17:48.7713438Z             at 
app//org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:147)
   2026-02-10T15:17:48.7714007Z             at 
app//org.apache.comet.CometExecIterator.hasNext(CometExecIterator.scala:203)
   2026-02-10T15:17:48.7714820Z             at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.cometcolumnartorow_nextBatch_0$(Unknown
 Source)
   2026-02-10T15:17:48.7715831Z             at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown
 Source)
   2026-02-10T15:17:48.7717080Z             at 
app//org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
   2026-02-10T15:17:48.7718406Z             at 
app//org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
   2026-02-10T15:17:48.7719546Z             at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.smj_findNextJoinRows_0$(Unknown
 Source)
   2026-02-10T15:17:48.7720692Z             at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.processNext(Unknown
 Source)
   2026-02-10T15:17:48.7721492Z             at 
app//org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
   2026-02-10T15:17:48.7722485Z             at 
app//org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
   2026-02-10T15:17:48.7723567Z             at 
app//org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.$anonfun$run$5(WriteToDataSourceV2Exec.scala:446)
   2026-02-10T15:17:48.7724326Z             at 
app//org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1397)
   2026-02-10T15:17:48.7725043Z             at 
app//org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.run(WriteToDataSourceV2Exec.scala:491)
   2026-02-10T15:17:48.7725962Z             at 
app//org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.run$(WriteToDataSourceV2Exec.scala:430)
   2026-02-10T15:17:48.7727084Z             at 
app//org.apache.spark.sql.execution.datasources.v2.DeltaWithMetadataWritingSparkTask.run(WriteToDataSourceV2Exec.scala:531)
   2026-02-10T15:17:48.7728264Z             at 
app//org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.$anonfun$writeWithV2$2(WriteToDataSourceV2Exec.scala:393)
   2026-02-10T15:17:48.7729257Z             at 
app//org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
   2026-02-10T15:17:48.7729805Z             at 
app//org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
   2026-02-10T15:17:48.7730288Z             at 
app//org.apache.spark.scheduler.Task.run(Task.scala:141)
   2026-02-10T15:17:48.7730805Z             at 
app//org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
   2026-02-10T15:17:48.7732129Z             at 
app//org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
   2026-02-10T15:17:48.7732923Z             at 
app//org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
   2026-02-10T15:17:48.7733488Z             at 
app//org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
   2026-02-10T15:17:48.7734093Z             at 
app//org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
   2026-02-10T15:17:48.7734804Z             at 
[email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
   2026-02-10T15:17:48.7735501Z             at 
[email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
   2026-02-10T15:17:48.7736028Z             at 
[email protected]/java.lang.Thread.run(Thread.java:840)
   ```
   
   ### Steps to reproduce
   
   _No response_
   
   ### Expected behavior
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to