Messages by Thread
-
[I] Spark SQL `maintenance` test fails intermittently [datafusion-comet]
via GitHub
-
Re: [PR] feat: Support Spark expression hours_of_time [datafusion-comet]
via GitHub
-
Re: [PR] Add `rust-required-checks` [datafusion]
via GitHub
-
Re: [PR] implement `preimage` for date_trunc [datafusion]
via GitHub
-
Re: [PR] OptimizeProjections: safely prune struct-only UNNEST when outputs are unused [datafusion]
via GitHub
-
Re: [PR] feat: Improve `partition_statistics()` for `AggregateExec` using `distinct_count` [datafusion]
via GitHub
-
[PR] test: add INT96 TimestampNTZ correctness tests [datafusion-comet]
via GitHub
-
[I] native_datafusion more permissive than Spark 3.x when reading Parquet TimestampNTZ columns [datafusion-comet]
via GitHub
-
[PR] feat: add array_normalize scalar function [datafusion]
via GitHub
-
[D] [datafusion-spark] Add physical implementations for functions that only have simplify() [datafusion]
via GitHub
-
[I] Native DataFusion scan silently returns wrong values reading INT96 as TimestampNTZ [datafusion-comet]
via GitHub
-
[PR] deps: Bump OpenDAL to 0.56.0 [datafusion-comet]
via GitHub
-
[PR] feat: support Parquet field ID matching in native_datafusion scan [datafusion-comet]
via GitHub
-
Re: [I] [native_datafusion] Add support for reading row index metadata columns [datafusion-comet]
via GitHub
-
Re: [I] `lower`, `upper` could be further optimized for ASCII-only inputs [datafusion]
via GitHub
-
Re: [I] Support dict encoded structs in `get_field` [datafusion]
via GitHub
-
[PR] feat: support AQE DPP broadcast reuse for Iceberg native scans [datafusion-comet]
via GitHub
-
Re: [I] Write a wikipedia article for Apache DataFusion [datafusion]
via GitHub
-
Re: [I] Unsupported aggregation mode PartialMerge [datafusion-comet]
via GitHub
-
Re: [PR] Add support for PostgreSQL's ORDER BY ... USING <operator> clause [datafusion-sqlparser-rs]
via GitHub
-
Re: [PR] test: extend SPARK-43402 plan-match to CometNativeScanExec and retag to #4042 [datafusion-comet]
via GitHub
-
[PR] fix: include per-column details in exportBatch row count mismatch error [datafusion-comet]
via GitHub
-
[PR] Map ProfileCredentialsProvider to profiel credential chain [datafusion-comet]
via GitHub
-
[I] Support AWS ProfileCredentialsProvider in native S3 object store [datafusion-comet]
via GitHub
-
[I] Number of rows in each column should be the same, but got [ArrayBuffer(8192, 0)] [datafusion-comet]
via GitHub
-
[PR] proto: serialize dynamic filters on Sort, Aggregate, HashJoin [datafusion]
via GitHub
-
Re: [PR] proto: serialize dynamic filters on Sort, Aggregate, HashJoin [datafusion]
via GitHub
-
Re: [PR] proto: serialize dynamic filters on Sort, Aggregate, HashJoin [datafusion]
via GitHub
-
Re: [PR] proto: serialize dynamic filters on Sort, Aggregate, HashJoin plan nodes [datafusion]
via GitHub
-
Re: [PR] proto: serialize dynamic filters on Sort, Aggregate, HashJoin plan nodes [datafusion]
via GitHub
-
Re: [PR] proto: serialize dynamic filters on Sort, Aggregate, HashJoin plan nodes [datafusion]
via GitHub
-
Re: [PR] proto: serialize dynamic filters on Sort, Aggregate, HashJoin plan nodes [datafusion]
via GitHub
-
Re: [PR] proto: serialize dynamic filters on Sort, Aggregate, HashJoin plan nodes [datafusion]
via GitHub
-
[PR] docs: document Spark version labels in bug triage guide [datafusion-comet]
via GitHub
-
[PR] perf: coalesce batches before sending to distributor channels in RepartitionExec [datafusion]
via GitHub
-
[PR] WIP: windows support [datafusion-comet]
via GitHub
-
[PR] build: Enable Spark SQL tests for Spark 4.2.0-preview4 [datafusion-comet]
via GitHub
-
Re: [PR] feat: Optimise convert_to_state for SUM and BIT_OR_XOR [datafusion]
via GitHub
-
[PR] fix: [Spark 4.1] preserve union output partitioning in CometUnionExec [datafusion-comet]
via GitHub
-
[PR] feat: add Spark commit audit process [datafusion-comet]
via GitHub
-
[I] Create `spark-latest` profile [datafusion-comet]
via GitHub
-
[PR] test: [Spark 4.1.1] unignore CachedBatchSerializerNoUnwrapSuite [datafusion-comet]
via GitHub
-
[I] Bug triage results: 2026-05-04 [datafusion-comet]
via GitHub
-
[PR] docs: start Spark 4.1 known-limitations section, seeded with #4199 [datafusion-comet]
via GitHub
-
[PR] Optimize ClickBench q17 aggregate limit [datafusion]
via GitHub
-
[I] Extend `datafusion-spark` `sequence` function [datafusion]
via GitHub
-
[PR] chore(deps): bump graphviz-rust from 0.9.7 to 0.9.8 [datafusion-ballista]
via GitHub
-
[PR] chore(deps): bump ctor from 0.12.0 to 1.0.0 [datafusion-ballista]
via GitHub
-
[PR] chore(deps): bump github/codeql-action from 4.35.2 to 4.35.3 [datafusion-ballista]
via GitHub
-
[PR] bench: add to_char_array_date32 [datafusion]
via GitHub
-
Re: [PR] feat: Support Spark expression minutes_of_time [datafusion-comet]
via GitHub
-
Re: [PR] Use datafusion-spark SparkArrayContains for three-valued NULL semantics [datafusion-comet]
via GitHub
-
Re: [PR] chore(deps): bump tonic from 0.14.2 to 0.14.3 [datafusion-sandbox]
via GitHub
-
Re: [PR] Fix cast wrapped AggregateUDF during Substrait Conversion [datafusion]
via GitHub
-
Re: [PR] Add Parquet read pruning configuration for max elements in inList [datafusion]
via GitHub
-
Re: [PR] dictionary encoded group by's [datafusion]
via GitHub
-
Re: [PR] perf: Optimize regexp match and not match for `.*foo.*` cases [datafusion]
via GitHub
-
[PR] refactor: introduce CometPlanner to replace CometScanRule and CometExecRule [datafusion-comet]
via GitHub
-
Re: [PR] feat: Implement `datafusion-spark` `sequence` function [datafusion]
via GitHub
-
[I] macOS aarch64 flake: SIGBUS in _pthread_tsd_cleanup after ParquetReadFromFakeHadoopFsSuite [datafusion-comet]
via GitHub
-
[I] Spark 4.1 NullType parquet: parquet-rs rejects BOOLEAN + Unknown logical type [datafusion-comet]
via GitHub
-
Re: [I] Use `datafusion-spark` version of `date_add` and `date_sub` [datafusion-comet]
via GitHub
-
[PR] fix(metrics): avoid stage metrics inflation by tracking partition snapshots [datafusion-ballista]
via GitHub
-
[PR] docs: document Spark 4 IntelliJ setup [datafusion-comet]
via GitHub
-
[PR] build(deps): bump github/codeql-action from 4.32.5 to 4.35.3 [datafusion-python]
via GitHub
-
Re: [PR] build(deps): bump github/codeql-action from 4.32.5 to 4.35.2 [datafusion-python]
via GitHub
-
Re: [I] Write documentation for working with Comet's spark-4.0 profile in IntelliJ [datafusion-comet]
via GitHub
-
[PR] test: relax bytesRead ratio assertion for Spark 4.1+ + docs [datafusion-comet]
via GitHub
-
[PR] fix: support Spark 4.1 BloomFilter V2 format and bit-scattering [datafusion-comet]
via GitHub
-
[PR] test: unignore SPARK-52921 union partitioning tests [datafusion-comet]
via GitHub
-
[I] Spark 4.1: native parquet reader returns wrong rows for user-defined struct schema [datafusion-comet]
via GitHub
-
[I] Spark 4.1: native_datafusion bytesRead task metric off by 6-14x vs Spark [datafusion-comet]
via GitHub
-
[I] Spark 4.1: bloom filter result mismatch (might_contain returns wrong answers) [datafusion-comet]
via GitHub
-
[I] Spark 4.1: SPARK-52921 union output partitioning tests fail because Comet replaces UnionExec/ShuffleExchangeExec [datafusion-comet]
via GitHub
-
[PR] ci: use base repository branch for breaking change detector [datafusion]
via GitHub
-
[I] Breaking change detector cannot compile datafusion-common in some cases [datafusion]
via GitHub
-
[PR] fix: preserve parent struct nullness when all requested fields missing in Parquet [datafusion-comet]
via GitHub
-
[I] Native scan path doesn't honour Parquet field-ID matching when spark.sql.parquet.fieldId.read.enabled=true [datafusion-comet]
via GitHub
-
[PR] Add benchmarks for dictionary path of new_group_values [datafusion]
via GitHub
-
[PR] docs: add llms.txt ecosystem hub at site root [datafusion]
via GitHub
-
[PR] docs: add upstream sync process documentation [datafusion-python]
via GitHub
-
[PR] feat: default to sort-merge join [datafusion-ballista]
via GitHub
-
Re: [I] Comet doesn't support Spark BroadcastHashJoinExec if it is null-aware anti-join [datafusion-comet]
via GitHub
-
[PR] [WIP] Explore extensible range partitioning for dynamic filters [datafusion]
via GitHub
-
[I] Set up processing for auditing all Spark commits to assess impact on Comet [datafusion-comet]
via GitHub
-
Re: [PR] Update user documentation for AI agent skill usage [datafusion-python]
via GitHub
-
[PR] Add benchmark_runner with help and list commands [datafusion]
via GitHub
-
[PR] feat: job plans optional rendering [datafusion-ballista]
via GitHub
-
[PR] feat(parquet): row-group and row-range sampling on ParquetSource [datafusion]
via GitHub