github
Thread
Date
Earlier messages
Messages by Date
2026/05/04
[PR] perf: coalesce batches before sending to distributor channels in RepartitionExec [datafusion]
via GitHub
2026/05/04
Re: [I] Track Spark 4.2 test failures [datafusion-comet]
via GitHub
2026/05/04
Re: [I] Track Spark 4.2 test failures [datafusion-comet]
via GitHub
2026/05/04
Re: [PR] test: [Spark 4.1.1] unignore CachedBatchSerializerNoUnwrapSuite [datafusion-comet]
via GitHub
2026/05/04
Re: [I] CachedBatchSerializerNoUnwrapSuite: Comet replaces WholeStageCodegenExec [datafusion-comet]
via GitHub
2026/05/04
Re: [I] CaseWhen does not work with custom implemented column expression [datafusion]
via GitHub
2026/05/04
Re: [PR] chore: fix spark ansi sum incompatibility message [datafusion-comet]
via GitHub
2026/05/04
Re: [PR] chore: fix spark ansi sum incompatibility message [datafusion-comet]
via GitHub
2026/05/04
Re: [I] Spark 4.1 NullType parquet: parquet-rs rejects BOOLEAN + Unknown logical type [datafusion-comet]
via GitHub
2026/05/04
Re: [PR] feat: add bug-triage Claude skill [datafusion-comet]
via GitHub
2026/05/04
Re: [PR] feat: add bug-triage Claude skill [datafusion-comet]
via GitHub
2026/05/04
[PR] WIP: windows support [datafusion-comet]
via GitHub
2026/05/04
Re: [PR] Query-aware statistics requests via ScanArgs / ScanResult (RFC for #21624) [datafusion]
via GitHub
2026/05/04
Re: [PR] feat: TABLESAMPLE SYSTEM end-to-end + row-group / row sampling on ParquetSource [datafusion]
via GitHub
2026/05/04
Re: [PR] feat(functions-nested): add array_filter higher-order function [datafusion]
via GitHub
2026/05/04
Re: [PR] docs: start Spark 4.1 known-limitations section, seeded with #4199 [datafusion-comet]
via GitHub
2026/05/04
Re: [I] Requirements for scalar UDF preimage [datafusion]
via GitHub
2026/05/04
Re: [PR] Fix fully matched row groups with null counts [datafusion]
via GitHub
2026/05/04
[PR] build: Enable Spark SQL tests for Spark 4.2.0-preview4 [datafusion-comet]
via GitHub
2026/05/04
Re: [PR] feat: Optimise convert_to_state for SUM and BIT_OR_XOR [datafusion]
via GitHub
2026/05/04
Re: [PR] Fix fully matched row groups with null counts [datafusion]
via GitHub
2026/05/04
[PR] fix: [Spark 4.1] preserve union output partitioning in CometUnionExec [datafusion-comet]
via GitHub
2026/05/04
Re: [PR] feat: type-keyed extensions map for PartitionedFile [datafusion]
via GitHub
2026/05/04
Re: [PR] Add support for lambda column capture [datafusion]
via GitHub
2026/05/04
Re: [PR] Add support for lambda column capture [datafusion]
via GitHub
2026/05/04
[PR] feat: add Spark commit audit process [datafusion-comet]
via GitHub
2026/05/04
Re: [PR] Add support for lambda column capture [datafusion]
via GitHub
2026/05/04
Re: [PR] Add support for lambda column capture [datafusion]
via GitHub
2026/05/04
Re: [PR] Add support for lambda column capture [datafusion]
via GitHub
2026/05/04
Re: [PR] feat: TABLESAMPLE SYSTEM end-to-end + row-group / row sampling on ParquetSource [datafusion]
via GitHub
2026/05/04
Re: [PR] feat: TABLESAMPLE SYSTEM end-to-end + row-group / row sampling on ParquetSource [datafusion]
via GitHub
2026/05/04
Re: [PR] Add support for lambda column capture [datafusion]
via GitHub
2026/05/04
Re: [PR] feat: type-keyed extensions map for PartitionedFile [datafusion]
via GitHub
2026/05/04
[I] Create `spark-latest` profile [datafusion-comet]
via GitHub
2026/05/04
Re: [PR] feat: type-keyed extensions map for PartitionedFile [datafusion]
via GitHub
2026/05/04
Re: [PR] Add a memory bound FileStatisticsCache for the Listing Table [datafusion]
via GitHub
2026/05/04
[PR] test: [Spark 4.1.1] unignore CachedBatchSerializerNoUnwrapSuite [datafusion-comet]
via GitHub
2026/05/04
Re: [I] Set up process for auditing all Spark commits to assess impact on Comet [datafusion-comet]
via GitHub
2026/05/04
Re: [PR] functions: Add dict support for get field [datafusion]
via GitHub
2026/05/04
Re: [PR] feat: add bug-triage Claude skill [datafusion-comet]
via GitHub
2026/05/04
[I] Bug triage results: 2026-05-04 [datafusion-comet]
via GitHub
2026/05/04
Re: [I] Bug triage results: 2026-04-27 [datafusion-comet]
via GitHub
2026/05/04
Re: [PR] feat: add bug-triage Claude skill [datafusion-comet]
via GitHub
2026/05/04
[PR] docs: start Spark 4.1 known-limitations section, seeded with #4199 [datafusion-comet]
via GitHub
2026/05/04
Re: [PR] Optimize ClickBench q17 aggregate limit [datafusion]
via GitHub
2026/05/04
Re: [PR] Optimize ClickBench q17 aggregate limit [datafusion]
via GitHub
2026/05/04
Re: [PR] Optimize ClickBench q17 aggregate limit [datafusion]
via GitHub
2026/05/04
Re: [PR] Optimize ClickBench q17 aggregate limit [datafusion]
via GitHub
2026/05/04
Re: [PR] Optimize ClickBench q17 aggregate limit [datafusion]
via GitHub
2026/05/04
Re: [PR] Optimize ClickBench q17 aggregate limit [datafusion]
via GitHub
2026/05/04
Re: [PR] Optimize ClickBench q17 aggregate limit [datafusion]
via GitHub
2026/05/04
Re: [PR] ci: add `auto detected api change` label on breaking change detecting in the CI [datafusion]
via GitHub
2026/05/04
Re: [I] Breaking change detector: adding "auto detect api change" label on detection [datafusion]
via GitHub
2026/05/04
Re: [PR] ci: add `auto detected api change` label on breaking change detecting in the CI [datafusion]
via GitHub
2026/05/04
Re: [PR] Optimize ClickBench q17 aggregate limit [datafusion]
via GitHub
2026/05/04
Re: [PR] Optimize ClickBench q17 aggregate limit [datafusion]
via GitHub
2026/05/04
Re: [PR] Optimize ClickBench q17 aggregate limit [datafusion]
via GitHub
2026/05/04
Re: [PR] Optimize ClickBench q17 aggregate limit [datafusion]
via GitHub
2026/05/04
Re: [PR] Optimize ClickBench q17 aggregate limit [datafusion]
via GitHub
2026/05/04
Re: [PR] Optimize ClickBench q17 aggregate limit [datafusion]
via GitHub
2026/05/04
Re: [PR] Optimize ClickBench q17 aggregate limit [datafusion]
via GitHub
2026/05/04
[PR] Optimize ClickBench q17 aggregate limit [datafusion]
via GitHub
2026/05/04
Re: [PR] feat: type-keyed extensions map for PartitionedFile [datafusion]
via GitHub
2026/05/04
Re: [PR] Add reusable plan-time schema alignment helper and apply to RecursiveQueryExec [datafusion]
via GitHub
2026/05/04
Re: [PR] Add support for lambda column capture [datafusion]
via GitHub
2026/05/04
Re: [PR] Add support for lambda column capture [datafusion]
via GitHub
2026/05/04
Re: [PR] Add support for lambda column capture [datafusion]
via GitHub
2026/05/04
Re: [PR] Add support for lambda column capture [datafusion]
via GitHub
2026/05/04
Re: [PR] Add support for lambda column capture [datafusion]
via GitHub
2026/05/04
Re: [PR] feat: TABLESAMPLE SYSTEM end-to-end + row-group / row sampling on ParquetSource [datafusion]
via GitHub
2026/05/04
Re: [PR] Feat/map sql extension types [datafusion]
via GitHub
2026/05/04
Re: [PR] Explicitly declare spill codec dependency in `physical-plan` [datafusion]
via GitHub
2026/05/04
Re: [PR] Explicitly declare spill codec dependency in `physical-plan` [datafusion]
via GitHub
2026/05/04
Re: [PR] chore(deps): bump graphviz-rust from 0.9.7 to 0.9.8 [datafusion-ballista]
via GitHub
2026/05/04
Re: [PR] feat: Support RIGHT/FULL joins in NLJ memory-limited execution [datafusion]
via GitHub
2026/05/04
Re: [PR] perf: Cast entire Date32 array to Date64 on 1st failure [datafusion]
via GitHub
2026/05/04
Re: [I] Make spill codec availability an explicit contract of the spill stack [datafusion]
via GitHub
2026/05/04
Re: [PR] ci: add `auto detected api change` label on breaking change detecting in the CI [datafusion]
via GitHub
2026/05/03
Re: [PR] perf: Cast entire Date32 array to Date64 on 1st failure [datafusion]
via GitHub
2026/05/03
Re: [PR] perf: Cast entire Date32 array to Date64 on 1st failure [datafusion]
via GitHub
2026/05/03
Re: [PR] perf: Cast entire Date32 array to Date64 on 1st failure [datafusion]
via GitHub
2026/05/03
Re: [PR] perf: Cast entire Date32 array to Date64 on 1st failure [datafusion]
via GitHub
2026/05/03
Re: [PR] perf: Cast entire Date32 array to Date64 on 1st failure [datafusion]
via GitHub
2026/05/03
Re: [PR] bench: add to_char_array_date32 [datafusion]
via GitHub
2026/05/03
Re: [PR] bench: add to_char_array_date32 [datafusion]
via GitHub
2026/05/03
Re: [I] Support array_agg as a sliding window aggregate by implementing retract_batch [datafusion]
via GitHub
2026/05/03
Re: [PR] feat: Implement `datafusion-spark` `sequence` function [datafusion]
via GitHub
2026/05/03
Re: [I] Extend `datafusion-spark` `sequence` function [datafusion]
via GitHub
2026/05/03
[I] Extend `datafusion-spark` `sequence` function [datafusion]
via GitHub
2026/05/03
[PR] chore(deps): bump graphviz-rust from 0.9.7 to 0.9.8 [datafusion-ballista]
via GitHub
2026/05/03
[PR] chore(deps): bump ctor from 0.12.0 to 1.0.0 [datafusion-ballista]
via GitHub
2026/05/03
[PR] chore(deps): bump github/codeql-action from 4.35.2 to 4.35.3 [datafusion-ballista]
via GitHub
2026/05/03
Re: [PR] perf: Cast entire Date32 array to Date64 on 1st failure [datafusion]
via GitHub
2026/05/03
[PR] bench: add to_char_array_date32 [datafusion]
via GitHub
2026/05/03
Re: [PR] perf: Cast entire Date32 array to Date64 on 1st failure [datafusion]
via GitHub
2026/05/03
Re: [PR] perf: Cast entire Date32 array to Date64 on 1st failure [datafusion]
via GitHub
2026/05/03
Re: [PR] fix(metrics): avoid stage metrics inflation by tracking partition snapshots [datafusion-ballista]
via GitHub
2026/05/03
Re: [PR] feat: Add dialect-aware SQL serialization with `ToSql` trait for ClickHouse PascalCase types [datafusion-sqlparser-rs]
via GitHub
2026/05/03
Re: [PR] feat: support url encode expression [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] feat: Support Spark expression minutes_of_time [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] feat: support url_decode expression via StaticInvoke [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] [WIP] feat: support binary lpad/rpad via StaticInvoke [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] chore(deps): bump sphinx from 8.2.3 to 9.1.0 in /docs [datafusion-sandbox]
via GitHub
2026/05/03
Re: [PR] Iceberg v3 support - enable and initial version [iceberg] [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] Acceleration : Iceberg table compaction [iceberg] [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] [WIP] feat: support charTypeWriteSideCheck and varcharTypeWriteSideCheck via StaticInvoke [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] Use datafusion-spark SparkArrayContains for three-valued NULL semantics [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] chore(deps): bump myst-parser from 4.0.1 to 5.0.0 in /docs [datafusion-sandbox]
via GitHub
2026/05/03
Re: [PR] chore(deps): bump sphinx from 8.2.3 to 9.1.0 in /docs [datafusion-sandbox]
via GitHub
2026/05/03
Re: [PR] chore(deps): bump myst-parser from 4.0.1 to 5.0.0 in /docs [datafusion-sandbox]
via GitHub
2026/05/03
Re: [PR] chore(deps): bump tonic from 0.14.2 to 0.14.3 [datafusion-sandbox]
via GitHub
2026/05/03
Re: [PR] Consolidate ClickBench setup documentation for DataFusion [datafusion]
via GitHub
2026/05/03
Re: [PR] Various performance improvements [datafusion]
via GitHub
2026/05/03
Re: [PR] Fix cast wrapped AggregateUDF during Substrait Conversion [datafusion]
via GitHub
2026/05/03
Re: [PR] Add options to control hash join dynamic filter pushdown [datafusion]
via GitHub
2026/05/03
Re: [PR] fix: race condition in SpillPool caused by buffered stream [datafusion]
via GitHub
2026/05/03
Re: [PR] refactor(substrait): use Arrow extension types instead of type variation hacks [datafusion]
via GitHub
2026/05/03
Re: [PR] Add Parquet read pruning configuration for max elements in inList [datafusion]
via GitHub
2026/05/03
Re: [PR] feat: optimize CASE WHEN for divide-by-zero protection pattern [datafusion]
via GitHub
2026/05/03
Re: [PR] Optimize arrow bytes view map [datafusion]
via GitHub
2026/05/03
Re: [PR] Expose virtual columns from the Arrow Parquet reader in datasource-parquet [datafusion]
via GitHub
2026/05/03
Re: [PR] dictionary encoded group by's [datafusion]
via GitHub
2026/05/03
Re: [PR] perf: Optimize regexp match and not match for `.*foo.*` cases [datafusion]
via GitHub
2026/05/03
Re: [I] Add cargo-fuzz / OSS-Fuzz coverage for DataFusion's parser and analyzer [datafusion]
via GitHub
2026/05/03
Re: [PR] Add group join physical optimizer [datafusion]
via GitHub
2026/05/03
Re: [I] Add cargo-fuzz / OSS-Fuzz coverage for DataFusion's parser and analyzer [datafusion]
via GitHub
2026/05/03
[PR] refactor: introduce CometPlanner to replace CometScanRule and CometExecRule [datafusion-comet]
via GitHub
2026/05/03
Re: [I] Release DataFusion `54.0.0` (Apr 2026 / May 2026) [datafusion]
via GitHub
2026/05/03
Re: [PR] feat: TABLESAMPLE SYSTEM end-to-end + row-group / row sampling on ParquetSource [datafusion]
via GitHub
2026/05/03
Re: [PR] test: [Spark 4.1.1] unignore SPARK-52921 union partitioning tests [datafusion-comet]
via GitHub
2026/05/03
Re: [I] Spark 4.1: SPARK-52921 union output partitioning tests fail because Comet replaces UnionExec/ShuffleExchangeExec [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] fix(metrics): avoid stage metrics inflation by tracking partition snapshots [datafusion-ballista]
via GitHub
2026/05/03
Re: [PR] feat: Implement `datafusion-spark` `sequence` function [datafusion]
via GitHub
2026/05/03
Re: [PR] feat: Implement `datafusion-spark` `sequence` function [datafusion]
via GitHub
2026/05/03
Re: [PR] feat: Implement `datafusion-spark` `sequence` function [datafusion]
via GitHub
2026/05/03
[I] macOS aarch64 flake: SIGBUS in _pthread_tsd_cleanup after ParquetReadFromFakeHadoopFsSuite [datafusion-comet]
via GitHub
2026/05/03
[I] Spark 4.1 NullType parquet: parquet-rs rejects BOOLEAN + Unknown logical type [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] fix(metrics): avoid stage metrics inflation by tracking partition snapshots [datafusion-ballista]
via GitHub
2026/05/03
Re: [PR] fix(metrics): avoid stage metrics inflation by tracking partition snapshots [datafusion-ballista]
via GitHub
2026/05/03
Re: [I] Use `datafusion-spark` version of `date_add` and `date_sub` [datafusion-comet]
via GitHub
2026/05/03
[PR] fix(metrics): avoid stage metrics inflation by tracking partition snapshots [datafusion-ballista]
via GitHub
2026/05/03
Re: [PR] feat: Making From<ConfigFileDecryptionProperties/ConfigFileEncryptionProperties> conversions fallible with `TryFrom` [datafusion]
via GitHub
2026/05/03
Re: [PR] docs: document Spark 4 IntelliJ setup [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] docs: document Spark 4 IntelliJ setup [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] docs: add custom table provider filter pushdown examples [datafusion]
via GitHub
2026/05/03
Re: [PR] fix(physical-plan): set column byte_size to 0 in FilterExec zero-row interval stats [datafusion]
via GitHub
2026/05/03
Re: [PR] perf: Cast entire Date32 array to Date64 on 1st failure [datafusion]
via GitHub
2026/05/03
Re: [PR] Query-aware statistics requests via ScanArgs / ScanResult (RFC for #21624) [datafusion]
via GitHub
2026/05/03
[PR] docs: document Spark 4 IntelliJ setup [datafusion-comet]
via GitHub
2026/05/03
Re: [I] CachedBatchSerializerNoUnwrapSuite: Comet replaces WholeStageCodegenExec [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] build(deps): bump github/codeql-action from 4.32.5 to 4.35.2 [datafusion-python]
via GitHub
2026/05/03
[PR] build(deps): bump github/codeql-action from 4.32.5 to 4.35.3 [datafusion-python]
via GitHub
2026/05/03
Re: [PR] build(deps): bump github/codeql-action from 4.32.5 to 4.35.2 [datafusion-python]
via GitHub
2026/05/03
Re: [I] Write documentation for working with Comet's spark-4.0 profile in IntelliJ [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] feat: job plans optional rendering [datafusion-ballista]
via GitHub
2026/05/03
[PR] test: relax bytesRead ratio assertion for Spark 4.1+ + docs [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] Add StatisticsContext parameter to partition_statistics [datafusion]
via GitHub
2026/05/03
Re: [PR] ci: add `auto detected api change` label on breaking change detecting in the CI [datafusion]
via GitHub
2026/05/03
Re: [PR] Add StatisticsContext parameter to partition_statistics [datafusion]
via GitHub
2026/05/03
Re: [PR] docs: add roadmap items for spillable hash join, UDF support, memory management, and 1.0.0 [datafusion-comet]
via GitHub
2026/05/03
Re: [I] Apply spark.comet.exec.strictFloatingPoint to RangePartitioning [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] fix: honor strictFloatingPoint in RangePartitioning [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] chore(deps): bump cc from 1.2.60 to 1.2.61 in /native in the all-other-cargo-deps group [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] Add group join physical optimizer [datafusion]
via GitHub
2026/05/03
Re: [PR] ci: add `auto detected api change` label on breaking change detecting in the CI [datafusion]
via GitHub
2026/05/03
Re: [PR] Add group join physical optimizer [datafusion]
via GitHub
2026/05/03
Re: [PR] docs: add Ballista TUI documentation [datafusion-ballista]
via GitHub
2026/05/03
[PR] fix: support Spark 4.1 BloomFilter V2 format and bit-scattering [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] ci: add `auto detected api change` label on breaking change detecting in the CI [datafusion]
via GitHub
2026/05/03
Re: [PR] fix(aggregate): show aliased expr in explain [datafusion]
via GitHub
2026/05/03
Re: [I] Should Ballista use sort-merge join rather than hash join by default? [datafusion-ballista]
via GitHub
2026/05/03
Re: [PR] feat(scheduler): broadcast-style hash join for small-side joins [datafusion-ballista]
via GitHub
2026/05/03
Re: [PR] Add support for lambda column capture [datafusion]
via GitHub
2026/05/03
Re: [I] Breaking change detector cannot compile datafusion-common in some cases [datafusion]
via GitHub
2026/05/03
Re: [PR] ci: use base repository branch for breaking change detector [datafusion]
via GitHub
2026/05/03
Re: [PR] ci: use base repository branch for breaking change detector [datafusion]
via GitHub
2026/05/03
[PR] test: unignore SPARK-52921 union partitioning tests [datafusion-comet]
via GitHub
2026/05/03
Re: [I] Tracking: remaining Spark 4.1 CI failures [datafusion-comet]
via GitHub
2026/05/03
[I] Spark 4.1: native parquet reader returns wrong rows for user-defined struct schema [datafusion-comet]
via GitHub
2026/05/03
[I] Spark 4.1: native_datafusion bytesRead task metric off by 6-14x vs Spark [datafusion-comet]
via GitHub
2026/05/03
[I] Spark 4.1: bloom filter result mismatch (might_contain returns wrong answers) [datafusion-comet]
via GitHub
2026/05/03
[I] Spark 4.1: SPARK-52921 union output partitioning tests fail because Comet replaces UnionExec/ShuffleExchangeExec [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] feat(scheduler): broadcast-style hash join for small-side joins [datafusion-ballista]
via GitHub
2026/05/03
Re: [PR] feat: job plans optional rendering [datafusion-ballista]
via GitHub
2026/05/03
Re: [PR] feat(scheduler): broadcast-style hash join for small-side joins [datafusion-ballista]
via GitHub
2026/05/03
[PR] ci: use base repository branch for breaking change detector [datafusion]
via GitHub
2026/05/03
[I] Breaking change detector cannot compile datafusion-common in some cases [datafusion]
via GitHub
2026/05/03
[PR] fix: preserve parent struct nullness when all requested fields missing in Parquet [datafusion-comet]
via GitHub
2026/05/03
Re: [I] Introduce `StringViewArrayBuilder::map` to avoid duplication [datafusion]
via GitHub
2026/05/03
Re: [I] Introduce `StringViewArrayBuilder::map` to avoid duplication [datafusion]
via GitHub
2026/05/03
Re: [I] Introduce `StringViewArrayBuilder::map` to avoid duplication [datafusion]
via GitHub
2026/05/03
[I] Native scan path doesn't honour Parquet field-ID matching when spark.sql.parquet.fieldId.read.enabled=true [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] feat: Cache ballista clients on executor [datafusion-ballista]
via GitHub
2026/05/03
Re: [PR] feat: move shuffle writer disk I/O off tokio worker threads [datafusion-ballista]
via GitHub
2026/05/03
Re: [PR] feat: support regular BuildRight+LeftAnti hash join [datafusion-comet]
via GitHub
2026/05/03
Re: [PR] feat: move shuffle writer disk I/O off tokio worker threads [datafusion-ballista]
via GitHub
2026/05/03
[PR] Add benchmarks for dictionary path of new_group_values [datafusion]
via GitHub
2026/05/03
Re: [PR] docs: add llms.txt ecosystem hub at site root [datafusion]
via GitHub
2026/05/03
Re: [PR] feat: Cache ballista clients on executor [datafusion-ballista]
via GitHub
2026/05/03
Re: [PR] feat: Cache ballista clients on executor [datafusion-ballista]
via GitHub
Earlier messages