github
Thread
Date
Earlier messages
Later messages
Messages by Date
2026/04/28
Re: [PR] Intermediate result blocked approach to aggregation memory management [datafusion]
via GitHub
2026/04/28
Re: [PR] docs: improve Python documentation structure [datafusion-ballista]
via GitHub
2026/04/28
Re: [PR] docs: improve Python documentation structure [datafusion-ballista]
via GitHub
2026/04/28
Re: [PR] docs: improve Python documentation structure [datafusion-ballista]
via GitHub
2026/04/28
Re: [I] Ballista configs cannot be set in Python client [datafusion-ballista]
via GitHub
2026/04/28
Re: [PR] docs: improve Python documentation structure [datafusion-ballista]
via GitHub
2026/04/28
Re: [PR] Intermediate result blocked approach to aggregation memory management [datafusion]
via GitHub
2026/04/28
[I] Ballista configs cannot be set in Python client [datafusion-ballista]
via GitHub
2026/04/28
Re: [PR] Intermediate result blocked approach to aggregation memory management [datafusion]
via GitHub
2026/04/28
Re: [PR] Intermediate result blocked approach to aggregation memory management [datafusion]
via GitHub
2026/04/28
Re: [PR] Intermediate result blocked approach to aggregation memory management [datafusion]
via GitHub
2026/04/28
Re: [PR] Intermediate result blocked approach to aggregation memory management [datafusion]
via GitHub
2026/04/28
[I] Add IO_BLOCK_TRANSPORT support for sort-based shuffle [datafusion-ballista]
via GitHub
2026/04/28
Re: [PR] ci: enable Spark 4.1 PR test matrix [datafusion-comet]
via GitHub
2026/04/28
[I] SPARK-53968 SQLViewSuite: decimal arithmetic returns ~10x smaller values through view CTE on Spark 4.1.1 [datafusion-comet]
via GitHub
2026/04/28
[I] Comet native sort lacks row-format support for Struct(Map(...)) sort keys [datafusion-comet]
via GitHub
2026/04/28
[I] EXCEPT ALL / INTERSECT ALL with GROUP BY return incorrect results on Spark 4.1.1 [datafusion-comet]
via GitHub
2026/04/28
[I] Comet native scan rejects invalid UTF-8 byte sequences in STRING column (hll.sql on Spark 4.1) [datafusion-comet]
via GitHub
2026/04/28
Re: [PR] feat: Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [PR] Intermediate result blocked approach to aggregation memory management [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
[I] Type annotation requires UDFs to return a type instead of an array [datafusion-python]
via GitHub
2026/04/28
Re: [PR] feat: Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [PR] fix: grouping separator for float and decimal [datafusion]
via GitHub
2026/04/28
Re: [PR] Add support for nested types to nullif. [datafusion]
via GitHub
2026/04/28
Re: [I] Add support for nested types to `nullif`. [datafusion]
via GitHub
2026/04/28
Re: [I] Support `EXPLAIN ANALYZE` in Ballista [datafusion-ballista]
via GitHub
2026/04/28
Re: [PR] feat: Support `EXPLAIN ANALYZE` in Ballista [datafusion-ballista]
via GitHub
2026/04/28
Re: [PR] feat(unparser): Keep inner join `Filter → TableScan` predicates to `WHERE` instead of moving to `JOIN ON` [datafusion]
via GitHub
2026/04/28
Re: [PR] Add StatisticsContext parameter to partition_statistics [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: Scheduler config update [datafusion-ballista]
via GitHub
2026/04/28
Re: [PR] feat: Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [PR] Intermediate result blocked approach to aggregation memory management [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [PR] Intermediate result blocked approach to aggregation memory management [datafusion]
via GitHub
2026/04/28
Re: [PR] Intermediate result blocked approach to aggregation memory management [datafusion]
via GitHub
2026/04/28
Re: [PR] Intermediate result blocked approach to aggregation memory management [datafusion]
via GitHub
2026/04/28
Re: [PR] Intermediate result blocked approach to aggregation memory management [datafusion]
via GitHub
2026/04/28
Re: [PR] Intermediate result blocked approach to aggregation memory management [datafusion]
via GitHub
2026/04/28
Re: [PR] Intermediate result blocked approach to aggregation memory management [datafusion]
via GitHub
2026/04/28
Re: [PR] Intermediate result blocked approach to aggregation memory management [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: Support Spark levenshtein expression in native execution [datafusion-comet]
via GitHub
2026/04/28
Re: [PR] feat: Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [PR] fix: fix elapsed_compute metric in ParquetSink to report encoding time only [datafusion]
via GitHub
2026/04/28
Re: [I] Potential Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [I] Potential Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/28
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/28
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/28
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: Scheduler config update [datafusion-ballista]
via GitHub
2026/04/28
Re: [PR] chore(deps): bump pbjson-types from 0.8.0 to 0.9.0 in the proto group [datafusion]
via GitHub
2026/04/28
Re: [PR] chore(deps): bump pbjson-types from 0.8.0 to 0.9.0 in the proto group [datafusion]
via GitHub
2026/04/28
[PR] chore(deps): bump libloading from 0.8.9 to 0.9.0 [datafusion]
via GitHub
2026/04/28
Re: [PR] ci: pin JDK per Spark version in Iceberg workflow matrix [datafusion-comet]
via GitHub
2026/04/28
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/28
[PR] chore(deps): update pydata-sphinx-theme requirement from <1,>=0.17.0 to >=0.17.1,<1 in /docs [datafusion]
via GitHub
2026/04/28
[PR] chore(deps): bump pbjson-types from 0.8.0 to 0.9.0 in the proto group [datafusion]
via GitHub
2026/04/28
[PR] chore(deps): bump taiki-e/install-action from 2.75.18 to 2.75.23 [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [I] Blog post about 1000 distinct committers / history of the project [datafusion]
via GitHub
2026/04/28
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: LogicalPlanningPipeline [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: Scheduler config update [datafusion-ballista]
via GitHub
2026/04/28
Re: [I] Potential Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [I] [DRAFT, EPIC] Full lambda support [datafusion]
via GitHub
2026/04/28
[PR] Improved multiple column aggregation performance by using bitmasks rather than `Vec<bool>` [datafusion]
via GitHub
2026/04/28
Re: [PR] ci: pin JDK per Spark version in Iceberg workflow matrix [datafusion-comet]
via GitHub
2026/04/28
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/28
Re: [PR] fix(proto): correctly serialize FilterExec empty projection [datafusion]
via GitHub
2026/04/28
Re: [PR] fix(proto): correctly serialize FilterExec empty projection [datafusion]
via GitHub
2026/04/28
Re: [PR] fix(proto): correctly serialize FilterExec empty projection [datafusion]
via GitHub
2026/04/28
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/28
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/28
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/28
Re: [PR] docs: improve Python documentation structure [datafusion-ballista]
via GitHub
2026/04/28
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: Support Spark levenshtein expression in native execution [datafusion-comet]
via GitHub
2026/04/28
Re: [PR] feat: Support Spark levenshtein expression in native execution [datafusion-comet]
via GitHub
2026/04/28
Re: [PR] feat: Support `EXPLAIN ANALYZE` in Ballista [datafusion-ballista]
via GitHub
2026/04/28
Re: [PR] fix(proto): correctly serialize FilterExec empty projection [datafusion]
via GitHub
2026/04/28
Re: [PR] feat: Support `EXPLAIN ANALYZE` in Ballista [datafusion-ballista]
via GitHub
2026/04/28
Re: [PR] feat: Support Spark levenshtein expression in native execution [datafusion-comet]
via GitHub
2026/04/28
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/28
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/27
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/27
Re: [PR] feat: Support column based partition write in comet IO [datafusion-comet]
via GitHub
2026/04/27
Re: [PR] feat: [iceberg] allow native Iceberg scans with non-identity transform residuals [datafusion-comet]
via GitHub
2026/04/27
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/27
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/27
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/27
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/27
Re: [PR] feat: add encode time tracking for shuffle operations [datafusion-comet]
via GitHub
2026/04/27
Re: [PR] feat: add encode time tracking for shuffle operations [datafusion-comet]
via GitHub
2026/04/27
Re: [PR] ci: pin JDK per Spark version in Iceberg workflow matrix [datafusion-comet]
via GitHub
2026/04/27
Re: [PR] docs: improve Python documentation structure [datafusion-ballista]
via GitHub
2026/04/27
Re: [PR] chore: fix spark ansi sum incompatibility message [datafusion-comet]
via GitHub
2026/04/27
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/27
Re: [PR] chore: fix spark ansi sum incompatibility message [datafusion-comet]
via GitHub
2026/04/27
Re: [PR] feat: Cache remote shuffle reader clients on executor [datafusion-ballista]
via GitHub
2026/04/27
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/27
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/27
Re: [PR] feat: Cache remote shuffle reader clients on executor [datafusion-ballista]
via GitHub
2026/04/27
Re: [PR] fix(proto): correctly serialize FilterExec empty projection [datafusion]
via GitHub
2026/04/27
Re: [PR] feat: support binary arguments for StringConcat operator [datafusion]
via GitHub
2026/04/27
[PR] ci: pin JDK per Spark version in Iceberg workflow matrix [datafusion-comet]
via GitHub
2026/04/27
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/27
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/27
Re: [PR] feat: Cache remote shuffle reader clients on executor [datafusion-ballista]
via GitHub
2026/04/27
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/27
Re: [PR] Add lambda support and array_transform udf [datafusion]
via GitHub
2026/04/27
[PR] fix(proto): correctly serialize FilterExec empty projection [datafusion]
via GitHub
2026/04/27
Re: [PR] chore(deps): bump reqwest from 0.13.2 to 0.13.3 [datafusion-ballista]
via GitHub
2026/04/27
Re: [PR] fix(optimizer): preserve filter execution order in PushDownFilter [datafusion]
via GitHub
2026/04/27
Re: [PR] fix(optimizer): preserve filter execution order in PushDownFilter [datafusion]
via GitHub
2026/04/27
Re: [PR] Add a memory bound FileStatisticsCache for the Listing Table [datafusion]
via GitHub
2026/04/27
Re: [PR] perf(spark): use 256-entry byte-pair table in hex encoding [datafusion]
via GitHub
2026/04/27
Re: [PR] fix(proto): correctly serialize `FilterExec` empty projection [datafusion]
via GitHub
2026/04/27
Re: [PR] fix(proto): correctly serialize `FilterExec` empty projection [datafusion]
via GitHub
2026/04/27
Re: [PR] feat: suport spark compatible floor function [datafusion]
via GitHub
2026/04/27
Re: [PR] perf(spark): use 256-entry byte-pair table in hex encoding [datafusion]
via GitHub
2026/04/27
Re: [PR] build: Enable Spark SQL tests for Spark 4.1.1 [datafusion-comet]
via GitHub
2026/04/27
[PR] chore(deps): bump reqwest from 0.13.2 to 0.13.3 [datafusion-ballista]
via GitHub
2026/04/27
Re: [PR] feat: suport spark compatible floor function [datafusion]
via GitHub
2026/04/27
[PR] Fix GH action permissions in `rust.yml` and `docs.yaml` workflows [datafusion]
via GitHub
2026/04/27
Re: [PR] refactor `array_remove` benchmarks & add nested benches [datafusion]
via GitHub
2026/04/27
Re: [PR] Fix some GH action permission issues identified by CodeQL [datafusion]
via GitHub
2026/04/27
Re: [PR] Fix some GH action permission issues identified by CodeQL [datafusion]
via GitHub
2026/04/27
Re: [PR] feat: default to sort-based shuffle writer [WIP] [datafusion-ballista]
via GitHub
2026/04/27
Re: [PR] feat: default to sort-based shuffle writer [WIP] [datafusion-ballista]
via GitHub
2026/04/27
Re: [I] Add vector distance, array math, and array aggregate functions [datafusion]
via GitHub
2026/04/27
[PR] build: add spark-4.2 Maven profile targeting 4.2.0-preview4 [datafusion-comet]
via GitHub
2026/04/27
Re: [PR] refactor: consolidate identical spark-4.0 and spark-4.1 shims into spark-4.x [datafusion-comet]
via GitHub
2026/04/27
[PR] refactor: consolidate identical spark-4.0 and spark-4.1 shims into spark-4.x [datafusion-comet]
via GitHub
2026/04/27
Re: [PR] fix: InterleaveExec fallback to UnionExec when children partitioning diverges [datafusion]
via GitHub
2026/04/27
Re: [PR] fix: InterleaveExec fallback to UnionExec when children partitioning diverges [datafusion]
via GitHub
2026/04/27
Re: [PR] build: add spark-4.1 Maven profile and shim sources [datafusion-comet]
via GitHub
2026/04/27
Re: [PR] feat: split disk_write_time from write_time in sort shuffle writer [datafusion-ballista]
via GitHub
2026/04/27
Re: [PR] feat: default to sort-based shuffle writer [WIP] [datafusion-ballista]
via GitHub
2026/04/27
Re: [PR] Do not merge - Test df binding python [datafusion]
via GitHub
2026/04/27
Re: [PR] feat: default to sort-based shuffle writer [datafusion-ballista]
via GitHub
2026/04/27
Re: [PR] build: add spark-4.1 Maven profile and shim sources [datafusion-comet]
via GitHub
2026/04/27
Re: [PR] build: add spark-4.1 Maven profile and shim sources [datafusion-comet]
via GitHub
2026/04/27
Re: [PR] feat: [iceberg] Pass table master key ID to native scan [datafusion-comet]
via GitHub
2026/04/27
Re: [PR] Run core benchmarks with CodSpeed [datafusion]
via GitHub
2026/04/27
Re: [PR] Fix:c Median() truncates integers by returning Float64 for integer inputs [datafusion]
via GitHub
2026/04/27
Re: [PR] feat: Prune complex/nested predicates via statistics propagation [datafusion]
via GitHub
2026/04/27
Re: [PR] feat: Add Spark-compatible `xxhash64` and `murmur3` hash functions [datafusion]
via GitHub
2026/04/27
Re: [PR] Refactor(optimizer): Simplify Stable ScalarUDFs using physical execution [datafusion]
via GitHub
2026/04/27
Re: [PR] perf: reduce read amplification for partitioned JSON file scanning [datafusion]
via GitHub
2026/04/27
Re: [PR] Fix: External sort OOM for single oversized batches by chunked spill fallback [datafusion]
via GitHub
2026/04/27
Re: [PR] [TEST] Filter pushdown dynamic [datafusion]
via GitHub
2026/04/27
Re: [PR] AggregateExec: incremental memory release before streaming merge [datafusion]
via GitHub
2026/04/27
Re: [PR] fix(explain): render aggregate expressions correctly when CSE is applied [datafusion]
via GitHub
2026/04/27
Re: [PR] Simplify case when all result expressions are identical [datafusion]
via GitHub
2026/04/27
Re: [PR] Remove from string column [datafusion]
via GitHub
2026/04/27
Re: [PR] feat: add coerce_arguments flag to UDTFs to allow skipping automatic … [datafusion]
via GitHub
2026/04/27
Re: [PR] refactor: Consolidate async UDF handling in physical planner [datafusion]
via GitHub
2026/04/27
Re: [PR] feat: add LazyPartitioned mode for hash join to reduce RepartitionExec overhead [datafusion]
via GitHub
2026/04/27
Re: [PR] Preserve input field nullability in ArrayAgg return field [datafusion]
via GitHub
2026/04/27
Re: [PR] Changed clippy.toml and added std hashmap and hashset to disallowed t… [datafusion]
via GitHub
2026/04/27
Re: [PR] Gene.bordegaray/2026/02/partition index dynamic filters [datafusion]
via GitHub
2026/04/27
Re: [PR] [BENCH] Benchmark upgrade coalesce improvements [datafusion]
via GitHub
2026/04/27
Re: [PR] docs: add Docker-based workflow for building documentation [datafusion]
via GitHub
2026/04/27
Re: [PR] Support reverse page ordering in sort pushdown phase 1 [datafusion]
via GitHub
2026/04/27
Re: [PR] Add preimage optimization for `ceil` to rewrite predicates into range filters [datafusion]
via GitHub
2026/04/27
Re: [PR] [datafusion-spark] Implement map function [datafusion]
via GitHub
2026/04/27
Re: [PR] WIP: FFI query planner [datafusion]
via GitHub
2026/04/27
Re: [I] Use interleave_record_batch to avoid tiny batches in sort-based shuffle [datafusion-ballista]
via GitHub
2026/04/27
Re: [PR] feat: Add standalone shuffle writer benchmark that shuffles real Parquet input [datafusion-ballista]
via GitHub
2026/04/27
Re: [I] Add support for Spark 4.1 `OneRowRelationExec` [datafusion-comet]
via GitHub
2026/04/27
[PR] feat: default to sort-based shuffle writer [datafusion-ballista]
via GitHub
2026/04/27
Re: [PR] feat: Add standalone shuffle writer benchmark that shuffles real Parquet input [datafusion-ballista]
via GitHub
2026/04/27
Re: [PR] feat: Add standalone shuffle writer benchmark that shuffles real Parquet input [datafusion-ballista]
via GitHub
2026/04/27
Re: [I] Replace shuffle implementation with version based on Comet [datafusion-ballista]
via GitHub
2026/04/27
[PR] fix: throw SchemaColumnConvertNotSupportedException from native_datafusion schema mismatch [datafusion-comet]
via GitHub
2026/04/27
[PR] feat: split disk_write_time from write_time in sort shuffle writer [datafusion-ballista]
via GitHub
2026/04/27
Re: [I] Dictionary coercion on min/max [datafusion]
via GitHub
2026/04/27
Re: [PR] Support Dictionary Arrays in MIN/MAX Aggregates [datafusion]
via GitHub
2026/04/27
Re: [PR] Support Dictionary Arrays in MIN/MAX Aggregates [datafusion]
via GitHub
2026/04/27
[I] Replace shuffle implementation with version based on Comet [datafusion-ballista]
via GitHub
2026/04/27
Re: [PR] feat: support `PartialMerge` aggregation mode [datafusion-comet]
via GitHub
2026/04/27
Re: [PR] feat: support regular BuildRight+LeftAnti hash join [datafusion-comet]
via GitHub
2026/04/27
Re: [I] Release DataFusion `54.0.0` (Apr 2026 / May 2026) [datafusion]
via GitHub
2026/04/27
[PR] feat: Utf8View and BinaryView support [datafusion-comet]
via GitHub
2026/04/27
[PR] feat(aqe): support executor failure in AdaptiveExecutionGraph [datafusion-ballista]
via GitHub
2026/04/27
Re: [PR] ci: add Spark 4.0 / JDK 21 profile [datafusion-comet]
via GitHub
2026/04/27
Re: [I] Create new Comet logo that is consistent with other DataFusion Projects [datafusion-comet]
via GitHub
2026/04/27
Re: [PR] docs: add Understanding Comet Plans user guide page [datafusion-comet]
via GitHub
2026/04/27
Re: [I] Improve documentation for configs related to explain/fallback [datafusion-comet]
via GitHub
2026/04/27
Re: [PR] docs: add Understanding Comet Plans user guide page [datafusion-comet]
via GitHub
Earlier messages
Later messages