github
Thread
Date
Earlier messages
Messages by Thread
[PR] chore(deps): bump graphviz-rust from 0.9.7 to 0.9.8 [datafusion-ballista]
via GitHub
[PR] chore(deps): bump ctor from 0.12.0 to 1.0.0 [datafusion-ballista]
via GitHub
[PR] chore(deps): bump github/codeql-action from 4.35.2 to 4.35.3 [datafusion-ballista]
via GitHub
[PR] bench: add to_char_array_date32 [datafusion]
via GitHub
Re: [PR] feat: Support Spark expression minutes_of_time [datafusion-comet]
via GitHub
Re: [PR] Use datafusion-spark SparkArrayContains for three-valued NULL semantics [datafusion-comet]
via GitHub
Re: [PR] chore(deps): bump tonic from 0.14.2 to 0.14.3 [datafusion-sandbox]
via GitHub
Re: [PR] Fix cast wrapped AggregateUDF during Substrait Conversion [datafusion]
via GitHub
Re: [PR] Add Parquet read pruning configuration for max elements in inList [datafusion]
via GitHub
Re: [PR] dictionary encoded group by's [datafusion]
via GitHub
Re: [PR] perf: Optimize regexp match and not match for `.*foo.*` cases [datafusion]
via GitHub
[PR] refactor: introduce CometPlanner to replace CometScanRule and CometExecRule [datafusion-comet]
via GitHub
Re: [PR] feat: Implement `datafusion-spark` `sequence` function [datafusion]
via GitHub
Re: [PR] feat: Implement `datafusion-spark` `sequence` function [datafusion]
via GitHub
Re: [PR] feat: Implement `datafusion-spark` `sequence` function [datafusion]
via GitHub
[I] macOS aarch64 flake: SIGBUS in _pthread_tsd_cleanup after ParquetReadFromFakeHadoopFsSuite [datafusion-comet]
via GitHub
[I] Spark 4.1 NullType parquet: parquet-rs rejects BOOLEAN + Unknown logical type [datafusion-comet]
via GitHub
Re: [I] Use `datafusion-spark` version of `date_add` and `date_sub` [datafusion-comet]
via GitHub
[PR] fix(metrics): avoid stage metrics inflation by tracking partition snapshots [datafusion-ballista]
via GitHub
Re: [PR] fix(metrics): avoid stage metrics inflation by tracking partition snapshots [datafusion-ballista]
via GitHub
Re: [PR] fix(metrics): avoid stage metrics inflation by tracking partition snapshots [datafusion-ballista]
via GitHub
Re: [PR] fix(metrics): avoid stage metrics inflation by tracking partition snapshots [datafusion-ballista]
via GitHub
Re: [PR] fix(metrics): avoid stage metrics inflation by tracking partition snapshots [datafusion-ballista]
via GitHub
[PR] docs: document Spark 4 IntelliJ setup [datafusion-comet]
via GitHub
Re: [PR] docs: document Spark 4 IntelliJ setup [datafusion-comet]
via GitHub
Re: [PR] docs: document Spark 4 IntelliJ setup [datafusion-comet]
via GitHub
[PR] build(deps): bump github/codeql-action from 4.32.5 to 4.35.3 [datafusion-python]
via GitHub
Re: [PR] build(deps): bump github/codeql-action from 4.32.5 to 4.35.2 [datafusion-python]
via GitHub
Re: [PR] build(deps): bump github/codeql-action from 4.32.5 to 4.35.2 [datafusion-python]
via GitHub
Re: [I] Write documentation for working with Comet's spark-4.0 profile in IntelliJ [datafusion-comet]
via GitHub
[PR] test: relax bytesRead ratio assertion for Spark 4.1+ + docs [datafusion-comet]
via GitHub
[PR] fix: support Spark 4.1 BloomFilter V2 format and bit-scattering [datafusion-comet]
via GitHub
[PR] test: unignore SPARK-52921 union partitioning tests [datafusion-comet]
via GitHub
Re: [PR] test: [Spark 4.1.1] unignore SPARK-52921 union partitioning tests [datafusion-comet]
via GitHub
[I] Spark 4.1: native parquet reader returns wrong rows for user-defined struct schema [datafusion-comet]
via GitHub
[I] Spark 4.1: native_datafusion bytesRead task metric off by 6-14x vs Spark [datafusion-comet]
via GitHub
[I] Spark 4.1: bloom filter result mismatch (might_contain returns wrong answers) [datafusion-comet]
via GitHub
[I] Spark 4.1: SPARK-52921 union output partitioning tests fail because Comet replaces UnionExec/ShuffleExchangeExec [datafusion-comet]
via GitHub
Re: [I] Spark 4.1: SPARK-52921 union output partitioning tests fail because Comet replaces UnionExec/ShuffleExchangeExec [datafusion-comet]
via GitHub
[PR] ci: use base repository branch for breaking change detector [datafusion]
via GitHub
Re: [PR] ci: use base repository branch for breaking change detector [datafusion]
via GitHub
Re: [PR] ci: use base repository branch for breaking change detector [datafusion]
via GitHub
[I] Breaking change detector cannot compile datafusion-common in some cases [datafusion]
via GitHub
Re: [I] Breaking change detector cannot compile datafusion-common in some cases [datafusion]
via GitHub
[PR] fix: preserve parent struct nullness when all requested fields missing in Parquet [datafusion-comet]
via GitHub
[I] Native scan path doesn't honour Parquet field-ID matching when spark.sql.parquet.fieldId.read.enabled=true [datafusion-comet]
via GitHub
[PR] Add benchmarks for dictionary path of new_group_values [datafusion]
via GitHub
[PR] docs: add llms.txt ecosystem hub at site root [datafusion]
via GitHub
Re: [PR] docs: add llms.txt ecosystem hub at site root [datafusion]
via GitHub
[PR] docs: add upstream sync process documentation [datafusion-python]
via GitHub
[PR] feat: default to sort-merge join [datafusion-ballista]
via GitHub
Re: [I] Comet doesn't support Spark BroadcastHashJoinExec if it is null-aware anti-join [datafusion-comet]
via GitHub
[PR] [WIP] Explore extensible range partitioning for dynamic filters [datafusion]
via GitHub
Re: [PR] [WIP] Explore extensible range partitioning for dynamic filters [datafusion]
via GitHub
Re: [PR] [WIP] Explore extensible range partitioning for dynamic filters [datafusion]
via GitHub
Re: [PR] [WIP] Explore extensible range partitioning for dynamic filters [datafusion]
via GitHub
Re: [PR] [WIP] Explore extensible range partitioning for dynamic filters [datafusion]
via GitHub
[I] Set up processing for auditing all Spark commits to assess impact on Comet [datafusion-comet]
via GitHub
Re: [I] Set up process for auditing all Spark commits to assess impact on Comet [datafusion-comet]
via GitHub
[PR] Add benchmark_runner with help and list commands [datafusion]
via GitHub
[PR] feat: job plans optional rendering [datafusion-ballista]
via GitHub
Re: [PR] feat: job plans optional rendering [datafusion-ballista]
via GitHub
Re: [PR] feat: job plans optional rendering [datafusion-ballista]
via GitHub
Re: [PR] feat: job plans optional rendering [datafusion-ballista]
via GitHub
[PR] feat(parquet): row-group and row-range sampling on ParquetSource [datafusion]
via GitHub
Re: [PR] feat: TABLESAMPLE SYSTEM end-to-end + row-group / row sampling on ParquetSource [datafusion]
via GitHub
Re: [PR] feat(pruning): add StatisticsSource trait with two-phase resolve/evaluate API [datafusion]
via GitHub
[PR] fix(physical-plan): set column byte_size to 0 in FilterExec zero-row interval stats [datafusion]
via GitHub
Re: [PR] fix(physical-plan): set column byte_size to 0 in FilterExec zero-row interval stats [datafusion]
via GitHub
Re: [I] Release 0.61.1 to resolve panics [datafusion-sqlparser-rs]
via GitHub
[I] Add cargo-fuzz / OSS-Fuzz coverage for DataFusion's parser and analyzer [datafusion]
via GitHub
Re: [I] Add cargo-fuzz / OSS-Fuzz coverage for DataFusion's parser and analyzer [datafusion]
via GitHub
Re: [I] Add cargo-fuzz / OSS-Fuzz coverage for DataFusion's parser and analyzer [datafusion]
via GitHub
Re: [PR] kill `linux-build-lib` from extended tests [datafusion]
via GitHub
Re: [PR] kill `linux-build-lib` from extended tests [datafusion]
via GitHub
Re: [PR] kill `linux-build-lib` from extended tests [datafusion]
via GitHub
[I] Introduce `StringViewArrayBuilder::map` to avoid duplication [datafusion]
via GitHub
Re: [I] Introduce `StringViewArrayBuilder::map` to avoid duplication [datafusion]
via GitHub
Re: [I] Introduce `StringViewArrayBuilder::map` to avoid duplication [datafusion]
via GitHub
Re: [I] Introduce `StringViewArrayBuilder::map` to avoid duplication [datafusion]
via GitHub
Re: [I] Introduce `StringViewArrayBuilder::map` to avoid duplication [datafusion]
via GitHub
Re: [I] Introduce `StringViewArrayBuilder::map` to avoid duplication [datafusion]
via GitHub
[PR] Query-aware statistics requests via ScanArgs / ScanResult (RFC for #21624) [datafusion]
via GitHub
Re: [PR] Query-aware statistics requests via ScanArgs / ScanResult (RFC for #21624) [datafusion]
via GitHub
Re: [PR] Query-aware statistics requests via ScanArgs / ScanResult (RFC for #21624) [datafusion]
via GitHub
Re: [PR] Query-aware statistics requests via ScanArgs / ScanResult (RFC for #21624) [datafusion]
via GitHub
Re: [PR] Query-aware statistics requests via ScanArgs / ScanResult (RFC for #21624) [datafusion]
via GitHub
Re: [PR] Query-aware statistics requests via ScanArgs / ScanResult (RFC for #21624) [datafusion]
via GitHub
Re: [I] Improve `InListExpr` types and expectations [datafusion]
via GitHub
Re: [PR] Add ExpressionAnalyzer for pluggable expression-level statistics estimation [datafusion]
via GitHub
Re: [PR] functions: Add dict support for get field [datafusion]
via GitHub
[PR] feat(aqe) remove eager plan stages split [datafusion-ballista]
via GitHub
Re: [PR] feat(aqe) remove eager plan stages split [datafusion-ballista]
via GitHub
[PR] Add group join physical optimizer [datafusion]
via GitHub
Re: [PR] Add group join physical optimizer [datafusion]
via GitHub
Re: [PR] Add group join physical optimizer [datafusion]
via GitHub
Re: [PR] Add group join physical optimizer [datafusion]
via GitHub
Re: [PR] Add group join physical optimizer [datafusion]
via GitHub
Re: [PR] Add group join physical optimizer [datafusion]
via GitHub
Re: [PR] Add group join physical optimizer [datafusion]
via GitHub
Re: [PR] Add group join physical optimizer [datafusion]
via GitHub
Re: [PR] Add group join physical optimizer [datafusion]
via GitHub
Re: [PR] Add group join physical optimizer [datafusion]
via GitHub
Re: [PR] Add group join physical optimizer [datafusion]
via GitHub
Re: [PR] Add group join physical optimizer [datafusion]
via GitHub
[PR] feat: Add Protobuf support for Explain node [datafusion]
via GitHub
Re: [PR] feat: Add Protobuf support for Explain node [datafusion]
via GitHub
Re: [PR] feat: Support basic Delta scans [datafusion-comet]
via GitHub
Re: [PR] perf: Add ReflectionCache for Iceberg serialization optimization [iceberg] [datafusion-comet]
via GitHub
Re: [PR] docs: add custom table provider filter pushdown examples [datafusion]
via GitHub
Re: [PR] docs: add custom table provider filter pushdown examples [datafusion]
via GitHub
[PR] feat: type-keyed extensions map for PartitionedFile [datafusion]
via GitHub
Re: [PR] feat: type-keyed extensions map for PartitionedFile [datafusion]
via GitHub
Re: [PR] feat: type-keyed extensions map for PartitionedFile [datafusion]
via GitHub
[PR] fix(spark-expr): preserve scalar tag in WideDecimalBinaryExpr when both inputs are scalars [datafusion-comet]
via GitHub
Re: [PR] fix(spark-expr): preserve scalar tag in WideDecimalBinaryExpr when both inputs are scalars [datafusion-comet]
via GitHub
[PR] feat: memory-budget-aware SortMergeJoin to ShuffledHashJoin rewrite [datafusion-comet]
via GitHub
[I] [DISCUSSION] Extending Partitioning to Support More Variants [datafusion]
via GitHub
Re: [I] [DISCUSSION] Extending Partitioning to Support More Variants [datafusion]
via GitHub
Re: [I] [DISCUSSION] Extending Partitioning to Support More Variants [datafusion]
via GitHub
[PR] build(deps): bump arrow-schema from 58.1.0 to 58.2.0 [datafusion-python]
via GitHub
[PR] build(deps): bump arrow-array from 58.1.0 to 58.2.0 [datafusion-python]
via GitHub
Re: [PR] Spark dayname function implementation [datafusion]
via GitHub
Re: [I] Unnecessary RepartitionExec + SortPreservingMergeExec on single-partition sorted output [datafusion]
via GitHub
[I] Should Ballista use sort-merge join rather than hash join by default? [datafusion-ballista]
via GitHub
Re: [I] Should Ballista use sort-merge join rather than hash join by default? [datafusion-ballista]
via GitHub
Re: [I] Should Ballista use sort-merge join rather than hash join by default? [datafusion-ballista]
via GitHub
Re: [I] Should Ballista use sort-merge join rather than hash join by default? [datafusion-ballista]
via GitHub
[PR] perf: Optimize `repeat` using bulk-NULL string builders [datafusion]
via GitHub
Re: [PR] perf: Optimize `repeat` using bulk-NULL string builders [datafusion]
via GitHub
Re: [PR] perf: Optimize `repeat` using bulk-NULL string builders [datafusion]
via GitHub
Re: [PR] perf: Optimize `repeat` using bulk-NULL string builders [datafusion]
via GitHub
Re: [PR] perf: Optimize `repeat` using bulk-NULL string builders [datafusion]
via GitHub
Re: [PR] perf: Optimize `repeat` using bulk-NULL string builders [datafusion]
via GitHub
Re: [PR] perf: Optimize `repeat` using bulk-NULL string builders [datafusion]
via GitHub
Re: [PR] perf: Optimize `reverse` using bulk-NULL string builders [datafusion]
via GitHub
Re: [PR] perf: Optimize `reverse` using bulk-NULL string builders [datafusion]
via GitHub
Re: [PR] perf: Optimize `reverse` using bulk-NULL string builders [datafusion]
via GitHub
Re: [PR] perf: Optimize `reverse` using bulk-NULL string builders [datafusion]
via GitHub
[I] Optimize `reverse` with bulk-NULL string builders [datafusion]
via GitHub
Re: [I] Optimize `reverse` with bulk-NULL string builders [datafusion]
via GitHub
[PR] chore: update PMC/committer list [datafusion]
via GitHub
Re: [PR] chore: update PMC/committer list [datafusion]
via GitHub
[PR] feat(scheduler): broadcast-style hash join for small-side joins [datafusion-ballista]
via GitHub
Re: [PR] feat(scheduler): broadcast-style hash join for small-side joins [datafusion-ballista]
via GitHub
Re: [PR] feat(scheduler): broadcast-style hash join for small-side joins [datafusion-ballista]
via GitHub
Re: [PR] feat(scheduler): broadcast-style hash join for small-side joins [datafusion-ballista]
via GitHub
Re: [PR] feat(scheduler): broadcast-style hash join for small-side joins [datafusion-ballista]
via GitHub
Re: [PR] feat(scheduler): broadcast-style hash join for small-side joins [datafusion-ballista]
via GitHub
Re: [PR] feat(scheduler): broadcast-style hash join for small-side joins [datafusion-ballista]
via GitHub
Re: [PR] feat(scheduler): broadcast-style hash join for small-side joins [datafusion-ballista]
via GitHub
Re: [PR] feat(scheduler): broadcast-style hash join for small-side joins [datafusion-ballista]
via GitHub
Re: [PR] feat(scheduler): broadcast-style hash join for small-side joins [datafusion-ballista]
via GitHub
Re: [PR] feat(scheduler): broadcast-style hash join for small-side joins [datafusion-ballista]
via GitHub
Re: [PR] feat(scheduler): broadcast-style hash join for small-side joins [datafusion-ballista]
via GitHub
[PR] fix: `median` returns Float64 for integer inputs to avoid truncation [datafusion]
via GitHub
Re: [PR] fix: `median` returns Float64 for integer inputs to avoid truncation [datafusion]
via GitHub
Re: [PR] fix: `median` returns Float64 for integer inputs to avoid truncation [datafusion]
via GitHub
Re: [PR] fix: `median` returns Float64 for integer inputs to avoid truncation [datafusion]
via GitHub
[PR] test: add Scala test coverage for spark.sql.optimizer.nestedSchemaPruning.enabled [datafusion-comet]
via GitHub
[PR] [do-not-merge] Wide-schema parquet read perf — visibility for #21968 [datafusion]
via GitHub
Re: [PR] [do-not-merge] Wide-schema parquet read perf — visibility for #21968 [datafusion]
via GitHub
Re: [PR] feat: Replace current hash join implementation with Grace hash join [experimental] [datafusion-comet]
via GitHub
Re: [PR] feat: Replace current hash join implementation with Grace hash join [experimental] [datafusion-comet]
via GitHub
Re: [PR] feat: Replace current hash join implementation with Grace hash join [experimental] [datafusion-comet]
via GitHub
[PR] feat: Add experimental Grace Hash Join operator [WIP] [datafusion-comet]
via GitHub
Re: [PR] feat: Add experimental Grace Hash Join operator [datafusion-comet]
via GitHub
[PR] build(bench): rework docker-compose TPC-H stack and add manual benchmark workflow [datafusion-ballista]
via GitHub
Re: [PR] build(bench): rework docker-compose TPC-H stack and add manual benchmark workflow [datafusion-ballista]
via GitHub
Re: [PR] build(bench): rework docker-compose TPC-H stack and add manual benchmark workflow [datafusion-ballista]
via GitHub
Re: [PR] build(bench): rework docker-compose TPC-H stack [datafusion-ballista]
via GitHub
Re: [PR] build(bench): rework docker-compose TPC-H stack [datafusion-ballista]
via GitHub
[PR] test: add SQL test coverage for spark.sql.legacy.timeParserPolicy [datafusion-comet]
via GitHub
[PR] fix: Avoid spurious repartitioning with ScalarSubqueryExec [datafusion]
via GitHub
Re: [PR] fix: Avoid unnecessary input repartitioning with `ScalarSubqueryExec` [datafusion]
via GitHub
Re: [PR] fix: Avoid unnecessary input repartitioning with `ScalarSubqueryExec` [datafusion]
via GitHub
Re: [PR] fix: Avoid unnecessary input repartitioning with `ScalarSubqueryExec` [datafusion]
via GitHub
Re: [PR] fix: Avoid unnecessary input repartitioning with `ScalarSubqueryExec` [datafusion]
via GitHub
Re: [PR] fix: Avoid unnecessary input repartitioning with `ScalarSubqueryExec` [datafusion]
via GitHub
Re: [PR] fix: Avoid unnecessary input repartitioning with `ScalarSubqueryExec` [datafusion]
via GitHub
Re: [PR] fix: Avoid unnecessary input repartitioning with `ScalarSubqueryExec` [datafusion]
via GitHub
Re: [PR] fix: Avoid unnecessary input repartitioning with `ScalarSubqueryExec` [datafusion]
via GitHub
[PR] Making From<ConfigFileDecryptionProperties/ConfigFileEncryptionProperties> conversions fallible with `TryFrom` [datafusion]
via GitHub
Re: [PR] feat: Making From<ConfigFileDecryptionProperties/ConfigFileEncryptionProperties> conversions fallible with `TryFrom` [datafusion]
via GitHub
Re: [PR] feat: Making From<ConfigFileDecryptionProperties/ConfigFileEncryptionProperties> conversions fallible with `TryFrom` [datafusion]
via GitHub
Re: [PR] feat: Making From<ConfigFileDecryptionProperties/ConfigFileEncryptionProperties> conversions fallible with `TryFrom` [datafusion]
via GitHub
Re: [PR] feat: Making From<ConfigFileDecryptionProperties/ConfigFileEncryptionProperties> conversions fallible with `TryFrom` [datafusion]
via GitHub
Re: [PR] feat: Making From<ConfigFileDecryptionProperties/ConfigFileEncryptionProperties> conversions fallible with `TryFrom` [datafusion]
via GitHub
Re: [PR] fix: generate integer keys instead of floats in TPC-DS data [datafusion-benchmarks]
via GitHub
Re: [PR] fix: generate integer keys instead of floats in TPC-DS data [datafusion-benchmarks]
via GitHub
[PR] test: add explicit negative-count cases for array_repeat [datafusion-comet]
via GitHub
Re: [PR] test: add explicit negative-count cases for array_repeat [datafusion-comet]
via GitHub
[PR] fix: propagate inner-field metadata through composite-type constructors [datafusion]
via GitHub
Re: [PR] fix: propagate inner-field metadata through composite-type constructors [datafusion]
via GitHub
Re: [PR] fix: propagate inner-field metadata through composite-type constructors [datafusion]
via GitHub
[PR] feat: Refactor NLJ into an extensible framework for specialized joins [datafusion]
via GitHub
Re: [PR] feat: Refactor NLJ into an extensible framework for specialized joins [datafusion]
via GitHub
Re: [PR] feat: Refactor NLJ into an extensible framework for specialized joins [datafusion]
via GitHub
[I] make_array / array_agg drop inner-Field metadata when constructing List<T> [datafusion]
via GitHub
Re: [I] make_array / array_agg drop inner-Field metadata when constructing List<T> [datafusion]
via GitHub
Earlier messages