github
Thread
Date
Earlier messages
Later messages
Messages by Thread
Re: [I] Is there any more information about UDF framework? [datafusion-comet]
via GitHub
Re: [I] Is there any more information about UDF framework? [datafusion-comet]
via GitHub
[I] Add support for scalar UDFs that operate on Arrow data [datafusion-comet]
via GitHub
[PR] chore: Bump version to 0.62.0 and add changelog [datafusion-sqlparser-rs]
via GitHub
Re: [PR] chore: Bump version to 0.62.0 and add changelog [datafusion-sqlparser-rs]
via GitHub
Re: [PR] chore: Bump version to 0.62.0 and add changelog [datafusion-sqlparser-rs]
via GitHub
[I] TPC-H Q2 parallelism bottleneck: hash partitioning on low-cardinality keys causes severe data skew [datafusion-ballista]
via GitHub
[I] Make JVM-scalar-UDF dispatch responsive to task cancellation [datafusion-comet]
via GitHub
[I] Register CometArrowAllocator as a Spark MemoryConsumer for JVM-UDF dispatch [datafusion-comet]
via GitHub
[I] Add metrics and logging to JVM-scalar-UDF dispatch path [datafusion-comet]
via GitHub
[I] Tighten CometUDF API with input/return type validation at registration [datafusion-comet]
via GitHub
[I] JNI local references accumulate across executor JVM lifetime in native call sites [datafusion-comet]
via GitHub
[PR] docs: add roadmap items for spillable hash join, UDF support, memory management, and 1.0.0 [datafusion-comet]
via GitHub
[PR] Add EnsureRequirements: merged EnforceDistribution + EnforceSorting with idempotent pushdown_sorts [datafusion]
via GitHub
[PR] Js/remove snapshot method from physical expr [datafusion]
via GitHub
Re: [PR] Js/remove snapshot method from physical expr [datafusion]
via GitHub
Re: [PR] physical-expr-common: remote `PhysicalExpr::snapshot` method [datafusion]
via GitHub
Re: [PR] physical-expr-common: remote `PhysicalExpr::snapshot` method [datafusion]
via GitHub
[I] Make From<ConfigFileDecryptionProperties>/ConfigFileEncryptionProperties conversions fallible (TryFrom) [datafusion]
via GitHub
Re: [I] Make `From<ConfigFileDecryptionProperties>/ConfigFileEncryptionProperties` conversions fallible (TryFrom) [datafusion]
via GitHub
Re: [I] Make `From<ConfigFileDecryptionProperties>/ConfigFileEncryptionProperties` conversions fallible (TryFrom) [datafusion]
via GitHub
Re: [I] Make `From<ConfigFileDecryptionProperties>/ConfigFileEncryptionProperties` conversions fallible (TryFrom) [datafusion]
via GitHub
[I] [EPIC] Merge EnforceDistribution + EnforceSorting into a single EnsureRequirements rule for correctness and idempotency [datafusion]
via GitHub
Re: [I] [EPIC] Merge EnforceDistribution + EnforceSorting into a single EnsureRequirements rule for correctness and idempotency [datafusion]
via GitHub
Re: [I] [EPIC] Merge EnforceDistribution + EnforceSorting into a single EnsureRequirements rule for correctness and idempotency [datafusion]
via GitHub
[PR] feat: prototype JVM scalar UDF bridge for Java-compatible RLike [datafusion-comet]
via GitHub
Re: [PR] feat: Accelerate RLike with 100% Spark compatibility [datafusion-comet]
via GitHub
Re: [I] Comet DPP exchange/broadcast reuse fails under AQE [datafusion-comet]
via GitHub
Re: [I] [EPIC] Improve awslabs published results for Comet w/ TPC-DS [datafusion-comet]
via GitHub
[I] [TUI] Render Executor's system & process metrics [datafusion-ballista]
via GitHub
Re: [I] [TUI] Render Executor's system & process metrics [datafusion-ballista]
via GitHub
Re: [I] [TUI] Render Executor's system & process metrics [datafusion-ballista]
via GitHub
[I] Make session setup an extension point for the CLI [datafusion]
via GitHub
Re: [PR] perf: early termination for right semi/anti hash joins [datafusion]
via GitHub
Re: [PR] perf: early termination for right semi/anti hash joins [datafusion]
via GitHub
Re: [PR] perf: early termination for right semi/anti hash joins [datafusion]
via GitHub
[PR] feat(bench): show TPC-H query timings in seconds and add total time [datafusion-ballista]
via GitHub
Re: [PR] feat(bench): show TPC-H query timings in seconds and add total time [datafusion-ballista]
via GitHub
Re: [PR] feat(bench): show TPC-H query timings in seconds and add total time [datafusion-ballista]
via GitHub
Re: [PR] feat(bench): show TPC-H query timings in seconds and add total time [datafusion-ballista]
via GitHub
[PR] docs: refresh Gluten comparison with ANSI, Spark 4, and Iceberg coverage [datafusion-comet]
via GitHub
[PR] perf: optimize scalar aggregate joins as semi joins [datafusion]
via GitHub
Re: [PR] perf: optimize scalar aggregate joins as semi joins [datafusion]
via GitHub
Re: [PR] perf: optimize scalar aggregate joins as semi joins [datafusion]
via GitHub
Re: [PR] perf: optimize scalar aggregate joins as semi joins [datafusion]
via GitHub
Re: [PR] perf: optimize scalar aggregate joins as semi joins [datafusion]
via GitHub
Re: [PR] perf: optimize scalar aggregate joins as semi joins [datafusion]
via GitHub
Re: [PR] perf: optimize scalar aggregate joins as semi joins [datafusion]
via GitHub
Re: [PR] perf: optimize scalar aggregate joins as semi joins [datafusion]
via GitHub
Re: [PR] perf: optimize scalar aggregate joins as semi joins [datafusion]
via GitHub
Re: [PR] perf: optimize scalar aggregate joins as semi joins [datafusion]
via GitHub
Re: [PR] perf: optimize scalar aggregate joins as semi joins [datafusion]
via GitHub
Re: [PR] perf: optimize scalar aggregate joins as semi joins [datafusion]
via GitHub
Re: [PR] perf: optimize scalar aggregate joins as semi joins [datafusion]
via GitHub
Re: [I] Conversion from FileDecryptionProperties to ConfigFileDecryptionProperties should be fallible [datafusion]
via GitHub
[PR] chore(deps): bump cc from 1.2.60 to 1.2.61 in /native in the all-other-cargo-deps group [datafusion-comet]
via GitHub
Re: [I] Implement group join [datafusion]
via GitHub
Re: [PR] fix: add parentheses in binary expr human_display to reflect precedence [datafusion]
via GitHub
[PR] Add wide-schema benchmark suite for measuring per-file metadata overhead [datafusion]
via GitHub
Re: [PR] Add wide-schema benchmark suite for measuring per-file metadata overhead [datafusion]
via GitHub
[PR] feat(aqe): Support sort-based shuffle writer in AQE [datafusion-ballista]
via GitHub
Re: [PR] feat(aqe): Support sort-based shuffle writer in AQE [datafusion-ballista]
via GitHub
Re: [PR] feat(aqe): Support sort-based shuffle writer in AQE [datafusion-ballista]
via GitHub
Re: [PR] feat(aqe): Support sort-based shuffle writer in AQE [datafusion-ballista]
via GitHub
Re: [PR] feat(aqe): Support sort-based shuffle writer in AQE [datafusion-ballista]
via GitHub
Re: [PR] feat(aqe): Support sort-based shuffle writer in AQE [datafusion-ballista]
via GitHub
Re: [PR] feat(aqe): Support sort-based shuffle writer in AQE [datafusion-ballista]
via GitHub
[PR] fix(spark): align parse_url empty FILE path [datafusion]
via GitHub
Re: [PR] fix(spark): align parse_url empty FILE path [datafusion]
via GitHub
Re: [I] Add tests for spill file sizes [datafusion]
via GitHub
Re: [I] Add tests for spill file sizes [datafusion]
via GitHub
[PR] fix: honor strictFloatingPoint in RangePartitioning [datafusion-comet]
via GitHub
Re: [PR] fix: honor strictFloatingPoint in RangePartitioning [datafusion-comet]
via GitHub
Re: [PR] fix: honor strictFloatingPoint in RangePartitioning [datafusion-comet]
via GitHub
Re: [PR] fix: honor strictFloatingPoint in RangePartitioning [datafusion-comet]
via GitHub
Re: [PR] fix: honor strictFloatingPoint in RangePartitioning [datafusion-comet]
via GitHub
[PR] chore(deps): bump ctor from 0.11.1 to 0.12.0 [datafusion-ballista]
via GitHub
Re: [PR] chore(deps): bump ctor from 0.11.1 to 0.12.0 [datafusion-ballista]
via GitHub
[PR] feat(sort-shuffle): accept Option<Partitioning> [datafusion-ballista]
via GitHub
[I] Add benchmarks for queries against wide schema parquet files [datafusion]
via GitHub
Re: [I] Add benchmarks for queries against wide schema parquet files [datafusion]
via GitHub
Re: [I] Add benchmarks for queries against wide schema parquet files [datafusion]
via GitHub
Re: [I] Add benchmarks for queries against wide schema parquet files [datafusion]
via GitHub
Re: [I] Add benchmarks for queries against wide schema parquet files [datafusion]
via GitHub
Re: [I] Add benchmarks for queries against wide schema parquet files [datafusion]
via GitHub
[PR] feat(executor): default memory pool to concurrent_tasks * 1 GiB [datafusion-ballista]
via GitHub
[PR] fix(sort-shuffle): bound writer memory with per-task spill threshold [datafusion-ballista]
via GitHub
Re: [PR] fix(sort-shuffle): bound writer memory with per-task spill threshold [datafusion-ballista]
via GitHub
Re: [PR] fix(sort-shuffle): bound writer memory with per-task spill threshold [datafusion-ballista]
via GitHub
Re: [PR] fix(sort-shuffle): bound writer memory with per-task spill threshold [datafusion-ballista]
via GitHub
Re: [PR] fix(sort-shuffle): bound writer memory with per-task spill threshold [datafusion-ballista]
via GitHub
Re: [I] plan_to_sql drops window expressions for Window(Aggregate) plans without Projection [datafusion]
via GitHub
Re: [PR] feat: Native Delta Lake scan via delta-kernel-rs [datafusion-comet]
via GitHub
Re: [PR] feat: Native Delta Lake scan via delta-kernel-rs [datafusion-comet]
via GitHub
[I] EXPLAIN ANALYZE row counts and elapsed_compute inflated by partition count [datafusion-ballista]
via GitHub
Re: [I] EXPLAIN ANALYZE row counts and elapsed_compute inflated by partition count [datafusion-ballista]
via GitHub
Re: [I] EXPLAIN ANALYZE row counts and elapsed_compute inflated by partition count [datafusion-ballista]
via GitHub
Re: [I] EXPLAIN ANALYZE row counts and elapsed_compute inflated by partition count [datafusion-ballista]
via GitHub
[PR] feat: add Spark-compatible xxhash64 function [datafusion]
via GitHub
Re: [PR] feat: add Spark-compatible xxhash64 function [datafusion]
via GitHub
Re: [PR] feat: add Spark-compatible xxhash64 function [datafusion]
via GitHub
[PR] feat(shuffle_bench): measure end-to-end write + read times [datafusion-ballista]
via GitHub
Re: [PR] feat(shuffle_bench): measure end-to-end write + read times [datafusion-ballista]
via GitHub
Re: [PR] feat: add PySpark validation script for datafusion-spark .slt tests [datafusion]
via GitHub
Re: [PR] feat: add PySpark validation script for datafusion-spark .slt tests [datafusion]
via GitHub
[PR] feat: format-agnostic serde hook for PhysicalExpr (Column + NotExpr) [datafusion]
via GitHub
[PR] fix: error on CREATE EXTERNAL TABLE with no files and no explicit schema [datafusion]
via GitHub
Re: [PR] fix: error on CREATE EXTERNAL TABLE with no files and no explicit schema [datafusion]
via GitHub
Re: [PR] fix: error on CREATE EXTERNAL TABLE with no files and no explicit schema [datafusion]
via GitHub
Re: [PR] Add support for lambda column capture [datafusion]
via GitHub
[PR] feat(sort-shuffle): disable sort-based shuffle by default [datafusion-ballista]
via GitHub
Re: [PR] feat(sort-shuffle): disable sort-based shuffle by default [datafusion-ballista]
via GitHub
Re: [PR] feat(sort-shuffle): disable sort-based shuffle by default [datafusion-ballista]
via GitHub
[I] tracking tpc-h sf=100 benchmark results [datafusion-ballista]
via GitHub
Re: [I] tracking tpc-h sf=100 benchmark results [datafusion-ballista]
via GitHub
Re: [I] tracking tpc-h sf=100 benchmark results [datafusion-ballista]
via GitHub
Re: [I] tracking tpc-h sf=100 benchmark results [datafusion-ballista]
via GitHub
Re: [I] tracking tpc-h sf=100 benchmark results [datafusion-ballista]
via GitHub
Re: [I] Replace true_count() and false_count() with has_true() and has_false() [datafusion]
via GitHub
[PR] docs: add benchmarking guide for contributors [datafusion-ballista]
via GitHub
[I] Update benchmark results in README [datafusion-ballista]
via GitHub
Re: [PR] proto: serialize and dedupe dynamic filters [datafusion]
via GitHub
Re: [PR] proto: serialize and dedupe dynamic filters [datafusion]
via GitHub
[PR] feat: add config to gate converting Spark shuffle to Comet shuffle when child is non-Comet plan [datafusion-comet]
via GitHub
Re: [PR] feat: add config to gate converting Spark shuffle to Comet shuffle when child is non-Comet plan [datafusion-comet]
via GitHub
Re: [PR] feat: add config to gate converting Spark shuffle to Comet shuffle when child is non-Comet plan [datafusion-comet]
via GitHub
[I] investigate slowdown in sort-based shuffle [datafusion-ballista]
via GitHub
Re: [I] investigate slowdown in sort-based shuffle [datafusion-ballista]
via GitHub
Re: [I] investigate slowdown in sort-based shuffle [datafusion-ballista]
via GitHub
Re: [I] investigate slowdown in sort-based shuffle [datafusion-ballista]
via GitHub
[PR] Fix no-op Transformed flags [datafusion]
via GitHub
Re: [PR] Fix no-op Transformed flags [datafusion]
via GitHub
Re: [PR] Fix no-op Transformed flags [datafusion]
via GitHub
Re: [PR] Fix no-op Transformed flags [datafusion]
via GitHub
Re: [PR] chore: Fix no-op Transformed flags [datafusion]
via GitHub
Re: [PR] chore: Fix no-op Transformed flags [datafusion]
via GitHub
[PR] Update `spark_expressions_support.md` doc [datafusion-comet]
via GitHub
Re: [PR] chore: Update `spark_expressions_support.md` doc [datafusion-comet]
via GitHub
Re: [PR] chore: Update `spark_expressions_support.md` doc [datafusion-comet]
via GitHub
Re: [PR] chore: Update `spark_expressions_support.md` doc [datafusion-comet]
via GitHub
Re: [PR] chore: Update `spark_expressions_support.md` doc [datafusion-comet]
via GitHub
[PR] feat: improve shuffle size estimation [experimental!] [datafusion-comet]
via GitHub
[PR] fix: broadcast exchange bypasses AQE partition coalescing [WIP] [datafusion-comet]
via GitHub
[PR] fix(rest): remove unwrap and return 404 if executor does not exist [datafusion-ballista]
via GitHub
Re: [PR] fix(rest): remove unwrap and return 404 if executor does not exist [datafusion-ballista]
via GitHub
Re: [PR] fix(rest): remove unwrap and return 404 if executor does not exist [datafusion-ballista]
via GitHub
Re: [PR] fix(rest): remove unwrap and return 404 if executor does not exist [datafusion-ballista]
via GitHub
[PR] docs: explain Java vs Rust regexp engine differences in compatibility guide [datafusion-comet]
via GitHub
[PR] chore: use Datafusion `substring` [datafusion-comet]
via GitHub
[PR] chore: fix `datafusion-spark` substring [datafusion]
via GitHub
Re: [PR] chore: fix `datafusion-spark` substring [datafusion]
via GitHub
Re: [PR] chore: fix `datafusion-spark` substring [datafusion]
via GitHub
Re: [PR] chore: fix `datafusion-spark` substring [datafusion]
via GitHub
Re: [PR] chore: fix `datafusion-spark` substring [datafusion]
via GitHub
Re: [PR] chore: fix `datafusion-spark` substring [datafusion]
via GitHub
[I] feat: make it user configurable to display plans as tree renderer [datafusion-ballista]
via GitHub
Re: [I] feat: make it user configurable to display plans as tree renderer [datafusion-ballista]
via GitHub
Re: [I] feat: make it user configurable to display plans as tree renderer [datafusion-ballista]
via GitHub
Re: [I] feat: make it user configurable to display plans as tree renderer [datafusion-ballista]
via GitHub
Re: [I] feat: make it user configurable to display plans as tree renderer [datafusion-ballista]
via GitHub
Re: [I] feat: make it user configurable to display plans as tree renderer [datafusion-ballista]
via GitHub
Re: [I] feat: make it user configurable to display plans as tree renderer [datafusion-ballista]
via GitHub
Re: [I] feat: make it user configurable to display plans as tree renderer [datafusion-ballista]
via GitHub
Re: [I] feat: make it user configurable to display plans as tree renderer [datafusion-ballista]
via GitHub
Re: [I] feat: make it user configurable to display plans as tree renderer [datafusion-ballista]
via GitHub
Re: [I] feat: make it user configurable to display plans as tree renderer [datafusion-ballista]
via GitHub
Re: [I] feat: make it user configurable to display plans as tree renderer [datafusion-ballista]
via GitHub
[PR] fix: track join_arrays memory in reservation after SMJ spill [datafusion]
via GitHub
Re: [PR] fix: track join_arrays memory in reservation after SMJ spill [datafusion]
via GitHub
Re: [PR] fix: track join_arrays memory in reservation after SMJ spill [datafusion]
via GitHub
[I] Pending PRs badge showing failed PRs [datafusion-comet]
via GitHub
Re: [I] Pending PRs badge showing failed PRs [datafusion-comet]
via GitHub
Re: [I] Pending PRs badge showing failed PRs [datafusion-comet]
via GitHub
Re: [I] Pending PRs badge showing failed PRs [datafusion-comet]
via GitHub
Re: [I] Pending PRs badge showing failed PRs [datafusion-comet]
via GitHub
Re: [PR] chore: update pending PR filter [datafusion-comet]
via GitHub
[PR] Respect DATA_DIR location for sql benchmarks [datafusion]
via GitHub
Re: [PR] Respect DATA_DIR location for sql benchmarks [datafusion]
via GitHub
Re: [PR] Respect DATA_DIR location for sql benchmarks [datafusion]
via GitHub
Re: [PR] feat: comet native scan improvements - Dynamic Partition Pruning [datafusion-comet]
via GitHub
Re: [PR] feat: comet native scan improvements - Dynamic Partition Pruning [datafusion-comet]
via GitHub
[PR] chore: pin sqlparser to apache/datafusion-sqlparser-rs main (9833c033) [datafusion]
via GitHub
[PR] feat(sort-shuffle): count spill events and log each spill [datafusion-ballista]
via GitHub
Re: [PR] perf(sort-shuffle): fix performance regression caused by datafusion upgrade [datafusion-ballista]
via GitHub
Re: [PR] perf(sort-shuffle): fix performance regression caused by datafusion upgrade [datafusion-ballista]
via GitHub
Re: [PR] feat: capture per-query output from CometSqlFileTestSuite via system property [datafusion-comet]
via GitHub
Re: [I] [Feature] Support Spark expression: seconds_to_timestamp [datafusion-comet]
via GitHub
[I] Some `datafusion-spark` expressions are missing the physical implementation [datafusion]
via GitHub
Re: [I] Some `datafusion-spark` expressions are missing the physical implementation [datafusion]
via GitHub
[PR] feat: add base64 expression [datafusion-comet]
via GitHub
Re: [PR] Null aware hash mark joins [datafusion]
via GitHub
Re: [PR] Null aware hash mark joins [datafusion]
via GitHub
Re: [PR] Null aware hash mark joins [datafusion]
via GitHub
[PR] Fix panic in EscapeQuotedString and parse_flush, clean up a few unwraps [datafusion-sqlparser-rs]
via GitHub
[PR] Fix output_rows_skew sqllogictest flake [datafusion]
via GitHub
Re: [PR] Fix output_rows_skew sqllogictest flake [datafusion]
via GitHub
Re: [PR] Fix output_rows_skew sqllogictest flake [datafusion]
via GitHub
[I] Support array_agg as a sliding window aggregate by implementing retract_batch [datafusion]
via GitHub
[PR] docs: add implement-comet-expression Claude skill [datafusion-comet]
via GitHub
Re: [PR] docs: add implement-comet-expression Claude skill [datafusion-comet]
via GitHub
Earlier messages
Later messages