Messages by Thread
-
-
[PR] perf: Optimize `translate` to use new bulk-NULL string builders [datafusion]
via GitHub
-
[PR] feat: detect Iceberg V2 writes and emit fall-back reasons [datafusion-comet]
via GitHub
-
[I] Writes to Apache Iceberg Tables [datafusion-comet]
via GitHub
-
[PR] feat(datetime): prototype JVM UDF path for Hour/Minute/Second (engine=java) [datafusion-comet]
via GitHub
-
[I] Optimize `translate` using `append_with` [datafusion]
via GitHub
-
[PR] feat(udf): account JVM-UDF Arrow allocations to the Spark task [datafusion-comet]
via GitHub
-
[PR] Add support for logical and physical codecs [datafusion-python]
via GitHub
-
[PR] Make use of Swatinem/rust-cache to make the CI workflows faster [datafusion-ballista]
via GitHub
-
[PR] docs: show child links on Expression Compatibility page [datafusion-comet]
via GitHub
-
[I] AbstractMethodError: CometBroadcastExchangeExec missing sparkContext() from BroadcastExchangeLike [datafusion-comet]
via GitHub
-
[I] Create Comet versioning policy [datafusion-comet]
via GitHub
-
Re: [PR] Add configurable UNION DISTINCT to FILTER rewrite optimization [datafusion]
via GitHub
-
Re: [I] Introduce `StringViewArrayBuilder::map` to avoid duplication [datafusion]
via GitHub
-
Re: [PR] test: add test that validate partial reduce with different number of state fields [datafusion]
via GitHub
-
[PR] Support DISTINCT ON with aggregation and windows [datafusion]
via GitHub
-
Re: [PR] feat: rest api supports plan tree rendering [datafusion-ballista]
via GitHub
-
[PR] [TUI] Add a config setting for rendering job stage's plan as a tree [datafusion-ballista]
via GitHub
-
Re: [D] DISCUSSION: Apache DataFusion New York Meetup May 2026 [datafusion]
via GitHub
-
[D] DataFusion-Federation: Union Flattening Across Executors [datafusion]
via GitHub
-
[PR] feat(dataframe): add executeStream(allocator) for incremental batch iteration [datafusion-java]
via GitHub
-
[PR] fix: REST API does not show running jobs [datafusion-ballista]
via GitHub
-
[I] feat(dataframe): add executeStream(allocator) for incremental batch iteration [datafusion-java]
via GitHub
-
[I] CREATE TABLE AS not checking column unicity [datafusion]
via GitHub
-
[PR] Refactor Spark `format_string` numeric `%c` conversion dispatch [datafusion]
via GitHub
-
[PR] fix: reduce memory allocation overhead during partial aggregation ear… [datafusion]
via GitHub
-
[I] Extra memory allocated during partial aggregation early emit during OOM handling [datafusion]
via GitHub
-
[I] Refactor: Centralize numeric `%c` formatting dispatch in format_string.rs [datafusion]
via GitHub
-
[PR] Add blog: Sort Pushdown in DataFusion: Skip Sorts, Skip I/O [datafusion-site]
via GitHub
-
[PR] feat(builder): expose ConfigOptions.set/get as setOption / setOptions / getOption [datafusion-java]
via GitHub
-
Re: [PR] Preserve recursive CTE nullability across logical and physical planning [datafusion]
via GitHub
-
[I] feat: expose ConfigOptions.set as a generic SessionContextBuilder.setOption(key, value) [datafusion-java]
via GitHub
-
Re: [PR] Split proto serialization to encapsulate private state (#21835) [datafusion]
via GitHub
-
[PR] chore(deps): bump pytest from 9.0.2 to 9.0.3 in /python [datafusion-ballista]
via GitHub
-
[PR] Fix extension type metadata propagation through casts [datafusion]
via GitHub
-
[PR] Optimize away unused `UNNEST` under duplicate-insensitive aggregates [datafusion]
via GitHub
-
[PR] build(deps): bump pyjwt from 2.10.1 to 2.12.0 [datafusion-python]
via GitHub
-
[PR] feat(parquet): two-stage access-plan hooks with shared async reader [datafusion]
via GitHub
-
[PR] feat(json): expose NdJsonReadOptions via registerJson and readJson [datafusion-java]
via GitHub
-
Re: [PR] feat: support optional threshold parameter for levenshtein function [datafusion]
via GitHub
-
[I] KEDA scaler `pending_jobs` metric appears insufficient for scaling due to rapid task assignment by scheduler [datafusion-ballista]
via GitHub
-
[PR] feat: add Java scalar UDF support [datafusion-java]
via GitHub
-
Re: [PR] test: add SQL test coverage for spark.sql.legacy.timeParserPolicy [datafusion-comet]
via GitHub
-
[PR] build(deps): bump pygments from 2.19.1 to 2.20.0 [datafusion-python]
via GitHub
-
[PR] chore(deps): bump pyjwt from 2.10.1 to 2.12.0 in /python [datafusion-ballista]
via GitHub
-
[PR] build(deps): bump requests from 2.32.3 to 2.33.0 [datafusion-python]
via GitHub
-
Re: [I] [Spark 4.0] Add string collation support [datafusion-comet]
via GitHub
-
[I] feat(dataframe): expose withColumn and unnestColumns [datafusion-java]
via GitHub
-
[I] feat(dataframe): expose introspection methods (schema, explain, cache, describe) [datafusion-java]
via GitHub
-
[I] design: DataFrame joins (join, joinOn) and the Java Expr question [datafusion-java]
via GitHub
-
[I] feat(dataframe): expose set operations (union, intersect, except) [datafusion-java]
via GitHub
-
[I] feat(dataframe): expose sort and repartition [datafusion-java]
via GitHub
-
[I] native_datafusion: ParquetSchemaConvert error does not include the file path [datafusion-comet]
via GitHub
-
[I] feat: add DataFrame.writeCsv with CsvWriteOptions [datafusion-java]
via GitHub
-
[I] feat: expose Avro reader via registerAvro and readAvro [datafusion-java]
via GitHub
-
[I] bug: SessionContext.close() / DataFrame.close() race with concurrent JNI calls (use-after-free) [datafusion-java]
via GitHub
-
[I] feat: add DataFrame.writeJson with JsonWriteOptions [datafusion-java]
via GitHub
-
[I] feat: expose JSON reader via registerJson and readJson [datafusion-java]
via GitHub
-
[I] feat: expose Arrow IPC reader via registerArrow and readArrow [datafusion-java]
via GitHub
-
[PR] docs: remove project-status checklist [datafusion-java]
via GitHub
-
[PR] build(deps): bump urllib3 from 2.3.0 to 2.7.0 [datafusion-python]
via GitHub
-
Re: [PR] feat: Native Delta Lake scan via delta-kernel-rs [datafusion-comet]
via GitHub
-
[I] Publish fat JAR with platform-specific native libraries to Maven Central [datafusion-java]
via GitHub
-
[PR] build(deps): bump pynacl from 1.5.0 to 1.6.2 [datafusion-python]
via GitHub
-
[PR] build: add examples module on a multi-module Maven build [datafusion-java]
via GitHub
-
[PR] docs: publish Javadoc as part of the User Guide [datafusion-java]
via GitHub
-
[PR] build(deps): bump cryptography from 44.0.0 to 46.0.7 [datafusion-python]
via GitHub
-
Re: [I] Automate breaking change detection [datafusion]
via GitHub
-
[PR] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
-
[PR] perf: reuse mask in `truncate_list_nulls` and avoid counting all true bits [datafusion]
via GitHub
-
[PR] Expose `ExecutionPlan` statistics across the FFI boundary [datafusion]
via GitHub
-
Re: [PR] Add internal repartition metrics [datafusion]
via GitHub
-
[PR] chore(deps): bump urllib3 from 2.6.3 to 2.7.0 in /python [datafusion-ballista]
via GitHub
-
Re: [I] chore: Publish specific documentation for each supported Spark version [datafusion-comet]
via GitHub
-
[PR] feat: Support Spark Expression Encode [datafusion-comet]
via GitHub
-
[PR] feat(dataframe): add limit, distinct, dropColumns, withColumnRenamed [datafusion-java]
via GitHub
-
[PR] refactor(parquet): split opener.rs into module + add ParquetAccessPlanOptimizer trait [datafusion]
via GitHub
-
[I] Add internal EXPLAIN ANALYZE metric level [datafusion]
via GitHub
-
Re: [I] Improve integration tests to test push scheduler mode as well [datafusion-ballista]
via GitHub