This is an automated email from the ASF dual-hosted git repository.
blaginin pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/datafusion-sandbox.git
from e5c4c9702 Merge branch 'main' into sandbox-main
add c2747eb10 feat: Support log for Decimal32 and Decimal64 (#18999)
add cbf33d1ee Fix regression for negative-scale decimal128 in log (#19315)
add d493f3d44 Add Decimal support to Ceil and Floor (#18979)
add 8550010bd Fix input handling for encoding functions & various
refactors (#18754)
add 75d2473ba Remove SchemaAdapter (#19345)
add 887aa9f8c fix: preserve ListFilesCache TTL when not set in config
(#19401)
add 91cfb6990 feat(proto): Add protobuf serialization for HashExpr (#19379)
add 2e3707e38 fix: projection for `CooperativeExec` and
`CoalesceBatchesExec` (#19400)
add b3d2cb622 Fix ORDER BY positional reference regression with aliased
aggregates (#19412)
add 5419ff590 feat: hash partitioning satisfies subset (#19304)
add 8cc8c11de Optimize muti-column grouping with StringView/ByteView
(option 2) - 25% faster (#19413)
add 6fa9c1ad1 Optimize hashing for StringView and ByteView (15-70% faster)
(#19374)
add eb30c19b3 Implement disk spilling for all grouping ordering modes in
GroupedHashAggregateStream (#19287)
add 32e6fe887 feat: update FFI TableProvider and ExecutionPlan to use FFI
Session and TaskContext (#19281)
add d8e68a404 refactor: add ParquetOpenerBuilder to reduce test code
duplication (#19405)
add 4249e4ecd bench: add `range_and_generate_series` (#19428)
add 65a6bc423 chore: use extend instead of manual loop in multi group by
(#19429)
add 2c3566ce8 doc: add example for cache factory (#19139)
add 1acaf7a9a chore(deps): bump taiki-e/install-action from 2.64.0 to
2.64.2 (#19399)
add 9fe9ec744 fix: spark crc32 custom nullability (#19271)
add 9a9e4dd5c Add recursive protection on planner's `create_physical_expr`
(#19299)
add d9d55cfc6 chore(deps): bump aws-config from 1.8.11 to 1.8.12 (#19453)
add edc693f99 chore(deps): bump log from 0.4.28 to 0.4.29 (#19452)
add c7f9fdf90 chore(deps): bump taiki-e/install-action from 2.64.2 to
2.65.1 (#19451)
add 8e95627d3 chore(deps): bump sphinx-reredirects from 1.0.0 to 1.1.0 in
/docs (#19455)
add 5fedb8423 chore(deps): bump insta from 1.44.3 to 1.45.0 (#19454)
add a886b9eb4 added support for negative scale for log decimal32/64 and
power (#19409)
add 1e591640d Remove core dependency from ffi (#19422)
add bb9a4a7ea bench: increase in_list benchmark coverage (#19443)
add 48f5d0b72 fix: Fix skip aggregate test to cover regression (#19461)
add d0d93117b fix: [19450]Added flush for tokio file(substrait) write
(#19456)
add 258e18cf8 Use SortMergeJoinExec name consistently in physical plan
outputs (#19246)
add d844f8687 Add:arrow_metadata() UDF (#19435)
add 4a1f69f9b Update date_bin to support Time32 and Time64 data types
(#19341)
add 72f174616 feat: Add decimal support for round (#19384)
add e6faacbf5 Fix panic during spill to disk in clickbench query (#19421)
add 677c543ae Optimize memory footprint of view arrays from
`ScalarValue::to_array_of_size` (#19441)
add 33ac70dd6 minor: refactoring of some `ScalarValue` code (#19439)
add 0bd880931 fix: csv schema_infer_max_records set to 0 return null
datatype (#19432)
add 902d3b32b fix: Add custom nullability for Spark LIKE function (#19218)
add 67b526a62 Refactor Spark crc32 & sha1 to remove unnecessary scalar
argument check (#19466)
add 47ddd5035 Add link to arrow-rs ticket in comments (#19479)
add a405d3fe4 Support nested field access in `get_field` with multiple
path arguments (#19389)
add d2830b6be chore(deps): bump taiki-e/install-action from 2.65.1 to
2.65.2 (#19474)
add 6ce237492 Improve plan_to_sql handling of empty projections with
dialect-specific SELECT list support (#19221)
add ef2c1a30e examples: replace sql_dialect with custom_sql_parser example
(#19383)
add 03904e1b2 Replace custom merge operator with arrow-rs implementation
(#19424)
add ea2e22c74 Implement nested recursive CTEs (#18956)
add e586ff532 fix: implement custom nullability for spark abs function
(#19395)
add 058bcb001 fix: custom nullability for format_string (#19173) (#19190)
add 62740802f Update `to_unixtime` udf function to support a consistent
set of argument types (#19442)
add ed7af0b12 Add: PI upper/lower bound f16 constants to ScalarValue
(#19497)
add 853273157 chore: enforce clippy::allow_attributes for datafusion-ffi
crate (#19480)
add ae35177df Add CI check to ensure examples are documented in README
(#19371)
add e5ca510dc perf: Improve performance of `to_hex` (> 2x) (#19503)
add d20c5d68f fix : snapshot to the modern multiline format (#19517)
add 5b90ceef1 perf: improve performance of string repeat (#19502)
add 134be4ce5 chore(deps): bump taiki-e/install-action from 2.65.2 to
2.65.3 (#19499)
add d825e5f39 docs : clarify unused test utility (#19508)
add bb4e0eca2 perf: Optimize `starts_with` and `ends_with` for scalar
arguments (#19516)
add 85c696df4 Date / time / interval arithmetic improvements (#19460)
add 8246631bf fix: Implement `reset_state` for `LazyMemoryExec` (#19362)
add 6ac7b898e Preserve ORDER BY in Unparser for projection -> order by
pattern (#19483)
add 9eddf473f fix: CteWorkTable: properly apply TableProvider::scan
projection argument (#18993)
add 496028454 fix: Median() integer overflow (#19509)
add 10db6b371 Redesign the try_reverse_output to support more cases
(#19446)
add a95c7fc2c feat: fix matching for named parameters with non-lowercase
signatures (#19378)
add 83ed19235 refactor: Spark `ascii` signature away from `user_defined`
(#19513)
add 36df145e9 feat: Add per-expression evaluation timing metrics to
ProjectionExec (#19447)
add 3aa0ab78b Fix: SparkAscii nullability to depend on input nullability
(#19531)
add 1dbf9a6df chore(deps): bump tracing from 0.1.41 to 0.1.43 (#19543)
add c37db42ae chore(deps): bump substrait from 0.62.0 to 0.62.2 (#19542)
add d7e5190f7 chore(deps): bump taiki-e/install-action from 2.65.3 to
2.65.6 (#19541)
add bd10f2744 minor: run all examples by default (#19506)
add 94709dc02 perf: improve performance of string replace (#19530)
add 43567b468 perf: improve performance of levenshtein by reusing cache
buffer (#19532)
add 3f0b3425c feat: Improve sort memory resilience (#19494)
add 7c50448f5 perf: improve performance of translate by reusing buffers
(#19533)
add 673d7c93a Refactor TopKHashTable to use HashTable API (#19464)
add f9cdfea7f docs: Improve config tables' readability (#19522)
add 8ac500bf0 Revert Spark Elt nullability change (#19510)
add 8469aa1dc minor: implement more arms for `get_data_types()` for
`NativeType` (#19449)
add 1d2b38959 perf: Optimize `contains` for scalar search arg (#19529)
add 13f38435a Introduce `TypeSignatureClass::Any` (#19485)
add a6fd5cc84 Upgrade hashbrown to 0.16 (#19554)
add a51e3a079 minor : add crypto function benchmark (#19539)
add 3420a2d4a chore(deps): bump taiki-e/install-action from 2.65.6 to
2.65.8 (#19559)
add 34addca6b bugfix: preserve schema metadata for record batch in FFI
(#19293)
add d13d89129 feat: Add DELETE/UPDATE hooks to TableProvider trait and to
MemTable implementation (#19142)
add 1704d1e74 refactor: extract the data generate out of aggregate_topk
benchmark (#19523)
add 9690f958e perf: improve performance of lpad/rpad by reusing buffers
(#19558)
add 56a2be17d perf: optimize regexp_count to avoid String allocation when
start position is provided (#19553)
add 4e45c19d1 Enables DefaultListFilesCache by default (#19366)
add f1e5c94f3 Compute Dynamic Filters only when a consumer supports them
(#19546)
add 1ce4b51a4 Various refactors to string functions (#19402)
add 27de50d05 fix: Reverse row selection should respect the row group
index (#19557)
add 79f67b8ef feat: implement partition_statistics for WindowAggExec
(#18534)
add b818f9341 perf: Improve performance of `md5` (#19568)
add fd263216c feat: integrate batch coalescer with async fn exec (#19342)
add 8959b3d11 feat: output statistics for constant columns in projections
(#19419)
add db7b8cc4d Implement `partition_statistics` API for
`NestedLoopJoinExec` (#19468)
add cd12d5103 Replace deprecated structopt with clap in
datafusion-benchmarks (#19492)
add 818706ab7 feat: `to_time` function (#19540)
add 0db668bc9 Refactor duplicate code in `type_coercion/functions.rs`
(#19518)
add 90f5bfe30 feat: Implement Spark functions hour, minute, second (#19512)
add bc753c201 chore(deps): bump taiki-e/install-action from 2.65.8 to
2.65.10 (#19578)
add 195d3d64b perf: optimize strpos by eliminating double iteration for
UTF-8 (#19572)
add 9a9ff8d61 perf: Improve performance of hex encoding in spark
functions (#19586)
add 56fec71c7 Add left function benchmark (#19600)
add 987b94ca4 chore: Add TPCDS benchmark comparison for PR (#19552)
add 132006924 Fix typo in contributor guide architecture section (#19613)
add a29569859 chore(deps): bump taiki-e/install-action from 2.65.10 to
2.65.11 (#19601)
add 715962c80 perf: optimize factorial function performance (#19575)
add 8809dae28 perf: Improve performance of ltrim, rtrim, btrim (#19551)
add 2ac032b40 fix: emit empty RecordBatch for empty file writes (#19370)
add 70daf8825 feat: plan-time SQL expression simplifying (#19311)
add 09455f181 chore: bump testcontainers-modules to 0.14 and remove
testcontainers dep (#19620)
add 7fde30a8a fix: handle invalid byte ranges in calculate_range for
single-line files (#19607)
add 955fd41d8 docs: fix typos in PartitionEvaluator trait documentation
(#19631)
add 7e049749e feat: Implement Spark function `space` (#19610)
add e0b4e8d82 feat: Implement `partition_statistics` API for
`SortMergeJoinExec` (#19567)
add 45d4948b3 Validate parquet writer version (#19515)
add 418f62ae3 fix: NULL handling in arrow_intersect and arrow_union
(#19415)
add 52bbc8afc chore(deps): bump insta from 1.45.0 to 1.46.0 (#19643)
add 47df535d2 chore(deps): bump taiki-e/install-action from 2.65.11 to
2.65.13 (#19646)
add c8620129f chore(deps): bump tracing from 0.1.43 to 0.1.44 (#19644)
add fd7924163 chore(deps): bump syn from 2.0.111 to 2.0.113 (#19645)
add ada0923a3 Respect execution timezone in to_timestamp and related
functions (#19078)
add 9b2505ce6 fix(doc): close #19393, make upgrading guide match v51 api
(#19648)
add 2d5625389 fix(spark): Use wrapping addition/subtraction in
`SparkDateAdd` and `SparkDateSub` (#19377)
add ff38480f2 Refactor `percentile_cont` to clarify support input types
(#19611)
add aee5cd9f3 fix(functions): Make translate function postgres compatible
(#19630)
add adf00a649 Add a protection to release candidate branch 52 (#19660)
add 5c2ee3650 perf: optimize `HashTableLookupExpr::evaluate` (#19602)
add c3e1c3644 Downgrade aws-smithy-runtime, update `rust_decimal`, ignore
RUSTSEC-2026-0001 to get clean CI (#19657)
add 1037f0aa2 feat: add list_files_cache table function for
`datafusion-cli` (#19388)
add 924037ea0 perf: Improve performance of `split_part` (#19570)
add a2f02f069 fix: Return Int for Date - Date instead of duration (#19563)
add 7942e751c Update dependencies (#19667)
add ed01b67f2 Refactor PartitionedFile: add ordering field and
new_from_meta constructor (#19596)
add e8196f462 Remove coalesce batches rule and deprecate
CoalesceBatchesExec (#19622)
add 166ef8112 Perf: Optimize `substring_index` via single-byte fast path
and direct indexing (#19590)
add 1f654bbe6 feat: implement metrics for AsyncFuncExec (#19626)
add ce08307a4 refactor: Use `Signature::coercible` for isnan/iszero
(#19604)
add 680ddcc6c feat: split BatchPartitioner::try_new into hash and
round-robin constructors (#19668)
add 566bcde9e Parquet: Push down supported list predicates
(array_has/any/all) during decoding (#19545)
add 3a0ca4ef7 Remove dependency on `rust_decimal`, remove ignore of
`RUSTSEC-2026-0001` (#19666)
add 142f5972d Store example data directly inside the datafusion-examples
(#19141) (#19319)
add 35ff4ab0a Allow logical optimizer to be run without evaluating now() &
refactor SimplifyInfo (#19505)
add 646213ec7 feat: add Time type support to date_trunc function (#19640)
add 102caeb22 minor: More comments to `ParquetOpener::open()` (#19677)
add d18e670e7 feat: Allow log with non-integer base on decimals (#19372)
add 1d5d63c41 Feat: Allow pow with negative & non-integer exponent on
decimals (#19369)
add e6049de5a Make default ListingFilesCache table scoped (#19616)
add 5194fd5eb chore(deps): bump taiki-e/install-action from 2.65.13 to
2.65.15 (#19676)
add 0cf45cae9 Refactor cache APIs to support ordering information (#19597)
add b9a3b9f94 Record sort order when writing Parquet with WITH ORDER
(#19595)
add a55b77e7d fix: DynamicFilterPhysicalExpr violates Hash/Eq contract
(#19659)
add 62658cd62 implement var distinct (#19706)
add c98fa5616 perfect hash join (#19411)
add b7091c0d2 Optimize `Nullstate` / accumulators (#19625)
add 07e63edfa Fix TopK aggregation for UTF-8/Utf8View group keys and add
safe fallback for unsupported string aggregates (#19285)
add 8ba46466d docs: Fix two small issues in introduction.md (#19712)
add 209a0a2e8 fix: unnest struct field with an alias failed with internal
error (#19698)
add 20870da20 infer parquet file order from metadata and use it to
optimize scans (#19433)
add 5c2b1236b feat(spark): implement array_repeat function (#19702)
add 821d410fc feat(spark): Implement collect_list/collect_set aggregate
functions (#19699)
add 3087ca8a9 perf: optimize `NthValue` when `ignore_nulls` is true
(#19496)
add afc912106 Optimize `concat/concat_ws` scalar path by pre-allocating
memory (#19547)
add 45fb0b4b9 fix(accumulators): preserve state in evaluate() for window
frame queries (#19618)
add 458b49109 perf: optimize left function by eliminating double chars()
iteration (#19571)
add 4e0161d99 fix: Don't treat quoted column names as placeholder
variables in SQL (#19339)
add 013efb4fe docs: Refine Communication documentation to highlight
Discord (#19714)
add 41a0b85af Add support for additional numeric types in to_timestamp
functions (#19663)
add 9fa7500bb Fix internal error "Physical input schema should be the same
as the one converted from logical input schema." (#18412)
add 0c5c97b22 fix(functions-aggregate): drain CORR state vectors for
streaming aggregation (#19669)
add 30c6ff198 chore: bump dependabot PR limit for cargo from 5 to 15
(#19730)
add 84ea07029 chore(deps): bump maturin from 1.10.2 to 1.11.5 in /docs
(#19740)
add 2067324f1 chore(deps): bump taiki-e/install-action from 2.65.15 to
2.66.1 (#19741)
add c03606547 chore(deps): bump sqllogictest from 0.28.4 to 0.29.0 (#19744)
add 4ca82cfdb chore(deps): bump blake3 from 1.8.2 to 1.8.3 (#19746)
add 1ddc639fa chore(deps): bump libc from 0.2.179 to 0.2.180 (#19748)
add 1c3763811 chore(deps): bump async-compression from 0.4.36 to 0.4.37
(#19742)
add f9697c14e chore(deps): bump indexmap from 2.12.1 to 2.13.0 (#19747)
add d103d8886 chore: remove LZO Parquet compression (#19726)
add bd2f34805 Improve comment for predicate_cache_inner_records (#19762)
add 383e673ac Update 52.0.0 release version number and changelog (#19767)
add 2fb9fb3fb Update the upgrading.md (#19769)
add 278950a76 Fix dynamic filter is_used function (#19734)
add 7716cae50 chore: update copyright notice year (#19758)
add cb9ec127e slt: Add test for REE arrays in group by (#19763)
add d484c09ba perf: Optimize floor and ceil scalar performance (#19752)
add e8efd5920 chore(deps): Update sqlparser to 0.60 (#19672)
add 803cce881 feat: implement Spark size function for arrays and maps
(#19592)
add 36880d89f Fix run_tpcds data dir (#19771)
add ec974ee5f chore(deps): bump taiki-e/install-action from 2.66.1 to
2.66.2 (#19778)
add f60b68a28 Include .proto files in datafusion-proto distribution
(#19490)
add e076e59b2 Simplify `expr = L1 AND expr != L2` to `expr = L1` when `L1
!= L2` (#19731)
add 4e1bc79e0 fix: enhance CTE resolution with identifier normalization
(#19519)
add 4c67d0208 feat: Add null-aware anti join support (#19635)
add 6267feef8 chore(deps): bump flate2 from 1.1.5 to 1.1.8 (#19780)
add 617700d1b Upgrade DataFusion to arrow-rs/parquet 57.2.0 (#19355)
add c4011e61f Merge branch 'main' into sandbox-main
No new revisions were added by this update.
Summary of changes:
.github/dependabot.yml | 1 +
.github/workflows/audit.yml | 4 +-
.github/workflows/rust.yml | 21 +-
Cargo.lock | 1659 ++++++++------------
Cargo.toml | 103 +-
NOTICE.txt | 2 +-
benchmarks/Cargo.toml | 2 +-
benchmarks/README.md | 19 +-
benchmarks/bench.sh | 6 +-
benchmarks/compare.py | 20 +-
benchmarks/src/bin/dfbench.rs | 15 +-
benchmarks/src/bin/external_aggr.rs | 27 +-
benchmarks/src/bin/imdb.rs | 24 +-
benchmarks/src/bin/mem_profile.rs | 22 +-
benchmarks/src/cancellation.rs | 18 +-
benchmarks/src/clickbench.rs | 30 +-
benchmarks/src/h2o.rs | 26 +-
benchmarks/src/hj.rs | 422 +++--
benchmarks/src/imdb/convert.rs | 12 +-
benchmarks/src/imdb/run.rs | 22 +-
benchmarks/src/nlj.rs | 12 +-
benchmarks/src/smj.rs | 12 +-
benchmarks/src/sort_tpch.rs | 18 +-
benchmarks/src/tpcds/run.rs | 28 +-
benchmarks/src/tpch/run.rs | 28 +-
benchmarks/src/util/options.rs | 18 +-
ci/scripts/check_examples_docs.sh | 64 +
datafusion-cli/Cargo.toml | 3 +-
datafusion-cli/src/functions.rs | 185 ++-
datafusion-cli/src/main.rs | 113 +-
datafusion-cli/tests/cli_integration.rs | 8 +-
datafusion-examples/Cargo.toml | 15 +-
datafusion-examples/README.md | 21 +-
datafusion-examples/data/README.md | 25 +
.../data => datafusion-examples/data/csv}/cars.csv | 0
.../data/csv}/regex.csv | 0
.../examples/builtin_functions/function_factory.rs | 4 +-
.../examples/builtin_functions/main.rs | 2 +-
.../examples/builtin_functions/regexp.rs | 31 +-
.../examples/custom_data_source/csv_json_opener.rs | 32 +-
.../custom_data_source/csv_sql_streaming.rs | 19 +-
.../custom_data_source/default_column_values.rs | 7 +-
.../examples/custom_data_source/main.rs | 2 +-
datafusion-examples/examples/data_io/main.rs | 2 +-
.../examples/data_io/parquet_encrypted.rs | 32 +-
.../examples/data_io/parquet_exec_visitor.rs | 24 +-
.../examples/data_io/parquet_index.rs | 2 +-
.../examples/dataframe/cache_factory.rs | 229 +++
.../examples/dataframe/dataframe.rs | 84 +-
.../examples/dataframe/deserialize_to_struct.rs | 321 +++-
datafusion-examples/examples/dataframe/main.rs | 9 +-
.../examples/execution_monitoring/main.rs | 2 +-
.../examples/execution_monitoring/tracing.rs | 34 +-
.../examples/external_dependency/main.rs | 2 +-
.../ffi/ffi_example_table_provider/src/lib.rs | 7 +-
.../examples/ffi/ffi_module_interface/src/lib.rs | 4 +-
.../examples/ffi/ffi_module_loader/Cargo.toml | 1 +
.../examples/ffi/ffi_module_loader/src/main.rs | 11 +-
datafusion-examples/examples/flight/client.rs | 17 +-
datafusion-examples/examples/flight/main.rs | 2 +-
datafusion-examples/examples/flight/server.rs | 38 +-
datafusion-examples/examples/flight/sql_server.rs | 25 +-
datafusion-examples/examples/proto/main.rs | 2 +-
.../examples/query_planning/expr_api.rs | 13 +-
.../examples/query_planning/main.rs | 2 +-
.../examples/query_planning/parse_sql_expr.rs | 70 +-
.../examples/query_planning/plan_to_sql.rs | 80 +-
.../examples/query_planning/planner_api.rs | 23 +-
.../examples/query_planning/thread_pools.rs | 14 +-
.../examples/relation_planner/main.rs | 2 +-
.../examples/relation_planner/match_recognize.rs | 6 +-
.../examples/relation_planner/pivot_unpivot.rs | 10 +-
.../examples/relation_planner/table_sample.rs | 63 +-
.../examples/sql_ops/custom_sql_parser.rs | 420 +++++
datafusion-examples/examples/sql_ops/dialect.rs | 135 --
datafusion-examples/examples/sql_ops/main.rs | 14 +-
datafusion-examples/examples/sql_ops/query.rs | 64 +-
datafusion-examples/examples/udf/advanced_udaf.rs | 15 +-
datafusion-examples/examples/udf/advanced_udwf.rs | 49 +-
datafusion-examples/examples/udf/async_udf.rs | 3 +-
datafusion-examples/examples/udf/main.rs | 2 +-
datafusion-examples/examples/udf/simple_udtf.rs | 27 +-
.../mod.rs => datafusion-examples/src/lib.rs | 4 +-
datafusion-examples/src/utils/csv_to_parquet.rs | 245 +++
.../src/utils/datasets/cars.rs | 16 +-
datafusion-examples/src/utils/datasets/mod.rs | 139 ++
.../src/utils/datasets/regex.rs | 22 +-
.../src/utils}/mod.rs | 5 +-
datafusion/catalog-listing/src/config.rs | 86 +-
datafusion/catalog-listing/src/table.rs | 410 ++++-
datafusion/catalog/src/cte_worktable.rs | 38 +-
datafusion/catalog/src/memory/table.rs | 349 +++-
datafusion/catalog/src/table.rs | 25 +
datafusion/common/Cargo.toml | 2 +-
datafusion/common/src/config.rs | 87 +-
datafusion/common/src/dfschema.rs | 6 +
.../common/src/file_options/parquet_writer.rs | 34 +-
datafusion/common/src/hash_utils.rs | 136 +-
datafusion/common/src/lib.rs | 4 +-
datafusion/common/src/parquet_config.rs | 108 ++
datafusion/common/src/scalar/consts.rs | 12 +
datafusion/common/src/scalar/mod.rs | 238 ++-
datafusion/common/src/utils/mod.rs | 36 +-
datafusion/core/Cargo.toml | 5 +
.../core/benches/range_and_generate_series.rs | 90 ++
datafusion/core/benches/topk_aggregate.rs | 211 ++-
datafusion/core/src/dataframe/mod.rs | 2 +
.../core/src/datasource/file_format/arrow.rs | 93 ++
datafusion/core/src/datasource/file_format/csv.rs | 90 ++
datafusion/core/src/datasource/file_format/json.rs | 42 +
datafusion/core/src/datasource/file_format/mod.rs | 12 +-
.../core/src/datasource/file_format/parquet.rs | 22 +
datafusion/core/src/datasource/listing/table.rs | 22 +-
.../core/src/datasource/listing_table_factory.rs | 7 +-
datafusion/core/src/datasource/mod.rs | 66 +-
.../core/src/datasource/physical_plan/mod.rs | 135 --
.../core/src/datasource/physical_plan/parquet.rs | 94 +-
datafusion/core/src/execution/context/mod.rs | 35 +-
datafusion/core/src/execution/session_state.rs | 57 +-
datafusion/core/src/physical_planner.rs | 167 +-
datafusion/core/src/test_util/parquet.rs | 15 +-
datafusion/core/tests/core_integration.rs | 3 -
.../tests/custom_sources_cases/dml_planning.rs | 297 ++++
datafusion/core/tests/custom_sources_cases/mod.rs | 1 +
datafusion/core/tests/dataframe/mod.rs | 2 +-
.../core/tests/datasource/object_store_access.rs | 76 +-
datafusion/core/tests/execution/coop.rs | 59 +-
datafusion/core/tests/expr_api/mod.rs | 5 +-
datafusion/core/tests/expr_api/simplification.rs | 156 +-
datafusion/core/tests/fuzz_cases/join_fuzz.rs | 1 +
datafusion/core/tests/fuzz_cases/window_fuzz.rs | 4 +-
datafusion/core/tests/parquet/custom_reader.rs | 10 +-
.../parquet/{schema_adapter.rs => expr_adapter.rs} | 152 +-
datafusion/core/tests/parquet/mod.rs | 3 +-
datafusion/core/tests/parquet/ordering.rs | 103 ++
datafusion/core/tests/parquet/page_pruning.rs | 9 +-
.../physical_optimizer/aggregate_statistics.rs | 86 +
.../physical_optimizer/enforce_distribution.rs | 59 +-
.../tests/physical_optimizer/enforce_sorting.rs | 57 +-
.../physical_optimizer/filter_pushdown/mod.rs | 409 +++--
.../physical_optimizer/filter_pushdown/util.rs | 28 +-
.../tests/physical_optimizer/join_selection.rs | 10 +
.../tests/physical_optimizer/limit_pushdown.rs | 84 +-
.../physical_optimizer/partition_statistics.rs | 162 +-
.../physical_optimizer/projection_pushdown.rs | 46 +
.../core/tests/physical_optimizer/pushdown_sort.rs | 434 ++++-
.../replace_with_order_preserving_variants.rs | 229 ++-
.../tests/physical_optimizer/sanity_checker.rs | 6 +-
.../core/tests/physical_optimizer/test_utils.rs | 216 ++-
datafusion/core/tests/schema_adapter/mod.rs | 18 -
.../schema_adapter_integration_tests.rs | 752 ---------
datafusion/core/tests/sql/explain_analyze.rs | 1 -
datafusion/core/tests/sql/mod.rs | 1 +
datafusion/core/tests/sql/unparser.rs | 462 ++++++
.../core/tests/user_defined/relation_planner.rs | 16 +-
.../user_defined_async_scalar_functions.rs | 40 +-
.../user_defined/user_defined_scalar_functions.rs | 19 +-
datafusion/datasource-arrow/NOTICE.txt | 2 +-
datafusion/datasource-arrow/src/source.rs | 118 +-
.../datasource-avro/src/avro_to_arrow/schema.rs | 4 +-
datafusion/datasource-avro/src/source.rs | 17 -
datafusion/datasource-csv/src/file_format.rs | 50 +-
datafusion/datasource-csv/src/source.rs | 17 -
datafusion/datasource-json/src/source.rs | 17 -
datafusion/datasource-parquet/Cargo.toml | 7 +
.../benches/parquet_nested_filter_pushdown.rs | 238 +++
datafusion/datasource-parquet/src/file_format.rs | 92 +-
datafusion/datasource-parquet/src/metadata.rs | 142 +-
datafusion/datasource-parquet/src/metrics.rs | 10 +-
datafusion/datasource-parquet/src/mod.rs | 1 +
datafusion/datasource-parquet/src/opener.rs | 617 +++++---
datafusion/datasource-parquet/src/row_filter.rs | 487 +++++-
.../datasource-parquet/src/row_group_filter.rs | 9 +-
datafusion/datasource-parquet/src/sort.rs | 865 ++++++++--
datafusion/datasource-parquet/src/source.rs | 88 +-
.../datasource-parquet/src/supported_predicates.rs | 144 ++
datafusion/datasource/Cargo.toml | 2 +-
datafusion/datasource/src/display.rs | 9 +-
datafusion/datasource/src/file.rs | 87 +-
datafusion/datasource/src/file_format.rs | 76 +
datafusion/datasource/src/file_scan_config.rs | 96 +-
datafusion/datasource/src/mod.rs | 77 +-
datafusion/datasource/src/schema_adapter.rs | 1065 ++-----------
datafusion/datasource/src/test_util.rs | 18 -
datafusion/datasource/src/url.rs | 84 +-
datafusion/datasource/src/write/demux.rs | 17 +
datafusion/execution/Cargo.toml | 1 +
datafusion/execution/src/cache/cache_manager.rs | 284 +++-
datafusion/execution/src/cache/cache_unit.rs | 454 ++++--
.../execution/src/cache/file_metadata_cache.rs | 429 ++---
datafusion/execution/src/cache/list_files_cache.rs | 917 ++++++-----
datafusion/execution/src/cache/mod.rs | 48 +-
datafusion/execution/src/lib.rs | 1 -
datafusion/expr-common/src/accumulator.rs | 23 +-
datafusion/expr-common/src/signature.rs | 74 +-
datafusion/expr-common/src/type_coercion/binary.rs | 121 +-
.../src/type_coercion/binary/tests/arithmetic.rs | 8 +-
.../src/type_coercion/binary/tests/comparison.rs | 58 +-
datafusion/expr/src/arguments.rs | 433 ++++-
datafusion/expr/src/execution_props.rs | 16 +-
datafusion/expr/src/expr.rs | 3 +-
datafusion/expr/src/expr_schema.rs | 55 +-
datafusion/expr/src/function.rs | 8 +-
datafusion/expr/src/lib.rs | 4 +
datafusion/expr/src/logical_plan/builder.rs | 23 +
datafusion/expr/src/logical_plan/dml.rs | 8 +-
datafusion/expr/src/logical_plan/plan.rs | 24 +
datafusion/expr/src/logical_plan/tree_node.rs | 4 +
datafusion/expr/src/partition_evaluator.rs | 4 +-
datafusion/expr/src/planner.rs | 4 +-
datafusion/expr/src/simplify.rs | 120 +-
datafusion/expr/src/type_coercion/functions.rs | 304 ++--
datafusion/expr/src/udaf.rs | 2 +-
datafusion/expr/src/udf.rs | 14 +-
datafusion/expr/src/udwf.rs | 2 +-
datafusion/ffi/Cargo.toml | 15 +-
datafusion/ffi/src/arrow_wrappers.rs | 10 +-
datafusion/ffi/src/catalog_provider.rs | 95 +-
datafusion/ffi/src/catalog_provider_list.rs | 81 +-
datafusion/ffi/src/execution/task_ctx.rs | 1 -
datafusion/ffi/src/execution/task_ctx_provider.rs | 1 -
datafusion/ffi/src/execution_plan.rs | 98 +-
datafusion/ffi/src/expr/columnar_value.rs | 1 -
datafusion/ffi/src/expr/distribution.rs | 11 +-
datafusion/ffi/src/expr/expr_properties.rs | 3 -
datafusion/ffi/src/expr/interval.rs | 4 +-
datafusion/ffi/src/insert_op.rs | 3 +-
datafusion/ffi/src/lib.rs | 1 +
datafusion/ffi/src/physical_expr/mod.rs | 69 +-
datafusion/ffi/src/physical_expr/partitioning.rs | 7 +-
datafusion/ffi/src/physical_expr/sort.rs | 7 +-
datafusion/ffi/src/plan_properties.rs | 143 +-
.../ffi/src/proto/logical_extension_codec.rs | 39 +-
.../ffi/src/proto/physical_extension_codec.rs | 11 +-
datafusion/ffi/src/record_batch_stream.rs | 85 +-
datafusion/ffi/src/schema_provider.rs | 103 +-
datafusion/ffi/src/session/config.rs | 1 -
datafusion/ffi/src/session/mod.rs | 10 +-
datafusion/ffi/src/table_provider.rs | 259 +--
datafusion/ffi/src/table_source.rs | 7 +-
datafusion/ffi/src/tests/async_provider.rs | 64 +-
datafusion/ffi/src/tests/catalog.rs | 41 +-
datafusion/ffi/src/tests/mod.rs | 60 +-
datafusion/ffi/src/tests/sync_provider.rs | 11 +-
datafusion/ffi/src/tests/udf_udaf_udwf.rs | 70 +-
datafusion/ffi/src/tests/utils.rs | 8 +-
datafusion/ffi/src/udaf/accumulator.rs | 40 +-
datafusion/ffi/src/udaf/accumulator_args.rs | 122 +-
datafusion/ffi/src/udaf/groups_accumulator.rs | 47 +-
datafusion/ffi/src/udaf/mod.rs | 67 +-
datafusion/ffi/src/udf/mod.rs | 63 +-
datafusion/ffi/src/udf/return_type_args.rs | 16 +-
datafusion/ffi/src/udtf.rs | 158 +-
datafusion/ffi/src/udwf/mod.rs | 17 +-
datafusion/ffi/src/udwf/partition_evaluator.rs | 31 +-
.../ffi/src/udwf/partition_evaluator_args.rs | 108 +-
datafusion/ffi/src/udwf/range.rs | 1 -
datafusion/ffi/src/volatility.rs | 3 +-
datafusion/ffi/tests/ffi_catalog.rs | 14 +-
datafusion/ffi/tests/ffi_integration.rs | 11 +-
datafusion/ffi/tests/ffi_udaf.rs | 71 +-
datafusion/ffi/tests/ffi_udf.rs | 4 +-
datafusion/ffi/tests/ffi_udtf.rs | 8 +-
datafusion/ffi/tests/ffi_udwf.rs | 3 +-
datafusion/ffi/tests/utils/mod.rs | 43 +
.../src/aggregate/groups_accumulator/accumulate.rs | 214 ++-
.../src/aggregate/groups_accumulator/bool_op.rs | 2 +-
.../src/aggregate/groups_accumulator/prim_op.rs | 5 +-
datafusion/functions-aggregate/src/array_agg.rs | 4 +-
datafusion/functions-aggregate/src/average.rs | 21 +-
datafusion/functions-aggregate/src/correlation.rs | 62 +-
datafusion/functions-aggregate/src/count.rs | 4 +-
datafusion/functions-aggregate/src/median.rs | 22 +-
.../functions-aggregate/src/percentile_cont.rs | 505 +++---
datafusion/functions-aggregate/src/string_agg.rs | 13 +-
datafusion/functions-aggregate/src/variance.rs | 144 +-
datafusion/functions-nested/src/array_has.rs | 21 +-
datafusion/functions-nested/src/planner.rs | 3 +
datafusion/functions-nested/src/set_ops.rs | 19 +-
datafusion/functions-table/src/generate_series.rs | 47 +
datafusion/functions-window/Cargo.toml | 8 +
datafusion/functions-window/benches/nth_value.rs | 263 ++++
datafusion/functions-window/src/nth_value.rs | 182 ++-
datafusion/functions/Cargo.toml | 70 +-
datafusion/functions/benches/concat.rs | 100 +-
.../functions/benches/{concat.rs => concat_ws.rs} | 64 +-
datafusion/functions/benches/contains.rs | 185 +++
.../functions/benches/{upper.rs => crypto.rs} | 49 +-
datafusion/functions/benches/ends_with.rs | 185 +++
.../functions/benches/{uuid.rs => factorial.rs} | 35 +-
datafusion/functions/benches/floor_ceil.rs | 135 ++
datafusion/functions/benches/left.rs | 111 ++
datafusion/functions/benches/levenshtein.rs | 87 +
datafusion/functions/benches/pad.rs | 314 +++-
datafusion/functions/benches/regexp_count.rs | 118 ++
.../functions/benches/{repeat.rs => replace.rs} | 108 +-
datafusion/functions/benches/split_part.rs | 382 +++++
datafusion/functions/benches/starts_with.rs | 185 +++
datafusion/functions/benches/substr_index.rs | 124 +-
datafusion/functions/benches/to_hex.rs | 120 +-
datafusion/functions/benches/to_timestamp.rs | 24 +-
datafusion/functions/benches/translate.rs | 90 ++
datafusion/functions/benches/{ltrim.rs => trim.rs} | 200 ++-
datafusion/functions/src/core/arrow_cast.rs | 17 +-
datafusion/functions/src/core/arrow_metadata.rs | 160 ++
datafusion/functions/src/core/coalesce.rs | 4 +-
datafusion/functions/src/core/getfield.rs | 645 +++++---
datafusion/functions/src/core/mod.rs | 14 +
datafusion/functions/src/core/nvl.rs | 4 +-
datafusion/functions/src/core/nvl2.rs | 4 +-
datafusion/functions/src/core/union_extract.rs | 5 +-
datafusion/functions/src/crypto/basic.rs | 34 +-
datafusion/functions/src/datetime/common.rs | 153 +-
datafusion/functions/src/datetime/current_date.rs | 23 +-
datafusion/functions/src/datetime/current_time.rs | 65 +-
datafusion/functions/src/datetime/date_bin.rs | 314 +++-
datafusion/functions/src/datetime/date_trunc.rs | 284 +++-
datafusion/functions/src/datetime/mod.rs | 55 +-
datafusion/functions/src/datetime/now.rs | 18 +-
datafusion/functions/src/datetime/to_date.rs | 6 +-
datafusion/functions/src/datetime/to_time.rs | 252 +++
datafusion/functions/src/datetime/to_timestamp.rs | 1007 +++++++++---
datafusion/functions/src/datetime/to_unixtime.rs | 38 +-
datafusion/functions/src/encoding/inner.rs | 745 ++++-----
datafusion/functions/src/macros.rs | 29 +
datafusion/functions/src/math/ceil.rs | 206 +++
datafusion/functions/src/math/decimal.rs | 111 ++
datafusion/functions/src/math/factorial.rs | 78 +-
datafusion/functions/src/math/floor.rs | 206 +++
datafusion/functions/src/math/iszero.rs | 32 +-
datafusion/functions/src/math/log.rs | 230 ++-
datafusion/functions/src/math/mod.rs | 21 +-
datafusion/functions/src/math/monotonicity.rs | 48 -
datafusion/functions/src/math/nans.rs | 32 +-
datafusion/functions/src/math/nanvl.rs | 27 +-
datafusion/functions/src/math/power.rs | 368 ++++-
datafusion/functions/src/math/round.rs | 367 +++--
datafusion/functions/src/regex/regexpcount.rs | 12 +-
datafusion/functions/src/regex/regexplike.rs | 4 +-
datafusion/functions/src/string/btrim.rs | 2 +-
datafusion/functions/src/string/common.rs | 234 ++-
datafusion/functions/src/string/concat.rs | 11 +-
datafusion/functions/src/string/concat_ws.rs | 35 +-
datafusion/functions/src/string/contains.rs | 89 +-
datafusion/functions/src/string/ends_with.rs | 294 +++-
datafusion/functions/src/string/levenshtein.rs | 24 +-
datafusion/functions/src/string/ltrim.rs | 10 +-
datafusion/functions/src/string/repeat.rs | 33 +-
datafusion/functions/src/string/replace.rs | 85 +-
datafusion/functions/src/string/rtrim.rs | 10 +-
datafusion/functions/src/string/split_part.rs | 62 +-
datafusion/functions/src/string/starts_with.rs | 272 +++-
datafusion/functions/src/string/to_hex.rs | 217 ++-
datafusion/functions/src/string/uuid.rs | 2 +-
datafusion/functions/src/unicode/left.rs | 15 +-
datafusion/functions/src/unicode/lpad.rs | 33 +-
datafusion/functions/src/unicode/rpad.rs | 38 +-
datafusion/functions/src/unicode/strpos.rs | 39 +-
datafusion/functions/src/unicode/substrindex.rs | 70 +-
datafusion/functions/src/unicode/translate.rs | 69 +-
datafusion/functions/src/utils.rs | 153 +-
datafusion/macros/Cargo.toml | 2 +-
datafusion/optimizer/src/analyzer/type_coercion.rs | 185 ++-
datafusion/optimizer/src/decorrelate.rs | 10 +-
.../src/decorrelate_predicate_subquery.rs | 72 +-
datafusion/optimizer/src/eliminate_cross_join.rs | 3 +
datafusion/optimizer/src/eliminate_outer_join.rs | 1 +
.../optimizer/src/extract_equijoin_predicate.rs | 4 +
datafusion/optimizer/src/optimizer.rs | 28 +-
.../src/simplify_expressions/expr_simplifier.rs | 213 +--
.../optimizer/src/simplify_expressions/mod.rs | 3 +-
.../src/simplify_expressions/simplify_exprs.rs | 13 +-
.../src/simplify_expressions/simplify_literal.rs | 148 ++
.../src/simplify_expressions/unwrap_cast.rs | 22 +-
.../optimizer/src/simplify_expressions/utils.rs | 48 +
.../physical-expr-adapter/src/schema_rewriter.rs | 4 +-
datafusion/physical-expr-common/Cargo.toml | 3 +
datafusion/physical-expr-common/src/lib.rs | 1 +
.../src/metrics/baseline.rs | 4 +-
.../src/metrics/builder.rs | 2 +-
.../src/metrics/custom.rs | 2 +-
.../physical-expr-common/src/metrics/expression.rs | 88 ++
.../src/metrics/mod.rs | 4 +-
.../src/metrics/value.rs | 0
datafusion/physical-expr-common/src/sort_expr.rs | 21 +-
datafusion/physical-expr-common/src/utils.rs | 17 +-
datafusion/physical-expr/Cargo.toml | 4 +
datafusion/physical-expr/benches/in_list.rs | 77 +
datafusion/physical-expr/src/equivalence/class.rs | 8 +-
datafusion/physical-expr/src/expressions/binary.rs | 94 ++
datafusion/physical-expr/src/expressions/case.rs | 279 +---
.../src/expressions/dynamic_filters.rs | 189 ++-
.../physical-expr/src/expressions/in_list.rs | 302 +++-
datafusion/physical-expr/src/partitioning.rs | 586 ++++++-
datafusion/physical-expr/src/planner.rs | 33 +-
datafusion/physical-expr/src/projection.rs | 317 +++-
datafusion/physical-expr/src/scalar_function.rs | 8 +-
datafusion/physical-expr/src/utils/mod.rs | 2 +-
.../physical-optimizer/src/coalesce_batches.rs | 87 -
.../physical-optimizer/src/enforce_distribution.rs | 86 +-
.../physical-optimizer/src/join_selection.rs | 28 +-
datafusion/physical-optimizer/src/lib.rs | 1 -
datafusion/physical-optimizer/src/optimizer.rs | 4 -
.../physical-optimizer/src/sanity_checker.rs | 3 +-
.../physical-optimizer/src/topk_aggregation.rs | 21 +-
datafusion/physical-plan/Cargo.toml | 1 +
.../src/aggregates/group_values/mod.rs | 4 +-
.../group_values/multi_group_by/bytes_view.rs | 159 +-
.../aggregates/group_values/multi_group_by/mod.rs | 24 +-
.../src/aggregates/group_values/row.rs | 9 +-
.../group_values/single_group_by/boolean.rs | 3 +-
.../group_values/single_group_by/bytes.rs | 4 +-
.../group_values/single_group_by/bytes_view.rs | 4 +-
.../group_values/single_group_by/primitive.rs | 8 +-
datafusion/physical-plan/src/aggregates/mod.rs | 284 +++-
.../physical-plan/src/aggregates/row_hash.rs | 353 +++--
.../src/aggregates/topk/hash_table.rs | 416 +++--
.../physical-plan/src/aggregates/topk/heap.rs | 230 ++-
.../src/aggregates/topk/priority_map.rs | 129 +-
.../physical-plan/src/aggregates/topk_stream.rs | 14 +
datafusion/physical-plan/src/async_func.rs | 118 +-
datafusion/physical-plan/src/coalesce_batches.rs | 20 +
datafusion/physical-plan/src/coop.rs | 39 +-
datafusion/physical-plan/src/filter.rs | 5 +-
datafusion/physical-plan/src/joins/array_map.rs | 547 +++++++
datafusion/physical-plan/src/joins/chain.rs | 69 +
.../physical-plan/src/joins/hash_join/exec.rs | 1338 +++++++++++++---
.../physical-plan/src/joins/hash_join/mod.rs | 2 +-
.../src/joins/hash_join/partitioned_hash_eval.rs | 458 +++++-
.../src/joins/hash_join/shared_bounds.rs | 41 +-
.../physical-plan/src/joins/hash_join/stream.rs | 163 +-
.../physical-plan/src/joins/join_hash_map.rs | 117 +-
datafusion/physical-plan/src/joins/mod.rs | 29 +-
.../physical-plan/src/joins/nested_loop_join.rs | 33 +-
.../src/joins/sort_merge_join/exec.rs | 18 +-
.../src/joins/sort_merge_join/tests.rs | 74 +
.../physical-plan/src/joins/stream_join_utils.rs | 14 +-
datafusion/physical-plan/src/joins/test_utils.rs | 1 +
datafusion/physical-plan/src/joins/utils.rs | 30 +-
datafusion/physical-plan/src/memory.rs | 54 +
datafusion/physical-plan/src/metrics.rs | 6 +-
datafusion/physical-plan/src/projection.rs | 77 +-
datafusion/physical-plan/src/recursive_query.rs | 4 +-
datafusion/physical-plan/src/repartition/mod.rs | 128 +-
.../physical-plan/src/sorts/multi_level_merge.rs | 12 +-
datafusion/physical-plan/src/sorts/sort.rs | 465 +++++-
.../src/sorts/sort_preserving_merge.rs | 18 +-
.../physical-plan/src/spill/spill_manager.rs | 5 +-
datafusion/physical-plan/src/stream.rs | 187 ++-
datafusion/physical-plan/src/test.rs | 1 +
.../physical-plan/src/windows/window_agg_exec.rs | 38 +-
datafusion/physical-plan/src/work_table.rs | 103 +-
datafusion/proto-common/src/from_proto/mod.rs | 25 +-
datafusion/proto-common/src/to_proto/mod.rs | 2 +-
datafusion/proto/Cargo.toml | 3 -
datafusion/proto/proto/datafusion.proto | 12 +
datafusion/proto/src/generated/pbjson.rs | 225 +++
datafusion/proto/src/generated/prost.rs | 21 +-
datafusion/proto/src/logical_plan/file_formats.rs | 7 +-
datafusion/proto/src/physical_plan/from_proto.rs | 76 +-
datafusion/proto/src/physical_plan/mod.rs | 6 +
datafusion/proto/src/physical_plan/to_proto.rs | 16 +-
.../proto/tests/cases/roundtrip_logical_plan.rs | 15 +-
.../proto/tests/cases/roundtrip_physical_plan.rs | 53 +-
datafusion/spark/Cargo.toml | 5 +
datafusion/spark/benches/{char.rs => space.rs} | 25 +-
datafusion/spark/src/function/aggregate/collect.rs | 200 +++
datafusion/spark/src/function/aggregate/mod.rs | 19 +-
datafusion/spark/src/function/array/mod.rs | 9 +-
datafusion/spark/src/function/array/repeat.rs | 128 ++
datafusion/spark/src/function/collection/mod.rs | 13 +-
datafusion/spark/src/function/collection/size.rs | 162 ++
datafusion/spark/src/function/conditional/if.rs | 2 +-
datafusion/spark/src/function/datetime/date_add.rs | 25 +-
datafusion/spark/src/function/datetime/date_sub.rs | 25 +-
datafusion/spark/src/function/datetime/extract.rs | 268 ++++
datafusion/spark/src/function/datetime/mod.rs | 18 +
datafusion/spark/src/function/hash/crc32.rs | 43 +-
datafusion/spark/src/function/hash/sha1.rs | 27 +-
datafusion/spark/src/function/math/abs.rs | 78 +-
datafusion/spark/src/function/math/hex.rs | 27 +-
datafusion/spark/src/function/mod.rs | 1 +
datafusion/spark/src/function/null_utils.rs | 122 ++
datafusion/spark/src/function/string/ascii.rs | 77 +-
datafusion/spark/src/function/string/concat.rs | 110 +-
datafusion/spark/src/function/string/elt.rs | 67 +-
.../spark/src/function/string/format_string.rs | 63 +-
datafusion/spark/src/function/string/like.rs | 92 +-
datafusion/spark/src/function/string/mod.rs | 4 +
datafusion/spark/src/function/string/space.rs | 232 +++
datafusion/sql/src/expr/function.rs | 37 +-
datafusion/sql/src/expr/identifier.rs | 4 +-
datafusion/sql/src/expr/value.rs | 130 +-
datafusion/sql/src/planner.rs | 2 +-
datafusion/sql/src/query.rs | 1 +
datafusion/sql/src/relation/mod.rs | 8 +-
datafusion/sql/src/resolve.rs | 162 +-
datafusion/sql/src/select.rs | 8 +-
datafusion/sql/src/statement.rs | 192 ++-
datafusion/sql/src/unparser/ast.rs | 46 +-
datafusion/sql/src/unparser/dialect.rs | 45 +-
datafusion/sql/src/unparser/expr.rs | 44 +-
datafusion/sql/src/unparser/plan.rs | 36 +-
datafusion/sql/src/utils.rs | 25 +-
datafusion/sql/src/values.rs | 8 +-
datafusion/sql/tests/cases/plan_to_sql.rs | 80 +-
datafusion/sql/tests/common/mod.rs | 12 +-
datafusion/sql/tests/sql_integration.rs | 42 +-
datafusion/sqllogictest/Cargo.toml | 7 +-
datafusion/sqllogictest/bin/postgres_container.rs | 6 +-
datafusion/sqllogictest/src/engines/conversion.rs | 6 +-
.../src/engines/postgres_engine/mod.rs | 131 +-
.../src/engines/postgres_engine/types.rs | 45 -
datafusion/sqllogictest/src/test_context.rs | 165 +-
datafusion/sqllogictest/test_files/aggregate.slt | 501 +++++-
.../test_files/aggregate_skip_partial.slt | 17 +-
.../sqllogictest/test_files/aggregates_topk.slt | 88 ++
datafusion/sqllogictest/test_files/array.slt | 32 +
.../sqllogictest/test_files/arrow_typeof.slt | 5 +-
datafusion/sqllogictest/test_files/async_udf.slt | 12 +-
datafusion/sqllogictest/test_files/case.slt | 5 +-
datafusion/sqllogictest/test_files/cte.slt | 69 +-
.../test_files/cte_quoted_reference.slt | 70 +
.../test_files/datetime/arith_date_date.slt | 15 +
.../test_files/datetime/arith_date_integer.slt | 89 ++
.../test_files/datetime/arith_date_interval.slt | 37 +
.../test_files/datetime/arith_date_time.slt | 116 ++
.../test_files/datetime/arith_interval_double.slt | 41 +
.../datetime/arith_interval_interval.slt | 27 +
.../test_files/datetime/arith_negate_interval.slt | 13 +
.../test_files/datetime/arith_time_interval.slt | 70 +
.../test_files/datetime/arith_time_time.slt | 47 +
.../datetime/arith_timestamp_duration.slt | 147 ++
.../datetime/arith_timestamp_interval.slt | 36 +
.../datetime/arith_timestamp_timestamp.slt | 13 +
.../{ => datetime}/current_date_timezone.slt | 0
.../{ => datetime}/current_time_timezone.slt | 0
.../test_files/{expr => datetime}/date_part.slt | 0
.../test_files/{ => datetime}/dates.slt | 14 +-
.../test_files/{ => datetime}/interval.slt | 0
.../test_files/{ => datetime}/interval_mysql.slt | 0
.../test_files/{ => datetime}/timestamps.slt | 1245 ++++++++++++++-
datafusion/sqllogictest/test_files/decimal.slt | 173 +-
datafusion/sqllogictest/test_files/delete.slt | 16 +-
datafusion/sqllogictest/test_files/dml_delete.slt | 202 +++
datafusion/sqllogictest/test_files/dml_update.slt | 286 ++++
.../test_files/dynamic_filter_pushdown_config.slt | 326 ----
datafusion/sqllogictest/test_files/encoding.slt | 143 +-
datafusion/sqllogictest/test_files/errors.slt | 4 +-
datafusion/sqllogictest/test_files/explain.slt | 4 -
.../sqllogictest/test_files/explain_analyze.slt | 41 +
.../sqllogictest/test_files/information_schema.slt | 63 +-
datafusion/sqllogictest/test_files/join.slt.part | 8 +-
datafusion/sqllogictest/test_files/joins.slt | 13 +-
datafusion/sqllogictest/test_files/math.slt | 2 +-
datafusion/sqllogictest/test_files/metadata.slt | 54 +-
.../sqllogictest/test_files/named_arguments.slt | 3 +-
.../test_files/null_aware_anti_join.slt | 453 ++++++
datafusion/sqllogictest/test_files/order.slt | 11 +
datafusion/sqllogictest/test_files/parquet.slt | 4 +
.../test_files/parquet_filter_pushdown.slt | 111 ++
.../sqllogictest/test_files/repartition_scan.slt | 4 +
.../test_files/repartition_subset_satisfaction.slt | 526 +++++++
.../sqllogictest/test_files/run_end_encoded.slt | 57 +
datafusion/sqllogictest/test_files/scalar.slt | 130 +-
.../sqllogictest/test_files/schema_evolution.slt | 144 ++
.../sqllogictest/test_files/set_variable.slt | 18 +
.../sqllogictest/test_files/simplify_expr.slt | 18 +
.../sqllogictest/test_files/sort_merge_join.slt | 2 +-
.../sqllogictest/test_files/sort_pushdown.slt | 886 +++++++++++
.../test_files/spark/aggregate/collect.slt | 93 ++
.../test_files/spark/array/array_repeat.slt | 77 +-
.../test_files/spark/collection/size.slt | 132 ++
.../test_files/spark/datetime/date_add.slt | 12 +-
.../test_files/spark/datetime/hour.slt | 23 +-
.../test_files/spark/datetime/minute.slt | 23 +-
.../test_files/spark/datetime/second.slt | 23 +-
.../sqllogictest/test_files/spark/hash/crc32.slt | 6 +-
.../spark/string/{char.slt => space.slt} | 23 +-
.../test_files/string/string_query.slt.part | 10 +-
datafusion/sqllogictest/test_files/struct.slt | 210 ++-
.../sqllogictest/test_files/table_functions.slt | 24 +
.../test_files/to_timestamp_timezone.slt | 204 +++
.../test_files/tpch/plans/q16.slt.part | 11 +-
.../test_files/tpch/plans/q18.slt.part | 34 +-
.../sqllogictest/test_files/tpch/plans/q3.slt.part | 30 +-
datafusion/sqllogictest/test_files/unnest.slt | 30 +-
datafusion/sqllogictest/test_files/update.slt | 20 +-
datafusion/substrait/src/physical_plan/consumer.rs | 12 +-
datafusion/substrait/src/serializer.rs | 1 +
datafusion/wasmtest/src/lib.rs | 11 +-
dev/changelog/52.0.0.md | 745 +++++++++
docs/requirements.txt | 4 +-
docs/source/_static/theme_overrides.css | 18 +
docs/source/contributor-guide/architecture.md | 2 +-
docs/source/contributor-guide/communication.md | 68 +-
docs/source/index.rst | 3 +-
docs/source/library-user-guide/upgrading.md | 348 ++--
docs/source/user-guide/cli/functions.md | 50 +
docs/source/user-guide/configs.md | 253 +--
docs/source/user-guide/introduction.md | 5 +-
docs/source/user-guide/sql/format_options.md | 102 +-
docs/source/user-guide/sql/scalar_functions.md | 245 ++-
603 files changed, 41161 insertions(+), 14363 deletions(-)
create mode 100755 ci/scripts/check_examples_docs.sh
create mode 100644 datafusion-examples/data/README.md
copy {datafusion/core/tests/data => datafusion-examples/data/csv}/cars.csv
(100%)
copy {datafusion/physical-expr/tests/data =>
datafusion-examples/data/csv}/regex.csv (100%)
create mode 100644 datafusion-examples/examples/dataframe/cache_factory.rs
create mode 100644 datafusion-examples/examples/sql_ops/custom_sql_parser.rs
delete mode 100644 datafusion-examples/examples/sql_ops/dialect.rs
copy datafusion/physical-expr/src/statistics/mod.rs =>
datafusion-examples/src/lib.rs (91%)
create mode 100644 datafusion-examples/src/utils/csv_to_parquet.rs
copy datafusion/spark/src/function/collection/mod.rs =>
datafusion-examples/src/utils/datasets/cars.rs (66%)
create mode 100644 datafusion-examples/src/utils/datasets/mod.rs
copy datafusion/spark/src/function/conditional/mod.rs =>
datafusion-examples/src/utils/datasets/regex.rs (67%)
copy {datafusion/physical-expr/src/statistics =>
datafusion-examples/src/utils}/mod.rs (90%)
create mode 100644 datafusion/common/src/parquet_config.rs
create mode 100644 datafusion/core/benches/range_and_generate_series.rs
create mode 100644 datafusion/core/tests/custom_sources_cases/dml_planning.rs
rename datafusion/core/tests/parquet/{schema_adapter.rs => expr_adapter.rs}
(70%)
create mode 100644 datafusion/core/tests/parquet/ordering.rs
delete mode 100644 datafusion/core/tests/schema_adapter/mod.rs
delete mode 100644
datafusion/core/tests/schema_adapter/schema_adapter_integration_tests.rs
create mode 100644 datafusion/core/tests/sql/unparser.rs
create mode 100644
datafusion/datasource-parquet/benches/parquet_nested_filter_pushdown.rs
create mode 100644 datafusion/datasource-parquet/src/supported_predicates.rs
create mode 100644 datafusion/ffi/tests/utils/mod.rs
create mode 100644 datafusion/functions-window/benches/nth_value.rs
copy datafusion/functions/benches/{concat.rs => concat_ws.rs} (56%)
create mode 100644 datafusion/functions/benches/contains.rs
copy datafusion/functions/benches/{upper.rs => crypto.rs} (55%)
create mode 100644 datafusion/functions/benches/ends_with.rs
copy datafusion/functions/benches/{uuid.rs => factorial.rs} (52%)
create mode 100644 datafusion/functions/benches/floor_ceil.rs
create mode 100644 datafusion/functions/benches/left.rs
create mode 100644 datafusion/functions/benches/levenshtein.rs
create mode 100644 datafusion/functions/benches/regexp_count.rs
copy datafusion/functions/benches/{repeat.rs => replace.rs} (56%)
create mode 100644 datafusion/functions/benches/split_part.rs
create mode 100644 datafusion/functions/benches/starts_with.rs
create mode 100644 datafusion/functions/benches/translate.rs
rename datafusion/functions/benches/{ltrim.rs => trim.rs} (56%)
create mode 100644 datafusion/functions/src/core/arrow_metadata.rs
create mode 100644 datafusion/functions/src/datetime/to_time.rs
create mode 100644 datafusion/functions/src/math/ceil.rs
create mode 100644 datafusion/functions/src/math/decimal.rs
create mode 100644 datafusion/functions/src/math/floor.rs
create mode 100644
datafusion/optimizer/src/simplify_expressions/simplify_literal.rs
rename datafusion/{execution => physical-expr-common}/src/metrics/baseline.rs
(98%)
rename datafusion/{execution => physical-expr-common}/src/metrics/builder.rs
(99%)
rename datafusion/{execution => physical-expr-common}/src/metrics/custom.rs
(98%)
create mode 100644 datafusion/physical-expr-common/src/metrics/expression.rs
rename datafusion/{execution => physical-expr-common}/src/metrics/mod.rs (99%)
rename datafusion/{execution => physical-expr-common}/src/metrics/value.rs
(100%)
delete mode 100644 datafusion/physical-optimizer/src/coalesce_batches.rs
create mode 100644 datafusion/physical-plan/src/joins/array_map.rs
create mode 100644 datafusion/physical-plan/src/joins/chain.rs
copy datafusion/spark/benches/{char.rs => space.rs} (83%)
create mode 100644 datafusion/spark/src/function/aggregate/collect.rs
create mode 100644 datafusion/spark/src/function/array/repeat.rs
create mode 100644 datafusion/spark/src/function/collection/size.rs
create mode 100644 datafusion/spark/src/function/datetime/extract.rs
create mode 100644 datafusion/spark/src/function/null_utils.rs
create mode 100644 datafusion/spark/src/function/string/space.rs
delete mode 100644 datafusion/sqllogictest/src/engines/postgres_engine/types.rs
create mode 100644 datafusion/sqllogictest/test_files/cte_quoted_reference.slt
create mode 100644
datafusion/sqllogictest/test_files/datetime/arith_date_date.slt
create mode 100644
datafusion/sqllogictest/test_files/datetime/arith_date_integer.slt
create mode 100644
datafusion/sqllogictest/test_files/datetime/arith_date_interval.slt
create mode 100644
datafusion/sqllogictest/test_files/datetime/arith_date_time.slt
create mode 100644
datafusion/sqllogictest/test_files/datetime/arith_interval_double.slt
create mode 100644
datafusion/sqllogictest/test_files/datetime/arith_interval_interval.slt
create mode 100644
datafusion/sqllogictest/test_files/datetime/arith_negate_interval.slt
create mode 100644
datafusion/sqllogictest/test_files/datetime/arith_time_interval.slt
create mode 100644
datafusion/sqllogictest/test_files/datetime/arith_time_time.slt
create mode 100644
datafusion/sqllogictest/test_files/datetime/arith_timestamp_duration.slt
create mode 100644
datafusion/sqllogictest/test_files/datetime/arith_timestamp_interval.slt
create mode 100644
datafusion/sqllogictest/test_files/datetime/arith_timestamp_timestamp.slt
rename datafusion/sqllogictest/test_files/{ =>
datetime}/current_date_timezone.slt (100%)
rename datafusion/sqllogictest/test_files/{ =>
datetime}/current_time_timezone.slt (100%)
rename datafusion/sqllogictest/test_files/{expr => datetime}/date_part.slt
(100%)
rename datafusion/sqllogictest/test_files/{ => datetime}/dates.slt (98%)
rename datafusion/sqllogictest/test_files/{ => datetime}/interval.slt (100%)
rename datafusion/sqllogictest/test_files/{ => datetime}/interval_mysql.slt
(100%)
rename datafusion/sqllogictest/test_files/{ => datetime}/timestamps.slt (82%)
create mode 100644 datafusion/sqllogictest/test_files/dml_delete.slt
create mode 100644 datafusion/sqllogictest/test_files/dml_update.slt
create mode 100644 datafusion/sqllogictest/test_files/null_aware_anti_join.slt
create mode 100644
datafusion/sqllogictest/test_files/repartition_subset_satisfaction.slt
create mode 100644 datafusion/sqllogictest/test_files/run_end_encoded.slt
create mode 100644 datafusion/sqllogictest/test_files/sort_pushdown.slt
create mode 100644
datafusion/sqllogictest/test_files/spark/aggregate/collect.slt
create mode 100644 datafusion/sqllogictest/test_files/spark/collection/size.slt
copy datafusion/sqllogictest/test_files/spark/string/{char.slt => space.slt}
(80%)
create mode 100644 datafusion/sqllogictest/test_files/to_timestamp_timezone.slt
create mode 100644 dev/changelog/52.0.0.md
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]