This is an automated email from the ASF dual-hosted git repository.

houqp pushed a change to branch arrow2
in repository https://gitbox.apache.org/repos/asf/arrow-datafusion.git.


    from c0c9c72  Officially maintained Arrow2 branch (#1556)
     add 54da006  Stop merging avro schemas as it doesn't support list of lists
     add 97415ca  use the latest arrow2 with Chunk
     add b42ebe7  Clarify docs about `Accumulator::update` and 
`Accumulator::update_batch` (#1542)
     add b05feda  Mark ARRAY_AGG(DISTINCT ...) not implemented (#1534)
     add 06d147a  Add batch operations to stddev (#1547)
     add e1e7b86  Address clippy warnings (#1553)
     add 14176ff  Update to arrow-7.0.0 (#1523)
     add 794b92b  Remove unused `update` and `merge` implementations from 
Aggregates and supporting `ScalarValue` arithmetic (#1550)
     add cf76969  Make call SchedulerServer::new once in ballista-scheduler 
process (#1537)
     add b4c77e5  Add covar operators (#1551)
     add d7e465a  Initial MemoryManager and DiskManager APIs  for query 
execution + External Sort implementation (#1526)
     add 811bb51  Update to rust 1.58 (#1557)
     add 0bddfb7  support cast/try_cast for decimal: signed numeric to decimal 
(#1442)
     add 1c39f5c  support comparison for decimal data type and refactor the 
binary coercion rule (#1483)
     add b743610  add correlation function (#1561)
     add 1dae7e2  Rename sql integration tests from `mod` to `sql_integration` 
(#1575)
     add bbfc2c0  update reference to python and update readme (#1581)
     add 278e859  minor: improve the benchmark readme (#1567)
     add 438b417  Tests for support try_cast/cast decimal to numeric (#1465)
     add 6f7b2d2  implement Hash for various types and replace PartialOrd 
(#1580)
     add f027e5f  add from_slice trait to ease arrow2 migration (#1588)
     add 92a3e45  Consolidate `batch_size` configuration in `ExecutionConfig`, 
`RuntimeConfig` and `PhysicalPlanConfig` (#1562)
     add 30df911  support from_slice for binary, string, and boolean array 
types (#1589)
     add 059e52b  update nightly version (#1597)
     add 82e8003  remove update and merge (#1582)
     add c549d51  support mathematics operation for decimal data type (#1554)
     add fefbfc8  add test for decimal to decimal (#1603)
     add 8ebc94c  fix: casting Int64 to Float64 unsuccessfully caused tpch8 to 
fail (#1601)
     add 444c153  Add support show tables and show columns for ballista (#1593)
     add ad392fd  Fix comparison of dictionary arrays (#1606)
     add 345f727  Replace Datafusion Error with Generic Error for Object store 
(#1541)
     add eb51fae  consolidate binary_expr coercion rule code into 
`binary_rule.rs` module (#1607)
     add a96bb5e  Implement ARRAY_AGG(DISTINCT ...) (#1579)
     add d93cf79  Add roadmap to readme (#1616)
     add 2f702e4  fix: sql planner creates cross join instead of inner join 
from select predicates (#1566)
     add e92225d  feat: Support complex interval via IntervalMonthDayNano 
(#1615)
     add 03075d5  Fix null comparison for Parquet pruning predicate (#1595)
     add 3c5a679  fix dependabot (#1625)
     add 7d819d1  Consolidate sort and external_sort (#1596)
     add 62edddb  Optimize `SortPreservingMergeStream` to avoid `SortKeyCursor` 
sharing (#1624)
     add cc8f325  Update pyo3 requirement from 0.14 to 0.15 (#1627)
     add 67a598c  Update etcd-client requirement from 0.7 to 0.8 (#1626)
     add 0762bf0  Update hashbrown requirement from 0.11 to 0.12 (#1631)
     add af8786e  support hash decimal array and group by (#1640)
     add 1c63759  Add spill_count and spilled_bytes to baseline metrics, test 
sort with spill of metrics (#1641)
     add 15af24a  Add `DataFusionError` -> `ArrowError` conversion (#1643)
     add 9c5ccae  update md-5, sha2, blake2 (#1647)
     add 71757bb  Introduce push-based task scheduling for Ballista (#1560)
     add 4a2453a  fix a cte block with same name for many times (#1639)
     add deaa8ac  Handle merging of evolved schemas in ParquetExec (#1622)
     add 01b5244  refine match pattern related code (#1650)
     add 6ec18bb  Consolidate Schema and RecordBatch projection (#1638)
     add 741df36  Remove DataFusionError::into_arrow_external_error (#1645)
     add c63cfd4  Move AggregatedMetricsSet to metrics for further reuse (#1663)
     add 97f95b3  Make `MemoryManager` and `MemoryStream` public (#1664)
     add 618c1e8  feat: Support Substring(str [from int] [for int]) (#1621)
     add 2a9df64  [Ballista] Fix scheduler state mod bug (#1655)
     add 992624a  Fix predicate pushdown for outer joins (#1618)
     add 271b6ba  feat: Support quarter granularity in date_trunct fn (#1667)
     add 6c8d642  Update to arrow 8.0.0 (#1673)
     add bf68073  [Ballista] Add Decimal128, Date64, TimestampSecond, 
TimestampMillisecond, Interv… (#1659)
     add ee91c68  upgrade clap to version 3 (#1672)
     add 7153fac  Improve configuration and resource use of `MemoryManager` and 
`DiskManager` (#1668)
     add bffa5e4  Use NamedTempFile rather than `String` in DiskManager (#1680)
     add 48ad975  Add VegaFusion as project that uses DataFusion (#1683)
     add d297540  Add a new metric type: `Gauge` + `CurrentMemoryUsage` to 
metrics (#1682)
     add fdbd608  enhance arithmetic operation for array with scalar (#1552)
     add bf71577  refactor array_agg to not to have `update` and `merge` (#1681)
     add 2266474  Fix bug while merging `RecordBatch`, add 
`SortPreservingMerge` fuzz tester (#1678)
     add 63d24bf  Make `SortPreservingMergeStream` stable on input stream order 
(#1687)
     add 18918fa  Merge branch 'master' into i_arrow2
     add a7ec38e  resolve up to last datafusion issue with SortColumn using a 
reference that cannot be returned
     add eea061f  Fix other crates and address check warnings
     add ab48bb2  clippy
     add 98f98d1  test fix #1, errors and debug strings

No new revisions were added by this update.

Summary of changes:
 .env                                               |    2 +-
 .github/dependabot.yml                             |    6 +-
 .github/workflows/rust.yml                         |    2 +-
 Cargo.toml                                         |    4 +-
 README.md                                          |   68 +-
 ballista-examples/Cargo.toml                       |    2 +-
 ballista/rust/client/Cargo.toml                    |    4 +-
 ballista/rust/client/src/columnar_batch.rs         |    7 +-
 ballista/rust/client/src/context.rs                |  206 ++-
 ballista/rust/core/Cargo.toml                      |    8 +-
 ballista/rust/core/proto/ballista.proto            |  128 +-
 ballista/rust/core/src/client.rs                   |   11 +-
 ballista/rust/core/src/config.rs                   |   82 +-
 ballista/rust/core/src/error.rs                    |    2 +-
 .../core/src/execution_plans/distributed_query.rs  |    4 +-
 .../core/src/execution_plans/shuffle_reader.rs     |   10 +-
 .../core/src/execution_plans/shuffle_writer.rs     |   95 +-
 .../core/src/execution_plans/unresolved_shuffle.rs |    7 +-
 ballista/rust/core/src/lib.rs                      |    1 -
 ballista/rust/core/src/memory_stream.rs            |   92 --
 .../rust/core/src/serde/logical_plan/from_proto.rs |  111 +-
 ballista/rust/core/src/serde/logical_plan/mod.rs   |   52 +-
 .../rust/core/src/serde/logical_plan/to_proto.rs   |   88 +-
 ballista/rust/core/src/serde/mod.rs                |   49 +-
 .../core/src/serde/physical_plan/from_proto.rs     |   19 +-
 ballista/rust/core/src/serde/physical_plan/mod.rs  |    3 +-
 .../rust/core/src/serde/physical_plan/to_proto.rs  |   12 +-
 ballista/rust/core/src/serde/scheduler/mod.rs      |  141 +++
 ballista/rust/core/src/utils.rs                    |   22 +-
 ballista/rust/executor/Cargo.toml                  |    5 +-
 ballista/rust/executor/executor_config_spec.toml   |   13 +
 ballista/rust/executor/src/collect.rs              |   11 +-
 ballista/rust/executor/src/execution_loop.rs       |   38 +-
 ballista/rust/executor/src/executor.rs             |   22 +-
 ballista/rust/executor/src/executor_server.rs      |  291 +++++
 ballista/rust/executor/src/flight_service.rs       |   12 +-
 ballista/rust/executor/src/lib.rs                  |   39 +
 ballista/rust/executor/src/main.rs                 |   65 +-
 ballista/rust/executor/src/standalone.rs           |    2 +
 ballista/rust/scheduler/Cargo.toml                 |    4 +-
 ballista/rust/scheduler/scheduler_config_spec.toml |    9 +-
 ballista/rust/scheduler/src/lib.rs                 |  457 ++++++-
 ballista/rust/scheduler/src/main.rs                |   48 +-
 ballista/rust/scheduler/src/planner.rs             |    2 +-
 ballista/rust/scheduler/src/standalone.rs          |    6 +-
 ballista/rust/scheduler/src/state/mod.rs           |  235 +++-
 ballista/rust/scheduler/src/test_utils.rs          |    1 +
 benchmarks/.gitignore                              |    2 +-
 benchmarks/Cargo.toml                              |    4 +-
 benchmarks/README.md                               |    2 +-
 benchmarks/src/bin/nyctaxi.rs                      |   13 +-
 benchmarks/src/bin/tpch.rs                         |   24 +-
 ci/docker/linux-apt-lint.dockerfile                |    2 +-
 datafusion-cli/Cargo.toml                          |    6 +-
 datafusion-cli/Dockerfile                          |    2 +-
 datafusion-cli/src/command.rs                      |    3 +-
 datafusion-cli/src/exec.rs                         |    3 +-
 datafusion-cli/src/functions.rs                    |   15 +-
 datafusion-cli/src/main.rs                         |   34 +-
 datafusion-cli/src/print_format.rs                 |   23 +-
 datafusion-cli/src/print_options.rs                |    2 +-
 datafusion-examples/Cargo.toml                     |    6 +-
 .../examples/dataframe_in_memory.rs                |    3 +-
 datafusion-examples/examples/flight_client.rs      |    8 +-
 datafusion-examples/examples/flight_server.rs      |   14 +-
 datafusion-examples/examples/simple_udaf.rs        |   64 +-
 datafusion-examples/examples/simple_udf.rs         |    6 +-
 datafusion/Cargo.toml                              |   17 +-
 datafusion/benches/data_utils/mod.rs               |    4 +-
 datafusion/benches/filter_query_sql.rs             |    3 +-
 datafusion/benches/math_query_sql.rs               |    3 +-
 datafusion/benches/physical_plan.rs                |   14 +-
 datafusion/benches/sort_limit_query_sql.rs         |   10 +-
 datafusion/src/arrow_print.rs                      |    5 +-
 datafusion/src/avro_to_arrow/arrow_array_reader.rs |   15 +-
 datafusion/src/avro_to_arrow/reader.rs             |    4 +-
 datafusion/src/avro_to_arrow/schema.rs             |    1 +
 datafusion/src/cast.rs                             |   52 +
 datafusion/src/catalog/catalog.rs                  |   12 +
 datafusion/src/catalog/information_schema.rs       |    3 +-
 datafusion/src/catalog/mod.rs                      |    1 +
 datafusion/src/catalog/schema.rs                   |    7 +
 datafusion/src/dataframe.rs                        |    2 +-
 datafusion/src/datasource/datasource.rs            |    1 -
 datafusion/src/datasource/empty.rs                 |   15 +-
 datafusion/src/datasource/file_format/avro.rs      |   61 +-
 datafusion/src/datasource/file_format/csv.rs       |   49 +-
 datafusion/src/datasource/file_format/json.rs      |   30 +-
 datafusion/src/datasource/file_format/mod.rs       |    4 +-
 datafusion/src/datasource/file_format/parquet.rs   |  117 +-
 datafusion/src/datasource/listing/helpers.rs       |    3 +-
 datafusion/src/datasource/listing/table.rs         |   26 +-
 datafusion/src/datasource/memory.rs                |   62 +-
 datafusion/src/datasource/mod.rs                   |    6 +-
 datafusion/src/datasource/object_store/local.rs    |    3 +-
 datafusion/src/datasource/object_store/mod.rs      |   10 +-
 datafusion/src/error.rs                            |   84 +-
 datafusion/src/execution/context.rs                |  232 +++-
 datafusion/src/execution/dataframe_impl.rs         |   14 +-
 datafusion/src/execution/disk_manager.rs           |  165 +++
 datafusion/src/execution/memory_manager.rs         |  601 +++++++++
 datafusion/src/execution/mod.rs                    |    6 +
 datafusion/src/execution/options.rs                |    6 +
 datafusion/src/execution/runtime_env.rs            |  138 ++
 datafusion/src/field_util.rs                       |  381 +++++-
 datafusion/src/lib.rs                              |   16 +-
 datafusion/src/logical_plan/builder.rs             |   10 +-
 datafusion/src/logical_plan/dfschema.rs            |   42 +-
 datafusion/src/logical_plan/display.rs             |    1 +
 datafusion/src/logical_plan/expr.rs                |   96 +-
 datafusion/src/logical_plan/extension.rs           |    2 +-
 datafusion/src/logical_plan/plan.rs                |    6 +-
 .../src/optimizer/common_subexpr_eliminate.rs      |    6 +
 datafusion/src/optimizer/eliminate_limit.rs        |    1 +
 datafusion/src/optimizer/filter_push_down.rs       |  465 ++++---
 datafusion/src/optimizer/limit_push_down.rs        |    1 +
 datafusion/src/optimizer/mod.rs                    |    1 +
 datafusion/src/optimizer/projection_push_down.rs   |    5 +-
 datafusion/src/optimizer/simplify_expressions.rs   |   45 +-
 .../src/optimizer/single_distinct_to_groupby.rs    |    1 +
 datafusion/src/optimizer/utils.rs                  |   78 +-
 .../src/physical_optimizer/aggregate_statistics.rs |   10 +-
 .../src/physical_optimizer/coalesce_batches.rs     |    3 +-
 .../physical_optimizer/hash_build_probe_order.rs   |    2 +
 datafusion/src/physical_optimizer/merge_exec.rs    |    1 +
 datafusion/src/physical_optimizer/pruning.rs       |   77 +-
 datafusion/src/physical_optimizer/repartition.rs   |   10 +-
 datafusion/src/physical_plan/aggregates.rs         |  273 +++-
 datafusion/src/physical_plan/analyze.rs            |   16 +-
 datafusion/src/physical_plan/coalesce_batches.rs   |   46 +-
 .../src/physical_plan/coalesce_partitions.rs       |   24 +-
 .../physical_plan/coercion_rule/aggregate_rule.rs  |   30 +-
 .../src/physical_plan/coercion_rule/binary_rule.rs |  625 +++++++++
 datafusion/src/physical_plan/coercion_rule/mod.rs  |    2 +
 datafusion/src/physical_plan/common.rs             |   96 +-
 datafusion/src/physical_plan/cross_join.rs         |   17 +-
 datafusion/src/physical_plan/crypto_expressions.rs |   10 +-
 .../src/physical_plan/datetime_expressions.rs      |   61 +-
 datafusion/src/physical_plan/empty.rs              |   21 +-
 datafusion/src/physical_plan/explain.rs            |   17 +-
 .../physical_plan/expressions/approx_distinct.rs   |   17 -
 .../src/physical_plan/expressions/array_agg.rs     |   64 +-
 .../src/physical_plan/expressions/average.rs       |   26 +-
 datafusion/src/physical_plan/expressions/binary.rs | 1326 +++++++++++++++++---
 datafusion/src/physical_plan/expressions/case.rs   |   52 +-
 datafusion/src/physical_plan/expressions/cast.rs   |  399 +++++-
 .../src/physical_plan/expressions/coercion.rs      |  247 ----
 datafusion/src/physical_plan/expressions/column.rs |    7 +-
 .../src/physical_plan/expressions/correlation.rs   |  544 ++++++++
 datafusion/src/physical_plan/expressions/count.rs  |   21 +-
 .../src/physical_plan/expressions/covariance.rs    |  725 +++++++++++
 .../src/physical_plan/expressions/cume_dist.rs     |    3 +-
 .../{ => expressions}/distinct_expressions.rs      |  320 ++++-
 .../physical_plan/expressions/get_indexed_field.rs |    8 +-
 .../src/physical_plan/expressions/in_list.rs       |   10 +-
 .../src/physical_plan/expressions/is_not_null.rs   |    9 +-
 .../src/physical_plan/expressions/is_null.rs       |    8 +-
 .../src/physical_plan/expressions/lead_lag.rs      |    9 +-
 .../src/physical_plan/expressions/literal.rs       |    7 +-
 .../src/physical_plan/expressions/min_max.rs       |   23 +-
 datafusion/src/physical_plan/expressions/mod.rs    |   65 +-
 .../src/physical_plan/expressions/negative.rs      |    7 +-
 datafusion/src/physical_plan/expressions/not.rs    |    3 +-
 .../src/physical_plan/expressions/nth_value.rs     |    5 +-
 datafusion/src/physical_plan/expressions/nullif.rs |    4 +-
 datafusion/src/physical_plan/expressions/rank.rs   |   25 +-
 .../src/physical_plan/expressions/row_number.rs    |    5 +-
 datafusion/src/physical_plan/expressions/stddev.rs |  113 +-
 datafusion/src/physical_plan/expressions/sum.rs    |   14 +-
 .../src/physical_plan/expressions/try_cast.rs      |  320 ++++-
 .../src/physical_plan/expressions/variance.rs      |  240 ++--
 datafusion/src/physical_plan/file_format/avro.rs   |   46 +-
 datafusion/src/physical_plan/file_format/csv.rs    |   53 +-
 .../src/physical_plan/file_format/file_stream.rs   |    2 +-
 datafusion/src/physical_plan/file_format/json.rs   |   36 +-
 datafusion/src/physical_plan/file_format/mod.rs    |   17 +-
 .../src/physical_plan/file_format/parquet.rs       |  490 +++++++-
 datafusion/src/physical_plan/filter.rs             |   26 +-
 datafusion/src/physical_plan/functions.rs          |   17 +-
 datafusion/src/physical_plan/hash_aggregate.rs     |   86 +-
 datafusion/src/physical_plan/hash_join.rs          |   77 +-
 datafusion/src/physical_plan/hash_utils.rs         |   61 +-
 datafusion/src/physical_plan/join_utils.rs         |    1 +
 datafusion/src/physical_plan/limit.rs              |   27 +-
 datafusion/src/physical_plan/memory.rs             |   61 +-
 datafusion/src/physical_plan/metrics/aggregated.rs |  155 +++
 datafusion/src/physical_plan/metrics/baseline.rs   |   39 +-
 datafusion/src/physical_plan/metrics/builder.rs    |   49 +-
 datafusion/src/physical_plan/metrics/mod.rs        |   24 +-
 datafusion/src/physical_plan/metrics/value.rs      |  139 +-
 datafusion/src/physical_plan/mod.rs                |  130 +-
 datafusion/src/physical_plan/planner.rs            |   14 +-
 datafusion/src/physical_plan/projection.rs         |   34 +-
 datafusion/src/physical_plan/regex_expressions.rs  |    2 +-
 datafusion/src/physical_plan/repartition.rs        |   55 +-
 datafusion/src/physical_plan/sort.rs               |  565 ---------
 datafusion/src/physical_plan/sorts/mod.rs          |  312 +++++
 datafusion/src/physical_plan/sorts/sort.rs         |  907 +++++++++++++
 .../{ => sorts}/sort_preserving_merge.rs           |  721 ++++++-----
 datafusion/src/physical_plan/stream.rs             |    5 +-
 datafusion/src/physical_plan/type_coercion.rs      |    1 +
 datafusion/src/physical_plan/udaf.rs               |    2 +-
 datafusion/src/physical_plan/udf.rs                |    2 +-
 .../src/physical_plan/unicode_expressions.rs       |    8 +-
 datafusion/src/physical_plan/union.rs              |   31 +-
 datafusion/src/physical_plan/values.rs             |   12 +-
 datafusion/src/physical_plan/window_functions.rs   |    2 +-
 datafusion/src/physical_plan/windows/aggregate.rs  |    2 +-
 datafusion/src/physical_plan/windows/built_in.rs   |    2 +-
 datafusion/src/physical_plan/windows/mod.rs        |   15 +-
 .../src/physical_plan/windows/window_agg_exec.rs   |   19 +-
 datafusion/src/record_batch.rs                     |  432 +++++++
 datafusion/src/scalar.rs                           |  639 +---------
 datafusion/src/sql/planner.rs                      |  205 ++-
 datafusion/src/test/exec.rs                        |   41 +-
 datafusion/src/test/mod.rs                         |   11 +-
 datafusion/src/test/variable.rs                    |    2 +
 datafusion/src/test_util.rs                        |    1 +
 datafusion/tests/custom_sources.rs                 |   29 +-
 datafusion/tests/dataframe.rs                      |    7 +-
 datafusion/tests/dataframe_functions.rs            |    5 +-
 datafusion/tests/merge_fuzz.rs                     |  222 ++++
 datafusion/tests/parquet_pruning.rs                |    6 +-
 datafusion/tests/path_partition.rs                 |    1 +
 datafusion/tests/provider_filter_pushdown.rs       |   15 +-
 datafusion/tests/sql/aggregates.rs                 |   90 +-
 datafusion/tests/sql/avro.rs                       |    5 +-
 datafusion/tests/sql/errors.rs                     |    3 +-
 datafusion/tests/sql/explain_analyze.rs            |   20 +-
 datafusion/tests/sql/expr.rs                       |   30 +
 datafusion/tests/sql/joins.rs                      |  229 +++-
 datafusion/tests/sql/mod.rs                        |   53 +-
 datafusion/tests/sql/parquet.rs                    |    6 +-
 datafusion/tests/sql/select.rs                     |   89 +-
 datafusion/tests/sql/timestamp.rs                  |    4 +-
 datafusion/tests/sql_integration.rs                |    1 +
 datafusion/tests/statistics.rs                     |   27 +-
 datafusion/tests/user_defined_plan.rs              |   17 +-
 dev/docker/ballista-base.dockerfile                |    2 +-
 239 files changed, 14188 insertions(+), 3895 deletions(-)
 create mode 100644 ballista/rust/executor/src/executor_server.rs
 create mode 100644 datafusion/src/avro_to_arrow/schema.rs
 create mode 100644 datafusion/src/cast.rs
 create mode 100644 datafusion/src/execution/disk_manager.rs
 create mode 100644 datafusion/src/execution/memory_manager.rs
 create mode 100644 datafusion/src/execution/runtime_env.rs
 create mode 100644 datafusion/src/physical_plan/coercion_rule/binary_rule.rs
 delete mode 100644 datafusion/src/physical_plan/expressions/coercion.rs
 create mode 100644 datafusion/src/physical_plan/expressions/correlation.rs
 create mode 100644 datafusion/src/physical_plan/expressions/covariance.rs
 rename datafusion/src/physical_plan/{ => expressions}/distinct_expressions.rs 
(70%)
 create mode 100644 datafusion/src/physical_plan/metrics/aggregated.rs
 create mode 100644 datafusion/src/physical_plan/sorts/mod.rs
 create mode 100644 datafusion/src/physical_plan/sorts/sort.rs
 rename datafusion/src/physical_plan/{ => sorts}/sort_preserving_merge.rs (67%)
 create mode 100644 datafusion/src/record_batch.rs
 create mode 100644 datafusion/tests/merge_fuzz.rs
 create mode 100644 datafusion/tests/sql_integration.rs

Reply via email to