Re: [VOTE][RUST][DataFusion] Release Apache Arrow DataFusion 10.0.0 RC1

Ian Joiner Wed, 13 Jul 2022 09:47:28 -0700

0 (non-binding)

I got 3 errors while running the verification script which I have attached.



sql::explain_analyze::csv_explain

sql::explain_analyze::test_physical_plan_display_indent

sql::explain_analyze::test_physical_plan_display_indent_multi_children

Tested on macOS Monterey / Apple M1 Pro

Best,
Ian

On Tue, Jul 12, 2022 at 11:45 AM Andy Grove <andygrov...@gmail.com> wrote:

> Hi,
>
> I would like to propose a release of Apache Arrow DataFusion
> Implementation,
> version 10.0.0.
>
> This release candidate is based on commit:
> d25e822c1ef85ee7c0297b4b38d05a51b0d2e46f [1]
> The proposed release tarball and signatures are hosted at [2].
> The changelog is located at [3].
>
> Please download, verify checksums and signatures, run the unit tests, and
> vote
> on the release. The vote will be open for at least 72 hours.
>
> Only votes from PMC members are binding, but all members of the community
> are
> encouraged to test the release and vote with "(non-binding)".
>
> The standard verification procedure is documented at
>
> https://github.com/apache/arrow-datafusion/blob/master/dev/release/README.md#verifying-release-candidates
> .
>
> [ ] +1 Release this as Apache Arrow DataFusion 10.0.0
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow DataFusion 10.0.0 because...
>
> Here is my vote:
>
> +1
>
> [1]:
>
> https://github.com/apache/arrow-datafusion/tree/d25e822c1ef85ee7c0297b4b38d05a51b0d2e46f
> [2]:
>
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-datafusion-10.0.0-rc1
> [3]:
>
> https://github.com/apache/arrow-datafusion/blob/d25e822c1ef85ee7c0297b4b38d05a51b0d2e46f/CHANGELOG.md
>

failures:

---- sql::explain_analyze::csv_explain stdout ----
thread 'sql::explain_analyze::csv_explain' panicked at 'assertion failed: 
`(left == right)`
  left: `[["logical_plan", "Projection: #aggregate_test_100.c1\n  Filter: 
#aggregate_test_100.c2 > Int64(10)\n    TableScan: aggregate_test_100 
projection=[c1, c2], partial_filters=[#aggregate_test_100.c2 > Int64(10)]"], 
["physical_plan", "ProjectionExec: expr=[c1@0 as c1]\n  CoalesceBatchesExec: 
target_batch_size=4096\n    FilterExec: CAST(c2@1 AS Int64) > 10\n      
RepartitionExec: partitioning=RoundRobinBatch(NUM_CORES)\n        CsvExec: 
files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, 
limit=None, projection=[c1, c2]\n"]]`,
 right: `[["logical_plan", "Projection: #aggregate_test_100.c1\n  Filter: 
#aggregate_test_100.c2 > Int64(10)\n    TableScan: aggregate_test_100 
projection=[c1, c2], partial_filters=[#aggregate_test_100.c2 > Int64(10)]"], 
["physical_plan", "ProjectionExec: expr=[c1@0 as c1]\n  CoalesceBatchesExec: 
target_batch_size=4096\n    FilterExec: CAST(c2@1 AS Int64) > 10\n      
RepartitionExec: partitioning=RoundRobinBatch(NUM_CORES)\n        CsvExec: 
files=[privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, 
limit=None, projection=[c1, c2]\n"]]`', 
datafusion/core/tests/sql/explain_analyze.rs:766:5

---- sql::explain_analyze::test_physical_plan_display_indent stdout ----
thread 'sql::explain_analyze::test_physical_plan_display_indent' panicked at 
'assertion failed: `(left == right)`
  left: `["GlobalLimitExec: skip=None, fetch=10", "  SortExec: [the_min@2 
DESC]", "    CoalescePartitionsExec", "      ProjectionExec: expr=[c1@0 as c1, 
MAX(aggregate_test_100.c12)@1 as MAX(aggregate_test_100.c12), 
MIN(aggregate_test_100.c12)@2 as the_min]", "        AggregateExec: 
mode=FinalPartitioned, gby=[c1@0 as c1], aggr=[MAX(aggregate_test_100.c12), 
MIN(aggregate_test_100.c12)]", "          CoalesceBatchesExec: 
target_batch_size=4096", "            RepartitionExec: 
partitioning=Hash([Column { name: \"c1\", index: 0 }], 9000)", "              
AggregateExec: mode=Partial, gby=[c1@0 as c1], 
aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]", "             
   CoalesceBatchesExec: target_batch_size=4096", "                  FilterExec: 
c12@1 < CAST(10 AS Float64)", "                    RepartitionExec: 
partitioning=RoundRobinBatch(9000)", "                      CsvExec: 
files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, 
limit=None, projection=[c1, c12]"]`,
 right: `["GlobalLimitExec: skip=None, fetch=10", "  SortExec: [the_min@2 
DESC]", "    CoalescePartitionsExec", "      ProjectionExec: expr=[c1@0 as c1, 
MAX(aggregate_test_100.c12)@1 as MAX(aggregate_test_100.c12), 
MIN(aggregate_test_100.c12)@2 as the_min]", "        AggregateExec: 
mode=FinalPartitioned, gby=[c1@0 as c1], aggr=[MAX(aggregate_test_100.c12), 
MIN(aggregate_test_100.c12)]", "          CoalesceBatchesExec: 
target_batch_size=4096", "            RepartitionExec: 
partitioning=Hash([Column { name: \"c1\", index: 0 }], 9000)", "              
AggregateExec: mode=Partial, gby=[c1@0 as c1], 
aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]", "             
   CoalesceBatchesExec: target_batch_size=4096", "                  FilterExec: 
c12@1 < CAST(10 AS Float64)", "                    RepartitionExec: 
partitioning=RoundRobinBatch(9000)", "                      CsvExec: 
files=[privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, 
limit=None, projection=[c1, c12]"]`: expected:
[
    "GlobalLimitExec: skip=None, fetch=10",
    "  SortExec: [the_min@2 DESC]",
    "    CoalescePartitionsExec",
    "      ProjectionExec: expr=[c1@0 as c1, MAX(aggregate_test_100.c12)@1 as 
MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)@2 as the_min]",
    "        AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1], 
aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]",
    "          CoalesceBatchesExec: target_batch_size=4096",
    "            RepartitionExec: partitioning=Hash([Column { name: \"c1\", 
index: 0 }], 9000)",
    "              AggregateExec: mode=Partial, gby=[c1@0 as c1], 
aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]",
    "                CoalesceBatchesExec: target_batch_size=4096",
    "                  FilterExec: c12@1 < CAST(10 AS Float64)",
    "                    RepartitionExec: partitioning=RoundRobinBatch(9000)",
    "                      CsvExec: 
files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, 
limit=None, projection=[c1, c12]",
]
actual:

[
    "GlobalLimitExec: skip=None, fetch=10",
    "  SortExec: [the_min@2 DESC]",
    "    CoalescePartitionsExec",
    "      ProjectionExec: expr=[c1@0 as c1, MAX(aggregate_test_100.c12)@1 as 
MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)@2 as the_min]",
    "        AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1], 
aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]",
    "          CoalesceBatchesExec: target_batch_size=4096",
    "            RepartitionExec: partitioning=Hash([Column { name: \"c1\", 
index: 0 }], 9000)",
    "              AggregateExec: mode=Partial, gby=[c1@0 as c1], 
aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]",
    "                CoalesceBatchesExec: target_batch_size=4096",
    "                  FilterExec: c12@1 < CAST(10 AS Float64)",
    "                    RepartitionExec: partitioning=RoundRobinBatch(9000)",
    "                      CsvExec: 
files=[privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, 
limit=None, projection=[c1, c12]",
]
', datafusion/core/tests/sql/explain_analyze.rs:680:5

---- sql::explain_analyze::test_physical_plan_display_indent_multi_children 
stdout ----
thread 'sql::explain_analyze::test_physical_plan_display_indent_multi_children' 
panicked at 'assertion failed: `(left == right)`
  left: `["ProjectionExec: expr=[c1@0 as c1]", "  CoalesceBatchesExec: 
target_batch_size=4096", "    HashJoinExec: mode=Partitioned, join_type=Inner, 
on=[(Column { name: \"c1\", index: 0 }, Column { name: \"c2\", index: 0 })]", " 
     CoalesceBatchesExec: target_batch_size=4096", "        RepartitionExec: 
partitioning=Hash([Column { name: \"c1\", index: 0 }], 9000)", "          
ProjectionExec: expr=[c1@0 as c1]", "            ProjectionExec: expr=[c1@0 as 
c1]", "              RepartitionExec: partitioning=RoundRobinBatch(9000)", "    
            CsvExec: files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], 
has_header=true, limit=None, projection=[c1]", "      CoalesceBatchesExec: 
target_batch_size=4096", "        RepartitionExec: partitioning=Hash([Column { 
name: \"c2\", index: 0 }], 9000)", "          ProjectionExec: expr=[c2@0 as 
c2]", "            ProjectionExec: expr=[c1@0 as c2]", "              
RepartitionExec: partitioning=RoundRobinBatch(9000)", "                CsvExec: 
files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, 
limit=None, projection=[c1]"]`,
 right: `["ProjectionExec: expr=[c1@0 as c1]", "  CoalesceBatchesExec: 
target_batch_size=4096", "    HashJoinExec: mode=Partitioned, join_type=Inner, 
on=[(Column { name: \"c1\", index: 0 }, Column { name: \"c2\", index: 0 })]", " 
     CoalesceBatchesExec: target_batch_size=4096", "        RepartitionExec: 
partitioning=Hash([Column { name: \"c1\", index: 0 }], 9000)", "          
ProjectionExec: expr=[c1@0 as c1]", "            ProjectionExec: expr=[c1@0 as 
c1]", "              RepartitionExec: partitioning=RoundRobinBatch(9000)", "    
            CsvExec: files=[privateARROW_TEST_DATA/csv/aggregate_test_100.csv], 
has_header=true, limit=None, projection=[c1]", "      CoalesceBatchesExec: 
target_batch_size=4096", "        RepartitionExec: partitioning=Hash([Column { 
name: \"c2\", index: 0 }], 9000)", "          ProjectionExec: expr=[c2@0 as 
c2]", "            ProjectionExec: expr=[c1@0 as c2]", "              
RepartitionExec: partitioning=RoundRobinBatch(9000)", "                CsvExec: 
files=[privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, 
limit=None, projection=[c1]"]`: expected:
[
    "ProjectionExec: expr=[c1@0 as c1]",
    "  CoalesceBatchesExec: target_batch_size=4096",
    "    HashJoinExec: mode=Partitioned, join_type=Inner, on=[(Column { name: 
\"c1\", index: 0 }, Column { name: \"c2\", index: 0 })]",
    "      CoalesceBatchesExec: target_batch_size=4096",
    "        RepartitionExec: partitioning=Hash([Column { name: \"c1\", index: 
0 }], 9000)",
    "          ProjectionExec: expr=[c1@0 as c1]",
    "            ProjectionExec: expr=[c1@0 as c1]",
    "              RepartitionExec: partitioning=RoundRobinBatch(9000)",
    "                CsvExec: 
files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, 
limit=None, projection=[c1]",
    "      CoalesceBatchesExec: target_batch_size=4096",
    "        RepartitionExec: partitioning=Hash([Column { name: \"c2\", index: 
0 }], 9000)",
    "          ProjectionExec: expr=[c2@0 as c2]",
    "            ProjectionExec: expr=[c1@0 as c2]",
    "              RepartitionExec: partitioning=RoundRobinBatch(9000)",
    "                CsvExec: 
files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, 
limit=None, projection=[c1]",
]
actual:

[
    "ProjectionExec: expr=[c1@0 as c1]",
    "  CoalesceBatchesExec: target_batch_size=4096",
    "    HashJoinExec: mode=Partitioned, join_type=Inner, on=[(Column { name: 
\"c1\", index: 0 }, Column { name: \"c2\", index: 0 })]",
    "      CoalesceBatchesExec: target_batch_size=4096",
    "        RepartitionExec: partitioning=Hash([Column { name: \"c1\", index: 
0 }], 9000)",
    "          ProjectionExec: expr=[c1@0 as c1]",
    "            ProjectionExec: expr=[c1@0 as c1]",
    "              RepartitionExec: partitioning=RoundRobinBatch(9000)",
    "                CsvExec: 
files=[privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, 
limit=None, projection=[c1]",
    "      CoalesceBatchesExec: target_batch_size=4096",
    "        RepartitionExec: partitioning=Hash([Column { name: \"c2\", index: 
0 }], 9000)",
    "          ProjectionExec: expr=[c2@0 as c2]",
    "            ProjectionExec: expr=[c1@0 as c2]",
    "              RepartitionExec: partitioning=RoundRobinBatch(9000)",
    "                CsvExec: 
files=[privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, 
limit=None, projection=[c1]",
]
', datafusion/core/tests/sql/explain_analyze.rs:731:5


failures:
    sql::explain_analyze::csv_explain
    sql::explain_analyze::test_physical_plan_display_indent
    sql::explain_analyze::test_physical_plan_display_indent_multi_children

test result: FAILED. 386 passed; 3 failed; 2 ignored; 0 measured; 0 filtered 
out; finished in 2.43s

error: test failed, to rerun pass '-p datafusion --test sql_integration'
+ cleanup
+ '[' no = yes ']'
+ echo 'Failed to verify release candidate. See 
/var/folders/cl/ycxd_6916zlf50f8mpthd9qw0000gn/T/arrow-10.0.0.XXXXX.QQbqgyHb 
for details.'
Failed to verify release candidate. See 
/var/folders/cl/ycxd_6916zlf50f8mpthd9qw0000gn/T/arrow-10.0.0.XXXXX.QQbqgyHb 
for details.

Re: [VOTE][RUST][DataFusion] Release Apache Arrow DataFusion 10.0.0 RC1

Reply via email to