[jira] [Created] (ARROW-12437) [Rust] [Ballista] Ballista plans must not include RepartitionExec
Andy Grove created ARROW-12437: -- Summary: [Rust] [Ballista] Ballista plans must not include RepartitionExec Key: ARROW-12437 URL: https://issues.apache.org/jira/browse/ARROW-12437 Project: Apache Arrow Issue Type: Bug Components: Rust - Ballista Reporter: Andy Grove Ballista plans must not include RepartitionExec because it results in incorrect results. Ballista needs to manage its own repartitioning in a distributed-aware way later on. For now we just need to configure the DataFusion context to disable repartition. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12436) [Rust][Ballista] Add watch capabilities to config backend trait
Ximo Guanter created ARROW-12436: Summary: [Rust][Ballista] Add watch capabilities to config backend trait Key: ARROW-12436 URL: https://issues.apache.org/jira/browse/ARROW-12436 Project: Apache Arrow Issue Type: Task Components: Rust - Ballista Reporter: Ximo Guanter [arrow/lib.rs at 66aa3e7c365a8d4c4eca6e23668f2988e714b493 · apache/arrow (github.com)|https://github.com/apache/arrow/blob/66aa3e7c365a8d4c4eca6e23668f2988e714b493/rust/ballista/rust/scheduler/src/lib.rs#L183] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12435) [Rust][DataFusion] Remove unnecessary references to namespace in executor
Ximo Guanter created ARROW-12435: Summary: [Rust][DataFusion] Remove unnecessary references to namespace in executor Key: ARROW-12435 URL: https://issues.apache.org/jira/browse/ARROW-12435 Project: Apache Arrow Issue Type: Task Components: Rust - Ballista Reporter: Ximo Guanter There is no need to support multiple executor clusters from a scheduler, so the namespace of an executor is implicitly defined by the scheduler it connects to. See [https://the-asf.slack.com/archives/C01QUFS30TD/p1618679585211100] for more context -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12434) [Rust] [Ballista] Show executed plans with metrics
Andy Grove created ARROW-12434: -- Summary: [Rust] [Ballista] Show executed plans with metrics Key: ARROW-12434 URL: https://issues.apache.org/jira/browse/ARROW-12434 Project: Apache Arrow Issue Type: New Feature Components: Rust - Ballista Reporter: Andy Grove Assignee: Andy Grove Fix For: 5.0.0 Show executed plans with metrics to help with debugging and performance tuning -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12433) [Rust] Builds failing due to new flatbuffer release introducing const generics
Andy Grove created ARROW-12433: -- Summary: [Rust] Builds failing due to new flatbuffer release introducing const generics Key: ARROW-12433 URL: https://issues.apache.org/jira/browse/ARROW-12433 Project: Apache Arrow Issue Type: Bug Affects Versions: 4.0.0 Reporter: Andy Grove I filed [https://github.com/google/flatbuffers/issues/6572] but for now we should pin the dependency to 0.8.3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12432) [Rust] [DataFusion] Add metrics for SortExec
Andy Grove created ARROW-12432: -- Summary: [Rust] [DataFusion] Add metrics for SortExec Key: ARROW-12432 URL: https://issues.apache.org/jira/browse/ARROW-12432 Project: Apache Arrow Issue Type: New Feature Components: Rust - DataFusion Reporter: Andy Grove Fix For: 5.0.0 Add metrics for SortExec -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12431) [Python] pa.array mask inverted when type is binary and value to be converted in numpy array
Daniel Nugent created ARROW-12431: - Summary: [Python] pa.array mask inverted when type is binary and value to be converted in numpy array Key: ARROW-12431 URL: https://issues.apache.org/jira/browse/ARROW-12431 Project: Apache Arrow Issue Type: Bug Reporter: Daniel Nugent {code:python} Python 3.9.2 | packaged by conda-forge | (default, Feb 21 2021, 05:02:46) [GCC 9.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> import pyarrow as pa >>> >>> pa.array(np.array([b'\x00']),type=pa.binary(1), mask = np.array([False])) [ null ] >>> pa.array(np.array([b'\x00']),type=pa.binary(1), mask = np.array([True])) [ 00 ] >>> pa.array([b'\x00'],type=pa.binary(1), mask = np.array([False])) [ 00 ] >>> pa.__version__ '3.0.0' >>> np.__version__ '1.20.1' {code} Happens both with FixedSizeBinary and variable sized binary (I was working with FixedSizeBinary). Does not happen for integers (presumably other types, didn't exhaustively check)? -- This message was sent by Atlassian Jira (v8.3.4#803005)