Andy Grove created ARROW-9832:
---------------------------------

             Summary: [Rust] [DataFusion] Refactor PhysicalPlan to remove 
Partition
                 Key: ARROW-9832
                 URL: https://issues.apache.org/jira/browse/ARROW-9832
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Rust, Rust - DataFusion
            Reporter: Andy Grove
            Assignee: Andy Grove
             Fix For: 2.0.0


As a step towards supporting an improved threading model, I would like to 
refactor to remove the redundant `Partition` trait. The implementations of 
these partition traits really just duplicate the state of their operator and 
just add the partition number. It would be better to just pass the partition 
number to the execute() method in the PhysicalPlan trait.

This means it will also be necessary for each ExecutionPlan to state its output 
partitioning (and this is needed for other reasons when we get into the 
physical optimizer).

Proposed trait:

 
{code:java}
/// Partition-aware execution plan for a relation
pub trait ExecutionPlan: Debug {
    /// Get the schema for this execution plan
    fn schema(&self) -> SchemaRef;
    /// Specifies the output partitioning of this execution plan
    fn output_partitioning(&self) -> Partitioning;
    /// Execute this plan for a single partition and return a stream of results
    fn execute(&self, partition: usize) -> Result<Arc<Mutex<dyn 
RecordBatchReader + Send + Sync>>>;
}

/// Partitioning schemes supported by operators.
#[derive(Debug, Clone)]
pub enum Partitioning {
    UnknownPartitioning(usize),
}
 {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to