> So here are my recommendations for moving forward, with DataSourceV2 as a
> starting point:
> 1. Use well-defined logical plan nodes for all high-level operations:
> insert, create, CTAS, overwrite table, etc.
> 2. Use rules that match on these high-level plan nodes, so that it
> isn’t necessary to create rules to match each eventual code path
> 3. Define Spark’s behavior for these logical plan nodes. Physical
> nodes should implement that behavior, but all CREATE TABLE OVERWRITE should
> (eventually) make the same guarantees.
> 4. Specialize implementation when creating a physical plan, not
> logical plans.
> I realize this is really long, but I’d like to hear thoughts about this.
> I’m sure I’ve left out some additional context, but I think the main idea
> here is solid: lets standardize logical plans for more consistent behavior
> and easier maintenance.
Context aside, I really like these rules! I think having query planning be
the boundary for specialization makes a lot of sense.
(RunnableCommand might also be my fault though.... sorry! :P)