Ahhh, thanks all
So if I'm understanding right, a "Physical Plan" would be something like
this?
JdbcToEnumerableConverter
JdbcProject(fields=["name"])
JdbcFilter(condition=[<("id", 5)])
JdbcTableScan(table=[["user"]])
Where this is a plan that says to "physically" use a JDBC source
for the underlying operations
And the translation of the Logical Plan -> SQL and execution against
the DB would be an IMPLEMENTATION DETAIL of these Physical Plan operators?
Something like this?
class JdbcToEnumerableConverter {
public Result implement() {
// Visit child expressions, produce SQL,
// execute against JDBC source, return Enumerable of rows
}
}
On Wed, Mar 23, 2022 at 7:31 AM Justin Swanhart <[email protected]> wrote:
> Hi,
>
> Generally a cost-based optimizer chooses physical plans (sometimes with
> help from a rules-based optimizer). Hints (/*+USE_NL*/ /*+HASH_JOIN*/) etc
> (depends on RDBMS) generally allow the user to override the optimizer and
> choose a physical plan that differs from what the database would pick.
>
> On Wed, Mar 23, 2022 at 8:19 AM Charles Givre <[email protected]> wrote:
>
> > Hi Gavin,
> > Thanks for the book recommendation. That looks really solid and I'm
> > definitely going to pick up a copy. To continue the conversation about
> > physical plans, Drill does allow you to view the logical and physical
> plans
> > from a query AND modify them, or submit your own. Here's a doc link that
> > explains how: https://drill.apache.org/docs/query-plans/ <
> > https://drill.apache.org/docs/query-plans/>
> > Best,
> > -- C
> >
> > > On Mar 23, 2022, at 5:26 AM, Alessandro Solimando <
> > [email protected]> wrote:
> > >
> > > Hi Gavin,
> > > in a nutshell, a logical plan consists of logical operators (say, a
> > join),
> > > which can be implemented in several ways at the physical level (say,
> > merge
> > > join, hash join, etc.), and are therefore associated with some
> > > corresponding physical operators.
> > >
> > > How the logical vs physical planning is performed depends on the
> > > optimization framework you use.
> > >
> > > SQL is not a plan, it's a declarative language, it only dictates "what"
> > you
> > > will be getting, the logical plan sketches how you will get it, the
> > > physical plan fills the missing details.
> > > Then there is also query compilation which translates to "real
> > > instructions" (be it machine code, a DAG for an execution engine like
> for
> > > Spark or Hive, etc.).
> > >
> > > In federated queries, you will plan and split the work across different
> > > databases, you will pass them either SQL or some native query language
> > (CQL
> > > for Cassandra, for example), but then the DB will do all the steps
> again,
> > > parse, validate, etc., build a logical plan, optimize it and run it as
> it
> > > thinks it's best, which can be radically different from what you
> > envisioned.
> > >
> > > I have never heard of systems where you can directly inject a
> > > physical/execution plan, but I haven't really looked for that.
> > >
> > > HTH,
> > > Alessandro
> > >
> > > On Tue, 22 Mar 2022 at 22:10, Gavin Ray <[email protected]> wrote:
> > >
> > >> I'm on my second pass of the book "How Query Engines Work" by Arrow's
> > own
> > >> Andy Grove
> > >> (Really great read, huge recommendation:
> > >> https://leanpub.com/how-query-engines-work)
> > >>
> > >> Something I'm not sure I'm fully understanding is what qualifies
> > something
> > >> as a Physical Plan
> > >> A logical plan is straightforward, say I have an expression, "Select
> > name
> > >> from users where ID is less than 5"
> > >>
> > >> Then I can represent this as an abstract, logical operation like:
> > >>
> > >> LogicalPlan(
> > >> project = ["name"],
> > >> filter = Filter(LessThan(Column("id"), Literal(5)))
> > >> )
> > >>
> > >> Now say I want to give this plan to a database and have it run it.
> > >> I need to write an implementation for translating this to an
> executable
> > >> expression
> > >> (Probably SQL)
> > >>
> > >> Is the class that implements the translation to SQL that gets executed
> > the
> > >> Physical Plan
> > >> Or is there no Physical Plan, and that's the database's job to figure
> > out?
> > >>
> >
> >
>