Re: spark1.0.1 catalyst transform filter not push down

Yin Huai Mon, 14 Jul 2014 08:09:37 -0700

Hi,

queryPlan.baseLogicalPlan is not the plan used to execution. Actually,
the baseLogicalPlan
of a SchemaRDD (queryPlan in your case) is just the parsed plan (the parsed
plan will be analyzed, and then optimized. Finally, a physical plan will be
created). The plan shows up after you execute "val queryPlan = sql("select
value from (select key,value from src)a where a.key=86 ")" is the physical
plan. Or, you can use queryPlan.queryExecution to see the Logical Plan,
Optimized Logical Plan, and Physical Plan. You can find the physical plan
is


== Physical Plan ==
Project [value#3:0]
 Filter (key#2:1 = 86)
  HiveTableScan [value#3,key#2], (MetastoreRelation default, src, None),
None

Thanks,

Yin



On Mon, Jul 14, 2014 at 3:42 AM, victor sheng <victorsheng...@gmail.com>
wrote:

> Hi, I encountered a weird problem in spark sql.
> I use sbt/sbt hive/console  to go into the shell.
>
> I test the filter push down by using catalyst.
>
> scala>  val queryPlan = sql("select value from (select key,value from src)a
> where a.key=86 ")
> scala> queryPlan.baseLogicalPlan
> res0: org.apache.spark.sql.catalyst.plans.logical.LogicalPlan =
> Project ['value]
>  Filter ('a.key = 86)
>   Subquery a
>    Project ['key,'value]
>     UnresolvedRelation None, src, None
>
> I want to achieve the "Filter Push Down".
>
> So I run :
> scala> var newQuery = queryPlan.baseLogicalPlan transform {
>      |     case f @ Filter(_, p @ Project(_,grandChild))
>      |     if (f.references subsetOf grandChild.output) =>
>      |     p.copy(child = f.copy(child = grandChild))
>      | }
> <console>:42: error: type mismatch;
>  found   : Seq[org.apache.spark.sql.catalyst.expressions.Attribute]
>  required:
>
> scala.collection.GenSet[org.apache.spark.sql.catalyst.expressions.Attribute]
>            if (f.references subsetOf grandChild.output) =>
>                                                 ^
> It throws exception above. I don't know what's wrong.
>
> If I run :
> var newQuery = queryPlan.baseLogicalPlan transform {
>     case f @ Filter(_, p @ Project(_,grandChild))
>     if true =>
>     p.copy(child = f.copy(child = grandChild))
> }
> scala> var newQuery = queryPlan.baseLogicalPlan transform {
>      |     case f @ Filter(_, p @ Project(_,grandChild))
>      |     if true =>
>      |     p.copy(child = f.copy(child = grandChild))
>      | }
> newQuery: org.apache.spark.sql.catalyst.plans.logical.LogicalPlan =
> Project ['value]
>  Filter ('a.key = 86)
>   Subquery a
>    Project ['key,'value]
>     UnresolvedRelation None, src, None
>
> It seems the Filter also in the same position, not switch the order.
> Can anyone guide me about it?
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/spark1-0-1-catalyst-transform-filter-not-push-down-tp9599.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Re: spark1.0.1 catalyst transform filter not push down

Reply via email to