Saurabh Chawla created SPARK-36027:
--------------------------------------

             Summary: Push down Filter in case of filter having child as 
TypedFilter
                 Key: SPARK-36027
                 URL: https://issues.apache.org/jira/browse/SPARK-36027
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.1.2
            Reporter: Saurabh Chawla


In case of Filter having child as TypedFilter, Pushdown of Filters does not 
take place

 
{code:java}
scala> def testUdfFunction(r: String): Boolean = {
 | r.equals("hello")
 | }
testUdfFunction: (r: String)Boolean

val df= spark.read.parquet("/testDir/testParquetSize/Parquetgzip/")
df: org.apache.spark.sql.DataFrame = [_1: string, _2: string ... 1 more field]
{code}
 

df.filter(x => 
testUdfFunction(x.getAs("_1"))).filter("_2<='id103855'").queryExecution.executedPlan

 
{code:java}
Filter (isnotnull(_2#1) AND (_2#1 <= id103855))
+- *(1) Filter 
$line20.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$Lambda$3184/[email protected]
 +- *(1) ColumnarToRow
 +- FileScan parquet [_1#0,_2#1,_3#2] Batched: true, DataFilters: [], Format: 
Parquet, Location: 
InMemoryFileIndex[file:/Users/saurabh.chawla/testDir/testParquetSize/Parquetgzip],
 PartitionFilters: [], PushedFilters: [], ReadSchema: 
struct<_1:string,_2:string,_3:string>
 
{code}
df.filter(x => 
testUdfFunction(x.getAs("_1"))).filter("_2<='id103855'").queryExecution.optimizedPlan
{code:java}
Filter (isnotnull(_2#1) AND (_2#1 <= id103855))
+- TypedFilter 
$line22.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$Lambda$3191/320569017@37a2806c, 
interface org.apache.spark.sql.Row, [StructField(_1,StringType,true), 
StructField(_2,StringType,true), StructField(_3,StringType,true)], 
createexternalrow(_1#0.toString, _2#1.toString, _3#2.toString, 
StructField(_1,StringType,true), StructField(_2,StringType,true), 
StructField(_3,StringType,true))
 +- Relation[_1#0,_2#1,_3#2] parquet{code}
 

There is need to add this change to push down the filter for this scenario

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to