[ 
https://issues.apache.org/jira/browse/SPARK-19638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872413#comment-15872413
 ] 

Nick Dimiduk commented on SPARK-19638:
--------------------------------------

Debugging. I'm looking at the match expression in 
[{{DataSourceStrategy#translateFilter(Expression)}}|https://github.com/apache/spark/blob/v2.1.0/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala#L509].
 The predicate comes in as a {{EqualTo(GetStructField, Literal)}}. This doesn't 
match any of the cases. I was expecting it to step into the [{{case 
expressions.EqualTo(a: Attribute, Literal(v, t)) 
=>}}|https://github.com/apache/spark/blob/v2.1.0/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala#L511]
 on line 511 but execution steps past this point. Upon investigation, 
{{GetStructField}} does not extend {{Attribute}}.

>From this point, the {{EqualTo}} condition involving the struct field is 
>dropped from the filter set pushed down to the ES connector. Thus I believe 
>this is an issue in Spark, not in the connector.

> Filter pushdown not working for struct fields
> ---------------------------------------------
>
>                 Key: SPARK-19638
>                 URL: https://issues.apache.org/jira/browse/SPARK-19638
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: Nick Dimiduk
>
> Working with a dataset containing struct fields, and enabling debug logging 
> in the ES connector, I'm seeing the following behavior. The dataframe is 
> created over the ES connector and then the schema is extended with a couple 
> column aliases, such as.
> {noformat}
> df.withColumn("f2", df("foo"))
> {noformat}
> Queries vs those alias columns work as expected for fields that are 
> non-struct members.
> {noformat}
> scala> df.withColumn("f2", df("foo")).where("f2 == '1'").limit(0).show
> 17/02/16 15:06:49 DEBUG DataSource: Pushing down filters 
> [IsNotNull(foo),EqualTo(foo,1)]
> 17/02/16 15:06:49 TRACE DataSource: Transformed filters into DSL 
> [{"exists":{"field":"foo"}},{"match":{"foo":"1"}}]
> {noformat}
> However, try the same with an alias over a struct field, and no filters are 
> pushed down.
> {noformat}
> scala> df.withColumn("bar_baz", df("bar.baz")).where("bar_baz == 
> '1'").limit(1).show
> {noformat}
> In fact, this is the case even when no alias is used at all.
> {noformat}
> scala> df.where("bar.baz == '1'").limit(1).show
> {noformat}
> Basically, pushdown for structs doesn't work at all.
> Maybe this is specific to the ES connector?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to