szehon-ho opened a new issue, #5132:
URL: https://github.com/apache/iceberg/issues/5132

   From the discussion in https://github.com/apache/iceberg/pull/5113 with 
@huaxingao , I found this behavior:
   
   select * from table where table.struct_field = struct(10)
   > org.apache.spark.sql.AnalysisException: cannot resolve 
'(table.struct_field = struct(10))' due to data type mismatch: differing types 
in '(table.struct_field = struct(1))' (struct<nested:int> and 
struct<col1:int>).; line 1 pos 39;
   
   select * from table where table.struct_field in (struct(10))
   ```
   java.lang.IllegalArgumentException: Cannot create expression literal from 
org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema: [1]
     at org.apache.iceberg.expressions.Literals.from(Literals.java:87)
     at 
org.apache.iceberg.expressions.UnboundPredicate.<init>(UnboundPredicate.java:40)
     at org.apache.iceberg.expressions.Expressions.equal(Expressions.java:175)
     at org.apache.iceberg.spark.SparkFilters.handleEqual(SparkFilters.java:239)
     at org.apache.iceberg.spark.SparkFilters.convert(SparkFilters.java:152)
     at 
org.apache.iceberg.spark.source.SparkScanBuilder.pushFilters(SparkScanBuilder.java:106)
     at 
org.apache.spark.sql.execution.datasources.v2.PushDownUtils$.pushFilters(PushDownUtils.scala:69)
     at 
org.apache.spark.sql.execution.datasources.v2.V2ScanRelationPushDown$$anonfun$pushDownFilters$1.applyOrElse(V2ScanRelationPushDown.scala:60)
     at 
org.apache.spark.sql.execution.datasources.v2.V2ScanRelationPushDown$$anonfun$pushDownFilters$1.applyOrElse(V2ScanRelationPushDown.scala:47)
   ```
   
   
   But on Spark non-Iceberg table, I get:
   
   ```
   spark.sql("select * from test_struct_non_iceberg where struct_field 
in(struct(10))").show
   +------------+
   |struct_field|
   +------------+
   |        {10}|
   +------------+
   
   
   scala> spark.sql("select * from test_struct_non_iceberg where struct_field = 
struct(10)").show
   +------------+
   |struct_field|
   +------------+
   |        {10}|
   +------------+
   ```
   
   It's possible that Iceberg cannot handle these filters (as it does not 
collect metrics for anything other than primitive columns).  So maybe we should 
not even push down the filters.  There may also be other problem (the returned 
schema not matching)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to