wangyum edited a comment on pull request #30517:
URL: https://github.com/apache/spark/pull/30517#issuecomment-735367655
`sql/hive`, `sql/thriftserver` and `external/avro` should be fine.
`sql/core` has some issues, e.g.:
```
mvn -Dtest=none
-DwildcardSuites=org.apache.spark.sql.execution.datasources.parquet.ParquetV2SchemaPruningSuite
test
```
```
- Spark vectorized reader - with partition data column - select nullable
complex field and having is not null predicate *** FAILED ***
Results do not match for query:
Timezone:
sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project ['employer.company]
+- 'Filter (isnotnull('employer) AND ('p = 1))
+- 'UnresolvedRelation [contacts], [], false
== Analyzed Logical Plan ==
company: struct<name:string,address:string>
Project [employer#7739.company AS company#7772]
+- Filter (isnotnull(employer#7739) AND (p#7741 = 1))
+- SubqueryAlias contacts
+- RelationV2[id#7733, name#7734, address#7735, pets#7736,
friends#7737, relatives#7738, employer#7739, relations#7740, p#7741] parquet
file:/root/opensource/spark/sql/core/target/tmp/spark-bdb1b34b-cf6a-462d-8caa-fcd923df3fe3/contacts
== Optimized Logical Plan ==
Project [employer#7739.company AS company#7772]
+- Filter isnotnull(employer#7739)
+- RelationV2[employer#7739, p#7741] parquet
file:/root/opensource/spark/sql/core/target/tmp/spark-bdb1b34b-cf6a-462d-8caa-fcd923df3fe3/contacts
== Physical Plan ==
*(1) Project [employer#7739.company AS company#7772]
+- *(1) Filter isnotnull(employer#7739)
+- BatchScan[employer#7739, p#7741] ParquetScan DataFilters:
[isnotnull(employer#7739)], Format: parquet, Location:
InMemoryFileIndex[file:/root/opensource/spark/sql/core/target/tmp/spark-bdb1b34b-cf6a-462d-8caa-f...,
PartitionFilters: [isnotnull(p#7741), (p#7741 = 1)], PushedFilers:
[IsNotNull(p), EqualTo(p,1)], ReadSchema:
struct<employer:struct<company:struct<name:string,address:string>>>,
PushedFilters: [IsNotNull(p), EqualTo(p,1)]
== Results ==
== Results ==
!== Correct Answer - 2 == == Spark Answer - 0 ==
struct<> struct<>
![[abc,123 Business Street]]
![null] (QueryTest.scala:243)
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]