Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/22880#discussion_r229212944
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetReadSupport.scala
---
@@ -93,13 +141,14 @@ private[parquet] class ParquetReadSupport(val
convertTz: Option[TimeZone])
log.debug(s"Preparing for read Parquet file with message type:
$fileSchema")
val parquetRequestedSchema = readContext.getRequestedSchema
- logInfo {
- s"""Going to read the following fields from the Parquet file:
- |
- |Parquet form:
+ log.info {
+ s"""Going to read the following fields from the Parquet file with
the following schema:
+ |Parquet file schema:
+ |$fileSchema
+ |Parquet read schema:
--- End diff --
This might increase a lot of log data. Do we need to output `fileSchema`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]