Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/19702#discussion_r149847655
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala
---
@@ -30,49 +30,31 @@ import
org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter
import org.apache.spark.sql.internal.SQLConf
import org.apache.spark.sql.types._
+
/**
- * This converter class is used to convert Parquet [[MessageType]] to
Spark SQL [[StructType]] and
- * vice versa.
+ * This converter class is used to convert Parquet [[MessageType]] to
Spark SQL [[StructType]].
*
* Parquet format backwards-compatibility rules are respected when
converting Parquet
* [[MessageType]] schemas.
*
* @see
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md
- * @constructor
+ *
* @param assumeBinaryIsString Whether unannotated BINARY fields should be
assumed to be Spark SQL
- * [[StringType]] fields when converting Parquet a [[MessageType]]
to Spark SQL
- * [[StructType]]. This argument only affects Parquet read path.
+ * [[StringType]] fields.
* @param assumeInt96IsTimestamp Whether unannotated INT96 fields should
be assumed to be Spark SQL
- * [[TimestampType]] fields when converting Parquet a
[[MessageType]] to Spark SQL
- * [[StructType]]. Note that Spark SQL [[TimestampType]] is
similar to Hive timestamp, which
- * has optional nanosecond precision, but different from
`TIME_MILLS` and `TIMESTAMP_MILLIS`
- * described in Parquet format spec. This argument only affects
Parquet read path.
- * @param writeLegacyParquetFormat Whether to use legacy Parquet format
compatible with Spark 1.4
- * and prior versions when converting a Catalyst [[StructType]] to
a Parquet [[MessageType]].
- * When set to false, use standard format defined in parquet-format
spec. This argument only
- * affects Parquet write path.
- * @param writeTimestampInMillis Whether to write timestamp values as
INT64 annotated by logical
- * type TIMESTAMP_MILLIS.
- *
+ * [[TimestampType]] fields.
*/
-private[parquet] class ParquetSchemaConverter(
+class ParquetToSparkSchemaConverter(
--- End diff --
split it into 2 classes, to make it clear that which configs are for
reading and which are for writing.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]