Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/21415#discussion_r190694499
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVOptions.scala
---
@@ -29,17 +29,20 @@ import org.apache.spark.sql.catalyst.util._
class CSVOptions(
@transient val parameters: CaseInsensitiveMap[String],
defaultTimeZoneId: String,
- defaultColumnNameOfCorruptRecord: String)
+ defaultColumnNameOfCorruptRecord: String,
+ val columnPruning: Boolean)
extends Logging with Serializable {
def this(
parameters: Map[String, String],
defaultTimeZoneId: String,
- defaultColumnNameOfCorruptRecord: String = "") = {
+ defaultColumnNameOfCorruptRecord: String = "",
+ columnPruning: Boolean = false) = {
--- End diff --
The constructor with disabled `columnPruning` is called in the CSV writer
and 30 times from test suites like `UnivocityParserSuite` and
`CSVInferSchemaSuite`.
> We might lose the pruning opportunity if we call this constructor.
ok. I will enable it by default.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]