Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/20727#discussion_r173632775
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/text/TextOptions.scala
---
@@ -39,9 +39,12 @@ private[text] class TextOptions(@transient private val
parameters: CaseInsensiti
*/
val wholeText = parameters.getOrElse(WHOLETEXT, "false").toBoolean
+ val lineSeparator: String = parameters.getOrElse(LINE_SEPARATOR, "\n")
+ require(lineSeparator.nonEmpty, s"'$LINE_SEPARATOR' cannot be an empty
string.")
}
private[text] object TextOptions {
val COMPRESSION = "compression"
val WHOLETEXT = "wholetext"
+ val LINE_SEPARATOR = "lineSep"
--- End diff --
In the example above, the line is counterintuitive for me. I imagine a line
in text files as a sequence of one or more characters, displayed within a
single horizontal sequence. I would prefer the short name *recSep* or
*recordSeparator* for long name. I guess when the option will be used, it will
separate text not by new line chars like `'\n'`, `'\r\n'`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]