Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14151#discussion_r155527201
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -313,11 +313,16 @@ def text(self, paths):
Each line in the text file is a new row in the resulting DataFrame.
:param paths: string, or list of strings, for input path(s).
+ :param wholetext: if true, read each file from input path(s) as a
single row.
>>> df = spark.read.text('python/test_support/sql/text-test.txt')
>>> df.collect()
[Row(value=u'hello'), Row(value=u'this')]
+ >>> df = spark.read.text('python/test_support/sql/text-test.txt',
wholetext=True)
+ >>> df.collect()
+ [Row(value=u'hello\nthis')]
--- End diff --
Hm, can't we just do `\\n`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]