GitHub user MaxGekk opened a pull request: https://github.com/apache/spark/pull/23080
[SPARK-26108][SQL] Support custom lineSep in CSV datasource ## What changes were proposed in this pull request? In the PR, I propose new options for CSV datasource - `lineSep` similar to Text and JSON datasource. The option allows to specify custom line separator of maximum length of 2 characters (because of a restriction in `uniVocity` parser). New option can be used in reading and writing CSV files. ## How was this patch tested? Added a few tests with custom `lineSep` for enabled/disabled `multiLine` in read as well as tests in write. Also I added roundtrip tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/MaxGekk/spark-1 csv-line-sep Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/23080.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #23080 ---- commit a790bb30e575cf6d4ffaeda307f0405f1bfecf03 Author: Maxim Gekk <max.gekk@...> Date: 2018-11-17T21:44:47Z Added a test for default line separator commit 7a47990af7a9e8782fbde2955c0cf6e4848a3806 Author: Maxim Gekk <max.gekk@...> Date: 2018-11-17T21:56:34Z Test for custom lineSep commit be2870f1006c3f2e783cec0c40bd6e1c7e4c5652 Author: Maxim Gekk <max.gekk@...> Date: 2018-11-18T09:59:07Z Test on read commit a058a6f2d6771173837ba4b6e829b2067993adb7 Author: Maxim Gekk <max.gekk@...> Date: 2018-11-18T10:33:12Z Support lineSep in write commit 7e3c0264ae93e270ed8b63c53897a2b775fa65ff Author: Maxim Gekk <max.gekk@...> Date: 2018-11-18T10:36:17Z Check roundtrip commit 486b090139ce6d7a93a24edae000fb546b4931db Author: Maxim Gekk <max.gekk@...> Date: 2018-11-18T10:42:08Z Test another char commit a0fedbbb06f33716fc632d3b4dd2a687b2587966 Author: Maxim Gekk <max.gekk@...> Date: 2018-11-18T11:03:20Z Don't keep quotes commit 5f013f505e7a57e4f72f6f1185f1dcdedc0960b5 Author: Maxim Gekk <max.gekk@...> Date: 2018-11-18T11:13:38Z Support 2 chars as lineSep commit 65786dfabbb5c901e3f8d32f737a6b24a2f58b6b Author: Maxim Gekk <max.gekk@...> Date: 2018-11-18T11:14:22Z Revert unrelated changes commit 49b91ea06b757a2feed283de1634c36a59ace8f0 Author: Maxim Gekk <max.gekk@...> Date: 2018-11-18T11:26:19Z Test restrictions for lineSep commit 12022ad1a0194a4bab9007d66145071562e066a4 Author: Maxim Gekk <max.gekk@...> Date: 2018-11-18T11:39:12Z Updating comments and docs ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org