[GitHub] spark pull request: [SPARK-15226][SQL]fix CSV file data-line with ...

2016-05-22 Thread WeichenXu123
Github user WeichenXu123 closed the pull request at: https://github.com/apache/spark/pull/13007 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-15226][SQL]fix CSV file data-line with ...

2016-05-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13007#issuecomment-218097561 @WeichenXu123 Currently it is possible to support multiple lines because the lines read from `LineRecordReader` becomes `Reader` (a byte stream) by `StringIteratorR

[GitHub] spark pull request: [SPARK-15226][SQL]fix CSV file data-line with ...

2016-05-10 Thread WeichenXu123
Github user WeichenXu123 commented on the pull request: https://github.com/apache/spark/pull/13007#issuecomment-218094762 @HyukjinKwon En..current cvs load code use Hadoop `LineRecordReader`, so not allow a row split into mulit-lines, so I think the code should disable csv multi-l

[GitHub] spark pull request: [SPARK-15226][SQL]fix CSV file data-line with ...

2016-05-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13007#issuecomment-218040423 @WeichenXu123 Also, I realised the test in `CSVSuite` is only for data. If we have a test for header, this will fail. I mean some might want a header ```

[GitHub] spark pull request: [SPARK-15226][SQL]fix CSV file data-line with ...

2016-05-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13007#issuecomment-218039838 @WeichenXu123 [External CSV data source](https://github.com/databricks/spark-csv) supports this but has an issue for parsing unescaped quotes, here, https://issues

[GitHub] spark pull request: [SPARK-15226][SQL]fix CSV file data-line with ...

2016-05-09 Thread WeichenXu123
Github user WeichenXu123 commented on the pull request: https://github.com/apache/spark/pull/13007#issuecomment-218037385 @HyukjinKwon I run existing test against this patch and all pass. If need I will add a new test in CSVSuit. And I think the only reason cause the bug is reading

[GitHub] spark pull request: [SPARK-15226][SQL]fix CSV file data-line with ...

2016-05-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13007#issuecomment-218018326 In addition, I think a test is needed for this as well in `CSVSuite`. Also it would be nicer if we don't comment the original code. I blieve current change breaks t

[GitHub] spark pull request: [SPARK-15226][SQL]fix CSV file data-line with ...

2016-05-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13007#issuecomment-218014048 I think if we really need to solve this problem, we need a option for unescaped quote handling. --- If your project is set up for it, you can reply to this email a

[GitHub] spark pull request: [SPARK-15226][SQL]fix CSV file data-line with ...

2016-05-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13007#discussion_r62587043 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/DefaultSource.scala --- @@ -63,7 +63,9 @@ class DefaultSource extends F

[GitHub] spark pull request: [SPARK-15226][SQL]fix CSV file data-line with ...

2016-05-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13007#discussion_r62586730 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/DefaultSource.scala --- @@ -63,7 +63,9 @@ class DefaultSource extends F

[GitHub] spark pull request: [SPARK-15226][SQL]fix CSV file data-line with ...

2016-05-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13007#issuecomment-217872808 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your p