[jira] [Comment Edited] (SPARK-33566) Incorrectly Parsing CSV file
[ https://issues.apache.org/jira/browse/SPARK-33566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17239256#comment-17239256 ] Yang Jie edited comment on SPARK-33566 at 11/26/20, 1:12 PM: - I think the reason for the bad case is Spark use "STOP_AT_DELIMITER" as default "UnescapedQuoteHandling" to build "CsvParser". Configure "UnescapedQuoteHandling" to "STOP_AT_CLOSING_QUOTE" seems can resolve this issue, but Spark not support configure this option now. [~hyukjin.kwon] [~moresmores] was (Author: luciferyang): I think the reason for the bad case is Spark use "STOP_AT_DELIMITER" as default "UnescapedQuoteHandling" to build "CsvParser". Configure "UnescapedQuoteHandling" to "STOP_AT_CLOSING_QUOTE" seems can resolve this issue. [~hyukjin.kwon] [~moresmores] > Incorrectly Parsing CSV file > > > Key: SPARK-33566 > URL: https://issues.apache.org/jira/browse/SPARK-33566 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.7 >Reporter: Stephen More >Priority: Minor > > Here is a test case: > [https://github.com/mores/maven-examples/blob/master/comma/src/test/java/org/test/CommaTest.java] > It shows how I believe apache commons csv and opencsv correctly parses the > sample csv file. > spark is not correctly parsing the sample csv file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-33566) Incorrectly Parsing CSV file
[ https://issues.apache.org/jira/browse/SPARK-33566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17239049#comment-17239049 ] Hyukjin Kwon edited comment on SPARK-33566 at 11/26/20, 3:57 AM: - Here is the output from running mvn clean test: Running org.test.CommaTest {code} 2020-11-25 17:55:45,728 INFO [CommaTest:12] OpenCsv 2020-11-25 17:55:45,758 INFO [CommaTest:19] h1 h3 h2 2020-11-25 17:55:45,758 INFO [CommaTest:19] one three two 2020-11-25 17:55:45,760 INFO [CommaTest:19] abc xyz ^@Referral from Joe Smith. Fred is hard working. Super smart, though you wouldn't know it at first. 6 months, and we sold this project. Phooey he said to me! What's up with you people. You'll say anything for a sale! Until he met me of coursehaar haar!Internet is spottyWorking while at home so. Will be applied this weekend. On Bill Recovery and 20 yr warranty added.Kindness made this deal happen! 2020-11-25 17:55:45,763 INFO [CommaTest:26] spark 2020-11-25 17:55:46,464 WARN [NativeCodeLoader:62] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2020-11-25 17:55:55,299 INFO [CommaTest:36] Count: 2 2020-11-25 17:55:55,449 INFO [CommaTest:41] one three two 2020-11-25 17:55:55,449 INFO [CommaTest:41] abc sans-serif;"">Referral from Joe Smith. Fred is hard working. Super smart "^@Referral from Joe Smith. Fred is hard working. Super smart, though you wouldn't know it at first. 6 months, and we sold this project. Phooey he said to me! What's up with you people. You'll say anything for a sale! Until he met me of coursehaar haar!Internet is spottyWorking while at home so. Will be applied this weekend. On Bill Recovery and 20 yr warranty added.Kindness made this deal happen! {code} was (Author: moresmores): Here is the output from running mvn clean test: Running org.test.CommaTest {code} 2020-11-25 17:55:45,728 INFO [CommaTest:12] OpenCsv 2020-11-25 17:55:45,758 INFO [CommaTest:19] h1 h3 h2 2020-11-25 17:55:45,758 INFO [CommaTest:19] one three two 2020-11-25 17:55:45,760 INFO [CommaTest:19] abc xyz ^@Referral from Joe Smith. Fred is hard working. Super smart, though you wouldn't know it at first. 6 months, and we sold this project. Phooey he said to me! What's up with you people. You'll say anything for a sale! Until he met me of coursehaar haar!Internet is spottyWorking while at home so. Will be applied this weekend. On Bill Recovery and 20 yr warranty added.Kindness made this deal happen! 2020-11-25 17:55:45,763 INFO [CommaTest:26] spark 2020-11-25 17:55:46,464 WARN [NativeCodeLoader:62] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2020-11-25 17:55:55,299 INFO [CommaTest:36] Count: 2 2020-11-25 17:55:55,449 INFO [CommaTest:41] one three two 2020-11-25 17:55:55,449 INFO [CommaTest:41] abc sans-serif;"">Referral from Joe Smith. Fred is hard working. Super smart "^@Referral from Joe Smith. Fred is hard working. Super smart, though you wouldn't know it at first. 6 months, and we sold this project. Phooey he said to me! What's up with you people. You'll say anything for a sale! Until he met me of coursehaar haar!Internet is spottyWorking while at home so. Will be applied this weekend. On Bill Recovery and 20 yr warranty added.Kindness made this deal happen! > Incorrectly Parsing CSV file > > > Key: SPARK-33566 > URL: https://issues.apache.org/jira/browse/SPARK-33566 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.7 >Reporter: Stephen More >Priority: Minor > > Here is a test case: > [https://github.com/mores/maven-examples/blob/master/comma/src/test/java/org/test/CommaTest.java] > It shows how I believe apache commons csv and opencsv correctly parses the > sample csv file. > spark is not correctly parsing the sample csv file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-33566) Incorrectly Parsing CSV file
[ https://issues.apache.org/jira/browse/SPARK-33566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17239049#comment-17239049 ] Hyukjin Kwon edited comment on SPARK-33566 at 11/26/20, 3:57 AM: - Here is the output from running mvn clean test: Running org.test.CommaTest {code} 2020-11-25 17:55:45,728 INFO [CommaTest:12] OpenCsv 2020-11-25 17:55:45,758 INFO [CommaTest:19] h1 h3 h2 2020-11-25 17:55:45,758 INFO [CommaTest:19] one three two 2020-11-25 17:55:45,760 INFO [CommaTest:19] abc xyz ^@Referral from Joe Smith. Fred is hard working. Super smart, though you wouldn't know it at first. 6 months, and we sold this project. Phooey he said to me! What's up with you people. You'll say anything for a sale! Until he met me of coursehaar haar!Internet is spottyWorking while at home so. Will be applied this weekend. On Bill Recovery and 20 yr warranty added.Kindness made this deal happen! 2020-11-25 17:55:45,763 INFO [CommaTest:26] spark 2020-11-25 17:55:46,464 WARN [NativeCodeLoader:62] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2020-11-25 17:55:55,299 INFO [CommaTest:36] Count: 2 2020-11-25 17:55:55,449 INFO [CommaTest:41] one three two 2020-11-25 17:55:55,449 INFO [CommaTest:41] abc sans-serif;"">Referral from Joe Smith. Fred is hard working. Super smart "^@Referral from Joe Smith. Fred is hard working. Super smart, though you wouldn't know it at first. 6 months, and we sold this project. Phooey he said to me! What's up with you people. You'll say anything for a sale! Until he met me of coursehaar haar!Internet is spottyWorking while at home so. Will be applied this weekend. On Bill Recovery and 20 yr warranty added.Kindness made this deal happen! was (Author: moresmores): {{Here is the output from running mvn clean test}} {{Running org.test.CommaTest}} {{2020-11-25 17:55:45,728 INFO [CommaTest:12] }} {{OpenCsv}} {{2020-11-25 17:55:45,758 INFO [CommaTest:19] h1 h3 h2}} {{2020-11-25 17:55:45,758 INFO [CommaTest:19] one three two}} {{2020-11-25 17:55:45,760 INFO [CommaTest:19] abc xyz ^@Referral from Joe Smith. Fred is hard working. Super smart, though you wouldn't know it at first. 6 months, and we sold this project. Phooey he said to me! What's up with you people. You'll say anything for a sale! Until he met me of coursehaar haar!Internet is spottyWorking while at home so. Will be applied this weekend. On Bill Recovery and 20 yr warranty added.Kindness made this deal happen!}} {{2020-11-25 17:55:45,763 INFO [CommaTest:26] }} {{spark}} {{2020-11-25 17:55:46,464 WARN [NativeCodeLoader:62] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable}} {{2020-11-25 17:55:55,299 INFO [CommaTest:36] Count: 2}} {{2020-11-25 17:55:55,449 INFO [CommaTest:41] one three two}} {{2020-11-25 17:55:55,449 INFO [CommaTest:41] abc sans-serif;"">Referral from Joe Smith. Fred is hard working. Super smart "^@Referral from Joe Smith. Fred is hard working. Super smart, though you wouldn't know it at first. 6 months, and we sold this project. Phooey he said to me! What's up with you people. You'll say anything for a sale! Until he met me of coursehaar haar!Internet is spottyWorking while at home so. Will be applied this weekend. On Bill Recovery and 20 yr warranty added.Kindness made this deal happen!}} > Incorrectly Parsing CSV file > > > Key: SPARK-33566 > URL: https://issues.apache.org/jira/browse/SPARK-33566 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.7 >Reporter: Stephen More >Priority: Minor > > Here is a test case: > [https://github.com/mores/maven-examples/blob/master/comma/src/test/java/org/test/CommaTest.java] > It shows how I believe apache commons csv and opencsv correctly parses the > sample csv file. > spark is not correctly parsing the sample csv file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org