[GitHub] spark issue #20975: [SPARK-23863] [SQL] Wholetext mode should not add line b...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20975 Thanks for the details and confirmation, @barrenlake. I helped me understand the intention. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20975: [SPARK-23863] [SQL] Wholetext mode should not add line b...
Github user barrenlake commented on the issue: https://github.com/apache/spark/pull/20975 @HyukjinKwon Oh yes, you are right. This problem was encountered when I added the small file merge function to the Hive module. When I used select count(1) to count the merge result file, I found that the number of wholetext lines was inconsistent with the original. So I submitted this issue. I didn't realize that I would use a private object until I saw you reply. I apologize for taking up your valuable time and offer my heartfelt thanks to you. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20975: [SPARK-23863] [SQL] Wholetext mode should not add line b...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20975 I am sorry, the unit test uses many private classes, and I think I am not getting the point of this change. Do you mind if I ask some code snippets? I think `wholetext` is not related in write path and wonder why it matters here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20975: [SPARK-23863] [SQL] Wholetext mode should not add line b...
Github user barrenlake commented on the issue: https://github.com/apache/spark/pull/20975 @HyukjinKwon Hello, the unit test describes the scene where this problem occurred. You can run the unit test in the original system. Spark2.3 introduced `wholetext` ( default `false`): If true, read a file as a single row and not split by "\n". So, read the textual content in wholetext mode and write the content to a new file directly, there is no need to add "\n". --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20975: [SPARK-23863] [SQL] Wholetext mode should not add line b...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20975 @barrenlake, mind if I ask a reproducer please? I think I am being unable to reproduce it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20975: [SPARK-23863] [SQL] Wholetext mode should not add line b...
Github user barrenlake commented on the issue: https://github.com/apache/spark/pull/20975 please help to review, thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20975: [SPARK-23863] [SQL] Wholetext mode should not add line b...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20975 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20975: [SPARK-23863] [SQL] Wholetext mode should not add line b...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20975 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org