sandeep-katta opened a new pull request #25843: [SPARK-29101][SQL] 
[Backport]Fix count API for csv file when DROPMALFORMED mode is selected
URL: https://github.com/apache/spark/pull/25843
 
 
   ### What changes were proposed in this pull request?
   #DataSet
   fruit,color,price,quantity
   apple,red,1,3
   banana,yellow,2,4
   orange,orange,3,5
   xxx
   
   This PR aims to fix the below
   ```
   scala> spark.conf.set("spark.sql.csv.parser.columnPruning.enabled", false)
   scala> spark.read.option("header", "true").option("mode", 
"DROPMALFORMED").csv("fruit.csv").count
   res1: Long = 4
   ```
   
   This is caused by the issue 
[SPARK-24645](https://issues.apache.org/jira/browse/SPARK-24645).
   SPARK-24645 issue can also be solved by 
[SPARK-25387](https://issues.apache.org/jira/browse/SPARK-25387)
   
   ### Why are the changes needed?
   
   SPARK-24645 caused this regression, so reverted the code as it can also be 
solved by SPARK-25387
   
   ### Does this PR introduce any user-facing change?
   No,
   
   
   ### How was this patch tested?
   Added UT, and also tested the bug SPARK-24645
   
   **SPARK-24645 regression**
   
![image](https://user-images.githubusercontent.com/35216143/65067957-4c08ff00-d9a5-11e9-8d43-a4a23a61e8b8.png)
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to