zhoulii commented on pull request #19152:
URL: https://github.com/apache/flink/pull/19152#issuecomment-1074936494


   > I found that after changing to new source, tpcds runs slower than before. 
This is probably mainly because the new csv source is slower than the legacy 
`CsvTableSource`. It only took 20 min before, and now it takes 30 min. I think 
we need to wait for 
[FLINK-26760](https://issues.apache.org/jira/browse/FLINK-26760) to have a 
conclusion before merging this pr.
   
   I agree. The way that parsing csv data between [CsvInputFormat.java#L87 
which legacy csv source 
used](https://github.com/apache/flink/blob/master/flink-java/src/main/java/org/apache/flink/api/java/io/CsvInputFormat.java#L87)
 and [CsvReaderFormat.java#L193 which new csv source 
used](https://github.com/apache/flink/blob/master/flink-formats/flink-csv/src/main/java/org/apache/flink/formats/csv/CsvReaderFormat.java#L193)
 is quite different, may be we can reuse the parse method of CsvInputFormat in 
CsvReaderFormat.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to