[
https://issues.apache.org/jira/browse/FLINK-14569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964064#comment-16964064
]
kk commented on FLINK-14569:
----------------------------
thank you very much, use the api
"env.readCsvFile(path).fieldDelimiter(fieldDeli).lineDelimiter(lineDeli).types(...)"
or use the "new
CsvAppendTableSourceFactory().createTableSource(false,targetTableDescriptor.toProperties())",
then print the dataset ,get then data alwasys lose 1
record(total:18records,Parallelism:24),but use the api
"env.readTextFile(path)", can get the right result!
> use flink to deal with batch task,when parallelism granter than 1, always 1
> or 2 records loss
> ---------------------------------------------------------------------------------------------
>
> Key: FLINK-14569
> URL: https://issues.apache.org/jira/browse/FLINK-14569
> Project: Flink
> Issue Type: Bug
> Components: API / DataSet
> Affects Versions: 1.7.2
> Environment: hadoop:2.8.5
> flink: 1.7.1
> node machines: 3
> deploy: yarn
> Reporter: kk
> Priority: Critical
>
> when flink read from hdfs file and set the parallelism>1, occasionally, 1 or
> 2 records lose, however the hdfs file bigger or smaller.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)