[
https://issues.apache.org/jira/browse/FLINK-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17046455#comment-17046455
]
Jiangjie Qin commented on FLINK-14807:
--------------------------------------
Yeah, that really depends on the implementation.
{quote}Imagine we are retrieving the result of batch job, and when client
started to get results back from the sink, the sink operator failed and
restart. We have to retrieve the whole result again and also need a way to tell
the client the previous results are invalid.
{quote}
In this case, one possible solution is to store the uncheckpionted records
emitted by the sink in the client and hold them back until they are
checkpointed. These records can potentially spill to disk. Because we don't
need to deal with the client failure, this makes the implementation simpler.
> Add Table#collect api for fetching data to client
> -------------------------------------------------
>
> Key: FLINK-14807
> URL: https://issues.apache.org/jira/browse/FLINK-14807
> Project: Flink
> Issue Type: New Feature
> Components: Table SQL / API
> Affects Versions: 1.9.1
> Reporter: Jeff Zhang
> Priority: Major
> Labels: usability
> Fix For: 1.11.0
>
> Attachments: table-collect.png
>
>
> Currently, it is very unconvinient for user to fetch data of flink job unless
> specify sink expclitly and then fetch data from this sink via its api (e.g.
> write to hdfs sink, then read data from hdfs). However, most of time user
> just want to get the data and do whatever processing he want. So it is very
> necessary for flink to provide api Table#collect for this purpose.
>
> Other apis such as Table#head, Table#print is also helpful.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)