[ 
https://issues.apache.org/jira/browse/FLINK-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976354#comment-16976354
 ] 

Caizhi Weng commented on FLINK-13943:
-------------------------------------

Hi dear Flink community. My ideas for this improvement are as follows:

Blink planner already contains {{CollectTableSink}} and {{CollectOutputFormat}} 
classes which can serialize data streams into a list using 
{{SerializedListAccumulator}}. These classes greatly simplifies this 
improvement.

As a starting point, we can add a method which is very similar to 
{{DataSet#collect}}: we execute the current table, and fetch the results 
collected by the accumulators, then deserialize it into our desired list.

The problem is that: this solution can only be applied to batch jobs whose 
results are of moderate size, for batch jobs having huge results size or for 
never-ending streaming jobs, as we cannot store the results in memory, this 
solution is not applicable.

> Provide api to convert flink table to java List (e.g. Table#collect)
> --------------------------------------------------------------------
>
>                 Key: FLINK-13943
>                 URL: https://issues.apache.org/jira/browse/FLINK-13943
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table SQL / API
>            Reporter: Jeff Zhang
>            Assignee: Caizhi Weng
>            Priority: Major
>
> It would be nice to convert flink table to java List so that I can do other 
> data manipulation in client side after execution flink job. For flink 
> planner, I can convert flink table to DataSet and use DataSet#collect, but 
> for blink planner, there's no such api.
> EDIT from FLINK-14807:
> Currently, it is very unconvinient for user to fetch data of flink job unless 
> specify sink expclitly and then fetch data from this sink via its api (e.g. 
> write to hdfs sink, then read data from hdfs). However, most of time user 
> just want to get the data and do whatever processing he want. So it is very 
> necessary for flink to provide api Table#collect for this purpose. 
> Other apis such as Table#head, Table#print is also helpful.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to