[jira] [Commented] (FLINK-14807) Add Table#collect api for fetching data to client

Stephan Ewen (Jira) Thu, 23 Jan 2020 07:31:36 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022193#comment-17022193
 ]


Stephan Ewen commented on FLINK-14807:
--------------------------------------

I would like to understand this a bit more. Having two ways of doing things is 
always tricky - more complexity for maintainers, harder to understand for 
users, etc.

Given the savepoint resuming - fair enough, I see that this is something that 
has subtle semantics. Since that is a parameter on the executor (the 
environment), would it be fair to say always the first execution uses that 
savepoint? If you want the driver to have independent jobs resuming from the 
same savepoint, you need different environments. That sounds like quite well 
defined behavior.

For the "driver mode" or the "run main() in cluster", I don't fully understand 
the issues. This should work the same way that the often discussed "library 
mode" works. One thing that has caused frequent confusion about these issues 
here is the assumption that somehow the "execute()" method needs to produce a 
job graph that can be passed to another component that was previously started. 
I think that is not necessary, the "execute()" method would just inside create 
a JobMaster (against a ResourceManager that is created in the environment) and 
block while this JobMaster is executing.


> Add Table#collect api for fetching data to client
> -------------------------------------------------
>
>                 Key: FLINK-14807
>                 URL: https://issues.apache.org/jira/browse/FLINK-14807
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table SQL / API
>    Affects Versions: 1.9.1
>            Reporter: Jeff Zhang
>            Priority: Major
>              Labels: usability
>             Fix For: 1.11.0
>
>
> Currently, it is very unconvinient for user to fetch data of flink job unless 
> specify sink expclitly and then fetch data from this sink via its api (e.g. 
> write to hdfs sink, then read data from hdfs). However, most of time user 
> just want to get the data and do whatever processing he want. So it is very 
> necessary for flink to provide api Table#collect for this purpose. 
>  
> Other apis such as Table#head, Table#print is also helpful.  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-14807) Add Table#collect api for fetching data to client

Reply via email to