[ 
https://issues.apache.org/jira/browse/FLINK-11818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16792451#comment-16792451
 ] 

Fabian Hueske commented on FLINK-11818:
---------------------------------------

I can see that such a function is valuable. However, I also think that starting 
external processes is performance sensitive and can also depend on the 
scheduling / availability of software. Hence, I would not make it a first-class 
API (i.e., add it to DataSetUtils).

When the feature is stable, we can check if the function is popular enough to 
move it to DataSet.

> Provide pipe transformation function for DataSet API
> ----------------------------------------------------
>
>                 Key: FLINK-11818
>                 URL: https://issues.apache.org/jira/browse/FLINK-11818
>             Project: Flink
>          Issue Type: Improvement
>          Components: API / DataSet
>            Reporter: vinoyang
>            Assignee: vinoyang
>            Priority: Major
>
> We have some business requirements that require the data handled by Flink to 
> interact with some external programs (such as Python/Perl/shell scripts). 
> There is no such function in the existing DataSet API, although it can be 
> implemented by the map function, but it is not concise. It would be helpful 
> if we could provide a pipe[1] function like Spark.
> [1]: 
> https://spark.apache.org/docs/latest/rdd-programming-guide.html#transformations



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to