Re: [DISCUSS] ExecIO

2016-12-08 Thread Jean-Baptiste Onofré
Hi guys, I understand your point. The Exec "IO" can already take input commands from a PCollection, but the user has to prepare the commands. I will improve the ExecFn as you said: be able to construct the shell commands using elements in the PCollection (using one element as command, the

Re: [DISCUSS] ExecIO

2016-12-08 Thread Ben Chambers
I think I agree with Robert (unless I'm misunderstanding his point). I think that the shell commands are going to be the most useful if it is possible to take the elements in an input PCollection, construct a shell command depending on those elements, and then execute it. I think doing so in a

Re: [DISCUSS] ExecIO

2016-12-08 Thread Robert Bradshaw
On Wed, Dec 7, 2016 at 1:32 AM, Jean-Baptiste Onofré wrote: > By the way, just to elaborate a bit why I provided as an IO: > > 1. From an user experience perspective, I think we have to provide > convenient way to write pipeline. Any syntax simplifying this is valuable. > I

Re: [DISCUSS] ExecIO

2016-12-07 Thread Eugene Kirpichov
(discussion continues on a thread called "Naming and API for executing shell commands") On Wed, Dec 7, 2016 at 1:32 AM Jean-Baptiste Onofré wrote: > By the way, just to elaborate a bit why I provided as an IO: > > 1. From an user experience perspective, I think we have to

Re: [DISCUSS] ExecIO

2016-12-07 Thread Jean-Baptiste Onofré
By the way, just to elaborate a bit why I provided as an IO: 1. From an user experience perspective, I think we have to provide convenient way to write pipeline. Any syntax simplifying this is valuable. I think it's easier to write: pipeline.apply(ExecIO.read().withCommand("foo")) than:

Re: [DISCUSS] ExecIO

2016-12-06 Thread Eugene Kirpichov
Ben - the issues of "things aren't hung, there is a shell command running", aren't they general to all DoFn's? i.e. I don't see why the runner would need to know that a shell command is running, but not that, say, a heavy monolithic computation is running. What's the benefit to the runner in

Re: [DISCUSS] ExecIO

2016-12-06 Thread Jean-Baptiste Onofré
Hi Eugene, thanks for the extended questions. I think we have two levels of expectations here: - end-user responsibility - worker/runner responsibility 1/ From a end-user perspective, the end-user has to know that using a system command (via ExecIO) and more generally speaking anything which

Re: [DISCUSS] ExecIO

2016-12-05 Thread Ben Chambers
The problem with not integrating with Beam at all, is the runner doesn't know about any of these callouts. So it can't report "things aren't hung, there is a shell command running", etc. But, the integration doesn't need to be particularly deep. Imagine that the you can just pass the

Re: [DISCUSS] ExecIO

2016-12-05 Thread Ben Chambers
One option would be to use the reflective DoFn approach to this. Imagine something like: public class MyExternalFn extends DoFn { @ProcessElement // Existence of ShellExecutor indicates the code shells out. public void processElement(ProcessContext c, ShellExecutor shell) { ...

Re: [DISCUSS] ExecIO

2016-12-05 Thread Eugene Kirpichov
Hi JB, Thanks for bringing this to the mailing list. I also think that this is useful in general (and that use cases for Beam are more than just classic bigdata), and that there are interesting questions here at different levels about how to do it right. I suggest to start with the highest-level