Hi guys,
I understand your point.
The Exec "IO" can already take input commands from a PCollection, but
the user has to prepare the commands.
I will improve the ExecFn as you said: be able to construct the shell
commands using elements in the PCollection (using one element as
command, the
I think I agree with Robert (unless I'm misunderstanding his point).
I think that the shell commands are going to be the most useful if it is
possible to take the elements in an input PCollection, construct a shell
command depending on those elements, and then execute it. I think doing so
in a
On Wed, Dec 7, 2016 at 1:32 AM, Jean-Baptiste Onofré wrote:
> By the way, just to elaborate a bit why I provided as an IO:
>
> 1. From an user experience perspective, I think we have to provide
> convenient way to write pipeline. Any syntax simplifying this is valuable.
> I
(discussion continues on a thread called "Naming and API for executing
shell commands")
On Wed, Dec 7, 2016 at 1:32 AM Jean-Baptiste Onofré wrote:
> By the way, just to elaborate a bit why I provided as an IO:
>
> 1. From an user experience perspective, I think we have to
By the way, just to elaborate a bit why I provided as an IO:
1. From an user experience perspective, I think we have to provide
convenient way to write pipeline. Any syntax simplifying this is valuable.
I think it's easier to write:
pipeline.apply(ExecIO.read().withCommand("foo"))
than:
Ben - the issues of "things aren't hung, there is a shell command running",
aren't they general to all DoFn's? i.e. I don't see why the runner would
need to know that a shell command is running, but not that, say, a heavy
monolithic computation is running. What's the benefit to the runner in
Hi Eugene,
thanks for the extended questions.
I think we have two levels of expectations here:
- end-user responsibility
- worker/runner responsibility
1/ From a end-user perspective, the end-user has to know that using a
system command (via ExecIO) and more generally speaking anything which
The problem with not integrating with Beam at all, is the runner doesn't
know about any of these callouts. So it can't report "things aren't hung,
there is a shell command running", etc.
But, the integration doesn't need to be particularly deep. Imagine that the
you can just pass the
One option would be to use the reflective DoFn approach to this. Imagine
something like:
public class MyExternalFn extends DoFn {
@ProcessElement
// Existence of ShellExecutor indicates the code shells out.
public void processElement(ProcessContext c, ShellExecutor shell) {
...
Hi JB,
Thanks for bringing this to the mailing list. I also think that this is
useful in general (and that use cases for Beam are more than just classic
bigdata), and that there are interesting questions here at different levels
about how to do it right.
I suggest to start with the highest-level
10 matches
Mail list logo