Github user zentol commented on the pull request:
https://github.com/apache/incubator-flink/pull/210#issuecomment-63467505
hmm...alright, i can see the point.
Doesn't executing right away carries the risk of it being inefficient when
using them multiple times though? Since it effectively means executing multiple
jobs within the same program (i *think* ... ), any common part of the jobs are
done an extra time. (if I'm wrong here skip the rest)
example:
```
List l1 = A.map(X).map(Y).collect();
List l2 = A.map(X).map(Z).collect();
<some user code using l1 & l2>
```
this would result in 2 jobs being executed, with map(X) being executed
twice. whereas
```
B = A.map(X);
B.map(Y).collect("c1");
B.map(Z).collect("c2");
JobExecutionResult jre = env.execute()
List l1 = jre.getAccumulatorResult("c1");
List l2 = jre.getAccumulatorResult("c2);
<some user code using l1 & l2>
```
would only be 1 job, with map(X) done only once. it is not as pretty (by a
fair margin i admit), but in line with the current API.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---