I think we should break the API and remove the unnecessary execute() calls.
On Tue, Apr 28, 2015 at 10:59 AM, Stephan Ewen <se...@apache.org> wrote: > I think this is the 3rd discussion about this ;-) > > AFAIK, the consensus in previous discussions was to do it exactly like > collect() and print to the client. > > The only open question was how do we deal with the break in the API. Right > now, the programs contain a "execute()" call after the print(), which would > then throw an exception because there is nothing to be executed that was > not already part of the print(). > > > On Tue, Apr 28, 2015 at 10:18 AM, Aljoscha Krettek <aljos...@apache.org> > wrote: > >> Hi Folks, >> right now .print() on DataSet creates a DataSink that prints to the >> local stdout of a TaskManager. This is not very helpful when running >> in a distributed environment, especially when using something like an >> interactive Scala Shell in a cluster. >> >> I propose to change print() to use collect() internally and therefore >> eagerly execute without requiring env.execute(). >> >> What do you think? >> >> Aljoscha >>