I think we should break the API and remove the unnecessary execute() calls.

On Tue, Apr 28, 2015 at 10:59 AM, Stephan Ewen <se...@apache.org> wrote:
> I think this is the 3rd discussion about this ;-)
>
> AFAIK, the consensus in previous discussions was to do it exactly like
> collect() and print to the client.
>
> The only open question was how do we deal with the break in the API. Right
> now, the programs contain a "execute()" call after the print(), which would
> then throw an exception because there is nothing to be executed that was
> not already part of the print().
>
>
> On Tue, Apr 28, 2015 at 10:18 AM, Aljoscha Krettek <aljos...@apache.org>
> wrote:
>
>> Hi Folks,
>> right now .print() on DataSet creates a DataSink that prints to the
>> local stdout of a TaskManager. This is not very helpful when running
>> in a distributed environment, especially when using something like an
>> interactive Scala Shell in a cluster.
>>
>> I propose to change print() to use collect() internally and therefore
>> eagerly execute without requiring env.execute().
>>
>> What do you think?
>>
>> Aljoscha
>>

Reply via email to