Hi Folks, right now .print() on DataSet creates a DataSink that prints to the local stdout of a TaskManager. This is not very helpful when running in a distributed environment, especially when using something like an interactive Scala Shell in a cluster.
I propose to change print() to use collect() internally and therefore eagerly execute without requiring env.execute(). What do you think? Aljoscha