Re: Alternatives for dataframe collectAsList()

2017-04-04 Thread lucas.g...@gmail.com
As Keith said, it depends on what you want to do with your data. >From a pipelining perspective the general flow (YMMV) is: Load dataset(s) -> Transform and / or Join --> Aggregate --> Write dataset Each step in the pipeline does something distinct with the data. The end step is usually

Re: Alternatives for dataframe collectAsList()

2017-04-04 Thread Keith Chapman
As Paul said it really depends on what you want to do with your data, perhaps writing it to a file would be a better option, but again it depends on what you want to do with the data you collect. Regards, Keith. http://keith-chapman.com On Tue, Apr 4, 2017 at 7:38 AM, Eike von Seggern

Re: Alternatives for dataframe collectAsList()

2017-04-04 Thread Eike von Seggern
Hi, depending on what you're trying to achieve `RDD.toLocalIterator()` might help you. Best Eike 2017-03-29 21:00 GMT+02:00 szep.laszlo.it : > Hi, > > after I created a dataset > > Dataset df = sqlContext.sql("query"); > > I need to have a result values and I call a

Re: Alternatives for dataframe collectAsList()

2017-04-03 Thread Paul Tremblay
> View this message in context: http://apache-spark-user-list. > 1001560.n3.nabble.com/Alternatives-for-dataframe- > collectAsList-tp28547.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > -

Alternatives for dataframe collectAsList()

2017-03-29 Thread szep.laszlo.it
p; Regards, Laszlo Szep -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Alternatives-for-dataframe-collectAsList-tp28547.html Sent from the Apache Spark User List mailing list arc