Re: [Spark 2.x Core] .collect() size limit

Irving Duran Mon, 30 Apr 2018 11:35:24 -0700

I don't think there is a magic number, so I would say that it will depend
on how big your dataset is and the size of your worker(s).


Thank You,

Irving Duran


On Sat, Apr 28, 2018 at 10:41 AM klrmowse <klrmo...@gmail.com> wrote:

> i am currently trying to find a workaround for the Spark application i am
> working on so that it does not have to use .collect()
>
> but, for now, it is going to have to use .collect()
>
> what is the size limit (memory for the driver) of RDD file that .collect()
> can work with?
>
> i've been scouring google-search - S.O., blogs, etc, and everyone is
> cautioning about .collect(), but does not specify how huge is huge... are
> we
> talking about a few gigabytes? terabytes?? petabytes???
>
>
>
> thank you
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Re: [Spark 2.x Core] .collect() size limit

Reply via email to