Dear Davies,
Thanks so much for your instructions! It worked like a charm.
Best,
Diederik
On Wed, Jul 30, 2014 at 1:27 AM, Davies Liu-2 [via Apache Spark User List] <
ml-node+s1001560n10917...@n3.nabble.com> wrote:
> Hey Diederik,
>
> The data in rdd._jrdd.rdd() is serializ
hat is also weird is that when
I set p to 8, I should get a more accurate number, but it's actually
smaller. Any tips or pointers are much appreciated!
Best,
Diederik
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Using-countApproxDistinct-in-pyspark-tp1087