Re: countApprox

2016-09-18 Thread Stefano Lodi
10:04 A: Stefano Lodi Cc: user@spark.apache.org Oggetto: Re: countApprox countApprox gives the best answer within some timeout. Is it possible that 1ms is more than enough to count this exactly? then the confidence wouldn't matter. Although that seems way too fast, you're counting ranges whose values

countApprox

2016-09-15 Thread Stefano Lodi
I am experimenting with countApprox. I created a RDD of 10^8 numbers and ran countApprox with different parameters but I failed to generate any approximate output. In all runs it returns the exact number of elements. What is the effect of approximation in countApprox supposed to be, and for