Hi, could someone point me to the recommended way of using countApproxDistinctByKey with DataFrames?
I know I can map to pair RDD but I'm wondering if there is a simpler method? If someone knows if this operations is expressible in SQL that information would be most appreciated as well.