Hi,

I couldn't find any documentation on how to test whether an RDD is empty.
 I'm doing a reduce operation, but it throws an
UnsupportedOperationException if the RDD is empty.  I'd like to check if
the RDD is empty before calling reduce.

On the Google Groups list Reynold Xin had suggested using rdd.first, but it
throws the same exception in case the RDD is empty.  rdd.count() on the
other hand might do a lot of unnecessary processing just to check whether
the RDD is empty.

Is there some way to do rdd.isEmpty?


Catching the UnsupportedOperationException works, but it seems like bad
practice, as it could be an indication of some other error as well.  (I'd
suggest changing it to EmptyRddException or similar, which could be a
subclass of UnsupportedOperationException.  That would make the cause
explicit.)


Thanks.

*    Sampo Niskanen*
    *Lead developer / Wellmo*
    [email protected]

Reply via email to