Hi

We have a RDD<UserId> that needs to be mapped with information from
HBase, where the exact key is the user id.

What's the different alternatives for doing this?

- Is it possible to do HBase.get() requests from a map function in Spark?
- Or should we join RDDs with all full HBase table scan?

I ask because full table scans feels inefficient, especially if the
input RDD<UserId> is really small compared to the full table. But I
realize that a full table scan may not be what happens in reality?

Cheers,
-Kristoffer

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to