Spark and HBase RDD join/get

Kristoffer Sjögren Thu, 14 Jan 2016 05:04:24 -0800

Hi

We have a RDD<UserId> that needs to be mapped with information from
HBase, where the exact key is the user id.


What's the different alternatives for doing this?

- Is it possible to do HBase.get() requests from a map function in Spark?
- Or should we join RDDs with all full HBase table scan?

I ask because full table scans feels inefficient, especially if the
input RDD<UserId> is really small compared to the full table. But I
realize that a full table scan may not be what happens in reality?

Cheers,
-Kristoffer

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Spark and HBase RDD join/get

Reply via email to