Hi, I am doing lookup on cached RDDs [(Int,String)], and I noticed that the lookup is relatively slow 30-100 ms ?? I even tried this on one machine with single partition, but no difference!
The RDDs are not large at all, 3-30 MB. Is this expected behaviour? should I use other data structures, like HashMap to keep data and look up it there and use Broadcast to send a copy to all machines? best, /Shahab
