I'm not sure I understand what you describe. But currently HBase table is already distributed cached in each region server unless it is comprised of one region or only a few regions, where these regions are served by one region server. But for such a small table, I don't think there is a need to cache the table data distributed. Copying the table data to all nodes might overcome the processing time, isn't it? Why do you need this to process small tables? Can you describe your scenarios?
Regards, Chongxin Li >From: john smith <[email protected]> >Reply-To: [email protected] >To: [email protected] >Subject: Re:Distributed Cache for HBase table >Date: Tue, 22 Jun 2010 07:36:05 +0530 > >Hey all, > >I am not sure whether this is feasible or not . >How about having a "Distributed Cache" for Hbase table, where we can just >specify the table name (or a subset of it's columns) and it gets copied to >all the nodes in the cluster during the mapreduce job. Does anything similar >to this already exists in HBase . This feature can be useful if we want to >do any processing on very small tables. I tried searching for this but >didn't find anything relevant. > >Any ideas/comments ? > >Thanks >jS > >
