Hi Ted, I did not at first. I don't know why I didn't realize I could do that at first. But then I understood that I can. Thanks for the help though.
Cheers, Arun On Jul 3, 2014 10:28 AM, "Ted Yu" <[email protected]> wrote: > Did you read the summary object through HTable API in Job #2 ? > > Cheers > > > On Thu, Jul 3, 2014 at 9:14 AM, Arun Allamsetty <[email protected] > > > wrote: > > > Hi, > > > > I am trying to write a chained MapReduce job on data present in HBase > > tables and need some help with the concept. I am not expecting people to > > provide code by pseudo code for this based on HBase's Java API would be > > nice. > > > > In a nutshell, what I am trying to do is, > > > > MapReduce Job 1: Read data from two tables with no common row keys and > > create a summary out of them in the reducer. The output of the reducer > is a > > Java Object containing the summary which has been serialized to byte > code. > > I store this object in a temporary table in HBase. > > > > MapReduce Job 2: This is where I am having problems. I now need to read > > this summary object such that it is available in each mapper so that > when I > > read data from a third (different) table, I can use this summary object > to > > perform more calculations on the data I am reading from the third table. > > > > I read about distributed cache and tried to implement it, but that > doesn't > > seem to work out. I can provide more details in the form of edits if the > > need arises because I don't want to spam this question, right now, with > > details which might be irrelevant. > > Thanks, > > Arun > > >
