what distinguishes this record and will every mapper know it? It sounds like all you need to do is ignore non-matching records and then run other code in the mapper - I am assuming across all mappers the code only runs once.
On Wed, Sep 22, 2010 at 2:06 PM, Shi Yu <[email protected]> wrote: > Dear Hadoopers, > > I am stuck at a probably very simple problem but can't figure it out. In > the Hadoop Map/Reduce framework, I want to search a huge file (which is > generated by another Reduce task) for a unique line of record (a <String, > double> value actually). That record is expected to be passed to another > function. I have read the previous post about using Mapper only output to > HBase ( > http://www.mail-archive.com/[email protected]/msg06579.html) > and another post ( > http://www.mail-archive.com/[email protected]/msg07337.html). > They are both very interesting, however, I am still confused about how to > circle away from writing to HBase, but to use the returned record directly > from memory? I guess my problem doesn't need a reducer, so basically > load-balance the search task via multiple Mappers. I want to have something > like this > > class myClass > method seekResultbyMapper (string toSearch, path reduceFile) > call Map(a,b) > do some simple calculation > return <String, double> result > > class anotherClass > <String, double> para = myClass.seekResultbyMapper (c,d) > > > I don't know whether this is doable (maybe it is not a valid style in > Map/Reduce framework)? How to implement it using JAVA API? Thanks for any > suggestion in advance. > > > Best Regards, > > Shi > > -- > Postdoctoral Scholar > Institute for Genomics and Systems Biology > Department of Medicine, the University of Chicago > Knapp Center for Biomedical Discovery > 900 E. 57th St. Room 10148 > Chicago, IL 60637, US > Tel: 773-702-6799 > > -- Steven M. Lewis PhD Institute for Systems Biology Seattle WA
