Hadoop knows about files and blocks so you can achieve data locality if you are accessing files directly
I think, In your case you'll have to develop your own logic that can take advantage of it Sent from my iPhone On Jan 16, 2013, at 6:56 AM, John Lilley <[email protected]> wrote: > Um, I think you and I are talking about the same thing, but maybe not? > > Certainly HBase/MongoDB are HDFS-aware, so I would expect that if I am a > client program running outside of the Hadoop cluster and I do a query, the > database tools will construct query processing such that data is read and > processed in an optimal fashion (using MapReduce?), before the aggregated > information is shipped to me on the client side. > > The question I was asking is a little different although hopefully the answer > is just as simple. Can I write mapper/reducer that queries HBase/MongoDB and > have MR schedule my mappers such that each mapper is receiving tuples that > have been read in a locality-aware fashion? > > john > > From: Mohammad Tariq [mailto:[email protected]] > Sent: Wednesday, January 16, 2013 7:47 AM > To: [email protected] > Subject: Re: Query mongodb > > MapReduce framework tries its best to run the jobs on the nodes > where data is located. It is its fundamental nature. You don't have > to do anything extra. > > *I am sorry if I misunderstood the question. > > > Warm Regards, > Tariq > https://mtariq.jux.com/ > cloudfront.blogspot.com > > > On Wed, Jan 16, 2013 at 8:10 PM, John Lilley <[email protected]> wrote: > How does one schedule mappers to read MongoDB or HBase in a > data-locality-aware fashion? > -john > > From: Mohammad Tariq [mailto:[email protected]] > Sent: Wednesday, January 16, 2013 3:29 AM > To: [email protected] > Subject: Re: Query mongodb > > Yes. You can use MongoDB-Hadoop adapter to achieve that. Through this adapter > you can pull the data, process it and push it back to your MongoDB backed > datastore by writing MR jobs. > > It is also 100% possible to query Hbase or JSON files, or anything else for > that matter, stored in HDFS. > > Warm Regards, > Tariq > https://mtariq.jux.com/ > cloudfront.blogspot.com > > > On Wed, Jan 16, 2013 at 3:50 PM, Panshul Whisper <[email protected]> > wrote: > Hello, > Is it possible or how is it possible to query mongodb directly from hadoop. > > Or is it possible to query hbase or json files stored in hdfs in a similar > way as we can query the json documents in mongodb. > > Suggestions please. > > Thank you. > Regards, > Panshul. > > >
