Hadoop knows about files and blocks so you can achieve data locality if you are 
accessing files directly

I think, In your case you'll have to develop your own logic that can take 
advantage of it

Sent from my iPhone

On Jan 16, 2013, at 6:56 AM, John Lilley <[email protected]> wrote:

> Um, I think you and I are talking about the same thing, but maybe not?
>  
> Certainly HBase/MongoDB are HDFS-aware, so I would expect that if I am a 
> client program running outside of the Hadoop cluster and I do a query, the 
> database tools will construct query processing such that data is read and 
> processed in an optimal fashion (using MapReduce?), before the aggregated 
> information is shipped to me on the client side. 
>  
> The question I was asking is a little different although hopefully the answer 
> is just as simple.  Can I write mapper/reducer that queries HBase/MongoDB and 
> have MR schedule my mappers such that each mapper is receiving tuples that 
> have been read in a locality-aware fashion?
>  
> john
>  
> From: Mohammad Tariq [mailto:[email protected]] 
> Sent: Wednesday, January 16, 2013 7:47 AM
> To: [email protected]
> Subject: Re: Query mongodb
>  
> MapReduce framework tries its best to run the jobs on the nodes 
> where  data is located. It is its fundamental nature. You don't have 
> to do anything extra.
>  
> *I am sorry if I misunderstood the question.
>  
> 
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>  
> 
> On Wed, Jan 16, 2013 at 8:10 PM, John Lilley <[email protected]> wrote:
> How does one schedule mappers to read MongoDB or HBase in a 
> data-locality-aware fashion?
> -john
>  
> From: Mohammad Tariq [mailto:[email protected]] 
> Sent: Wednesday, January 16, 2013 3:29 AM
> To: [email protected]
> Subject: Re: Query mongodb
>  
> Yes. You can use MongoDB-Hadoop adapter to achieve that. Through this adapter 
> you can pull the data, process it and push it back to your MongoDB backed 
> datastore by writing MR jobs.
>  
> It is also 100% possible to query Hbase or JSON files, or anything else for 
> that matter, stored in HDFS.
> 
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>  
> 
> On Wed, Jan 16, 2013 at 3:50 PM, Panshul Whisper <[email protected]> 
> wrote:
> Hello,
> Is it possible or how is it possible to query mongodb directly from hadoop.
> 
> Or is it possible to query hbase or json files stored in hdfs in a similar 
> way as we can query the json documents in mongodb.
> 
> Suggestions please.
> 
> Thank you.
> Regards,
> Panshul.
> 
>  
>  

Reply via email to