Are there examples detailing how to write input formats, record readers and 
related classes? I was hoping to write one against a Redis database and it 
seems that shares similar issues to accessing data from a rest API.

Alex Thieme
[email protected]
508-361-2788


On Feb 19, 2013, at 1:34 PM, Robert Evans <[email protected]> wrote:

> I don't know of any input format that will do this out of the box.  But it 
> should not be that hard to write one.  There are two big issues here.   
> 
> the data you are reading form the API really needs to be static, or you could 
> get some very odd inconsistencies. For example a node dies after a map task 
> has finished and not all of the reducers got the data, so the map task is 
> rerun and some of the reducers have some old data, and some of the reducers 
> have new data.  This is the main reason to download the data before 
> processing it.  You can work around this by using the input format to run a 
> map only job that then writes the data out to a file before processing it the 
> rest of the way.
> You need a good way to partition the data from the API.  This can be 
> difficult unless the REST API provides a logical way to split this up.
> --Bobby
> 
> From: Yaron Gonen <[email protected]>
> Reply-To: "[email protected]" <[email protected]>
> Date: Tuesday, February 19, 2013 4:49 AM
> To: "[email protected]" <[email protected]>
> Subject: InputFormat for some REST api
> 
> Hi,
> Do you know of any InputFormat implemented for some REST api provider?
> Usually when one needs to process data that is accessible only by REST, one 
> should try to download the data first someone, but what if you cannot 
> download it?
> 
> thanks

Reply via email to