Re: one new bie question

Ted Xu Thu, 09 May 2013 00:46:14 -0700

Hi Balson,

Have you tried 
NLineInputFormat<http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/NLineInputFormat.html>?
You can find example of NLineInputFormat here: http://goo.gl/aVzDr.



On Thu, May 9, 2013 at 2:53 PM, Balachandar R.A.
<[email protected]>wrote:

>
> Hello
>
> I would like to see the possibility of using map reduce framework for my
> following problem.
>
> I have a set of huge files. I would like to execute a binary over every
> input files. The binary needs to operate over the whole file and hence it
> is not possible to split the file in chunks. Let’s assume that I have six
> such files and have their names in a single text file. I need to write
> hadoop code to take this single file as input and every line in it should
> go to one map task. The map tasks shall execute the binary on this file and
> the file can be located in hdfs. No reduce tasks is needed and no output
> shall be emitted from the map tasks as well. The binary take care of
> creating output file in the specified location.
> Is there a way to tell hadoop to feed single line to a map task? I came
> across few examples wherein a set of files has been given and looks like
> the framework try to split the file, reads every line in the split,
> generates key/value pairs and send this pairs to single map task. In my
> situation, I want only one key value pair should be generated for one line
> and it should be given to a single map task. Thats it?
>
> For ex. Assume that this is my file <input.txt>
>
> myFirstInput.vlc
> mySecondInput.vlc
> myThirdInput.vlc
>
> Now, first map task should get a pair <1, myFirstInput.vlc>, the second
> gets a pair <2, mySecondInput.vlc> and so on.
>
> Can someone throw some light in to this problem? For me, it looks
> straightforward but could not find any pointers in the web.
>
>
>
>
>
>
>
> With thanks and regards
> Balson
>
>
>
>



-- 
Regards,
Ted Xu

Re: one new bie question

Reply via email to