regarding output dir usage
--------------------------
Key: HADOOP-5575
URL: https://issues.apache.org/jira/browse/HADOOP-5575
Project: Hadoop Core
Issue Type: Task
Environment: ubuntu hardy
Reporter: girija l
Fix For: 0.20.0
I want to do following:
1. A Sequence of map-reduce operations - Found no relevant link / template of
how it is done.
2. An an alternative I thought of passing output/part-00000 file to next
map-reduce as input file, but then I got the exception for wrong FS while
accessing output dir. This bug is already fixed in 0.20, but this version I am
not able to find on apache core - download release page.
Can anyone help me out with 0.20 distribution?
Also, if possible can anyone give me an idea of how I can do a sequence of
map-reduce iterations?
One more point is - I want my map task to access a common file which ENTIRELY
should be accessible to it (i.e. not a split one, but the whole file a map task
should be able to READ) and the same file I want my reducer task to write into.
2 things:
1. how to set such file which won't be split but will be given to each map task
entirely?
2. how can i make all my reducer operating in parallel to modify such file,
which will be used for next iteration?
I am not sure whether the things I mentioned above are indeed possible. I am
new to hadoop..
Please help,
Thanks,
--
Girija
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.