Re: hadoop File loading

Jeff Zhang Tue, 22 Nov 2011 00:21:00 -0800

It will work as long as you consider the xml tag boundary in your
RecordReader.


On Tue, Nov 22, 2011 at 9:20 AM, hari708 <[email protected]> wrote:

>
> Hi,
> I have a big file consisting of XML data.the XML is not represented as a
> single line in the file. if we stream this file using ./hadoop dfs -put
> command to a hadoop directory .How the distribution happens.?
> Basically in My mapreduce program i am expecting a complete XML as my
> input.i have a CustomReader(for XML) in my mapreduce job configuration.My
> main confusion is if namenode distribute data to DataNodes ,there is a
> chance that a part of xml can go to one data node and other half can go in
> another datanode.If that is the case will my custom XMLReader in the
> mapreduce be able to combine it(as mapreduce reads data locally only).
> Please help me on this?
> --
> View this message in context:
> http://old.nabble.com/hadoop-File-loading-tp32871902p32871902.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>


-- 
Best Regards

Jeff Zhang

Re: hadoop File loading

Reply via email to