Re: Processing xml documents using StreamXmlRecordReader

Mohammad Tariq Tue, 19 Jun 2012 04:07:04 -0700

Hello Madhu,

             Thanks for the response. Actually I was trying to use the
new API (Job). Have you tried that. I was not able to set the
InputFormat using the Job API.


Regards,
    Mohammad Tariq


On Tue, Jun 19, 2012 at 4:28 PM, madhu phatak <phatak....@gmail.com> wrote:
> Hi,
>  Set the following properties in driver class
>
>   jobConf.set("stream.recordreader.class",
> "org.apache.hadoop.streaming.StreamXmlRecordReader");
> jobConf.set("stream.recordreader.begin",
> "start-tag");
> jobConf.set("stream.recordreader.end",
> "end-tag");
>                         jobConf.setInputFormat(StreamInputFormat,class);
>
>  In Mapper, xml record will come as key of type Text,so your mapper will
> look like
>
>   public class MyMapper<K,V>  implements Mapper<Text,Text,K,V>
>
>
> On Tue, Jun 19, 2012 at 2:49 AM, Mohammad Tariq <donta...@gmail.com> wrote:
>>
>> Hello list,
>>
>>        Could anyone, who has written MapReduce jobs to process xml
>> documents stored in there cluster using "StreamXmlRecordReader" share
>> his/her experience??...or if you can provide me some pointers
>> addressing that..Many thanks.
>>
>> Regards,
>>     Mohammad Tariq
>
>
>
>
> --
> https://github.com/zinnia-phatak-dev/Nectar
>

Re: Processing xml documents using StreamXmlRecordReader

Reply via email to