There's a StreamXmlRecordReader class in contrib/streaming that looks
like it will chunk up an xml file based on xml tags. I haven't used it
myself ..

-----Original Message-----
From: Prasan Ary [mailto:[EMAIL PROTECTED] 
Sent: Monday, March 03, 2008 3:30 PM
To: [email protected]
Subject: map/reduce function on xml string

Hi All,
  I am writing a java implementation for my map/reduce function on
hadoop.
  Input to this is a xml file, and the map function has to process a
well formed xml records. So far I have been unable to split the xml file
at xml record boundary to feed into my map function.
  Can anybody point me to resources where forcing file split at desired
boundary is explained ?
   
  thx,
  Pra.

       
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try
it now.

Reply via email to