Take a look at the Mahout xmlinputformat class. That should get you started.
On Thu, Jan 30, 2014 at 5:08 AM, Mayur Rustagi <[email protected]>wrote: > I am trying to load xml in streaming and convert to csv and store it. When > I use textfile it separates the file on "\n" and hence breaks the parser. > Is it possible to receive the data one file at a time from the hdfs folder ? > > Mayur Rustagi > Ph: +919632149971 > h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com > https://twitter.com/mayur_rustagi > -- Woody Christy Solutions Architect | Partner Engineering | Cloudera Inc @woodychristy
