Can you post a Jira and a patch?
On 12/10/07 1:12 AM, "Alan Ho" <[EMAIL PROTECTED]> wrote: > I've written a xml input splitter based on a Stax parser. Its much better than > StreamXMLRecordReader > > ----- Original Message ---- > From: Peter Thygesen <[EMAIL PROTECTED]> > To: [email protected] > Sent: Monday, November 26, 2007 8:49:52 AM > Subject: MapReduce Job on XML input > > I would like to run some mapReduce jobs on some xml files I got (aprox. > 100000 compressed files). > The XML files are not that big about 1 Mb compressed, each containing > about 1000 records. > > Do I have to write my own InputSplitter? Should I use > MultiFileInputFormat or StreamInputFormat? Can I use the > StreamXmlRecordReader, and how? By sub-classing some input class? > > The tutorials and examples I've read are all very straight forward > reading simple text files, but I miss a more complex example, > especially > one that reads xml files ;) > > thx. > Peter > > > > > > > > Looking for the perfect gift? Give the gift of Flickr! > > http://www.flickr.com/gift/ >
