Can you post a Jira and a patch?

On 12/10/07 1:12 AM, "Alan Ho" <[EMAIL PROTECTED]> wrote:

> I've written a xml input splitter based on a Stax parser. Its much better than
> StreamXMLRecordReader
> 
> ----- Original Message ----
> From: Peter Thygesen <[EMAIL PROTECTED]>
> To: [email protected]
> Sent: Monday, November 26, 2007 8:49:52 AM
> Subject: MapReduce Job on XML input
> 
> I would like to run some mapReduce jobs on some xml files I got (aprox.
> 100000 compressed files).
> The XML files are not that big about 1 Mb compressed, each containing
> about 1000 records.
> 
> Do I have to write my own InputSplitter? Should I use
> MultiFileInputFormat or StreamInputFormat? Can I use the
> StreamXmlRecordReader, and how? By sub-classing some input class?
> 
> The tutorials and examples I've read are all very straight forward
> reading simple text files, but I miss a more complex example,
>  especially
> one that reads xml files ;)
> 
> thx. 
> Peter
> 
> 
> 
> 
> 
> 
> 
>       Looking for the perfect gift? Give the gift of Flickr!
> 
> http://www.flickr.com/gift/
> 

Reply via email to