Hello,

I have a larger XML file, over 10GB, has a simple format like

<book>
    <title></title>
    <author></author>
    ...
</book>

I used to parse the XML and convert into another format, i.e. CSV.
Currently, the parsing only performed on a single server and speed is
slow (a few hours)

Is hadoop is a good solution for spliting the XML files and spread the
XML parsing on serveral clusters?

Thanks for any comment.

Reply via email to