Hi

I am going to feed nutch-0.8-dev crawler with seeds in xml format. And
I have read nutch TextInputFormat/InputFormatBase. It seems now nutch
breaks the plain text files into chars and parses on them. My question
is how to support XmlInputFormat, in my eye, xml format is not
character-based but blocke-based.

Thanks

/Jack

--
Keep Discovering ... ...
http://www.jroller.com/page/jmars

Reply via email to