Hello Ashumeet,

Yes, thats a symptom of malformed XML, if you haven't change the nutch- default.xml file, then its probably the version of the SAX parser you are using. Which version of Java are you using...?

Best regards,

---
Andreas P. Koenzen

On 14/02/2010, at 01:31 a.m., Ashumeet Singh wrote:

I am not sure which tag is missing in nutch-default.xml. Also "Content is not allowed in prolog." I don't understand what is it trying to say. I can view it in the browser but the search is empty because there is no crawl happened till now.

Thanks for the prompt reply.
Ashumeet Singh

On Feb 13, 2010, at 11:20 PM, Neera Sharma wrote:

You need to check your nutch-default.xml file for missing tag etc. Are you
able to view it in your browser?


On Sat, Feb 13, 2010 at 4:33 PM, Ashumeet Singh <ashumeet.landm...@gmail.com
wrote:

Hey everyone…!!! I am pleased to be a part of this community. I am a
student trying to learn nutch. I have installed nutch and tomcat properly.
And they are running but at the time of crawl, when I am running the
following command in terminal:

./nutch crawl urls -dir crawl -depth 3 -topN 50

It is giving me the following error:

[Fatal Error] nutch-default.xml:1:1: Content is not allowed in prolog.
Exception in thread "main" java.lang.RuntimeException:
org.xml.sax.SAXParseException: Content is not allowed in prolog.
     at
org .apache.hadoop.conf.Configuration.loadResource(Configuration.java: 1049)
     at
org .apache.hadoop.conf.Configuration.loadResources(Configuration.java: 940)
     at
org.apache.hadoop.conf.Configuration.getProps(Configuration.java: 891) at org.apache.hadoop.conf.Configuration.set(Configuration.java:345)
     at org.apache.hadoop.mapred.JobConf.setJar(JobConf.java:195)
at org.apache.hadoop.mapred.JobConf.setJarByClass(JobConf.java:205)
     at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:150)
     at org.apache.nutch.util.NutchJob.<init>(NutchJob.java:27)
     at org.apache.nutch.crawl.Crawl.main(Crawl.java:59)
Caused by: org.xml.sax.SAXParseException: Content is not allowed in prolog.
     at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180)
     at
org .apache.hadoop.conf.Configuration.loadResource(Configuration.java: 968)
     ... 8 more

PLEASE HELP ME …….


Thanks
Ashumeet Singh




Reply via email to