You need to check your nutch-default.xml file for missing tag etc. Are you
able to view it in your browser?


On Sat, Feb 13, 2010 at 4:33 PM, Ashumeet Singh <ashumeet.landm...@gmail.com
> wrote:

> Hey everyone…!!! I am pleased to be a part of this community. I am a
> student trying to learn nutch. I have installed nutch and tomcat properly.
> And they are running but at the time of crawl, when I am running the
> following command in terminal:
>
> ./nutch crawl urls -dir crawl -depth 3 -topN 50
>
> It is giving me the following error:
>
> [Fatal Error] nutch-default.xml:1:1: Content is not allowed in prolog.
> Exception in thread "main" java.lang.RuntimeException:
> org.xml.sax.SAXParseException: Content is not allowed in prolog.
>        at
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1049)
>        at
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:940)
>        at
> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:891)
>        at org.apache.hadoop.conf.Configuration.set(Configuration.java:345)
>        at org.apache.hadoop.mapred.JobConf.setJar(JobConf.java:195)
>        at org.apache.hadoop.mapred.JobConf.setJarByClass(JobConf.java:205)
>        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:150)
>        at org.apache.nutch.util.NutchJob.<init>(NutchJob.java:27)
>        at org.apache.nutch.crawl.Crawl.main(Crawl.java:59)
> Caused by: org.xml.sax.SAXParseException: Content is not allowed in prolog.
>        at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
>        at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
>        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180)
>        at
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:968)
>        ... 8 more
>
> PLEASE HELP ME …….
>
>
> Thanks
> Ashumeet Singh
>
>

Reply via email to