You need to check your nutch-default.xml file for missing tag etc. Are you able to view it in your browser?
On Sat, Feb 13, 2010 at 4:33 PM, Ashumeet Singh <ashumeet.landm...@gmail.com > wrote: > Hey everyone…!!! I am pleased to be a part of this community. I am a > student trying to learn nutch. I have installed nutch and tomcat properly. > And they are running but at the time of crawl, when I am running the > following command in terminal: > > ./nutch crawl urls -dir crawl -depth 3 -topN 50 > > It is giving me the following error: > > [Fatal Error] nutch-default.xml:1:1: Content is not allowed in prolog. > Exception in thread "main" java.lang.RuntimeException: > org.xml.sax.SAXParseException: Content is not allowed in prolog. > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1049) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:940) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:891) > at org.apache.hadoop.conf.Configuration.set(Configuration.java:345) > at org.apache.hadoop.mapred.JobConf.setJar(JobConf.java:195) > at org.apache.hadoop.mapred.JobConf.setJarByClass(JobConf.java:205) > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:150) > at org.apache.nutch.util.NutchJob.<init>(NutchJob.java:27) > at org.apache.nutch.crawl.Crawl.main(Crawl.java:59) > Caused by: org.xml.sax.SAXParseException: Content is not allowed in prolog. > at org.apache.xerces.parsers.DOMParser.parse(Unknown Source) > at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source) > at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:968) > ... 8 more > > PLEASE HELP ME ……. > > > Thanks > Ashumeet Singh > >