I haven't changed anything in nutch.default.xml. I am using java version 1.6.0 
on a macbook. Is there a way I can fix SAX Parser ?? 

Thanks everyone for your support. I am honored… :) 
Ashumeet Singh

On Feb 14, 2010, at 8:32 AM, Andreas P. Koenzen wrote:

> Hello Ashumeet,
> 
> Yes, thats a symptom of malformed XML, if you haven't change the 
> nutch-default.xml file, then its probably the version of the SAX parser you 
> are using. Which version of Java are you using...?
> 
> Best regards,
> 
> ---
> Andreas P. Koenzen
> 
> On 14/02/2010, at 01:31 a.m., Ashumeet Singh wrote:
> 
>> I am not sure which tag is missing in nutch-default.xml. Also "Content is 
>> not allowed in prolog." I don't understand what is it trying to say. I can 
>> view it in the browser but the search is empty because there is no crawl 
>> happened till now.
>> 
>> Thanks for the prompt reply.
>> Ashumeet Singh
>> 
>> On Feb 13, 2010, at 11:20 PM, Neera Sharma wrote:
>> 
>>> You need to check your nutch-default.xml file for missing tag etc. Are you
>>> able to view it in your browser?
>>> 
>>> 
>>> On Sat, Feb 13, 2010 at 4:33 PM, Ashumeet Singh <ashumeet.landm...@gmail.com
>>>> wrote:
>>> 
>>>> Hey everyone…!!! I am pleased to be a part of this community. I am a
>>>> student trying to learn nutch. I have installed nutch and tomcat properly.
>>>> And they are running but at the time of crawl, when I am running the
>>>> following command in terminal:
>>>> 
>>>> ./nutch crawl urls -dir crawl -depth 3 -topN 50
>>>> 
>>>> It is giving me the following error:
>>>> 
>>>> [Fatal Error] nutch-default.xml:1:1: Content is not allowed in prolog.
>>>> Exception in thread "main" java.lang.RuntimeException:
>>>> org.xml.sax.SAXParseException: Content is not allowed in prolog.
>>>>     at
>>>> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1049)
>>>>     at
>>>> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:940)
>>>>     at
>>>> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:891)
>>>>     at org.apache.hadoop.conf.Configuration.set(Configuration.java:345)
>>>>     at org.apache.hadoop.mapred.JobConf.setJar(JobConf.java:195)
>>>>     at org.apache.hadoop.mapred.JobConf.setJarByClass(JobConf.java:205)
>>>>     at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:150)
>>>>     at org.apache.nutch.util.NutchJob.<init>(NutchJob.java:27)
>>>>     at org.apache.nutch.crawl.Crawl.main(Crawl.java:59)
>>>> Caused by: org.xml.sax.SAXParseException: Content is not allowed in prolog.
>>>>     at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
>>>>     at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
>>>>     at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180)
>>>>     at
>>>> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:968)
>>>>     ... 8 more
>>>> 
>>>> PLEASE HELP ME …….
>>>> 
>>>> 
>>>> Thanks
>>>> Ashumeet Singh
>>>> 
>>>> 
>> 
> 

Reply via email to