Hello Jacquelyn,
This is very odd:
> Unexpected EOF in prolog
> at [row,col {unknown-source}]: [1,0]
We've fixed this problem a long time ago. It was a problem of non-unicode
codepoints in the data sent to Solr. The Solr indexing plugin strips them all,
and to my knowledge, there are no other non-unicode codepoints to strip.
What you can do to analyze the problem is to use debug or even trace logging,
so you can see the exact XML Nutch is sending on the wire, and use a hexeditor
to check for position 1,0, well, the first few bytes.
Markus
-----Original message-----
> From:Richardson, Jacquelyn F. <[email protected]>
> Sent: Friday 12th August 2016 19:37
> To: [email protected]
> Subject: Error while attempting to add documents to Solr
>
> Hi All,
>
> Some background information that maybe of some help. I have Cygwin64, Solr
> 4.7, apache Nutch 1.9 source and tomcat configured in a Windows 7
> environment. This setup works well on my local machine. I can crawl the
> specified web page(s) and Nutch can successfully index the content to Solr.
>
> I moved this setup to one of our servers (except tomcat; it was already on
> the server and the OS is Windows Server 2008). I executed a crawl of a seed
> file using the individual Nutch commands. Everything worked fine until I ran
> the command to index the content to Solr. I issued the following command:
> bin/nutch solrindex http://fegddd.enther.rlco.gov/solr/collection1_tst
> crawls/crawlsitemap/crawldb -linkdb crawls/crawlsitemap/linkdb
> crawls/crawlsitemap/segments/*
>
> I received the following error in haddoop.log:
> WARN mapred.LocalJobRunner - job_local_0001
> org.apache.solr.common.SolrException: Bad Request
>
> Bad Request
>
> request:
> http://fegddd.enther.rlco.gov/solr/collection1_tst/update?wt=javabin&version=2
>
> Solr.log reports this error:
> INFO - 2016-08-12 07:18:27.656;
> org.apache.solr.update.processor.LogUpdateProcessor; [collection1_tst]
> webapp=/solr path=/update params={wt=javabin&version=2} {} 0 62
> ERROR - 2016-08-12 07:18:27.656; org.apache.solr.common.SolrException;
> org.apache.solr.common.SolrException: Unexpected EOF in prolog
> at [row,col {unknown-source}]: [1,0]
> at
> org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
> at
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
> at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916)
> at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:780)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:212)
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:106)
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:141)
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79)
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:521)
> at
> org.apache.coyote.ajp.AbstractAjpProcessor.process(AbstractAjpProcessor.java:850)
> at
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:674)
> at
> org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.doRun(AprEndpoint.java:2500)
> at
> org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:2489)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at
> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: com.ctc.wstx.exc.WstxEOFException: Unexpected EOF in prolog
> at [row,col {unknown-source}]: [1,0]
> at
> com.ctc.wstx.sr.StreamScanner.throwUnexpectedEOF(StreamScanner.java:686)
> at
> com.ctc.wstx.sr.BasicStreamReader.handleEOF(BasicStreamReader.java:2134)
> at
> com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2040)
> at
> com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1069)
> at
> org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:213)
> at
> org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174)
>
> I have compared the setup on my local machine with the setup on the server
> machine and I cannot see a difference. I thought perhaps it had something to
> do with the solrindex-mapping.xml file but what is on the server agrees with
> what I have on my local machine.
>
> Any help you can provide will be most appreciated.
>
> Thanks,
> Jackie
>
>