The page was successfully fetched and parsed but the title just contains: "ERROR: The requested URL could not be retrieved" as it seems.
On Thursday 15 December 2011 15:36:40 Christopher Gross wrote: > I'm getting a success status AND an error message when trying to do a > parse check. It is a SharePoint site, but this part allows for > anonymous access -- I can curl the page just fine without having to do > anything funky. I have a robots.txt in place that allows everyone > through (it is an internal test site, url has been redacted). Here's > what I run: > > [user@eval bin]$ ./nutch parsechecker "http://sharepointurl/Home.aspx" > fetching: http://sharepointurl/Home.aspx > parsing: http://sharepointurl/Home.aspx > contentType: text/html > --------- > Url > --------------- > http://http://sharepointurl/Home.aspx--------- > ParseData > --------- > Version: 5 > Status: success(1,0) > Title: ERROR: The requested URL could not be retrieved > Outlinks: 0 > Content Metadata: Connection=close Content-Type=text/html > Parse Metadata: CharEncodingForConversion=windows-1252 > OriginalCharEncoding=windows-1252 > > Google searches have been fruitless. Can anyone help me make sense of > what is going on here? I can provide some snippets of config files if > need be. > > Nutch 1.4, SharePoint 2010, Java 1.6.0_06-b02. > > Thanks! > > -- Chris -- Markus Jelsma - CTO - Openindex

