[ http://issues.apache.org/jira/browse/NUTCH-190?page=comments#action_12364145 ]
[EMAIL PROTECTED] commented on NUTCH-190: ----------------------------------------- Here's an example of failure output after patch is applied: 060126 141413 task_m_bx2ifn Error parsing: http://techreports.jpl.nasa.gov/2000/00-1147.pdf: failed(2,202): Content truncated at 102013 bytes. Parser can't handle incomplete application/pdf file > ParseUtil drops reason for failed parse > --------------------------------------- > > Key: NUTCH-190 > URL: http://issues.apache.org/jira/browse/NUTCH-190 > Project: Nutch > Type: Bug > Components: fetcher > Versions: 0.8-dev > Environment: linux > Reporter: [EMAIL PROTECTED] > Priority: Minor > Attachments: ParseUtil_drops_failure_reason.patch > > Doing the below: > Parse parse; > ParseStatus parseStatus; > try { > parse = ParseUtil.parse(content); > parseStatus = parse.getData().getStatus(); > } catch (Exception e) { > parseStatus = new ParseStatus(e); > } > if (!parseStatus.isSuccess()) { > LOG.warning("Error parsing: " + url + ": " + parseStatus); > parse = null; > } > ...on failure, the LOG.warning never prints out the reason for failure. > Here's an example: "Error parsing: > http://www.dfrc.nasa.gov/DTRS/1967/PDF/H-478.pdf: failed(0,0)". > ParseUtil is dropping messages lovingly crafted by parsers. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
