[ http://issues.apache.org/jira/browse/NUTCH-379?page=all ]
Work on NUTCH-379 started by Chris A. Mattmann. > ParseUtil does not pass through the content's URL to the ParserFactory > ---------------------------------------------------------------------- > > Key: NUTCH-379 > URL: http://issues.apache.org/jira/browse/NUTCH-379 > Project: Nutch > Issue Type: Bug > Components: fetcher > Affects Versions: 0.8, 0.9.0, 0.8.1 > Environment: Power Mac Dual G5, 2.0 Ghz, although fix is independent > of environment > Reporter: Chris A. Mattmann > Assigned To: Chris A. Mattmann > Fix For: 0.8, 0.9.0, 0.8.1, 0.8.2 > > > Currently the ParseUtil class that is called by the Fetcher to actually > perform the parsing of content does not forward thorugh the content's url for > use in the ParserFactory. A bigger issue, however, is that the url (and for > that matter, the pathSuffix) is no longer used to determine which parsing > plugin should be called. My colleague at JPL discovered that more major bug > and will soon input a JIRA issue for it. However, in the meantime, this small > patch at least sets up the forwarding of the content's URL to the > ParserFactory. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira