[ http://issues.apache.org/jira/browse/NUTCH-34?page=comments#action_63033 ] Stephan Strittmatter commented on NUTCH-34: -------------------------------------------
Do you think the boolean is really required or should it not be in the same way than in the file-/http-plugins: The value "0" implies "whole content is required". What do you think? > Parsing different content formats > --------------------------------- > > Key: NUTCH-34 > URL: http://issues.apache.org/jira/browse/NUTCH-34 > Project: Nutch > Type: Improvement > Components: fetcher > Reporter: Stephan Strittmatter > Priority: Trivial > > At the moment Nuch is set up to filter content by config the xml-config file. > There it is also set global how many bytes are loaded. > I think it yould be better to let the parser plugins "register" themselfe in > some registry where every plugin could tell the fetcher, that: > 1. this document type is wanted (because the parser plugin is > installed and activated) > 2. how much of the content is required (some plugins need the whole > content and some not) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
