Jerome, I think that this is a great idea and ensures that there isn't replication of so-called "management information" across the system. It could be easily implemented as a utility method because we have utility java classes that represent the ParsePluginList, that you could get the mimeTypes from. Additionally, we could create a utility method that searches the extension point list for parsing plugins and returns a boolean true or false whether they are activated or not. Using this information, I believe that the url filtering would be a snap.
+1 Cheers, Chris On 12/1/05 12:11 PM, "Jérôme Charron" <[EMAIL PROTECTED]> wrote: > Suggestion: > For consistency purpose, and easy of nutch management, why not filtering the > extensions based on the activated plugins? > By looking at the mime-types defined in the parse-plugins.xml file and the > activated plugins, we know which content-types will be parsed. > So, by getting the file extensions associated to each content-type, we can > build a list of file extensions to include (other ones will be excluded) in > the fecth process. > No? > > Jérôme > > -- > http://motrech.free.fr/ > http://www.frutch.org/ ______________________________________________ Chris A. Mattmann [EMAIL PROTECTED] Staff Member Modeling and Data Management Systems Section (387) Data Management Systems and Technologies Group _________________________________________________ Jet Propulsion Laboratory Pasadena, CA Office: 171-266B Mailstop: 171-246 _______________________________________________________ Disclaimer: The opinions presented within are my own and do not reflect those of either NASA, JPL, or the California Institute of Technology. ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_idv37&alloc_id865&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
