Stefan Groschupf wrote:
Before we start adding meta data and more meta data, why not once in general adding meta data to the crawlDatum, than we can have any kinds of plugins that add and process metadata that belongs to a url.

+1

This feature strikes me as something that might prove very useful, but might also prove unworkable, or at least not useful to everyone. Thus it would be best if it doesn't require changes to a core class like CrawlDatum. If it does eventually prove generally useful, as something that everyone will use and that should be enabled by default, then we could promote its data from metadata to a field for efficiency.

In this vein, should modifiedTime be moved to metadata, once metadata is added?

Cheers,

Doug


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to