I'm confused since we have 2 getMetaData methods, i guess this methods gives information like encoding?
I would found it good if we can have a getFetchMetaData and getContentMetaData
I guess this was confusing. I'll rename these getFetchMetaData() and getParseMetaData(). Note that if fetch meta data is needed downstream, then the parser can copy it into the parse meta data, or we could change the DocumentFactory so that the FetcherContent is also passed in. Unfortunately that has performance implications, as the raw content is not otherwise needed when indexing, so I'd rather just have things copied into the parse meta data. We can, by default, copy, e.g., the contentType. Then plugins can decide if they want to copy other stuff. How does that sound?
Doug
-------------------------------------------------------
This SF.Net email is sponsored by: Oracle 10g
Get certified on the hottest thing ever to hit the market... Oracle 10g. Take an Oracle 10g class now, and we'll give you the exam FREE.
http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers
