Kevin MacDonald wrote:
Which is better for overall performance? To parse during fetching or afterward?
It's slightly faster to parse during fetching ... BUT if a parser crashes or catches OOM exception, you are left without content and without parsed text, whereas if you fetch and then parse then at least you already have the content and can re-run the parse job. Usually the process of getting content from remote sites is the bottleneck.
-- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
