Is there a way of parallelizing URLFiltering over multiple threads? After
all, the URLFilters themselves must already be thread-safe, or else they
would have problems during fetching.
The reason why I'm asking is I have a custom URLFilter that needs to make
calls to the DNS resolver, and multi-threading the URLFiltering would
greatly speed up some filtering procedures that, unlike fetching, appear to
be single-threaded: "mergedb -filter", inject, generate, "updatedb -filter"
etc. (The most important is of course "generate" or, even better,
"updatedb -filter" to prevent undesired URL's to reach the crawldb in first
place).
Enzo
- Parallelizing URLFiltering Enzo Michelangeli
-