> Right, but the URL filters run long before we know the mime type, in > order to try to keep us from fetching lots of stuff we can't process. > The mime type is not known until we've fetched it.
Yes, the fetcher can't rely on the document mime-type. The only thing we can use for filtering is the document's URL. So, another alternative, could be to exclude only files extensions that are registered in the mime-type repository (some well known file extensions) but for which no parser is activated. And accepting all other ones. So that the .foo files will be fetched... Jérôme
