> So it simply has empty refreshHref so some check before invoking
> toString() is needed. I am not sure what should be done here right now
> - but probably this case should not be handled as redirect.

Piotr,
looking at the code, and at the meta refresh tag, it seems there's a bug in 
the HTMLMetaProcessor.
At line 115, is checked that there's only the refresh time specified (no 
url), so that the time is setted to the content.
But in such a case (no url specified), the refreshURL should be the current 
page (not setted in our case). The code should become:

if (idx == -1) { // just the refresh time
time = content;
refreshUrl = currURL;
} else time = content.substring(0, idx);

But, does this code make sense for a search engine (refreshing the same Url 
is usefull for a browser, but not really for a fetcher).
So, perhaps, the best solution is to setRefresh(false) in such a case. The 
code will become someting like:

String time = null;
if (idx != -1) { // check not only the refresh time
time = content.substring(0, idx);
try {
metaTags.setRefreshTime(Integer.parseInt(time));
// Try now the retrieve the refresh URL (cut/paste the original code at 
line125)
metaTags.setRefresh(true);
} catch (Exception e) {
;
}

Sorry, no time today to provide a full patch .... just some ideas.

Regards

Jerome

-- 
http://motrech.free.fr/
http://frutch.free.fr/

Reply via email to