El 17/03/13 23:50, Juan Miguel Taboada Godoy escribió: > Hello: > > This patch prevent wget from stopping when -nc argument is in use and > file in the disk (from a previous download) has a name which doesn't > finish with htm or html. > > I detected this bug when downloading a website with this URL: > http://www.abc.com/dirdir/cgi-bin/Search.php?lng=EN&search=Query_Search_List Wrong url?
> This was saved in the disk at: > dirdir/cgi-bin/ > as a file named: > Search.php?lng=EN&search=Query_Search_List > > When I was using -nc argument, wget couldn't detect that this file could > have links inside. > > Because I believe wget should check most of the files as text/html just > in case there are some links inside to visit, I repaired this problem > not checking for htm or html suffix inside the name of the file. > > Sincerely, Usually the problem is the opposite, with wget checking too many files. A problem this could cause is when you have big binary files (eg. 16GB) and wget dies when trying to parse them. Perhaps we could add a --expect-html-everywhere option :/ > La legislación española ampara el secreto de las comunicaciones. Este > mensaje se dirige exclusivamente a su destinatario y puede contener > información privilegiada o CONFIDENCIAL. (...) You are sending your patch to a GPL program to a publicly archived mailing list... Regards
