Hi, I'm using wget to recursively download content from a bunch os sites.
The command line is "wget -x -r -l1 [url]"
I have a problem with one url:
http://olhardigital.uol.com.br/ultimas_noticias/1
If I execute wget with my parameters in this url it gives me lots of
"No such file or directory"
for every file inside the
'olhardigital.uol.com.br/produtos/digital_news/' directory
because 'olhardigital.uol.com.br/produtos/digital_news' is a file too
(html) saved by wget
previously so it can't create the 'digital_news' directory in the file
system.
I can't remove the directory sctructure (-x option) because I have to
know the url of the downloaded
files for further processing.
Is there a way to circunvent the file/dir with the same name problem? Or
a way to
retrieve the original url of the file without using the directory structure?
Reproduce the problem executing: wget -x -r -l1
http://olhardigital.uol.com.br/ultimas_noticias/1
Regards.
Islon Scherer