On Thu, Apr 30, 2009 at 3:14 AM, Petr Pisar <[email protected]> wrote: > > On Wed, Apr 29, 2009 at 06:50:11PM -0500, Jake b wrote: > > Instead of creating something like: "912.html" or "index.html" it instead > > becomes: "viewtopic....@t=29807&postdays=0&postorder=asc&start=27330" > > > That't normal because the server doean't provide any usefull alternative name > via HTTP headers which can be obtained using wget's option > "--content-disposition".
I already know how to get the page number ( my python script converts 27330 to 912 and back ), but i'm not sure how to tell wget that the output html file should be named. > > How do I make wget download all images on the page? I don't want to > > recurse other hosts, or even sijun, just download this page, and all > > images needed to display it. > > > That's not easy task. Especially because all big desktop images are stored on > other servers. I think wget is not enough powerfull to do it all on its own. Are you saying because some services show a thumbnail, then click to do the full image? I'm not worried about that, since the majority are full size in the thread. Would it be simpler to say something like: download page 912, recursion level=1 ( or 2? ), except for non-image links. ( so it only allows recursion on images, ie: downloading "randomguyshost.com/3.png" But the problem that it does not span any hosts? Is there a way I can achieve this, if I do the same, except, allow span everybody, recurse lvl=1, and only recurse non-images. > I propose using other tools to extract the image ULRs and then to download > them > using wget. E.g.: I guess I could use wget to get the html, and parse that for image tags manually, but, then I don't get the forum thread comments. Which isn't required, but would be nice. > wget -O - > 'http://forums.sijun.com/viewtopic.php?t=29807&postdays=0&postorder=asc&start=27+330' > | grep -o -E 'http:\/\/[^"]*\.(jpg|jpeg|png)' | wget -i - > Ok, will have to try it out. ( In windows ATM so I can't pipe. ) > Acctually, I suppose you use some unix enviroment, where you have available > powerfull collection of external tools (grep, seq) and amazing shell scripting > abilities (like colons and loops). > > -- Petr Using python, and I have dual boot if needed. -- Jake
