On Mon, Dec 29, 2008 at 5:34 PM, kish <[email protected]> wrote: > hi > > I wish to download a bunch of files from a site which allows directory > listing > with links like > > http://38.106.111.111/folder/sub%20folder/sub%20folderl%201/ > zYWIxNDQ2ODliZDhhMTc4NzRjMGQwMzY/ > file%201.zip<http://38.106.111.111/folder/sub%20folder/sub%20folderl%201/zYWIxNDQ2ODliZDhhMTc4NzRjMGQwMzY/file%201.zip> > .......
> ...... > http://38.106.111.111/folder/sub%20folder/sub%20folderl%205/ > afkafkajfkljlkfjiejjflkjlFJLKJFLAY/ > file%205.zip<http://38.106.111.111/folder/sub%20folder/sub%20folderl%205/afkafkajfkljlkfjiejjflkjlFJLKJFLAY/file%205.zip> > > As you can see, the second line of the link changes with every link > The file name is a combination of sub folder name and .zip > I see such links only on sites that dont want you to do what you just mentioned you need to do. eg. Rapidshare uses it to prevent usage of download managers other than their own. They also use it to prevent resuming downloads. SDN uses it to prevent someone from repeatedly downloading a file and using up their bandwidth and for promoting their own Sun Download Manager that can resume and stop. I can do that with aria2c as long as the download is resumed within a day or 2/ ( I can't think of a different reason why Sun does that) > > Could anybody point me to a library or modules to explore a given > website to gather all the links in the site? > > After which I can apply a filter to extract only those I need and pass > the list to wget or the like In either case, you were not meant to download it without human intervention. --- Ashok `ScriptDevil` Gautham _______________________________________________ To unsubscribe, email [email protected] with "unsubscribe <password> <address>" in the subject or body of the message. http://www.ae.iitm.ac.in/mailman/listinfo/ilugc
