On 01/08/15 11:48, Gaurav Lathwal wrote:

I want to write a script that automatically downloads all the videos hosted
on this site :-

http://www.toonova.com/batman-beyond

The first thing to ask is whether they allow robotic downloads
from the site. If they are funded by advertising then they may
not permit it and it would be self defeating to try since you
would be helping close down your source!

Now, the problem I am having is, I am unable to fetch the video urls of all
the videos.

I assume you want to fetch the videos not just the URLs?
Fetching the URLs is easy enough and I doubt the site would object
too strongly. But fetching the videos is much harder since:

a) The page you give only has links to separate pages for each
   video.
b) The separate pages have a download link which is to a
   tiny url which may well change.
c) The separate page is not static HTML (or even server
   generated HTML) it is created in part by the Javascript
   code when the page loads. That means it is very likely to
   change on each load (possibly deliberately so to foil robots!)

I mean I can manually fetch the video urls using the chrome developer's
console, but it's too time consuming.
Is there any way to just fetch all the video urls using BeautifulSoup ?

It's probably possible for a one-off, but it may not work reliably for future use. Assuming the site allows it in the first place.

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Reply via email to