On 01/11/2014 00:27, dinkypumpkin wrote:
I tried this same approach, but it foundered on radio programmes. There
is just too much stuff there. It's soul-crushingly slow to scrape the
iPlayer Radio site, at least for a desktop cache. It would be great to
have everything available on iPlayer searchable off-site, but there is
too much of it for get_iplayer's current local caching model. I'm going
to have another go at some point.
There is no real need to download *all* of the schedule information;
after all, only a fraction of it will ever be of any use to an
individual user.
I would use the BBC server to do the search for me, after which there is
little work to be done. For instance, if I look for all Book at Bedtime
episodes with this URL
http://www.bbc.co.uk/radio/programmes/a-z/by/book%20at%20bedtime/player
then I am taken a page with a link to the series at
http://www.bbc.co.uk/programmes/b006qtlx/episodes/player?page=1
through to `page=6`. That amounts to 52 programmes which, even on my
meagre 13 megabit connection that takes less than ten seconds, and the
results could be cached for practically instantaneous response for a
similar request in the future. There is also the possibility of writing
a batch solution that makes a query only every minute or so and could be
run continuously or overnight.
I'm more than happy to write a proof of concept if you're interested. I
have it half-written already just to get that timing information.
The one thing that bothers me is the terms and conditions of the web
site. I scanned through them quickly and couldn't find anything about
robotic access, but it would be a first if there isn't anything there.
If it's just a matter of obeying the /robots.txt then I'm more than
happy to go ahead.
Let me know how I can help.
Rob
---
This email is free from viruses and malware because avast! Antivirus protection
is active.
http://www.avast.com
_______________________________________________
get_iplayer mailing list
[email protected]
http://lists.infradead.org/mailman/listinfo/get_iplayer