Dependent on language and libraries available you should try making a
HEAD request for each of the URLs you extract. This would return only
the headers of the endpoint, and inside this list you should get the
mime-type of the content. There are a ton of video mime-types but it
should be easy to look for them once you extract the headers.
On May 31, 1:07 pm, Nick Arnett nick.arn...@gmail.com wrote:
On Sun, May 31, 2009 at 4:53 AM, grand_unifier jijodasgu...@gmail.comwrote:
i have written a code to get all tweets that have urls in them in atom
or json format.
now i want a way to:
1separate the urls from the tweetslike a tweetmeme way...
2find out if the url represents a video...
how will i do that??
I don't think anyone can answer this in detail without knowing what language
are you writing this code in. You should be able to use a regular
expression to extract the URLs and then use the file extension to detect
whether or not it is a direct link to a video file. But if it is a link to
a page that contains a video, you'll have to fetch the page and examine its
links.
There are some URL patterns that you probably can assume point to pages that
contain video, such as YouTube URLs.
Nick