On Wednesday, November 25, 2015 at 3:42:21 PM UTC-5, ryguy7272 wrote: > Hello experts. I'm looking at this url: > https://en.wikipedia.org/wiki/Wikipedia:Unusual_place_names > > I'm trying to figure out how to list all 'a title' elements. For instance, I > see the following: > <a title="Accident, Maryland" href="/wiki/Accident,_Maryland">Accident</a> > <a class="new" title="Ala-Lemu (page does not exist)" > href="/w/index.php?title=Ala-Lemu&action=edit&redlink=1">Ala-Lemu</a> > <a title="Alert, Nunavut" href="/wiki/Alert,_Nunavut">Alert</a> > <a title="Apocalypse Peaks" href="/wiki/Apocalypse_Peaks">Apocalypse Peaks</a> > > So, I tried putting a script together to get 'title'. Here's my attempt. > > import requests > import sys > from bs4 import BeautifulSoup > > url = "https://en.wikipedia.org/wiki/Wikipedia:Unusual_place_names" > source_code = requests.get(url) > plain_text = source_code.text > soup = BeautifulSoup(plain_text) > for link in soup.findAll('title'): > print(link) > > All that does is get the title of the page. I tried to get the links from > that url, with this script. > > import urllib2 > import re > > #connect to a URL > website = > urllib2.urlopen('https://en.wikipedia.org/wiki/Wikipedia:Unusual_place_names') > > #read html code > html = website.read() > > #use re.findall to get all the links > links = re.findall('"((http|ftp)s?://.*?)"', html) > > print links > > That doesn't work wither. Basically, I'd like to see this. > > Accident > Ala-Lemu > Alert > Apocalypse Peaks > Athol > Å > Barbecue > Båstad > Bastardstown > Batman > Bathmen (Battem), Netherlands > ... > Worms > Yell > Zigzag > Zzyzx > > How can I do that? > Thanks all!!
Ok, I guess that makes sense. So, I just tried the script below, and got nothing... import requests from bs4 import BeautifulSoup r = requests.get("https://en.wikipedia.org/wiki/Wikipedia:Unusual_place_names") soup = BeautifulSoup(r.content) print soup.find_all("a",{"title"}) -- https://mail.python.org/mailman/listinfo/python-list