Re: [Tutor] python and Beautiful soup question

Mark Lawrence Mon, 22 Jun 2015 06:41:22 -0700

On 22/06/2015 02:41, Alex Kleider wrote:

On 2015-06-21 15:55, Mark Lawrence wrote:

On 21/06/2015 21:04, Joshua Valdez wrote:

I'm having trouble making this script work to scrape information from a
series of Wikipedia articles.


What I'm trying to do is iterate over a series of wiki URLs and pull out
the page links on a wiki portal category (e.g.
https://en.wikipedia.org/wiki/Category:Electronic_design).

I know that all the wiki pages I'm going through have a page links
section.
However when I try to iterate through them I get this error message:

Traceback (most recent call last):
   File "./wiki_parent.py", line 37, in <module>
     cleaned = pages.get_text()AttributeError: 'NoneType' object has no
attribute 'get_text'


Presumably because this line

     pages = soup.find("div" , { "id" : "mw-pages" })


doesn't find anything, pages is set to None and hence the attribute
error on the next line.  I'm suspicious of { "id" : "mw-pages" } as
it's a Python dict comprehension with one entry of key "id" and value
"mw-pages".


Why do you refer to { "id" : "mw-pages" } as a dict comprehension?
Is that what a simple dict declaration is?

No, I'm simply wrong, it's just a plain dict. Please don't ask, as I'veno idea how it got into my head :)


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] python and Beautiful soup question

Reply via email to