Re: [Tutor] python and Beautiful soup question

Timo Mon, 22 Jun 2015 03:13:14 -0700

Op 21-06-15 om 22:04 schreef Joshua Valdez:

I'm having trouble making this script work to scrape information from a
series of Wikipedia articles.


What I'm trying to do is iterate over a series of wiki URLs and pull out
the page links on a wiki portal category (e.g.
https://en.wikipedia.org/wiki/Category:Electronic_design).

Instead of scraping the webpage, I'd have a look at the API. This mightgive much better and more reliable results than to rely on parsing HTML.


https://www.mediawiki.org/wiki/API:Main_page

You can try out the huge amount of different options (with smalldescriptions) on the sandbox page:


https://en.wikipedia.org/wiki/Special:ApiSandbox

Timo





*Joshua Valdez*
*Computational Linguist : Cognitive Scientist
      *

(440)-231-0479
[email protected] <[email protected]> | [email protected] | [email protected]
<http://www.linkedin.com/in/valdezjoshua/>
_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] python and Beautiful soup question

Reply via email to