Any help on this appreciated! How do we scr*pe the content from an entire site? *(I put a * in place of the "a" in scr*pe because spam filters often don't like that word.)*
Though Appropedia's Public Domain Search<http://www.appropedia.org/Public_Domain_Search>(Beta) I've started to identify some sites with a lot of good content on subjects like sustainable agriculture, aid projects and energy efficiency. I'd really like to be able to take everything off a given site, automatically, and put them onto Appropedia, so they can then be wikified (by bot and manually). The strategy is like Wikipedia being populated with the old version of Encyclopedia Brittanica etc. Once it's on the site, it's easier for people to improve and expand those pages, rather than starting from scratch. I'm hoping there's a way of connecting a bot to a tool (such as the Send2Wiki <http://www.mediawiki.org/wiki/Extension:Send2Wiki> extension, or the tools mentioned at (Appropedia:Porting formatted content to MediaWiki<http://www.appropedia.org/Appropedia:Porting_formatted_content_to_MediaWiki>), so we can take a whole list or directory of pages from their source all the way to the wiki. Any ideas? I've asked elsewhere with no luck yet, and Jonathan Gray suggested asking here. Thanks! -- Chris Watkins (a.k.a. Chriswaterguy) Appropedia.org - Sharing knowledge to build rich, sustainable lives. Blog: chriswaterguy.livejournal.com/ Aiming for emails of 5 sentences or less - http://five.sentenc.es/ 'They demanded bread and their method of making their protest was to burn down the bakery. - Ortega Y Gasset Buying at Amazon, eBay etc? Start at http://appropedia.maatiam.com and support Appropedia - at no extra cost.
_______________________________________________ okfn-discuss mailing list [email protected] http://lists.okfn.org/cgi-bin/mailman/listinfo/okfn-discuss
