Following on from my previous posts about trying to classify the scope and coverage of humanities subjects in Wikipedia, I have a practical question: is it possible to query the Wikipedia database in such a way as to get a list of all articles (current version)? Even better, with a second, larger list that indexes each article with a list of categories it belongs to. Example
List 1 Name , ID Thomas Aquinas, 1 William of Ockham, 2 List 2 ID, category 1, 1225 births 1, 1274 deaths [...] 2, 1285 births 2, 1347 deaths 2, 13th century philosophers and so on. I appreciate the second list may be up to 20 times the size of the first, thus 60 million rows. Perhaps there is a way to limit the number of categories, I don't know. This would allow me to see exactly what was there under the humanities. My hunch is that most articles in Wikipedia are obscure stubs (from using the random article function), and that the coverage of humanities subjects, possibly other areas, is actually no different to a conventional encyclopedia. _______________________________________________ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l