On Tue, Apr 21, 2009 at 9:25 AM, Daniel Kinzler <[email protected]> wrote: > Magnus Manske schrieb: >> All in all, it would be much better directly integrated into MediaWiki >> (no need for text retrieval/parsing, no bulk updates). But I've been >> saying that for years, at least this is a first attempt. > > Actually, this is part of my grand plan for world domination. I'm pushing for > it > behind the scenes... I have a few ideas on how it may be done nicely.
Excellent! I'll hold further development on the tool for now. > I think the main problem is that semantic mediawiki looks like the obvious > answer. But i doubt it is. I only want a small subset of that functionality on > wikipedia. Maybe SMW can be chopped up to fit that, but i'm personally more > inclined to extend the RDF extension to store triples in the DB. I agree about Semantic MediaWiki, which is a different beast (and might one day be used on Wikipedia). The question seems to be scalability.Extrapolating from my sample data set, just the key/value pairs of templates directly included in articles would come to over 200 million rows for en.wikipedia at the moment. A MediaWiki-internal solution would want to store templates included in templates as well, which can be a lot for complicated meta-templates. I think a billion rows for the current English Wikipedia is not too far-fetched in that model. The table would be both constantly updated (potentially hundeds of writes for a single article update) and heavily searched (with LIKE "%stuff%", no less). Would the RDF extension be up to that? Cheers, Magnus _______________________________________________ Toolserver-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/toolserver-l
