On Tue, Feb 24, 2009 at 11:26 PM, Brian <[email protected]> wrote:
> Why not make the uncompressed dump available as an Amazon Public > Dataset? http://aws.amazon.com/publicdatasets/ > Which uncompressed dump? The full history English Wikipedia dump doesn't exist, and there doesn't seem to be any demand for this anyway. You can already find DBPedia and FreeBase there. Its true that the > uncompressed dump won't fit on a commercial drive (the largest is a > 4-platter 500GB = 2TB drive). Cloud computing seems to be the most > economically feasible alternative for all parties involved. > "Cloud computing" might be a good alternative for some reusers, but if so it'd be most economical to just host the cloud at the source, i.e. API/live feed access open to everyone (for free or for a cost). For certain uses there's the toolserver, but access to it is handed out with special permission. For small amounts of traffic there's an API, and there's the live feed which seems to be limited to major corporations with special permission. The WMF hasn't put any real resources into this for the small time commercial user (big players have the live feed and non-commercial users can probably get toolserver access). Of course, there isn't all that much demand either. If there was, a third party would have set it up by now (I'd personally be willing to set up a pay-for-access toolserver and custom dump service if I could get a commitment from one or more people for a couple hundred a month in funding). "Typically the data sets in the repository are between 1 GB to 1 TB in > size (based on the Amazon EBS volume limit), but we can work with you > to host larger data sets as well. You must have the right to make the > data freely available." Yeah, if there was any demand for this, nothing's stopping someone from setting it up on their own. _______________________________________________ foundation-l mailing list [email protected] Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
