Brian wrote: > Why not make the uncompressed dump available as an Amazon Public > Dataset? http://aws.amazon.com/publicdatasets/ > > You can already find DBPedia and FreeBase there. Its true that the > uncompressed dump won't fit on a commercial drive (the largest is a > 4-platter 500GB = 2TB drive). Cloud computing seems to be the most > economically feasible alternative for all parties involved.
It depends on the parties--- for me as a user, it's more economically feasible to download the dataset locally and run scripts on my own machine, than to pay for EC2 compute time to run those scripts. But I have free unlimited university bandwidth. It does seem like there might be some mutual benefits to having a copy at Amazon, for those who do prefer it. Since it would become easy to analyze a full database dump from an Amazon EC2 compute instance, due to it being already available on the filesystem, a number of people might use EC2 to run their analysis scripts. From that perspective, maybe Amazon might be persuaded to help out? Maybe they could donate some money, equipment, or developer time to reengineer the dump process, in return for one part of the reengineering being the addition of a routine sync to their service? -Mark _______________________________________________ foundation-l mailing list [email protected] Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
