Hello People,
I have completed my first set in uploading the osm/fosm dataset (350gb
unpacked) to archive.org
http://osmopenlayers.blogspot.de/2012/05/upload-finished.html
We can do something similar with wikipedia, the bucket size of
archive.org is 10gb, we need to split up the data in a way
There is no such 10GB limit,
http://archive.org/details/ARCHIVETEAM-YV-6360017-6399947 (238 GB example)
ArchiveTeam/WikiTeam is uploading some dumps to Internet Archive, if you
want to join the effort use the mailing list
https://groups.google.com/group/wikiteam-discuss to avoid wasting
On Thu, May 17, 2012 at 6:06 AM, John phoenixoverr...@gmail.com wrote:
If your willing to foot the bill for the new hardware
Ill gladly prove my point
given the millions of dollars that wikipedia has, it should not be a
problem to provide such resources for a good cause like that.
--
James
I'd like to point out that the increasingly technical nature of this
conversation probably belongs either on wikitech-l, or off-list, and that
the strident nature of the comments is fast approaching inappropriate.
Alex
Wikimedia-l list administrator
2012/5/17 Anthony wikim...@inbox.org
On
On 17/05/12 12:49, Anthony wrote:
Please have someone at WMF coordinate this so that there aren't
multiple requests made. In my opinion, it should preferably be made
by a WMF employee.
Fill out the form at
https://aws-portal.amazon.com/gp/aws/html-forms-controller/aws-dataset-inquiry
Tell
On Thu, May 17, 2012 at 07:43:09AM -0400, Anthony wrote:
In fact, I think someone at WMF should contact Amazon and see if
they'll let us conduct the experiment for free, in exchange for us
creating the dump for them to host as a public data set
(http://aws.amazon.com/publicdatasets/).
That
Ill run a quick benchmark and import the full history of simple.wikipedia
to my laptop wiki on a stick, and give an exact duration
On Thu, May 17, 2012 at 12:26 AM, John phoenixoverr...@gmail.com wrote:
Toolserver is a clone of the wmf servers minus files. they run a database
replication of
On Thu, May 17, 2012 at 12:30 AM, John phoenixoverr...@gmail.com wrote:
Ill run a quick benchmark and import the full history of simple.wikipedia to
my laptop wiki on a stick, and give an exact duration
Simple.wikipedia is nothing like en.wikipedia. For one thing, there's
no need to turn on
Well to be honest, I am still upset about how much data is deleted
from wikipedia because it is not notable,
there are so many articles that I might be interested in that are lost
in the same garbage as spam and other things.
We should make non notable articles and non harmful ones available in
On Thu, May 17, 2012 at 1:22 AM, John phoenixoverr...@gmail.com wrote:
Anthony the process is linear, you have a php inserting X number of rows per
Y time frame.
Amazing. I need to switch all my databases to MySQL. It can insert X
rows per Y time frame, regardless of whether the database is
10 matches
Mail list logo