Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-06-06 Thread Oren Bochman
] On Behalf Of Oren Bochman Sent: 05 June 2012 08:44 To: 'Wikimedia developers' Subject: Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update Any chance that these archived can be served via bittorent - so that even partial downloaders can become servers - leveraging p2p to reduce

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-06-06 Thread Strainu
2012/6/6 Oren Bochman orenboch...@gmail.com: Dear Ariel, Consider that people who would need to use Torrent most of all cannot host a mirrors - this is a situation of the little guy being asked to do the heavy lifting. It would be saving WMF significant resources, - it would be more

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-06-05 Thread Oren Bochman
[mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Mike Dupont Sent: Saturday, June 02, 2012 1:28 AM To: Wikimedia developers; wikiteam-disc...@googlegroups.com Subject: Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update I have run cron archiving now every 30 minutes, http://ia700802

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-06-05 Thread Derric Atzrott
...@lists.wikimedia.org] On Behalf Of Oren Bochman Sent: 05 June 2012 08:44 To: 'Wikimedia developers' Subject: Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update Any chance that these archived can be served via bittorent - so that even partial downloaders can become servers - leveraging p2p

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-06-05 Thread Ariel T. Glenn
...@lists.wikimedia.org [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Oren Bochman Sent: 05 June 2012 08:44 To: 'Wikimedia developers' Subject: Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update Any chance that these archived can be served via bittorent - so that even

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-06-01 Thread Mike Dupont
I have run cron archiving now every 30 minutes, http://ia700802.us.archive.org/34/items/wikipedia-delete-2012-06/ it is amazing how fast the stuff gets deleted on wikipedia. what about the proposed deletes are there categories for that? thanks mike On Wed, May 30, 2012 at 6:26 AM, Mike Dupont

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-05-30 Thread Mike Dupont
Ok, I merged the code from wikteam and have a full history dump script that uploads to archive.org, next step is to fix the bucket metadata in the script mike On Tue, May 29, 2012 at 3:08 AM, Mike Dupont jamesmikedup...@googlemail.com wrote: Well, I have now updated the script to include  the

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-05-30 Thread Mike Dupont
https://github.com/h4ck3rm1k3/wikiteam code here On Wed, May 30, 2012 at 6:26 AM, Mike Dupont jamesmikedup...@googlemail.com wrote: Ok, I merged the code from wikteam and have a full history dump script that uploads to archive.org, next step is to fix the bucket metadata in the script mike

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-05-30 Thread Huib Laurens
I'm still intressted in running a mirror also, like noted on Meta and send out earlier per mail also. I'm just wondering, why is there no rsync possibility from the main server? Its strange when we need to rsync from a mirror. -- *Kind regards, Huib Laurens** Certified cPanel Specialist

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-05-30 Thread Hydriz Wikipedia
Eh, mirrors rsync directly from dataset1001.wikimedia.org, see rsync dataset1001.wikimedia.org:: However, the system limits the rsyncers to only mirrors, to prevent others from rsyncing directly from Wikimedia. On Wed, May 30, 2012 at 4:52 PM, Huib Laurens sterke...@gmail.com wrote: I'm still

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-05-30 Thread Huib Laurens
Ok, cool. And how will I get wikimedia to allow our IP to rsync? Best, Huib On Wed, May 30, 2012 at 10:54 AM, Hydriz Wikipedia ad...@alphacorp.tkwrote: Eh, mirrors rsync directly from dataset1001.wikimedia.org, see rsync dataset1001.wikimedia.org:: However, the system limits the rsyncers

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-05-30 Thread Hydriz Wikipedia
Ariel will do that :) BTW just dig around inside their puppet configuration repository on Gerrit and you can know more :) On Wed, May 30, 2012 at 4:58 PM, Huib Laurens sterke...@gmail.com wrote: Ok, cool. And how will I get wikimedia to allow our IP to rsync? Best, Huib On Wed, May 30,

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-05-30 Thread Huib Laurens
Ok. I mailed Ariel about this, if all goes will I can have the mirror up and running by Friday. Best, Huib On Wed, May 30, 2012 at 10:59 AM, Hydriz Wikipedia ad...@alphacorp.tkwrote: Ariel will do that :) BTW just dig around inside their puppet configuration repository on Gerrit and you

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-05-30 Thread Hydriz Wikipedia
Do you have a url that you can reveal so that some of us can have a sneak peak? :P On Wed, May 30, 2012 at 5:16 PM, Huib Laurens sterke...@gmail.com wrote: Ok. I mailed Ariel about this, if all goes will I can have the mirror up and running by Friday. Best, Huib On Wed, May 30, 2012 at

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-05-30 Thread Huib Laurens
Sure :) http://mirror.fr.wickedway.nl later on we will duplicate this mirror to a Dutch mirror also :) Best, Huib On Wed, May 30, 2012 at 11:18 AM, Hydriz Wikipedia ad...@alphacorp.tkwrote: Do you have a url that you can reveal so that some of us can have a sneak peak? :P On Wed, May 30,

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-05-28 Thread Mike Dupont
first version of the Script is ready , it gets the versions, puts them in a zip and puts that on archive.org https://github.com/h4ck3rm1k3/pywikipediabot/blob/master/export_deleted.py here is an example output : http://archive.org/details/wikipedia-delete-2012-05

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-05-28 Thread Hydriz Wikipedia
This is quite nice, though the item's metadata is too little :) On Tue, May 29, 2012 at 3:40 AM, Mike Dupont jamesmikedup...@googlemail.com wrote: first version of the Script is ready , it gets the versions, puts them in a zip and puts that on archive.org

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-05-28 Thread Mike Dupont
Well, I have now updated the script to include the xml dump in raw format. I will have to add more information the achive.org item, at least a basic readme. other thing is that the wikipybot does not support the full history it seems, so that I will have to move over to the wikiteam version and

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-05-21 Thread Mike Dupont
Thanks! and run that 1 time per day, they dont get deleted that quickly. mike On Mon, May 21, 2012 at 9:11 PM, emijrp emi...@gmail.com wrote: Create a script that makes a request to Special:Export using this category as feed https://en.wikipedia.org/wiki/Category:Candidates_for_speedy_deletion

Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update

2012-05-17 Thread Platonides
On 17/05/12 14:23, Ariel T. Glenn wrote: There's a few other reasons articles get deleted: copyright issues, personal identifying data, etc. This makes maintaning the sort of mirror you propose problematic, although a similar mirror is here: