Also if you are a dumps user or have thoughts about how you would redo
them from scratch, get your ideas in now. We're not waiting for the
Dev Summit to get the work started. See
https://phabricator.wikimedia.org/T114019 for details, especially the
document linked at the end of the task
Thank you Google for hiding the start of this thread in my spam folder
_
I'm going to have to change my import tools for the new format, but
that's the way it goes; it's a reasonable change. Have you checked with
folks on the xml data dumps list to see who might be affected?
Ariel
Στις
Στις 23-06-2014, ημέρα Δευ, και ώρα 20:56 +0300, ο/η Ariel T. Glenn
έγραψε:
dumps.wikimedia.org, downloads.wikimedia.org will be down on Thursday
June 26 from 13.30 UTC until 14.30 UTC. While we expect the actual
downtime to be much less, we're blocking one hour just in case.
And Murphy has
Στις 26-06-2014, ημέρα Πεμ, και ώρα 17:37 +0300, ο/η Ariel T. Glenn
έγραψε:
Στις 23-06-2014, ημέρα Δευ, και ώρα 20:56 +0300, ο/η Ariel T. Glenn
έγραψε:
dumps.wikimedia.org, downloads.wikimedia.org will be down on Thursday
June 26 from 13.30 UTC until 14.30 UTC. While we expect the actual
dumps.wikimedia.org, downloads.wikimedia.org will be down on Thursday
June 26 from 13.30 UTC until 14.30 UTC. While we expect the actual
downtime to be much less, we're blocking one hour just in case.
We will be moving it to a new rack in preparation for improved
bandwidth, and yes this mean
These names will be moved so that requests to them go to our server in
the eqiad data center. This should not cause any service interruptions
but you may notice more current files available for download as the
switch goes into effect.
Time of switch: 10 to 12 am Thursday March 27, UTC.
Hi puppet wranglers,
We're trying to refactor the WMF puppet manifests to get rid of reliance
on dynamic scope, since puppet 3 doesn't permit it. Until now we've
done what is surely pretty standard pupet 2.x practice: assign
values to a variable in the node definition and pick it up in the
class
Στις 28-01-2014, ημέρα Τρι, και ώρα 10:21 -0800, ο/η Ryan Lane έγραψε:
In puppet3 variables assigned in the node are still global. It's the only
place other than facts (or hiera) that you can assign them and have their
scope propagate. So, this'll continue working. I think the future path is
Στις 11-12-2013, ημέρα Τετ, και ώρα 23:01 -0500, ο/η MZMcBride έγραψε:
...
The idea being proposed in bug 58236, as it was framed, was a non-starter.
It simply riled people up and caused them to become defensive. (Its
sibling bugs didn't help.) However, if we re-frame the issue, I think many
We have a somewhat out of date off site mirror of images (I'm working on
the out of date part). This includes commons. It's accessible by
rsync, http, ftp:
http://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Media
Thanks again to your.org for hosting that.
Are these images
Στις 09-07-2013, ημέρα Τρι, και ώρα 07:07 -0400, ο/η Tyler Romeo έγραψε:
Follow-up question. Will our new dumps project be dumping the change_tag
table? ;)
Our old dumps project could (the new one isn't intended to handle the
table dumps but only the page metadata and content data). I don't
Στις 08-07-2013, ημέρα Δευ, και ώρα 20:17 -0700, ο/η Robert Rohde
έγραψε:
Various parts of Mediawiki will apply tags to specific edits in recent
changes and histories.
For example, the recently introduced Visual Editor is adding Tag:
VisualEditor to all of its edits.
Are such tags
Στις 07-07-2013, ημέρα Κυρ, και ώρα 21:09 -0700, ο/η Randall Farmer
έγραψε:
Sorry, reading back over this thread late.
What I hope for is a format that allows dumps to be produced much
more
rapidly, where the time to produce the incrementals grows only as
the
number of edits per time
Στις 02-07-2013, ημέρα Τρι, και ώρα 11:47 +0100, ο/η Neil Harris έγραψε:
The simplest possible dump format is the best, and there's already a
thriving ecosystem around the current XML dumps, which would be broken
by moving to a binary format. Binary file formats and APIs defined by
code
For folks that edit on Wikitech, note that obsolete docs can now be
moved to their own namespace, Obsolete, where we'll be able to dig
them up if we ever want them but they won't clutter up the search
results etc. Please feel free to start populating the new namespace
with all that cruft you were
Στις 02-05-2013, ημέρα Πεμ, και ώρα 15:40 +0200, ο/η Petr Onderka
έγραψε:
I realized I didn't post my proposal to the list yet (I have added it to
the official GSoC site few days ago), so here it is:
http://www.mediawiki.org/wiki/User:Svick/Incremental_dumps
In short, the project aims to
Στις 14-03-2013, ημέρα Πεμ, και ώρα 23:24 +, ο/η Neil Harris έγραψε:
Dear Wikimedia ops team,
The most recent enwiki dump now seems to have finished _almost_
successfully, apart from the dumping of the database metadata tables
such as the pages table and the various links tables,
Στις 07-03-2013, ημέρα Πεμ, και ώρα 21:12 -0400, ο/η bawolff έγραψε:
On 2013-03-07 4:06 PM, Matthew Flaschen mflasc...@wikimedia.org wrote:
On 03/07/2013 12:00 PM, Antoine Musso wrote:
Le 06/03/13 23:58, Federico Leva (Nemo) a écrit :
There's slow-parse.log, but it's private unless a
Στις 11-03-2013, ημέρα Δευ, και ώρα 05:35 -0500, ο/η wiki έγραψε:
Thank you for the response.
I think those sizes refer to the exported xml, e.g. 41.5GB is the English
xml.bz2 expanded.
I was curious as to how much extra disk space is needed (and consumed) after
importing this
Στις 05-02-2013, ημέρα Τρι, και ώρα 07:21 -0500, ο/η Chad έγραψε:
On Tue, Feb 5, 2013 at 7:06 AM, Marco Fleckinger
marco.fleckin...@wikipedia.at wrote:
The farmer doesn't want to eat anything he doesn't know. I don't know this
sentence's popularity in Hungary (AFAIK?), but in German it's
Στις 23-01-2013, ημέρα Τετ, και ώρα 15:10 -0800, ο/η Gabriel Wicke
έγραψε:
Fellow MediaWiki hackers!
After the pretty successful December release and some more clean-up work
following up on that we are now considering the next steps for Parsoid.
To this end, we have put together a rough
Στις 28-12-2012, ημέρα Παρ, και ώρα 10:38 -0500, ο/η Brad Jorsch έγραψε:
On Thu, Dec 27, 2012 at 7:26 PM, Sumana Harihareswara
suma...@wikimedia.org wrote:
3) Look at Nymble - http://freehaven.net/anonbib/#oakland11-formalizing
and http://cgi.soic.indiana.edu/~kapadia/nymble/overview.php .
Στις 11-12-2012, ημέρα Τρι, και ώρα 19:04 -0500, ο/η MZMcBride έγραψε:
Brion Vibber wrote:
Over on the mobile team we've been chatting for a while about the various
trade-offs in native vs HTML-based (PhoneGap/Cordova) development.
[...]
iOS and Android remain our top-tier mobile
Στις 11-12-2012, ημέρα Τρι, και ώρα 01:10 -0800, ο/η Erik Moeller
έγραψε:
Wikimedia wikis hosted on the s3 cluster (pretty much all but the
very large wikis, click on the s3 box in
https://noc.wikimedia.org/dbtree/ to get a full list) are currently in
read-only mode due to severe replication
+0300, ο/η Ariel T. Glenn
έγραψε:
We're going to swap out ms7, the current media server fallback, for a
netapp. We'll start this on Friday Oct 5 at 11am UTC, to conclude at
2pm UTC or earlier. This may entail turning off uploads to all
projects during the switchover. It is possible
We rolled back this change after discovering an ownership issue with the
rsynced media files that caused deletions of media to fail. We'll try
again early next week.
Ariel
Στις 04-10-2012, ημέρα Πεμ, και ώρα 15:19 +0300, ο/η Ariel T. Glenn
έγραψε:
We're going to swap out ms7, the current media
We're going to swap out ms7, the current media server fallback, for a
netapp. We'll start this on Friday Oct 5 at 11am UTC, to conclude at
2pm UTC or earlier. This will entail turning off uploads to all
projects during the switchover. It is possible that
ExtensionDistributor and captchas will
So it's time to have this discussion again. At least, I think we're
having it again, though I could not find previous threads on this list
about the subject.
In short, scaled media is currently generated on the fly for any size
and for any user. The resulting files are kept around forever or
How did this not propagate in the usual way through the job queue? (And
why wouldn't either a null or an insignificant edit to the template add
the requisite job queue entries now?)
A.
Στις 19-06-2012, ημέρα Τρι, και ώρα 10:00 +0200, ο/η Maarten Dammers
έγραψε:
Hi guys,
There must be an
Dupont jamesmikedup...@googlemail.com
Well I whould be happy for items like this :
http://en.wikipedia.org/wiki/Template:Db-a7
would it be possible to extract them easily?
mike
On Thu, May 17, 2012 at 2:23 PM, Ariel T. Glenn
ar...@wikimedia.org
wrote:
There's a few
We now have three mirror sites, yay! The full list is linked to from
http://dumps.wikimedia.org/ and is also available at
http://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Current_Mirrors
Summarizing, we have:
C3L (Brazil) with the last 5 good known dumps,
Masaryk
of
deleted data, at least that which is not spam/vandalism based on tags.
mike
On Thu, May 17, 2012 at 1:09 PM, Ariel T. Glenn ar...@wikimedia.org wrote:
We now have three mirror sites, yay! The full list is linked to from
http://dumps.wikimedia.org/ and is also available at
http
Στις 13-04-2012, ημέρα Παρ, και ώρα 12:49 +1000, ο/η Andrew Garrett
έγραψε:
On Wed, Apr 4, 2012 at 6:25 PM, Petr Bena benap...@gmail.com wrote:
An account with sysop rights cannot do that much damage anyway.
Deleting a page does no more damage than deleting a paragraph in an
existent
*cough* LQT 3 is private because it doesn't exist... see the date of
that email (hint, 1st day of April).
If there were to be a project like that I expect it would be very very
public indeed. ;-)
Ariel
Στις 05-04-2012, ημέρα Πεμ, και ώρα 09:10 +0200, ο/η Petr Bena έγραψε:
When we talk about
know if this is a part of some joke
https://www.mediawiki.org/wiki/LiquidThreads_3.0/status
but it seems that someone wrote some code
2012/4/5 Petr Bena benap...@gmail.com:
This isn't true?
https://www.mediawiki.org/wiki/LiquidThreads_3.0
On Thu, Apr 5, 2012 at 9:40 AM, Ariel T
Στις 26-03-2012, ημέρα Δευ, και ώρα 10:39 -0400, ο/η Mark A. Hershberger
έγραψε:
Benjamin Lees emufarm...@gmail.com writes:
I see two different use cases here: one, you have URLs that need to be
short so they can fit in Twitter messages and the like. Here, it
doesn't matter whether the
Στις 17-03-2012, ημέρα Σαβ, και ώρα 16:45 +0100, ο/η Christian
Aistleitner έγραψε:
Hi Saper,
On Sat, Mar 17, 2012 at 01:59:33PM +, Marcin Cieslak wrote:
[ Announcing xmldumps-test ]
The code is up for review at
https://gerrit.wikimedia.org/r/p/operations/dumps/test.git
[
Στις 01-03-2012, ημέρα Πεμ, και ώρα 15:00 +0200, ο/η Amir E. Aharoni
έγραψε:
2012/3/1 Srikanth Lakshmanan srik@gmail.com:
Hi all,
Would having opengrok[1] setup for Mediawiki code be useful tool?
I haven't used ViewVC much, so not sure if opengrok doesn't do something
that ViewVC
Actually we still want mirrors of the revision texts and the mysql
tables [1], and we do not yet have a mirror of the image data. If
anyone has contacts at a university with good bandwidth and 6T + of
space lying around for a good cause...
Ariel
[1]
I can't answer to the automated or not part, partially because I'm
missing the rest of the thread. I can tell you that because our one
server with the thumbs on it was getting dangerously low on space (and
still is), I've been purging a number of thumbs each day that are not
linked to on our
Hello,
I just checked the pagelinks and categorylinks files here:
http://dumps.wikimedia.org/enwiki/2015/
and they are accessible. Can you give a couple of specific links that
did not work?
Ariel
Στις 26-11-2011, ημέρα Σαβ, και ώρα 20:54 +0100, ο/η Khalida BEN SIDI
AHMED έγραψε:
For my
Στις 25-11-2011, ημέρα Παρ, και ώρα 19:08 +1000, ο/η K. Peachey έγραψε:
On Fri, Nov 25, 2011 at 7:06 PM, Bryan Tong Minh
bryan.tongm...@gmail.com wrote:
There is also another, impersonating brion, which does not appear to
have been cleaned up.
Most of that has been (Its what Ariel was
Does anyone here use the mass bug change feature? If not, we might
consider just turning that off outright for all users.
Ariel
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Στις 25-11-2011, ημέρα Παρ, και ώρα 18:05 +0100, ο/η Siebrand Mazeland
έγραψε:
Does anyone here use the mass bug change feature? If not, we might
consider just turning that off outright for all users.
I use it from time to time, as do others (mostly bugzilla admins). Can it
be disabled on
Στις 26-11-2011, ημέρα Σαβ, και ώρα 01:24 +0800, ο/η Liangent έγραψε:
On Sat, Nov 26, 2011 at 1:11 AM, Ariel T. Glenn ar...@wikimedia.org wrote:
I'm sure it can. If we had to we could give it to all existing users,
it's just a bit more tedious.
Only to users that existed before you sent
Στις 24-11-2011, ημέρα Πεμ, και ώρα 21:30 +0100, ο/η Daniel Zahn έγραψε:
On Thu, Nov 24, 2011 at 8:15 PM, Rob Lanphier ro...@wikimedia.org wrote:
So, here's the solution for now, and probably for a while:
1. New account creation has been re-enabled
2. All existing accounts have been
The three processes we had going for largish wikis had been restarted
from a particilar step, since I had to interrupt them earlier for kernel
upgrade and reboot. These stop at the end of the run. Three regular
jobs are now running; these cycle through the list of the ten largish
wikis in the
Thanks!
But it seems that the update of pagecounts files is stopped for the
past few hours. Is this a temporary problem?
Thanks,
Ikuya
Yes, very temporary. A mistaken side-effect of taking Domas' server out
of the loop; fixed.
Ariel
___
:
very cool! is there a readme or project page somewhere that explains what
all these files are?
On Wed, Nov 16, 2011 at 1:27 PM, Ariel T. Glenn ar...@wikimedia.org wrote:
Thanks!
But it seems that the update of pagecounts files is stopped for the
past few hours. Is this a temporary
Στις 09-11-2011, ημέρα Τετ, και ώρα 10:07 -0500, ο/η Sean Timm έγραψε:
On 11/9/2011 8:21 AM, Ikuya Yamada wrote:
I had thought to do a daily update. If it turns out that hourly updates
are indeed useful, I'll set that up. I don't know of anyone else that
has a current mirror.
I had
Στις 07-11-2011, ημέρα Δευ, και ώρα 18:41 +, ο/η Sean Timm έγραψε:
Ariel T. Glenn ariel at wikimedia.org writes:
I think we finally have a complete copy from December 2007 through
August 2011 of the pageview stats scrounged from various sources, now
available on our dumps server
A while back (over 2 years ago, urk!) we had a request for dumps of
titles of things other than articles [1]. I haven't seen that request
repeated, but I'm wondering how useful that would be to folks and which
namespaces we should dump, if we were going to add a few. Article talk
pages? Other?
:
Some of the most recent dumps links are broken[1].
[1] http://wikipedia.c3sl.ufpr.br/jawikisource/20111018
2011/10/13 Ariel T. Glenn ar...@wikimedia.org
As the subject says, the first mirror of our XML dumps is up,
hosted at
C3Sl in BRazil. We're really excited about
recent dumps links are broken[1].
[1] http://wikipedia.c3sl.ufpr.br/jawikisource/20111018
2011/10/13 Ariel T. Glenn ar...@wikimedia.org
As the subject says, the first mirror of our XML dumps is up,
hosted at
C3Sl in BRazil. We're really excited about it. Details
As the subject says, the first mirror of our XML dumps is up, hosted at
C3Sl in BRazil. We're really excited about it. Details are listed on
the main index page on our download server
( http://dumps.wikimedia.org/ ) and are reproduced below for everyone's
convenience:
Site: Centro de Computação
Στις 03-10-2011, ημέρα Δευ, και ώρα 22:21 -0400, ο/η Russell Nelson
έγραψε:
On Mon, Oct 3, 2011 at 10:15 PM, Brion Vibber br...@wikimedia.org wrote:
I would *very* strongly recommend doing the internal refactoring before we
get anywhere near reviewing and deploying that bad boy; otherwise
Στις 26-09-2011, ημέρα Δευ, και ώρα 02:20 +0200, ο/η melvin_mm έγραψε:
John phoenixoverride at gmail.com writes:
mysql select count(*) from archive;
+--+
| count(*) |
+--+
| 33263574 |
+--+
1 row in set (8 min 47.50 sec)
On Sun, Sep 25, 2011
Στις 26-09-2011, ημέρα Δευ, και ώρα 08:47 +0200, ο/η melvin_mm έγραψε:
Ariel T. Glennarielat wikimedia.org writes:
Στις 26-09-2011, ημέρα Δευ, και ώρα 02:20 +0200, ο/η melvin_mm έγραψε:
Ok, thanks! So in pages-meta-history, those ~33.000.000 archived /
deleted revisions are not
Στις 26-09-2011, ημέρα Δευ, και ώρα 17:59 +0300, ο/η Ariel T. Glenn
έγραψε:
Στις 26-09-2011, ημέρα Δευ, και ώρα 08:47 +0200, ο/η melvin_mm έγραψε:
Ariel T. Glennarielat wikimedia.org writes:
Στις 26-09-2011, ημέρα Δευ, και ώρα 02:20 +0200, ο/η melvin_mm έγραψε:
Ok, thanks! So
Στις 17-09-2011, ημέρα Σαβ, και ώρα 22:55 -0700, ο/η Robert Rohde
έγραψε:
On Sat, Sep 17, 2011 at 4:56 PM, Anthony wikim...@inbox.org wrote:
snip
For offline analyses, there's no need to change the online database tables.
Need? That's debatable, but one of the major motivators is the
Yes, and I've already been getting the information on that together so
it can be documented. :-)
Ariel
Στις 18-09-2011, ημέρα Κυρ, και ώρα 11:55 +0100, ο/η Harry Burt έγραψε:
Ariel T. Glenn wrote:
I think we finally have a complete copy from December 2007 through
August 2011 of the pageview
I think we finally have a complete copy from December 2007 through
August 2011 of the pageview stats scrounged from various sources, now
available on our dumps server.
See http://dumps.wikimedia.org/other/pagecounts-raw/
Ariel
___
Wikitech-l
Στις 06-09-2011, ημέρα Τρι, και ώρα 17:07 -0700, ο/η Brion Vibber
έγραψε:
snip
Indeed -- as long as the data's accessible I'm content enough -
http://lists.wikimedia.org/pipermail/foundation-l/2006-September/023835.html:)
Since then though we've removed it from the data dumps, so it's no
If it's actually etherpad-based, that keeps track of who makes which
change within a given session, so one could attribute specific pieces of
text to a given editor.
Ariel
Στις 04-09-2011, ημέρα Κυρ, και ώρα 21:40 +, ο/η Russell N. Nelson -
rnnelson έγραψε:
Treat the concurrent session as
Στις 01-06-2011, ημέρα Τετ, και ώρα 15:58 -0600, ο/η bawolff έγραψε:
On Wed, Jun 1, 2011 at 3:02 PM, Brion Vibber br...@pobox.com wrote:
On Wed, Jun 1, 2011 at 1:53 PM, bawolff bawolff...@gmail.com wrote:
As a volunteer person, I'm fine if code I commit is reverted based on
it sucking,
Στις 02-06-2011, ημέρα Πεμ, και ώρα 08:31 -0400, ο/η Alex Mr.Z-man
έγραψε:
On Thu, Jun 2, 2011 at 12:10 AM, Brandon Harris bhar...@wikimedia.org wrote:
Your solution, as you've described it in the past, comprises people
do
code review or orf wit' dere heads.
I know of no
And they're done. In the future I expect to spam these lists much less
often, as we'll be able to add a status notice on the download page.
Thanks for your patience.
Ariel
Στις 18-04-2011, ημέρα Δευ, και ώρα 13:17 +0300, ο/η Ariel T. Glenn
έγραψε:
The en wikipedia history bz2 files are ready
The en wikipedia history bz2 files are ready; the last of the 7z files
is being rerun.
Ariel
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Στις 15-04-2011, ημέρα Παρ, και ώρα 18:41 -0700, ο/η Brion Vibber
έγραψε:
One issue we see at present is that since we version and deploy core and
extensions together, it's tough to get a semi-experimental extension into
limited deployment with regular updates. Let's make sure that's clean
The April run of the english history dumps is incomplete. There is at
least one file that will need to be regenerated. When it's ready I'll
send an email update. I expect a delay of 4-5 days for that.
Ariel
___
Wikitech-l mailing list
file size growth seems to
be pretty linear:
(chart x-axis starts from 20060816 dump and ends at 20110115 dump)
http://nekrom.com/wikipedia/enwiki%20history%20dump%20file%20size%
20over%20time.png
cheers,
Jamie
- Original Message -
From: Ariel T. Glenn ar...@wikimedia.org
Date
Well, that used up all my good luck for the year, but the bz2s are ready
for download. The md5sums are still calculating, give them a couple
hours to show up. If all continues to go well we'll have the 7z files
in 4-5 days.
As before I do not plan to provide a single 350gb file of the bz2, nor
Στις 24-03-2011, ημέρα Πεμ, και ώρα 20:29 -0400, ο/η James Linden
έγραψε:
So, thoughts on this? Is 'Move Dumping Process to another language' a
good idea at all?
I'd worry a lot less about what languages are used than whether the process
itself is scalable.
I'm not a mediawiki /
Στις 25-03-2011, ημέρα Παρ, και ώρα 21:49 +0100, ο/η Platonides έγραψε:
Andrew Dunbar wrote:
Just a thought, wouldn't it be easier to generate dumps in parallel if
we did away with the assumption that the dump would be in database
order. The metadata in the dump provides the ordering info
Στις 23-03-2011, ημέρα Τετ, και ώρα 02:03 +0100, ο/η Platonides έγραψε:
Marcin Cieslak wrote:
So having a possibility to have a pre-flight test of the translation
(or even watch the demo of the original in action) is something
Selenium could deinitely help. In many cases, translators
do
Well that, like many things about dumps, took longer than I would have
liked but the January enwikipedia run is finally complete. Unless
someone really really wants them (and then we might talk off list about
it) I am not going to provide a single file for download of the history
dumps; instead
And one more time...
I noticed that we were seeing a 3 to 4-fold slowdown on sv wiki history
dumps in comparison with the previous run. After investigation it
appears that this is due to use of XMLReader(). I've rolled that back
and we are once again up. I've also restarted dawiki from the
Irritatingly enough we haven't quite switched all the paths of
everything to use php-1.17. For example, the dumps.
So the previous tests aren't very useful. I'm shooting the svwiki dump
in process and doing another round of tests with the correct path; after
that I'll restart svwiki from its
We are back in business and running off the new codebase.
Please check the output carefully.
Note that we are on schema version 0.5 now, which includes byte length
of revisions. Also please note that leading spaces before / in xml
markup are now removed. Other than that things should look
I have done a small amount of testing, the tests look good. Acccordingly
I have started up one process to do dumps; please get your eyeballs on
them and let me know thumbs up or down. I'd like to start up the rest
of the processes by tomorrow at this time so if you can squeeze in some
time to
back
up again.
Ariel
Στις 08-02-2011, ημέρα Τρι, και ώρα 11:51 +0530, ο/η Janesh Kodikara
έγραψε:
- Original Message -
From: Ariel T. Glenn ar...@wikimedia.org
Newsgroups: gmane.science.linguistics.wikipedia.technical
To: xmldatadump...@lists.wikimedia.org; wikitech-l
A little bit before the scheduled deployment of the 1.17 branch on our
production servers, I will be halting production of XML dumps.
Deployment is set for Tuesday Feb 8 at 07:00 UTC, so a few hours before
that I'll start shutting down processes.
This is a precautionary measure; after the
Στις 11-01-2011, ημέρα Τρι, και ώρα 10:16 +, ο/η Neil Harris έγραψε:
On 10/01/11 22:13, Ariel T. Glenn wrote:
So soon took longer than I would have liked. However, we are up and
running with the new code. I have started a few processes going and
over the next few days I will ramp
You may be noticing a recombine step for several files on the recent
dumps which simply seems to list the same file again. That's a bug not
a feature; fortunately it doesn't impact the files themselves. I have
fixed the configuration file so that it should no longer claim to run
these, as they
get trapped behind them.
Guess I'd better go update the various pages on wikitech now.
Ariel
Στις 24-12-2010, ημέρα Παρ, και ώρα 20:42 +0200, ο/η Ariel T. Glenn
έγραψε:
The new host Dataset2 is now up and running and serving XML dumps. Those
of you paying attention to DNS entries should see
Στις 01-01-2011, ημέρα Σαβ, και ώρα 16:42 +, ο/η David Gerard
έγραψε:
On 31 December 2010 17:09, Ariel T. Glenn ar...@wikimedia.org wrote:
I'd like all the dumps from all the projects to be on line. Being
realistic I think we would wind up keeping offline copies of all
I'd like to remind everyone once again of the mirror page:
http://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps
If you have any ideas, please add them there, and pursue them or ask for
help in doing so. If you are able to host, don't be shy, step right
up ;-)
Ariel
Στις
At the moment the easiest way for you to mirror our content would be via
wget. You would want to generate a list of the most recent completed
dumps, or we might make such a list available on a biweekly basis. I
need to think about the best mechanism for that. There is also an RSS
feed which
Anthony:
We would like to get copies of any of these dumps as well. This
includes any of the other files: stubs, tables, the lot.
If you have them for other languages or other time periods, that would
be great to know too. I think we could ship you a disk, or two if
needed. Contact me off list
next week when I am back, currently traveling in
germany.
Best,
Huib
2010/12/31, Ariel T. Glenn ar...@wikimedia.org:
Anthony:
We would like to get copies of any of these dumps as well. This
includes any of the other files: stubs, tables, the lot.
If you have them for other
I'd like all the dumps from all the projects to be on line. Being
realistic I think we would wind up keeping offline copies of all of it,
and copies from every 6 months online, with the last several months of
consecutive runs = around 20 or 30 of them also online.
Since these are en wiki we
Ryan Lane wrote a script to purge some of the Flaged Rev memcached
entries; that ran last night as well.
The DOM-related errors all seem to have come from srv227; apache on that
host was restarted about half an hour ago and the results look good.
Ariel
Στις 26-12-2010, ημέρα Κυρ, και ώρα 01:49
The new host Dataset2 is now up and running and serving XML dumps. Those
of you paying attention to DNS entries should see the change within the
hour. We are not generating new dumps yet but expect to do so soon.
Ariel
___
Wikitech-l mailing list
Google donated storage space for backups for XML dumps. Accordingly, a
copy of the latest complete dump for each project is being copied over
(public files only). We expect to run similar copies once every two
weeks, keeping the four latest copies as well as one permanent copy at
every six month
if it gave everyone one more copy.
Ariel
Στις 20-12-2010, ημέρα Δευ, και ώρα 17:41 +0100, ο/η Platonides έγραψε:
Ariel T. Glenn wrote:
Google donated storage space for backups for XML dumps. Accordingly, a
copy of the latest complete dump for each project is being copied over
(public files only
Στις 20-12-2010, ημέρα Δευ, και ώρα 00:21 +0100, ο/η Platonides έγραψε:
Diederik van Liere wrote:
Which dump file is offered in smaller sub files?
http://download.wikimedia.org/enwiki/20100904/
Also see http://wikitech.wikimedia.org/view/Dumps/Parallelization
Expect to see more of this
The dumps in the archive are there because they are incomplete, by the
way.
Ariel
Στις 16-12-2010, ημέρα Πεμ, και ώρα 16:50 +0100, ο/η emijrp έγραψε:
Hi Monica;
You dump is this one, with date 2010-03-12:[1][2]
a3a5ee062abc16a79d111273d4a1a99a enwiki-20100312-pages-articles.xml.bz2
has arrived and we are waiting for the
arrays to be put together and shipped!
Ariel
Στις 16-12-2010, ημέρα Πεμ, και ώρα 17:06 +0100, ο/η emijrp έγραψε:
All? The 2006 one too?
2010/12/16 Ariel T. Glenn ar...@wikimedia.org
The dumps in the archive are there because they are incomplete
, emijrp emi...@gmail.com wrote:
Have you checked the md5sum?
2010/12/16 Gabriel Weinberg y...@alum.mit.edu
Ariel T. Glenn ariel at wikimedia.org writes:
We now have a copy of the dumps on a backup host. Although we are
still
resolving hardware issues on the XML dumps
Στις 17-12-2010, ημέρα Παρ, και ώρα 00:52 +0100, ο/η Platonides έγραψε:
Roan Kattouw wrote:
I'm not sure how hard this would be to achieve (you'd have to
correlate blob parts with revisions manually using the text table;
there might be gaps for deleted revs because ES is append-only) or how
1 - 100 of 134 matches
Mail list logo