On Thu, Feb 26, 2009 at 4:48 PM, Platonides wrote:
> Not only do you need to keep them in the same block. You also need to
> keep them inside the compression window. Unless you are going to reorder
> those 1M revisions to keep revisions to the same article together.
He already said that should be
Robert Ullmann wrote:
> look at the first three digits of the revid, when they are the same,
> they would be in the same "block" (this is assuming 1M revs/block as I
> suggested). You can check any title you like (remember _ for space,
> and % escapes for a lot of characters, but a good browser wil
Hi,
On Thu, Feb 26, 2009 at 2:29 AM, Andrew Garrett wrote:
> On Thu, Feb 26, 2009 at 5:08 AM, John Doe wrote:
>> But server space saved by compression would be would be compensated by the
>> stability, and flexibility provided by this method. this would allow what
>> ever server is controlling t
On Thu, Feb 26, 2009 at 5:08 AM, John Doe wrote:
> But server space saved by compression would be would be compensated by the
> stability, and flexibility provided by this method. this would allow what
> ever server is controlling the dump process to designate and delegate
> parallel processes for
--- El mié, 25/2/09, Robert Ullmann escribió:
> De: Robert Ullmann
> Asunto: Re: [Wikitech-l] Dump processes seem to be dead
> Para: "Wikimedia developers"
> Fecha: miércoles, 25 febrero, 2009 2:09
> you
> yourself suggested page id.
>
> I suggest the history
Marco Schuster wrote:
> Another idea: If $revision is
> deleted/oversighted/whateverhowmadeinvisible, then find out the block
> ID for the dump so that only this specific block needs to be
> re-created in next dump run. Or, better: do not recreate the dump
> block, but only remove the offending rev
2009/2/25 John Doe :
> Id recommend either 10m or 10% of
> the database which ever is larger for new dumps to screen out a majority of
> the deletions. what are your thoughts on this process brion (and the rest of
> the tech team)?
Another idea: If $revision is
deleted/oversighted/whateverhowmadein
2009/2/25 John Doe :
> But server space saved by compression would be would be compensated by the
> stability, and flexibility provided by this method.
True, I didn't mean to say it was a bad idea, I was just pointing out
one disadvantage you may not have considered.
_
On Tue, Feb 24, 2009 at 5:09 PM, Robert Ullmann wrote:
> I suggest the history be partitioned into "blocks" by *revision ID*
>
> Like this: revision IDs (0)-999,999 go in "block 0", 1M to 2M-1 in
> "block 1", and so on. The English Wiktionary at the moment would have
> 7 blocks; the English Wikip
But server space saved by compression would be would be compensated by the
stability, and flexibility provided by this method. this would allow what
ever server is controlling the dump process to designate and delegate
parallel processes for the same dump. so block 1 could be on server 1 and
block
2009/2/25 Robert Ullmann :
> I suggest the history be partitioned into "blocks" by *revision ID*
>
> Like this: revision IDs (0)-999,999 go in "block 0", 1M to 2M-1 in
> "block 1", and so on. The English Wiktionary at the moment would have
> 7 blocks; the English Wikipedia would have 273.
One prob
afaik there are "hands" in amsterdam that can be called upon to do stuff as
necessary in the centre like any other hosting customer, but the need is not
quite of the same level as tampa due to size, servers there etc. seoul no
longer operates so this is not an issue.
regards
mark
On Tue, Feb 24
> The worry bit is that it seems srv136 will now work as apache.
> So, where will dumps be done?
I'm not sure where (or if it has changed), but they are running now (:-)
To Ariel Glenn:
On getting them to work better in the future, this is what I would suggest:
First, note that everything
Hoi,
Is there also a "Rob" in Amsterdam and Seoul ?
Thanks,
GerardM
2009/2/24 Aryeh Gregor
>
> On Tue, Feb 24, 2009 at 9:42 AM, Thomas Dalton
> wrote:
> > Is there anyone within minutes of the servers at all times? Aren't
> > they at a remote data centre?
>
> Isn't Rob on-site?
>
> _
2009/2/24 Aryeh Gregor :
> On Tue, Feb 24, 2009 at 9:42 AM, Thomas Dalton
> wrote:
>> Is there anyone within minutes of the servers at all times? Aren't
>> they at a remote data centre?
>
> Isn't Rob on-site?
He's based somewhere near the data centre, but I'm not sure he's
actually there unless
On Tue, Feb 24, 2009 at 9:42 AM, Thomas Dalton wrote:
> Is there anyone within minutes of the servers at all times? Aren't
> they at a remote data centre?
Isn't Rob on-site?
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikim
2009/2/24 Robert Ullmann :
> When a server is reported down (in this case hard; won't reply to
> ping) it should be physically looked at within minutes.
Is there anyone within minutes of the servers at all times? Aren't
they at a remote data centre?
___
Robert Ullmann wrote:
> All servers should be monitored, on several levels (ping, various
> queries, checking processes)
Nagios should have been monitoring them.
> Someone should be "watching" the monitor 24x7. (being right there, or
> by SMS, whatever ;)
Don't know if there can be a nagios "si
Let me ask a separate question (Ariel may be interested in this):
What if we took the regular permanent media backups, and WMF filtered
them in house just to remove the classified stuff (;-), and then put
them somewhere where others could convert them to the desired
format(s)? (Build all-history f
On Tue, Feb 24, 2009 at 6:49 AM, Andrew Garrett wrote:
> On Tue, Feb 24, 2009 at 1:07 PM, Robert Ullmann wrote:
>> Really? I mean is this for real?
>>
>> The sequence ought to be something like: breaker trips, monitor shows
>> within a minute or two that 4 servers are offline, and not scheduled
>
On Tue, Feb 24, 2009 at 1:07 PM, Robert Ullmann wrote:
> Really? I mean is this for real?
>
> The sequence ought to be something like: breaker trips, monitor shows
> within a minute or two that 4 servers are offline, and not scheduled
> to be. In the next 5 minutes someone looks at the server(s),
Hmm:
On Mon, Feb 23, 2009 at 9:04 PM, Russell Blau wrote:
> 2) Within the last hour, the server log at
> http://wikitech.wikimedia.org/wiki/Server_admin_log indicates that Rob found
> and fixed the cause of srv31 (and srv32-34) being down -- a circuit breaker
> was tripped in the data center.
Robert Rohde wrote:
> The largest gains are almost certainly going to be in parallelization
> though. A single monolithic dumper is impractical for enwiki.
>
> -Robert Rohde
Using dumps compressed per blocks, as the ones I used for
http://lists.wikimedia.org/pipermail/wikitech-l/2009-January/040
On 2/23/09 12:13 PM, Ariel T. Glenn wrote:
> I asked for it, and that's why it was assigned to me. I should have
> recognized much sooner that I could not actually get it done and should
> have brought this to Brion's attention instead of continuing to hang on
> to it after he brought it to my att
On 2/23/09 3:08 AM, Marco Schuster wrote:
> Even if you had the dumps, you have another problem: They're
> incredibly big and so a bit difficult to parse. So, a small suggestion
> if the dumps will ever be workin' again: Split the history and current
> db stuff by alphabet, please.
Define alphabet
Στις 23-02-2009, ημέρα Δευ, και ώρα 19:02 +, ο/η Thomas Dalton
έγραψε:
> 2009/2/23 Ariel T. Glenn :
> > The reason these dumps are not rewritten more efficiently is that this
> > job was handed to me (at my request) and I have not been able to get to
> > it, even though it is the first thing on
On Mon, Feb 23, 2009 at 11:08 AM, Alex wrote:
> Most of that hasn't been touched in years, and it seems to be mainly a
> Python wrapper around the dump scripts in /phase3/maintenance/ which
> also don't seem to have had significant changes recently. Has anything
> been done recently (in a very bro
Most of that hasn't been touched in years, and it seems to be mainly a
Python wrapper around the dump scripts in /phase3/maintenance/ which
also don't seem to have had significant changes recently. Has anything
been done recently (in a very broad sense of the word)? Or at least, has
anything been w
2009/2/23 Ariel T. Glenn :
> The reason these dumps are not rewritten more efficiently is that this
> job was handed to me (at my request) and I have not been able to get to
> it, even though it is the first thing on my list for development work.
> So, if there are going to be rants, they can be di
Ariel,
Thank you for giving some insight into what has been going on behind
the scenes.
I have a few questions that will hopefully get some answers to those
of us eager to help out
in any way we can.
What are the planned code changes to speed the process up? Can we help
this volunteer
with
Thanks for the update Russell!
On Feb 23, 2009, at 10:04 AM, Russell Blau wrote:
> "Russell Blau" wrote in message
> news:gnuacf$hf...@ger.gmane.org...
>>
>> I have to second this. I tried to report this outage several times
>> last
>> week - on IRC, on this mailing list, and on Bugzilla. Al
"Russell Blau" wrote in message
news:gnuacf$hf...@ger.gmane.org...
>
> I have to second this. I tried to report this outage several times last
> week - on IRC, on this mailing list, and on Bugzilla. All reports -- NOT
> COMPLAINTS, JUST REPORTS -- were met with absolute silence.
Two updates
"Lars Aronsson" wrote in message
news:pine.lnx.4.64.0902231202140.1...@localhost.localdomain...
>
> However, quite independent of your development work, the current
> system for dumps seems to have stopped on February 12. That's the
> impression I get from looking at
> http://download.wikimedia.o
Ariel T. Glenn wrote:
>> The reason these dumps are not rewritten more efficiently is
>> that this job was handed to me (at my request) and I have not
>> been able to get to it, even though it is the first thing on my
>> list for development work.
>> [...]
>> The in-office needs that I am also
2009/2/22 Robert Ullmann :
> Want everyone to just dynamically crawl the live DB, with whatever
> screwy lousy inefficiency? FIne, just continue as you are, where that
> is all that can be relied upon!
Even if you had the dumps, you have another problem: They're
incredibly big and so a bit difficu
yep, http://svn.wikimedia.org/viewvc/mediawiki/trunk/backup/ +)
2009/2/23 Alex :
> Ariel T. Glenn wrote:
>> The reason these dumps are not rewritten more efficiently is that this
>> job was handed to me (at my request) and I have not been able to get to
>> it, even though it is the first thing on
Ariel T. Glenn wrote:
> The reason these dumps are not rewritten more efficiently is that this
> job was handed to me (at my request) and I have not been able to get to
> it, even though it is the first thing on my list for development work.
> So, if there are going to be rants, they can be directe
The reason these dumps are not rewritten more efficiently is that this
job was handed to me (at my request) and I have not been able to get to
it, even though it is the first thing on my list for development work.
So, if there are going to be rants, they can be directed at me, not at
the whole team
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Robert Ullmann:
> What is with this?
wrong list. the Foundation needs to allocate the resources to fix dumps. it
hasn't done so, therefore dumps are still broken. perhaps you might ask the
Foundation why dumps have such a low priority.
- r
Hoi,
There have been previous offers for developer time and for hardware...
Thanks,
GerardM
2009/2/23 Platonides
> Robert Ullmann wrote:
> > Hi,
> >
> > Maybe I should offer a constructive suggestion?
>
> They are better than rants :)
>
> > Clearly, trying to do these dumps (particularly
Robert Ullmann wrote:
> Hi,
>
> Maybe I should offer a constructive suggestion?
They are better than rants :)
> Clearly, trying to do these dumps (particularly "history" dumps) as it
> is being done from the servers is proving hard to manage
>
> I also realize that you can't just put the set of
Hi,
Maybe I should offer a constructive suggestion?
Clearly, trying to do these dumps (particularly "history" dumps) as it
is being done from the servers is proving hard to manage
I also realize that you can't just put the set of daily
permanent-media backups on line, as they contain lots of use
What is with this? Why are the XML dumps (the primary product of the
projects: re-usable content) the absolute effing lowest possible
effing priority? Why?
I just finished (I thought) putting together some new software to
update iwikis on the wiktionaries. It is set up to read the
"langlinks" and
"Andreas Meier" wrote in message
news:4997d645.8050...@gmx.de...
> Hello,
>
> the current dump building seem to be dead and perhaps should be killed
> by hand.
>
Reported: https://bugzilla.wikimedia.org/show_bug.cgi?id=17535
___
Wikitech-l mailing
Hello,
the current dump building seem to be dead and perhaps should be killed
by hand.
Best regards
Andim
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
45 matches
Mail list logo