On Wed, Jan 28, 2009 at 8:28 AM, Tei wrote:
> On Wed, Jan 28, 2009 at 1:41 AM, Aryeh Gregor
> wrote:
>> On Tue, Jan 27, 2009 at 7:37 PM, George Herbert
>> wrote:
>>> Right, but a live mirror is a very different thing than a search box link.
>>
>> Well, as far as I can tell, we have no idea wheth
On Wed, Jan 28, 2009 at 1:41 AM, Aryeh Gregor
wrote:
> On Tue, Jan 27, 2009 at 7:37 PM, George Herbert
> wrote:
>> Right, but a live mirror is a very different thing than a search box link.
>
> Well, as far as I can tell, we have no idea whether the original
> poster meant either of those, or per
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Wed, Jan 28, 2009 at 1:13 AM, Daniel Kinzler wrote:
> Marco Schuster schrieb:
>>> Fetch them from the toolserver (there's a tool by duesentrieb for that).
>>> It will catch almost all of them from the toolserver cluster, and make a
>>> request to w
http://svn.wikimedia.org/viewvc/mediawiki/trunk/tools/jobs-loop/run-jobs.c?revision=22101&view=markup&sortby=date
As mentioned, it is just a sample script. For sites with just one
master/slave cluster, any simple script that keeps looping to run
maintenance/runJobs.php will do.
-Aaron
Dawson wrote:
> Modified config file as follows:
>
> $wgUseDatabaseMessage = false;
> $wgUseFileCache = true;
> $wgMainCacheType = "CACHE_ACCEL";
This should be $wgMainCacheType = CACHE_ACCEL; (constant) not
$wgMainCacheType = "CACHE_ACCEL"; (string)
On Tue, Jan 27, 2009 at 7:37 PM, George Herbert
wrote:
> Right, but a live mirror is a very different thing than a search box link.
Well, as far as I can tell, we have no idea whether the original
poster meant either of those, or perhaps something else altogether.
Obviously nobody minds a search
On Tue, Jan 27, 2009 at 3:54 PM, Aryeh Gregor
> wrote:
> Anyway, the reason live mirrors are prohibited is not for load
> reasons. I believe it's because if a site does nothing but stick up
> some ads and add no value, Wikimedia is going to demand a cut of the
> profit for using its trademarks a
Marco Schuster schrieb:
>> Fetch them from the toolserver (there's a tool by duesentrieb for that).
>> It will catch almost all of them from the toolserver cluster, and make a
>> request to wikipedia only if needed.
> I highly doubt this is "legal" use for the toolserver, and I pretty
> much guess
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Wed, Jan 28, 2009 at 12:53 AM, Platonides wrote:
> Marco Schuster wrote:
>> Hi all,
>>
>> I want to crawl around 800.000 flagged revisions from the German
>> Wikipedia, in order to make a dump containing only flagged revisions.
>> For this, I obvio
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Wed, Jan 28, 2009 at 12:49 AM, Rolf Lampa wrote:
> Marco Schuster skrev:
>> I want to crawl around 800.000 flagged revisions from the German
>> Wikipedia, in order to make a dump containing only flagged revisions.
> [...]
>> flaggedpages where fp_r
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Tue, Jan 27, 2009 at 6:56 PM, Jason Schulz wrote:
> Also, see
> http://www.mediawiki.org/wiki/User:Aaron_Schulz/How_to_make_MediaWiki_fast
The shell script you mention in step 2 has some stuff in it that makes
it unusable outside Wikimedia:
1) lots
On Tue, Jan 27, 2009 at 6:43 PM, George Herbert
wrote:
> Google switching to use our search would crush us, obviously.
Doubtful. It wouldn't be terribly pleasant, but I doubt it would take
down the site so easily. Alexa says google.com gets about ten times
the traffic as wikipedia.org. If goog
Marco Schuster wrote:
> Hi all,
>
> I want to crawl around 800.000 flagged revisions from the German
> Wikipedia, in order to make a dump containing only flagged revisions.
> For this, I obviously need to spider Wikipedia.
> What are the limits (rate!) here, what UA should I use and what
> caveats
Rolf Lampa schrieb:
> Marco Schuster skrev:
>> I want to crawl around 800.000 flagged revisions from the German
>> Wikipedia, in order to make a dump containing only flagged revisions.
> [...]
>> flaggedpages where fp_reviewed=1;". Is it correct this one gives me a
>> list of all articles with flag
Marco Schuster skrev:
> I want to crawl around 800.000 flagged revisions from the German
> Wikipedia, in order to make a dump containing only flagged revisions.
[...]
> flaggedpages where fp_reviewed=1;". Is it correct this one gives me a
> list of all articles with flagged revs,
Doesn't the xml
On 1/27/09 2:55 PM, Robert Rohde wrote:
> On Tue, Jan 27, 2009 at 2:42 PM, Brion Vibber wrote:
>> On 1/27/09 2:35 PM, Thomas Dalton wrote:
>>> The way I see it, what we need is to get a really powerful server
>> Nope, it's a software architecture issue. We'll restart it with the new
>> arch when i
On Tue, Jan 27, 2009 at 11:29 AM, Steve Summit wrote:
> Jeff Ferland wrote:
> > You'll need a quite impressive machine to host even just the current
> > revisions of the wiki. Expect to expend 10s to even hundreds of
> > gigabytes on the database alone for Wikipedia using only the current
> > ver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hi all,
I want to crawl around 800.000 flagged revisions from the German
Wikipedia, in order to make a dump containing only flagged revisions.
For this, I obviously need to spider Wikipedia.
What are the limits (rate!) here, what UA should I use and w
On Tue, Jan 27, 2009 at 2:42 PM, Brion Vibber wrote:
> On 1/27/09 2:35 PM, Thomas Dalton wrote:
>> The way I see it, what we need is to get a really powerful server
>
> Nope, it's a software architecture issue. We'll restart it with the new
> arch when it's ready to go.
I don't know what your tim
On 1/27/09 2:35 PM, Thomas Dalton wrote:
> The way I see it, what we need is to get a really powerful server
Nope, it's a software architecture issue. We'll restart it with the new
arch when it's ready to go.
-- brion
___
Wikitech-l mailing list
Wikit
> Whether we want to let the current process continue to try and finish
> or not, I would seriously suggest someone look into redumping the rest
> of the enwiki files (i.e. logs, current pages, etc.). I am also among
> the people that care about having reasonably fresh dumps and it really
> is a p
The problem, as I understand it (and Brion may come by to correct me)
is essentially that the current dump process is designed in a way that
can't be sustained given the size of enwiki. It really needs to be
re-engineered, which means that developer time is needed to create a
new approach to dumpi
Chad hett schreven:
> Should be done with a wiki's content language as of r46372.
>
> -Chad
Thanks! That's already a big improvement, but why content language? As I
pointed out in response to your question, it need's to be user language
on Meta, Incubator, Wikispecies, Beta Wikiversity, old Wiki
On Mon, Jan 26, 2009 at 12:44 PM, Ilmari Karonen wrote:
> Chad wrote:
> > I was going to provide a specific parameter for it. That entire key sucks
> > though anyway, I should probably ditch the md5()'d URL in favor of using
> > the actual name. Fwiw: I've got a patch working, but I'm not quite r
I have a decent server that is dedicated for a Wikipedia project that
depends on the fresh dumps. Can this be used anyway to speed up the process
of generating the dumps?
bilal
On Tue, Jan 27, 2009 at 2:24 PM, Christian Storm wrote:
> >> On 1/4/09 6:20 AM, yegg at alum.mit.edu wrote:
> >> The c
Jeff Ferland wrote:
> You'll need a quite impressive machine to host even just the current
> revisions of the wiki. Expect to expend 10s to even hundreds of
> gigabytes on the database alone for Wikipedia using only the current
> versions.
No, no, no. You're looking at it all wrong. That's
>> On 1/4/09 6:20 AM, yegg at alum.mit.edu wrote:
>> The current enwiki database dump
>> (http://download.wikimedia.org/enwiki/20081008/
>> ) has been crawling along since 10/15/2008.
> The current dump system is not sustainable on very large wikis and
> is being replaced. You'll hear about it
I'll try to weigh in with a bit of useful information, but it probably
won't help that much.
You'll need a quite impressive machine to host even just the current
revisions of the wiki. Expect to expend 10s to even hundreds of
gigabytes on the database alone for Wikipedia using only the curre
maybe this is what this guy need:
http://en.wiktionary.org/wiki/Special:Search";>
test:
http://zerror.com/unorganized/wika/test.htm
it don't seems wiktionary block external searchs now (via REFERRER),
but maybe may change the policy on the future/change the parameters
needed.
On Tue, Jan 27,
refer to reference. com website and do a search
- Original Message
From: Thomas Dalton
To: Wikimedia developers
Sent: Tuesday, January 27, 2009 1:07:36 PM
Subject: Re: [Wikitech-l] hosting wikipedia
2009/1/27 Stephen Dunn :
> yes, website. so a web page has a search box that passes t
2009/1/27 Stephen Dunn :
> yes, website. so a web page has a search box that passes the input to
> wiktionary and results are provided on a results page. an example may be
> reference..com
How would this differ from the search box on en.wiktionary.org? What
are you actually trying to achieve?
_
yes, website. so a web page has a search box that passes the input to
wiktionary and results are provided on a results page. an example may be
reference..com
- Original Message
From: Thomas Dalton
To: Wikimedia developers
Sent: Tuesday, January 27, 2009 12:50:18 PM
Subject: Re: [Wik
To use filecache, you need to set $wgShowIPinHeader = false;
Also, see
http://www.mediawiki.org/wiki/User:Aaron_Schulz/How_to_make_MediaWiki_fast
-Aaron
--
From: "Dawson"
Sent: Tuesday, January 27, 2009 9:52 AM
To: "Wikimedia developers"
Subject:
2009/1/27 Stephen Dunn :
> I am working on a project to host wiktionary on one web page and wikipedia on
> another. So both, sorry..
You mean web *site*, surely? They are both far too big to fit on a
single page. I think you need to work out precisely what it is you're
trying to do before we can
I am working on a project to host wiktionary on one web page and wikipedia on
another. So both, sorry..
- Original Message
From: Thomas Dalton
To: Wikimedia developers
Sent: Tuesday, January 27, 2009 12:43:49 PM
Subject: Re: [Wikitech-l] hosting wikipedia
2009/1/27 Stephen Dunn :
>
2009/1/27 Stephen Dunn :
> Hi Folks:
>
> I am a newbie so I apologize if I am asking basic questions. How would I go
> about hosting wiktionary allowing search queries via the web using
> opensearch. I am having trouble fining info on how to set this up. Any
> assistance is greatly appreciated.
Hi Folks:
I am a newbie so I apologize if I am asking basic questions. How would I go
about hosting wiktionary allowing search queries via the web using opensearch.
I am having trouble fining info on how to set this up. Any assistance is
greatly appreciated.
___
Modified config file as follows:
$wgUseDatabaseMessage = false;
$wgUseFileCache = true;
$wgMainCacheType = "CACHE_ACCEL";
I also installed xcache and eaccelerator. The improvement in speed is huge.
2009/1/27 Aryeh Gregor
>
> On Tue, Jan 27, 2009 at 5:31 AM, Dawson wrote:
> > Hello, I have a c
On Tue, Jan 27, 2009 at 5:31 AM, Dawson wrote:
> Hello, I have a couple of mediawiki installations on two different slices at
> Slicehost, both of which run websites on the same slice with no speed
> problems, however, the mediawiki themselves run like dogs!
> http://wiki.medicalstudentblog.co.uk/
Hello, I have a couple of mediawiki installations on two different slices at
Slicehost, both of which run websites on the same slice with no speed
problems, however, the mediawiki themselves run like dogs!
http://wiki.medicalstudentblog.co.uk/ Any ideas what to look for or ways to
optimise them? I
40 matches
Mail list logo