Re: [Wikitech-l] Moving the Dump Process to another language

2011-03-25 Thread Ariel T. Glenn
Στις 24-03-2011, ημέρα Πεμ, και ώρα 20:29 -0400, ο/η James Linden έγραψε: So, thoughts on this? Is 'Move Dumping Process to another language' a good idea at all? I'd worry a lot less about what languages are used than whether the process itself is scalable. I'm not a mediawiki /

Re: [Wikitech-l] Moving the Dump Process to another language

2011-03-25 Thread Andrew Dunbar
On 25 March 2011 18:21, Ariel T. Glenn ar...@wikimedia.org wrote: Στις 24-03-2011, ημέρα Πεμ, και ώρα 20:29 -0400, ο/η James Linden έγραψε: So, thoughts on this? Is 'Move Dumping Process to another language' a good idea at all? I'd worry a lot less about what languages are used than

Re: [Wikitech-l] Moving the Dump Process to another language

2011-03-25 Thread Charles Polisher
Platonides wrote: Yuvi Panda wrote: Hi, I'm Yuvi, a student looking forward to working with MediaWiki via this year's GSoC. snip/ An idea I have been pondering is to pass the offset to the previous revision to the compressor, so it would need much less work in the compressing window to

Re: [Wikitech-l] Moving the Dump Process to another language

2011-03-25 Thread Platonides
Andrew Dunbar wrote: Just a thought, wouldn't it be easier to generate dumps in parallel if we did away with the assumption that the dump would be in database order. The metadata in the dump provides the ordering info for the people that require it. Andrew Dunbar (hippietrail) I don't see

Re: [Wikitech-l] Moving the Dump Process to another language

2011-03-25 Thread Ariel T. Glenn
Στις 25-03-2011, ημέρα Παρ, και ώρα 21:49 +0100, ο/η Platonides έγραψε: Andrew Dunbar wrote: Just a thought, wouldn't it be easier to generate dumps in parallel if we did away with the assumption that the dump would be in database order. The metadata in the dump provides the ordering info

Re: [Wikitech-l] Moving the Dump Process to another language

2011-03-25 Thread Platonides
Ariel T. Glenn wrote: Amusingly, splitting based on some number of articles doesn't really balance out the pieces, at least for history dumps, after the project has been around long enough with enough activity. Splitting by number of revisions is what we really want, and the older pages have

[Wikitech-l] Moving the Dump Process to another language

2011-03-24 Thread Yuvi Panda
Hi, I'm Yuvi, a student looking forward to working with MediaWiki via this year's GSoC. I want to work on something dump related, and have been bugging apergos (Ariel) for a while now. One of the things that popped up into my head is moving the dump process to another language (say, C#, or Java,

Re: [Wikitech-l] Moving the Dump Process to another language

2011-03-24 Thread Brion Vibber
On Thu, Mar 24, 2011 at 1:05 PM, Yuvi Panda yuvipa...@gmail.com wrote: Hi, I'm Yuvi, a student looking forward to working with MediaWiki via this year's GSoC. I want to work on something dump related, and have been bugging apergos (Ariel) for a while now. One of the things that popped up

Re: [Wikitech-l] Moving the Dump Process to another language

2011-03-24 Thread Platonides
Yuvi Panda wrote: Hi, I'm Yuvi, a student looking forward to working with MediaWiki via this year's GSoC. I want to work on something dump related, and have been bugging apergos (Ariel) for a while now. One of the things that popped up into my head is moving the dump process to another

Re: [Wikitech-l] Moving the Dump Process to another language

2011-03-24 Thread James Linden
So, thoughts on this? Is 'Move Dumping Process to another language' a good idea at all? I'd worry a lot less about what languages are used than whether the process itself is scalable. I'm not a mediawiki / wikipedia developer, but as a developer / sys admin, I'd think that adding another