Marco Schuster skrev:
Rolf Lampa wrote:
Doesn't the xml dumps contain the flag for flagged revs?
The xml dumps are nothing for me, way too much overhead (especially,
they are old, and I want to use single files, it's easier to process
these than one hge xml file). And they don't
Daniel Kinzler wrote:
Rolf Lampa schrieb:
I'd love, however, to see the flagged rev status as an attribute in one
of the tags, for example revision flagged_rev=true
Regards,
Naw, it's more complex than that. You can have any number of different flags.
It
would probably have to be
2009/1/28 Platonides platoni...@gmail.com:
Daniel Kinzler wrote:
Rolf Lampa schrieb:
I'd love, however, to see the flagged rev status as an attribute in one
of the tags, for example revision flagged_rev=true
Regards,
Naw, it's more complex than that. You can have any number of different
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hi all,
I want to crawl around 800.000 flagged revisions from the German
Wikipedia, in order to make a dump containing only flagged revisions.
For this, I obviously need to spider Wikipedia.
What are the limits (rate!) here, what UA should I use and
Marco Schuster skrev:
I want to crawl around 800.000 flagged revisions from the German
Wikipedia, in order to make a dump containing only flagged revisions.
[...]
flaggedpages where fp_reviewed=1;. Is it correct this one gives me a
list of all articles with flagged revs,
Doesn't the xml
Rolf Lampa schrieb:
Marco Schuster skrev:
I want to crawl around 800.000 flagged revisions from the German
Wikipedia, in order to make a dump containing only flagged revisions.
[...]
flaggedpages where fp_reviewed=1;. Is it correct this one gives me a
list of all articles with flagged revs,
Marco Schuster wrote:
Hi all,
I want to crawl around 800.000 flagged revisions from the German
Wikipedia, in order to make a dump containing only flagged revisions.
For this, I obviously need to spider Wikipedia.
What are the limits (rate!) here, what UA should I use and what
caveats do I
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Wed, Jan 28, 2009 at 12:49 AM, Rolf Lampa wrote:
Marco Schuster skrev:
I want to crawl around 800.000 flagged revisions from the German
Wikipedia, in order to make a dump containing only flagged revisions.
[...]
flaggedpages where