[Xmldatadumps-l] Re: Part of pages missing in N0 enterprise dumps

2022-03-18 Thread Federico Leva (Nemo)
Il 18/03/22 14:04, Erik del Toro ha scritto: Just wanted to tell you, thathttp://aarddict.org users and dictionary creators also stumbled about these missing namespaces and are now suggesting to continue scraping these. So is scraping the expected approach? Thanks for mentioning this. Not sure

[Xmldatadumps-l] Re: Part of pages missing in N0 enterprise dumps

2022-03-18 Thread Erik del Toro
Just wanted to tell you, that http://aarddict.org users and dictionary creators also stumbled about these missing namespaces and are now suggesting to continue scraping these. So is scraping the expected approach? See here: https://groups.google.com/g/aarddict/c/WssxfWQYsto Regards, Erik Am 17.03

[Xmldatadumps-l] Re: Part of pages missing in N0 enterprise dumps

2022-03-17 Thread Jan Berkel
>> Can they be found somewhere else? In N6 or N14? For me it seems that >> articles/pages that have a colon like Anexo: or Conjugaison: are not >> part. > > These are not namespace 0. Perhaps the export process forgot to respect > $wgContentNamespaces? I don't think this these namespaces are incl

[Xmldatadumps-l] Re: Part of pages missing in N0 enterprise dumps

2022-02-13 Thread Federico Leva (Nemo)
Il 13/02/22 21:16, Erik del Toro ha scritto: Can they be found somewhere else? In N6 or N14? For me it seems that articles/pages that have a colon like Anexo: or Conjugaison: are not part. These are not namespace 0. Perhaps the export process forgot to respect $wgContentNamespaces? Federico

[Xmldatadumps-l] Re: Part of pages missing in N0 enterprise dumps

2022-02-13 Thread John
That eswiki page is in namespace 'wgNamespaceNumber":104 the FR page is "wgNamespaceNumber":116 On Sun, Feb 13, 2022 at 2:17 PM Erik del Toro wrote: > Hello. > > I am doing some converts to aarddict https://aarddict.org/ offline > wikipedia and wiktionary app. I use mw2slob and the N0 files foun