On Sun, Oct 29, 2017 at 10:51 PM, Antoine Beaupré <anar...@debian.org> wrote:
> When we specify a list of namespaces to fetch from, by default the MW
> API will not fetch from the default namespace, refered to as "(Main)"
> in the documentation:
>
> https://www.mediawiki.org/wiki/Manual:Namespace#Built-in_namespaces
>
> I haven't found a way to address that "(Main)" namespace when getting
> the namespace ids: indeed, when listing namespaces, there is no
> "canonical" field for the main namespace, although there is a "*"
> field that is set to "" (empty). So in theory, we could specify the
> empty namespace to get the main namespace, but that would make
> specifying namespaces harder for the user: we would need to teach
> users about the "empty" default namespace. It would also make the code
> more complicated: we'd need to parse quotes in the configuration.
>
> So we simply override the query here and allow the user to specify
> "(Main)" since that is the publicly documented name.

Thanks, this explanation makes the patch a lot clearer. More below...

> Signed-off-by: Antoine Beaupré <anar...@debian.org>
> ---
> diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl 
> b/contrib/mw-to-git/git-remote-mediawiki.perl
> @@ -264,9 +264,14 @@ sub get_mw_tracked_categories {
>  sub get_mw_tracked_namespaces {
>      my $pages = shift;
>      foreach my $local_namespace (@tracked_namespaces) {
> -        my $namespace_id = get_mw_namespace_id($local_namespace);
> +        my ($namespace_id, $mw_pages);
> +        if ($local_namespace eq "(Main)") {
> +            $namespace_id = 0;
> +        } else {
> +            $namespace_id = get_mw_namespace_id($local_namespace);
> +        }

I meant to ask this in the previous round, but with the earlier patch
mixing several distinct changes into one, I plumb forgot: Would it
make sense to move this "(Main)" special case into
get_mw_namespace_id() itself? After all, that function is all about
determining an ID associated with a name, and "(Main)" is a name.

>          next if $namespace_id < 0; # virtual namespaces don't support 
> allpages
> -        my $mw_pages = $mediawiki->list( {
> +        $mw_pages = $mediawiki->list( {

Why did the "my" of $my_pages get moved up to the top of the foreach
loop? I can't seem to see any reason for it. Is this an unrelated
change accidentally included in this patch?

>              action => 'query',
>              list => 'allpages',
>              apnamespace => $namespace_id,
> --

Reply via email to