Matthieu Moy <matthieu....@imag.fr> writes:

> The algorithm to find a path from the local revision to the remote one
> was calling "git rev-list" and parsing its output N times. Run rev-list
> only once, and fill a hashtable with the result to optimize the body of
> the loop.

Good thinking.  I wonder if it would further reduce the overhead if
you stop using --children and do this using --parents instead, as
you will be reading the parsed_sha1..local range either way yourself
anyway.

> Signed-off-by: Matthieu Moy <matthieu....@imag.fr>
> ---
>  contrib/mw-to-git/git-remote-mediawiki | 25 ++++++++++++++++++-------
>  1 file changed, 18 insertions(+), 7 deletions(-)
>
> diff --git a/contrib/mw-to-git/git-remote-mediawiki 
> b/contrib/mw-to-git/git-remote-mediawiki
> index 8e46e4e..f9c0cc6 100755
> --- a/contrib/mw-to-git/git-remote-mediawiki
> +++ b/contrib/mw-to-git/git-remote-mediawiki
> @@ -1196,16 +1196,27 @@ sub mw_push_revision {
>       if ($last_local_revid > 0) {
>               my $parsed_sha1 = $remoteorigin_sha1;
>               # Find a path from last MediaWiki commit to pushed commit
> +             print STDERR "Computing path from local to remote ...\n";
> +             my @local_ancestry = split(/\n/, run_git("rev-list --boundary 
> --children $local ^$parsed_sha1"));
> +             my %local_ancestry;
> +             foreach my $line (@local_ancestry) {
> +                     if (my ($parent, $child) = $line =~ m/^-?([a-f0-9]+) 
> ([a-f0-9]+)/) {
> +                             $local_ancestry{$parent} = $child;
> +                             if ($parent eq $parsed_sha1 || $child eq 
> $parsed_sha1) {
> +                                     print STDERR "$parent -> $child\n";
> +                             }
> +                     } elsif (!$line =~ m/^([a-f0-9]+)/) {
> +                             die "Unexpected output from git rev-list: 
> $line";
> +                     }
> +             }
>               while ($parsed_sha1 ne $HEAD_sha1) {
> -                     my @commit_info =  grep(/^$parsed_sha1/, split(/\n/, 
> run_git("rev-list --children $local")));
> -                     if (!@commit_info) {
> +                     my $child = $local_ancestry{$parsed_sha1};
> +                     if (!$child) {
> +                             printf STDERR "Cannot find a path in history 
> from remote commit to last commit\n";
>                               return error_non_fast_forward($remote);
>                       }
> -                     my @commit_info_split = split(/ |\n/, $commit_info[0]);
> -                     # $commit_info_split[1] is the sha1 of the commit to 
> export
> -                     # $commit_info_split[0] is the sha1 of its direct child
> -                     push(@commit_pairs, \@commit_info_split);
> -                     $parsed_sha1 = $commit_info_split[1];
> +                     push(@commit_pairs, [$parsed_sha1, $child]);
> +                     $parsed_sha1 = $child;
>               }
>       } else {
>               # No remote mediawiki revision. Export the whole
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to