James Carlson <james.d.carlson at sun.com> writes:

>> Did you find any issues?
>
> Compared with Teamware, Mercurial is relatively fast on some things
> that I do relatively infrequently (such as full bringovers) and
> relatively slow on some things that I do often (such as listing
> modified files).  And not having access to the raw per-file changes so
> that I can pull SCCS tricks to get myself out of tight Teamware
> problems is a bit nerve-wracking.
>
> For example, 'webrev' in an hg workspace will spin its wheels for a
> long time consulting the parent workspace via ssh and coming up with a
> file list on a simple one-file change, but 'wx webrev' would have
> blasted out the data quickly.
>
> It's going to take a change in development habits and expectations.

So, I think you hinted toward this during your review when looking at
one of the XXX's in WorkSpace (I think the "tied to the parent"
comment).

When I've profiled this, it has not been communication with the parent
that has had the performance impact, it has been the search for new,
uncommitted changes (which as implemented in Mercurial right now,
walks far too much of the workspace[1]), that's sucked up the vast
majority of runtime.

This is why I didn't get to implementing any kind of persistent
caching there (because the data in question is uncachable), and why I
didn't give up on my "no user maintained active list" kick, because
without the lack of write permission before 'wx/sccs edit', I think that
would lead to considerable confusion.

The most noticeable indicator of this (beyond dtrace and the python
profiler), is that runtime increases with the amount of workspace you
have built and thus the number of uncontrolled files (hg st -u) which
Hg will walk but aren't cared about.  SFW, by way of the tarballs, is
an obvious and clear victim of this.

It increases (to a lesser degree) with the number of changesets local
to the child, but I suspect not in such a way that the improvement
would be noticeable (I would be happy to be incorrect), AL._build() is
in many respects visibly inefficient in the way it chews data.

There used to be a commented toggle in WorkSpace (for the sake of
testing), that would prevent uncommitted changes being cared about by
the active list (which is harder now we deal with branches, as they're
more important).  With it set to ignore them, things were considerably
faster, even on less than great hardware.  (again, empirical evidence
to go with my memory of the profile runs).

I'd encourage anyone and everyone to take a shot at it with dtrace or
one of the python profilers to try and confirm, or to see if they see
another way to rectify this.

This is the primary reason bug #357 exists.

-- Rich

[1] I talked to them about fixing this, and showed Matt a fairly nasty
    patch I put together to demonstrate what I meant.  If I recall the
    end result of that conversation correctly, the idea was sound, the
    patch was awful, and he would prefer to redesign the way Hg walks
    files in general (or just the API surrounding it?) in the course
    of whatever fix.

Reply via email to