On 11/10/2012 01:43 PM, Felipe Contreras wrote:
On Sat, Nov 10, 2012 at 6:28 PM, Michael J Gruber
<g...@drmicha.warpmail.net>  wrote:
Felipe Contreras venit, vidit, dixit 09.11.2012 15:34:
On Fri, Nov 9, 2012 at 10:28 AM, Michael J Gruber
<g...@drmicha.warpmail.net>  wrote:

Hg seems to store just anything in the author field ("committer"). The
various interfaces that are floating around do some behind-the-back
conversion to git format. The more conversions they do, the better they
seem to work (no erroring out) but I'm wondering whether it's really a
good thing, or whether we should encourage a more diligent approach
which requires a user to map non-conforming author names wilfully.

So you propose that when somebody does 'git clone hg::hg hg-git' the
thing should fail. I hope you don't think it's too unbecoming for me
to say that I disagree.

There is no need to disagree with a proposal I haven't made. I would
disagree with the proposal that I haven't made, too.

All right, we shouldn't encourage a more diligent approach which
requires a user to map author names then.

IMO it should be git fast-import the one that converts these bad
authors, not every single tool out there. Maybe throw a warning, but
that's all. Or maybe generate a list of bad authors ready to be filled
out. That way when a project is doing a real conversion, say, when
moving to git, they can run the conversion once and see which authors
are bad and not multiple times, each try taking longer than the next.

As Jeff pointed out, git-fast-import expects output conforming to a
certain standard, and that's not going to change. import is agnostic to
where its import stream is coming from. Only the producer of that stream
can have additional information about the provenience of the stream's
data which may aid (possibly together with user input or choices) in
transforming that into something conforming.

We already know where the import of those streams come from:
mercurial, bazaar, etc. There's absolutely nothing the tools exporting
data from those repositories can do, except try to convert all kind of
weird names--and many tools do it poorly.

So, the options are:

a) Leave the name conversion to the export tools, and when they miss
some weird corner case, like 'Author<email', let the user face the
consequences, perhaps after an hour of the process.

We know there are sources of data that don't have git-formatted author
names, so we know every tool out there must do this checking.

In addition to that, let the export tool decide what to do when one of
these bad names appear, which in many cases probably means do nothing,
so the user would not even see that such a bad name was there, which
might not be what they want.

b) Do the name conversion in fast-import itself, perhaps optionally,
so if a tool missed some weird corner case, the user does not have to
face the consequences.

The tool writers don't have to worry about this, so we would not have
tools out there doing a half-assed job of this.

And what happens when such bad names end up being consistent: warning,
a scaffold mapping of bad names, etc.


One is bad for the users, and the tools writers, only disadvantages,
the other is good for the users and the tools writers, only
advantages.


c) Do the name conversion, and whatever other cleanup and manipulations you're interesting in, in a filter between the exporter and git-fast-import.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to