On Tue, Jan 15, 2013 at 12:51:13PM -0800, Junio C Hamano wrote:
> John Keeping <j...@keeping.me.uk> writes:
>> Although 2to3 will fix most issues in Python 2 code to make it run under
>> Python 3, it does not handle the new strict separation between byte
>> strings and unicode strings. There is one instance in
>> git_remote_helpers where we are caught by this, which is when reading
>> refs from "git for-each-ref".
>> While we could fix this by explicitly handling refs as byte strings,
>> this is merely punting the problem to users of the library since the
>> same problem will be encountered as soon you want to display the ref
>> name to a user.
>> Instead of doing this, explicit decode the incoming byte string into a
>> unicode string.
> That really feels wrong. Displaying is a separate issue and it is
> the _right_ thing to punt the problem at the lower-level machinery
But the display will require decoding the ref name to a Unicode string,
which depends on the encoding of the underlying ref name, so it feels
like it should be decoded where it's read (see ).
>> Following the lead of pygit2 (the Python bindings for
>> libgit2 - see  and ),...
> I do not think other people getting it wrong is not an excuse to
> repeat the same mistake.
> Is it really so cumbersome to handle byte strings as byte strings in
As  says, there is a potential for bugs whenever people attempt to
combine Unicode and byte strings. I think it also violates the
principle of least surprise if a ref name (a string) doesn't behave like
a normal string.
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html