On Sun, Jan 13, 2013 at 11:40:45AM -0500, Pete Wyckoff wrote:
> j...@keeping.me.uk wrote on Sun, 13 Jan 2013 00:41 +0000:
>> On Sat, Jan 12, 2013 at 06:43:04PM -0500, Pete Wyckoff wrote:
>> > Can you give me some hints about the byte/unicode string issues
>> > in git-p4.py? There's really only one place that does:
>> > p4 = subprocess.Popen("p4 -G ...")
>> > marshal.load(p4.stdout)
>> > If that's the only issue, this might not be too paniful.
>> The problem is that what gets loaded there is a dictionary (encoded by
>> p4) that maps byte strings to byte strings, so all of the accesses to
>> that dictionary need to either:
>> 1) explicitly call encode() on a string constant
>> or 2) use a byte string constant with a "b" prefix
>> Or we could re-write the dictionary once, which handles the keys... but
>> some of the values are also used as strings and we can't handle that as
>> a one-off conversion since in other places we really do want the byte
>> string (think content of binary files).
>> Basically a thorough audit of all access to variables that come from p4
>> would be needed, with explicit decode()s for authors, dates, etc.
> Your auto-conversion snippet in the follow-up mail would work
> fine for most keys and values. A few perforce docs and some
> playing around convince me that it is mostly utf-8, except for
> file data for particular types.
> I'd still rather handle each command separately, and think about
> the conversions, to do it right in the long run.
I sent that on the assumption that the same key would have similar
semantics wherever its used, but I don't use git-p4 or know much about
It would be interesting to know whether there is any likelihood of p4
gaining a Python 3 output mode (since the documentation currently say
not to use "p4 -G" with Python 3). If it does then I would assume that
it will make a sensible choice about unicode/bytes such that the
existing git-p4 would Just Work with only a small change to the
invocation of p4 to add the new argument.
>> > I hesitated to take Sebastian's changes due to the huge number of
>> > print() lines, but maybe a 2to3 approach would make that aspect
>> > of python3 support not too onerous.
>> I think we'd want to change to print() eventually and having a single
>> codebase for 2 and 3 would be nicer for development, but I think we need
>> to be able to say "no one is using Python 2.5 or earlier" before we can
>> do that and I'm not sure we're there yet. From where we are at the
>> moment I think 2to3 is a good answer, particularly where we're already
>> using distutils to generate a release image.
> Agreed. The 2to3 diff is large but straightforward. But these
> p4 -G interface errors require a lot of thought and work. I'm
> not too eager to work on this yet.
Fair enough. As I don't use git-p4, it's not something I intend to
tackle either (given the scale of the changes involved).
Given the minimal scope of the changes needed for everything else, I
sent this series wondering whether it's sensible to move forward on the
basis of "Python scripts except git-p4 work with Python 3. You must use
Python 2 if you want to use git-p4".
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html