Re: [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly

2013-01-17 Thread Junio C Hamano
John Keeping  writes:

> On Thu, Jan 17, 2013 at 02:24:37PM -0800, Junio C Hamano wrote:
>> John Keeping  writes:
>> 
>>> You're right - I think we need to add ", errors='replace'" to the call
>>> to encode.
>> 
>> Of if it is used just as a opaque token, you can .encode('hex') or
>> something to punt on the whole issue, no?
>
> Even better.  Are you happy to squash that in (assuming nothing else
> comes up) or shall I resend?

If you go the .encode('hex') route, the log message needs to explain
why the hashed values are now different from the old implementation
and justify why it is safe to do so.  I do not think I want to do
that myself ;-).

Thanks.


>
>  git-remote-testpy.py | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/git-remote-testpy.py b/git-remote-testpy.py
> index d94a66a..f8dc196 100644
> --- a/git-remote-testpy.py
> +++ b/git-remote-testpy.py
> @@ -31,9 +31,9 @@ from git_remote_helpers.git.exporter import GitExporter
>  from git_remote_helpers.git.importer import GitImporter
>  from git_remote_helpers.git.non_local import NonLocalGit
>  
> -if sys.hexversion < 0x01050200:
> -# os.makedirs() is the limiter
> -sys.stderr.write("git-remote-testgit: requires Python 1.5.2 or 
> later.\n")
> +if sys.hexversion < 0x0200:
> +# string.encode() is the limiter
> +sys.stderr.write("git-remote-testgit: requires Python 2.0 or 
> later.\n")
>  sys.exit(1)
>  
>  def get_repo(alias, url):
> @@ -45,7 +45,7 @@ def get_repo(alias, url):
>  repo.get_head()
>  
>  hasher = _digest()
> -hasher.update(repo.path)
> +hasher.update(repo.path.encode('utf-8'))
>  repo.hash = hasher.hexdigest()
>  
>  repo.get_base_path = lambda base: os.path.join(
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly

2013-01-17 Thread John Keeping
On Thu, Jan 17, 2013 at 02:24:37PM -0800, Junio C Hamano wrote:
> John Keeping  writes:
> 
>> You're right - I think we need to add ", errors='replace'" to the call
>> to encode.
> 
> Of if it is used just as a opaque token, you can .encode('hex') or
> something to punt on the whole issue, no?

Even better.  Are you happy to squash that in (assuming nothing else
comes up) or shall I resend?

  git-remote-testpy.py | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

 diff --git a/git-remote-testpy.py b/git-remote-testpy.py
 index d94a66a..f8dc196 100644
 --- a/git-remote-testpy.py
 +++ b/git-remote-testpy.py
 @@ -31,9 +31,9 @@ from git_remote_helpers.git.exporter import GitExporter
  from git_remote_helpers.git.importer import GitImporter
  from git_remote_helpers.git.non_local import NonLocalGit
  
 -if sys.hexversion < 0x01050200:
 -# os.makedirs() is the limiter
 -sys.stderr.write("git-remote-testgit: requires Python 1.5.2 or 
 later.\n")
 +if sys.hexversion < 0x0200:
 +# string.encode() is the limiter
 +sys.stderr.write("git-remote-testgit: requires Python 2.0 or 
 later.\n")
  sys.exit(1)
  
  def get_repo(alias, url):
 @@ -45,7 +45,7 @@ def get_repo(alias, url):
  repo.get_head()
  
  hasher = _digest()
 -hasher.update(repo.path)
 +hasher.update(repo.path.encode('utf-8'))
  repo.hash = hasher.hexdigest()
  
  repo.get_base_path = lambda base: os.path.join(
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly

2013-01-17 Thread Junio C Hamano
John Keeping  writes:

> You're right - I think we need to add ", errors='replace'" to the call
> to encode.

Of if it is used just as a opaque token, you can .encode('hex') or
something to punt on the whole issue, no?

>
>> >  git-remote-testpy.py | 8 
>> >  1 file changed, 4 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/git-remote-testpy.py b/git-remote-testpy.py
>> > index d94a66a..f8dc196 100644
>> > --- a/git-remote-testpy.py
>> > +++ b/git-remote-testpy.py
>> > @@ -31,9 +31,9 @@ from git_remote_helpers.git.exporter import GitExporter
>> >  from git_remote_helpers.git.importer import GitImporter
>> >  from git_remote_helpers.git.non_local import NonLocalGit
>> >  
>> > -if sys.hexversion < 0x01050200:
>> > -# os.makedirs() is the limiter
>> > -sys.stderr.write("git-remote-testgit: requires Python 1.5.2 or 
>> > later.\n")
>> > +if sys.hexversion < 0x0200:
>> > +# string.encode() is the limiter
>> > +sys.stderr.write("git-remote-testgit: requires Python 2.0 or 
>> > later.\n")
>> >  sys.exit(1)
>> >  
>> >  def get_repo(alias, url):
>> > @@ -45,7 +45,7 @@ def get_repo(alias, url):
>> >  repo.get_head()
>> >  
>> >  hasher = _digest()
>> > -hasher.update(repo.path)
>> > +hasher.update(repo.path.encode('utf-8'))
>> >  repo.hash = hasher.hexdigest()
>> >  
>> >  repo.get_base_path = lambda base: os.path.join(
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly

2013-01-17 Thread John Keeping
On Thu, Jan 17, 2013 at 09:00:48PM +, John Keeping wrote:
> On Thu, Jan 17, 2013 at 12:36:33PM -0800, Junio C Hamano wrote:
>> John Keeping  writes:
>> 
>>> Under Python 3 'hasher.update(...)' must take a byte string and not a
>>> unicode string.  Explicitly encode the argument to this method as UTF-8
>>> so that this code works under Python 3.
>>>
>>> This moves the required Python version forward to 2.0.
>>>
>>> Signed-off-by: John Keeping 
>>> ---
>> 
>> Hmph.  So what happens when the path is _not_ encoded in UTF-8?
> 
> Do you mean encodable?  As you say below it will currently throw an
> exception.

Now my brain's not working - we shouldn't get an error converting from a
Unicode string to UTF-8, so I think this patch is OK as it is.

> > Is the repo.hash (and local.hash that gets a copy of it) something
> > that needs to stay the same across multiple invocations of this
> > remote helper, and between the currently shipped Git and the version
> > of Git after applying this patch?
> 
> It's used to specify the path of the repository for importing or
> exporting, so it should stay consistent across invocations.  However,
> this is only an example remote helper so I don't think we should worry
> if it changes from one Git release to the next.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly

2013-01-17 Thread John Keeping
On Thu, Jan 17, 2013 at 12:36:33PM -0800, Junio C Hamano wrote:
> John Keeping  writes:
> 
>> Under Python 3 'hasher.update(...)' must take a byte string and not a
>> unicode string.  Explicitly encode the argument to this method as UTF-8
>> so that this code works under Python 3.
>>
>> This moves the required Python version forward to 2.0.
>>
>> Signed-off-by: John Keeping 
>> ---
> 
> Hmph.  So what happens when the path is _not_ encoded in UTF-8?

Do you mean encodable?  As you say below it will currently throw an
exception.

> Is the repo.hash (and local.hash that gets a copy of it) something
> that needs to stay the same across multiple invocations of this
> remote helper, and between the currently shipped Git and the version
> of Git after applying this patch?

It's used to specify the path of the repository for importing or
exporting, so it should stay consistent across invocations.  However,
this is only an example remote helper so I don't think we should worry
if it changes from one Git release to the next.

>If that is not the case, and if
> this is used only to get a randomly-looking 40-byte hexadecimal
> string, then a lossy attempt to .encode('utf-8') and falling back to
> replace or ignore bytes in the original that couldn't be interpreted
> as part of a UTF-8 string would be OK, but doesn't .encode('utf-8')
> throw an exception if not told to 'ignore' or something?

You're right - I think we need to add ", errors='replace'" to the call
to encode.

> >  git-remote-testpy.py | 8 
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/git-remote-testpy.py b/git-remote-testpy.py
> > index d94a66a..f8dc196 100644
> > --- a/git-remote-testpy.py
> > +++ b/git-remote-testpy.py
> > @@ -31,9 +31,9 @@ from git_remote_helpers.git.exporter import GitExporter
> >  from git_remote_helpers.git.importer import GitImporter
> >  from git_remote_helpers.git.non_local import NonLocalGit
> >  
> > -if sys.hexversion < 0x01050200:
> > -# os.makedirs() is the limiter
> > -sys.stderr.write("git-remote-testgit: requires Python 1.5.2 or 
> > later.\n")
> > +if sys.hexversion < 0x0200:
> > +# string.encode() is the limiter
> > +sys.stderr.write("git-remote-testgit: requires Python 2.0 or later.\n")
> >  sys.exit(1)
> >  
> >  def get_repo(alias, url):
> > @@ -45,7 +45,7 @@ def get_repo(alias, url):
> >  repo.get_head()
> >  
> >  hasher = _digest()
> > -hasher.update(repo.path)
> > +hasher.update(repo.path.encode('utf-8'))
> >  repo.hash = hasher.hexdigest()
> >  
> >  repo.get_base_path = lambda base: os.path.join(
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly

2013-01-17 Thread Junio C Hamano
Junio C Hamano  writes:

> John Keeping  writes:
>
>> Under Python 3 'hasher.update(...)' must take a byte string and not a
>> unicode string.  Explicitly encode the argument to this method as UTF-8
>> so that this code works under Python 3.
>>
>> This moves the required Python version forward to 2.0.
>>
>> Signed-off-by: John Keeping 
>> ---
>
> Hmph.  So what happens when the path is _not_ encoded in UTF-8?

Oh, my brain was not working. Forget this part, and sorry for the
noise.  We are not decoding a bytestring to an array of unicode
characters, but going the other way around here.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] git-remote-testpy: hash bytes explicitly

2013-01-17 Thread Junio C Hamano
John Keeping  writes:

> Under Python 3 'hasher.update(...)' must take a byte string and not a
> unicode string.  Explicitly encode the argument to this method as UTF-8
> so that this code works under Python 3.
>
> This moves the required Python version forward to 2.0.
>
> Signed-off-by: John Keeping 
> ---

Hmph.  So what happens when the path is _not_ encoded in UTF-8?

Is the repo.hash (and local.hash that gets a copy of it) something
that needs to stay the same across multiple invocations of this
remote helper, and between the currently shipped Git and the version
of Git after applying this patch?  If that is not the case, and if
this is used only to get a randomly-looking 40-byte hexadecimal
string, then a lossy attempt to .encode('utf-8') and falling back to
replace or ignore bytes in the original that couldn't be interpreted
as part of a UTF-8 string would be OK, but doesn't .encode('utf-8')
throw an exception if not told to 'ignore' or something?

>  git-remote-testpy.py | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/git-remote-testpy.py b/git-remote-testpy.py
> index d94a66a..f8dc196 100644
> --- a/git-remote-testpy.py
> +++ b/git-remote-testpy.py
> @@ -31,9 +31,9 @@ from git_remote_helpers.git.exporter import GitExporter
>  from git_remote_helpers.git.importer import GitImporter
>  from git_remote_helpers.git.non_local import NonLocalGit
>  
> -if sys.hexversion < 0x01050200:
> -# os.makedirs() is the limiter
> -sys.stderr.write("git-remote-testgit: requires Python 1.5.2 or later.\n")
> +if sys.hexversion < 0x0200:
> +# string.encode() is the limiter
> +sys.stderr.write("git-remote-testgit: requires Python 2.0 or later.\n")
>  sys.exit(1)
>  
>  def get_repo(alias, url):
> @@ -45,7 +45,7 @@ def get_repo(alias, url):
>  repo.get_head()
>  
>  hasher = _digest()
> -hasher.update(repo.path)
> +hasher.update(repo.path.encode('utf-8'))
>  repo.hash = hasher.hexdigest()
>  
>  repo.get_base_path = lambda base: os.path.join(
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 6/8] git-remote-testpy: hash bytes explicitly

2013-01-17 Thread John Keeping
Under Python 3 'hasher.update(...)' must take a byte string and not a
unicode string.  Explicitly encode the argument to this method as UTF-8
so that this code works under Python 3.

This moves the required Python version forward to 2.0.

Signed-off-by: John Keeping 
---
 git-remote-testpy.py | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/git-remote-testpy.py b/git-remote-testpy.py
index d94a66a..f8dc196 100644
--- a/git-remote-testpy.py
+++ b/git-remote-testpy.py
@@ -31,9 +31,9 @@ from git_remote_helpers.git.exporter import GitExporter
 from git_remote_helpers.git.importer import GitImporter
 from git_remote_helpers.git.non_local import NonLocalGit
 
-if sys.hexversion < 0x01050200:
-# os.makedirs() is the limiter
-sys.stderr.write("git-remote-testgit: requires Python 1.5.2 or later.\n")
+if sys.hexversion < 0x0200:
+# string.encode() is the limiter
+sys.stderr.write("git-remote-testgit: requires Python 2.0 or later.\n")
 sys.exit(1)
 
 def get_repo(alias, url):
@@ -45,7 +45,7 @@ def get_repo(alias, url):
 repo.get_head()
 
 hasher = _digest()
-hasher.update(repo.path)
+hasher.update(repo.path.encode('utf-8'))
 repo.hash = hasher.hexdigest()
 
 repo.get_base_path = lambda base: os.path.join(
-- 
1.8.1.1.260.g99b33f4.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html