Re: [gentoo-portage-dev] [PATCH 1/2] fetch: Use real os.walk() to avoid unicode issues with Portage
On 10/21/19 2:16 AM, Michał Górny wrote: > On Mon, 2019-10-21 at 02:10 -0700, Zac Medico wrote: >> On 10/21/19 1:43 AM, Michał Górny wrote: >>> Use real os.walk() when getting filenames for FlatLayout. Unlike >>> the wrapped Portage module, it return str output for str path parameter, >>> so we don't have to recode it back and forth. >>> >>> Signed-off-by: Michał Górny >>> --- >>> lib/portage/package/ebuild/fetch.py | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/lib/portage/package/ebuild/fetch.py >>> b/lib/portage/package/ebuild/fetch.py >>> index cedf12b19..be277f1a3 100644 >>> --- a/lib/portage/package/ebuild/fetch.py >>> +++ b/lib/portage/package/ebuild/fetch.py >>> @@ -11,6 +11,7 @@ import io >>> import itertools >>> import json >>> import logging >>> +import os as real_os >>> import random >>> import re >>> import stat >>> @@ -270,7 +271,7 @@ class FlatLayout(object): >>> return filename >>> >>> def get_filenames(self, distdir): >>> - for dirpath, dirnames, filenames in os.walk(distdir, >>> + for dirpath, dirnames, filenames in real_os.walk(distdir, >>> onerror=_raise_exc): >>> return iter(filenames) >>> >>> >> >> The real_os.walk will trigger UnicodeEncodeError if distdir can't be >> encoded with sys.getfilesystemencoding(). It's an edge case, but >> generally I prefer to handle it. >> >> We can continue to use portage.os for the os.walk call, and turn >> get_filenames into a generator method like this: >> >> for filename in filenames: >> try: >> yield portage._unicode_decode(filename, errors='strict') >> except UnicodeDecodeError: >> # Ignore it. Distfiles names must have valid UTF8 encoding. >> pass > > Since you've already written it, could you commit it? I don't wish to > have my name on the implicit module overrides hackery I don't approve > of. Done: https://gitweb.gentoo.org/proj/portage.git/commit/?id=d9855418352398013ae787bb73f70e935ec109ca I don't really like the portage.os unicode wrapper either, but I'm not aware of a good alternative to solve the pervasive UnicodeEncodeError issue that I've mentioned. -- Thanks, Zac signature.asc Description: OpenPGP digital signature
Re: [gentoo-portage-dev] [PATCH 1/2] fetch: Use real os.walk() to avoid unicode issues with Portage
On Mon, 2019-10-21 at 02:10 -0700, Zac Medico wrote: > On 10/21/19 1:43 AM, Michał Górny wrote: > > Use real os.walk() when getting filenames for FlatLayout. Unlike > > the wrapped Portage module, it return str output for str path parameter, > > so we don't have to recode it back and forth. > > > > Signed-off-by: Michał Górny > > --- > > lib/portage/package/ebuild/fetch.py | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/lib/portage/package/ebuild/fetch.py > > b/lib/portage/package/ebuild/fetch.py > > index cedf12b19..be277f1a3 100644 > > --- a/lib/portage/package/ebuild/fetch.py > > +++ b/lib/portage/package/ebuild/fetch.py > > @@ -11,6 +11,7 @@ import io > > import itertools > > import json > > import logging > > +import os as real_os > > import random > > import re > > import stat > > @@ -270,7 +271,7 @@ class FlatLayout(object): > > return filename > > > > def get_filenames(self, distdir): > > - for dirpath, dirnames, filenames in os.walk(distdir, > > + for dirpath, dirnames, filenames in real_os.walk(distdir, > > onerror=_raise_exc): > > return iter(filenames) > > > > > > The real_os.walk will trigger UnicodeEncodeError if distdir can't be > encoded with sys.getfilesystemencoding(). It's an edge case, but > generally I prefer to handle it. > > We can continue to use portage.os for the os.walk call, and turn > get_filenames into a generator method like this: > > for filename in filenames: > try: > yield portage._unicode_decode(filename, errors='strict') > except UnicodeDecodeError: > # Ignore it. Distfiles names must have valid UTF8 encoding. > pass Since you've already written it, could you commit it? I don't wish to have my name on the implicit module overrides hackery I don't approve of. -- Best regards, Michał Górny signature.asc Description: This is a digitally signed message part
Re: [gentoo-portage-dev] [PATCH 1/2] fetch: Use real os.walk() to avoid unicode issues with Portage
On 10/21/19 1:43 AM, Michał Górny wrote: > Use real os.walk() when getting filenames for FlatLayout. Unlike > the wrapped Portage module, it return str output for str path parameter, > so we don't have to recode it back and forth. > > Signed-off-by: Michał Górny > --- > lib/portage/package/ebuild/fetch.py | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/lib/portage/package/ebuild/fetch.py > b/lib/portage/package/ebuild/fetch.py > index cedf12b19..be277f1a3 100644 > --- a/lib/portage/package/ebuild/fetch.py > +++ b/lib/portage/package/ebuild/fetch.py > @@ -11,6 +11,7 @@ import io > import itertools > import json > import logging > +import os as real_os > import random > import re > import stat > @@ -270,7 +271,7 @@ class FlatLayout(object): > return filename > > def get_filenames(self, distdir): > - for dirpath, dirnames, filenames in os.walk(distdir, > + for dirpath, dirnames, filenames in real_os.walk(distdir, > onerror=_raise_exc): > return iter(filenames) > > The real_os.walk will trigger UnicodeEncodeError if distdir can't be encoded with sys.getfilesystemencoding(). It's an edge case, but generally I prefer to handle it. We can continue to use portage.os for the os.walk call, and turn get_filenames into a generator method like this: for filename in filenames: try: yield portage._unicode_decode(filename, errors='strict') except UnicodeDecodeError: # Ignore it. Distfiles names must have valid UTF8 encoding. pass -- Thanks, Zac signature.asc Description: OpenPGP digital signature
[gentoo-portage-dev] [PATCH 1/2] fetch: Use real os.walk() to avoid unicode issues with Portage
Use real os.walk() when getting filenames for FlatLayout. Unlike the wrapped Portage module, it return str output for str path parameter, so we don't have to recode it back and forth. Signed-off-by: Michał Górny --- lib/portage/package/ebuild/fetch.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/portage/package/ebuild/fetch.py b/lib/portage/package/ebuild/fetch.py index cedf12b19..be277f1a3 100644 --- a/lib/portage/package/ebuild/fetch.py +++ b/lib/portage/package/ebuild/fetch.py @@ -11,6 +11,7 @@ import io import itertools import json import logging +import os as real_os import random import re import stat @@ -270,7 +271,7 @@ class FlatLayout(object): return filename def get_filenames(self, distdir): - for dirpath, dirnames, filenames in os.walk(distdir, + for dirpath, dirnames, filenames in real_os.walk(distdir, onerror=_raise_exc): return iter(filenames) -- 2.23.0