Re: [gentoo-portage-dev] [PATCH] make.globals: Change FETCHCOMMAND_RSYNC to --copy-links

2019-10-21 Thread Zac Medico
On 10/21/19 8:06 AM, Michał Górny wrote:
> Change FETCHCOMMAND_RSYNC to use '-Lt' over '-a'.  Notably, this
> replaces --links with --copy-links option, i.e. makes rsync copy
> underlying files when symlinks are met.  This is important since
> we do not transfer symlink targets, therefore '-l' ends up creating
> dangling symlinks.
> 
> This also removes most of the other options that are irrelevant or even
> undesirable to distfile fetching, that is:
> 
> - '-r' since we always fetch a single file, so recursive operation is
>   unnecessary
> - '-p', '-o', '-g' since we want to apply our permissions and ownership
>   for distfiles rather than copying the one from mirrors,
> - '-D' since we do not expect any devices or specials in distfiles.
> 
> Copying timestamps is preserved in case it's helpful in determining
> whether files need to be refetched.
> 
> Bug: https://bugs.gentoo.org/698046
> Signed-off-by: Michał Górny 
> ---
>  cnf/make.globals | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/cnf/make.globals b/cnf/make.globals
> index 9eeb7a01e..50511e812 100644
> --- a/cnf/make.globals
> +++ b/cnf/make.globals
> @@ -38,8 +38,8 @@ PORTAGE_TMPDIR="/var/tmp"
>  FETCHCOMMAND="wget -t 3 -T 60 --passive-ftp -O \"\${DISTDIR}/\${FILE}\" 
> \"\${URI}\""
>  RESUMECOMMAND="wget -c -t 3 -T 60 --passive-ftp -O \"\${DISTDIR}/\${FILE}\" 
> \"\${URI}\""
>  
> -FETCHCOMMAND_RSYNC="rsync -avP \"\${URI}\" \"\${DISTDIR}/\${FILE}\""
> -RESUMECOMMAND_RSYNC="rsync -avP \"\${URI}\" \"\${DISTDIR}/\${FILE}\""
> +FETCHCOMMAND_RSYNC="rsync -LtvP \"\${URI}\" \"\${DISTDIR}/\${FILE}\""
> +RESUMECOMMAND_RSYNC="rsync -LtvP \"\${URI}\" \"\${DISTDIR}/\${FILE}\""
>  
>  # NOTE: rsync will evaluate quotes embedded inside PORTAGE_SSH_OPTS
>  FETCHCOMMAND_SSH="bash -c \"x=\\\${2#ssh://} ; host=\\\${x%%/*} ; 
> port=\\\${host##*:} ; host=\\\${host%:*} ; [[ \\\${host} = \\\${port} ]] && 
> port= ; exec rsync --rsh=\\\"ssh \\\${port:+-p\\\${port}} \\\${3}\\\" -avP 
> \\\"\\\${host}:/\\\${x#*/}\\\" \\\"\\\$1\\\"\" rsync \"\${DISTDIR}/\${FILE}\" 
> \"\${URI}\" \"\${PORTAGE_SSH_OPTS}\""
> 

Looks good. Please merge.
-- 
Thanks,
Zac



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-portage-dev] [PATCH] make.globals: Change FETCHCOMMAND_RSYNC to --copy-links

2019-10-21 Thread Michał Górny
On Mon, 2019-10-21 at 17:25 +0200, Ulrich Mueller wrote:
> > > > > > On Mon, 21 Oct 2019, Michał Górny wrote:
> > This also removes most of the other options that are irrelevant or even
> > undesirable to distfile fetching, that is:
> > - '-r' since we always fetch a single file, so recursive operation is
> >   unnecessary
> > - '-p', '-o', '-g' since we want to apply our permissions and ownership
> >   for distfiles rather than copying the one from mirrors,
> > - '-D' since we do not expect any devices or specials in distfiles.
> 
> That certainly makes sense, but I don't see it in the actual patch?

Those were the options implied by '-a'.

-- 
Best regards,
Michał Górny



signature.asc
Description: This is a digitally signed message part


Re: [gentoo-portage-dev] [PATCH] make.globals: Change FETCHCOMMAND_RSYNC to --copy-links

2019-10-21 Thread Ulrich Mueller
> On Mon, 21 Oct 2019, Michał Górny wrote:

> This also removes most of the other options that are irrelevant or even
> undesirable to distfile fetching, that is:

> - '-r' since we always fetch a single file, so recursive operation is
>   unnecessary
> - '-p', '-o', '-g' since we want to apply our permissions and ownership
>   for distfiles rather than copying the one from mirrors,
> - '-D' since we do not expect any devices or specials in distfiles.

That certainly makes sense, but I don't see it in the actual patch?


signature.asc
Description: PGP signature


[gentoo-portage-dev] [PATCH] make.globals: Change FETCHCOMMAND_RSYNC to --copy-links

2019-10-21 Thread Michał Górny
Change FETCHCOMMAND_RSYNC to use '-Lt' over '-a'.  Notably, this
replaces --links with --copy-links option, i.e. makes rsync copy
underlying files when symlinks are met.  This is important since
we do not transfer symlink targets, therefore '-l' ends up creating
dangling symlinks.

This also removes most of the other options that are irrelevant or even
undesirable to distfile fetching, that is:

- '-r' since we always fetch a single file, so recursive operation is
  unnecessary
- '-p', '-o', '-g' since we want to apply our permissions and ownership
  for distfiles rather than copying the one from mirrors,
- '-D' since we do not expect any devices or specials in distfiles.

Copying timestamps is preserved in case it's helpful in determining
whether files need to be refetched.

Bug: https://bugs.gentoo.org/698046
Signed-off-by: Michał Górny 
---
 cnf/make.globals | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/cnf/make.globals b/cnf/make.globals
index 9eeb7a01e..50511e812 100644
--- a/cnf/make.globals
+++ b/cnf/make.globals
@@ -38,8 +38,8 @@ PORTAGE_TMPDIR="/var/tmp"
 FETCHCOMMAND="wget -t 3 -T 60 --passive-ftp -O \"\${DISTDIR}/\${FILE}\" 
\"\${URI}\""
 RESUMECOMMAND="wget -c -t 3 -T 60 --passive-ftp -O \"\${DISTDIR}/\${FILE}\" 
\"\${URI}\""
 
-FETCHCOMMAND_RSYNC="rsync -avP \"\${URI}\" \"\${DISTDIR}/\${FILE}\""
-RESUMECOMMAND_RSYNC="rsync -avP \"\${URI}\" \"\${DISTDIR}/\${FILE}\""
+FETCHCOMMAND_RSYNC="rsync -LtvP \"\${URI}\" \"\${DISTDIR}/\${FILE}\""
+RESUMECOMMAND_RSYNC="rsync -LtvP \"\${URI}\" \"\${DISTDIR}/\${FILE}\""
 
 # NOTE: rsync will evaluate quotes embedded inside PORTAGE_SSH_OPTS
 FETCHCOMMAND_SSH="bash -c \"x=\\\${2#ssh://} ; host=\\\${x%%/*} ; 
port=\\\${host##*:} ; host=\\\${host%:*} ; [[ \\\${host} = \\\${port} ]] && 
port= ; exec rsync --rsh=\\\"ssh \\\${port:+-p\\\${port}} \\\${3}\\\" -avP 
\\\"\\\${host}:/\\\${x#*/}\\\" \\\"\\\$1\\\"\" rsync \"\${DISTDIR}/\${FILE}\" 
\"\${URI}\" \"\${PORTAGE_SSH_OPTS}\""
-- 
2.23.0




Re: [gentoo-portage-dev] [PATCH 2/2] emirrordist: Pass path from DeletionIterator to DeletionTask

2019-10-21 Thread Zac Medico
On 10/21/19 1:43 AM, Michał Górny wrote:
> Since DeletionIterator needs to stat the distfile and therefore find
> one working path for it, pass it to DeletionTask instead of recomputing
> it there.  This also fixes wrongly assuming that first layout will
> always be correct.
> 
> Bug: https://bugs.gentoo.org/697890
> Signed-off-by: Michał Górny 
> ---
>  lib/portage/_emirrordist/DeletionIterator.py |  2 ++
>  lib/portage/_emirrordist/DeletionTask.py | 14 +-
>  2 files changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/lib/portage/_emirrordist/DeletionIterator.py 
> b/lib/portage/_emirrordist/DeletionIterator.py
> index 5c193911a..3cbff2c3a 100644
> --- a/lib/portage/_emirrordist/DeletionIterator.py
> +++ b/lib/portage/_emirrordist/DeletionIterator.py
> @@ -72,6 +72,7 @@ class DeletionIterator(object):
>  
>   yield DeletionTask(background=True,
>   distfile=filename,
> + distfile_path=path,
>   config=self._config)
>  
>   else:
> @@ -85,6 +86,7 @@ class DeletionIterator(object):
>  
>   yield 
> DeletionTask(background=True,
>   distfile=filename,
> + distfile_path=path,
>   config=self._config)
>  
>   if deletion_db is not None:
> diff --git a/lib/portage/_emirrordist/DeletionTask.py 
> b/lib/portage/_emirrordist/DeletionTask.py
> index a4bb29419..4e9c26ca2 100644
> --- a/lib/portage/_emirrordist/DeletionTask.py
> +++ b/lib/portage/_emirrordist/DeletionTask.py
> @@ -10,14 +10,9 @@ from _emerge.CompositeTask import CompositeTask
>  
>  class DeletionTask(CompositeTask):
>  
> - __slots__ = ('distfile', 'config')
> + __slots__ = ('distfile', 'distfile_path', 'config')
>  
>   def _start(self):
> -
> - distfile_path = os.path.join(
> - self.config.options.distfiles,
> - self.config.layouts[0].get_path(self.distfile))
> -
>   if self.config.options.recycle_dir is not None:
>   recycle_path = os.path.join(
>   self.config.options.recycle_dir, self.distfile)
> @@ -29,7 +24,8 @@ class DeletionTask(CompositeTask):
>   "distfiles to recycle") % self.distfile)
>   try:
>   # note: distfile_path can be a symlink 
> here
> - 
> os.rename(os.path.realpath(distfile_path), recycle_path)
> + 
> os.rename(os.path.realpath(self.distfile_path),
> + recycle_path)
>   except OSError as e:
>   if e.errno != errno.EXDEV:
>   logging.error(("rename %s from 
> distfiles to "
> @@ -40,7 +36,7 @@ class DeletionTask(CompositeTask):
>   return
>  
>   self._start_task(
> - FileCopier(src_path=distfile_path,
> + FileCopier(src_path=self.distfile_path,
>   dest_path=recycle_path,
>   background=False),
>   self._recycle_copier_exit)
> @@ -55,7 +51,7 @@ class DeletionTask(CompositeTask):
>   logging.debug(("delete '%s' from "
>   "distfiles") % self.distfile)
>   try:
> - os.unlink(distfile_path)
> + os.unlink(self.distfile_path)
>   except OSError as e:
>   if e.errno not in (errno.ENOENT, errno.ESTALE):
>   logging.error("%s unlink failed in 
> distfiles: %s" %
> 

Looks good. Please merge.
-- 
Thanks,
Zac



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-portage-dev] [PATCH 1/2] fetch: Use real os.walk() to avoid unicode issues with Portage

2019-10-21 Thread Zac Medico
On 10/21/19 2:16 AM, Michał Górny wrote:
> On Mon, 2019-10-21 at 02:10 -0700, Zac Medico wrote:
>> On 10/21/19 1:43 AM, Michał Górny wrote:
>>> Use real os.walk() when getting filenames for FlatLayout.  Unlike
>>> the wrapped Portage module, it return str output for str path parameter,
>>> so we don't have to recode it back and forth.
>>>
>>> Signed-off-by: Michał Górny 
>>> ---
>>>  lib/portage/package/ebuild/fetch.py | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/portage/package/ebuild/fetch.py 
>>> b/lib/portage/package/ebuild/fetch.py
>>> index cedf12b19..be277f1a3 100644
>>> --- a/lib/portage/package/ebuild/fetch.py
>>> +++ b/lib/portage/package/ebuild/fetch.py
>>> @@ -11,6 +11,7 @@ import io
>>>  import itertools
>>>  import json
>>>  import logging
>>> +import os as real_os
>>>  import random
>>>  import re
>>>  import stat
>>> @@ -270,7 +271,7 @@ class FlatLayout(object):
>>> return filename
>>>  
>>> def get_filenames(self, distdir):
>>> -   for dirpath, dirnames, filenames in os.walk(distdir,
>>> +   for dirpath, dirnames, filenames in real_os.walk(distdir,
>>> onerror=_raise_exc):
>>> return iter(filenames)
>>>  
>>>
>>
>> The real_os.walk will trigger UnicodeEncodeError if distdir can't be
>> encoded with sys.getfilesystemencoding(). It's an edge case, but
>> generally I prefer to handle it.
>>
>> We can continue to use portage.os for the os.walk call, and turn
>> get_filenames into a generator method like this:
>>
>> for filename in filenames:
>> try:
>> yield portage._unicode_decode(filename, errors='strict')
>> except UnicodeDecodeError:
>> # Ignore it. Distfiles names must have valid UTF8 encoding.
>> pass
> 
> Since you've already written it, could you commit it?  I don't wish to
> have my name on the implicit module overrides hackery I don't approve
> of.

Done:

https://gitweb.gentoo.org/proj/portage.git/commit/?id=d9855418352398013ae787bb73f70e935ec109ca

I don't really like the portage.os unicode wrapper either, but I'm not
aware of a good alternative to solve the pervasive UnicodeEncodeError
issue that I've mentioned.
-- 
Thanks,
Zac



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-portage-dev] [PATCH 1/2] fetch: Use real os.walk() to avoid unicode issues with Portage

2019-10-21 Thread Michał Górny
On Mon, 2019-10-21 at 02:10 -0700, Zac Medico wrote:
> On 10/21/19 1:43 AM, Michał Górny wrote:
> > Use real os.walk() when getting filenames for FlatLayout.  Unlike
> > the wrapped Portage module, it return str output for str path parameter,
> > so we don't have to recode it back and forth.
> > 
> > Signed-off-by: Michał Górny 
> > ---
> >  lib/portage/package/ebuild/fetch.py | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/lib/portage/package/ebuild/fetch.py 
> > b/lib/portage/package/ebuild/fetch.py
> > index cedf12b19..be277f1a3 100644
> > --- a/lib/portage/package/ebuild/fetch.py
> > +++ b/lib/portage/package/ebuild/fetch.py
> > @@ -11,6 +11,7 @@ import io
> >  import itertools
> >  import json
> >  import logging
> > +import os as real_os
> >  import random
> >  import re
> >  import stat
> > @@ -270,7 +271,7 @@ class FlatLayout(object):
> > return filename
> >  
> > def get_filenames(self, distdir):
> > -   for dirpath, dirnames, filenames in os.walk(distdir,
> > +   for dirpath, dirnames, filenames in real_os.walk(distdir,
> > onerror=_raise_exc):
> > return iter(filenames)
> >  
> > 
> 
> The real_os.walk will trigger UnicodeEncodeError if distdir can't be
> encoded with sys.getfilesystemencoding(). It's an edge case, but
> generally I prefer to handle it.
> 
> We can continue to use portage.os for the os.walk call, and turn
> get_filenames into a generator method like this:
> 
> for filename in filenames:
> try:
> yield portage._unicode_decode(filename, errors='strict')
> except UnicodeDecodeError:
> # Ignore it. Distfiles names must have valid UTF8 encoding.
> pass

Since you've already written it, could you commit it?  I don't wish to
have my name on the implicit module overrides hackery I don't approve
of.

-- 
Best regards,
Michał Górny



signature.asc
Description: This is a digitally signed message part


Re: [gentoo-portage-dev] [PATCH 1/2] fetch: Use real os.walk() to avoid unicode issues with Portage

2019-10-21 Thread Zac Medico
On 10/21/19 1:43 AM, Michał Górny wrote:
> Use real os.walk() when getting filenames for FlatLayout.  Unlike
> the wrapped Portage module, it return str output for str path parameter,
> so we don't have to recode it back and forth.
> 
> Signed-off-by: Michał Górny 
> ---
>  lib/portage/package/ebuild/fetch.py | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/portage/package/ebuild/fetch.py 
> b/lib/portage/package/ebuild/fetch.py
> index cedf12b19..be277f1a3 100644
> --- a/lib/portage/package/ebuild/fetch.py
> +++ b/lib/portage/package/ebuild/fetch.py
> @@ -11,6 +11,7 @@ import io
>  import itertools
>  import json
>  import logging
> +import os as real_os
>  import random
>  import re
>  import stat
> @@ -270,7 +271,7 @@ class FlatLayout(object):
>   return filename
>  
>   def get_filenames(self, distdir):
> - for dirpath, dirnames, filenames in os.walk(distdir,
> + for dirpath, dirnames, filenames in real_os.walk(distdir,
>   onerror=_raise_exc):
>   return iter(filenames)
>  
> 

The real_os.walk will trigger UnicodeEncodeError if distdir can't be
encoded with sys.getfilesystemencoding(). It's an edge case, but
generally I prefer to handle it.

We can continue to use portage.os for the os.walk call, and turn
get_filenames into a generator method like this:

for filename in filenames:
try:
yield portage._unicode_decode(filename, errors='strict')
except UnicodeDecodeError:
# Ignore it. Distfiles names must have valid UTF8 encoding.
pass
-- 
Thanks,
Zac



signature.asc
Description: OpenPGP digital signature


[gentoo-portage-dev] [PATCH 2/2] emirrordist: Pass path from DeletionIterator to DeletionTask

2019-10-21 Thread Michał Górny
Since DeletionIterator needs to stat the distfile and therefore find
one working path for it, pass it to DeletionTask instead of recomputing
it there.  This also fixes wrongly assuming that first layout will
always be correct.

Bug: https://bugs.gentoo.org/697890
Signed-off-by: Michał Górny 
---
 lib/portage/_emirrordist/DeletionIterator.py |  2 ++
 lib/portage/_emirrordist/DeletionTask.py | 14 +-
 2 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/lib/portage/_emirrordist/DeletionIterator.py 
b/lib/portage/_emirrordist/DeletionIterator.py
index 5c193911a..3cbff2c3a 100644
--- a/lib/portage/_emirrordist/DeletionIterator.py
+++ b/lib/portage/_emirrordist/DeletionIterator.py
@@ -72,6 +72,7 @@ class DeletionIterator(object):
 
yield DeletionTask(background=True,
distfile=filename,
+   distfile_path=path,
config=self._config)
 
else:
@@ -85,6 +86,7 @@ class DeletionIterator(object):
 
yield 
DeletionTask(background=True,
distfile=filename,
+   distfile_path=path,
config=self._config)
 
if deletion_db is not None:
diff --git a/lib/portage/_emirrordist/DeletionTask.py 
b/lib/portage/_emirrordist/DeletionTask.py
index a4bb29419..4e9c26ca2 100644
--- a/lib/portage/_emirrordist/DeletionTask.py
+++ b/lib/portage/_emirrordist/DeletionTask.py
@@ -10,14 +10,9 @@ from _emerge.CompositeTask import CompositeTask
 
 class DeletionTask(CompositeTask):
 
-   __slots__ = ('distfile', 'config')
+   __slots__ = ('distfile', 'distfile_path', 'config')
 
def _start(self):
-
-   distfile_path = os.path.join(
-   self.config.options.distfiles,
-   self.config.layouts[0].get_path(self.distfile))
-
if self.config.options.recycle_dir is not None:
recycle_path = os.path.join(
self.config.options.recycle_dir, self.distfile)
@@ -29,7 +24,8 @@ class DeletionTask(CompositeTask):
"distfiles to recycle") % self.distfile)
try:
# note: distfile_path can be a symlink 
here
-   
os.rename(os.path.realpath(distfile_path), recycle_path)
+   
os.rename(os.path.realpath(self.distfile_path),
+   recycle_path)
except OSError as e:
if e.errno != errno.EXDEV:
logging.error(("rename %s from 
distfiles to "
@@ -40,7 +36,7 @@ class DeletionTask(CompositeTask):
return
 
self._start_task(
-   FileCopier(src_path=distfile_path,
+   FileCopier(src_path=self.distfile_path,
dest_path=recycle_path,
background=False),
self._recycle_copier_exit)
@@ -55,7 +51,7 @@ class DeletionTask(CompositeTask):
logging.debug(("delete '%s' from "
"distfiles") % self.distfile)
try:
-   os.unlink(distfile_path)
+   os.unlink(self.distfile_path)
except OSError as e:
if e.errno not in (errno.ENOENT, errno.ESTALE):
logging.error("%s unlink failed in 
distfiles: %s" %
-- 
2.23.0




[gentoo-portage-dev] [PATCH 1/2] fetch: Use real os.walk() to avoid unicode issues with Portage

2019-10-21 Thread Michał Górny
Use real os.walk() when getting filenames for FlatLayout.  Unlike
the wrapped Portage module, it return str output for str path parameter,
so we don't have to recode it back and forth.

Signed-off-by: Michał Górny 
---
 lib/portage/package/ebuild/fetch.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/portage/package/ebuild/fetch.py 
b/lib/portage/package/ebuild/fetch.py
index cedf12b19..be277f1a3 100644
--- a/lib/portage/package/ebuild/fetch.py
+++ b/lib/portage/package/ebuild/fetch.py
@@ -11,6 +11,7 @@ import io
 import itertools
 import json
 import logging
+import os as real_os
 import random
 import re
 import stat
@@ -270,7 +271,7 @@ class FlatLayout(object):
return filename
 
def get_filenames(self, distdir):
-   for dirpath, dirnames, filenames in os.walk(distdir,
+   for dirpath, dirnames, filenames in real_os.walk(distdir,
onerror=_raise_exc):
return iter(filenames)
 
-- 
2.23.0




Re: [gentoo-portage-dev] [PATCH v2] emirrordist: Clean dangling symlinks up

2019-10-21 Thread Zac Medico
On 10/21/19 1:02 AM, Michał Górny wrote:
> Bug: https://bugs.gentoo.org/697906
> Signed-off-by: Michał Górny 
> ---
>  lib/portage/_emirrordist/DeletionIterator.py | 11 ---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/portage/_emirrordist/DeletionIterator.py 
> b/lib/portage/_emirrordist/DeletionIterator.py
> index dab6eaea2..5c193911a 100644
> --- a/lib/portage/_emirrordist/DeletionIterator.py
> +++ b/lib/portage/_emirrordist/DeletionIterator.py
> @@ -27,11 +27,16 @@ class DeletionIterator(object):
>   # require at least one successful stat()
>   exceptions = []
>   for layout in reversed(self._config.layouts):
> + path = os.path.join(distdir, 
> layout.get_path(filename))
>   try:
> - st = os.stat(
> - os.path.join(distdir, 
> layout.get_path(filename)))
> + st = os.stat(path)
>   except OSError as e:
> - exceptions.append(e)
> + # is it a dangling symlink?
> + try:
> + if os.path.islink(path):
> + os.unlink(path)
> + except OSError as e:
> + exceptions.append(e)
>   else:
>   if stat.S_ISREG(st.st_mode):
>   break
> 

Looks good. Please merge.
-- 
Thanks,
Zac



signature.asc
Description: OpenPGP digital signature


[gentoo-portage-dev] [PATCH v2] emirrordist: Clean dangling symlinks up

2019-10-21 Thread Michał Górny
Bug: https://bugs.gentoo.org/697906
Signed-off-by: Michał Górny 
---
 lib/portage/_emirrordist/DeletionIterator.py | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/lib/portage/_emirrordist/DeletionIterator.py 
b/lib/portage/_emirrordist/DeletionIterator.py
index dab6eaea2..5c193911a 100644
--- a/lib/portage/_emirrordist/DeletionIterator.py
+++ b/lib/portage/_emirrordist/DeletionIterator.py
@@ -27,11 +27,16 @@ class DeletionIterator(object):
# require at least one successful stat()
exceptions = []
for layout in reversed(self._config.layouts):
+   path = os.path.join(distdir, 
layout.get_path(filename))
try:
-   st = os.stat(
-   os.path.join(distdir, 
layout.get_path(filename)))
+   st = os.stat(path)
except OSError as e:
-   exceptions.append(e)
+   # is it a dangling symlink?
+   try:
+   if os.path.islink(path):
+   os.unlink(path)
+   except OSError as e:
+   exceptions.append(e)
else:
if stat.S_ISREG(st.st_mode):
break
-- 
2.23.0




Re: [gentoo-portage-dev] [PATCH] emirrordist: Clean dangling symlinks up

2019-10-21 Thread Michał Górny
On Mon, 2019-10-21 at 00:44 -0700, Zac Medico wrote:
> On 10/20/19 3:23 AM, Michał Górny wrote:
> > Bug: https://bugs.gentoo.org/697906
> > Signed-off-by: Michał Górny 
> > ---
> >  lib/portage/_emirrordist/DeletionIterator.py | 21 +---
> >  1 file changed, 18 insertions(+), 3 deletions(-)
> > 
> > diff --git a/lib/portage/_emirrordist/DeletionIterator.py 
> > b/lib/portage/_emirrordist/DeletionIterator.py
> > index dab6eaea2..6bc0fd09f 100644
> > --- a/lib/portage/_emirrordist/DeletionIterator.py
> > +++ b/lib/portage/_emirrordist/DeletionIterator.py
> > @@ -27,16 +27,31 @@ class DeletionIterator(object):
> > # require at least one successful stat()
> > exceptions = []
> > for layout in reversed(self._config.layouts):
> > +   path = os.path.join(distdir, 
> > layout.get_path(filename))
> > try:
> > -   st = os.stat(
> > -   os.path.join(distdir, 
> > layout.get_path(filename)))
> > +   st = os.stat(path)
> > except OSError as e:
> > -   exceptions.append(e)
> > +   # is it a dangling symlink?
> > +   try:
> > +   if os.path.islink(path):
> > +   os.unlink(path)
> > +   except OSError as e:
> > +   exceptions.append(e)
> > else:
> > if stat.S_ISREG(st.st_mode):
> > break
> 
> How about if we remove the above break so that we can eliminate the
> lstat loop below?

Oops, I forgot to remove it from my patch.  The 'break' is fine since we
need this dangling symlink elimination only if none of the files exist. 
If they do, then symlink removal post distfile removal will take care of
them.

> 
> 
> > else:
> > if exceptions:
> > +   # check for dangling symlinks
> > +   for layout in self._config.layouts:
> > +   path = os.path.join(distdir, 
> > layout.get_path(filename))
> > +   try:
> > +   st = os.lstat(path)
> > +   if 
> > stat.S_ISLNK(st.st_mode):
> > +   os.unlink(path)
> > +   except OSError as e:
> > +   pass
> > +
> > logging.error("stat failed on '%s' in 
> > distfiles: %s\n" %
> > (filename, '; '.join(str(x) for 
> > x in exceptions)))
> > continue
> > 
> 
> Looks good except that it would be nice to eliminate the second loop.

-- 
Best regards,
Michał Górny



signature.asc
Description: This is a digitally signed message part


Re: [gentoo-portage-dev] [PATCH] emirrordist: Clean dangling symlinks up

2019-10-21 Thread Zac Medico
On 10/20/19 3:23 AM, Michał Górny wrote:
> Bug: https://bugs.gentoo.org/697906
> Signed-off-by: Michał Górny 
> ---
>  lib/portage/_emirrordist/DeletionIterator.py | 21 +---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/portage/_emirrordist/DeletionIterator.py 
> b/lib/portage/_emirrordist/DeletionIterator.py
> index dab6eaea2..6bc0fd09f 100644
> --- a/lib/portage/_emirrordist/DeletionIterator.py
> +++ b/lib/portage/_emirrordist/DeletionIterator.py
> @@ -27,16 +27,31 @@ class DeletionIterator(object):
>   # require at least one successful stat()
>   exceptions = []
>   for layout in reversed(self._config.layouts):
> + path = os.path.join(distdir, 
> layout.get_path(filename))
>   try:
> - st = os.stat(
> - os.path.join(distdir, 
> layout.get_path(filename)))
> + st = os.stat(path)
>   except OSError as e:
> - exceptions.append(e)
> + # is it a dangling symlink?
> + try:
> + if os.path.islink(path):
> + os.unlink(path)
> + except OSError as e:
> + exceptions.append(e)
>   else:
>   if stat.S_ISREG(st.st_mode):
>   break

How about if we remove the above break so that we can eliminate the
lstat loop below?


>   else:
>   if exceptions:
> + # check for dangling symlinks
> + for layout in self._config.layouts:
> + path = os.path.join(distdir, 
> layout.get_path(filename))
> + try:
> + st = os.lstat(path)
> + if 
> stat.S_ISLNK(st.st_mode):
> + os.unlink(path)
> + except OSError as e:
> + pass
> +
>   logging.error("stat failed on '%s' in 
> distfiles: %s\n" %
>   (filename, '; '.join(str(x) for 
> x in exceptions)))
>   continue
> 

Looks good except that it would be nice to eliminate the second loop.
-- 
Thanks,
Zac



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-portage-dev] [PATCH] emirrordist: Report all stat() errors instead of first one

2019-10-21 Thread Zac Medico
On 10/20/19 2:54 AM, Michał Górny wrote:
> When DeletionIterator fails, report all stat() errors.  Reporting
> just the first one results in confusing logs, suggesting that one
> of the location did not exist but the other existed and was removed.
> ---
>  lib/portage/_emirrordist/DeletionIterator.py | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/portage/_emirrordist/DeletionIterator.py 
> b/lib/portage/_emirrordist/DeletionIterator.py
> index cd773b3c8..dab6eaea2 100644
> --- a/lib/portage/_emirrordist/DeletionIterator.py
> +++ b/lib/portage/_emirrordist/DeletionIterator.py
> @@ -25,20 +25,20 @@ class DeletionIterator(object):
>   distfiles_set.update(layout.get_filenames(distdir))
>   for filename in distfiles_set:
>   # require at least one successful stat()
> - first_exception = None
> + exceptions = []
>   for layout in reversed(self._config.layouts):
>   try:
>   st = os.stat(
>   os.path.join(distdir, 
> layout.get_path(filename)))
>   except OSError as e:
> - first_exception = e
> + exceptions.append(e)
>   else:
>   if stat.S_ISREG(st.st_mode):
>   break
>   else:
> - if first_exception is not None:
> + if exceptions:
>   logging.error("stat failed on '%s' in 
> distfiles: %s\n" %
> - (filename, first_exception))
> + (filename, '; '.join(str(x) for 
> x in exceptions)))
>   continue
>  
>   if filename in file_owners:
> 

Looks good. Please merge.
-- 
Thanks,
Zac



signature.asc
Description: OpenPGP digital signature