Re: [gentoo-portage-dev] [PATCH] portage.manifest: Fix mis-parsing Manifests with numerical checksums

2017-11-19 Thread Zac Medico
On 11/19/2017 09:12 AM, Michał Górny wrote:
> Fix the regular expression used to parse Manifests not to fail horribly
> when one of the checksums accidentally happens to be all-digits.
> 
> The previously used regular expression used to greedily take everything
> up to the first number as filename. If one of the checksums happened to
> be purely numeric, this meant that everything up to that checksum was
> taken as filename, and the checksum itself was taken as file size. It
> was also capable of accepting an empty filename.
> 
> The updated regular expression uses '\S+' to match filenames. Therefore,
> the match is terminated on first whitespace character and filenames can
> no longer contain spaces. Not that it could ever work reliably.
> 
> Spotted by Ulrich Müller.
> 
> Bug: https://bugs.gentoo.org/638148
> ---
>  pym/portage/manifest.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/pym/portage/manifest.py b/pym/portage/manifest.py
> index 4ec20515e..4bca61e86 100644
> --- a/pym/portage/manifest.py
> +++ b/pym/portage/manifest.py
> @@ -30,7 +30,7 @@ from portage.const import (MANIFEST2_HASH_DEFAULTS, 
> MANIFEST2_IDENTIFIERS)
>  from portage.localization import _
>  
>  _manifest_re = re.compile(
> - r'^(' + '|'.join(MANIFEST2_IDENTIFIERS) + r') (.*)( \d+( \S+ \S+)+)$',
> + r'^(' + '|'.join(MANIFEST2_IDENTIFIERS) + r') (\S+)( \d+( \S+ \S+)+)$',
>   re.UNICODE)
>  
>  if sys.hexversion >= 0x300:
> 

Looks good, please merge.
-- 
Thanks,
Zac



[gentoo-portage-dev] [PATCH] portage.manifest: Fix mis-parsing Manifests with numerical checksums

2017-11-19 Thread Michał Górny
Fix the regular expression used to parse Manifests not to fail horribly
when one of the checksums accidentally happens to be all-digits.

The previously used regular expression used to greedily take everything
up to the first number as filename. If one of the checksums happened to
be purely numeric, this meant that everything up to that checksum was
taken as filename, and the checksum itself was taken as file size. It
was also capable of accepting an empty filename.

The updated regular expression uses '\S+' to match filenames. Therefore,
the match is terminated on first whitespace character and filenames can
no longer contain spaces. Not that it could ever work reliably.

Spotted by Ulrich Müller.

Bug: https://bugs.gentoo.org/638148
---
 pym/portage/manifest.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pym/portage/manifest.py b/pym/portage/manifest.py
index 4ec20515e..4bca61e86 100644
--- a/pym/portage/manifest.py
+++ b/pym/portage/manifest.py
@@ -30,7 +30,7 @@ from portage.const import (MANIFEST2_HASH_DEFAULTS, 
MANIFEST2_IDENTIFIERS)
 from portage.localization import _
 
 _manifest_re = re.compile(
-   r'^(' + '|'.join(MANIFEST2_IDENTIFIERS) + r') (.*)( \d+( \S+ \S+)+)$',
+   r'^(' + '|'.join(MANIFEST2_IDENTIFIERS) + r') (\S+)( \d+( \S+ \S+)+)$',
re.UNICODE)
 
 if sys.hexversion >= 0x300:
-- 
2.15.0