gemini-code-assist[bot] commented on code in PR #38688:
URL: https://github.com/apache/beam/pull/38688#discussion_r3301059314
##########
sdks/python/apache_beam/runners/portability/stager.py:
##########
@@ -781,23 +787,54 @@ def _populate_requirements_cache(
]
if populate_cache_with_sdists:
- cmd_args.extend(['--no-binary', ':all:'])
+ download_dir = tempfile.mkdtemp(dir=temp_directory)
+ cmd_args.extend(['--dest', download_dir, '--no-binary', ':all:'])
+ _LOGGER.info('Executing command: %s', cmd_args)
+ processes.check_output(cmd_args, stderr=processes.STDOUT)
else:
language_implementation_tag = 'cp'
abi_suffix = 'm' if sys.version_info < (3, 8) else ''
abi_tag = 'cp%d%d%s' % (
sys.version_info[0], sys.version_info[1], abi_suffix)
- platform_tag = Stager._get_platform_for_default_sdk_container()
- cmd_args.extend([
- '--implementation',
- language_implementation_tag,
- '--abi',
- abi_tag,
- '--platform',
- platform_tag
- ])
- _LOGGER.info('Executing command: %s', cmd_args)
- processes.check_output(cmd_args, stderr=processes.STDOUT)
+ preferred_platform = Stager._get_platform_for_default_sdk_container()
+
+ # Fallback platform tags in case the preferred modern tag is too strict
+ # for some dependencies on PyPI.
+ try:
+ start_idx = _MANYLINUX_PLATFORMS.index(preferred_platform)
+ platforms = _MANYLINUX_PLATFORMS[start_idx:]
+ except ValueError:
+ platforms = [preferred_platform]
Review Comment:

Since we've refactored `_MANYLINUX_PLATFORMS` to `_MANYLINUX_BASE_TAGS` to
support multiple architectures, we should dynamically detect the architecture
suffix (e.g., `_x86_64` or `_aarch64`) from the `preferred_platform` and
reconstruct the platform list. Note that `manylinux2010` is not supported on
`aarch64`, so we should exclude it for ARM64.
```suggestion
arch = None
for suffix in ['_x86_64', '_aarch64']:
if preferred_platform.endswith(suffix):
arch = suffix
break
if arch:
platforms_with_arch = [
f'{base}{arch}'
for base in _MANYLINUX_BASE_TAGS
if not (base == 'manylinux2010' and arch == '_aarch64')
]
try:
start_idx = platforms_with_arch.index(preferred_platform)
platforms = platforms_with_arch[start_idx:]
except ValueError:
platforms = [preferred_platform]
else:
platforms = [preferred_platform]
```
##########
sdks/python/apache_beam/runners/portability/stager.py:
##########
@@ -88,6 +88,14 @@
# One of the choices for user to use for requirements cache during staging
SKIP_REQUIREMENTS_CACHE = 'skip'
+# Ordered list of manylinux tags from newest (strictest) to oldest (most
compatible)
+# used for cross-platform binary dependency downloads.
+_MANYLINUX_PLATFORMS = [
+ 'manylinux_2_28_x86_64',
+ 'manylinux2014_x86_64', # equivalent to manylinux_2_17
+ 'manylinux2010_x86_64', # equivalent to manylinux_2_12
+]
Review Comment:

The hardcoded `_MANYLINUX_PLATFORMS` list only contains `_x86_64` platform
tags. This means that on ARM64 (`aarch64`) environments (such as Apple Silicon
or AWS Graviton), the fallback mechanism will be completely bypassed because
the preferred platform (e.g., `manylinux_2_28_aarch64`) won't be found in this
list.
To support both `x86_64` and `aarch64` architectures, we should define a
list of base manylinux tags and dynamically append the appropriate architecture
suffix.
```suggestion
_MANYLINUX_BASE_TAGS = [
'manylinux_2_28',
'manylinux2014', # equivalent to manylinux_2_17
'manylinux2010', # equivalent to manylinux_2_12
]
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]