gemini-code-assist[bot] commented on code in PR #38688:
URL: https://github.com/apache/beam/pull/38688#discussion_r3301059314


##########
sdks/python/apache_beam/runners/portability/stager.py:
##########
@@ -781,23 +787,54 @@ def _populate_requirements_cache(
       ]
 
       if populate_cache_with_sdists:
-        cmd_args.extend(['--no-binary', ':all:'])
+        download_dir = tempfile.mkdtemp(dir=temp_directory)
+        cmd_args.extend(['--dest', download_dir, '--no-binary', ':all:'])
+        _LOGGER.info('Executing command: %s', cmd_args)
+        processes.check_output(cmd_args, stderr=processes.STDOUT)
       else:
         language_implementation_tag = 'cp'
         abi_suffix = 'm' if sys.version_info < (3, 8) else ''
         abi_tag = 'cp%d%d%s' % (
             sys.version_info[0], sys.version_info[1], abi_suffix)
-        platform_tag = Stager._get_platform_for_default_sdk_container()
-        cmd_args.extend([
-            '--implementation',
-            language_implementation_tag,
-            '--abi',
-            abi_tag,
-            '--platform',
-            platform_tag
-        ])
-      _LOGGER.info('Executing command: %s', cmd_args)
-      processes.check_output(cmd_args, stderr=processes.STDOUT)
+        preferred_platform = Stager._get_platform_for_default_sdk_container()
+
+        # Fallback platform tags in case the preferred modern tag is too strict
+        # for some dependencies on PyPI.
+        try:
+          start_idx = _MANYLINUX_PLATFORMS.index(preferred_platform)
+          platforms = _MANYLINUX_PLATFORMS[start_idx:]
+        except ValueError:
+          platforms = [preferred_platform]

Review Comment:
   ![high](https://www.gstatic.com/codereviewagent/high-priority.svg)
   
   Since we've refactored `_MANYLINUX_PLATFORMS` to `_MANYLINUX_BASE_TAGS` to 
support multiple architectures, we should dynamically detect the architecture 
suffix (e.g., `_x86_64` or `_aarch64`) from the `preferred_platform` and 
reconstruct the platform list. Note that `manylinux2010` is not supported on 
`aarch64`, so we should exclude it for ARM64.
   
   ```suggestion
           arch = None
           for suffix in ['_x86_64', '_aarch64']:
             if preferred_platform.endswith(suffix):
               arch = suffix
               break
   
           if arch:
             platforms_with_arch = [
                 f'{base}{arch}'
                 for base in _MANYLINUX_BASE_TAGS
                 if not (base == 'manylinux2010' and arch == '_aarch64')
             ]
             try:
               start_idx = platforms_with_arch.index(preferred_platform)
               platforms = platforms_with_arch[start_idx:]
             except ValueError:
               platforms = [preferred_platform]
           else:
             platforms = [preferred_platform]
   ```



##########
sdks/python/apache_beam/runners/portability/stager.py:
##########
@@ -88,6 +88,14 @@
 # One of the choices for user to use for requirements cache during staging
 SKIP_REQUIREMENTS_CACHE = 'skip'
 
+# Ordered list of manylinux tags from newest (strictest) to oldest (most 
compatible)
+# used for cross-platform binary dependency downloads.
+_MANYLINUX_PLATFORMS = [
+    'manylinux_2_28_x86_64',
+    'manylinux2014_x86_64',  # equivalent to manylinux_2_17
+    'manylinux2010_x86_64',  # equivalent to manylinux_2_12
+]

Review Comment:
   ![high](https://www.gstatic.com/codereviewagent/high-priority.svg)
   
   The hardcoded `_MANYLINUX_PLATFORMS` list only contains `_x86_64` platform 
tags. This means that on ARM64 (`aarch64`) environments (such as Apple Silicon 
or AWS Graviton), the fallback mechanism will be completely bypassed because 
the preferred platform (e.g., `manylinux_2_28_aarch64`) won't be found in this 
list.
   
   To support both `x86_64` and `aarch64` architectures, we should define a 
list of base manylinux tags and dynamically append the appropriate architecture 
suffix.
   
   ```suggestion
   _MANYLINUX_BASE_TAGS = [
       'manylinux_2_28',
       'manylinux2014',  # equivalent to manylinux_2_17
       'manylinux2010',  # equivalent to manylinux_2_12
   ]
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to