Re: [PR] Update RunInference to work with model manager [beam]

via GitHub Wed, 04 Feb 2026 13:15:16 -0800


gemini-code-assist[bot] commented on PR #37506:
URL: https://github.com/apache/beam/pull/37506#issuecomment-3849780688


   ## Summary of Changes
   
   Hello @AMOOOMA, I'm Gemini Code Assist[^1]! I'm currently reviewing this 
pull request and will post my feedback shortly. In the meantime, here's a 
summary to help you and other reviewers quickly get up to speed!
   
   This pull request enhances the `RunInference` transform by introducing an 
optional model manager. This manager is designed to automatically handle GPU 
memory, making the inference process more efficient, especially when dealing 
with multiple or large models. The changes include renaming an existing 
internal class for better semantic clarity, updating the core `RunInference` 
DoFn to accept and utilize the new manager, and adding comprehensive tests to 
ensure its correct operation, particularly in GPU environments.
   
   ### Highlights
   
   * **Model Manager Integration**: The `RunInference` transform now supports 
an optional model manager to automatically control GPU memory, improving 
resource utilization for inference tasks.
   * **Refactoring and Renaming**: The internal `_ModelManager` class has been 
renamed to `_ModelHandlerManager` to more accurately reflect its role in 
managing model handlers rather than raw models, and its docstring has been 
updated for clarity.
   * **Enhanced Model Loading**: A new `_ProxyLoader` class has been introduced 
to facilitate the wrapping of model loaders for `MultiProcessShared`, ensuring 
proper model management when the model manager is active.
   * **Updated `_SharedModelWrapper`**: The `_SharedModelWrapper` has been 
modified to integrate with the new model manager, allowing it to acquire and 
release models through the manager, which is crucial for GPU memory control.
   * **New Integration Tests**: New integration tests have been added, 
specifically for GPU-based HuggingFace models, to validate the functionality of 
`RunInference` when using the model manager, including scenarios with large 
inputs, large models, and parallel inference branches.
   
   🧠 **New Feature in Public Preview:** You can now enable **Memory** to help 
**Gemini Code Assist** learn from your team's feedback. This makes future code 
reviews more consistent and personalized to your project's style. **Click 
[here](https://codeassist.google/code-review/login) to enable Memory in your 
admin console.**
   
   <details>
   <summary><b>Changelog</b></summary>
   
   * **sdks/python/apache_beam/ml/inference/base.py**
       * Imported `ModelManager` and added `ImportError` handling.
       * Renamed `_ModelManager` to `_ModelHandlerManager` and updated its 
docstring.
       * Updated type hints and class references from `_ModelManager` to 
`_ModelHandlerManager` in `KeyedModelHandler` and related methods.
       * Added `use_model_manager` and `model_manager_args` parameters to 
`RunInference.__init__` and `_RunInferenceDoFn.__init__`.
       * Introduced `_ProxyLoader` class to wrap model loaders for 
`MultiProcessShared`.
       * Modified `_SharedModelWrapper` to conditionally use the `ModelManager` 
for model acquisition and release.
       * Updated `_RunInferenceDoFn._load_model` to initialize and use the 
`ModelManager` based on the `use_model_manager` flag.
       * Adjusted `_RunInferenceDoFn._run_inference` to acquire and release 
models via the `ModelManager` when enabled.
   * **sdks/python/apache_beam/ml/inference/base_test.py**
       * Imported the `multiprocessing` module.
       * Renamed test methods related to `_ModelManager` to 
`_ModelHandlerManager` (e.g., `test_model_manager_loads_shared_model` to 
`test_model_handler_manager_loads_shared_model`).
       * Updated instantiations of `base._ModelManager` to 
`base._ModelHandlerManager` in existing tests.
       * Added new test cases for `RunInference` with `use_model_manager=True`, 
including tests for passing `model_manager_args`.
       * Introduced `SimpleFakeModelHanlder` and `try_import_model_manager` 
helper functions.
       * Added `ModelManagerTest` class with a `tearDown` method to clean up 
multiprocessing children.
   * **sdks/python/apache_beam/ml/inference/model_manager_it_test.py**
       * Added a new integration test file for the model manager.
       * Included `HuggingFaceGpuTest` class with tests for `RunInference` 
using `use_model_manager=True` on GPU.
       * Tests cover sentiment analysis with large inputs, large RoBERTa 
models, and parallel inference branches.
       * Implemented `unittest.skipIf` to skip GPU-specific tests if CUDA is 
not available or if `ModelManager` cannot be imported.
   </details>
   
   
   
   <details>
   <summary><b>Activity</b></summary>
   
   * The pull request was authored by AMOOOMA with the goal of updating 
`RunInference` to integrate with a model manager for GPU memory control.
   * The description highlights the optional nature of the model manager and 
the renaming of `_ModelManager` to `ModelHandlerManager` for better alignment 
with its function.
   * The author included a standard contribution checklist.
   </details>
   
   
   <details>
   <summary><b>Using Gemini Code Assist</b></summary>
   <br>
   
   The full guide for Gemini Code Assist can be found on our [documentation 
page](https://developers.google.com/gemini-code-assist/docs/review-github-code),
 here are some quick tips.
   
   <b>Invoking Gemini</b>
   
   You can request assistance from Gemini at any point by creating a comment 
using either `/gemini <command>` or `@gemini-code-assist <command>`. Below is a 
summary of the supported commands on the current page.
   
   Feature | Command | Description
   --- | --- | ---
   Code Review | `/gemini review` | Performs a code review for the current pull 
request in its current state.
   Pull Request Summary | `/gemini summary` | Provides a summary of the current 
pull request in its current state.
   Comment | @gemini-code-assist | Responds in comments when explicitly tagged, 
both in pull request comments and review comments.
   Help | `/gemini help` | Displays a list of available commands.
   
   <b>Customization</b>
   
   To customize Gemini Code Assist for GitHub experience, repository 
maintainers can create a configuration file and/or provide a custom code review 
style guide (such as PEP-8 for Python) by creating and adding files to a 
`.gemini/` folder in the base of the repository. Detailed instructions can be 
found 
[here](https://developers.google.com/gemini-code-assist/docs/customize-gemini-behavior-github).
   
   <b>Limitations & Feedback</b>
   
   Gemini Code Assist may make mistakes. Please leave feedback on any instances 
where its feedback is incorrect or counter productive. You can react with 
:thumbsup: and :thumbsdown: on @gemini-code-assist comments. If you're 
interested in giving your feedback about your experience with Gemini Code 
Assist for Github and other Google products, sign up 
[here](https://google.qualtrics.com/jfe/form/SV_2cyuGuTWsEw84yG).
   
   <b>You can also get AI-powered code generation, chat, as well as code 
reviews directly in the IDE at no cost with the [Gemini Code Assist IDE 
Extension](https://cloud.google.com/products/gemini/code-assist).</b>
   </details>
   
   
   
   
   [^1]: Review the [Privacy Notices](https://policies.google.com/privacy), 
[Generative AI Prohibited Use 
Policy](https://policies.google.com/terms/generative-ai/use-policy), [Terms of 
Service](https://policies.google.com/terms), and learn how to configure Gemini 
Code Assist in GitHub 
[here](https://developers.google.com/gemini-code-assist/docs/customize-gemini-behavior-github).
 Gemini can make mistakes, so double check it and [use code with 
caution](https://support.google.com/legal/answer/13505487).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Update RunInference to work with model manager [beam]

Reply via email to