Re: [PR] Add image generation code to Gemini Model Handler [beam]

via GitHub Fri, 17 Oct 2025 13:51:11 -0700


jrmccluskey commented on code in PR #36177:
URL: https://github.com/apache/beam/pull/36177#discussion_r2437126876



##########
sdks/python/apache_beam/ml/inference/gemini_inference.py:
##########
@@ -51,11 +54,46 @@ def _retry_on_appropriate_service_error(exception: 
Exception) -> bool:
   return exception.code == 429 or exception.code >= 500
 
 
-def generate_from_string(
+def generate_text_from_string(
     model_name: str,
     batch: Sequence[str],
     model: genai.Client,
     inference_args: dict[str, Any]):
+  """ Request function that expects inputs to be composed of strings, then
+  sends requests to Gemini to generate text responses based on the text
+  prompts.
+
+  Args:
+    model_name: the Gemini model to use for the request. This model should be
+      a text generation model.
+    batch: the string inputs to be send to Gemini for text generation.
+    model: the genai Client
+    inference_args: any additional arguments passed to the generate_content
+      call.
+  """
+  return model.models.generate_content(
+      model=model_name, contents=batch, **inference_args)
+
+
+def generate_image_from_strings_and_images(
+    model_name: str,
+    batch: Sequence[list[Union[str, Image, Part]]],
+    model: genai.Client,
+    inference_args: dict[str, Any]):
+  """ Request function that expects inputs to be composed of lists of strings
+  and PIL Image instances, then sends requests to Gemini to generate images
+  based on the text prompts and contextual images. This is currently intended
+  to be used with the gemini-2.5-flash-image model (AKA Nano Banana.)
+
+  Args:
+    model_name: the Gemini model to use for the request. This model should be
+      an image generation model such as gemini-2.5-flash-image.
+    batch: the inputs to be send to Gemini for image generation as prompts.
+      Composed of text prompts and contextual pillow Images.
+    model: the genai Client
+    inference_args: any additional arguments passed to the generate_content
+      call.
+  """
   return model.models.generate_content(

Review Comment:
   The type hinting is the major distinction here, the actual generation call 
to gemini is largely distinguished by the model you're calling (the docstring 
covers this somewhat.) The API is pretty simple from a user standpoint, the 
distinguishing request functions are really just to make sure we get reasonable 
typing in here. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Add image generation code to Gemini Model Handler [beam]

Reply via email to