Copilot commented on code in PR #5570:
URL: https://github.com/apache/texera/pull/5570#discussion_r3425805016
##########
common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/huggingFace/codegen/PythonCodegenBase.scala:
##########
@@ -207,18 +239,191 @@ object PythonCodegenBase {
| summary = "; ".join(errors) if errors else "no providers
available"
| return last_resp, summary
|
- | def _call_provider(self, provider_name, provider_id, json_headers,
pipeline_payload, prompt_value):
+ | def _call_provider(self, provider_name, provider_id, json_headers,
raw_binary_headers, pipeline_payload, use_raw_binary_body, prompt_value):
| '''Route to a third-party provider using its native API format.
- | For the text-gen-only build this covers the OpenAI-compatible
chat
- | providers and an unknown-provider fallback that tries the
pipeline
- | format then chat completions. Image / audio / media routing
will
- | be added in subsequent PRs alongside the corresponding task
- | codegens.
+ | Handles OpenAI-compatible chat providers for text-gen,
zai-org's
+ | custom API, Replicate / Fal-ai / Wavespeed for media-generation
+ | and image-to-image, and an unknown-provider fallback that tries
+ | the pipeline format then chat completions.
| '''
| base = f"https://router.huggingface.co/{provider_name}"
+ | task = self.TASK
+ | img_b64 = ""
+ | if use_raw_binary_body and isinstance(pipeline_payload, bytes):
+ | img_b64 =
base64.b64encode(pipeline_payload).decode("utf-8")
+ | elif isinstance(pipeline_payload, dict):
+ | # Image+prompt tasks (visual-question-answering,
document-question-
+ | # answering, zero-shot-image-classification) build dict
payloads
+ | # with use_raw_binary_body=False, so the raw-bytes
extraction above
+ | # doesn't fire. Without this branch, when one of those
tasks routes
+ | # to a third-party provider (replicate / fal-ai /
wavespeed /
+ | # OpenAI-compatible / unknown-fallback) the image is
silently
+ | # dropped and only prompt_value is sent — they happen to
work only
+ | # on hf-inference, where the dict goes through as JSON.
Surfacing
+ | # img_b64 here keeps the provider-specific branches below
image-
+ | # aware without each branch needing to know the dict shape.
+ | inputs = pipeline_payload.get("inputs")
+ | if isinstance(inputs, dict) and
isinstance(inputs.get("image"), str):
+ | img_b64 = inputs["image"]
+ | elif task == "zero-shot-image-classification" and
isinstance(inputs, str):
+ | img_b64 = inputs
+ |
+ | # zai-org: custom /api/paas/v4/ surface.
+ | if provider_name == "zai-org":
+ | zai_headers = {**json_headers, "x-source-channel":
"hugging_face", "accept-language": "en-US,en"}
+ | if task in ("image-to-text", "image-text-to-text"):
+ | url = f"{base}/api/paas/v4/layout_parsing"
+ | file_data = f"data:image/png;base64,{img_b64}" if
img_b64 else ""
+ | return requests.post(url, headers=zai_headers,
json={"model": provider_id, "file": file_data}, timeout=120)
+ | url = f"{base}/api/paas/v4/chat/completions"
+ | messages = [{"role": "user", "content": prompt_value}]
+ | if img_b64:
+ | messages = [{"role": "user", "content": [
+ | {"type": "image_url", "image_url": {"url":
f"data:image/png;base64,{img_b64}"}},
+ | {"type": "text", "text": prompt_value if
prompt_value else "What is in this image?"},
+ | ]}]
+ | return requests.post(url, headers=zai_headers,
json={"model": provider_id, "messages": messages}, timeout=120)
+ |
+ | # Replicate: synchronous predictions endpoint with polling
fallback.
+ | if provider_name == "replicate":
+ | url = f"{base}/v1/models/{provider_id}/predictions"
+ | hdrs = {**json_headers, "Prefer": "wait"}
+ | if task == "text-to-speech":
+ | inp = {"text": prompt_value}
+ | elif task in ("text-to-image", "text-to-video"):
+ | inp = {"prompt": prompt_value}
+ | elif task == "automatic-speech-recognition" and img_b64:
Review Comment:
In the Replicate provider routing, `audio-classification` is treated as a
generic `img_b64` payload and ends up being sent under the `image` key. Since
`audio-classification` is an `audio_only_task` (raw bytes), it should be
encoded as an `audio` data URL similarly to ASR.
##########
common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/huggingFace/codegen/PythonCodegenBase.scala:
##########
@@ -266,11 +483,12 @@ object PythonCodegenBase {
| # --- resolve all available inference providers for this model
(tried in order) ---
| providers = self._resolve_providers(token)
|
- | # --- validate prompt column exists ---
- | assert prompt_col in table.columns, (
- | f"Prompt column '{prompt_col}' not found in input table. "
- | f"Available columns: {list(table.columns)}"
- | )
+ | # --- validate prompt column exists (skipped for binary-only
tasks) ---
+ | if task not in image_only_tasks and task not in
audio_only_tasks:
+ | assert prompt_col in table.columns, (
+ | f"Prompt column '{prompt_col}' not found in input
table. "
+ | f"Available columns: {list(table.columns)}"
+ | )
Review Comment:
The prompt-column assertion prevents `image_prompt_tasks` from using the
intended fallback prompt when the configured prompt column is missing (later
code explicitly handles `task in image_prompt_tasks and prompt_col not in
table.columns`). This currently makes that fallback path unreachable and will
raise an AssertionError instead.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]