PG1204 opened a new pull request, #5320:
URL: https://github.com/apache/texera/pull/5320
> ⚠️ This PR is stacked on #5278. Until that lands, the diff below also
includes #5278's operator + codegen + spec changes. The new code in this PR is
`codegen/ImageTaskCodegen.scala`, the image-related additions to
`codegen/PythonCodegenBase.scala`, the new image fields on
`HuggingFaceInferenceOpDesc.scala`, the frontend image-upload component, and
the image-task tests in `HuggingFaceInferenceOpDescSpec.scala`. Once #5278
merges, this diff will auto-clean to ~856 lines.
### What changes were proposed in this PR?
Adds the image task family — 9 HF pipeline tasks — as the second
`TaskCodegen` plugged into the dispatcher established by #5278:
image-only: image-classification, object-detection, image-segmentation,
image-to-text
image + prompt: visual-question-answering, document-question-answering,
zero-shot-image-classification, image-text-to-text, image-to-image
- `codegen/ImageTaskCodegen.scala` supplies the per-task payload + parse
Python branches for all 9 tasks.
- `TaskCodegen` trait gains a `tasks: Set[String]` default method (defaults
to `Set(task)`) so a single codegen can register under multiple task strings;
`ImageTaskCodegen` is the first multi-task codegen to use it.
- `CodegenContext` extended with `imageInput` + `inputImageColumn`
(`EncodableString`).
- `HuggingFaceInferenceOpDesc.scala` gains 2 new `@JsonProperty` fields and
registers `ImageTaskCodegen` via the new `tasks` flat-map.
`PythonCodegenBase.scala` grows to host the shared image infrastructure:
- Task-family tuples (`image_only_tasks`, `image_prompt_tasks`,
`image_tasks`) + `image_headers` in `process_table`.
- Per-row image-bytes resolution from upload or column with
`_read_image_input` / `_read_binary_value` / `_compress_image_bytes`.
- `_post_with_fallback` extended with `raw_binary_headers` +
`use_raw_binary_body`; adds image-text-to-text chat-completions and
model-author vision branches.
- `_call_provider` gains zai-org, Replicate predictions + polling, Fal-ai,
Wavespeed submit+poll branches, and image embedding for OpenAI-compatible /
unknown-provider fallbacks.
- Image content-type response handling returns `data:image/...;base64,...`
URLs.
- Image helpers added: `_read_image_input`, `_compress_image_bytes`,
`_image_input_as_base64`, `_read_binary_value`, `_looks_like_html`,
`_html_to_image_bytes`, `_extract_json_arg`, `_url_to_data_url`.
Frontend integration (HF lines only — no agent / dataset noise):
`HuggingFaceImageUploadComponent` declared in `app.module.ts`,
`huggingface-image-upload` formly type registered, image upload component
.ts/.html/.scss + `HuggingFace.png` + `sample-image.png` assets.
User-input strings continue to flow through `pyb"..."` + `EncodableString`
so they reach Python as `self.decode_python_template('<base64>')` rather than
raw literals. `PythonCodeRawInvalidTextSpec` still passes
(117/117 descriptors `py_compile` cleanly).
### Any related issues, documentation, or discussions?
- Tracking issue: #5319
- Closes: #5319
- Stacked on: #5278 (operator + text-generation — issue #5277)
- Parent issue: #5041
- Closed sibling issue: #5134 (REST resource — landed via #5124)
### How was this PR tested?
- `sbt "WorkflowOperator/compile; WorkflowOperator/Test/compile"` clean.
- `sbt scalafmtCheck` clean.
- `sbt "WorkflowOperator/testOnly
org.apache.texera.amber.operator.huggingFace.HuggingFaceInferenceOpDescSpec"` —
18/18 pass (PR 2's 13 spec tests + 5 new image-task tests: image-only routing,
VQA / document-QA payload, image-text-to-text chat-completions, image-to-image
data-URL parse, all-9-tasks dispatcher coverage).
- `sbt "WorkflowOperator/testOnly
org.apache.texera.amber.util.PythonCodeRawInvalidTextSpec"` — 117/117
descriptors `py_compile` cleanly with the new operator code paths, no marker
leaks.
- Generated Python verified via `python3 -m py_compile` on sample image-task
outputs.
### Was this PR authored or co-authored using generative AI tooling?
Yes, co-authored with Claude Opus 4.7.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]