[I] Add HuggingFaceInferenceOpDesc with dispatcher + per-task codegen architecture (text-generation) [texera]

via GitHub Thu, 28 May 2026 11:48:33 -0700


PG1204 opened a new issue, #5277:
URL: https://github.com/apache/texera/issues/5277


   ### Task Summary
   
   ### Feature Summary
   
   The HuggingFace inference operator (#5041) needs to cover ~20 HF pipeline 
tasks (text-generation, image-classification, ASR, text-to-image, …). To land 
it cleanly and let the per-task work proceed in parallel, the operator is 
introduced via a dispatcher + per-task codegen architecture: a thin 
`HuggingFaceInferenceOpDesc` selects a `TaskCodegen` based on the configured 
task, and the selected codegen contributes the per-task Python payload + parse 
snippets. Shared infrastructure (provider fallback, HTTP loop, response-parsing 
framework) lives in `PythonCodegenBase`.
   
   This issue covers shipping the dispatcher pattern + the first task family 
(text-generation) end-to-end. Subsequent child issues add the image, audio / 
media-generation, and QA / ranking task families by introducing new `*Codegen` 
objects and registering them in the dispatcher map. The architecture lets each 
task-family PR stay focused: a new task family means one new file plus one 
entry in the dispatcher map — no surgery on the shared infrastructure or other 
codegens.
   
   Concretely, landing this would enable:
   
   - A working HuggingFace operator on the workspace for text-generation tasks 
against HF Hub and any OpenAI-compatible third-party provider (Cerebras, Groq, 
Sambanova, Together, …).
   - A clean extension point for the image / audio / QA task families to plug 
into via subsequent PRs without modifying the operator class or the shared 
Python infrastructure.
   
   ### Proposed Solution or Design
   
   1. New files under 
`common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/huggingFace/`:
      - `HuggingFaceInferenceOpDesc.scala` — thin (~180-line) dispatcher 
holding the `@JsonProperty` fields and the `registeredCodegens` map.
      - `codegen/TaskCodegen.scala` — trait + `CodegenContext` case class; 
default `tasks: Set[String] = Set(task)` for single-task codegens, overridable 
by multi-task codegens.
      - `codegen/PythonCodegenBase.scala` — shared provider-fallback (HF router 
+ OpenAI-compatible third-party providers), `process_table` loop, 
`_parse_response` framework, with two holes for the per-task payload + parse 
snippets.
      - `codegen/TextGenCodegen.scala` — text-generation's chat-completions 
payload and `body["choices"][0]["message"]["content"]` parse.
   2. Register `HuggingFaceInferenceOpDesc` in `LogicalOp.scala`'s 
`@JsonSubTypes`.
   3. Design constraints baked into the codegen:
      - **Safe codegen via `EncodableString` + `pyb"..."`:** user-input string 
fields are typed as `EncodableString` (`String @EncodableStringAnnotation`); 
the `pyb` macro emits them as `self.decode_python_template('<base64>')` runtime 
expressions instead of raw Python literals, so they never appear in the 
generated source as-is. This is what satisfies `PythonCodeRawInvalidTextSpec`'s 
leakage check.
      - **Constants in `open(self)`:** per-instance attributes 
(`self.MODEL_ID`, `self.PROMPT_COLUMN`, …) are assigned in the lifecycle method 
so `self` is in scope for the decode call.
      - **Codegen totality:** `generatePythonCode` never throws on arbitrary 
`@JsonProperty` values — unknown task strings fall back to `TextGenCodegen`, 
and the generated Python's `else` branch produces a generic `{"inputs": 
prompt_value}` payload, matching the original monolithic operator's behavior. 
Required by the regression test contract.
      - **Defensive `MODEL_ID` validation at runtime:** generated Python 
rejects malformed model IDs (path-traversal segments, query strings, fragments, 
control characters) with a clear `ValueError` before any HF URL is composed.
   
   References:
   - Parent issue: #5041
   - Stacked on: #5124 (REST resource — issue #5134)
   - HF Inference Providers API: https://huggingface.co/docs/inference-providers
   
   ### Impact / Priority
   
   (P2) Medium — required for the HuggingFace inference operator (#5041) to 
function. Does not affect existing functionality.
   
   ### Affected Area
   
   Workflow Engine (Amber) — operator descriptor + Python codegen.
   
   ### Task Type
   
   - [ ] Refactor / Cleanup
   - [ ] DevOps / Deployment / CI
   - [ ] Testing / QA
   - [ ] Documentation
   - [ ] Performance
   - [x] Other


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] Add HuggingFaceInferenceOpDesc with dispatcher + per-task codegen architecture (text-generation) [texera]

Reply via email to