PG1204 opened a new issue, #5134:
URL: https://github.com/apache/texera/issues/5134

   ### Task Summary
   
   ### Feature Summary
   
   The HuggingFace inference operator (#5041) needs a small backend REST 
surface to support its frontend UI. Without these endpoints, the operator's 
property panel can't populate the model picker, audio-input tasks can't accept 
user uploads, and inference responses that link to remote media (images, audio, 
video on HF / Fal / Replicate CDNs) can't be previewed in the workspace due to 
browser CORS.
   
   This issue covers introducing `HuggingFaceModelResource` and registering it 
on the web application. It is the backend foundation that subsequent child 
issues — the operator class, the property panel, result-panel media rendering, 
and developer docs — depend on.
   
   Concretely, publishing these endpoints would enable:
   
   - The operator UI's model picker (browse HF models per pipeline task; search 
by name).
   - Audio uploads for tasks like automatic speech recognition and audio 
classification, with the uploaded clip streamable back to the browser for 
preview.
   - Inline display of HF inference response media in the result panel, by 
proxying allowlisted remote URLs through Texera (bypassing browser CORS).
   
   ### Proposed Solution or Design
   
   1. Add `HuggingFaceModelResource` (Jersey REST resource, 
`@Path("/huggingface")`) exposing five endpoints:
      - `GET /api/huggingface/models?task=…[&search=…]` — browse or search HF 
models for a pipeline task.
      - `GET /api/huggingface/tasks` — list HF pipeline tags with hosted 
inference.
      - `POST /api/huggingface/upload-audio?filename=…` — stream-upload an 
audio file.
      - `GET /api/huggingface/audio-preview?path=…` — stream an uploaded audio 
file back to the browser.
      - `GET /api/huggingface/media-proxy?url=…` — proxy an allowlisted remote 
media URL.
   2. Register the resource in `TexeraWebApplication`.
   3. Design constraints baked into the resource:
      - **Token sourcing:** user's HF token forwarded via the `X-HF-Token` 
request header from the operator panel; anonymous fallback for unauthenticated 
browsing. No server-side env-var token.
      - **Caching:** bounded Guava `Cache` (size + 1 h TTL) for browse and 
tasks endpoints; user-token requests bypass the cache to keep private-model 
visibility per-user.
      - **Streaming upload:** `InputStream`-based with a 25 MiB cap and 
extension allowlist (`.wav`, `.mp3`, `.flac`, …); non-audio extensions rejected 
before disk write.
      - **SSRF protection:** allowlist on `/media-proxy` (`huggingface.co`, 
`fal.media`, `replicate.delivery`, `replicate.com`) with a leading-dot suffix 
guard against lookalike domains.
      - **Bounded fan-out:** the per-task probe in `/tasks` runs on a dedicated 
`ForkJoinPool(4)` instead of the JVM common pool, with explicit 429/503 WARN 
logging.
      - **Truncation visibility:** browse and search responses carry an 
`X-Texera-Truncated: true` header when a server-side cap is hit (`MAX_PAGES=50` 
for browse, `SEARCH_LIMIT=100` for search).
   
   References:
   - Parent issue: #5041
   - Pull request: #5124
   - HF Hub API: https://huggingface.co/docs/hub/api
   
   ### Impact / Priority
   
   (P2) Medium – required for the HuggingFace inference operator (#5041) to 
function. Does not affect existing functionality.
   
   ### Affected Area
   
   Workflow Engine (Amber) — backend REST layer.
   
   
   ### Task Type
   
   - [ ] Refactor / Cleanup
   - [ ] DevOps / Deployment / CI
   - [ ] Testing / QA
   - [ ] Documentation
   - [ ] Performance
   - [x] Other


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to