carloea2 opened a new issue, #4183:
URL: https://github.com/apache/texera/issues/4183
### Feature Summary
Add a complete **resumable multipart upload experience** across backend +
frontend:
- Backend exposes active resumable sessions, reports missing parts
deterministically, denies unsafe concurrency, and supports restart.
- Frontend detects conflicts and prompts users to **Resume** (continue
uploading missing parts) or **Restart** (start from byte 0).
### Proposed Solution or Design
### Backend
1. **Add multipart operation `type=list`**
- Returns active multipart upload file paths for the dataset (only those
within the physical-address expiration window).
- Enables clients to discover resumable uploads before starting a new one.
2. **Tighten concurrency for `type=init`**
- Use DB row locking (`FOR UPDATE NOWAIT`) to fail fast if another client
is initializing/uploading the same file.
- Return **409 CONFLICT** on lock contention rather than allowing
concurrent uploads.
3. **Enhance `type=init` response**
- Return:
- `missingParts`: sorted list of part numbers whose ETag is empty
- `completedPartsCount`: `numParts - missingParts.length`
- Allows clients to resume deterministically without probing.
4. **Add optional init param: `restart` (default: false)**
- When `restart=true`, forcibly discard the existing session and start
fresh.
5. **Auto-restart on upload configuration changes**
- If existing session’s config mismatches incoming init request
(fileSizeBytes / partSizeBytes / computed numParts) OR session is expired:
- delete DB upload session (cascade deletes part rows)
- abort previous LakeFS multipart upload
- create a fresh session with the new parameters
6. **Add/Update tests**
- `type=list` returns only non-expired sessions, sorted
- init concurrency denial (409 when row locks are held)
- init missing parts reporting (all missing parts, sorted)
- delete-then-restart when fileSizeBytes / partSizeBytes changes
- restart=true behavior resets session/progress (even if config unchanged)
### Frontend
1. **Detect resumable uploads**
- Call `listMultipartUploads(ownerEmail, datasetName)` before enqueueing
selected files.
2. **Show a Resume/Restart decision dialog**
- For conflicting paths, user chooses:
- **Resume**: keep session and upload only missing parts (fallback to
restart if init params mismatch)
- **Restart**: force new session from 0 by setting `restart=true`
- Support “for all” options when multiple conflicts are present.
3. **Ensure Resume is the primary action**
- Make Resume button visually highlighted to guide users toward safe
recovery.
### Impact / Priority
(P3)Low – nice to have
### Affected Area
Storage / Metadata
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]