bharos opened a new issue, #8887:
URL: https://github.com/apache/gravitino/issues/8887

   ### What would you like to be improved?
   
   The Iceberg metrics endpoint (`POST /v1/.../metrics`) always returns `204 No 
Content`, even when metrics are rejected due to queue full or service shutdown. 
This makes it impossible for clients to detect failures or implement proper 
error handling.
   
   **Current behavior:**
   - Queue full (>1000 metrics) → Returns 204 ✅ (but metric dropped)
   - Manager closed (shutdown) → Returns 204 ✅ (but metric dropped)
   - Success → Returns 204 ✅
   
   All scenarios return the same code, providing no feedback to clients 
(Spark/Flink).
   
   **Code location:**
   - `IcebergMetricsManager.recordMetric()` - drops metrics silently
   - `IcebergTableOperations.reportTableMetrics()` - always returns 204
   
   ### How should we improve?
   
   Return HTTP status codes that accurately reflect acceptance status:
   
   - **202 Accepted** - Metric successfully queued
   - **503 Service Unavailable** - Manager closed or queue full (client should 
retry)
   
   **Implementation:**
   1. Change `recordMetric()` to return boolean
   2. Update REST endpoint to check result and return appropriate status
   3. Add tests for failure scenarios
   
   **Benefits:**
   - Clients can detect and respond to failures
   - Operators can monitor acceptance rate
   - Enables proper retry logic
   - Better observability
   
   **Compatibility:** Backward compatible - clients currently ignore response
   
   I'm willing to contribute a PR with code changes and tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to