obelix74 commented on PR #3385:
URL: https://github.com/apache/polaris/pull/3385#issuecomment-3856668355

   @rmannibucau Thanks for the detailed feedback! You raise excellent points 
about observability stack integration.
   I think there are actually two complementary use cases here:
   
   ### 1. Observability/Monitoring 
   For real-time monitoring, alerting, and trace correlation, I completely 
agree that the OpenTelemetry approach is superior:
     - Span attributes for Iceberg metrics (already supported via 
otel_trace_id/otel_span_id correlation)
     - Prometheus /metrics endpoint for aggregated metrics
     - Let the observability stack (Tempo, Grafana, etc.) handle storage and 
retention
     
   This is the "hot path" for operational visibility.
   
   ### 2. Historical Analytics/Auditing (this PR's focus)
   The JDBC persistence targets a different use case:
     - Query optimization analysis - "Which tables have the most expensive 
scans over the last 30 days?"
     - Capacity planning - "What's the trend of data scanned per catalog?"
     - Audit/compliance - "Show me all operations on table X by principal Y"
     - Cost attribution - Correlate scan metrics with cloud costs
   
   These queries need structured, queryable storage that's harder to achieve 
with trace backends (which are optimized for trace retrieval, not analytical 
queries).
   
   ### Proposed Path Forward
   
   The current implementation is designed to be pluggable via the 
MetricsPersistence SPI:
     - NoOpMetricsPersistence - Default, no storage (current behavior)
     - JdbcMetricsPersistence - For users who want queryable historical data
     - Future: OtlpMetricsPersistence - Export as OTLP logs/metrics to collector
   
   Users can choose based on their needs. For pure observability, they'd use 
the existing OTEL integration + /metrics. For analytics, they'd enable JDBC 
persistence.
   
   Does this separation of concerns address your feedback? Or do you see the 
JDBC approach as fundamentally problematic even for the analytics use case?
   
   I need this data persisted for end to end auditing for both internal and 
external auditors (PII data).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to