obelix74 commented on PR #3385:
URL: https://github.com/apache/polaris/pull/3385#issuecomment-3736701475
@adutra, @dimas-b asked me to tag you to review the event related changes in
this PR. Please see the google doc in the PR description for the background.
**Event-Related Changes Summary for Review**
This pull request introduces comprehensive metrics reporting events and
infrastructure to enable audit logging of compute engine metrics (scan and
commit reports from Spark, Trino, Flink, etc.). Below is a detailed breakdown
of all event-related modifications:
**1. Event Types (PolarisEventType.java)**
File:
`runtime/service/src/main/java/org/apache/polaris/service/events/PolarisEventType.java`
Class: PolarisEventType (enum)
Changes: Added two new event types for metrics reporting:
• BEFORE_REPORT_METRICS - Emitted before a metrics report is processed
• AFTER_REPORT_METRICS - Emitted after a metrics report has been
processed
**2. Event Classes (IcebergRestCatalogEvents.java)**
File:
runtime/service/src/main/java/org/apache/polaris/service/events/IcebergRestCatalogEvents.java
Class: IcebergRestCatalogEvents
Changes: Added two new event record classes:
• `BeforeReportMetricsEvent` - Event emitted before metrics report
processing, containing event metadata, catalog name, namespace, table name, and
the
ReportMetricsRequest (which wraps ScanReport or CommitReport)
• `AfterReportMetricsEvent` - Event emitted after metrics report
processing with the same structure, enabling audit logging of compute engine
metrics including
trace context for correlation
**3. Event Listener Interface (PolarisEventListener.java)**
File:
runtime/service/src/main/java/org/apache/polaris/service/events/listeners/PolarisEventListener.java
Class: PolarisEventListener (interface)
Changes: Added two new default handler methods in the "Metrics Reporting
Events" section:
• onBeforeReportMetrics(BeforeReportMetricsEvent event) - Handler for
before-metrics events
• onAfterReportMetrics(AfterReportMetricsEvent event) - Handler for
after-metrics events
**4. Persistence Event Listener (PolarisPersistenceEventListener.java)**
File:
runtime/service/src/main/java/org/apache/polaris/service/events/listeners/PolarisPersistenceEventListener.java
Class: PolarisPersistenceEventListener (abstract class)
Changes: Implemented the onAfterReportMetrics handler with comprehensive
metrics extraction logic including:
• New imports for CommitMetricsResult, ScanMetricsResult,
CounterResult, ReportMetricsRequest
• Full implementation of onAfterReportMetrics() that creates a
PolarisEvent with report type, snapshot ID, schema ID, and OpenTelemetry context
• New helper method extractScanReportData() - Extracts key scan metrics
(result data files, delete files, file sizes, manifests) with trace-id from
metadata
• New helper method extractCommitReportData() - Extracts key commit
metrics (added/removed files, records, file sizes) with operation type and
sequence number
• New helper method addReportMetadata() - Safely adds report metadata
entries with null-safety to prevent NPE in ImmutableMap.Builder
• New helper method addCounterIfPresent() - Safely adds counter values
to properties map
**5. Event Service Delegator (IcebergRestCatalogEventServiceDelegator.java)**
File:
runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergRestCatalogEventServiceDelegator.java
Class: IcebergRestCatalogEventServiceDelegator
Changes: Modified reportMetrics() method to emit before/after events:
• Added import for RESTUtil for table name decoding
• Enhanced method documentation explaining the endpoint's purpose for
compute engine metrics
• Added event emission: onBeforeReportMetrics() called before
delegating, onAfterReportMetrics() called after
• Added proper namespace/table decoding using decodeNamespace() and
RESTUtil.decodeString()
**6. Test Event Listener (TestPolarisEventListener.java)**
File:
runtime/service/src/testFixtures/java/org/apache/polaris/service/events/listeners/TestPolarisEventListener.java
Class: TestPolarisEventListener
Changes: Added implementations for the two new event handlers:
• onBeforeReportMetrics() - Records the event for test verification
• onAfterReportMetrics() - Records the event for test verification
**7. New Unit Tests (PolarisPersistenceEventListenerTest.java)**
File:
runtime/service/src/test/java/org/apache/polaris/service/events/listeners/PolarisPersistenceEventListenerTest.java
Class: PolarisPersistenceEventListenerTest (NEW FILE)
Changes: New test class with null-safety tests for metrics extraction:
• testScanReportWithNullMetadataValues() - Verifies null values in
metadata are safely skipped
• testCommitReportWithNullOperation() - Verifies null operation handling
• testCommitReportWithNullMetadataValues() - Verifies null metadata
value handling
• testScanReportWithEmptyMetadata() - Verifies empty metadata handling
8. Integration Tests (InMemoryBufferEventListenerIntegrationTest.java)
File:
runtime/service/src/test/java/org/apache/polaris/service/events/listeners/inmemory/InMemoryBufferEventListenerIntegrationTest.java
Class: InMemoryBufferEventListenerIntegrationTest
Changes: Extended with comprehensive metrics event testing:
• Added new imports for ScanReport, CommitReport, ReportMetricsRequest,
and related classes
• Added @BeforeEach method resetDatabaseState() for test isolation
• Added feature flag ALLOW_OVERLAPPING_CATALOG_URLS to test
configuration
• Added unique base locations per catalog to avoid overlap issues
• `testReportMetricsEventWithTraceContext()` - Verifies
AfterReportMetricsEvent is emitted with OpenTelemetry trace context
• `testReportMetricsWithTraceIdInMetadata()` - Verifies trace-id from
report metadata is extracted with "report." prefix
• `testReportCommitMetrics()` - Verifies CommitReport data extraction
including operation, sequence number, and trace context
**9. New Metrics Reporting Infrastructure**
The following new classes were added to support alternative metrics
persistence strategies:
File:
runtime/service/src/main/java/org/apache/polaris/service/reporting/PolarisMetricsReporter.java
Class: PolarisMetricsReporter (interface)
Changes: Enhanced interface documentation describing available
implementations (default, events, persistence, composite)
File:
runtime/service/src/main/java/org/apache/polaris/service/reporting/EventsMetricsReporter.java
Class: EventsMetricsReporter (NEW FILE - CDI bean with
@Identifier("events"))
Purpose: Persists scan/commit reports to the events table as JSON for
unified audit trail
File:
runtime/service/src/main/java/org/apache/polaris/service/reporting/PersistingMetricsReporter.java
Class: PersistingMetricsReporter (NEW FILE - CDI bean with
@Identifier("persistence"))
Purpose: Persists metrics to dedicated tables (scan_metrics_report,
commit_metrics_report) for better queryability
File:
runtime/service/src/main/java/org/apache/polaris/service/reporting/CompositeMetricsReporter.java
Class: CompositeMetricsReporter (NEW FILE)
Purpose: Delegates to multiple reporters, allowing metrics to be sent to
multiple destinations
File:
runtime/service/src/main/java/org/apache/polaris/service/reporting/MetricsReportCleanupService.java
Class: MetricsReportCleanupService (NEW FILE)
Purpose: Scheduled cleanup service for old metrics reports based on
configurable retention policy
File:
runtime/service/src/main/java/org/apache/polaris/service/reporting/MetricsReportingConfiguration.java
Class: MetricsReportingConfiguration (interface)
Changes: Extended with new configuration options:
• Enhanced type() documentation for all reporter types
• Added targets() method for composite reporter configuration
• Added RetentionConfig nested interface with enabled(),
retentionPeriod(), and cleanupInterval() settings
**Key Summary**
• New Event Types: 2 (BEFORE_REPORT_METRICS, AFTER_REPORT_METRICS)
• New Event Records: 2 (BeforeReportMetricsEvent,
AfterReportMetricsEvent)
• New Listener Methods: 2 (onBeforeReportMetrics, onAfterReportMetrics)
• New Reporter Classes: 4 (EventsMetricsReporter,
PersistingMetricsReporter, CompositeMetricsReporter,
MetricsReportCleanupService)
• New Test Files: 4 (PolarisPersistenceEventListenerTest,
EventsMetricsReporterTest, PersistingMetricsReporterTest,
CompositeMetricsReporterTest)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]