peterxcli commented on code in PR #8449: URL: https://github.com/apache/ozone/pull/8449#discussion_r2100216246
########## hadoop-hdds/docs/content/design/s3-event-notification.md: ########## @@ -0,0 +1,284 @@ +--- +title: S3 Event Notifications +summary: S3 Event Notifications support, similar like AWS +date: 2025-05-14 +jira: HDDS-5984 +status: design +author: Chung En Lee, Wei-Chiu Chuang + +--- +<!-- + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + http://www.apache.org/licenses/LICENSE-2.0 + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. See accompanying LICENSE file. +--> + +# S3 Event Notifications +## Overview + +This document proposes the design of an event notification system for Apache Ozone. The goal is to enable external consumers to subscribe to and consume events that occur within the Ozone cluster. This requirement closely resembled the functionality provided by ([AWS S3 Event Notifications](https://docs.aws.amazon.com/AmazonS3/latest/userguide/EventNotifications.html)). + +## Motivation + +Apache Ozone doesn’t support event notifications, which limits its integration with downstream systems like real-time data pipelines, auditing tools, or metadata indexers. Adding this feature will enable external services to react to key operations (e.g., PUT, DELETE) in near real-time, supporting use cases such as analytics, compliance, and monitoring. It also helps bring Ozone closer to feature parity with object stores like Amazon S3. + +## Use Cases + +- **Triggering downstream data processing:** e.g., trigger a Spark or Flink job on object creation. +- **Data Backup and Replication** +- **Monitoring and Alerts** + +## Design Principles + +- **Cloud-Agnostic Integration:** Inspired by MinIO and Ceph, this design avoids dependence on managed services like AWS SNS/SQS, and instead supports self-managed targets (e.g., Kafka, RabbitMQ). +- **Separation of Concerns:** Event delivery is decoupled from core Ozone operations. Ozone handles event generation and persistence; external systems handle delivery and processing. + +## Goals & Non-Goals + +These design choices are guided by two main considerations: + +1. Refer to MinIO and Ceph: The design favors deployment in on-premises or cloud-agnostic environments by avoiding reliance on managed cloud services. +2. Decoupling event delivery from Ozone internals. + +### Goals + +- Provide Java APIs to configure bucket-level notifications and target configurations. +- Support CLI tools to manage notification settings. +- Support initial delivery targets: Kafka and RabbitMQ. +- Enable filtering of notifications via object prefixes. +- Support event types: object creation, deletion, and tagging. +- Support for AWS-compatible REST APIs (e.g., GET/PUT Bucket Notification). + +### Non-Goals + +- Only a subset of Ozone-supported events can be configured for notifications. +- The design does not guarantee any delivery semantics (except for synchronous failure responses). + +## Event Types + +### Support + +Currently, Apache Ozone can publish notifications for the following events: +- New object created events +- Object removal events +- Object tagging events + +### Not Support + +Those can be published by Amazon S3 but are not supported in Ozone. See [S3 Event Notification](https://docs.aws.amazon.com/AmazonS3/latest/userguide/EventNotifications.html#notification-how-to-overview). + +- Reduced Redundancy Storage (RRS) object lost events +- Replication events +- Restore object events +- S3 Lifecycle expiration/transition events +- S3 Intelligent-Tiering automatic archival events +- Object ACL PUT events + +### Mapping Table +Here is the mapping between OM audit actions and S3 event types. + +| Ozone Manager Action | S3 Event Type | Notes | +|-------------------------------|------------------------------------------|--------------------------------| +| COMMIT_KEY (via PUT/Copy) | `s3:ObjectCreated:Put`, `Copy` | No distinction between them | +| COMPLETE_MULTIPART_UPLOAD | `s3:ObjectCreated:CompleteMultipartUpload`| | +| DELETE_KEY | `s3:ObjectRemoved:Delete` | | +| PUT_OBJECT_TAGGING | `s3:ObjectTagging:Put` | | +| DELETE_OBJECT_TAGGING | `s3:ObjectTagging:Delete` | | + +--- + +## Design + +### Overview + +A callback is introduced post-Ratis commit and pre-client response to handle event notification logic. + +#### Component + +The implementation of this S3 trigger feature needs both S3 gateway and OzoneManager support. +- **S3 gateway**: Provide get and put notification s3 apis +- **Ozone Manager**: Provide get and put notification Java apis +Provide get and put target configuration Java apis +Persist notification & target configuration into DB +Notify OM each operation that may trigger an event notification. + +#### Java Apis + +This is the group of api to access target config: + +```java +public List<TargetInfo> listTargetInfo (); +public void putTargetInfo(String volumeName, String bucketName, TargetInfo targetInfo); +public TargetInfo getTargetInfo(String volumeName, String bucketName, String targetId); +``` + +This is the group of api to add/delete/list event notifications from a bucket + +```java +public String addS3Notification (String bucketName, Event event, String targetId, String prefix); +public List<TargetInfo> listS3NotificationInfo (String bucketName); +public void deleteS3Notification (String bucketName, String targetId); +``` + +### Details + +The following diagram illustrates a write flow originating from the S3 Gateway. The red lines highlight the newly introduced steps added by this event notification design. The design introduces a callback in the execution flow, which is triggered after the request is sent to Ratis but before responding to the client. This callback is responsible for handling event notification logic. +Add `NotificationInfo` to determine which event should be notified and where to send. + + +#### BucketTable + +Add NotificationInfo to the Bucket table. It is used to determine which event should be notified and where to send (refer to [Rpc Message](#rpc-message)). + +#### TargetTable + +A new table in rocksDB to store the details of the target. +- **Key**: A target ID corresponds to TargetConfig. +- **Value**: TargetConfig (refer to [Rpc Message](#rpc-message)) + +#### NotificationSenderManager + +The NotificationSenderManager is responsible for caching client instances for different targets to avoid unnecessary resource wastage caused by repeatedly creating new client instances. + +#### Rpc message + +This is the structure to determine which event should be sent and where to send. + +```protobuf +message NotificationInfo { +required string targetId = 1; +required Event event = 2; +} + + +enum Event { +S3TEST = 1; +S3ObjectCreate = 2; +S3ObjectCreatePut = 3; +S3ObjectCreatePost = 4; +S3ObjectCreateCopy = 5; +S3ObjectCreateCompleteMultipartUpload = 6; +S3ObjectRemovedDelete = 7; +S3ObjectTagging = 8; +S3ObjectTaggingPut = 9; +S3ObjectTaggingDelete = 10; +} +``` + + +This is the structure to store the details of the target. +```protobuf +message TargetConfig { +required Target target = 1; +optional KafkaNotificationConfig kafkaNotificationConfig = 3; +optional RabbitMQNotificationConfig rabbitMQNotificationConfig = 4; +} + + +enum Target { +Kafka = 1; +RabbitMQ = 2; +} + + +message KafkaNotificationConfig { +repeated string endpoints = 1; +required string topic = 2; +optional string saslUsername = 3; +optional string saslPassword = 4; +optional string saslMechanism = 5; +optional string tlsSkipVerify = 6; +optional string clientTlsCert = 7; +optional string clientTlsKey = 8; + + +} + + +message RabbitMQNotificationConfig { +required string url = 1; +required string exchange = 2; +required string exchangeType = 3; +required string routingKey = 4; +} +``` + +## Performance + +The performance impact of this design comes mainly from two things: + +### OM Metadata Read Overhead + +This design introduces additional read operations to OM's RocksDB metadata store. However, since most of the metadata is cached, the performance impact on OM is expected to be minimal. + +### Notification Delivery + +The speed of notification delivery directly affects performance, especially when operating in synchronous mode. This design plans to support both synchronous and asynchronous delivery options: +- **Asynchronous mode (default)**: Notifications are enqueued and handled by a background thread. This minimizes impact on client-facing operations. However, as delivery only occurs once the client has received a response, any notification failures (e.g. target unavailable or delivery timeout) will not affect the original operation and may therefore go unnoticed by the client. +- **Synchronous mode**: Notifications are delivered before completing the client request. If the delivery target (e.g., Kafka or RabbitMQ) is slow or unavailable, it may increase the end-to-end latency of the original operation. Providing both modes allows users to balance between reliability and performance according to their use case. + +Furthermore, the creation and reuse of notification client instances (such as Kafka or RabbitMQ producers) within the Ozone Manager must be carefully controlled. Poor resource management in this area could increase memory or thread usage and degrade OM performance over time. + +### Metrics Review Comment: Can add some metrics about: `event_triggered`, `event_lost`, `push_ok`, `push_fail` and `push_pending`. ref: https://docs.ceph.com/en/quincy/radosgw/notifications/#notification-performance-stats -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
