peterxcli commented on code in PR #8871: URL: https://github.com/apache/ozone/pull/8871#discussion_r3276165022
########## hadoop-hdds/docs/content/design/event-notifications.md: ########## @@ -0,0 +1,255 @@ +--- +title: Event notification support in Ozone +summary: Event notifications for all bucket/event types in ozone +date: 2025-06-28 +jira: HDDS-13513 +status: design +author: Donal Magennis, Colm Dougan +--- +<!-- + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. See accompanying LICENSE file. +--> + +# Abstract + +Implement an event notification system for Apache Ozone, providing the ability for users to consume events occurring on the Ozone filesystem. +This is similar to https://issues.apache.org/jira/browse/HDDS-5984 but aims to encapsulate all events and not solely S3 buckets. +This document proposes a potential solution and discusses some of the challenges/open questions. + +## Introduction + +Apache Ozone does not currently provide the ability to consume filesystem events, similar to how HDFS does with Inotify or S3 with bucket notifications. +These events are an integral part of integration with external systems to support real-time, scalable, and programmatic monitoring of changes in the data or metadata stored in Ozone. +These external systems can use notifications of objects created/deleted to trigger data processing workflows, replication and monitoring alerts. + +### Goals + +Durable event log within each OM containing relevant OMRequest information for notification purposes. +Plugin framework for publishers (e.g. Kafka/RabbitMQ, custom sinks) running in separate threads in the OM. +Provide support for all events across the Ozone filesystem for FSO and non FSO buckets, including renames and changes to acls. +Guarantee at-least-once delivery within a bounded retention period, with notification of "missed events" where applicable. +Read-only access for plugins to notification table. + +### Non-Goals + +Exactly-once end-to-end semantics to external systems. +Filtering of events or paths/buckets. +Cross-OM consensus about what has been notified; co-ordination to be defined in the plugin e.g. write last notified position to a file in Ozone. +Retrofitting historical events prior to feature enablement. + +### Supported OMRequests + +OMDirectoryCreateRequest +OMKeyCommitRequest +OMKeyDeleteRequest +OMKeyRenameRequest +OMKeyAddAclRequest +OMKeyRemoveAclRequest +OMKeySetAclRequest +OMKeySetTimesRequest + +# Design + +## Overview + +Introduce an Event Notification Pipeline for Apache Ozone with two +logical pieces: + +1. event data capture + +* OM captures the required details of selected OMRequest write + operations post metadata update and persists them to a dedicated RocksDB + completed operations "ledger" table keyed by the Ratis Txn Id +* each OM independently produces items to its local ledger table. The + ledger table should be integrated into OM Snapshots so that all OM's + converge on the full set of required notifications. +* a retention policy is to be implemented in order to clean up no longer required entries. This policy is bounded to a table size(number of events) which can be configurable. +* event capture will only be enabled if enabled + +2. event data publishing + +* a plugin framework is exposed where plugins can consume the ledger + items in read-only fashion and process them as desired +* Plugins will run inside the OM and should be cognisant of resource consumption i.e. memory/disk +* all OMs will run the plugins but only the current leader OM will be + active +* a base plugin implementation will provide common behaviour, including + read only iteration of new ledger items and flagging that events + have been "missed" since the consumer last requested them + leader OM will be active +* a concrete plugin implementation will deal with publishing Review Comment: I have two thoughts: 1. we should support dynamic configuration(throught s3 endpoint) of destinations and only store the `CompletedRequestInfo` if there is destination existing(event notifications are not retroactive). and looks like we could support more complex configurtion, for each event notification: - Event name, - Prefix (optional) - Suffix (optional) - Destination (only one for one event notification) ref: https://docs.aws.amazon.com/AmazonS3/latest/userguide/enable-event-notifications.html#:~:text=In%20the%20General,and%20Lambda. so you might want to create a separate table to store these. 3. For the "retention policy" and "Offset tracking and persistence", I think we should introduce a background service to mange the watermark of all downstream sink(eg. kafka destination). the background service will basically loop on leader periodically: 1. check if we have been suspended and `isLeaderReady()` 1. `restart_txnid` = query smallest txn id in complete info with `iter.Seek();` 2. `confirmed_txnid` = min(txn id that all downstream have committed write/persist) 4. check the retention policy, assume we limit `max_record_keep_size` configured. 1. `reclaim_txnid = iter.key()`, after `iter.SeekToLast()` and run `iter.Prev()` for `max_record_keep_size` times 2. then `reclaim_range = [restart_txnid, max(reclaim_txnid, confirmed_txnid)]` 5. submit a raft request/response to delete `reclaim_range` with rocksdb `deleteRange`. 6. sleep a while and goto first step -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
