eolivelli commented on a change in pull request #2214: BP-38 Publish Bookie Service Info on Metadata Service URL: https://github.com/apache/bookkeeper/pull/2214#discussion_r363030220
########## File path: site/bps/BP-38-bookie-endpoint-discovery.md ########## @@ -0,0 +1,117 @@ +--- +title: "BP-38: Publish Bookie Service Info on Metadata Service" +issue: https://github.com/apache/bookkeeper/2215 +state: 'Under Discussion' +release: "4.11.0" +--- + +### Motivation + +Bookie server exposes several services and some of them are optional: the binary BookKeeper protocol endpoint, the HTTP service, the StreamStorage service, a Metrics endpoint. + +Currently (in BookKeeper 4.10) the client can only discover the main Bookie endpoint: +the main BookKeeper binary RPC service. +Discovery of the TCP address is implicit, because the *id* of the bookie is made of the same host:port that point to the TCP address of the Bookie service. + +With this proposal we are now introducing a way for the Bookie to advertise the services it exposes, basically with this change the Bookie will be able to store on the Metadata Service a set of name-value pairs that describe the *available services*. + +We will also define a set of well know properties that will be useful for futher implementations. + +This information will be also useful for Monitoring and Management services as it will enable full discovery of the capabilities of all of the Bookies in a cluster just by having read access to the Metadata Service. + +### Public Interfaces + +On the Registration API, we introduce a new data structure that describes the services +exposed by a Bookie: + +``` +inteface BookieServiceInfo { + Iterable<String> keys(); + String get(String key, String defaultValue); +} + +``` + +In RegistrationClient interface we expose a new method: + +``` +CompletableFuture<Versioned<BookieServiceInfo>> getBookieServiceInfo(String bookieId) +``` + +The client can derive bookieId from a BookieSocketAddress. He can access the list of available bookies using **RegistrationClient#getAllBookies()** and then use this new method to get the details of the services exposed by the Bookie. + +On the Bookie side we change the RegistrationManager interface in order to let the Bookie +advertise the services: + +in RegistrationManager class the **registerBookie** method +``` +void registerBookie(String bookieId, boolean readOnly) +``` + +becomes + +``` +void registerBookie(String bookieId, boolean readOnly, BookieServiceInfo bookieServieInfo) +``` + +It will be up to the implementation of RegistrationManager and RegistrationClient to serialize +the BookieServiceInfo structure. + +For the ZooKeeper based implementation we are going to store such information in JSON format. + +``` +{ + "property": "value", + "property2": "value2", +} +``` +Such information will be stored inside the '/REGISTRATIONPATH/available' znode (or /REGISTRATIONPATH/available/readonline' in case of readonly bookie). + +The rationale around this choice is that the client is already using these znodes in order to discover available bookies services. + +It is possible that such endpoint information change during the lifetime of a Bookie, like after rebooting a machine and the machine gets a new network address. + +It is out of the scope of this proposal to change semantics of ledger metadata, in which we are currently storing directly the network address of the bookies, but with this proposal we are preparing the way to have an indirection and separate the concept of Bookie ID from the Network address. + +**Well known** keys will be: Review comment: So you are thinking about a more strongly typed structure ? ``` class BookieServiceInfo { class Endpoint { String name; String hostname; int port; String type; } Map<String, Endpoint> endpoints; (name => endpoint, no duplicate names allowed) String networkAddress; Map<String, String > properties; public void addEndpoint(Endpoint endpoint) .... public void setProperty(String key, String value)... public void setNetworkAddress(String networkAddress) ..... } ``` I like this idea. I am not sure about the "network_address" example, I see value in "region" and "rack", maybe as separate values Anyway I would defer introducing those concepts in a first implementation, because it deserves a discussion about how to migrate from the current script based approach Is it okay for use to use JSON format ? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
