Thanks for starting the PIP. 
I added some thoughts in this comment:
https://github.com/apache/pulsar/pull/22039/files#r1486372582

Expanding my comment in the PR here:

It would be interesting to learn why advertised listeners weren't implemented 
as they had been designed.  We wouldn't have this problem if the implementation 
would match the design.

In PIP-61 [1], the selected approach is "only return the corresponding service 
URL":

> In the approach, when the client sends a lookup request to the broker, the 
> broker only 
> returns one service URL that with the same listener name with the client 
> uses. The purpose 
> of this approach is keeping client-side simple and not expose extra listeners 
> to the client, 
> this is better for security.

This is also referred in PIP-95 [2] "use a unique bind address for each 
listener":

> The broker shall be configurable to open a server socket on numerous bind 
> addresses, and 
> to associate each socket with a listener name. When a lookup request arrives 
> on a particular 
> socket, the broker shall use the associated listener by default. Note that an 
> explicit listener 
> name in the request parameters would take precedence.

Why doesn't the implementation in Pulsar match PIP-61 and PIP-95 designs?
Instead of implementing PIP-338, should we consider implementing the original 
design for addressing the problem?

-Lari

1 - 
https://github.com/apache/pulsar/wiki/PIP-61%3A-Advertised-multiple-addresses
2 - https://github.com/apache/pulsar/issues/12040


On 2024/02/07 18:42:07 Manmeet Kaur wrote:
> Hello everyone,
> Proposing a PIP-338 : https://github.com/apache/pulsar/pull/22039
> ----------------------------------------------------------------------------
> 
> # PIP-338: Add default lookup listener and fix inconsistency with
> listener's usage between different protocols
> 
> # Motivation
> ## Issue in existing Code flow
> 
> Currently, the listener mentioned as `internalListenerName` in broker
> config helps in deciding the listener from the list of
> `advertisedListeners` to specify the service URL, is being used for
> broker-to-broker communication but that is exposed to the client in case of
> lookup results or redirects as well.
> 
> Even if the client sends a `listenerName` (PIP-91/PIP-95) corresponding to
> the http protocol broker’s addresses, it is only used in the `pulsar` and
> `pulsar+ssl` protocols and is not consistent for the other protocols.
> 
> See [code](
> https://github.com/apache/pulsar/blob/a77fb6ed165ac8ad6968558077d80ea3edaf9a7e/pulsar-broker/src/main/java/org/apache/pulsar/broker/namespace/NamespaceService.java#L482
> )
> 
> In the existing flow, the listenerName is not acknowledged in case of http
> protocol.
> Here, it is trying to obtain the broker service url. If the listener has
> service url/tls (pulsar protocols) then they are used in the response, but
> the HTTP urls are from the internal listener only (when configured). Later
> in the call chain, the http urls are used for redirect location - which may
> be non resolvable for the client.
> 
> See [code](
> https://github.com/apache/pulsar/blob/a77fb6ed165ac8ad6968558077d80ea3edaf9a7e/pulsar-broker/src/main/java/org/apache/pulsar/broker/lookup/TopicLookupBase.java#L103
> )
> 
> ## Need to have lookupListener
> The Pulsar broker has the concept of advertised listeners representing
> broker endpoints that are discoverable by, and accessible to, Pulsar
> clients. In addition, one listener may be designated as the internal
> listener, to be used for broker-to-broker communication. Listeners may be
> understood as named routes to the broker. If a listener name is not
> specified, the response contains the endpoint for the internal listener, or
> the first listener for the protocol.
> 
> But the challenge arises when it considers the same internalListener for
> lookup requests redirects. This may result in returning an unresolvable
> broker service address to the clients.
> It may not be possible to pass the listenerName from the clients, which
> consequently may lead to cluster downtime.
> 
> it might not be feasible to have the listerner config at the client side in
> every tech stack or connector.
> Also,  Currently, there is no option to pass the listener while using the
> admin APIs. As admin APIs can be called from an external network as well,
> the use of the internal listener’s broker service URL can lead all admin
> operations to get affected.
> 
> Moreover, as per current code provided above, even if the client provides a
> listenerName/listener header, the redirect urls are taken from the internal
> listener or the first listener with http protocol which may not be a
> client-resolvable http address.
> 
> # Proposed Solution
> To fix inconsistency with all protocol’s service url, we need to include
> all broker addresses associated with the listener name while returning the
> lookupResult, rather than solely the service URL
> 
> Also, Introduce a broker-level property to determine the appropriate
> listener for lookup calls if listenerName is missing from the client
> request. This can help deal with the side effects of not having listerName
> on the client side when multiple advertisedListeners are being used.
> 
> Also, this can help to determine the broker service URL that needs to be
> returned in case of lookup redirects and also for using Pulsar admin API
> calls from outside the network.
> 
> This will help in having a more transparent listener identifier for
> internal and external communications.
> 
> 
> # Design & Implementation Details
> ## Approach
> **1. Fix Code Flow - Inconsistency with listener usage with protocols other
> than pulsar and pulsar+ssl**
> 
> Currently, lookup result return the broker service url corresponding to the
> internal listerner in case of http or https protocol urls. To address the
> issue in the workflow, it is necessary to include all broker addresses
> associated with the listener name, rather than solely the service URL.
> There can be a check if all broker addresses corresponding to the given
> listener is null, then use the default listener(discussed below)
> 
> **2. Only return the service URL corresponding to the lookupListener**
> 
> This approach introduces one new configuration `lookupListenerName` in the
> broker.conf. The `lookupListenerName` is used to specify which service URL
> should be returned to the clients from the lookup requests or redirects.
> `lookupListenerName` should be present in the `advertisedListeners` list.
> Users can set up the `lookupListenerName`. for example:
> 
> ```
> existing configs:
> advertisedListeners=internal:pulsar://xyz-broker:6660,internal:pulsar+ssl://
> 192.168.1.11:6651,external:http://192.168.1.11:8080
> internalListenerName=internal
> 
> new config required:
> lookupListenerName=external
> ```
> 
> In the approach, when the client sends a lookup request to the broker to
> get the owner broker without the listnerName, the broker only returns one
> service URL that is with the same listener name for which the default value
> of the lookupListenerName is given in the broker configuration. Therefore,
> we must allow the client to get the corresponding service URL with the same
> advertised listener name as present in the broker config.
> 
> This approach's purpose is to keep client-side simple and handle situations
> when not having listenerName information can lead to breaking scenarios. By
> having the default value, clients should be able to connect with the broker
> even if the request lands to the broker who is not the owner and redirects
> to some other broker.
> 
> 
> ### Backward Compatibility
> The above mentioned approach(2) of having a new config as lookupListener is
> backward compatible as there should not be any impact on the existing users
> if this configuration would not be set. Users can set this configuration
> for listener selection for the external communication as per their usecase.
> The same listener would be picked up for the admin APIs as well if admin
> calls are coming outside the network.
> 
> Approach(1), Fixing the code flow for http protocol might be a breaking
> change for the clients where we are returning the all broker addresses
> corresponding to the listener rather than just having service url. This can
> be breaking for the clients who are used to this inconsistency of having
> internal listener’s service url as lookup request. To deal with this, we
> can have a flag - `preferHttpClientListenerOverInternalListener` which
> would indicate whether this change should be enabled for clients or not.
> By Default, the value of this flag can be set as false for backward
> compatibility.
> 
> 
> Regards,
> Manmeet
> 
> On Thu, Feb 8, 2024 at 12:00 AM Manmeet Kaur <meetahuja1...@gmail.com>
> wrote:
> 
> > https://github.com/apache/pulsar/pull/22039
> >
> 

Reply via email to