rdhabalia opened a new pull request, #23974:
URL: https://github.com/apache/pulsar/pull/23974

   
   
   ### Motivation
   
   In Apache Pulsar, the broker enables producers and consumers to connect to a 
topic and provides an API to retrieve topic statistics. These stats include a 
list of connected producers and consumers, along with their IP addresses and 
connection times. This information is particularly valuable when dealing with a 
large number of producers and consumers from various client hosts, as it helps 
troubleshoot issues such as:
   
   Identifying which client host has an active consumer
   Detecting if a client host has stopped consuming messages
   Diagnosing message backlogs
   Thus, mapping the client host IP to the corresponding producer or consumer 
is crucial.
   
   **The Issue with Reverse Proxies**
   However, this mapping breaks when a reverse proxy is used between the client 
and broker. In such cases, the broker records only the proxy's IP address for 
all connected producers and consumers, making it difficult to identify the 
actual client host. Apache Pulsar supports multiple proxy solutions, such as 
Pulsar-Proxy and SNI Proxy, which further complicates troubleshooting by 
obscuring client IPs.
   
   To resolve this, this PR ensures that when a client library connects to a 
broker via a proxy, it sends the actual client IP address. The broker then 
correctly identifies and records this IP in the stats API, mapping it to the 
appropriate producer or consumer. This approach abstracts the proxy layer from 
users, allowing them to see accurate client IPs without any additional effort.
   
   This PR doesn't change client-broker protocol, API definition or 
configuration.
   
   ### Modifications
   
   Client lib sends an ip-address property when client lib detects a proxy, and 
the broker shows it in the client stats.
   
   ### Verifying this change
   
   - [ ] Make sure that the change passes the CI checks.
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
     - *Added integration tests for end-to-end deployment with large payloads 
(10MB)*
     - *Extended integration test for recovery after broker failure*
   
   ### Does this pull request potentially affect one of the following parts:
   
   <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->
   
   *If the box was checked, please highlight the changes*
   
   - [ ] Dependencies (add or upgrade a dependency)
   - [ ] The public API
   - [ ] The schema
   - [ ] The default values of configurations
   - [ ] The threading model
   - [ ] The binary protocol
   - [ ] The REST endpoints
   - [ ] The admin CLI options
   - [ ] The metrics
   - [ ] Anything that affects deployment
   
   ### Documentation
   
   <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->
   
   - [ ] `doc` <!-- Your PR contains doc changes. -->
   - [ ] `doc-required` <!-- Your PR changes impact docs and you will update 
later -->
   - [ ] `doc-not-needed` <!-- Your PR changes do not impact docs -->
   - [ ] `doc-complete` <!-- Docs have been already added -->
   
   ### Matching PR in forked repository
   
   PR in forked repository: <!-- ENTER URL HERE -->
   
   <!--
   After opening this PR, the build in apache/pulsar will fail and instructions 
will
   be provided for opening a PR in the PR author's forked repository.
   
   apache/pulsar pull requests should be first tested in your own fork since 
the 
   apache/pulsar CI based on GitHub Actions has constrained resources and quota.
   GitHub Actions provides separate quota for pull requests that are executed 
in 
   a forked repository.
   
   The tests will be run in the forked repository until all PR review comments 
have
   been handled, the tests pass and the PR is approved by a reviewer.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to