On Tue, Aug 8, 2023 at 9:56 PM Michael Marshall <mmarsh...@apache.org>
wrote:

> Thanks for your proposal, Heesung. Sorry for my late response. Just
> want to make a few points to hopefully improve the implementation.
> Overall, I think this feature is absolutely the right direction for
> Pulsar and will be a great improvement. I remember discussing this at
> some community meetings last year. See the last paragraph of [0].
>
> In general, my primary question is about the protocol changes you're
> proposing, both the commands and the way the commands are sent.
>
> > I am worried about the backward compatibility if we add a new proto
> message
> > -- how can the old versions of clients handle this new proto message?
>
> The protocolVersion [1] object in the proto definition is how we
> guarantee backwards and forwards compatibility for all combinations of
> brokers/clients. The client and the broker each share their version
> during the handshake and then they only use commands known to both
> peers.
>
> We can introduce a new command here if we want/need. The nuance is
> that the server logic will branch depending on the client's
> protocolVersion.
>

Good Point. We can introduce new commands here instead of the optional
fields by branching the protocolVersion.
If the community thinks the new commands are cleaner, I can create new
commands for this, as the protocolVersion guarantees the
backward-compatbility.
However, with this optional field approach, we don't need to introduce a
new protocol version, which could be arguably beneficial.
I will leave the decision to the community. Let me know if I have to use
new commands and increase the protocolVersion for this change.



> Next, I am wondering if there are any opportunities to decrease the
> latency even more than are proposed in the PIP by introducing some
> concurrency to decouple operations like creating new connections and
> initializing resources used by each topic (like a managed ledger).
> Because I used a mermaid diagram, I put the majority of this point in
> a PR comment [2].
>

 I will be replying to the PR.


> Thanks,
> Michael
>
> [0] https://lists.apache.org/thread/ghlsprmqgkt8lfv634groy93c5dm1k1g
> [1]
> https://github.com/apache/pulsar/blob/15453964f6e6086b0c5e34736958cf749f16f5e1/pulsar-common/src/main/proto/PulsarApi.proto#L242-L268
> [2] https://github.com/apache/pulsar/pull/20748#discussion_r1287927655
>
> On Fri, Aug 4, 2023 at 3:41 PM Heesung Sohn
> <heesung.s...@streamnative.io.invalid> wrote:
> >
> > Hi,
> >
> > I updated the PIP to add more details about the recent discussion.
> >
> > Thanks,
> > Heesung
> >
> > On Thu, Aug 3, 2023 at 6:46 PM Heesung Sohn <
> heesung.s...@streamnative.io>
> > wrote:
> >
> > >
> > >
> > > On Wed, Aug 2, 2023 at 8:02 PM PengHui Li <peng...@apache.org> wrote:
> > >
> > >> > One of the advantages of this proposal (add these optional
> dstBrokerUrl
> > >> fields in the existing CommandCloseProducer and CommandCloseConsumer)
> is
> > >> the backward compatibility
> > >>
> > >> Ah, yes. I missed this point. It's a good convince.
> > >>
> > >> > After the client connects to the destination broker, the next
> > >> command(e.g.
> > >> > ProducerCommand) requires
> > >> > the destination broker to check the ownership again against its
> local
> > >> table
> > >> > view of the bundle state channel.
> > >>
> > >> > Upon this local ownership check, there could be the following
> scenarios:
> > >>
> > >> > Happy case:
> > >> > - If the ownership state is `owned ` and the current broker is
> indeed
> > >> the
> > >> > owner, the command completes.
> > >> > Unhappy cases:
> > >> > - If the ownership state is `owned ` and the current broker is not
> the
> > >> > owner, the command fails
> > >> > (the broker returns an error to the client), and the client tries to
> > >> find
> > >> > the true new owner by lookups.
> > >> > The global bundle state channel is eventually consistent, and the
> > >> lookups
> > >> > should eventually converge.
> > >> > - if the ownership change is still in progress(`releasing`,
> > >> `assigning`),
> > >> > this check will be deferred
> > >> > until the state becomes `owned` with a timeout.
> > >>
> > >> Could you please also update the explanation in the proposal?
> > >> It looks like what is guaranteed and what is not in the new proposal.
> > >> And why the mechanism can work with the unhappy case.
> > >>
> > >> Both the existing mechanism and the new mechanism follow the
> > >> eventual consistency. The nuance is that the new mechanism will change
> > >> the bundle owner first and then unload the topics. For the broker
> logs,
> > >> will more error messages related to the fenced ledger be introduced?
> > >> Or the client reconnects will happen after the owner cache is updated?
> > >>
> > >>
> > > As described in the protocol change in the PIP, the goal is to have
> fewer
> > > fenced ledger errors because the protocol change helps the client
> > > reconnection(open ledger)
> > >  happen at the very end.
> > >
> > >
> > >> If the load manager crashed before unloading the topics (close
> producers
> > >> and consumers).
> > >> Will the old broker provide service for this topic until the client
> > >> triggers a
> > >> new lookup?
> > >>
> > >
> > > Sorry, I am not sure if I understood your question here, but
> > > When the destination or source broker crashes in the middle,
> > > the leader will find the orphan state and clean the ownership by
> selecting
> > > a new owner, and the client will reconnect to it.
> > > During this transfer process, if alive, the source broker will serve
> the
> > > topic according to the protocol described in the PIP.
> > >
> > >
> > >> Sorry, Another question about PIP-192. Is it required by the new
> > >> mechanism?
> > >> Will the situation also be improved without enabling PIP-192?
> > >> We have added the new owner broker to the close producer and consumer
> > >> command.
> > >> We can also update the owner and then close the producer and consumers
> > >> with
> > >> the new owner.
> > >>
> > >
> > > This PIP is based on PIP-192 because the proposed protocol change is
> based
> > > on the bundle states from PIP-192.
> > > I assume the community could raise other PIPs to utilize the close
> > > producer and consumer command changes for other logic too.
> > >
> > >
> > >>
> > >> Regards,
> > >> Penghui
> > >>
> > >> On Thu, Aug 3, 2023 at 4:38 AM Heesung Sohn
> > >> <heesung.s...@streamnative.io.invalid> wrote:
> > >>
> > >> > Hi PengHui,
> > >> > Thank you for your questions. Please find my answers inline.
> > >> >
> > >> > On Wed, Aug 2, 2023 at 3:47 AM PengHui Li <peng...@apache.org>
> wrote:
> > >> >
> > >> > > Hi Heesung,
> > >> > >
> > >> > > I'm sorry for not getting back to you sooner.
> > >> > > The motivation and the solution look good to me.
> > >> > >
> > >> > > I just want to leave some comments to make the proposal clear
> > >> > > for users and developers.
> > >> > >
> > >> > >
> > >> > >
> > >> >
> > >>
> ----------------------------------------------------------------------------------------------------
> > >> > >
> > >> > > We should have a section to describe the values that will be added
> > >> > > by this proposal.
> > >> > >
> > >> > > - Reduced the publish latency spike and e2e latency spike during
> the
> > >> > > reconnecting
> > >> > > - Mitigated bottleneck(lookup) of a large number of topics in a
> > >> cluster
> > >> > >
> > >> > > Maybe more, I can only think about the above two values.
> > >> > >
> > >> > >
> > >> > Good point. I will clarify this part in the PIP.
> > >> >
> > >> >
> > >> >
> > >>
> ----------------------------------------------------------------------------------------------------
> > >> > >
> > >> > > For the proto changes, the client side can also use the HTTP
> lookup
> > >> > > service.
> > >> > > We should also mention it in this proposal. Instead of adding
> four new
> > >> > > fields to
> > >> > > the CommandCloseProducer and CommandCloseConsumer, maybe adding a
> new
> > >> > > message BrokerIdentifier, or others is better.
> > >> > >
> > >> >
> > >> > I am worried about the backward compatibility if we add a new proto
> > >> message
> > >> > -- how can the old versions of clients handle this new proto
> message?
> > >> >
> > >> > One of the advantages of this proposal (add these optional
> dstBrokerUrl
> > >> > fields in the existing CommandCloseProducer and
> CommandCloseConsumer) is
> > >> > the backward compatibility -- older clients can still work in the
> older
> > >> way
> > >> > by going through the lookup requests.
> > >> >
> > >> > Ref: https://protobuf.dev/overview/#updating-defs
> > >> > Updating Proto Definitions Without Updating Code
> > >> > <https://protobuf.dev/overview/#updating-defs>
> > >> >
> > >> > It’s standard for software products to be backward compatible, but
> it is
> > >> > less common for them to be forward compatible. As long as you follow
> > >> > some simple
> > >> > practices <https://protobuf.dev/programming-guides/proto3/#updating
> >
> > >> when
> > >> > updating .proto definitions, old code will read new messages without
> > >> > issues, ignoring any newly added fields. ...
> > >> >
> > >> >
> > >> > >
> > >> > >
> > >> >
> > >>
> ----------------------------------------------------------------------------------------------------
> > >> > >
> > >> > > There are two points we need to clarify in the proposal.
> > >> > >
> > >> > > - If the client reconnects before the new owner takes ownership.
> If
> > >> the
> > >> > > producer closed
> > >> > >   after the new broker took ownership, would it be a potential
> > >> > consistency
> > >> > > issue?
> > >> >
> > >> > - If the owner changed again before the client reconnection. How did
> > >> Pulsar
> > >> > > handle this case?
> > >> > >    As I understand, it's fine because the broker will check the
> > >> ownership
> > >> > > again when handling
> > >> > >    producer and consumer creation. If it can be explained in the
> > >> > proposal,
> > >> > > it should be great
> > >> > >    for developers to understand it deeply.
> > >> > >
> > >> >
> > >> > After the client connects to the destination broker, the next
> > >> command(e.g.
> > >> > ProducerCommand) requires
> > >> > the destination broker to check the ownership again against its
> local
> > >> table
> > >> > view of the bundle state channel.
> > >> >
> > >> > Upon this local ownership check, there could be the following
> scenarios:
> > >> >
> > >> > Happy case:
> > >> > - If the ownership state is `owned ` and the current broker is
> indeed
> > >> the
> > >> > owner, the command completes.
> > >> > Unhappy cases:
> > >> > - If the ownership state is `owned ` and the current broker is not
> the
> > >> > owner, the command fails
> > >> > (the broker returns an error to the client), and the client tries to
> > >> find
> > >> > the true new owner by lookups.
> > >> > The global bundle state channel is eventually consistent, and the
> > >> lookups
> > >> > should eventually converge.
> > >> > - if the ownership change is still in progress(`releasing`,
> > >> `assigning`),
> > >> > this check will be deferred
> > >> > until the state becomes `owned` with a timeout.
> > >> >
> > >> >
> > >> > > Regards,
> > >> > > Penghui
> > >> > >
> > >> > > On Wed, Aug 2, 2023 at 1:18 AM Heesung Sohn
> > >> > > <heesung.s...@streamnative.io.invalid> wrote:
> > >> > >
> > >> > > > Hi,
> > >> > > >
> > >> > > > It has been over three weeks since this discussion started, and
> I
> > >> > replied
> > >> > > > to the questions in the PIP.
> > >> > > > I will send out a vote email tomorrow if looking good.
> > >> > > >
> > >> > > > Thanks,
> > >> > > > Heesung
> > >> > > >
> > >> > > > On Tue, Jul 25, 2023 at 1:12 PM Heesung Sohn <
> > >> > > heesung.s...@streamnative.io
> > >> > > > >
> > >> > > > wrote:
> > >> > > >
> > >> > > > > Hi,
> > >> > > > >
> > >> > > > > I replied to the comments from Kai and Ran in the PIP.
> > >> > > > >
> > >> > > > > Please review the PIP by any chance.
> > >> > > > >
> > >> > > > > Thank you,
> > >> > > > > Heesung
> > >> > > > >
> > >> > > > > On Thu, Jul 6, 2023 at 4:32 PM Heesung Sohn <
> > >> > > > heesung.s...@streamnative.io>
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > >> Hi dev,
> > >> > > > >>
> > >> > > > >> I proposed this PIP,
> https://github.com/apache/pulsar/pull/20748
> > >> ,
> > >> > to
> > >> > > > >> make unloaded clients directly and gracefully connect to new
> > >> owner
> > >> > > > brokers
> > >> > > > >> without lookups.
> > >> > > > >>
> > >> > > > >> Please take a look and let me know what you think.
> > >> > > > >>
> > >> > > > >> Regards,
> > >> > > > >> Heesung
> > >> > > > >>
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
>

Reply via email to