Re: [DISCUSS] PIP-224: Introduce TopicMessageId for consumer's MessageId related APIs

丛搏 Wed, 21 Dec 2022 04:50:54 -0800

Hi, Yunze:

< ```java
< var msg = multiTopicsConsumer.receive();
< var msgId = (TopicMessageId) multiTopicsConsumer.getMessageId();
< consumer.seek(msgId.getOwnerTopic(), msgId);
< ```


the code can be like this:
 ```java
var msg = anyConsumer.receive();
var msgId = anyConsume.getMessageId();
consumer.seek(msg, msgId);
 ```
If messageID does not contain `TopicName`, the `TopicName` is best get from msg.

< What's different is that the offset in Kafka can represent a position
< of ANY partition, while the MessageId in Pulsar can only represent the
< position of A SPECIFIC partition.

Although MessageId in Pulsar can only represent the position of A
SPECIFIC partition, but it still needs a TopicName. `LedgerID` and
`EntryID` do not mean that this `MessageID` belongs to a topic
(although it does belong), but it still cannot avoid `TopicName` for
marking this `MessageID` belongs to this topic.

> And in Pulsar, we also do not expose
> the partition concept, if we introduce the seek API with the topic
> name as the argument, we have to explain in detail about what's the
> topic name for a partition. It could be a very confusing thing from my
> experience when I explained the "partition" concept in community.

if using `TopicMessageId` also has the same problem, why we need to
use `TopicMessageId` not `MessageId`

Thanks,
Bo

Yunze Xu <y...@streamnative.io.invalid> 于2022年12月21日周三 16:59写道：
>
> Hi Bo,
>
> If we have the `seek` API that accepts a topic name, how to use seek
> for a single topic consumer and multi-topics consumer will be
> different.
>
> ```java
> var msg = singleTopicConsumer.receive();
> var msgId = singleTopicConsumer.getMessageId();
> consumer.seek(msgId);
> ```
>
> ```java
> var msg = multiTopicsConsumer.receive();
> var msgId = (TopicMessageId) multiTopicsConsumer.getMessageId();
> consumer.seek(msgId.getOwnerTopic(), msgId);
> ```
>
> It's not as clear as you have thought. A question could come from the
> code above: since we can get the key (topic name) from `msgId` itself,
> why do we need another argument?
>
> What's worse is that users have to specify the correct topic name. For
> a partitioned topic, if users specified another partition, the `seek`
> operation would fail. If they specified something like
> `multiTopicsConsumer.getTopic()`, it would also fail because other
> APIs like `Consumer#getTopic()` doesn't return the correct topic name.
>
> If there is only one correct topic name for a given TopicMessageId,
> what's the meaning of making it as a required argument?
>
> BTW, let's see Kafka client's commit API:
>
> ```java
> public void commitSync(Map<TopicPartition,OffsetAndMetadata> offsets)
> ```
>
> What's different is that the offset in Kafka can represent a position
> of ANY partition, while the MessageId in Pulsar can only represent the
> position of A SPECIFIC partition. And in Pulsar, we also do not expose
> the partition concept, if we introduce the seek API with the topic
> name as the argument, we have to explain in detail about what's the
> topic name for a partition. It could be a very confusing thing from my
> experience when I explained the "partition" concept in community.
>
> Thanks,
> Yunze
>
>
> On Wed, Dec 21, 2022 at 3:20 PM 丛搏 <bog...@apache.org> wrote:
> >
> > Hi Yunze,
> >
> > add `TopicMessageId ` will couple messageID and `topic name` together,
> > which is very unclear for non-partition-topic.
> >
> > ```
> > void seek(String topicName, MessageId messageId) throws 
> > PulsarClientException;
> > List<Map<String, MessageId>> getLastTopicMessageId() throws
> > PulsarClientException;
> > ```
> > If the interface is designed in this way, it may be simpler, easier to
> > understand, and more intuitive for users, and MessageID will not be
> > coupled with TopicName.
> >
> > because this PIP has already initiated a VOTE, so I will sync this
> > reply to PIP-224-VOTE[0]
> >
> > Thanks,
> > Bo
> > [0] https://lists.apache.org/thread/mbrpjsgrgwrlkdpvkk738jxnlk7rf4qk
> >
> > Yunze Xu <y...@streamnative.io.invalid> 于2022年12月9日周五 14:33写道：
> > >
> > > Hi Jiaqi,
> > >
> > > Let's move to 
> > > https://lists.apache.org/thread/mbrpjsgrgwrlkdpvkk738jxnlk7rf4qk
> > > for the vote.
> > >
> > > Thanks,
> > > Yunze
> > >
> > > On Fri, Dec 9, 2022 at 1:54 PM Jiaqi Shen <gleiphir2...@gmail.com> wrote:
> > > >
> > > > This is make sense to me, +1
> > > >
> > > > Thanks,
> > > > Jiaqi Shen
> > > >
> > > >
> > > > Yunze Xu <y...@streamnative.io.invalid> 于2022年12月7日周三 13:51写道：
> > > >
> > > > > Hi Baodi,
> > > > >
> > > > > I decided not to change the behavior of the `negativeAcknowledge`
> > > > > method. I just checked again that there is no exception signature for
> > > > > this method and there is no asynchronous version like
> > > > > `negativeAcknowledgeAsync`. To keep the API compatible, we should not
> > > > > add an exception signature, which would be required if a
> > > > > `PulsarClientException` was thrown.
> > > > >
> > > > > Thanks,
> > > > > Yunze
> > > > >
> > > > > On Tue, Nov 29, 2022 at 10:12 PM Baodi Shi 
> > > > > <baodi....@icloud.com.invalid>
> > > > > wrote:
> > > > > >
> > > > > > Hi, Yunze:
> > > > > >
> > > > > > Thanks for your proposal. That Looks good to me.
> > > > > >
> > > > > > `negativeAcknowledge` also needs to add the same checks as the new
> > > > > acknowledge interface.
> > > > > >
> > > > > > > This interface doesn't add any acknowledge overload because the
> > > > > overloads are already too many. But it will make the behavior clear.
> > > > > > I think since we exposed the TopicMessageId, it would be better to 
> > > > > > add
> > > > > overloaded interfaces (even if the overloads are a lot). This can 
> > > > > users to
> > > > > clearly associate the use cases of MultiTopicConsumer and 
> > > > > TopicMessageId.
> > > > > >
> > > > > > Also, while it's okay to use TopicMessageId param on a single 
> > > > > > consumer,
> > > > > I guess we shouldn't allow users to use it.
> > > > > >
> > > > > > In this way, users are clearly aware that TopicMessageId is used 
> > > > > > when
> > > > > using MultiTopicConsumer and MessageId is used when using
> > > > > SingleTopicConsumer.(Maybe it's not a good idea)
> > > > > >
> > > > > >
> > > > > > Thanks,
> > > > > > Baodi Shi
> > > > > >
> > > > > > > 2022年11月29日 15:57，Yunze Xu <y...@streamnative.io.INVALID> 写道：
> > > > > > >
> > > > > > >> Is there a case where the user uses the messageId returned by the
> > > > > > > producer to seek in the consumer? Is this a good behavior?
> > > > > > >
> > > > > > > Yes. I think it should be acceptable. To correct my previous 
> > > > > > > point,
> > > > > > > now I think the MessageId returned by send should also be able to 
> > > > > > > be
> > > > > > > applied for seek or acknowledge.
> > > > > > >
> > > > > > >> even with the
> > > > > > > current proposal, it may return null when getting the topic from
> > > > > > > TopicMessageId for backward compatibility.
> > > > > > >
> > > > > > > No. It may return null just because Java doesn't allow a non-null
> > > > > > > returned value. The internal implementations of
> > > > > > > TopicMessageId#getOwerTopic should return a non-null topic name to
> > > > > > > avoid null check.
> > > > > > >
> > > > > > > When I mentioned **the implementation of getTopicName() must 
> > > > > > > return
> > > > > > > null**, the assumption is that MessageId#toByteArray serializes 
> > > > > > > the
> > > > > > > topic name if adding the `getTopicName()` method. However, in this
> > > > > > > proposal, `TopicMessageId#toByteArray` won't. See the 
> > > > > > > implementation
> > > > > > > of `TopicMessageId#create`. It's only a wrapper for an arbitrary
> > > > > > > MessageId implementation.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Yunze
> > > > > > >
> > > > > > > On Tue, Nov 29, 2022 at 2:47 PM Zike Yang <z...@apache.org> wrote:
> > > > > > >>
> > > > > > >> Hi Yunze,
> > > > > > >>
> > > > > > >> Thanks for your proposal. Quoted from your GitHub comments[0]:
> > > > > > >>
> > > > > > >>> There is also a case when MessageId is returned from 
> > > > > > >>> Producer#send.
> > > > > In this case, the returned MessageId should only used for 
> > > > > serialization
> > > > > > >>
> > > > > > >> Is there a case where the user uses the messageId returned by the
> > > > > > >> producer to seek in the consumer? Is this a good behavior?
> > > > > > >>
> > > > > > >>> If we added the method directly to MessageId, to keep the 
> > > > > > >>> backward
> > > > > compatibility, the implementation of getTopicName() must return null, 
> > > > > which
> > > > > is not a good design.
> > > > > > >>
> > > > > > >> I think it's a trade-off. If I understand correctly, even with 
> > > > > > >> the
> > > > > > >> current proposal, it may return null when getting the topic from
> > > > > > >> TopicMessageId for backward compatibility. The current
> > > > > > >> TopicMessageIdImpl doesn't serialize the topic information.
> > > > > > >>
> > > > > > >>
> > > > > > >> [0]
> > > > > https://github.com/apache/pulsar/issues/18616#issuecomment-1328609346
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >> Zike Yang
> > > > > > >>
> > > > > > >> On Mon, Nov 28, 2022 at 12:22 PM Yunze Xu
> > > > > <y...@streamnative.io.invalid> wrote:
> > > > > > >>>
> > > > > > >>> Hi all,
> > > > > > >>>
> > > > > > >>> I've opened a PIP to discuss:
> > > > > https://github.com/apache/pulsar/issues/18616.
> > > > > > >>>
> > > > > > >>> The consumer's MessageId related APIs have some hidden 
> > > > > > >>> requirements
> > > > > > >>> and flakiness and some behaviors are not documented well. This
> > > > > > >>> proposal will introduce a TopicMessageId interface that exposes 
> > > > > > >>> a
> > > > > > >>> method to get a message's owner topic.
> > > > > > >>>
> > > > > > >>> P.S. There was an email [1] that didn't add the "[DISCUSS]" 
> > > > > > >>> label,
> > > > > > >>> which might be a little confusing. So I sent the email again for
> > > > > > >>> discussion. Please do not reply to the previous email.
> > > > > > >>>
> > > > > > >>> [1] 
> > > > > > >>> https://lists.apache.org/thread/6gj16pmrjk6ncsd30xrl20pr5ng6t61o
> > > > > > >>>
> > > > > > >>> Thanks,
> > > > > > >>> Yunze
> > > > > >
> > > > >

Re: [DISCUSS] PIP-224: Introduce TopicMessageId for consumer's MessageId related APIs

Reply via email to