Re: [DISCUSS] KIP-68 Add a consumed log retention before log retention

Becket Qin Sun, 09 Oct 2016 11:17:56 -0700

Hey David,

Thanks for updating the wiki.


1. I was actually thinking of letting every broker just consume the
__consumer_offsets topic. But it seems less efficient if there are only a
few topics configured for committed offsets based retention. So querying
the committed offsets seems reasonable. From the wiki it is not clear
whether the committed offset query happens sync or async. It is probably
better to do this asynchronously, i.e. in another thread other than the log
deleting thread. Otherwise querying the committed offsets may slow down or
even potentially block the log deletion due to a remote call failure.

2. Using new consumer does not necessarily introduce a new group unless we
use Kafka based group management. But using KafkaConsumer directly to query
the committed offsets may not work in this case because by default it uses
the consumer group in the ConsumerConfig. We can use NetworkClient and see
if we can reuse some of the code in the new consumer. Since there has been
a lot of efforts spent on deprecating the SimpleConsumer, we probably want
to avoid introducing any new usage. Anyway, this is implementation detail
and we can figure that out when writing the patch.

3. What I am thinking is that we want to consider whether we will allow
multiple policies to be set at the same time? If we do allow that, which
one of the policies will take precedence. Otherwise it might be confusing
for the users if they have multiple retention policies set.

In addition to the above, it seems that we need some way to configure the
set of consumer groups a topic should be listening on? If it is through
topic config, it would be good to document the configuration name and
format of value in the wiki as well.

Thanks,

Jiangjie (Becket) Qin



On Sun, Oct 9, 2016 at 7:14 AM, 东方甲乙 <254479...@qq.com> wrote:

> Hi Becket,
>     This is david, thanks for the comments.  I have update some info in
> the wiki. All the changes is nearly described in the workflow.
> Answer for the commnets:
> 1. Every brokers only have some of the groups' commit offset which are
> storaged in the __comsumer_offsets topics,  it still have to query other
> coordinator(other brokers) for some group's commit offset.
> So we use the OffsetFetchRequest to query one group's commit offset.
>
>
> 2. If using new consumer to query the commit offset will introduce new
> group, but if we use the OffsetFetchRequest to query (like the
> consumer-offset-checker tool, first find the coordinator and build an
> channel to query), we will not introduce new group.
>
>
> 3. I think the KIP-47's functionality seems a little different from this
> KIP, though we are all modifying the log retention.
>
>
> Thanks,
> David.
>
>
>
>
>
>
>
>
> ------------------ 原始邮件 ------------------
> 发件人: "Becket Qin";<becket....@gmail.com>;
> 发送时间: 2016年10月9日(星期天) 中午1:00
> 收件人: "dev"<dev@kafka.apache.org>;
>
> 主题: Re: [DISCUSS] KIP-68 Add a consumed log retention before log retention
>
>
>
> Hi David,
>
> Thanks for the explanation. Could you update the KIP-68 wiki to include the
> changes that need to be made?
>
> I have a few more comments below:
>
> 1. We already have an internal topic __consumer_offsets to store all the
> committed offsets. So the brokers can probably just consume from that to
> get the committed offsets for all the partitions of each group.
>
> 2. It is probably better to use o.a.k.clients.consumer.KafkaConsumer
> instead of SimpleConsumer. It handles all the leader movements and
> potential failures.
>
> 3. KIP-47 also has a proposal for a new time based log retention policy and
> propose a new configuration on log retention. It may be worth thinking
> about the behavior together.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Sat, Oct 8, 2016 at 2:15 AM, Pengwei (L) <pengwei...@huawei.com> wrote:
>
> > Hi Becket,
> >
> >   Thanks for the feedback:
> > 1.  We use the simple consumer api to query the commit offset, so we
> don't
> > need to specify the consumer group.
> > 2.  Every broker using the simple consumer api(OffsetFetchKey) to query
> > the commit offset in the log retention process.  The client can commit
> > offset or not.
> > 3.  It does not need to distinguish the follower brokers or leader
> > brokers,  every brokers can query.
> > 4.  We don't need to change the protocols, we mainly change the log
> > retention process in the log manager.
> >
> >   One question is the query min offset need O(partitions * groups) time
> > complexity, another alternative is to build an internal topic to save
> every
> > partition's min offset, it can reduce to O(1).
> > I will update the wiki for more details.
> >
> > Thanks,
> > David
> >
> >
> > > Hi Pengwei,
> > >
> > > Thanks for the KIP proposal. It is a very useful KIP. At a high level,
> > the
> > > proposed behavior looks reasonable to me.
> > >
> > > However, it seems that some of the details are not mentioned in the
> KIP.
> > > For example,
> > >
> > > 1. How will the expected consumer group be specified? Is it through a
> per
> > > topic dynamic configuration?
> > > 2. How do the brokers detect the consumer offsets? Is it required for a
> > > consumer to commit offsets?
> > > 3. How do all the replicas know the about the committed offsets? e.g.
> 1)
> > > non-coordinator brokers which do not have the committed offsets, 2)
> > > follower brokers which do not have consumers directly consuming from
> it.
> > > 4. Is there any other changes need to be made (e.g. new protocols) in
> > > addition to the configuration change?
> > >
> > > It would be great if you can update the wiki to have more details.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Wed, Sep 7, 2016 at 2:26 AM, Pengwei (L) <pengwei...@huawei.com>
> > wrote:
> > >
> > > > Hi All,
> > > >    I have made a KIP to enhance the log retention, details as
> follows:
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 68+Add+a+consumed+log+retention+before+log+retention
> > > >    Now start a discuss thread for this KIP , looking forward to the
> > > > feedback.
> > > >
> > > > Thanks,
> > > > David
> > > >
> > > >
> >
>

Re: [DISCUSS] KIP-68 Add a consumed log retention before log retention

Reply via email to