Re: [DISCUSS] KIP-515: Reorganize checkpoint system in log cleaner to be per partition

2019-10-14 Thread Xu Jianhai
The sentence ` (KAFKA-4545 ) ` could be change to `KAFKA-4545 < https://issues.apache.org/jira/browse/KAFKA-4545 >` ? On Sun, Oct 13, 2019 at 2:08 AM Richard Yu wrote: > Hi Jun, Jason, > > I've updated the

Re: [DISCUSS] KIP-515: Reorganize checkpoint system in log cleaner to be per partition

2019-10-12 Thread Richard Yu
Hi Jun, Jason, I've updated the KIP accordingly. Sorry for taking a while to get back to you guys. We should've ironed out the general approach fairly well now. So if there isn't any other comments, I will get the vote started then. :) Cheers, Richard On Thu, Oct 10, 2019 at 3:41 PM Jun Rao

Re: [DISCUSS] KIP-515: Reorganize checkpoint system in log cleaner to be per partition

2019-10-10 Thread Jun Rao
Hi, Jason, I agree that your approach is better since it's more general, more accurate and simpler. The only thing is that it may not work for the old message format. I am not sure how important it is since most users are probably already on the new message format. Perhaps we can just document

Re: [DISCUSS] KIP-515: Reorganize checkpoint system in log cleaner to be per partition

2019-09-27 Thread Richard Yu
Hi Jason, That actually sounds like a pretty good idea to me. No doubt if we use this approach, then some comments need to be added that indicates this. But all things considered, I think its not bad at all. I definitely agree with you on that its a little hacky, but it works. Cheers, Richard

Re: [DISCUSS] KIP-515: Reorganize checkpoint system in log cleaner to be per partition

2019-09-24 Thread Jason Gustafson
Hi Richard, It would be unsatisfying to make a big change to the checkpointing logic in order to handle only one case of this problem, right? I did have one idea about how to do this. It's a bit of a hack, but keep an open mind ;). The basic problem is having somewhere to embed the delete

Re: [DISCUSS] KIP-515: Reorganize checkpoint system in log cleaner to be per partition

2019-09-19 Thread Richard Yu
Hi Jason, That hadn't occurred to me. I think I missed your comment in the discussion, so I created this KIP only with resolving the problem regarding tombstones. Whats your thoughts? If the problem regarding transaction markers is a little too complex, then we can we just leave it out of the

Re: [DISCUSS] KIP-515: Reorganize checkpoint system in log cleaner to be per partition

2019-09-19 Thread Jason Gustafson
Hi Richard, Just reposting my comment from the JIRA: The underlying problem here also impacts the cleaning of transaction markers. We use the same delete horizon in order to tell when it is safe to remove the marker. If all the data from a transaction has been cleaned and the delete horizon has

Re: [DISCUSS] KIP-515: Reorganize checkpoint system in log cleaner to be per partition

2019-09-09 Thread Richard Yu
Hi Jun, Thanks for chipping in. :) The description you provided is pretty apt in describing the motivation of the KIP, so I will add it. I've made some changes to the KIP and outlined the basic approaches of what we have so far (basically changing the checkpoint file organization or

Re: [DISCUSS] KIP-515: Reorganize checkpoint system in log cleaner to be per partition

2019-09-09 Thread Jun Rao
Hi, Richard, Thanks for drafting the KIP. A few comments below. 1. We need to provide a better motivation for the KIP. The goal of the KIP is not to reorganize the checkpoint for log cleaning. It's just an implementation detail. I was thinking that we could add sth like the following in the

[DISCUSS] KIP-515: Reorganize checkpoint system in log cleaner to be per partition

2019-09-01 Thread Richard Yu
Hi all, A KIP has been written that wishes to upgrade the checkpoint file system in log cleaner. If anybody wishes to comment, feel free to do so. :) https://cwiki.apache.org/confluence/display/KAFKA/KIP-515%3A+Reorganize+checkpoint+file+system+in+log+cleaner+to+be+per+partition Above is the