Re: [VOTE] KIP-954: expand default DSL store configuration to custom types

2023-07-29 Thread John Roesler
Thanks for the KIP, Almog!

I'm +1 (binding) 

I've reviewed the KIP and skimmed the discussion thread. I think this is going 
to be a very nice improvement.

Thanks,
-John

On Sat, Jul 29, 2023, at 13:26, Guozhang Wang wrote:
> Thanks Almog! I made a pass over the updated wiki and have no more questions. 
> +1
>
> Guozhang
>
> On Wed, Jul 26, 2023 at 8:14 AM Almog Gavra  wrote:
>>
>> Hello Everyone,
>>
>> Opening the voting for KIP-954. The discussion is converging, but please
>> feel free to chime in on the last few conversation points if you aren't
>> happy with where it settled.
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-954%3A+expand+default+DSL+store+configuration+to+custom+types
>>
>> Cheers,
>> Almog


Re: [VOTE] KIP-954: expand default DSL store configuration to custom types

2023-07-29 Thread Guozhang Wang
Thanks Almog! I made a pass over the updated wiki and have no more questions. +1

Guozhang

On Wed, Jul 26, 2023 at 8:14 AM Almog Gavra  wrote:
>
> Hello Everyone,
>
> Opening the voting for KIP-954. The discussion is converging, but please
> feel free to chime in on the last few conversation points if you aren't
> happy with where it settled.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-954%3A+expand+default+DSL+store+configuration+to+custom+types
>
> Cheers,
> Almog


Re: [DISCUSS] KIP-953: partition method to be overloaded to accept headers as well.

2023-07-29 Thread Sagar
Hi Andrew,

Thanks for your comments.

1) Yes that makes sense and that's what even would expect to see as well. I
just wanted to highlight that we might still need a way to let client side
partitioning logic be present as well. Anyways, I am good on this point.
2) The example provided does seem achievable by simply attaching the
partition number in the ProducerRecord. I guess if we can't find any
further examples which strengthen the case of this partitioner, it might be
harder to justify adding it.


Thanks!
Sagar.

On Fri, Jul 28, 2023 at 2:05 PM Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:

> Hi Sagar,
> Thanks for your comments.
>
> 1) Server-side partitioning doesn’t necessarily mean that there’s only one
> way to do it. It just means that the partitioning logic runs on the broker
> and
> any configuration of partitioning applies to the broker’s partitioner. If
> we ever
> see a KIP for this, that’s the kind of thing I would expect to see.
>
> 2) In the priority example in the KIP, there is a kind of contract between
> the
> producers and consumers so that some records can be processed before
> others regardless of the order in which they were sent. The producer
> wants to apply special significance to a particular header to control which
> partition is used. I would simply achieve this by setting the partition
> number
> in the ProducerRecord at the time of sending.
>
> I don’t think the KIP proposes adjusting the built-in partitioner or
> adding to AK
> a new one that uses headers in the partitioning decision. So, any
> configuration
> for a partitioner that does support headers would be up to the
> implementation
> of that specific partitioner. Partitioner implements Configurable.
>
> I’m just providing an alternative view and I’m not particularly opposed to
> the KIP.
> I just don’t think it quite merits the work involved to get it voted and
> merged.
> As an aside, a long time ago, I created a small KIP that was never adopted
> and I didn’t push it because I eventually didn’t need it.
>
> Thanks,
> Andrew
>
> > On 28 Jul 2023, at 05:15, Sagar  wrote:
> >
> > Hey Andrew,
> >
> > Thanks for the review. Since I had reviewed the KIP I thought I would
> also
> > respond. Of course Jack has the final say on this since he wrote the KIP.
> >
> > 1) This is an interesting point and I hadn't considered it. The
> > comparison with KIP-848 is a valid one but even within that KIP, it
> allows
> > client side partitioning for power users like Streams. So while we would
> > want to move away from client side partitioner as much as possible, we
> > still shouldn't do away completely with Client side partitioning and end
> up
> > being in a state of inflexibility for different kinds of usecases. This
> is
> > my opinion though and you have more context on Clients, so would like to
> > know your thoughts on this.
> >
> > 2) Regarding this, I assumed that since the headers are already part of
> the
> > consumer records they should have access to the headers and if there is a
> > contract b/w the applications producing and the application consuming,
> that
> > decisioning should be transparent. Was my assumption incorrect? But as
> you
> > rightly pointed out header based partitioning with keys is going to lead
> to
> > surprising results. Assuming there is merit in this proposal, do you
> think
> > we should ignore the keys in this case (similar to the effect of
> > setting *partitioner.ignore.keys
> > *config to false) and document it appropriately?
> >
> > Let me know what you think.
> >
> > Thanks!
> > Sagar.
> >
> >
> > On Thu, Jul 27, 2023 at 9:41 PM Andrew Schofield <
> > andrew_schofield_j...@outlook.com> wrote:
> >
> >> Hi Jack,
> >> Thanks for the KIP. I have a few concerns about the idea.
> >>
> >> 1) I think that while a client-side partitioner seems like a neat idea
> and
> >> it’s an established part of Kafka,
> >> it’s one of the things which makes Kafka clients quite complicated. Just
> >> as KIP-848 is moving from
> >> client-side assignors to server-side assignors, I wonder whether really
> we
> >> should be looking to make
> >> partitioning a server-side capability too over time. So, I’m not
> convinced
> >> that making the Partitioner
> >> interface richer is moving in the right direction.
> >>
> >> 2) For records with a key, the partitioner usually calculates the
> >> partition from the key. This means
> >> that records with the same key end up on the same partition. Many
> >> applications expect this to give ordering.
> >> Log compaction expects this. There are situations in which records have
> to
> >> be repartitioned, such as
> >> sometimes happens with Kafka Streams. I think that a header-based
> >> partitioner for records which have
> >> keys is going to be surprising and only going to have limited
> >> applicability as a result.
> >>
> >> The tricky part about clever partitioning is that downstream systems
> have
> >> no idea how the partition
> >> number was arrived at, 

[jira] [Created] (KAFKA-15272) Fix the logic which finds candidate log segments to upload it to tiered storage

2023-07-29 Thread Kamal Chandraprakash (Jira)
Kamal Chandraprakash created KAFKA-15272:


 Summary: Fix the logic which finds candidate log segments to 
upload it to tiered storage
 Key: KAFKA-15272
 URL: https://issues.apache.org/jira/browse/KAFKA-15272
 Project: Kafka
  Issue Type: Task
Reporter: Kamal Chandraprakash
Assignee: Kamal Chandraprakash


In tiered storage, a segment is eligible for deletion from local disk when it 
gets uploaded to the remote storage. 

If the topic active segment contains some messages and there are no new 
incoming messages, then the active segment gets rotated to passive segment 
after the configured {{log.roll.ms}} timeout.

 

The 
[logic|https://github.com/apache/kafka/blob/trunk/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L553]
 to find the candidate segment in RemoteLogManager does not include the 
recently rotated passive segment as eligible to upload it to remote storage so 
the passive segment won't be removed even after if it breaches by retention 
time/size. (ie) Topic won't be empty after it becomes stale.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)