Matthias, That's a good idea. I'm not sure why I didn't...
Thanks, -John On Mon, Jun 25, 2018 at 4:35 PM Matthias J. Sax <matth...@confluent.io> wrote: > Ok. > > @John: can you create a JIRA to track this? I think KAFKA-4730 is > related, but actually an own ticket (that is blocked by not having > Materialized for stream-stream joins). > > > -Matthias > > On 6/25/18 2:10 PM, Bill Bejeck wrote: > > I agree that it makes sense to have segmentInterval as a parameter to a > > store, but I also agree with Guozhang's point about not moving as part of > > this KIP. > > > > Thanks, > > Bill > > > > On Mon, Jun 25, 2018 at 4:17 PM John Roesler <j...@confluent.io> wrote: > > > >> Thanks Matthias and Guozhang, > >> > >> About deprecating the "segments" field instead of making it private. > Yes, I > >> just took another look at the code, and that is correct. I'll update the > >> KIP. > >> > >> I do agree that in the long run, it makes more sense as a parameter to > the > >> store somehow than as a parameter to the window. I think this isn't a > super > >> high priority, though, because it's not exposed in the DSL (or it wasn't > >> intended to be). > >> > >> I felt Guozhang's point is valid, and that we should probably revisit it > >> later, possibly in the scope of > >> https://issues.apache.org/jira/browse/KAFKA-4730 > >> > >> I'll wait an hour or so for more feedback before moving on to a vote. > >> > >> Thanks again, > >> -John > >> > >> On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang <wangg...@gmail.com> > wrote: > >> > >>> Re `segmentInterval` parameter in Windows: currently it is used in two > >>> places, the windowed stream aggregation, and the stream-stream joins. > For > >>> the former, we can potentially move the parameter from windowedBy() to > >>> Materialized, but for the latter we currently do not expose a > >> Materialized > >>> object yet, only the Windows spec. So I think in this KIP we probably > >>> cannot move it immediately. > >>> > >>> But in future KIPs if we decide to expose the stream-stream join's > store > >> / > >>> changelog / repartition topic names, we may well adding the > Materialized > >>> object into the operator, and we can then move the parameter to > >>> Materialized by then. > >>> > >>> > >>> Guozhang > >>> > >>> On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax < > matth...@confluent.io> > >>> wrote: > >>> > >>>> Thanks for the KIP. Overall, I think it makes sense to clean up the > >> API. > >>>> > >>>> Couple of comments: > >>>> > >>>>> Sadly there's no way to "deprecate" this > >>>>> exposure > >>>> > >>>> I disagree. We can just mark the variable as deprecated and I advocate > >>>> to do this. When the deprecation period passed, we can make it private > >>>> (or actually remove it; cf. my next comment). > >>>> > >>>> > >>>> Parameter, `segmentInterval` is semantically not a "window" > >>>> specification parameter but an implementation detail and thus a store > >>>> parameter. Would it be better to add it to `Materialized`? > >>>> > >>>> > >>>> -Matthias > >>>> > >>>> On 6/22/18 5:13 PM, Guozhang Wang wrote: > >>>>> Thanks John. > >>>>> > >>>>> On Fri, Jun 22, 2018 at 5:05 PM, John Roesler <j...@confluent.io> > >>> wrote: > >>>>> > >>>>>> Thanks for the feedback, Bill and Guozhang, > >>>>>> > >>>>>> I've updated the KIP accordingly. > >>>>>> > >>>>>> Thanks, > >>>>>> -John > >>>>>> > >>>>>> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <wangg...@gmail.com> > >>>> wrote: > >>>>>> > >>>>>>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on > >> the > >>>>>> wiki: > >>>>>>> the `In Windows, we will:` section code snippet is empty. > >>>>>>> > >>>>>>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bbej...@gmail.com> > >>>> wrote: > >>>>>>> > >>>>>>>> Hi John, > >>>>>>>> > >>>>>>>> Thanks for the KIP, and overall it's a +1 for me. > >>>>>>>> > >>>>>>>> In the JavaDoc for the segmentInterval method, there's no mention > >> of > >>>>>> the > >>>>>>>> number of segments there can be at any one time. While it's > >> implied > >>>>>> that > >>>>>>>> the number of segments is potentially unbounded, would be better > >> to > >>>>>>>> explicitly state that the previous limit on the number of segments > >>> is > >>>>>>> going > >>>>>>>> to be removed as well? > >>>>>>>> > >>>>>>>> I have a couple of nit comments. The method name is still > >>>> segmentSize > >>>>>>> in > >>>>>>>> the code block vs segmentInterval and the order of the parameters > >>> for > >>>>>> the > >>>>>>>> third persistentWindowStore don't match the order in the JavaDoc. > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Bill > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Thu, Jun 21, 2018 at 3:32 PM John Roesler <j...@confluent.io> > >>>>>> wrote: > >>>>>>>> > >>>>>>>>> I've updated the KIP and draft PR accordingly. > >>>>>>>>> > >>>>>>>>> On Thu, Jun 21, 2018 at 2:03 PM John Roesler <j...@confluent.io> > >>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Interesting... I did not initially consider it because I didn't > >>>>>> want > >>>>>>> to > >>>>>>>>>> have an impact on anyone's Streams apps, but now I see that > >> unless > >>>>>>>>>> developers have subclassed `Windows`, the number of segments > >> would > >>>>>>>> always > >>>>>>>>>> be 3! > >>>>>>>>>> > >>>>>>>>>> There's one caveat to this, which I think was a mistake. The > >> field > >>>>>>>>>> `segments` in Windows is public, which means that anyone can > >>>>>> actually > >>>>>>>> set > >>>>>>>>>> it directly on any Window instance like: > >>>>>>>>>> > >>>>>>>>>> TimeWindows tw = TimeWindows.of(100); > >>>>>>>>>> tw.segments = 12345; > >>>>>>>>>> > >>>>>>>>>> Bypassing the bounds check and contradicting the javadoc in > >>> Windows > >>>>>>>> that > >>>>>>>>>> says users can't directly set it. Sadly there's no way to > >>>>>> "deprecate" > >>>>>>>>> this > >>>>>>>>>> exposure, so I propose just to make it private. > >>>>>>>>>> > >>>>>>>>>> With this new knowledge, I agree, I think we can switch to > >>>>>>>>>> "segmentInterval" throughout the interface. > >>>>>>>>>> > >>>>>>>>>> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang < > >> wangg...@gmail.com > >>>> > >>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Hello John, > >>>>>>>>>>> > >>>>>>>>>>> Thanks for the KIP. > >>>>>>>>>>> > >>>>>>>>>>> Should we consider making the change on > >>>>>>> `Stores#persistentWindowStore` > >>>>>>>>>>> parameters as well? > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Guozhang > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler < > >> j...@confluent.io > >>>> > >>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Hi Ted, > >>>>>>>>>>>> > >>>>>>>>>>>> Ah, when you made that comment to me before, I thought you > >> meant > >>>>>>> as > >>>>>>>>>>> opposed > >>>>>>>>>>>> to "segments". Now it makes sense that you meant as opposed to > >>>>>>>>>>>> "segmentSize". > >>>>>>>>>>>> > >>>>>>>>>>>> I named it that way to match the peer method "windowSize", > >> which > >>>>>>> is > >>>>>>>>>>> also a > >>>>>>>>>>>> quantity of milliseconds. > >>>>>>>>>>>> > >>>>>>>>>>>> I agree that "interval" is more intuitive, but I think I favor > >>>>>>>>>>> consistency > >>>>>>>>>>>> in this case. Does that seem reasonable? > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks, > >>>>>>>>>>>> -John > >>>>>>>>>>>> > >>>>>>>>>>>> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yuzhih...@gmail.com> > >>>>>>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> Normally size is not measured in time unit, such as > >>>>>>> milliseconds. > >>>>>>>>>>>>> How about naming the new method segmentInterval ? > >>>>>>>>>>>>> Thanks > >>>>>>>>>>>>> -------- Original message --------From: John Roesler < > >>>>>>>>>>> j...@confluent.io> > >>>>>>>>>>>>> Date: 6/20/18 10:45 AM (GMT-08:00) To: > >> dev@kafka.apache.org > >>>>>>>>>>> Subject: > >>>>>>>>>>>>> [DISCUSS] KIP-319: Replace segments with segmentSize in > >>>>>>>>>>>>> WindowBytesStoreSupplier > >>>>>>>>>>>>> Hello All, > >>>>>>>>>>>>> > >>>>>>>>>>>>> I'd like to propose KIP-319 to fix an issue I identified in > >>>>>>>>>>> KAFKA-7080. > >>>>>>>>>>>>> Specifically, we're creating CachingWindowStore with the > >>>>>> *number > >>>>>>>> of > >>>>>>>>>>>>> segments* instead of the *segment size*. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Here's the jira: > >>>>>>> https://issues.apache.org/jira/browse/KAFKA-7080 > >>>>>>>>>>>>> Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ > >>>>>>>>>>>>> > >>>>>>>>>>>>> additionally, here's a draft PR for clarity: > >>>>>>>>>>>>> https://github.com/apache/kafka/pull/5257 > >>>>>>>>>>>>> > >>>>>>>>>>>>> Please let me know what you think! > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>> -John > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> -- > >>>>>>>>>>> -- Guozhang > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> -- Guozhang > >>>>>>> > >>>>>> > >>>>> > >>>>> > >>>>> > >>>> > >>>> > >>> > >>> > >>> -- > >>> -- Guozhang > >>> > >> > > > >