Many thanks for the feedback. Per Gwen's suggestion, I've updated the KIP to specify that the task count will be per-worker (no additional MBean tag, since each process is a worker) and per-connector (MBean tag).
On Mon, Jun 17, 2019 at 8:24 PM Cyrus Vafadari <[email protected]> wrote: > I meant to write: > I've also updated the KIP to clarify that every task must have exactly one > non-null *status* at all times. > > On Mon, Jun 17, 2019 at 6:55 PM Cyrus Vafadari <[email protected]> wrote: > >> Guozhang, >> >> Both of Kafka's implementations of "StatusBackingStore" immediately >> delete the task from the backign store when you try to set it to DESTROYED, >> so we'd actually expect it to always be zero. A nonzero number of destroyed >> tasks would either indicate a new implementation of StatusBackingStore, or >> a malfunctioning StatusBackingStore (e.g. caches out of sync with compacted >> topic). This metric will usually be uninteresting, and was only included >> for completeness. It could possibly catch a bug. >> >> Gwen, >> I had not considered this option. I agree there is an advantage to having >> more granular data about both connector and worker. The main disadvantage >> would be that it increases the number of metrics by a factor of >> num_workers, but I'd say this is an acceptable tradeoff. Another advantage >> of your suggestion is that the public interfaces for WorkerConnector would >> be unchanged, and the new metrics can be added within the Worker class. >> >> I've also updated the KIP to clarify that every task must have exactly >> one non-null task at all times. >> >> On Mon, Jun 17, 2019 at 1:41 PM Guozhang Wang <[email protected]> wrote: >> >>> Hello Cyrus, >>> >>> Thanks for the KIP. I just have one nit question about Connect destroyed >>> tasks: is it an ever-increasing number? If yes, the corresponding metric >>> value would be increasing indefinitely as well. Is that intentional? >>> >>> Otherwise, lgtm. >>> >>> >>> Guozhang >>> >>> On Mon, Jun 17, 2019 at 1:14 PM Gwen Shapira <[email protected]> wrote: >>> >>> > Sorry to join so late, but did we consider a single set of task-count >>> > metrics and using tags to scope each data point to a specific >>> > connector and worker (and in the future perhaps also user)? >>> > >>> > It will make analysis of the data easier - someone may want to >>> > breakdown tasks by both worker and connector to detect imbalanced >>> > assignments. >>> > >>> > Are there downsides to this approach? >>> > >>> > And a small nit: it will be good to capture in the KIP what are the >>> > expectations regarding overlap and disjointness of the proposed >>> > metrics. For example, is running+paused+failed = total? Can a task be >>> > failed and destroyed and therefore count in 2 of those metrics? >>> > >>> > On Thu, Jun 6, 2019 at 12:29 PM Cyrus Vafadari <[email protected]> >>> wrote: >>> > > >>> > > Konstantine, >>> > > >>> > > This is a good suggestion. Since the suggestion to add 2 additional >>> > > statuses analogous to the 3 proposed, it is a very minor change of no >>> > > structural consequence to the KIP. >>> > > >>> > > I've updated the KIP to incorporate your suggestion, and any voters >>> who >>> > > disagree should definitely respond in the thread. >>> > > >>> > > Cyrus >>> > > >>> > > On Thu, Jun 6, 2019 at 11:16 AM Konstantine Karantasis < >>> > > [email protected]> wrote: >>> > > >>> > > > Thanks Cyrus, >>> > > > >>> > > > this is a nice and straightforward addition. >>> > > > >>> > > > I'm +1 too, but I'd like to return with a question here as well >>> > regarding >>> > > > whether the unassigned tasks will be taken into account. >>> > > > Especially after KIP-415 we might start seeing this status for >>> specific >>> > > > periods of time. Therefore, I think it's a meaningful addition. >>> > > > Then there's the `destroyed` status which might be a lot more >>> > transient but >>> > > > we could also include for the sake of completion. >>> > > > Check org.apache.kafka.connect.runtime.AbstractStatus for the list >>> of >>> > all >>> > > > possible statuses. >>> > > > >>> > > > Konstantine >>> > > > >>> > > > On Wed, Jun 5, 2019 at 4:32 PM Randall Hauch <[email protected]> >>> wrote: >>> > > > >>> > > > > Thanks, Cyrus. >>> > > > > >>> > > > > +1 (binding) >>> > > > > >>> > > > > Randall Hauch >>> > > > > >>> > > > > On Wed, Jun 5, 2019 at 10:36 AM Andrew Schofield < >>> > > > > [email protected]> >>> > > > > wrote: >>> > > > > >>> > > > > > +1 (non-binding) >>> > > > > > >>> > > > > > Andrew Schofield >>> > > > > > >>> > > > > > On 05/06/2019, 14:04, "Ryanne Dolan" <[email protected]> >>> > wrote: >>> > > > > > >>> > > > > > +1 (non-binding) >>> > > > > > >>> > > > > > Thanks >>> > > > > > Ryanne >>> > > > > > >>> > > > > > On Tue, Jun 4, 2019, 11:29 PM Cyrus Vafadari < >>> > [email protected]> >>> > > > > > wrote: >>> > > > > > >>> > > > > > > Hi all, >>> > > > > > > >>> > > > > > > Like like to start voting in the following KIP: >>> > > > > > > >>> > > > > > > >>> > > > > > >>> > > > > >>> > > > >>> > >>> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-475%253A%2BNew%2BMetric%2Bto%2BMeasure%2BNumber%2Bof%2BTasks%2Bon%2Ba%2BConnector&data=02%7C01%7C%7C95f8a8ebb4a44882773808d6e9b65983%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636953366722392496&sdata=vbE%2BjrAapcQ68Vnwh5OkY1FFoOzFHs9rZRaPHlwqxSU%3D&reserved=0 >>> > > > > > > >>> > > > > > > Discussion thread: >>> > > > > > > >>> > > > > > > >>> > > > > > >>> > > > > >>> > > > >>> > >>> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.apache.org%2Fthread.html%2Fbf7c92224aa798336c14d7e96ec8f2e3406c61879ec381a50652acfe%40%253Cdev.kafka.apache.org%253E&data=02%7C01%7C%7C95f8a8ebb4a44882773808d6e9b65983%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636953366722402501&sdata=0JpQuCpTKwJyOjWH8cM%2B6eU%2FjNT28eE7xvMOBQgghjA%3D&reserved=0 >>> > > > > > > >>> > > > > > > Thanks! >>> > > > > > > >>> > > > > > > Cyrus >>> > > > > > > >>> > > > > > >>> > > > > > >>> > > > > > >>> > > > > >>> > > > >>> > >>> > >>> > >>> > -- >>> > Gwen Shapira >>> > Product Manager | Confluent >>> > 650.450.2760 | @gwenshap >>> > Follow us: Twitter | blog >>> > >>> >>> >>> -- >>> -- Guozhang >>> >>
