[ https://issues.apache.org/jira/browse/SAMZA-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709715#comment-16709715 ]
ASF GitHub Bot commented on SAMZA-1835: --------------------------------------- GitHub user shanthoosh opened a pull request: https://github.com/apache/samza/pull/844 SAMZA-1835: Consolidate all processorId generation code. Currently, the processorId creation function createProcessorId() is repeated in three different implementation of `JobCoordinator` viz `ZkJobCoordinator`, `PassthroughJobCoordinator`, and `AzureJobCoordinator`. Here're the few problems that stems from this duplication. 1. `ProcessorId` is passed into the `MetricsReporterFactory` through the factory create method: `MetricsReporter getMetricsReporter(String name, String processorId, Config config);`. Custom `MetricsReporter` implementations currently use the processorId as a component in the generated metric names. Metrics reporters are instantiated from `LocalApplicationRunner` and`processorId` is currently passed in as null to `MetricsReporterFactory.getMetricsReporter`. This corrupts the generated metrics names. 2. `ZkJobCoordinator`, `ZkUtils`, `ZkLeaderElector` and different downstream components of `LocalApplicationRunner` currently instantiate and manage their private reporters, rather than the sharing common `MetricsRegistry` managed by `LocalApplicationRunner`. Since there is no common namespace and reporter shared between reported metrics, generating metrics dashboards for standalone is kind of a hassle. This PR is comprised of the following changes: 1. Moved the processorId generation to `LocalApplicationRunner` and injects the generated identifier to all the downstream layers. 2. Deprecated the getProcessorId API in JobCoordinator interface. 3. Add the `processorId` and `metricsRegistry` arguments to the `getJobCoordinator` method of `JobCoordinatorFactory` t 4. Fixed the unit tests and added unit tests for `LocalApplicationRunner.createProcessorId`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shanthoosh/samza SAMZA-1835 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/samza/pull/844.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #844 ---- commit 6afe2b27c595b2870cc979f2f48c0af47b0fde84 Author: Shanthoosh Venkataraman <spvenkat@...> Date: 2018-11-30T20:11:26Z SAMZA-1835: Consolidate all processorId generation code. ---- > Consolidate all processorId generation code to StreamProcessor > -------------------------------------------------------------- > > Key: SAMZA-1835 > URL: https://issues.apache.org/jira/browse/SAMZA-1835 > Project: Samza > Issue Type: Improvement > Reporter: Yi Pan (Data Infrastructure) > Assignee: Sanil Jain > Priority: Major > Fix For: 1.0 > > > Currently, the processorId creation function createProcessorId() is repeated > in three different implementation of JobCoordinator: ZkJobCoordinator, > PassthroughJobCoordinator, and AzureJobCoordinator. > Making the processId generation dependent on JobCoordinator is also not > required now and each processor should know the durable processorId at the > startup time, not depending on the creation of JobCoordinator. Hence, > consolidating the createProcessorId() code to StreamProcessor and pass the > processId to the JobCoordinator constructor should be the right thing to do. -- This message was sent by Atlassian JIRA (v7.6.3#76005)