[
https://issues.apache.org/jira/browse/SAMZA-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shanthoosh Venkataraman updated SAMZA-2161:
-------------------------------------------
Description:
Currently the metadata of a samza job is stored into a kafka topic named
coordinator stream.
In samza-yarn ApplicationMaster, the same coordinator stream is read twice as a
part of the startup sequence. This duplicate read unnecessarily prolongs the
startup time of the application master and makes the container allocation take
longer than usual. This inadvertently incurs a substantial increase in input
stream processing delay depending upon the size of the coordinator stream.
To mitigate this problem, the two util classes in samza viz
`ChangelogPartitionManager`, `Config` should be moved to use the MetadataStore
abstraction. This ticket tracks the work involved in the migration.
was:
Currently the metadata of a samza job is stored into a kafka topic named
coordinator stream.
In samza-yarn ApplicationMaster, the same coordinator stream is read twice as a
part of the startup sequence. This duplicate read unnecessarily prolongs the
startup time of the application master and makes the container allocation take
longer than usual. This inadvertently incurs a substantial increase in input
stream processing delay depending upon the size of the coordinator stream.
To mitigate this problem, the two util classes in samza viz
`ChangelogPartitionManager`, `Config` should be moved to use the MetadataStore
abstraction. This ticket tracks the work involved in the migration.
> Move ChangelogPartitionManager and CoordinatorStream ConfigReader to
> MetadataStore
> ----------------------------------------------------------------------------------
>
> Key: SAMZA-2161
> URL: https://issues.apache.org/jira/browse/SAMZA-2161
> Project: Samza
> Issue Type: Improvement
> Reporter: Shanthoosh Venkataraman
> Assignee: Shanthoosh Venkataraman
> Priority: Major
>
> Currently the metadata of a samza job is stored into a kafka topic named
> coordinator stream.
> In samza-yarn ApplicationMaster, the same coordinator stream is read twice as
> a part of the startup sequence. This duplicate read unnecessarily prolongs
> the startup time of the application master and makes the container allocation
> take longer than usual. This inadvertently incurs a substantial increase in
> input stream processing delay depending upon the size of the coordinator
> stream.
> To mitigate this problem, the two util classes in samza viz
> `ChangelogPartitionManager`, `Config` should be moved to use the
> MetadataStore abstraction. This ticket tracks the work involved in the
> migration.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)