[
https://issues.apache.org/jira/browse/SAMZA-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shanthoosh Venkataraman closed SAMZA-2284.
------------------------------------------
> Remove redundant stream metadata API invocations in SamzaContainer startup
> sequence.
> ------------------------------------------------------------------------------------
>
> Key: SAMZA-2284
> URL: https://issues.apache.org/jira/browse/SAMZA-2284
> Project: Samza
> Issue Type: Improvement
> Reporter: Shanthoosh Venkataraman
> Assignee: Shanthoosh Venkataraman
> Priority: Major
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
>
> SamzaContainer startup sequence fetches the metadata of same input streams
> multiple times. Fetching the metadata of a stream entails making a remote
> call to underlying messaging broker and is very expensive. This redundant
> fetch-input-stream-metadata API invocations incurred significant delays in
> the start of actual message processing by the samza job.
> Impact:
> 1. With some samza jobs at LinkedIn, we observed that this
> fetch-input-stream-metadata loop took around 1.5 hrs to complete.
> 2. The redundant fetch-input-stream-metadata remote API calls will increase
> the load on the underlying messaging broker significantly.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)