shanthoosh opened a new pull request #987: SAMZA-2158: Remove the redunant 
coordinator stream reads in the ApplicationMaster startup sequence.
URL: https://github.com/apache/samza/pull/987
 
 
   **Changes:**
   Currently the input topic partitions assigned to a container of a samza job 
is stored in the coordinator stream(aka kafka topic).
   
   In samza-yarn ApplicationMaster startup sequence, the JobModel from the 
previous run of the samza job is read from the coordinator stream.
   
   JobModel is read multiple times(3 times) from the same kafka topic with 
different connections. These redundant reads prolongs the launch of containers 
by the samza-yarn ApplicationMaster. This fix is to remove the inefficieny by 
reading the coordinator stream only once with one connection.
   
   Please do note that the above two problems had slowed down ApplicationMaster 
startup and did not break functional correctness.
   
   **Note:**
   * In addition to the above problem,  KafkaSystemAdmin is created multiple 
times for same topic/system multiple times in ApplicationMaster/Container and 
it will be fixed in SAMZA-2157.
   * Tested this patch with a test samza-yarn job inside linkedin and also with 
a beam job. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to