[ 
https://issues.apache.org/jira/browse/SAMZA-829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15038838#comment-15038838
 ] 

Yi Pan (Data Infrastructure) commented on SAMZA-829:
----------------------------------------------------

{noformat}
diff --git a/docs/learn/documentation/versioned/yarn/yarn-host-affinity.md 
b/docs/learn/documentation/versioned/yarn/yarn-host-affinity.md
index 108dfbc..1d9c29e 100644
--- a/docs/learn/documentation/versioned/yarn/yarn-host-affinity.md
+++ b/docs/learn/documentation/versioned/yarn/yarn-host-affinity.md
@@ -96,7 +96,13 @@ export LOGGED_STORE_BASE_DIR=<path-for-state-stores>
     <value>1000*</value> <!-- Should be tuned per requirement -->
 </property>
 {% endhighlight %}
-
+3. Configure Yarn Node Manager SIGTERM to SIGKILL timeout to be reasonable 
time s.t. Node Manager will give Samza Container enough time to perform a clean 
shutdown in yarn-site.xml {% highlight xml %}
+<property>
+    <name>yarn.nodemanager.sleep-delay-before-sigkill.ms</name>
+    <description>No. of ms to wait between sending a SIGTERM and SIGKILL to a 
container</description>
+    <value>600000</value> <!-- Set it to 10min to allow enough time for clean 
shutdown of containers -->
+</property>
+{% endhighlight %}
 
 ## Configuring a Samza job to use Host Affinity
 Any stateful Samza job can leverage this feature to reduce the Mean Time To 
Restore (MTTR) of it's state stores by setting 
<code>yarn.samza.host-affinity</code> to true.
{noformat}

> Add documentation about clean shutdown configuration for host-affinity
> ----------------------------------------------------------------------
>
>                 Key: SAMZA-829
>                 URL: https://issues.apache.org/jira/browse/SAMZA-829
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Yi Pan (Data Infrastructure)
>            Assignee: Yi Pan (Data Infrastructure)
>             Fix For: 0.10.0
>
>
> The default YARN SIGTERM to SIGKILL timeout is too short for a clean shutdown 
> of a job with large states. We need to document the additional configuration 
> to tune for clean shutdown of containers, which is important for 
> host-affinity feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to