Tassapol Athiapinya created YARN-2297:
-----------------------------------------

             Summary: Preemption can hang in corner case by not allowing any 
task container to proceed.
                 Key: YARN-2297
                 URL: https://issues.apache.org/jira/browse/YARN-2297
             Project: Hadoop YARN
          Issue Type: Bug
          Components: capacityscheduler
    Affects Versions: 2.5.0
            Reporter: Tassapol Athiapinya
            Priority: Critical


Preemption can cause hang issue in single-node cluster. Only AMs run. No task 
container can run.

h3. queue configuration
Queue A/B has 1% and 99% respectively. 
No max capacity.

h3. scenario
Turn on preemption. Configure 1 NM with 4 GB of memory. Use only 2 apps. Use 1 
user.
Submit app 1 to queue A. AM needs 2 GB. There is 1 task that needs 2 GB. Occupy 
entire cluster.
Submit app 2 to queue B. AM needs 2 GB. There are 3 tasks that need 2 GB each.
Instead of entire app 1 preempted, app 1 AM will stay. App 2 AM will launch. No 
task of either app can proceed. 

h3. commands
/usr/lib/hadoop/bin/hadoop jar 
/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar randomtextwriter 
"-Dmapreduce.map.memory.mb=2000" 
"-Dyarn.app.mapreduce.am.command-opts=-Xmx1800M" 
"-Dmapreduce.randomtextwriter.bytespermap=2147483648" 
"-Dmapreduce.job.queuename=A" "-Dmapreduce.map.maxattempts=100" 
"-Dmapreduce.am.max-attempts=1" "-Dyarn.app.mapreduce.am.resource.mb=2000" 
"-Dmapreduce.map.java.opts=-Xmx1800M" 
"-Dmapreduce.randomtextwriter.mapsperhost=1" 
"-Dmapreduce.randomtextwriter.totalbytes=2147483648" dir1

/usr/lib/hadoop/bin/hadoop jar 
/usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar sleep 
"-Dmapreduce.map.memory.mb=2000" 
"-Dyarn.app.mapreduce.am.command-opts=-Xmx1800M" "-Dmapreduce.job.queuename=B" 
"-Dmapreduce.map.maxattempts=100" "-Dmapreduce.am.max-attempts=1" 
"-Dyarn.app.mapreduce.am.resource.mb=2000" 
"-Dmapreduce.map.java.opts=-Xmx1800M" -m 1 -r 0 -mt 4000  -rt 0




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to