[ 
https://issues.apache.org/jira/browse/MESOS-2660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518301#comment-14518301
 ] 

Chi Zhang commented on MESOS-2660:
----------------------------------

Resource temporarily unavailable is EAGAIN, which from digging, is actually 
ENOMEM:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=a477097d9c37c1cf289c7f0257dffcfa42d50197

{code}
        ret = get_user_pages(current, current->mm, addr,
                        len, write, 0, NULL, NULL);
-       if (ret < 0)
+       if (ret < 0) {
+               /*
+                  SUS require strange return value to mlock
+                   - invalid addr generate to ENOMEM.
+                   - out of memory should generate EAGAIN.
+               */
+               if (ret == -EFAULT)
+                       ret = -ENOMEM;
+               else if (ret == -ENOMEM)
+                       ret = -EAGAIN;
                return ret;
-       return ret == len ? 0 : -1;
+       }
+       return ret == len ? 0 : -ENOMEM;
 }
{code}

so my theory was mlock makes sure enough lockable memory is present before 
locking, but since I requested more than the limit to trigger the oom, the 
validation fails with EAGAIN(ENOMEM). 

Therefor I swapped the order in the code to memset first before mlock and it 
worked. Changing this order shouldn't make a difference when there is enough 
memory since both cases would go through; when there isn't, memset triggers 
page faults and gets the process under oom, which is expected.

(not sure why this wasn't picked up when I wrote the the test -_-) 

> ROOT_CGROUPS_Listen test is flaky
> ---------------------------------
>
>                 Key: MESOS-2660
>                 URL: https://issues.apache.org/jira/browse/MESOS-2660
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Jie Yu
>
> [==========] Running 1 test from 1 test case.
> [----------] Global test environment set-up.
> [----------] 1 test from CgroupsAnyHierarchyWithCpuMemoryTest
> [ RUN      ] CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen
> Failed to allocate RSS memory: Failed to lock memory, mlock: Resource 
> temporarily unavailable../../../mesos/src/tests/cgroups_tests.cpp:571: Failure
> Failed to wait 15secs for future
> [  FAILED  ] CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen (15121 
> ms)
> [----------] 1 test from CgroupsAnyHierarchyWithCpuMemoryTest (15121 ms total)
> [----------] Global test environment tear-down
> [==========] 1 test from 1 test case ran. (15174 ms total)
> [  PASSED  ] 0 tests.
> [  FAILED  ] 1 test, listed below:
> [  FAILED  ] CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to