[ 
https://issues.apache.org/jira/browse/MESOS-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu resolved MESOS-2672.
---------------------------
    Resolution: Fixed

commit 7d2d5de9b9f6adbb94cf692576236424aeaf2f67
Author: Chi Zhang <[email protected]>
Date:   Fri May 8 10:54:14 2015 -0700

    Changed BalloonExecutor to do memset before mlock.
    
    mlock returns error when requested memory is more than the limit,
    because it couldn't find enough lockable memory, which defeats the
    purpose to trigger an oom.
    
    Review: https://reviews.apache.org/r/33990

> ContainerizerTest.ROOT_CGROUPS_BalloonFramework flaky
> -----------------------------------------------------
>
>                 Key: MESOS-2672
>                 URL: https://issues.apache.org/jira/browse/MESOS-2672
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Chi Zhang
>            Assignee: Chi Zhang
>
> {noformat}
> I0429 00:58:35.267629  2086 slave.cpp:3210] Executor 'default' of framework 
> 20150429-005830-16777343-5432-2023-0000 terminated with signal Aborted
> I0429 00:58:35.270761  2086 slave.cpp:2512] Handling status update TASK_LOST 
> (UUID: f969e350-6f91-4fa9-980e-1852554bd704) for task 1 of framework 201
> 50429-005830-16777343-5432-2023-0000 from @0.0.0.0:0
> I0429 00:58:35.270983  2086 slave.cpp:4604] Terminating task 1
> W0429 00:58:35.271574  2080 containerizer.cpp:903] Ignoring update for 
> unknown container: 1298549a-a3d2-46ff-aad0-9dbc777affcc
> I0429 00:58:35.272541  2074 status_update_manager.cpp:317] Received status 
> update TASK_LOST (UUID: f969e350-6f91-4fa9-980e-1852554bd704) for task 1 o
> f framework 20150429-005830-16777343-5432-2023-0000
> I0429 00:58:35.272624  2074 status_update_manager.cpp:494] Creating 
> StatusUpdate stream for task 1 of framework 
> 20150429-005830-16777343-5432-2023-00
> 00
> I0429 00:58:35.273217  2053 master.cpp:3493] Executor default of framework 
> 20150429-005830-16777343-5432-2023-0000 on slave 20150429-005830-16777343-
> 5432-2023-S0 at slave(1)@10.35.12.124:5051 
> (smfd-aki-27-sr1.devel.twitter.com): terminated with signal Aborted
> {noformat}
> which is from
> {code}
>  60    // We use mlock and memset here to make sure that the memory           
>                                                                        
>  61    // actually gets paged in and thus accounted for.                      
>                                                                        
>  62    if (mlock(buffer, chunk) != 0) {                                       
>                                                                        
>  63      perror("Failed to lock memory, mlock");                              
>                                                                        
>  64      abort();                                                             
>                                                                        
>  65    }                                                                      
>                                                                        
>  66                                                                           
>                                                                        
>  67    if (memset(buffer, 1, chunk) != buffer) {                              
>                                                                        
>  68      perror("Failed to fill memory, memset");                             
>                                                                        
>  69      abort();                                                             
>                                                                        
>  70    }  
> {code}
> This is the same as MESOS-2660: I've confirmed that swapping them fixed it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to