[jira] [Commented] (MESOS-828) CgroupsIsolator BalloonFramework Test is broken.

Vinod Kone (JIRA) Wed, 20 Nov 2013 15:24:09 -0800

    [ 
https://issues.apache.org/jira/browse/MESOS-828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828277#comment-13828277
 ]


Vinod Kone commented on MESOS-828:
----------------------------------

https://reviews.apache.org/r/15741/

> CgroupsIsolator BalloonFramework Test is broken.
> ------------------------------------------------
>
>                 Key: MESOS-828
>                 URL: https://issues.apache.org/jira/browse/MESOS-828
>             Project: Mesos
>          Issue Type: Bug
>          Components: test
>            Reporter: Benjamin Mahler
>            Assignee: Vinod Kone
>            Priority: Blocker
>
> This was broken by the following commit:
> commit 82a0d329e112cc67e2916bda6e44cb00fe1a1236
> Author: Vinod Kone <[email protected]>
> Date:   Wed Nov 20 14:32:35 2013 -0800
>     Fixed local cluster slaves to use temporary work directory.
> [ RUN      ] CgroupsIsolatorTest.ROOT_CGROUPS_BalloonFramework
> Using temporary directory 
> '/tmp/CgroupsIsolatorTest_ROOT_CGROUPS_BalloonFramework_KGmMtN'
> Launched master at 13630
> I1120 22:55:24.342715 13630 main.cpp:126] Build: 2013-10-29 19:20:17 by root
> I1120 22:55:24.342897 13630 main.cpp:127] Starting Mesos master
> I1120 22:55:24.343392 13673 master.cpp:285] Master started on 127.0.0.1:5432
> I1120 22:55:24.343453 13673 master.cpp:299] Master ID: 
> 201311202255-16777343-5432-13630
> I1120 22:55:24.343464 13673 master.cpp:302] Master only allowing 
> authenticated frameworks to register!
> I1120 22:55:24.345286 13673 master.cpp:744] The newly elected leader is 
> [email protected]:5432
> I1120 22:55:24.345309 13673 master.cpp:748] Elected as the leading master!
> Launched slave at 15663
> Duplicate flag 'work_dir' on command line
> Usage: lt-mesos-slave [...]
> Supported options:
>   --attributes=VALUE                         Attributes of machine
>   --[no-]cgroups_enable_cfs                  Cgroups feature flag to enable 
> hard limits on CPU resources
>                                              via the CFS bandwidth limiting 
> subfeature.
>                                              (default: false)
>   --cgroups_hierarchy=VALUE                  The path to the cgroups 
> hierarchy root
>                                              (default: /cgroup)
>   --cgroups_root=VALUE                       Name of the root cgroup
>                                              (default: mesos)
>   --cgroups_subsystems=VALUE                 List of subsystems to enable 
> (e.g., 'cpu,freezer')
>                                              (default: cpu,memory,freezer)
>   --[no-]checkpoint                          Whether to checkpoint slave and 
> frameworks information
>                                              to disk. This enables a 
> restarted slave to recover
>                                              status updates and reconnect 
> with (--recover=reconnect) or
>                                              kill (--recover=kill) old 
> executors (default: true)
>   --default_role=VALUE                       Any resources in the --resources 
> flag that
>                                              omit a role, as well as any 
> resources that
>                                              are not present in --resources 
> but that are
>                                              automatically detected, will be 
> assigned to
>                                              this role. (default: *)
>   --disk_watch_interval=VALUE                Periodic time interval (e.g., 
> 10secs, 2mins, etc)
>                                              to check the disk usage 
> (default: 1mins)
>   --executor_registration_timeout=VALUE      Amount of time to wait for an 
> executor
>                                              to register with the slave 
> before considering it hung and
>                                              shutting it down (e.g., 60secs, 
> 3mins, etc) (default: 1mins)
>   --executor_shutdown_grace_period=VALUE     Amount of time to wait for an 
> executor
>                                              to shut down (e.g., 60secs, 
> 3mins, etc) (default: 5secs)
>   --frameworks_home=VALUE                    Directory prepended to relative 
> executor URIs (default: )
>   --gc_delay=VALUE                           Maximum amount of time to wait 
> before cleaning up
>                                              executor directories (e.g., 
> 3days, 2weeks, etc).
>                                              Note that this delay may be 
> shorter depending on
>                                              the available disk usage. 
> (default: 1weeks)
>   --hadoop_home=VALUE                        Where to find Hadoop installed 
> (for
>                                              fetching framework executors 
> from HDFS)
>                                              (no default, look for 
> HADOOP_HOME in
>                                              environment or find hadoop on 
> PATH) (default: )
>   --[no-]help                                Prints this help message 
> (default: false)
>   --hostname=VALUE                           The hostname the slave should 
> report.
>                                              If left unset, system hostname 
> will be used (recommended).
>   --ip=VALUE                                 IP address to listen on
>   --isolation=VALUE                          Isolation mechanism, may be one 
> of: process, cgroups (default: process)
>   --launcher_dir=VALUE                       Location of Mesos binaries 
> (default: /usr/local/libexec/mesos)
>   --log_dir=VALUE                            Location to put log files (no 
> default, nothing
>                                              is written to disk unless 
> specified;
>                                              does not affect logging to 
> stderr)
>   --logbufsecs=VALUE                         How many seconds to buffer log 
> messages for (default: 0)
>   --master=VALUE                             May be one of:
>                                                
> zk://host1:port1,host2:port2,.../path
>                                                
> zk://username:password@host1:port1,host2:port2,.../path
>                                                file://path/to/file (where 
> file contains one of the above)
>   --port=VALUE                               Port to listen on (default: 5051)
>   --[no-]quiet                               Disable logging to stderr 
> (default: false)
>   --recover=VALUE                            Whether to recover status 
> updates and reconnect with old executors.
>                                              Valid values for 'recover' are
>                                              reconnect: Reconnect with any 
> old live executors.
>                                              cleanup  : Kill any old live 
> executors and exit.
>                                                         Use this option when 
> doing an incompatible slave
>                                                         or executor upgrade!).
>                                              NOTE: If checkpointed slave 
> doesn't exist, no recovery is performed
>                                                    and the slave registers 
> with the master as a new slave. (default: reconnect)
>   --recovery_timeout=VALUE                   Amount of time alloted for the 
> slave to recover. If the slave takes
>                                              longer than recovery_timeout to 
> recover, any executors that are
>                                              waiting to reconnect to the 
> slave will self-terminate.
>                                              NOTE: This flag is only 
> applicable when checkpoint is enabled.
>                                              (default: 15mins)
>   --resource_monitoring_interval=VALUE       Periodic time interval for 
> monitoring executor
>                                              resource usage (e.g., 10secs, 
> 1min, etc) (default: 1secs)
>   --resources=VALUE                          Total consumable resources per 
> slave, in
>                                              the form 
> 'name(role):value;name(role):value...'.
>   --[no-]strict                              If strict=true, any and all 
> recovery errors are considered fatal.
>                                              If strict=false, any expected 
> errors (e.g., slave cannot recover
>                                              information about an executor, 
> because the slave died right before
>                                              the executor registered.) during 
> recovery are ignored and as much
>                                              state as possible is recovered.
>                                              (default: true)
>   --[no-]switch_user                         Whether to run tasks as the user 
> who
>                                              submitted them rather than the 
> user running
>                                              the slave (requires setuid 
> permission) (default: true)
>   --work_dir=VALUE                           Where to place framework work 
> directories
>                                              (default: /tmp/mesos)
> Slave crashed; failing test
> W1120 22:55:24.345309 13630 logging.cpp:58] RAW: Received signal SIGTERM; 
> exiting.
> /home/bmahler/git/mesos/src/tests/balloon_framework_test.sh: line 38: kill: 
> (15663) - No such process
> find: /tmp/mesos_test_cgroup: No such file or directory
> /home/bmahler/git/mesos/src/tests/balloon_framework_test.sh: line 30: 13630 
> Terminated              ${MASTER} --ip=127.0.0.1 --port=5432
> rmdir: /tmp/mesos_test_cgroup/mesos_test: No such file or directory
> ../../src/tests/script.cpp:78: Failure
> Failed
> balloon_framework_test.sh exited with status 2
> [  FAILED  ] CgroupsIsolatorTest.ROOT_CGROUPS_BalloonFramework (4072 ms)
> [----------] 1 test from CgroupsIsolatorTest (4073 ms total)
> [----------] Global test environment tear-down
> [==========] 1 test from 1 test case ran. (4073 ms total)
> [  PASSED  ] 0 tests.
> [  FAILED  ] 1 test, listed below:
> [  FAILED  ] CgroupsIsolatorTest.ROOT_CGROUPS_BalloonFramework
>  1 FAILED TEST
>   YOU HAVE 2 DISABLED TESTS



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (MESOS-828) CgroupsIsolator BalloonFramework Test is broken.

Reply via email to