[
https://issues.apache.org/jira/browse/MESOS-828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828277#comment-13828277
]
Vinod Kone commented on MESOS-828:
----------------------------------
https://reviews.apache.org/r/15741/
> CgroupsIsolator BalloonFramework Test is broken.
> ------------------------------------------------
>
> Key: MESOS-828
> URL: https://issues.apache.org/jira/browse/MESOS-828
> Project: Mesos
> Issue Type: Bug
> Components: test
> Reporter: Benjamin Mahler
> Assignee: Vinod Kone
> Priority: Blocker
>
> This was broken by the following commit:
> commit 82a0d329e112cc67e2916bda6e44cb00fe1a1236
> Author: Vinod Kone <[email protected]>
> Date: Wed Nov 20 14:32:35 2013 -0800
> Fixed local cluster slaves to use temporary work directory.
> [ RUN ] CgroupsIsolatorTest.ROOT_CGROUPS_BalloonFramework
> Using temporary directory
> '/tmp/CgroupsIsolatorTest_ROOT_CGROUPS_BalloonFramework_KGmMtN'
> Launched master at 13630
> I1120 22:55:24.342715 13630 main.cpp:126] Build: 2013-10-29 19:20:17 by root
> I1120 22:55:24.342897 13630 main.cpp:127] Starting Mesos master
> I1120 22:55:24.343392 13673 master.cpp:285] Master started on 127.0.0.1:5432
> I1120 22:55:24.343453 13673 master.cpp:299] Master ID:
> 201311202255-16777343-5432-13630
> I1120 22:55:24.343464 13673 master.cpp:302] Master only allowing
> authenticated frameworks to register!
> I1120 22:55:24.345286 13673 master.cpp:744] The newly elected leader is
> [email protected]:5432
> I1120 22:55:24.345309 13673 master.cpp:748] Elected as the leading master!
> Launched slave at 15663
> Duplicate flag 'work_dir' on command line
> Usage: lt-mesos-slave [...]
> Supported options:
> --attributes=VALUE Attributes of machine
> --[no-]cgroups_enable_cfs Cgroups feature flag to enable
> hard limits on CPU resources
> via the CFS bandwidth limiting
> subfeature.
> (default: false)
> --cgroups_hierarchy=VALUE The path to the cgroups
> hierarchy root
> (default: /cgroup)
> --cgroups_root=VALUE Name of the root cgroup
> (default: mesos)
> --cgroups_subsystems=VALUE List of subsystems to enable
> (e.g., 'cpu,freezer')
> (default: cpu,memory,freezer)
> --[no-]checkpoint Whether to checkpoint slave and
> frameworks information
> to disk. This enables a
> restarted slave to recover
> status updates and reconnect
> with (--recover=reconnect) or
> kill (--recover=kill) old
> executors (default: true)
> --default_role=VALUE Any resources in the --resources
> flag that
> omit a role, as well as any
> resources that
> are not present in --resources
> but that are
> automatically detected, will be
> assigned to
> this role. (default: *)
> --disk_watch_interval=VALUE Periodic time interval (e.g.,
> 10secs, 2mins, etc)
> to check the disk usage
> (default: 1mins)
> --executor_registration_timeout=VALUE Amount of time to wait for an
> executor
> to register with the slave
> before considering it hung and
> shutting it down (e.g., 60secs,
> 3mins, etc) (default: 1mins)
> --executor_shutdown_grace_period=VALUE Amount of time to wait for an
> executor
> to shut down (e.g., 60secs,
> 3mins, etc) (default: 5secs)
> --frameworks_home=VALUE Directory prepended to relative
> executor URIs (default: )
> --gc_delay=VALUE Maximum amount of time to wait
> before cleaning up
> executor directories (e.g.,
> 3days, 2weeks, etc).
> Note that this delay may be
> shorter depending on
> the available disk usage.
> (default: 1weeks)
> --hadoop_home=VALUE Where to find Hadoop installed
> (for
> fetching framework executors
> from HDFS)
> (no default, look for
> HADOOP_HOME in
> environment or find hadoop on
> PATH) (default: )
> --[no-]help Prints this help message
> (default: false)
> --hostname=VALUE The hostname the slave should
> report.
> If left unset, system hostname
> will be used (recommended).
> --ip=VALUE IP address to listen on
> --isolation=VALUE Isolation mechanism, may be one
> of: process, cgroups (default: process)
> --launcher_dir=VALUE Location of Mesos binaries
> (default: /usr/local/libexec/mesos)
> --log_dir=VALUE Location to put log files (no
> default, nothing
> is written to disk unless
> specified;
> does not affect logging to
> stderr)
> --logbufsecs=VALUE How many seconds to buffer log
> messages for (default: 0)
> --master=VALUE May be one of:
>
> zk://host1:port1,host2:port2,.../path
>
> zk://username:password@host1:port1,host2:port2,.../path
> file://path/to/file (where
> file contains one of the above)
> --port=VALUE Port to listen on (default: 5051)
> --[no-]quiet Disable logging to stderr
> (default: false)
> --recover=VALUE Whether to recover status
> updates and reconnect with old executors.
> Valid values for 'recover' are
> reconnect: Reconnect with any
> old live executors.
> cleanup : Kill any old live
> executors and exit.
> Use this option when
> doing an incompatible slave
> or executor upgrade!).
> NOTE: If checkpointed slave
> doesn't exist, no recovery is performed
> and the slave registers
> with the master as a new slave. (default: reconnect)
> --recovery_timeout=VALUE Amount of time alloted for the
> slave to recover. If the slave takes
> longer than recovery_timeout to
> recover, any executors that are
> waiting to reconnect to the
> slave will self-terminate.
> NOTE: This flag is only
> applicable when checkpoint is enabled.
> (default: 15mins)
> --resource_monitoring_interval=VALUE Periodic time interval for
> monitoring executor
> resource usage (e.g., 10secs,
> 1min, etc) (default: 1secs)
> --resources=VALUE Total consumable resources per
> slave, in
> the form
> 'name(role):value;name(role):value...'.
> --[no-]strict If strict=true, any and all
> recovery errors are considered fatal.
> If strict=false, any expected
> errors (e.g., slave cannot recover
> information about an executor,
> because the slave died right before
> the executor registered.) during
> recovery are ignored and as much
> state as possible is recovered.
> (default: true)
> --[no-]switch_user Whether to run tasks as the user
> who
> submitted them rather than the
> user running
> the slave (requires setuid
> permission) (default: true)
> --work_dir=VALUE Where to place framework work
> directories
> (default: /tmp/mesos)
> Slave crashed; failing test
> W1120 22:55:24.345309 13630 logging.cpp:58] RAW: Received signal SIGTERM;
> exiting.
> /home/bmahler/git/mesos/src/tests/balloon_framework_test.sh: line 38: kill:
> (15663) - No such process
> find: /tmp/mesos_test_cgroup: No such file or directory
> /home/bmahler/git/mesos/src/tests/balloon_framework_test.sh: line 30: 13630
> Terminated ${MASTER} --ip=127.0.0.1 --port=5432
> rmdir: /tmp/mesos_test_cgroup/mesos_test: No such file or directory
> ../../src/tests/script.cpp:78: Failure
> Failed
> balloon_framework_test.sh exited with status 2
> [ FAILED ] CgroupsIsolatorTest.ROOT_CGROUPS_BalloonFramework (4072 ms)
> [----------] 1 test from CgroupsIsolatorTest (4073 ms total)
> [----------] Global test environment tear-down
> [==========] 1 test from 1 test case ran. (4073 ms total)
> [ PASSED ] 0 tests.
> [ FAILED ] 1 test, listed below:
> [ FAILED ] CgroupsIsolatorTest.ROOT_CGROUPS_BalloonFramework
> 1 FAILED TEST
> YOU HAVE 2 DISABLED TESTS
--
This message was sent by Atlassian JIRA
(v6.1#6144)