[jira] [Issue Comment Deleted] (MESOS-4025) SlaveRecoveryTest/0.GCExecutor is flaky.

2015-12-11 Thread Jan Schlicht (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Schlicht updated MESOS-4025:

Comment: was deleted

(was: I saw the same behavior. Both tests have to be run in the same 
{{mesos-tests}} session to see the issue.)

> SlaveRecoveryTest/0.GCExecutor is flaky.
> 
>
> Key: MESOS-4025
> URL: https://issues.apache.org/jira/browse/MESOS-4025
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.26.0
>Reporter: Till Toenshoff
>Assignee: Jan Schlicht
>  Labels: flaky, flaky-test, test
>
> Build was SSL enabled (--enable-ssl, --enable-libevent). The build was based 
> on 0.26.0-rc1.
> Testsuite was run as root.
> {noformat}
> sudo ./bin/mesos-tests.sh --gtest_break_on_failure --gtest_repeat=-1
> {noformat}
> {noformat}
> [ RUN  ] SlaveRecoveryTest/0.GCExecutor
> I1130 16:49:16.336833  1032 exec.cpp:136] Version: 0.26.0
> I1130 16:49:16.345212  1049 exec.cpp:210] Executor registered on slave 
> dde9fd4e-b016-4a99-9081-b047e9df9afa-S0
> Registered executor on ubuntu14
> Starting task 22c63bba-cbf8-46fd-b23a-5409d69e4114
> sh -c 'sleep 1000'
> Forked command at 1057
> ../../src/tests/mesos.cpp:779: Failure
> (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup 
> '/sys/fs/cgroup/memory/mesos_test_e5edb2a8-9af3-441f-b991-613082f264e2/slave':
>  Device or resource busy
> *** Aborted at 1448902156 (unix time) try "date -d @1448902156" if you are 
> using GNU date ***
> PC: @  0x1443e9a testing::UnitTest::AddTestPartResult()
> *** SIGSEGV (@0x0) received by PID 27364 (TID 0x7f1bfdd2b800) from PID 0; 
> stack trace: ***
> @ 0x7f1be92b80b7 os::Linux::chained_handler()
> @ 0x7f1be92bc219 JVM_handle_linux_signal
> @ 0x7f1bf7bbc340 (unknown)
> @  0x1443e9a testing::UnitTest::AddTestPartResult()
> @  0x1438b99 testing::internal::AssertHelper::operator=()
> @   0xf0b3bb 
> mesos::internal::tests::ContainerizerTest<>::TearDown()
> @  0x1461882 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @  0x145c6f8 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @  0x143de4a testing::Test::Run()
> @  0x143e584 testing::TestInfo::Run()
> @  0x143ebca testing::TestCase::Run()
> @  0x1445312 testing::internal::UnitTestImpl::RunAllTests()
> @  0x14624a7 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @  0x145d26e 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @  0x14440ae testing::UnitTest::Run()
> @   0xd15cd4 RUN_ALL_TESTS()
> @   0xd158c1 main
> @ 0x7f1bf7808ec5 (unknown)
> @   0x913009 (unknown)
> {noformat}
> My Vagrantfile generator;
> {noformat}
> #!/usr/bin/env bash
> cat << EOF > Vagrantfile
> # -*- mode: ruby -*-" >
> # vi: set ft=ruby :
> Vagrant.configure(2) do |config|
>   # Disable shared folder to prevent certain kernel module dependencies.
>   config.vm.synced_folder ".", "/vagrant", disabled: true
>   config.vm.box = "bento/ubuntu-14.04"
>   config.vm.hostname = "${PLATFORM_NAME}"
>   config.vm.provider "virtualbox" do |vb|
> vb.memory = ${VAGRANT_MEM}
> vb.cpus = ${VAGRANT_CPUS}
> vb.customize ["modifyvm", :id, "--nictype1", "virtio"]
> vb.customize ["modifyvm", :id, "--natdnshostresolver1", "on"]
> vb.customize ["modifyvm", :id, "--natdnsproxy1", "on"]
>   end
>   config.vm.provider "vmware_fusion" do |vb|
> vb.memory = ${VAGRANT_MEM}
> vb.cpus = ${VAGRANT_CPUS}
>   end
>   config.vm.provision "file", source: "../test.sh", destination: "~/test.sh"
>   config.vm.provision "shell", inline: <<-SHELL
> sudo apt-get update
> sudo apt-get -y install openjdk-7-jdk autoconf libtool
> sudo apt-get -y install build-essential python-dev python-boto  \
> libcurl4-nss-dev libsasl2-dev maven \
> libapr1-dev libsvn-dev libssl-dev libevent-dev
> sudo apt-get -y install git
> sudo wget -qO- https://get.docker.com/ | sh
>   SHELL
> end
> EOF
> {noformat}
> The problem is kicking in frequently in my tests - I'ld say > 10% but less 
> than 50%.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4025) SlaveRecoveryTest/0.GCExecutor is flaky.

2015-12-11 Thread Jan Schlicht (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052635#comment-15052635
 ] 

Jan Schlicht commented on MESOS-4025:
-

I saw the same behavior. Both tests have to be run in the same {{mesos-tests}} 
session to see the issue.

> SlaveRecoveryTest/0.GCExecutor is flaky.
> 
>
> Key: MESOS-4025
> URL: https://issues.apache.org/jira/browse/MESOS-4025
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.26.0
>Reporter: Till Toenshoff
>Assignee: Jan Schlicht
>  Labels: flaky, flaky-test, test
>
> Build was SSL enabled (--enable-ssl, --enable-libevent). The build was based 
> on 0.26.0-rc1.
> Testsuite was run as root.
> {noformat}
> sudo ./bin/mesos-tests.sh --gtest_break_on_failure --gtest_repeat=-1
> {noformat}
> {noformat}
> [ RUN  ] SlaveRecoveryTest/0.GCExecutor
> I1130 16:49:16.336833  1032 exec.cpp:136] Version: 0.26.0
> I1130 16:49:16.345212  1049 exec.cpp:210] Executor registered on slave 
> dde9fd4e-b016-4a99-9081-b047e9df9afa-S0
> Registered executor on ubuntu14
> Starting task 22c63bba-cbf8-46fd-b23a-5409d69e4114
> sh -c 'sleep 1000'
> Forked command at 1057
> ../../src/tests/mesos.cpp:779: Failure
> (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup 
> '/sys/fs/cgroup/memory/mesos_test_e5edb2a8-9af3-441f-b991-613082f264e2/slave':
>  Device or resource busy
> *** Aborted at 1448902156 (unix time) try "date -d @1448902156" if you are 
> using GNU date ***
> PC: @  0x1443e9a testing::UnitTest::AddTestPartResult()
> *** SIGSEGV (@0x0) received by PID 27364 (TID 0x7f1bfdd2b800) from PID 0; 
> stack trace: ***
> @ 0x7f1be92b80b7 os::Linux::chained_handler()
> @ 0x7f1be92bc219 JVM_handle_linux_signal
> @ 0x7f1bf7bbc340 (unknown)
> @  0x1443e9a testing::UnitTest::AddTestPartResult()
> @  0x1438b99 testing::internal::AssertHelper::operator=()
> @   0xf0b3bb 
> mesos::internal::tests::ContainerizerTest<>::TearDown()
> @  0x1461882 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @  0x145c6f8 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @  0x143de4a testing::Test::Run()
> @  0x143e584 testing::TestInfo::Run()
> @  0x143ebca testing::TestCase::Run()
> @  0x1445312 testing::internal::UnitTestImpl::RunAllTests()
> @  0x14624a7 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @  0x145d26e 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @  0x14440ae testing::UnitTest::Run()
> @   0xd15cd4 RUN_ALL_TESTS()
> @   0xd158c1 main
> @ 0x7f1bf7808ec5 (unknown)
> @   0x913009 (unknown)
> {noformat}
> My Vagrantfile generator;
> {noformat}
> #!/usr/bin/env bash
> cat << EOF > Vagrantfile
> # -*- mode: ruby -*-" >
> # vi: set ft=ruby :
> Vagrant.configure(2) do |config|
>   # Disable shared folder to prevent certain kernel module dependencies.
>   config.vm.synced_folder ".", "/vagrant", disabled: true
>   config.vm.box = "bento/ubuntu-14.04"
>   config.vm.hostname = "${PLATFORM_NAME}"
>   config.vm.provider "virtualbox" do |vb|
> vb.memory = ${VAGRANT_MEM}
> vb.cpus = ${VAGRANT_CPUS}
> vb.customize ["modifyvm", :id, "--nictype1", "virtio"]
> vb.customize ["modifyvm", :id, "--natdnshostresolver1", "on"]
> vb.customize ["modifyvm", :id, "--natdnsproxy1", "on"]
>   end
>   config.vm.provider "vmware_fusion" do |vb|
> vb.memory = ${VAGRANT_MEM}
> vb.cpus = ${VAGRANT_CPUS}
>   end
>   config.vm.provision "file", source: "../test.sh", destination: "~/test.sh"
>   config.vm.provision "shell", inline: <<-SHELL
> sudo apt-get update
> sudo apt-get -y install openjdk-7-jdk autoconf libtool
> sudo apt-get -y install build-essential python-dev python-boto  \
> libcurl4-nss-dev libsasl2-dev maven \
> libapr1-dev libsvn-dev libssl-dev libevent-dev
> sudo apt-get -y install git
> sudo wget -qO- https://get.docker.com/ | sh
>   SHELL
> end
> EOF
> {noformat}
> The problem is kicking in frequently in my tests - I'ld say > 10% but less 
> than 50%.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4025) SlaveRecoveryTest/0.GCExecutor is flaky.

2015-12-11 Thread Jan Schlicht (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052636#comment-15052636
 ] 

Jan Schlicht commented on MESOS-4025:
-

I saw the same behavior. Both tests have to be run in the same {{mesos-tests}} 
session to see the issue.

> SlaveRecoveryTest/0.GCExecutor is flaky.
> 
>
> Key: MESOS-4025
> URL: https://issues.apache.org/jira/browse/MESOS-4025
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.26.0
>Reporter: Till Toenshoff
>Assignee: Jan Schlicht
>  Labels: flaky, flaky-test, test
>
> Build was SSL enabled (--enable-ssl, --enable-libevent). The build was based 
> on 0.26.0-rc1.
> Testsuite was run as root.
> {noformat}
> sudo ./bin/mesos-tests.sh --gtest_break_on_failure --gtest_repeat=-1
> {noformat}
> {noformat}
> [ RUN  ] SlaveRecoveryTest/0.GCExecutor
> I1130 16:49:16.336833  1032 exec.cpp:136] Version: 0.26.0
> I1130 16:49:16.345212  1049 exec.cpp:210] Executor registered on slave 
> dde9fd4e-b016-4a99-9081-b047e9df9afa-S0
> Registered executor on ubuntu14
> Starting task 22c63bba-cbf8-46fd-b23a-5409d69e4114
> sh -c 'sleep 1000'
> Forked command at 1057
> ../../src/tests/mesos.cpp:779: Failure
> (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup 
> '/sys/fs/cgroup/memory/mesos_test_e5edb2a8-9af3-441f-b991-613082f264e2/slave':
>  Device or resource busy
> *** Aborted at 1448902156 (unix time) try "date -d @1448902156" if you are 
> using GNU date ***
> PC: @  0x1443e9a testing::UnitTest::AddTestPartResult()
> *** SIGSEGV (@0x0) received by PID 27364 (TID 0x7f1bfdd2b800) from PID 0; 
> stack trace: ***
> @ 0x7f1be92b80b7 os::Linux::chained_handler()
> @ 0x7f1be92bc219 JVM_handle_linux_signal
> @ 0x7f1bf7bbc340 (unknown)
> @  0x1443e9a testing::UnitTest::AddTestPartResult()
> @  0x1438b99 testing::internal::AssertHelper::operator=()
> @   0xf0b3bb 
> mesos::internal::tests::ContainerizerTest<>::TearDown()
> @  0x1461882 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @  0x145c6f8 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @  0x143de4a testing::Test::Run()
> @  0x143e584 testing::TestInfo::Run()
> @  0x143ebca testing::TestCase::Run()
> @  0x1445312 testing::internal::UnitTestImpl::RunAllTests()
> @  0x14624a7 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @  0x145d26e 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @  0x14440ae testing::UnitTest::Run()
> @   0xd15cd4 RUN_ALL_TESTS()
> @   0xd158c1 main
> @ 0x7f1bf7808ec5 (unknown)
> @   0x913009 (unknown)
> {noformat}
> My Vagrantfile generator;
> {noformat}
> #!/usr/bin/env bash
> cat << EOF > Vagrantfile
> # -*- mode: ruby -*-" >
> # vi: set ft=ruby :
> Vagrant.configure(2) do |config|
>   # Disable shared folder to prevent certain kernel module dependencies.
>   config.vm.synced_folder ".", "/vagrant", disabled: true
>   config.vm.box = "bento/ubuntu-14.04"
>   config.vm.hostname = "${PLATFORM_NAME}"
>   config.vm.provider "virtualbox" do |vb|
> vb.memory = ${VAGRANT_MEM}
> vb.cpus = ${VAGRANT_CPUS}
> vb.customize ["modifyvm", :id, "--nictype1", "virtio"]
> vb.customize ["modifyvm", :id, "--natdnshostresolver1", "on"]
> vb.customize ["modifyvm", :id, "--natdnsproxy1", "on"]
>   end
>   config.vm.provider "vmware_fusion" do |vb|
> vb.memory = ${VAGRANT_MEM}
> vb.cpus = ${VAGRANT_CPUS}
>   end
>   config.vm.provision "file", source: "../test.sh", destination: "~/test.sh"
>   config.vm.provision "shell", inline: <<-SHELL
> sudo apt-get update
> sudo apt-get -y install openjdk-7-jdk autoconf libtool
> sudo apt-get -y install build-essential python-dev python-boto  \
> libcurl4-nss-dev libsasl2-dev maven \
> libapr1-dev libsvn-dev libssl-dev libevent-dev
> sudo apt-get -y install git
> sudo wget -qO- https://get.docker.com/ | sh
>   SHELL
> end
> EOF
> {noformat}
> The problem is kicking in frequently in my tests - I'ld say > 10% but less 
> than 50%.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4025) SlaveRecoveryTest/0.GCExecutor is flaky.

2015-12-11 Thread Jan Schlicht (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Schlicht updated MESOS-4025:

Shepherd: Till Toenshoff

> SlaveRecoveryTest/0.GCExecutor is flaky.
> 
>
> Key: MESOS-4025
> URL: https://issues.apache.org/jira/browse/MESOS-4025
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.26.0
>Reporter: Till Toenshoff
>Assignee: Jan Schlicht
>  Labels: flaky, flaky-test, test
>
> Build was SSL enabled (--enable-ssl, --enable-libevent). The build was based 
> on 0.26.0-rc1.
> Testsuite was run as root.
> {noformat}
> sudo ./bin/mesos-tests.sh --gtest_break_on_failure --gtest_repeat=-1
> {noformat}
> {noformat}
> [ RUN  ] SlaveRecoveryTest/0.GCExecutor
> I1130 16:49:16.336833  1032 exec.cpp:136] Version: 0.26.0
> I1130 16:49:16.345212  1049 exec.cpp:210] Executor registered on slave 
> dde9fd4e-b016-4a99-9081-b047e9df9afa-S0
> Registered executor on ubuntu14
> Starting task 22c63bba-cbf8-46fd-b23a-5409d69e4114
> sh -c 'sleep 1000'
> Forked command at 1057
> ../../src/tests/mesos.cpp:779: Failure
> (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup 
> '/sys/fs/cgroup/memory/mesos_test_e5edb2a8-9af3-441f-b991-613082f264e2/slave':
>  Device or resource busy
> *** Aborted at 1448902156 (unix time) try "date -d @1448902156" if you are 
> using GNU date ***
> PC: @  0x1443e9a testing::UnitTest::AddTestPartResult()
> *** SIGSEGV (@0x0) received by PID 27364 (TID 0x7f1bfdd2b800) from PID 0; 
> stack trace: ***
> @ 0x7f1be92b80b7 os::Linux::chained_handler()
> @ 0x7f1be92bc219 JVM_handle_linux_signal
> @ 0x7f1bf7bbc340 (unknown)
> @  0x1443e9a testing::UnitTest::AddTestPartResult()
> @  0x1438b99 testing::internal::AssertHelper::operator=()
> @   0xf0b3bb 
> mesos::internal::tests::ContainerizerTest<>::TearDown()
> @  0x1461882 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @  0x145c6f8 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @  0x143de4a testing::Test::Run()
> @  0x143e584 testing::TestInfo::Run()
> @  0x143ebca testing::TestCase::Run()
> @  0x1445312 testing::internal::UnitTestImpl::RunAllTests()
> @  0x14624a7 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @  0x145d26e 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @  0x14440ae testing::UnitTest::Run()
> @   0xd15cd4 RUN_ALL_TESTS()
> @   0xd158c1 main
> @ 0x7f1bf7808ec5 (unknown)
> @   0x913009 (unknown)
> {noformat}
> My Vagrantfile generator;
> {noformat}
> #!/usr/bin/env bash
> cat << EOF > Vagrantfile
> # -*- mode: ruby -*-" >
> # vi: set ft=ruby :
> Vagrant.configure(2) do |config|
>   # Disable shared folder to prevent certain kernel module dependencies.
>   config.vm.synced_folder ".", "/vagrant", disabled: true
>   config.vm.box = "bento/ubuntu-14.04"
>   config.vm.hostname = "${PLATFORM_NAME}"
>   config.vm.provider "virtualbox" do |vb|
> vb.memory = ${VAGRANT_MEM}
> vb.cpus = ${VAGRANT_CPUS}
> vb.customize ["modifyvm", :id, "--nictype1", "virtio"]
> vb.customize ["modifyvm", :id, "--natdnshostresolver1", "on"]
> vb.customize ["modifyvm", :id, "--natdnsproxy1", "on"]
>   end
>   config.vm.provider "vmware_fusion" do |vb|
> vb.memory = ${VAGRANT_MEM}
> vb.cpus = ${VAGRANT_CPUS}
>   end
>   config.vm.provision "file", source: "../test.sh", destination: "~/test.sh"
>   config.vm.provision "shell", inline: <<-SHELL
> sudo apt-get update
> sudo apt-get -y install openjdk-7-jdk autoconf libtool
> sudo apt-get -y install build-essential python-dev python-boto  \
> libcurl4-nss-dev libsasl2-dev maven \
> libapr1-dev libsvn-dev libssl-dev libevent-dev
> sudo apt-get -y install git
> sudo wget -qO- https://get.docker.com/ | sh
>   SHELL
> end
> EOF
> {noformat}
> The problem is kicking in frequently in my tests - I'ld say > 10% but less 
> than 50%.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4025) SlaveRecoveryTest/0.GCExecutor is flaky.

2015-12-11 Thread Jan Schlicht (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052634#comment-15052634
 ] 

Jan Schlicht commented on MESOS-4025:
-

Hey, I've seen that you posted a review for that. I really appreciate the work 
you done on this, but unfortunately there was already some work done, it just 
took a bit to find a shepherd for this. Your approach to use {{MockDocker}} is 
the same that I've chosen, so I'd really appreciate if you'd take a look at my 
review request for this ticket and comment on this. Again, sorry that we both 
basically did the same work and that my review request is online just now.

> SlaveRecoveryTest/0.GCExecutor is flaky.
> 
>
> Key: MESOS-4025
> URL: https://issues.apache.org/jira/browse/MESOS-4025
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.26.0
>Reporter: Till Toenshoff
>Assignee: Jan Schlicht
>  Labels: flaky, flaky-test, test
>
> Build was SSL enabled (--enable-ssl, --enable-libevent). The build was based 
> on 0.26.0-rc1.
> Testsuite was run as root.
> {noformat}
> sudo ./bin/mesos-tests.sh --gtest_break_on_failure --gtest_repeat=-1
> {noformat}
> {noformat}
> [ RUN  ] SlaveRecoveryTest/0.GCExecutor
> I1130 16:49:16.336833  1032 exec.cpp:136] Version: 0.26.0
> I1130 16:49:16.345212  1049 exec.cpp:210] Executor registered on slave 
> dde9fd4e-b016-4a99-9081-b047e9df9afa-S0
> Registered executor on ubuntu14
> Starting task 22c63bba-cbf8-46fd-b23a-5409d69e4114
> sh -c 'sleep 1000'
> Forked command at 1057
> ../../src/tests/mesos.cpp:779: Failure
> (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup 
> '/sys/fs/cgroup/memory/mesos_test_e5edb2a8-9af3-441f-b991-613082f264e2/slave':
>  Device or resource busy
> *** Aborted at 1448902156 (unix time) try "date -d @1448902156" if you are 
> using GNU date ***
> PC: @  0x1443e9a testing::UnitTest::AddTestPartResult()
> *** SIGSEGV (@0x0) received by PID 27364 (TID 0x7f1bfdd2b800) from PID 0; 
> stack trace: ***
> @ 0x7f1be92b80b7 os::Linux::chained_handler()
> @ 0x7f1be92bc219 JVM_handle_linux_signal
> @ 0x7f1bf7bbc340 (unknown)
> @  0x1443e9a testing::UnitTest::AddTestPartResult()
> @  0x1438b99 testing::internal::AssertHelper::operator=()
> @   0xf0b3bb 
> mesos::internal::tests::ContainerizerTest<>::TearDown()
> @  0x1461882 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @  0x145c6f8 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @  0x143de4a testing::Test::Run()
> @  0x143e584 testing::TestInfo::Run()
> @  0x143ebca testing::TestCase::Run()
> @  0x1445312 testing::internal::UnitTestImpl::RunAllTests()
> @  0x14624a7 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @  0x145d26e 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @  0x14440ae testing::UnitTest::Run()
> @   0xd15cd4 RUN_ALL_TESTS()
> @   0xd158c1 main
> @ 0x7f1bf7808ec5 (unknown)
> @   0x913009 (unknown)
> {noformat}
> My Vagrantfile generator;
> {noformat}
> #!/usr/bin/env bash
> cat << EOF > Vagrantfile
> # -*- mode: ruby -*-" >
> # vi: set ft=ruby :
> Vagrant.configure(2) do |config|
>   # Disable shared folder to prevent certain kernel module dependencies.
>   config.vm.synced_folder ".", "/vagrant", disabled: true
>   config.vm.box = "bento/ubuntu-14.04"
>   config.vm.hostname = "${PLATFORM_NAME}"
>   config.vm.provider "virtualbox" do |vb|
> vb.memory = ${VAGRANT_MEM}
> vb.cpus = ${VAGRANT_CPUS}
> vb.customize ["modifyvm", :id, "--nictype1", "virtio"]
> vb.customize ["modifyvm", :id, "--natdnshostresolver1", "on"]
> vb.customize ["modifyvm", :id, "--natdnsproxy1", "on"]
>   end
>   config.vm.provider "vmware_fusion" do |vb|
> vb.memory = ${VAGRANT_MEM}
> vb.cpus = ${VAGRANT_CPUS}
>   end
>   config.vm.provision "file", source: "../test.sh", destination: "~/test.sh"
>   config.vm.provision "shell", inline: <<-SHELL
> sudo apt-get update
> sudo apt-get -y install openjdk-7-jdk autoconf libtool
> sudo apt-get -y install build-essential python-dev python-boto  \
> libcurl4-nss-dev libsasl2-dev maven \
> libapr1-dev libsvn-dev libssl-dev libevent-dev
> sudo apt-get -y install git
> sudo wget -qO- https://get.docker.com/ | sh
>   SHELL
> end
> EOF
> {noformat}
> The problem is kicking in frequently in my tests - I'ld say > 10% but less 
> than 50%.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3892) Add a helper function to the Agent to retrieve the list of executors that are using optimistically offered, revocable resources.

2015-12-11 Thread Guangya Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052514#comment-15052514
 ] 

Guangya Liu commented on MESOS-3892:


Is it possible that we do not generate the executor list in master but let 
slave handle this in MVP? The slave already knows all of the executors/tasks 
who is using allocation slack resources, so when launch a new task which 
request resource preemption, the slave can just check and kill some 
executors/tasks to recover those resources, make sense? [~jvanremoortere] 
[~kaysoky] [~hartem] [~klaus1982] 

> Add a helper function to the Agent to retrieve the list of executors that are 
> using optimistically offered, revocable resources.
> 
>
> Key: MESOS-3892
> URL: https://issues.apache.org/jira/browse/MESOS-3892
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Klaus Ma
>  Labels: mesosphere
>
> {noformat}
> class Slave {
>   ...
>   // How the master currently keeps track of executors.
>   hashmap> executors;
>   ...
>   // Returns the list of executors that are using optimistically-
>   // offered, revocable resources.
>   list getEvictableExecutors() { ... }
>   ...
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4130) Document how the fetcher can reach across a proxy connection.

2015-12-11 Thread Bernd Mathiske (JIRA)
Bernd Mathiske created MESOS-4130:
-

 Summary: Document how the fetcher can reach across a proxy 
connection.
 Key: MESOS-4130
 URL: https://issues.apache.org/jira/browse/MESOS-4130
 Project: Mesos
  Issue Type: Documentation
  Components: fetcher
Reporter: Bernd Mathiske


The fetcher uses libcurl for downloading content from HTTP, HTTPS, etc. There 
is no source code in the pertinent parts of "net.hpp" that deals with proxy 
settings. However, libcurl automatically picks up certain environment variables 
and adjusts its settings accordingly. See "man libcurl-tutorial" for details. 
See section "Proxies", subsection "Environment Variables". If you follow this 
recipe in your Mesos agent startup script, you can use a proxy. 

We should document this in the fetcher (cache) doc 
(http://mesos.apache.org/documentation/latest/fetcher/).




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4131) Slaves do not report LVM-based storage

2015-12-11 Thread Hannes Eichblatt (JIRA)
Hannes Eichblatt created MESOS-4131:
---

 Summary: Slaves do not report LVM-based storage
 Key: MESOS-4131
 URL: https://issues.apache.org/jira/browse/MESOS-4131
 Project: Mesos
  Issue Type: Bug
Reporter: Hannes Eichblatt


Hello everyone,

I use Mesos with Docker. My Docker daemon is configured to use LVM storage, as 
recommended. Mesos only reports disk space in the usual partitions, not the 
space in the Docker volume group.

Thanks
Hannes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4120) Make DiscoveryInfo dynamically updatable

2015-12-11 Thread Bernd Mathiske (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bernd Mathiske updated MESOS-4120:
--
Affects Version/s: 0.26.0
   0.25.0
 Target Version/s: 0.27.0

> Make DiscoveryInfo dynamically updatable
> 
>
> Key: MESOS-4120
> URL: https://issues.apache.org/jira/browse/MESOS-4120
> Project: Mesos
>  Issue Type: Improvement
>Affects Versions: 0.25.0, 0.26.0
>Reporter: Sargun Dhillon
>Priority: Critical
>  Labels: mesosphere
>
> K8s tasks can dynamically update what they expose to make discoverable by the 
> cluster. Unfortunately, all DiscoveryInfo the cluster is immutable, at the 
> time of task start. 
> We would like to enable DiscoveryInfo to be dynamically updatable, so that 
> executors can change what they're advertising based on their internal state, 
> versus requiring DiscoveryInfo to be known prior to starting the tasks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3892) Add a helper function to the Agent to retrieve the list of executors that are using optimistically offered, revocable resources.

2015-12-11 Thread Klaus Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052993#comment-15052993
 ] 

Klaus Ma commented on MESOS-3892:
-

I have update the code diff of MESOS-1718 at 
https://reviews.apache.org/r/40759/; would you also help to review them? I'll 
start to work on this one when MESOS-1718 under review :).

> Add a helper function to the Agent to retrieve the list of executors that are 
> using optimistically offered, revocable resources.
> 
>
> Key: MESOS-3892
> URL: https://issues.apache.org/jira/browse/MESOS-3892
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Klaus Ma
>  Labels: mesosphere
>
> {noformat}
> class Slave {
>   ...
>   // How the master currently keeps track of executors.
>   hashmap> executors;
>   ...
>   // Returns the list of executors that are using optimistically-
>   // offered, revocable resources.
>   list getEvictableExecutors() { ... }
>   ...
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4113) Docker Executor should not set container IP during bridged mode

2015-12-11 Thread Bernd Mathiske (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bernd Mathiske updated MESOS-4113:
--
Affects Version/s: 0.26.0

> Docker Executor should not set container IP during bridged mode
> ---
>
> Key: MESOS-4113
> URL: https://issues.apache.org/jira/browse/MESOS-4113
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.25.0, 0.26.0
>Reporter: Sargun Dhillon
>  Labels: mesosphere
>
> The docker executor currently sets the IP address of the container into 
> ContainerStatus.NetworkInfo.IPAddresses. This isn't a good thing, because 
> during bridged mode execution, it makes it so that that IP address is 
> useless, since it's behind the Docker NAT. I would like a flag that disables 
> filling the IP address in, and allows it to fall back to the agent IP. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4027) Improve task-node affinity

2015-12-11 Thread Chris (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052902#comment-15052902
 ] 

Chris commented on MESOS-4027:
--

[~qianzhang] just noticed that - really should have spent more time reviewing 
the documentation in the protobuf file. I'd like to see the Task's labels field 
get propogated into an Offer so customized schedulers can have some view into 
what is running on a Mesos Node (ie: scheduler logic may filter out Offers 
associated with Nodes involved in Spark processing OR maybe schedulers want to 
filter out Offers that are not associated with HDFS). Figured out a way to do 
this without added more fields into the protobuf IDL for TaskInfo. Thanks for 
pointing this out!!!

> Improve task-node affinity
> --
>
> Key: MESOS-4027
> URL: https://issues.apache.org/jira/browse/MESOS-4027
> Project: Mesos
>  Issue Type: Wish
>  Components: allocation, general
>Reporter: Chris
>Priority: Trivial
>
> Improve task-to-node affinity and anti-affinity (running hadoop or spark jobs 
> on a node currently running hdfs or to avoid running Ceph on HDFS nodes).
> Provide a user-mutable Attribute in TaskInfo (the Attribute is modified by a 
> Framework Scheduler) that can describe what a Task is running.
> The Attribute would propagate to a Task at execution. The Attribute is  
> passed to Framework Schedulers as part of an Offer's Attributes list. 
> A Framework Scheduler could then filter out or accept Offers from Nodes that 
> are currently labeled with a desired set or individual Attribute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4125) Use 'git rev-parse --git-dir' in bootstrap instead of simply '.git'

2015-12-11 Thread Kevin Klues (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052920#comment-15052920
 ] 

Kevin Klues commented on MESOS-4125:


https://reviews.apache.org/r/41243/
https://reviews.apache.org/r/41244/

> Use 'git rev-parse --git-dir' in bootstrap instead of simply '.git'
> ---
>
> Key: MESOS-4125
> URL: https://issues.apache.org/jira/browse/MESOS-4125
> Project: Mesos
>  Issue Type: Improvement
>  Components: build
> Environment: All systems
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>Priority: Minor
>  Labels: bootstrap, build, mesosphere
>
> This issue relates to the 'bootstrap' file in the top level directory of the 
> mesos tree.
> When building from git, bootstrap will (among other things) install 
> pre-commit and post-rewirte hooks into the .git/hooks directory of the mesos 
> tree.  However the current implementation always assumes that .git exists in 
> the same directory as the bootstrap file.  This may not always be true.
> Most notably, it is not true if the mesos tree is included as a submodule 
> inside another project. When included as a submodule, .git is no longer a 
> directory, but rather a file whose text contains a pointer back to the actual 
> location of the .git folder inside the containing project.  To get at this 
> directory, we need to run 'git rev-parse --git-dir' instead of simply 
> assuming that the local .git is the proper directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4119) Add support for enabling --3way to apply-reviews.py.

2015-12-11 Thread Bernd Mathiske (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052943#comment-15052943
 ] 

Bernd Mathiske commented on MESOS-4119:
---

Since you marked this "newbie", please explain to newbies what you mean by 
--3way and what apply-reviews is in general.

> Add support for enabling --3way to apply-reviews.py.
> 
>
> Key: MESOS-4119
> URL: https://issues.apache.org/jira/browse/MESOS-4119
> Project: Mesos
>  Issue Type: Task
>Reporter: Artem Harutyunyan
>  Labels: beginner, mesosphere, newbie
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4125) Use 'git rev-parse --git-dir' instead of assuming '.git' in top dir

2015-12-11 Thread Kevin Klues (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-4125:
---
Description: 
This issue relates files where we asume that to the 'bootstrap' file in the top 
level directory of the mesos tree.

When building from git, bootstrap will (among other things) install pre-commit 
and post-rewirte hooks into the .git/hooks directory of the mesos tree.  
Moreover, support/pHowever the current implementation always assumes that .git 
exists in the same directory as the bootstrap file.  This may not always be 
true.

Most notably, it is not true if the mesos tree is included as a submodule 
inside another project. When included as a submodule, .git is no longer a 
directory, but rather a file whose text contains a pointer back to the actual 
location of the .git folder inside the containing project.  To get at this 
directory, we need to run 'git rev-parse --git-dir' instead of simply assuming 
that the local .git is the proper directory.

  was:
This issue relates to the 'bootstrap' file in the top level directory of the 
mesos tree.

When building from git, bootstrap will (among other things) install pre-commit 
and post-rewirte hooks into the .git/hooks directory of the mesos tree.  
However the current implementation always assumes that .git exists in the same 
directory as the bootstrap file.  This may not always be true.

Most notably, it is not true if the mesos tree is included as a submodule 
inside another project. When included as a submodule, .git is no longer a 
directory, but rather a file whose text contains a pointer back to the actual 
location of the .git folder inside the containing project.  To get at this 
directory, we need to run 'git rev-parse --git-dir' instead of simply assuming 
that the local .git is the proper directory.


> Use 'git rev-parse --git-dir' instead of assuming '.git' in top dir
> ---
>
> Key: MESOS-4125
> URL: https://issues.apache.org/jira/browse/MESOS-4125
> Project: Mesos
>  Issue Type: Improvement
>  Components: build
> Environment: All systems
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>Priority: Minor
>  Labels: bootstrap, build, mesosphere
>
> This issue relates files where we asume that to the 'bootstrap' file in the 
> top level directory of the mesos tree.
> When building from git, bootstrap will (among other things) install 
> pre-commit and post-rewirte hooks into the .git/hooks directory of the mesos 
> tree.  Moreover, support/pHowever the current implementation always assumes 
> that .git exists in the same directory as the bootstrap file.  This may not 
> always be true.
> Most notably, it is not true if the mesos tree is included as a submodule 
> inside another project. When included as a submodule, .git is no longer a 
> directory, but rather a file whose text contains a pointer back to the actual 
> location of the .git folder inside the containing project.  To get at this 
> directory, we need to run 'git rev-parse --git-dir' instead of simply 
> assuming that the local .git is the proper directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4125) Use 'git rev-parse --git-dir' instead of assuming '.git' in top dir

2015-12-11 Thread Kevin Klues (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-4125:
---
Summary: Use 'git rev-parse --git-dir' instead of assuming '.git' in top 
dir  (was: Use 'git rev-parse --git-dir' in bootstrap instead of simply '.git')

> Use 'git rev-parse --git-dir' instead of assuming '.git' in top dir
> ---
>
> Key: MESOS-4125
> URL: https://issues.apache.org/jira/browse/MESOS-4125
> Project: Mesos
>  Issue Type: Improvement
>  Components: build
> Environment: All systems
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>Priority: Minor
>  Labels: bootstrap, build, mesosphere
>
> This issue relates to the 'bootstrap' file in the top level directory of the 
> mesos tree.
> When building from git, bootstrap will (among other things) install 
> pre-commit and post-rewirte hooks into the .git/hooks directory of the mesos 
> tree.  However the current implementation always assumes that .git exists in 
> the same directory as the bootstrap file.  This may not always be true.
> Most notably, it is not true if the mesos tree is included as a submodule 
> inside another project. When included as a submodule, .git is no longer a 
> directory, but rather a file whose text contains a pointer back to the actual 
> location of the .git folder inside the containing project.  To get at this 
> directory, we need to run 'git rev-parse --git-dir' instead of simply 
> assuming that the local .git is the proper directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4080) Clean up HTTP authentication in quota endpoints

2015-12-11 Thread Bernd Mathiske (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052930#comment-15052930
 ] 

Bernd Mathiske commented on MESOS-4080:
---

Can you please be more specific about the tech debt mentioned?

> Clean up HTTP authentication in quota endpoints
> ---
>
> Key: MESOS-4080
> URL: https://issues.apache.org/jira/browse/MESOS-4080
> Project: Mesos
>  Issue Type: Task
>  Components: HTTP API, master
>Reporter: Jan Schlicht
>Assignee: Jan Schlicht
>Priority: Critical
>  Labels: mesosphere, quota, tech-debt
>
> The authentification of quota requests introduces some technical dept that 
> will be resolved by the refactored HTTP based authentification. This ticket 
> tracks the work related to cleaning up the quota handling to use the new HTTP 
> API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4125) Use 'git rev-parse --git-dir' instead of assuming '.git' in top dir

2015-12-11 Thread Kevin Klues (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-4125:
---
Description: 
This issue relates to files where we assume that mesos's '.git' folder is 
always located in the top level directory of the mesos tree.

For example, bootstrap makes this assumption.  Specifically, it attempts to 
install pre-commit and post-rewrite hooks into the hardcoded .git/hooks 
directory. However, it is not always true that the .git folder is located here.

Most notably, it is not true if the mesos tree is included as a submodule 
inside another project. When included as a submodule, .git is no longer a 
directory, but rather a file whose text contains a pointer back to the actual 
location of the .git folder inside the containing project.  To get at this 
directory, we need to run 'git rev-parse --git-dir' instead of simply assuming 
that the local .git is the proper directory.

  was:
This issue relates files where we asume that to the 'bootstrap' file in the top 
level directory of the mesos tree.

When building from git, bootstrap will (among other things) install pre-commit 
and post-rewirte hooks into the .git/hooks directory of the mesos tree.  
Moreover, support/pHowever the current implementation always assumes that .git 
exists in the same directory as the bootstrap file.  This may not always be 
true.

Most notably, it is not true if the mesos tree is included as a submodule 
inside another project. When included as a submodule, .git is no longer a 
directory, but rather a file whose text contains a pointer back to the actual 
location of the .git folder inside the containing project.  To get at this 
directory, we need to run 'git rev-parse --git-dir' instead of simply assuming 
that the local .git is the proper directory.


> Use 'git rev-parse --git-dir' instead of assuming '.git' in top dir
> ---
>
> Key: MESOS-4125
> URL: https://issues.apache.org/jira/browse/MESOS-4125
> Project: Mesos
>  Issue Type: Improvement
>  Components: build
> Environment: All systems
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>Priority: Minor
>  Labels: bootstrap, build, mesosphere
>
> This issue relates to files where we assume that mesos's '.git' folder is 
> always located in the top level directory of the mesos tree.
> For example, bootstrap makes this assumption.  Specifically, it attempts to 
> install pre-commit and post-rewrite hooks into the hardcoded .git/hooks 
> directory. However, it is not always true that the .git folder is located 
> here.
> Most notably, it is not true if the mesos tree is included as a submodule 
> inside another project. When included as a submodule, .git is no longer a 
> directory, but rather a file whose text contains a pointer back to the actual 
> location of the .git folder inside the containing project.  To get at this 
> directory, we need to run 'git rev-parse --git-dir' instead of simply 
> assuming that the local .git is the proper directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4118) Update Getting Started for Mac OS X El Capitan

2015-12-11 Thread Kevin Klues (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053769#comment-15053769
 ] 

Kevin Klues commented on MESOS-4118:


https://reviews.apache.org/r/41286

> Update Getting Started for Mac OS X El Capitan
> --
>
> Key: MESOS-4118
> URL: https://issues.apache.org/jira/browse/MESOS-4118
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
> Environment: Mac OS X
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>Priority: Minor
>  Labels: documentation, mesosphere
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> This ticket pertains to the Getting Started guide on the apache mesos website
> The current instructions for installing on Mac OS X only include instructions 
> for Yosemite.  The instructions to build for El Capitan are identical except 
> in the case of upgrading from Yosemite to El Capitan.  To build after an 
> upgrade requires a trivial (but important) step which is non-obvious -- you 
> have to rerun 'xcode-select --install' after you complete the upgrade.
> Let's change the heading for installing on Mac OS X to say:
> Mac OS X Yosemite & El Capitan
> and then add a comment at the bottom of the section to point out that a rerun 
> of 'xcode-select --install' is necessary after an upgrade from Yosemite to El 
> Capitan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4135) Labels are not return in statusUpdate TaskStatus

2015-12-11 Thread Felix Hupfeld (JIRA)
Felix Hupfeld created MESOS-4135:


 Summary: Labels are not return in statusUpdate TaskStatus
 Key: MESOS-4135
 URL: https://issues.apache.org/jira/browse/MESOS-4135
 Project: Mesos
  Issue Type: Bug
  Components: framework, master
Affects Versions: 0.25.0
Reporter: Felix Hupfeld
Priority: Minor


Labels that were set in the task's TaskInfo upon creation are not returned in 
statusUpdate TaskStatus messages.

This restricts their usefulness. Use case would be maintaining the container 
version of a running task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4088) Modularize existing plain-file logging for executor/task logs launched with the Mesos Containerizer

2015-12-11 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049749#comment-15049749
 ] 

Joseph Wu edited comment on MESOS-4088 at 12/12/15 12:04 AM:
-

|| Reviews || Summary ||
| https://reviews.apache.org/r/41166/ | Add {{ContainerLogger}} to 
{{Containerizer::Create}} |
| https://reviews.apache.org/r/41167/ | Initialize and call the 
{{ContainerLogger}} in {{MesosContainerizer::_launch}} |
| https://reviews.apache.org/r/41168/ | Update {{MesosTest}} |
| https://reviews.apache.org/r/41169/ | Update {{MesosContainerizer}} tests |


was (Author: kaysoky):
|| Reviews || Summary ||
| https://reviews.apache.org/r/41166/ | Add {{ExecutorLogger}} to 
{{Containerizer::Create}} |
| https://reviews.apache.org/r/41167/ | Initialize and call the 
{{ExecutorLogger}} in {{MesosContainerizer::_launch}} |
| https://reviews.apache.org/r/41168/ | Update {{MesosTest}} |
| https://reviews.apache.org/r/41169/ | Update {{MesosContainerizer}} tests |

> Modularize existing plain-file logging for executor/task logs launched with 
> the Mesos Containerizer
> ---
>
> Key: MESOS-4088
> URL: https://issues.apache.org/jira/browse/MESOS-4088
> Project: Mesos
>  Issue Type: Task
>  Components: modules
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>  Labels: logging, mesosphere
>
> Once a module for executor/task output logging has been introduced, the 
> default module will mirror the existing behavior.  Executor/task 
> stdout/stderr is piped into files within the executor's sandbox directory.
> The files are exposed in the web UI, via the {{/files}} endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4133) User-oriented docs for containerizers + isolators

2015-12-11 Thread Neil Conway (JIRA)
Neil Conway created MESOS-4133:
--

 Summary: User-oriented docs for containerizers + isolators
 Key: MESOS-4133
 URL: https://issues.apache.org/jira/browse/MESOS-4133
 Project: Mesos
  Issue Type: Documentation
  Components: containerization, documentation, isolation
Reporter: Neil Conway


This should cover practical user-oriented questions, such as:

* what is a containerizer, and what problems do they solve?
* how should I choose among the available containerizer options to solve a few 
typical, practical problems
* what is an isolator, and what problems do they solve?
* how should I choose among the available isolator options to solve a few 
typical, practical problems

We could possibly get into the details of cgroups and other system-level 
facilities for configuring resource isolation as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4132) Make `stout/windows.hpp` standalone.

2015-12-11 Thread Alex Clemmer (JIRA)
Alex Clemmer created MESOS-4132:
---

 Summary: Make `stout/windows.hpp` standalone.
 Key: MESOS-4132
 URL: https://issues.apache.org/jira/browse/MESOS-4132
 Project: Mesos
  Issue Type: Bug
  Components: stout
Reporter: Alex Clemmer
Assignee: Alex Clemmer






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3842) getting started documentation following Mesos 0.25 build fails for CentOS7 (http://mesos.apache.org/gettingstarted/)

2015-12-11 Thread Kevin Klues (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053651#comment-15053651
 ] 

Kevin Klues commented on MESOS-3842:


I am not seeing this issue on master. Can you confirm that it is still a 
problem?  I ran it on vagrant using the vbox centos 7.1 image from here:
https://github.com/CommanderK5/packer-centos-template/releases/download/0.7.1/vagrant-centos-7.1.box


> getting started documentation following Mesos 0.25 build fails for CentOS7 
> (http://mesos.apache.org/gettingstarted/)
> 
>
> Key: MESOS-3842
> URL: https://issues.apache.org/jira/browse/MESOS-3842
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation, project website
>Affects Versions: 0.25.0
> Environment: CentOS 7 AWS Linux image: AWS EC2 MarketPlace CentOS 7 
> (x86_64) with Updates HVM (a t2.medium instance)
>Reporter: Manne Laukkanen
>Assignee: Kevin Klues
>  Labels: build, documentation, mesosphere
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> WANdisco SVN repo file usage leads to failure of build process with error, so 
> usage of it should be 1) discouraged 2) replaced with a working solution
> Proceeding according to documentation at 
> http://mesos.apache.org/gettingstarted/:
> # 'Mesos > 0.21.0' requires 'subversion > 1.8' devel package, which is
> # not available in the default repositories.
> # Add the WANdisco SVN repo file: '/etc/yum.repos.d/wandisco-svn.repo' with 
> content:
>   [WANdiscoSVN]
>   name=WANdisco SVN Repo 1.9
>   enabled=1
>   baseurl=http://opensource.wandisco.com/centos/7/svn-1.9/RPMS/$basearch/
>   gpgcheck=1
>   gpgkey=http://opensource.wandisco.com/RPM-GPG-KEY-WANdisco
> ...we do as is described, then proceed to next step, which is 
> "# Install essential development tools."
> sudo yum groupinstall -y "Development Tools"
> ...the added WANDISCO -repo causes failed building process with error:
> Error: Package: subversion-1.9.2-1.x86_64 (WANdiscoSVN)
>Requires: libserf-1.so.0()(64bit)
>  - we end up with e.g. no build tools to proceed with, so process fails, 
> Mesos can not be built according to instructions (e.g. no C-compiler in 
> path...)
> Interestingly, building with aforementioned instructions (with some 
> modifications mentioned in ticket MESOS-3844) was successful without errors 
> justa a few days ago on 30 Oct 2015. WANDISCO repo breakage? 
> No changes to building machine image (the CentOS7 image) nor machine itself 
> (t2.medium EC2 instance) were made in between attempts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4114) Add field VIP to message Port

2015-12-11 Thread Tobi Knaup (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053596#comment-15053596
 ] 

Tobi Knaup commented on MESOS-4114:
---

[~sargun] DiscoveryInfo was meant to be the thing that clients or service 
discovery systems read. The local port seems like an implementation detail that 
should not be visible to clients. Back when this proto was introduced there was 
a lot of debate around whether name etc. should just go into labels or if there 
should be explicit members, and the decision was to be explicit, so I'd 
recommend a new vip member to follow that pattern. There was an assumption that 
each container would have it's own IP and instances of the service would listen 
on the same port that is listed in DiscoveryInfo, so there would be no need to 
call out the local port. In the absence of IP per container I think it makes 
sense to add a new member to Port called localPort or instancePort.

I'm not sure I understand the second point - different services will have 
different TaskInfo/ExecutorInfo and therefore different DiscoveryInfo, so you 
can set different names and IPs.

> Add field VIP to message Port
> -
>
> Key: MESOS-4114
> URL: https://issues.apache.org/jira/browse/MESOS-4114
> Project: Mesos
>  Issue Type: Wish
>Reporter: Sargun Dhillon
>Assignee: Avinash Sridharan
>Priority: Trivial
>  Labels: mesosphere
>
> We would like to extend the Mesos protocol buffer 'Port' to include an 
> optional repeated string named "VIP" - to map it to a well known virtual IP, 
> or virtual hostname for discovery purposes.
> We also want this field exposed in DiscoveryInfo in state.json.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4134) Add note about tunneling in site-docker README

2015-12-11 Thread Kevin Klues (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053741#comment-15053741
 ] 

Kevin Klues commented on MESOS-4134:


https://reviews.apache.org/r/41278/

> Add note about tunneling in site-docker README
> --
>
> Key: MESOS-4134
> URL: https://issues.apache.org/jira/browse/MESOS-4134
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>Priority: Minor
>  Labels: documentation
>
> If we are running the site-docker container on a remote machine, we should 
> set up a tunnel to localhost to view the site locally.  The README should 
> explain how to do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4134) Add note about tunneling in site-docker README

2015-12-11 Thread Kevin Klues (JIRA)
Kevin Klues created MESOS-4134:
--

 Summary: Add note about tunneling in site-docker README
 Key: MESOS-4134
 URL: https://issues.apache.org/jira/browse/MESOS-4134
 Project: Mesos
  Issue Type: Documentation
  Components: documentation
Reporter: Kevin Klues
Assignee: Kevin Klues
Priority: Minor


If we are running the site-docker container on a remote machine, we should set 
up a tunnel to localhost to view the site locally.  The README should explain 
how to do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4137) Modularize plain-file logging for executor/task logs launched with the Docker Containerizer

2015-12-11 Thread Joseph Wu (JIRA)
Joseph Wu created MESOS-4137:


 Summary: Modularize plain-file logging for executor/task logs 
launched with the Docker Containerizer
 Key: MESOS-4137
 URL: https://issues.apache.org/jira/browse/MESOS-4137
 Project: Mesos
  Issue Type: Task
  Components: docker, modules
Reporter: Joseph Wu
Assignee: Joseph Wu


Adding a hook inside the Docker containerizer is slightly more involved than 
the Mesos containerizer.

Docker executors/tasks perform plain-file logging in different places depending 
on whether the agent is in a Docker container itself
|| Agent || Code ||
| Not in container | {{DockerContainerizerProcess::launchExecutorProcess}} |
| In container | {{Docker::run}} in a {{mesos-docker-executor}} process |

This means a {{ContainerLogger}} will need to be loaded or hooked into the 
{{mesos-docker-executor}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4140) Indicate that the task is shutting down on shutdown

2015-12-11 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-4140:
-

 Summary: Indicate that the task is shutting down on shutdown
 Key: MESOS-4140
 URL: https://issues.apache.org/jira/browse/MESOS-4140
 Project: Mesos
  Issue Type: Improvement
Reporter: Sargun Dhillon


In the shutdown handler in the default executor, there is a grace period 
between when a SIGTERM is sent, and a SIGKILL is sent. There should a mechanism 
to expose that the task is being killed. A simple mechanism would be to mark 
the task as unhealthy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4027) Improve task-node affinity

2015-12-11 Thread Qian Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053934#comment-15053934
 ] 

Qian Zhang commented on MESOS-4027:
---

[~ct.clmsn] I understand your requirement, but that means a framework could be 
able to know what the tasks are running in an agent even some of the tasks 
belong to other frameworks, I am not sure if this is OK from multi-tenant and 
security point of view, e.g., a framework may not want other frameworks to know 
anything about its tasks.
Maybe we could leave the decision to framework itself, e.g., when framework 
launches a task, it can decide it is a "public" or "private" task, and if it is 
"public", then task's label will be propagated into an offer, so any framework 
receives that offer will know that "public" task already running on that agent.

> Improve task-node affinity
> --
>
> Key: MESOS-4027
> URL: https://issues.apache.org/jira/browse/MESOS-4027
> Project: Mesos
>  Issue Type: Wish
>  Components: allocation, general
>Reporter: Chris
>Priority: Trivial
>
> Improve task-to-node affinity and anti-affinity (running hadoop or spark jobs 
> on a node currently running hdfs or to avoid running Ceph on HDFS nodes).
> Provide a user-mutable Attribute in TaskInfo (the Attribute is modified by a 
> Framework Scheduler) that can describe what a Task is running.
> The Attribute would propagate to a Task at execution. The Attribute is  
> passed to Framework Schedulers as part of an Offer's Attributes list. 
> A Framework Scheduler could then filter out or accept Offers from Nodes that 
> are currently labeled with a desired set or individual Attribute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4139) Make escalationTimeout configurable

2015-12-11 Thread Sargun Dhillon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sargun Dhillon updated MESOS-4139:
--
Priority: Major  (was: Critical)

> Make escalationTimeout configurable
> ---
>
> Key: MESOS-4139
> URL: https://issues.apache.org/jira/browse/MESOS-4139
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Sargun Dhillon
>  Labels: mesosphere
>
> At the moment, escalationTimeout is fixed at 3 seconds in the code. This 
> means that if a task is shutdown, there are only 3 seconds between the 
> SIGTERM, and SIGKILL. This means that if someone is running something like a 
> rails framework, it may be too quick to terminate the tasks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3687) Streamline site-building process

2015-12-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15054108#comment-15054108
 ] 

ASF GitHub Bot commented on MESOS-3687:
---

Github user jonorossi commented on the pull request:


https://github.com/apache/mesos/commit/3c264c0929d328b1f8bedac3ad2fddadf782ec71#commitcomment-14940262
  
@westurner IIRC the architecture images were 404s beforehand even with an 
absolute URL because the files didn't live there. There were few images that 
were actually working beforehand, most images were broken. The previous comment 
to this one (260acd03b45c9a203a53bc92171aedadbb970dad) in the PR actually fixed 
the script so the deployment would post-process the markdown as it did with 
links. There is still something going wrong with the deployment (i.e. manual 
running of the script and getting committed to subversion).

I've mentioned in a couple of mailing list threads that we need to make 
this process automated via the Jenkins server. @davelester was helping to 
organise this with the Apache ops guys as he is the web site maintainer.

Here is the JIRA issue: https://issues.apache.org/jira/browse/MESOS-3687


> Streamline site-building process
> 
>
> Key: MESOS-3687
> URL: https://issues.apache.org/jira/browse/MESOS-3687
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Freddy Ayuso-Henson
>Priority: Minor
>
> The current site building and publishing process is somewhat cumbersome and 
> complicated. As part of MesosCon Hackathon, aim to streamline/simplify this 
> process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4127) Ensure `Content-Type` field is set for some responses

2015-12-11 Thread Klaus Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053899#comment-15053899
 ] 

Klaus Ma commented on MESOS-4127:
-

Similar JIRA on {{Content-Type}} of endpoint.

> Ensure `Content-Type` field is set for some responses
> -
>
> Key: MESOS-4127
> URL: https://issues.apache.org/jira/browse/MESOS-4127
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Alexander Rukletsov
>  Labels: http, mesosphere, newbie++, tech-debt
>
> As pointed out by [~anandmazumdar] in https://reviews.apache.org/r/40905/, we 
> should make sure we set the {{Content-Type}} files for some responses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3687) Streamline site-building process

2015-12-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15054114#comment-15054114
 ] 

ASF GitHub Bot commented on MESOS-3687:
---

Github user westurner commented on the pull request:


https://github.com/apache/mesos/commit/3c264c0929d328b1f8bedac3ad2fddadf782ec71#commitcomment-14940273
  
Thanks!

So the paths for the .md files are

```
./images/img.PNG
./doc.md
```

Whereas the site-deployed paths are:

```
/documentation/latest/images/img.PNG
/documentation/latest/doc/
```

Currently:

* images display w/ GitHub
* images 404 w/ the site

... I'll copy this to JIRA when I get a minute.
On Dec 12, 2015 1:31 AM, "Jonathon Rossi"  wrote:

> @westurner  IIRC the architecture images
> were 404s beforehand even with an absolute URL because the files didn't
> live there. There were few images that were actually working beforehand,
> most images were broken. The previous comment to this one (260acd0
> 
)
> in the PR actually fixed the script so the deployment would post-process
> the markdown as it did with links. There is still something going wrong
> with the deployment (i.e. manual running of the script and getting
> committed to subversion).
>
> I've mentioned in a couple of mailing list threads that we need to make
> this process automated via the Jenkins server. @davelester
>  was helping to organise this with the
> Apache ops guys as he is the web site maintainer.
>
> Here is the JIRA issue: https://issues.apache.org/jira/browse/MESOS-3687
>
> —
> Reply to this email directly or view it on GitHub
> 

> .
>



> Streamline site-building process
> 
>
> Key: MESOS-3687
> URL: https://issues.apache.org/jira/browse/MESOS-3687
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Freddy Ayuso-Henson
>Priority: Minor
>
> The current site building and publishing process is somewhat cumbersome and 
> complicated. As part of MesosCon Hackathon, aim to streamline/simplify this 
> process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4126) Construct the error string in `MethodNotAllowed`

2015-12-11 Thread Alexander Rukletsov (JIRA)
Alexander Rukletsov created MESOS-4126:
--

 Summary: Construct the error string in `MethodNotAllowed`
 Key: MESOS-4126
 URL: https://issues.apache.org/jira/browse/MESOS-4126
 Project: Mesos
  Issue Type: Improvement
Reporter: Alexander Rukletsov


Consider constructing the error string in {{MethodNotAllowed}} rather than at 
the invocation site. Currently we want all error messages follow the same 
pattern, so instead of writing
{code}
return MethodNotAllowed({"POST"}, "Expecting 'POST', received '" + 
request.method + "'");
{code}
we can write something like
{code}
MethodNotAllowed({"POST"}, request.method)`
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4127) Ensure `Conten-Type` field is set for some responses

2015-12-11 Thread Alexander Rukletsov (JIRA)
Alexander Rukletsov created MESOS-4127:
--

 Summary: Ensure `Conten-Type` field is set for some responses
 Key: MESOS-4127
 URL: https://issues.apache.org/jira/browse/MESOS-4127
 Project: Mesos
  Issue Type: Improvement
Reporter: Alexander Rukletsov


As pointed out by [~anandmazumdar] in https://reviews.apache.org/r/40905/, we 
should make sure we set the {{Content-Type}} files for some responses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4128) Refactor sorter factories in allocator and improve comments around them

2015-12-11 Thread Alexander Rukletsov (JIRA)
Alexander Rukletsov created MESOS-4128:
--

 Summary: Refactor sorter factories in allocator and improve 
comments around them
 Key: MESOS-4128
 URL: https://issues.apache.org/jira/browse/MESOS-4128
 Project: Mesos
  Issue Type: Improvement
  Components: allocation
Reporter: Alexander Rukletsov
Assignee: Alexander Rukletsov


For clarity we want to refactor the factory section in the allocator and 
explain the purpose (and necessity) of all sorters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4123) Enable agent/master know resource type is USAGE_SLACK for QoS Controller related resources

2015-12-11 Thread Guangya Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangya Liu reassigned MESOS-4123:
--

Assignee: Guangya Liu

> Enable agent/master know resource type is USAGE_SLACK for QoS Controller 
> related resources
> --
>
> Key: MESOS-4123
> URL: https://issues.apache.org/jira/browse/MESOS-4123
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Guangya Liu
>
> Now the master/agent have endpoint to get all revocable resources but the 
> current revocable resources are only for QoS controller.
> The current use of those resource are only for some display issues, but we 
> may need those APIs in future such as MESOS-2647 , it need to calculate if 
> there are enough usage_slack revocable resources before launch a task.
> So I think that we need to update the helper functions of
> {code}_resources_revocable_total{code}
> {code}_resources_revocable_used{code}
> {code}_resources_revocable_percent{code}
> to only get usage_slack revocable resources .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4129) Ignore some files from eclipse

2015-12-11 Thread Guangya Liu (JIRA)
Guangya Liu created MESOS-4129:
--

 Summary: Ignore some files from eclipse
 Key: MESOS-4129
 URL: https://issues.apache.org/jira/browse/MESOS-4129
 Project: Mesos
  Issue Type: Bug
Reporter: Guangya Liu
Assignee: Guangya Liu


When using eclipse to edit mesos code and the "git status"
command always show some eclipse system files, it is better put those
files to gitignore so that "git status" will not show those
files, the developer can simply use "git add ." to add all modified files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3943) Support dynamic weight in allocator

2015-12-11 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3943:
--
   Shepherd: Adam B
Description: Currently, RoleInfo protobuf never be used for serialization, 
so I think we can remove it from allocator.proto, and define a struct to 
communicate between the allocator and master. But for role information display, 
then current serialization way(call modle(role*) in http.cpp) is not better, 
and we should define another RoleInfo protobuf for serialization. Refer to 
other components(such as quota), I propose to define role protobuf in a 
separated package rather than define it in mesos.proto.  (was: Mesos allocator 
should aware the role change, this includes adding, updating and delete a role. 
so in this ticket, we will extend the allocator interface based on the design 
https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#)
Summary: Support dynamic weight in allocator  (was: Dynamic 
roles/weights support in allocator)

> Support dynamic weight in allocator
> ---
>
> Key: MESOS-3943
> URL: https://issues.apache.org/jira/browse/MESOS-3943
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> Currently, RoleInfo protobuf never be used for serialization, so I think we 
> can remove it from allocator.proto, and define a struct to communicate 
> between the allocator and master. But for role information display, then 
> current serialization way(call modle(role*) in http.cpp) is not better, and 
> we should define another RoleInfo protobuf for serialization. Refer to other 
> components(such as quota), I propose to define role protobuf in a separated 
> package rather than define it in mesos.proto.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3943) Support dynamic weight in allocator

2015-12-11 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052486#comment-15052486
 ] 

Yong Qiao Wang commented on MESOS-3943:
---

Append RR: https://reviews.apache.org/r/40469/

> Support dynamic weight in allocator
> ---
>
> Key: MESOS-3943
> URL: https://issues.apache.org/jira/browse/MESOS-3943
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> Currently, RoleInfo protobuf never be used for serialization, so I think we 
> can remove it from allocator.proto, and define a struct to communicate 
> between the allocator and master. But for role information display, then 
> current serialization way(call modle(role*) in http.cpp) is not better, and 
> we should define another RoleInfo protobuf for serialization. Refer to other 
> components(such as quota), I propose to define role protobuf in a separated 
> package rather than define it in mesos.proto.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4072) The lt-mesos-master will coredump in some situation.

2015-12-11 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway reassigned MESOS-4072:
--

Assignee: Neil Conway

> The lt-mesos-master will coredump in some situation.
> 
>
> Key: MESOS-4072
> URL: https://issues.apache.org/jira/browse/MESOS-4072
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.25.0
>Reporter: Nan Xiao
>Assignee: Neil Conway
>  Labels: mesosphere, newbie
>
>  I find  lt-mesos-master  will coredump when following conditions are met:  
> (1) The user doesn't have write permission of /var/lib/mesos directory:
> nan@ubuntu:~/mesos-0.25.0/build$ ls -lt /var/lib/
> total 176
> dr-xr-xr-x 2 rootroot4096 Dec  7 03:08 mesos
> ..
> (2) the /var/lib/mesos is an empty folder:
> nan@ubuntu:~/mesos-0.25.0/build$ ls -lt /var/lib/mesos/
> total 0
> Executing following command will core dump:
> nan@ubuntu:~/mesos-0.25.0/build$ ./bin/mesos-master.sh --ip=16.187.250.141 
> --work_dir=/var/lib/mesos
> I1207 03:18:36.431015 22951 main.cpp:229] Build: 2015-12-07 00:11:18 by nan
> I1207 03:18:36.431154 22951 main.cpp:231] Version: 0.25.0
> I1207 03:18:36.431388 22951 main.cpp:252] Using 'HierarchicalDRF' allocator
> F1207 03:18:36.431807 22951 replica.cpp:724] CHECK_SOME(state): IO error: 
> /var/lib/mesos/replicated_log/LOCK: No such file or directory Failed to 
> recover the log
> *** Check failure stack trace: ***
> @ 0x7f076bc208ca  google::LogMessage::Fail()
> @ 0x7f076bc20816  google::LogMessage::SendToLog()
> @ 0x7f076bc20218  google::LogMessage::Flush()
> @ 0x7f076bc2312c  google::LogMessageFatal::~LogMessageFatal()
> @ 0x7f076adf8f30  _CheckFatal::~_CheckFatal()
> @ 0x7f076baa4939  mesos::internal::log::ReplicaProcess::restore()
> @ 0x7f076baa0f8c  
> mesos::internal::log::ReplicaProcess::ReplicaProcess()
> @ 0x7f076baa4c95  mesos::internal::log::Replica::Replica()
> @ 0x7f076b9cf819  mesos::internal::log::LogProcess::LogProcess()
> @ 0x7f076b9d576c  mesos::internal::log::Log::Log()
> @   0x46d21f  main
> @ 0x7f0766f69ec5  (unknown)
> @   0x46b979  (unknown)
> Aborted (core dumped)
> Use gdb to analyze it:
> nan@ubuntu:~/mesos-0.25.0/build$ gdb 
> /home/nan/mesos-0.25.0/build/src/.libs/lt-mesos-master core
> GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
> Copyright (C) 2014 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later 
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-linux-gnu".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> .
> Find the GDB manual and other documentation resources online at:
> .
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from 
> /home/nan/mesos-0.25.0/build/src/.libs/lt-mesos-master...done.
> [New LWP 22065]
> [New LWP 22087]
> [New LWP 22085]
> [New LWP 22089]
> [New LWP 22084]
> [New LWP 22086]
> [New LWP 22091]
> [New LWP 22088]
> [New LWP 22092]
> [New LWP 22090]
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> Core was generated by `/home/nan/mesos-0.25.0/build/src/.libs/lt-mesos-master 
> --ip=127.0.0.1 --work_di'.
> Program terminated with signal SIGABRT, Aborted.
> #0  0x7fe917810cc9 in __GI_raise (sig=sig@entry=6) at 
> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> 56  ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
> Traceback (most recent call last):
>   File 
> "/usr/share/gdb/auto-load/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19-gdb.py",
>  line 63, in 
> from libstdcxx.v6.printers import register_libstdcxx_printers
> ImportError: No module named 'libstdcxx'
> (gdb) bt
> #0  0x7fe917810cc9 in __GI_raise (sig=sig@entry=6) at 
> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #1  0x7fe9178140d8 in __GI_abort () at abort.c:89
> #2  0x7fe91c4b8c1b in DumpStackTraceAndExit () from 
> /home/nan/mesos-0.25.0/build/src/.libs/libmesos-0.25.0.so
> #3  0x7fe91c4b28ca in google::LogMessage::Fail () from 
> /home/nan/mesos-0.25.0/build/src/.libs/libmesos-0.25.0.so
> #4  0x7fe91c4b2816 in google::LogMessage::SendToLog () from 
> /home/nan/mesos-0.25.0/build/src/.libs/libmesos-0.25.0.so
> #5  0x7fe91c4b2218 in google::LogMessage::Flush () from 
> /home/nan/mesos-0.25.0/build/src/.libs/libmesos-0.25.0.so
> #6  

[jira] [Assigned] (MESOS-4109) HTTPConnectionTest.ClosingResponse is flaky

2015-12-11 Thread Benjamin Mahler (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler reassigned MESOS-4109:
--

Assignee: Benjamin Mahler

Thanks for filing! I introduced this, I'll fix it shortly.

> HTTPConnectionTest.ClosingResponse is flaky
> ---
>
> Key: MESOS-4109
> URL: https://issues.apache.org/jira/browse/MESOS-4109
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess, test
>Affects Versions: 0.26.0
> Environment: ASF Ubuntu 14 
> {{--enable-ssl --enable-libevent}}
>Reporter: Joseph Wu
>Assignee: Benjamin Mahler
>Priority: Minor
>  Labels: flaky, flaky-test, newbie, test
>
> Output of the test:
> {code}
> [ RUN  ] HTTPConnectionTest.ClosingResponse
> I1210 01:20:27.048532 26671 process.cpp:3077] Handling HTTP event for process 
> '(22)' with path: '/(22)/get'
> ../../../3rdparty/libprocess/src/tests/http_tests.cpp:919: Failure
> Actual function call count doesn't match EXPECT_CALL(*http.process, get(_))...
>  Expected: to be called twice
>Actual: called once - unsatisfied and active
> [  FAILED  ] HTTPConnectionTest.ClosingResponse (43 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4109) HTTPConnectionTest.ClosingResponse is flaky

2015-12-11 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-4109:
--
  Sprint: Mesosphere Sprint 24
Story Points: 1
  Labels: flaky flaky-test mesosphere newbie test  (was: flaky 
flaky-test newbie test)

> HTTPConnectionTest.ClosingResponse is flaky
> ---
>
> Key: MESOS-4109
> URL: https://issues.apache.org/jira/browse/MESOS-4109
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess, test
>Affects Versions: 0.26.0
> Environment: ASF Ubuntu 14 
> {{--enable-ssl --enable-libevent}}
>Reporter: Joseph Wu
>Assignee: Benjamin Mahler
>Priority: Minor
>  Labels: flaky, flaky-test, mesosphere, newbie, test
> Fix For: 0.27.0
>
>
> Output of the test:
> {code}
> [ RUN  ] HTTPConnectionTest.ClosingResponse
> I1210 01:20:27.048532 26671 process.cpp:3077] Handling HTTP event for process 
> '(22)' with path: '/(22)/get'
> ../../../3rdparty/libprocess/src/tests/http_tests.cpp:919: Failure
> Actual function call count doesn't match EXPECT_CALL(*http.process, get(_))...
>  Expected: to be called twice
>Actual: called once - unsatisfied and active
> [  FAILED  ] HTTPConnectionTest.ClosingResponse (43 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3305) Getting Started docs for Ubuntu needs reference to libsasl2-modules

2015-12-11 Thread Kevin Klues (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053477#comment-15053477
 ] 

Kevin Klues commented on MESOS-3305:


On a fresh install of an ubuntu/trusty64 image with vagrant I do not see this 
issue.  Can you confirm this is still an issue with the current Getting Started 
instructions?

I provision with the following script (adapted from 
http://mesos.apache.org/gettingstarted/):
#!/usr/bin/env bash

# Update the packages.
sudo apt-get update

# Install the latest OpenJDK.
sudo apt-get install -y openjdk-7-jdk

# Install autotools (Only necessary if building from git repository).
sudo apt-get install -y autoconf libtool

# Install other Mesos dependencies.
sudo apt-get -y install build-essential python-dev python-boto libcurl4-nss-dev 
libsasl2-dev maven libapr1-dev libsvn-dev


> Getting Started docs for Ubuntu needs reference to libsasl2-modules
> ---
>
> Key: MESOS-3305
> URL: https://issues.apache.org/jira/browse/MESOS-3305
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Affects Versions: 0.23.0
> Environment: Ubuntu 14.04
>Reporter: Andrew A Smith
>Assignee: Kevin Klues
>Priority: Minor
>  Labels: documentation, mesosphere, newbie
>
> Following the Getting Started docs leads to an error during configure, due to 
> a missing dependency.
> Error during configure:
> checking SASL CRAM-MD5 support... configure: error: no
> ---
> We need CRAM-MD5 support for SASL authentication.
> ---



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4026) RegistryClientTest.SimpleRegistryPuller is flaky

2015-12-11 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053137#comment-15053137
 ] 

Joseph Wu commented on MESOS-4026:
--

Partially related, on some systems, the test will fail after 200 or so 
iterations, due to too many open FDs:
https://reviews.apache.org/r/41234/

> RegistryClientTest.SimpleRegistryPuller is flaky
> 
>
> Key: MESOS-4026
> URL: https://issues.apache.org/jira/browse/MESOS-4026
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Jojy Varghese
>  Labels: containerizer, flaky-test, mesosphere
>
> From ASF CI:
> https://builds.apache.org/job/Mesos/1289/COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,OS=centos:7,label_exp=docker%7C%7CHadoop/console
> {code}
> [ RUN  ] RegistryClientTest.SimpleRegistryPuller
> I1127 02:51:40.235900   362 registry_client.cpp:511] Response status for url 
> 'https://localhost:57828/v2/library/busybox/manifests/latest': 401 
> Unauthorized
> I1127 02:51:40.249766   360 registry_client.cpp:511] Response status for url 
> 'https://localhost:57828/v2/library/busybox/manifests/latest': 200 OK
> I1127 02:51:40.251137   361 registry_puller.cpp:195] Downloading layer 
> '1ce2e90b0bc7224de3db1f0d646fe8e2c4dd37f1793928287f6074bc451a57ea' for image 
> 'busybox:latest'
> I1127 02:51:40.258514   354 registry_client.cpp:511] Response status for url 
> 'https://localhost:57828/v2/library/busybox/blobs/sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4':
>  307 Temporary Redirect
> I1127 02:51:40.264171   367 libevent_ssl_socket.cpp:1023] Socket error: 
> Connection reset by peer
> ../../src/tests/containerizer/provisioner_docker_tests.cpp:1210: Failure
> (socket).failure(): Failed accept: connection error: Connection reset by peer
> [  FAILED  ] RegistryClientTest.SimpleRegistryPuller (349 ms)
> {code}
> Logs from a previous run that passed:
> {code}
> [ RUN  ] RegistryClientTest.SimpleRegistryPuller
> I1126 18:49:05.306396   349 registry_client.cpp:511] Response status for url 
> 'https://localhost:53492/v2/library/busybox/manifests/latest': 401 
> Unauthorized
> I1126 18:49:05.321362   347 registry_client.cpp:511] Response status for url 
> 'https://localhost:53492/v2/library/busybox/manifests/latest': 200 OK
> I1126 18:49:05.322720   352 registry_puller.cpp:195] Downloading layer 
> '1ce2e90b0bc7224de3db1f0d646fe8e2c4dd37f1793928287f6074bc451a57ea' for image 
> 'busybox:latest'
> I1126 18:49:05.331317   350 registry_client.cpp:511] Response status for url 
> 'https://localhost:53492/v2/library/busybox/blobs/sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4':
>  307 Temporary Redirect
> I1126 18:49:05.370625   352 registry_client.cpp:511] Response status for url 
> 'https://127.0.0.1:53492/': 200 OK
> I1126 18:49:05.372102   355 registry_puller.cpp:294] Untarring layer 
> '1ce2e90b0bc7224de3db1f0d646fe8e2c4dd37f1793928287f6074bc451a57ea' downloaded 
> from registry to directory 'output_dir'
> [   OK ] RegistryClientTest.SimpleRegistryPuller (353 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4114) Add field VIP to message Port

2015-12-11 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053110#comment-15053110
 ] 

Sargun Dhillon commented on MESOS-4114:
---

 Isn’t the port on DiscoveryInfo.Ports.Port the local port (the one that 
Marathon requested from Mesos?)? Otherwise, how do you know which DiscoveryInfo 
name correlated with what Mesos Port.

Also, you may want to expose different services under different names, or IPs.

> Add field VIP to message Port
> -
>
> Key: MESOS-4114
> URL: https://issues.apache.org/jira/browse/MESOS-4114
> Project: Mesos
>  Issue Type: Wish
>Reporter: Sargun Dhillon
>Assignee: Avinash Sridharan
>Priority: Trivial
>  Labels: mesosphere
>
> We would like to extend the Mesos protocol buffer 'Port' to include an 
> optional repeated string named "VIP" - to map it to a well known virtual IP, 
> or virtual hostname for discovery purposes.
> We also want this field exposed in DiscoveryInfo in state.json.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4114) Add field VIP to message Port

2015-12-11 Thread Tobi Knaup (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053092#comment-15053092
 ] 

Tobi Knaup commented on MESOS-4114:
---

Can you explain why the VIP should go into Port vs. DiscoveryInfo?
If I have an app that has a public port (8080) and an admin port (8081) I'd 
expect to reach them both on the same VIP, so DiscoveryInfo seems like the 
right place.

> Add field VIP to message Port
> -
>
> Key: MESOS-4114
> URL: https://issues.apache.org/jira/browse/MESOS-4114
> Project: Mesos
>  Issue Type: Wish
>Reporter: Sargun Dhillon
>Assignee: Avinash Sridharan
>Priority: Trivial
>  Labels: mesosphere
>
> We would like to extend the Mesos protocol buffer 'Port' to include an 
> optional repeated string named "VIP" - to map it to a well known virtual IP, 
> or virtual hostname for discovery purposes.
> We also want this field exposed in DiscoveryInfo in state.json.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4026) RegistryClientTest.SimpleRegistryPuller is flaky

2015-12-11 Thread Artem Harutyunyan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053098#comment-15053098
 ] 

Artem Harutyunyan commented on MESOS-4026:
--

[~tnachen], could you please updated the issue and close it?

> RegistryClientTest.SimpleRegistryPuller is flaky
> 
>
> Key: MESOS-4026
> URL: https://issues.apache.org/jira/browse/MESOS-4026
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Jojy Varghese
>  Labels: containerizer, flaky-test, mesosphere
>
> From ASF CI:
> https://builds.apache.org/job/Mesos/1289/COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,OS=centos:7,label_exp=docker%7C%7CHadoop/console
> {code}
> [ RUN  ] RegistryClientTest.SimpleRegistryPuller
> I1127 02:51:40.235900   362 registry_client.cpp:511] Response status for url 
> 'https://localhost:57828/v2/library/busybox/manifests/latest': 401 
> Unauthorized
> I1127 02:51:40.249766   360 registry_client.cpp:511] Response status for url 
> 'https://localhost:57828/v2/library/busybox/manifests/latest': 200 OK
> I1127 02:51:40.251137   361 registry_puller.cpp:195] Downloading layer 
> '1ce2e90b0bc7224de3db1f0d646fe8e2c4dd37f1793928287f6074bc451a57ea' for image 
> 'busybox:latest'
> I1127 02:51:40.258514   354 registry_client.cpp:511] Response status for url 
> 'https://localhost:57828/v2/library/busybox/blobs/sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4':
>  307 Temporary Redirect
> I1127 02:51:40.264171   367 libevent_ssl_socket.cpp:1023] Socket error: 
> Connection reset by peer
> ../../src/tests/containerizer/provisioner_docker_tests.cpp:1210: Failure
> (socket).failure(): Failed accept: connection error: Connection reset by peer
> [  FAILED  ] RegistryClientTest.SimpleRegistryPuller (349 ms)
> {code}
> Logs from a previous run that passed:
> {code}
> [ RUN  ] RegistryClientTest.SimpleRegistryPuller
> I1126 18:49:05.306396   349 registry_client.cpp:511] Response status for url 
> 'https://localhost:53492/v2/library/busybox/manifests/latest': 401 
> Unauthorized
> I1126 18:49:05.321362   347 registry_client.cpp:511] Response status for url 
> 'https://localhost:53492/v2/library/busybox/manifests/latest': 200 OK
> I1126 18:49:05.322720   352 registry_puller.cpp:195] Downloading layer 
> '1ce2e90b0bc7224de3db1f0d646fe8e2c4dd37f1793928287f6074bc451a57ea' for image 
> 'busybox:latest'
> I1126 18:49:05.331317   350 registry_client.cpp:511] Response status for url 
> 'https://localhost:53492/v2/library/busybox/blobs/sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4':
>  307 Temporary Redirect
> I1126 18:49:05.370625   352 registry_client.cpp:511] Response status for url 
> 'https://127.0.0.1:53492/': 200 OK
> I1126 18:49:05.372102   355 registry_puller.cpp:294] Untarring layer 
> '1ce2e90b0bc7224de3db1f0d646fe8e2c4dd37f1793928287f6074bc451a57ea' downloaded 
> from registry to directory 'output_dir'
> [   OK ] RegistryClientTest.SimpleRegistryPuller (353 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4003) Pass agent work_dir to isolator modules

2015-12-11 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053135#comment-15053135
 ] 

Greg Mann commented on MESOS-4003:
--

Thanks to [~jieyu] for presenting a much simpler solution for this: since 
{{work_dir}} is set at the command line anyway, we can just pass it to the 
modules via the {{parameters}} that they receive in their JSON command-line 
input.

Closing this ticket as a "won't fix".

> Pass agent work_dir to isolator modules
> ---
>
> Key: MESOS-4003
> URL: https://issues.apache.org/jira/browse/MESOS-4003
> Project: Mesos
>  Issue Type: Bug
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: external-volumes, mesosphere
>
> Some isolator modules can benefit from access to the agent's {{work_dir}}. 
> For example, the DVD isolator (https://github.com/emccode/mesos-module-dvdi) 
> is currently forced to mount external volumes in a hard-coded directory. 
> Making the {{work_dir}} accessible to the isolator via 
> {{Isolator::recover()}} would allow the isolator to mount volumes within the 
> agent's {{work_dir}}. This can be accomplished by simply adding an overloaded 
> signature for {{Isolator::recover()}} which includes the {{work_dir}} as a 
> parameter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4122) slave should ignore attribute changes

2015-12-11 Thread James Peach (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053391#comment-15053391
 ] 

James Peach commented on MESOS-4122:


Ah I see your point :-/

> slave should ignore attribute changes
> -
>
> Key: MESOS-4122
> URL: https://issues.apache.org/jira/browse/MESOS-4122
> Project: Mesos
>  Issue Type: Improvement
>  Components: slave
>Reporter: James Peach
>Priority: Minor
>
> {{mesos-slave}} should ignore changes in attributes when it checks for 
> incompatible {{SlaveInfo}} changes.
> This is a trivial change and I'm going to carry this patch internally. Let's 
> have a discussion on what this means semantically. It is not clear to me 
> whether it is a generally correct change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4026) RegistryClientTest.SimpleRegistryPuller is flaky

2015-12-11 Thread Jojy Varghese (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053141#comment-15053141
 ] 

Jojy Varghese commented on MESOS-4026:
--

The root cause of all the issues has been addressed at 
https://reviews.apache.org/r/41253. The issues seen on buildbot are not due to 
FD leak and is unrelated issue since they run only once there. But its a good 
find.

> RegistryClientTest.SimpleRegistryPuller is flaky
> 
>
> Key: MESOS-4026
> URL: https://issues.apache.org/jira/browse/MESOS-4026
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Jojy Varghese
>  Labels: containerizer, flaky-test, mesosphere
>
> From ASF CI:
> https://builds.apache.org/job/Mesos/1289/COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,OS=centos:7,label_exp=docker%7C%7CHadoop/console
> {code}
> [ RUN  ] RegistryClientTest.SimpleRegistryPuller
> I1127 02:51:40.235900   362 registry_client.cpp:511] Response status for url 
> 'https://localhost:57828/v2/library/busybox/manifests/latest': 401 
> Unauthorized
> I1127 02:51:40.249766   360 registry_client.cpp:511] Response status for url 
> 'https://localhost:57828/v2/library/busybox/manifests/latest': 200 OK
> I1127 02:51:40.251137   361 registry_puller.cpp:195] Downloading layer 
> '1ce2e90b0bc7224de3db1f0d646fe8e2c4dd37f1793928287f6074bc451a57ea' for image 
> 'busybox:latest'
> I1127 02:51:40.258514   354 registry_client.cpp:511] Response status for url 
> 'https://localhost:57828/v2/library/busybox/blobs/sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4':
>  307 Temporary Redirect
> I1127 02:51:40.264171   367 libevent_ssl_socket.cpp:1023] Socket error: 
> Connection reset by peer
> ../../src/tests/containerizer/provisioner_docker_tests.cpp:1210: Failure
> (socket).failure(): Failed accept: connection error: Connection reset by peer
> [  FAILED  ] RegistryClientTest.SimpleRegistryPuller (349 ms)
> {code}
> Logs from a previous run that passed:
> {code}
> [ RUN  ] RegistryClientTest.SimpleRegistryPuller
> I1126 18:49:05.306396   349 registry_client.cpp:511] Response status for url 
> 'https://localhost:53492/v2/library/busybox/manifests/latest': 401 
> Unauthorized
> I1126 18:49:05.321362   347 registry_client.cpp:511] Response status for url 
> 'https://localhost:53492/v2/library/busybox/manifests/latest': 200 OK
> I1126 18:49:05.322720   352 registry_puller.cpp:195] Downloading layer 
> '1ce2e90b0bc7224de3db1f0d646fe8e2c4dd37f1793928287f6074bc451a57ea' for image 
> 'busybox:latest'
> I1126 18:49:05.331317   350 registry_client.cpp:511] Response status for url 
> 'https://localhost:53492/v2/library/busybox/blobs/sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4':
>  307 Temporary Redirect
> I1126 18:49:05.370625   352 registry_client.cpp:511] Response status for url 
> 'https://127.0.0.1:53492/': 200 OK
> I1126 18:49:05.372102   355 registry_puller.cpp:294] Untarring layer 
> '1ce2e90b0bc7224de3db1f0d646fe8e2c4dd37f1793928287f6074bc451a57ea' downloaded 
> from registry to directory 'output_dir'
> [   OK ] RegistryClientTest.SimpleRegistryPuller (353 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4122) slave should ignore attribute changes

2015-12-11 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053176#comment-15053176
 ] 

Vinod Kone commented on MESOS-4122:
---

This is not as simple as you might think!

See the linked ticket for previous discussions/design doc.

> slave should ignore attribute changes
> -
>
> Key: MESOS-4122
> URL: https://issues.apache.org/jira/browse/MESOS-4122
> Project: Mesos
>  Issue Type: Improvement
>  Components: slave
>Reporter: James Peach
>Priority: Minor
>
> {{mesos-slave}} should ignore changes in attributes when it checks for 
> incompatible {{SlaveInfo}} changes.
> This is a trivial change and I'm going to carry this patch internally. Let's 
> have a discussion on what this means semantically. It is not clear to me 
> whether it is a generally correct change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3892) Add a helper function to the Agent to retrieve the list of executors that are using optimistically offered, revocable resources.

2015-12-11 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053148#comment-15053148
 ] 

Joseph Wu commented on MESOS-3892:
--

Yes that's reasonable (and what we discussed in the [work group 
meeting|https://docs.google.com/document/d/1CKMelV6xD_HOsqwbqH3PM24P7ypS_G4oz_MDNxE85D8/edit#bookmark=id.xlfbqnql7ngq]).

Can you update the relevant JIRA's accordingly (rename, update descriptions, 
etc)?

> Add a helper function to the Agent to retrieve the list of executors that are 
> using optimistically offered, revocable resources.
> 
>
> Key: MESOS-3892
> URL: https://issues.apache.org/jira/browse/MESOS-3892
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Klaus Ma
>  Labels: mesosphere
>
> {noformat}
> class Slave {
>   ...
>   // How the master currently keeps track of executors.
>   hashmap> executors;
>   ...
>   // Returns the list of executors that are using optimistically-
>   // offered, revocable resources.
>   list getEvictableExecutors() { ... }
>   ...
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4127) Ensure `Content-Type` field is set for some responses

2015-12-11 Thread Joseph Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-4127:
-
Summary: Ensure `Content-Type` field is set for some responses  (was: 
Ensure `Conten-Type` field is set for some responses)

> Ensure `Content-Type` field is set for some responses
> -
>
> Key: MESOS-4127
> URL: https://issues.apache.org/jira/browse/MESOS-4127
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Alexander Rukletsov
>  Labels: http, mesosphere, newbie++, tech-debt
>
> As pointed out by [~anandmazumdar] in https://reviews.apache.org/r/40905/, we 
> should make sure we set the {{Content-Type}} files for some responses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4114) Add field VIP to message Port

2015-12-11 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052413#comment-15052413
 ] 

Adam B commented on MESOS-4114:
---

If we had labels from MESOS-3962, would you really need VIP as a first-class 
Port field?

> Add field VIP to message Port
> -
>
> Key: MESOS-4114
> URL: https://issues.apache.org/jira/browse/MESOS-4114
> Project: Mesos
>  Issue Type: Wish
>Reporter: Sargun Dhillon
>Assignee: Avinash Sridharan
>Priority: Trivial
>  Labels: mesosphere
>
> We would like to extend the Mesos protocol buffer 'Port' to include an 
> optional repeated string named "VIP" - to map it to a well known virtual IP, 
> or virtual hostname for discovery purposes.
> We also want this field exposed in DiscoveryInfo in state.json.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4129) Ignore some files from eclipse

2015-12-11 Thread Benjamin Bannier (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052473#comment-15052473
 ] 

Benjamin Bannier commented on MESOS-4129:
-

Just wondering if we do plan to add e.g., backup files or project configs from 
vim, emacs and IDE/editor X as well? Right now {{/.gitignore-template}} only 
deals with files shared by all users (OK, assuming everybody is using autotools 
to build), and this addition would deviate from that.

I personally have been happy enough with {{.git/info/exclude}} for my personal 
blacklist.

> Ignore some files from eclipse
> --
>
> Key: MESOS-4129
> URL: https://issues.apache.org/jira/browse/MESOS-4129
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Guangya Liu
>
> When using eclipse to edit mesos code and the "git status"
> command always show some eclipse system files, it is better put those
> files to gitignore so that "git status" will not show those
> files, the developer can simply use "git add ." to add all modified files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4138) Document proposal for exclusive resources in Mesos

2015-12-11 Thread Ian Downes (JIRA)
Ian Downes created MESOS-4138:
-

 Summary: Document proposal for exclusive resources in Mesos
 Key: MESOS-4138
 URL: https://issues.apache.org/jira/browse/MESOS-4138
 Project: Mesos
  Issue Type: Improvement
  Components: isolation
Reporter: Ian Downes


Propose the concept of exclusivity to resources. An exclusive resource is a) 
not shared with any other task, b) employs stronger isolation for more 
predictable performance, and c) is consequently not oversubscribed (if 
enabled). In contrast to normal resources, exclusive resources have greater 
resource priority while oversubscribed resources have lower priority. 

Initial resources that could support the notion of exclusivity include cpu, 
network egress bandwidth, and IP addresses.

Please see this 
[document|https://docs.google.com/document/d/1Aby-U3-MPKE51s4aYd41L4Co2S97eM6LPtyzjyR_ecI/edit?usp=sharing].
 All comments welcome!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4139) Make escalationTimeout configurable

2015-12-11 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-4139:
-

 Summary: Make escalationTimeout configurable
 Key: MESOS-4139
 URL: https://issues.apache.org/jira/browse/MESOS-4139
 Project: Mesos
  Issue Type: Improvement
Reporter: Sargun Dhillon
Priority: Critical


At the moment, escalationTimeout is fixed at 3 seconds in the code. This means 
that if a task is shutdown, there are only 3 seconds between the SIGTERM, and 
SIGKILL. This means that if someone is running something like a rails 
framework, it may be too quick to terminate the tasks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)