[ 
https://issues.apache.org/jira/browse/MESOS-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031847#comment-15031847
 ] 

Jan Schlicht edited comment on MESOS-3975 at 11/30/15 2:35 PM:
---------------------------------------------------------------

I could reproduce this under Fedora 23, running {{./bin/mesos-tests.sh 
--gtest_repeat=-1}} if compiled with {{--enable-libevent --enable-ssl}}, though 
for me DiskQuotaTest.DiskUsageExceedsQuota fails.
It seems that the RegistryClientTest fixture sets some {{SSL_}} env variables 
but doesn't unset them. This enables SSL for subsequent fixtures and let's 
libprocess initialization fail with "Could not load cert file".
I don't know how GTest orders/shuffles its tests, but the ordering is important 
here:
{{./bin/mesos-tests.sh --gtest_repeat=2 
--gtest_filter="RegistryClientTest.*:*.DiskUsageExceedsQuota" 
--gtest_break_on_failure}} can be used to reproduce this here.


was (Author: nfnt):
I could reproduce this under Fedora 23, running {{./bin/mesos-tests.sh 
--gtest_repeat=-1}} if compiled with {{--enable-libevent --enable-ssl}}, though 
for me DiskQuotaTest.DiskUsageExceedsQuota fails.
It seems that the RegistryClientTest fixture sets some {{SSL_}} env variables 
but doesn't unset them. This enables SSL for subsequent fixtures and let's 
libprocess initialization fail with "Could not load cert file".
I don't know how GTest orders/shuffles its tests, but the ordering is important 
here:
Using {{./bin/mesos-tests.sh --gtest_repeat=2 
--gtest_filter="RegistryClientTest.*:*.DiskUsageExceedsQuota" 
--gtest_break_on_failure}} can be used to reproduce this here.

> SSL build of mesos causes flaky testsuite.
> ------------------------------------------
>
>                 Key: MESOS-3975
>                 URL: https://issues.apache.org/jira/browse/MESOS-3975
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 0.26.0
>         Environment: CentOS 7.1, Kernel 3.10.0-229.20.1.el7.x86_64, gcc 
> 4.8.3, Docker 1.9
>            Reporter: Till Toenshoff
>            Assignee: Joseph Wu
>              Labels: mesosphere
>
> When running the tests of an SSL build of Mesos on CentOS 7.1, I see spurious 
> test failures that are, so far, not reproducible.
> The following tests did fail for me in complete runs but did seem fine when 
> running them individually, in repetition.  
> {noformat}
> DockerTest.ROOT_DOCKER_CheckPortResource
> {noformat}
> {noformat}
> ContainerizerTest.ROOT_CGROUPS_BalloonFramework
> {noformat}
> {noformat}
> [ RUN      ] 
> LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystemCommandExecutor
> 2015-11-20 
> 19:08:38,826:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> + /home/vagrant/mesos/build/src/mesos-containerizer mount --help=false 
> --operation=make-rslave --path=/
> + grep -E 
> /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_Tz7P8c/.+
>  /proc/self/mountinfo
> + grep -v 2b98025c-74f1-41d2-b35a-ce2cdfae347e
> + cut '-d ' -f5
> + xargs --no-run-if-empty umount -l
> + mount -n --rbind 
> /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_Tz7P8c/provisioner/containers/2b98025c-74f1-41d2-b35a-ce2cdfae347e/backends/copy/rootfses/bed11080-474b-4c69-8e7f-0ab85e895b0d
>  
> /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_Tz7P8c/slaves/830e842e-c36a-4e4c-bff4-5b9568d7df12-S0/frameworks/830e842e-c36a-4e4c-bff4-5b9568d7df12-0000/executors/c735be54-c47f-4645-bfc1-2f4647e2cddb/runs/2b98025c-74f1-41d2-b35a-ce2cdfae347e/.rootfs
> Could not load cert file
> ../../src/tests/containerizer/filesystem_isolator_tests.cpp:354: Failure
> Value of: statusRunning.get().state()
>   Actual: TASK_FAILED
> Expected: TASK_RUNNING
> 2015-11-20 
> 19:08:42,164:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> 2015-11-20 
> 19:08:45,501:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> 2015-11-20 
> 19:08:48,837:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> 2015-11-20 
> 19:08:52,174:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> ../../src/tests/containerizer/filesystem_isolator_tests.cpp:355: Failure
> Failed to wait 15secs for statusFinished
> ../../src/tests/containerizer/filesystem_isolator_tests.cpp:349: Failure
> Actual function call count doesn't match EXPECT_CALL(sched, 
> statusUpdate(&driver, _))...
>          Expected: to be called twice
>            Actual: called once - unsatisfied and active
> 2015-11-20 
> 19:08:55,511:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> *** Aborted at 1448046536 (unix time) try "date -d @1448046536" if you are 
> using GNU date ***
> PC: @                0x0 (unknown)
> *** SIGSEGV (@0x0) received by PID 21380 (TID 0x7fa1549e68c0) from PID 0; 
> stack trace: ***
>     @     0x7fa141796fbb (unknown)
>     @     0x7fa14179b341 (unknown)
>     @     0x7fa14f096130 (unknown)
> {noformat}
> Vagrantfile generator:
> {noformat}
> cat << EOF > Vagrantfile
> # -*- mode: ruby -*-" >
> # vi: set ft=ruby :
> Vagrant.configure(2) do |config|
>   # Disable shared folder to prevent certain kernel module dependencies.
>   config.vm.synced_folder ".", "/vagrant", disabled: true
>   config.vm.hostname = "centos71"
>   config.vm.box = "bento/centos-7.1"
>   config.vm.provider "virtualbox" do |vb|
>     vb.memory = 16384
>     vb.cpus = 8
>   end
>   config.vm.provider "vmware_fusion" do |vb|
>     vb.memory = 9216
>     vb.cpus = 4
>   end
>   config.vm.provision "shell", inline: <<-SHELL
>      sudo yum -y update systemd
>      sudo yum install -y tar wget
>      sudo wget 
> http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo 
> -O /etc/yum.repos.d/epel-apache-maven.repo
>      sudo yum groupinstall -y "Development Tools"
>      sudo yum install -y apache-maven python-devel java-1.7.0-openjdk-devel 
> zlib-devel libcurl-devel openssl-devel cyrus-sasl-devel cyrus-sasl-md5 
> apr-devel subversion-devel apr-util-devel
>      sudo yum install libevent-devel
>      sudo yum install -y git
>      sudo yum install -y docker
>      sudo service docker start
>      sudo docker info
>      #sudo wget -qO- https://get.docker.com/ | sh
>   SHELL
> end
> EOF
> vagrant up
> vagrant reload
> vagrant ssh -c "
> git clone  https://github.com/apache/mesos.git mesos
> cd mesos
> git checkout -b 0.26.0-rc1 0.26.0-rc1
> ./bootstrap
> mkdir build
> cd build
> ../configure --enable-libevent --enable-ssl
> GTEST_FILTER="" make check
> sudo ./bin/mesos-tests.sh
> "
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to