[ 
https://issues.apache.org/jira/browse/MESOS-7218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16135602#comment-16135602
 ] 

Vinod Kone commented on MESOS-7218:
-----------------------------------

Saw a slightly different bug (double free corruption) for this test  in ASF CI.

{code}
[ RUN      ] ExamplesTest.PythonFramework
Using temporary directory '/tmp/ExamplesTest_PythonFramework_z8oLwZ'
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0818 23:26:03.783149  9113 process.cpp:1393] libprocess is initialized on 
172.17.0.2:45305 with 16 worker threads
I0818 23:26:03.783207  9113 logging.cpp:199] Logging to STDERR
I0818 23:26:03.973633  9113 leveldb.cpp:174] Opened db in 186.110939ms
I0818 23:26:04.006614  9113 leveldb.cpp:181] Compacted db in 32.951872ms
I0818 23:26:04.006680  9113 leveldb.cpp:196] Created db iterator in 20157ns
I0818 23:26:04.006700  9113 leveldb.cpp:202] Seeked to beginning of db in 8574ns
I0818 23:26:04.006712  9113 leveldb.cpp:271] Iterated through 0 keys in the db 
in 6739ns
I0818 23:26:04.006757  9113 replica.cpp:779] Replica recovered with log 
positions 0 -> 0 with 1 holes and 0 unlearned
I0818 23:26:04.007972  9132 recover.cpp:451] Starting replica recovery
I0818 23:26:04.008108  9132 recover.cpp:477] Replica is in EMPTY status
I0818 23:26:04.008359  9113 local.cpp:272] Creating default 'local' authorizer
I0818 23:26:04.008885  9130 replica.cpp:676] Replica in EMPTY status received a 
broadcasted recover request from __req_res__(1)@172.17.0.2:45305
I0818 23:26:04.010182  9131 recover.cpp:197] Received a recover response from a 
replica in EMPTY status
I0818 23:26:04.010478  9127 recover.cpp:568] Updating replica status to STARTING
I0818 23:26:04.011464  9139 master.cpp:442] Master 
82ba1c51-64a2-4202-9dac-081a35935d4e (8ca4e8552e66) started on 172.17.0.2:45305
I0818 23:26:04.011514  9139 master.cpp:444] Flags at startup: 
--acls="permissive: false
register_frameworks {
  principals {
    type: SOME
    values: "test-principal"
  }
  roles {
    type: SOME
    values: "*"
  }
}
run_tasks {
  principals {
    type: SOME
    values: "test-principal"
  }
  users {
    type: SOME
    values: "mesos"
  }
}
register_agents {
  principals {
    type: ANY
  }
  agents {
    type: ANY
  }
}
" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" 
--allocation_interval="1secs" --allocator="HierarchicalDRF" 
--authenticate_agents="false" --authenticate_frameworks="true" 
--authenticate_http_frameworks="false" --authenticate_http_readonly="false" 
--authenticate_http_readwrite="false" --authenticators="crammd5" 
--authorizers="local" 
--credentials="/tmp/ExamplesTest_PythonFramework_z8oLwZ/credentials" 
--filter_gpu_resources="true" --framework_sorter="drf" --help="false" 
--hostname_lookup="true" --http_authenticators="basic" 
--initialize_driver_logging="true" --log_auto_initialize="true" 
--logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" 
--max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" 
--max_unreachable_tasks_per_framework="1000" --port="5050" --quiet="false" 
--recovery_agent_removal_limit="100%" --registry="replicated_log" 
--registry_fetch_timeout="1mins" --registry_gc_interval="15mins" 
--registry_max_agent_age="2weeks" --registry_max_agent_count="102400" 
--registry_store_timeout="20secs" --registry_strict="false" 
--root_submissions="true" --user_sorter="drf" --version="false" 
--webui_dir="/mesos/mesos-1.4.0/src/webui" 
--work_dir="/tmp/mesos-2Nvz1X/master" --zk_session_timeout="10secs"
I0818 23:26:04.012109  9139 master.cpp:494] Master only allowing authenticated 
frameworks to register
I0818 23:26:04.012122  9139 master.cpp:510] Master allowing unauthenticated 
agents to register
I0818 23:26:04.012131  9139 master.cpp:524] Master allowing HTTP frameworks to 
register without authentication
I0818 23:26:04.012143  9139 credentials.hpp:37] Loading credentials for 
authentication from '/tmp/ExamplesTest_PythonFramework_z8oLwZ/credentials'
W0818 23:26:04.012240  9139 credentials.hpp:52] Permissions on credentials file 
'/tmp/ExamplesTest_PythonFramework_z8oLwZ/credentials' are too open; it is 
recommended that your credentials file is NOT accessible by others
I0818 23:26:04.012315  9139 master.cpp:566] Using default 'crammd5' 
authenticator
I0818 23:26:04.012372  9139 authenticator.cpp:520] Initializing server SASL
I0818 23:26:04.014071  9139 auxprop.cpp:73] Initialized in-memory auxiliary 
property plugin
I0818 23:26:04.014174  9139 master.cpp:646] Authorization enabled
I0818 23:26:04.014312  9134 hierarchical.cpp:171] Initialized hierarchical 
allocator process
I0818 23:26:04.014315  9127 whitelist_watcher.cpp:77] No whitelist given
I0818 23:26:04.015699  9113 resolver.cpp:69] Creating default secret resolver
I0818 23:26:04.016268  9113 containerizer.cpp:246] Using isolation: 
filesystem/posix,posix/cpu,posix/mem,network/cni,environment_secret
W0818 23:26:04.016788  9113 backend.cpp:76] Failed to create 'aufs' backend: 
AufsBackend requires root privileges
W0818 23:26:04.016878  9113 backend.cpp:76] Failed to create 'bind' backend: 
BindBackend requires root privileges
I0818 23:26:04.016914  9113 provisioner.cpp:255] Using default backend 'copy'
I0818 23:26:04.019286  9140 slave.cpp:250] Mesos agent started on 
(1)@172.17.0.2:45305
I0818 23:26:04.019330  9140 slave.cpp:251] Flags at startup: 
--acls="permissive: false
register_frameworks {
  principals {
    type: SOME
    values: "test-principal"
  }
  roles {
    type: SOME
    values: "*"
  }
}
run_tasks {
  principals {
    type: SOME
    values: "test-principal"
  }
  users {
    type: SOME
    values: "mesos"
  }
}
register_agents {
  principals {
    type: ANY
  }
  agents {
    type: ANY
  }
}
" --appc_simple_discovery_uri_prefix="http://"; 
--appc_store_dir="/tmp/mesos/store/appc" --authenticate_http_readonly="false" 
--authenticate_http_readwrite="false" --authenticatee="crammd5" 
--authentication_backoff_factor="1secs" --authorizer="local" 
--cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" 
--cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" 
--cgroups_root="mesos" --container_disk_watch_interval="15secs" 
--containerizers="mesos" --default_role="*" 
--disallow_sharing_agent_pid_namespace="false" --disk_watch_interval="1mins" 
--docker="docker" --docker_kill_orphans="true" 
--docker_registry="https://registry-1.docker.io"; --docker_remove_delay="6hrs" 
--docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" 
--docker_store_dir="/tmp/mesos/store/docker" 
--docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" 
--enforce_container_disk_quota="false" --executor_registration_timeout="1mins" 
--executor_reregistration_timeout="2secs" 
--executor_shutdown_grace_period="5secs" 
--fetcher_cache_dir="/tmp/mesos-2Nvz1X/agents/0/fetch" 
--fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" 
--gc_disk_headroom="0.1" --hadoop_home="" --help="false" 
--hostname_lookup="true" --http_command_executor="false" 
--http_heartbeat_interval="30secs" --initialize_driver_logging="true" 
--isolation="filesystem/posix,posix/cpu,posix/mem" --launcher="posix" 
--launcher_dir="/mesos/mesos-1.4.0/_build/src" --logbufsecs="0" 
--logging_level="INFO" --max_completed_executors_per_framework="150" 
--oversubscribed_resources_interval="15secs" --perf_duration="10secs" 
--perf_interval="1mins" --port="5051" --qos_correction_interval_min="0ns" 
--quiet="false" --recover="reconnect" --recovery_timeout="15mins" 
--registration_backoff_factor="1secs" --resources="cpus:2;mem:10240" 
--revocable_cpu_low_priority="true" 
--runtime_dir="/tmp/mesos-2Nvz1X/agents/0/run" 
--sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" 
--systemd_enable_support="true" 
--systemd_runtime_directory="/run/systemd/system" --version="false" 
--work_dir="/tmp/mesos-2Nvz1X/agents/0/work"
I0818 23:26:04.021750  9113 resolver.cpp:69] Creating default secret resolver
I0818 23:26:04.022028  9113 containerizer.cpp:246] Using isolation: 
filesystem/posix,posix/cpu,posix/mem,network/cni,environment_secret
I0818 23:26:04.022400  9135 master.cpp:2163] Elected as the leading master!
I0818 23:26:04.022433  9135 master.cpp:1702] Recovering from registrar
W0818 23:26:04.022450  9113 backend.cpp:76] Failed to create 'aufs' backend: 
AufsBackend requires root privileges
W0818 23:26:04.022500  9113 backend.cpp:76] Failed to create 'bind' backend: 
BindBackend requires root privileges
I0818 23:26:04.022518  9136 registrar.cpp:347] Recovering registrar
I0818 23:26:04.022526  9113 provisioner.cpp:255] Using default backend 'copy'
I0818 23:26:04.024107  9138 slave.cpp:250] Mesos agent started on 
(2)@172.17.0.2:45305
I0818 23:26:04.024148  9138 slave.cpp:251] Flags at startup: 
--acls="permissive: false
register_frameworks {
  principals {
    type: SOME
    values: "test-principal"
  }
  roles {
    type: SOME
    values: "*"
  }
}
run_tasks {
  principals {
    type: SOME
    values: "test-principal"
  }
  users {
    type: SOME
    values: "mesos"
  }
}
register_agents {
  principals {
    type: ANY
  }
  agents {
    type: ANY
  }
}
" --appc_simple_discovery_uri_prefix="http://"; 
--appc_store_dir="/tmp/mesos/store/appc" --authenticate_http_readonly="false" 
--authenticate_http_readwrite="false" --authenticatee="crammd5" 
--authentication_backoff_factor="1secs" --authorizer="local" 
--cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" 
--cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" 
--cgroups_root="mesos" --container_disk_watch_interval="15secs" 
--containerizers="mesos" --default_role="*" 
--disallow_sharing_agent_pid_namespace="false" --disk_watch_interval="1mins" 
--docker="docker" --docker_kill_orphans="true" 
--docker_registry="https://registry-1.docker.io"; --docker_remove_delay="6hrs" 
--docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" 
--docker_store_dir="/tmp/mesos/store/docker" 
--docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" 
--enforce_container_disk_quota="false" --executor_registration_timeout="1mins" 
--executor_reregistration_timeout="2secs" 
--executor_shutdown_grace_period="5secs" 
--fetcher_cache_dir="/tmp/mesos-2Nvz1X/agents/1/fetch" 
--fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" 
--gc_disk_headroom="0.1" --hadoop_home="" --help="false" 
--hostname_lookup="true" --http_command_executor="false" 
--http_heartbeat_interval="30secs" --initialize_driver_logging="true" 
--isolation="filesystem/posix,posix/cpu,posix/mem" --launcher="posix" 
--launcher_dir="/mesos/mesos-1.4.0/_build/src" --logbufsecs="0" 
--logging_level="INFO" --max_completed_executors_per_framework="150" 
--oversubscribed_resources_interval="15secs" --perf_duration="10secs" 
--perf_interval="1mins" --port="5051" --qos_correction_interval_min="0ns" 
--quiet="false" --recover="reconnect" --recovery_timeout="15mins" 
--registration_backoff_factor="1secs" --resources="cpus:2;mem:10240" 
--revocable_cpu_low_priority="true" 
--runtime_dir="/tmp/mesos-2Nvz1X/agents/1/run" 
--sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" 
--systemd_enable_support="true" 
--systemd_runtime_directory="/run/systemd/system" --version="false" 
--work_dir="/tmp/mesos-2Nvz1X/agents/1/work"
I0818 23:26:04.026486  9113 resolver.cpp:69] Creating default secret resolver
I0818 23:26:04.026756  9113 containerizer.cpp:246] Using isolation: 
filesystem/posix,posix/cpu,posix/mem,network/cni,environment_secret
W0818 23:26:04.027266  9113 backend.cpp:76] Failed to create 'aufs' backend: 
AufsBackend requires root privileges
W0818 23:26:04.027323  9113 backend.cpp:76] Failed to create 'bind' backend: 
BindBackend requires root privileges
I0818 23:26:04.027364  9113 provisioner.cpp:255] Using default backend 'copy'
I0818 23:26:04.032902  9133 slave.cpp:250] Mesos agent started on 
(3)@172.17.0.2:45305
I0818 23:26:04.033041  9133 slave.cpp:251] Flags at startup: 
--acls="permissive: false
register_frameworks {
  principals {
    type: SOME
    values: "test-principal"
  }
  roles {
    type: SOME
    values: "*"
  }
}
run_tasks {
  principals {
    type: SOME
    values: "test-principal"
  }
  users {
    type: SOME
    values: "mesos"
  }
}
register_agents {
  principals {
    type: ANY
  }
  agents {
    type: ANY
  }
}
" --appc_simple_discovery_uri_prefix="http://"; 
--appc_store_dir="/tmp/mesos/store/appc" --authenticate_http_readonly="false" 
--authenticate_http_readwrite="false" --authenticatee="crammd5" 
--authentication_backoff_factor="1secs" --authorizer="local" 
--cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" 
--cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" 
--cgroups_root="mesos" --container_disk_watch_interval="15secs" 
--containerizers="mesos" --default_role="*" 
--disallow_sharing_agent_pid_namespace="false" --disk_watch_interval="1mins" 
--docker="docker" --docker_kill_orphans="true" 
--docker_registry="https://registry-1.docker.io"; --docker_remove_delay="6hrs" 
--docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" 
--docker_store_dir="/tmp/mesos/store/docker" 
--docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" 
--enforce_container_disk_quota="false" --executor_registration_timeout="1mins" 
--executor_reregistration_timeout="2secs" 
--executor_shutdown_grace_period="5secs" 
--fetcher_cache_dir="/tmp/mesos-2Nvz1X/agents/2/fetch" 
--fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" 
--gc_disk_headroom="0.1" --hadoop_home="" --help="false" 
--hostname_lookup="true" --http_command_executor="false" 
--http_heartbeat_interval="30secs" --initialize_driver_logging="true" 
--isolation="filesystem/posix,posix/cpu,posix/mem" --launcher="posix" 
--launcher_dir="/mesos/mesos-1.4.0/_build/src" --logbufsecs="0" 
--logging_level="INFO" --max_completed_executors_per_framework="150" 
--oversubscribed_resources_interval="15secs" --perf_duration="10secs" 
--perf_interval="1mins" --port="5051" --qos_correction_interval_min="0ns" 
--quiet="false" --recover="reconnect" --recovery_timeout="15mins" 
--registration_backoff_factor="1secs" --resources="cpus:2;mem:10240" 
--revocable_cpu_low_priority="true" 
--runtime_dir="/tmp/mesos-2Nvz1X/agents/2/run" 
--sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" 
--systemd_enable_support="true" 
--systemd_runtime_directory="/run/systemd/system" --version="false" 
--work_dir="/tmp/mesos-2Nvz1X/agents/2/work"
I0818 23:26:04.020632  9140 slave.cpp:565] Agent resources: 
[{"name":"cpus","scalar":{"value":2.0},"type":"SCALAR"},{"name":"mem","scalar":{"value":10240.0},"type":"SCALAR"},{"name":"disk","scalar":{"value":3701220.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"type":"RANGES"}]
I0818 23:26:04.025518  9138 slave.cpp:565] Agent resources: 
[{"name":"cpus","scalar":{"value":2.0},"type":"SCALAR"},{"name":"mem","scalar":{"value":10240.0},"type":"SCALAR"},{"name":"disk","scalar":{"value":3701220.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"type":"RANGES"}]
I0818 23:26:04.034517  9138 slave.cpp:573] Agent attributes: [  ]
I0818 23:26:04.034548  9138 slave.cpp:582] Agent hostname: 8ca4e8552e66
I0818 23:26:04.034569  9140 slave.cpp:573] Agent attributes: [  ]
I0818 23:26:04.034377  9133 slave.cpp:565] Agent resources: 
[{"name":"cpus","scalar":{"value":2.0},"type":"SCALAR"},{"name":"mem","scalar":{"value":10240.0},"type":"SCALAR"},{"name":"disk","scalar":{"value":3701220.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"type":"RANGES"}]
I0818 23:26:04.034627  9135 status_update_manager.cpp:177] Pausing sending 
status updates
I0818 23:26:04.034641  9133 slave.cpp:573] Agent attributes: [  ]
I0818 23:26:04.034626  9140 slave.cpp:582] Agent hostname: 8ca4e8552e66
I0818 23:26:04.034656  9133 slave.cpp:582] Agent hostname: 8ca4e8552e66
I0818 23:26:04.034734  9129 status_update_manager.cpp:177] Pausing sending 
status updates
I0818 23:26:04.034765  9129 status_update_manager.cpp:177] Pausing sending 
status updates
I0818 23:26:04.035097  9113 sched.cpp:232] Version: 1.4.0
I0818 23:26:04.035392  9136 sched.cpp:336] New master detected at 
[email protected]:45305
I0818 23:26:04.035468  9136 sched.cpp:407] Authenticating with master 
[email protected]:45305
I0818 23:26:04.035487  9136 sched.cpp:414] Using default CRAM-MD5 authenticatee
I0818 23:26:04.035712  9126 authenticatee.cpp:97] Initializing client SASL
*** Error in `/usr/bin/python': double free or corruption (fasttop): 
0x00002b225c003c50 ***
Enabling authentication for the framework
../../src/tests/script.cpp:83: Failure
Failed
python_framework_test.sh terminated with signal Aborted
[  FAILED  ] ExamplesTest.PythonFramework (6643 ms)
[----------] 12 tests from ExamplesTest (85729 ms total)

{code}

> ExamplesTest.PythonFramework is flaky.
> --------------------------------------
>
>                 Key: MESOS-7218
>                 URL: https://issues.apache.org/jira/browse/MESOS-7218
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 1.1.1
>         Environment: Ubuntu (12.04, 14.04, 16.04), MacOS Sierra
>            Reporter: Till Toenshoff
>              Labels: flaky, mesosphere, test
>
> The test appears to be highly flaky (failure rate > 50%) on all tested Ubuntu 
> distros. Right now it is unclear to me if this is an issue of the CI I am 
> using or an actual test problem or even a proper bug when using Mesos on that 
> distribution. 
> Also, it fails on Mac OS Sierra with the following stack trace:
> {code}
> [ RUN      ] ExamplesTest.PythonFramework
> Using temporary directory 
> '/var/folders/v4/0jg38yfd44nb7mzz21c663_40000gn/T/ExamplesTest_PythonFramework_Ic9qX4'
> ../../src/tests/script.cpp:81: Failure
> Failed
> python_framework_test.sh terminated with signal Segmentation fault: 11
> *** Aborted at 1491951380 (unix time) try "date -d @1491951380" if you are 
> using GNU date ***
> PC: @        0x10e9f1170 testing::UnitTest::AddTestPartResult()
> *** SIGSEGV (@0x0) received by PID 710 (TID 0x7fffd5caa3c0) stack trace: ***
>     @     0x7fffccf4ab3a _sigtramp
>     @                0x8 (unknown)
>     @        0x10e9f0977 testing::internal::AssertHelper::operator=()
>     @        0x10e3c8bfd mesos::internal::tests::execute()
>     @        0x10d3e9241 ExamplesTest_PythonFramework_Test::TestBody()
>     @        0x10ea524aa 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
>     @        0x10ea02df7 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
>     @        0x10ea02ca5 testing::Test::Run()
>     @        0x10ea04898 testing::TestInfo::Run()
>     @        0x10ea05de7 testing::TestCase::Run()
>     @        0x10ea15c5c testing::internal::UnitTestImpl::RunAllTests()
>     @        0x10ea548aa 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
>     @        0x10ea15687 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
>     @        0x10ea15558 testing::UnitTest::Run()
>     @        0x10d9b8971 RUN_ALL_TESTS()
>     @        0x10d9b44dd main
>     @     0x7fffccd3b235 start
> Segmentation fault: 11
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to