Really apologize, I am in China and could not connect VPN recent days.
Would check as soon as possible once back.

On Fri, Mar 31, 2017 at 4:20 PM, Alex Rukletsov <a...@mesosphere.com> wrote:

> Cool, looking forward to it!
>
> On Fri, Mar 31, 2017 at 4:30 AM, tommy xiao <xia...@gmail.com> wrote:
>
>> Alex,Yes, let me have a try.
>>
>> 2017-03-31 3:16 GMT+08:00 Alex Rukletsov <a...@mesosphere.com>:
>>
>>> This is https://issues.apache.org/jira/browse/MESOS-7210. Deshi, do you
>>> want to send the patch? I or Haosdent can shepherd.
>>>
>>> A.
>>>
>>> On Thu, Mar 30, 2017 at 12:27 PM, tommy xiao <xia...@gmail.com> wrote:
>>>
>>>> interesting for the specified case.
>>>>
>>>> 2017-03-30 7:52 GMT+08:00 Jie Yu <yujie....@gmail.com>:
>>>>
>>>>> + AlexR, haosdent
>>>>>
>>>>> For posterity, the root cause of this problem is that when agent is
>>>>> running inside a docker container and `--docker_mesos_image` flag is
>>>>> specified, the pid namespace of the executor container (which initiate the
>>>>> health check) is different than the root pid namespace. Therefore, getting
>>>>> the network namespace handle using `/proc/<pid>/ns/net` does not work
>>>>> because the 'pid' here is in the root pid namespace (reported by docker
>>>>> daemon).
>>>>>
>>>>> Alex and haosdent, I think we should fix this issue. As suggested
>>>>> above, we can launch the executor container with --pid=host if
>>>>> `--docker_mesos_image` is specified.
>>>>>
>>>>> - Jie
>>>>>
>>>>> On Wed, Mar 29, 2017 at 3:56 AM, tommy xiao <xia...@gmail.com> wrote:
>>>>>
>>>>>> it resolved by add --pid=host.  thanks for community guys supports.
>>>>>> thanks a lot.
>>>>>>
>>>>>> 2017-03-29 9:52 GMT+08:00 tommy xiao <xia...@gmail.com>:
>>>>>>
>>>>>>> My Environment is specified:
>>>>>>>
>>>>>>> mesos 1.2 in docker containerized.
>>>>>>>
>>>>>>> send a sample nginx docker container with mesos native health check.
>>>>>>>
>>>>>>> then get sandbox core dump.
>>>>>>>
>>>>>>> i have digg into more information for your reference:
>>>>>>>
>>>>>>> in mesos slave container, i can only see task container pid. but i
>>>>>>> can't found process nginx pid.
>>>>>>>
>>>>>>> but in host console, i can found the nginx pid. so how can i get the
>>>>>>> pid in container?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2017-03-28 13:49 GMT+08:00 tommy xiao <xia...@gmail.com>:
>>>>>>>
>>>>>>>> https://issues.apache.org/jira/browse/MESOS-6184
>>>>>>>>
>>>>>>>> anyone give some hint?
>>>>>>>>
>>>>>>>> ```
>>>>>>>>
>>>>>>>> I0328 11:48:12.922181 48 exec.cpp:162] Version: 1.2.0
>>>>>>>> I0328 11:48:12.929252 54 exec.cpp:237] Executor registered on agent
>>>>>>>> a29dc3a5-3e3f-4058-8ab4-dd7de2ae58d1-S4
>>>>>>>> I0328 11:48:12.931640 54 docker.cpp:850] Running docker -H
>>>>>>>> unix:///var/run/docker.sock run --cpu-shares 10 --memory 33554432
>>>>>>>> --env-file /tmp/gvqGyb -v /data/mesos/slaves/a29dc3a5-3e
>>>>>>>> 3f-4058-8ab4-dd7de2ae58d1-S4/frameworks/d7ef5d2b-f924-42d9-a
>>>>>>>> 274-c020afba6bce-0000/executors/0-hc-xychu-datamanmesos-2f3b
>>>>>>>> 47f9ffc048539c7b22baa6c32d8f/runs/458189b8-2ff4-4337-ad3a-67321e96f5cb:/mnt/mesos/sandbox
>>>>>>>> --net bridge --label=USER_NAME=xychu --label=GROUP_NAME=groupautotest
>>>>>>>> --label=APP_ID=hc --label=VCLUSTER=clusterautotest
>>>>>>>> --label=USER=xychu --label=CLUSTER=datamanmesos --label=SLOT=0
>>>>>>>> --label=APP=hc -p 31000:80/tcp --name mesos-a29dc3a5-3e3f-4058-8ab4-
>>>>>>>> dd7de2ae58d1-S4.458189b8-2ff4-4337-ad3a-67321e96f5cb nginx
>>>>>>>> I0328 11:48:16.145714 53 health_checker.cpp:196] Ignoring failure
>>>>>>>> as health check still in grace period
>>>>>>>> W0328 11:48:26.289958 49 health_checker.cpp:202] Health check
>>>>>>>> failed 1 times consecutively: HTTP health check failed: curl returned
>>>>>>>> terminated with signal Aborted (core dumped): ABORT:
>>>>>>>> (../../../3rdparty/libprocess/include/process/posix/subprocess.hpp:190):
>>>>>>>> Failed to execute Subprocess::ChildHook: Failed to enter the net 
>>>>>>>> namespace
>>>>>>>> of pid 18596: Pid 18596 does not exist
>>>>>>>>
>>>>>>>>    -
>>>>>>>>       -
>>>>>>>>          - Aborted at 1490672906 (unix time) try "date -d
>>>>>>>>          @1490672906" if you are using GNU date ***
>>>>>>>>          PC: @ 0x7f26bfb485f7 __GI_raise
>>>>>>>>          - SIGABRT (@0x4a) received by PID 74 (TID 0x7f26ba152700)
>>>>>>>>          from PID 74; stack trace: ***
>>>>>>>>          @ 0x7f26c0703100 (unknown)
>>>>>>>>          @ 0x7f26bfb485f7 __GI_raise
>>>>>>>>          @ 0x7f26bfb49ce8 __GI_abort
>>>>>>>>          @ 0x7f26c315778e _Abort()
>>>>>>>>          @ 0x7f26c31577cc _Abort()
>>>>>>>>          @ 0x7f26c237a4b6 process::internal::childMain()
>>>>>>>>          @ 0x7f26c2379e9c std::_Function_handler<>::_M_invoke()
>>>>>>>>          @ 0x7f26c2379e53 process::internal::defaultClone()
>>>>>>>>          @ 0x7f26c237b951 process::internal::cloneChild()
>>>>>>>>          @ 0x7f26c237954f process::subprocess()
>>>>>>>>          @ 0x7f26c15a9fb1 mesos::internal::checks::Healt
>>>>>>>>          hCheckerProcess::httpHealthCheck()
>>>>>>>>          @ 0x7f26c15ababd mesos::internal::checks::Healt
>>>>>>>>          hCheckerProcess::performSingleCheck()
>>>>>>>>          @ 0x7f26c2331389 process::ProcessManager::resume()
>>>>>>>>          @ 0x7f26c233a3f7 _ZNSt6thread5_ImplISt12_Bind_s
>>>>>>>>          impleIFZN7process14ProcessMana
>>>>>>>>          ger12init_threadsEvEUt_vEEE6_M_runEv
>>>>>>>>          @ 0x7f26c04a1220 (unknown)
>>>>>>>>          @ 0x7f26c06fbdc5 start_thread
>>>>>>>>          @ 0x7f26bfc0928d __clone
>>>>>>>>          W0328 11:48:36.340055 55 health_checker.cpp:202] Health
>>>>>>>>          check failed 2 times consecutively: HTTP health check failed: 
>>>>>>>> curl returned
>>>>>>>>          terminated with signal Aborted (core dumped): ABORT:
>>>>>>>>          (../../../3rdparty/libprocess/
>>>>>>>>          include/process/posix/subprocess.hpp:190): Failed to
>>>>>>>>          execute Subprocess::ChildHook: Failed to enter the net 
>>>>>>>> namespace of pid
>>>>>>>>          18596: Pid 18596 does not exist
>>>>>>>>          - Aborted at 1490672916 (unix time) try "date -d
>>>>>>>>          @1490672916" if you are using GNU date ***
>>>>>>>>          PC: @ 0x7f26bfb485f7 __GI_raise
>>>>>>>>          - SIGABRT (@0x4b) received by PID 75 (TID 0x7f26b9951700)
>>>>>>>>          from PID 75; stack trace: ***
>>>>>>>>          @ 0x7f26c0703100 (unknown)
>>>>>>>>          @ 0x7f26bfb485f7 __GI_raise
>>>>>>>>          @ 0x7f26bfb49ce8 __GI_abort
>>>>>>>>          @ 0x7f26c315778e _Abort()
>>>>>>>>          @ 0x7f26c31577cc _Abort()
>>>>>>>>          @ 0x7f26c237a4b6 process::internal::childMain()
>>>>>>>>          @ 0x7f26c2379e9c std::_Function_handler<>::_M_invoke()
>>>>>>>>          @ 0x7f26c2379e53 process::internal::defaultClone()
>>>>>>>>          @ 0x7f26c237b951 process::internal::cloneChild()
>>>>>>>>          @ 0x7f26c237954f process::subprocess()
>>>>>>>>          @ 0x7f26c15a9fb1 mesos::internal::checks::Healt
>>>>>>>>          hCheckerProcess::httpHealthCheck()
>>>>>>>>          @ 0x7f26c15ababd mesos::internal::checks::Healt
>>>>>>>>          hCheckerProcess::performSingleCheck()
>>>>>>>>          @ 0x7f26c2331389 process::ProcessManager::resume()
>>>>>>>>          @ 0x7f26c233a3f7 _ZNSt6thread5_ImplISt12_Bind_s
>>>>>>>>          impleIFZN7process14ProcessMana
>>>>>>>>          ger12init_threadsEvEUt_vEEE6_M_runEv
>>>>>>>>          @ 0x7f26c04a1220 (unknown)
>>>>>>>>          @ 0x7f26c06fbdc5 start_thread
>>>>>>>>          @ 0x7f26bfc0928d __clone
>>>>>>>>          W0328 11:48:46.386533 49 health_checker.cpp:202] Health
>>>>>>>>          check failed 3 times consecutively: HTTP health check failed: 
>>>>>>>> curl returned
>>>>>>>>          terminated with signal Aborted (core dumped): ABORT:
>>>>>>>>          (../../../3rdparty/libprocess/
>>>>>>>>          include/process/posix/subprocess.hpp:190): Failed to
>>>>>>>>          execute Subprocess::ChildHook: Failed to enter the net 
>>>>>>>> namespace of pid
>>>>>>>>          18596: Pid 18596 does not exist
>>>>>>>>          - Aborted at 1490672926 (unix time) try "date -d
>>>>>>>>          @1490672926" if you are using GNU date ***
>>>>>>>>          PC: @ 0x7f26bfb485f7 __GI_raise
>>>>>>>>          - SIGABRT (@0x4c) received by PID 76 (TID 0x7f26ba152700)
>>>>>>>>          from PID 76; stack trace: ***
>>>>>>>>          @ 0x7f26c0703100 (unknown)
>>>>>>>>          @ 0x7f26bfb485f7 __GI_raise
>>>>>>>>          @ 0x7f26bfb49ce8 __GI_abort
>>>>>>>>          @ 0x7f26c315778e _Abort()
>>>>>>>>          @ 0x7f26c31577cc _Abort()
>>>>>>>>          @ 0x7f26c237a4b6 process::internal::childMain()
>>>>>>>>          @ 0x7f26c2379e9c std::_Function_handler<>::_M_invoke()
>>>>>>>>          @ 0x7f26c2379e53 process::internal::defaultClone()
>>>>>>>>          @ 0x7f26c237b951 process::internal::cloneChild()
>>>>>>>>          @ 0x7f26c237954f process::subprocess()
>>>>>>>>          @ 0x7f26c15a9fb1 mesos::internal::checks::Healt
>>>>>>>>          hCheckerProcess::httpHealthCheck()
>>>>>>>>          @ 0x7f26c15ababd mesos::internal::checks::Healt
>>>>>>>>          hCheckerProcess::performSingleCheck()
>>>>>>>>          @ 0x7f26c2331389 process::ProcessManager::resume()
>>>>>>>>          @ 0x7f26c233a3f7 _ZNSt6thread5_ImplISt12_Bind_s
>>>>>>>>          impleIFZN7process14ProcessMana
>>>>>>>>          ger12init_threadsEvEUt_vEEE6_M_runEv
>>>>>>>>          @ 0x7f26c04a1220 (unknown)
>>>>>>>>          @ 0x7f26c06fbdc5 start_thread
>>>>>>>>          @ 0x7f26bfc0928d __clone
>>>>>>>>          W0328 11:48:56.531623 53 health_checker.cpp:202] Health
>>>>>>>>          check failed 4 times consecutively: HTTP health check failed: 
>>>>>>>> curl returned
>>>>>>>>          terminated with signal Aborted (core dumped): ABORT:
>>>>>>>>          (../../../3rdparty/libprocess/
>>>>>>>>          include/process/posix/subprocess.hpp:190): Failed to
>>>>>>>>          execute Subprocess::ChildHook: Failed to enter the net 
>>>>>>>> namespace of pid
>>>>>>>>          18596: Pid 18596 does not exist
>>>>>>>>          - Aborted at 1490672936 (unix time) try "date -d
>>>>>>>>          @1490672936" if you are using GNU date ***
>>>>>>>>          PC: @ 0x7f26bfb485f7 __GI_raise
>>>>>>>>          - SIGABRT (@0x4d) received by PID 77 (TID 0x7f26b814e700)
>>>>>>>>          from PID 77; stack trace: ***
>>>>>>>>          @ 0x7f26c0703100 (unknown)
>>>>>>>>          @ 0x7f26bfb485f7 __GI_raise
>>>>>>>>          @ 0x7f26bfb49ce8 __GI_abort
>>>>>>>>          @ 0x7f26c315778e _Abort()
>>>>>>>>          @ 0x7f26c31577cc _Abort()
>>>>>>>>          @ 0x7f26c237a4b6 process::internal::childMain()
>>>>>>>>          @ 0x7f26c2379e9c std::_Function_handler<>::_M_invoke()
>>>>>>>>          @ 0x7f26c2379e53 process::internal::defaultClone()
>>>>>>>>          @ 0x7f26c237b951 process::internal::cloneChild()
>>>>>>>>          @ 0x7f26c237954f process::subprocess()
>>>>>>>>          @ 0x7f26c15a9fb1 mesos::internal::checks::Healt
>>>>>>>>          hCheckerProcess::httpHealthCheck()
>>>>>>>>          @ 0x7f26c15ababd mesos::internal::checks::Healt
>>>>>>>>          hCheckerProcess::performSingleCheck()
>>>>>>>>          @ 0x7f26c2331389 process::ProcessManager::resume()
>>>>>>>>          @ 0x7f26c233a3f7 _ZNSt6thread5_ImplISt12_Bind_s
>>>>>>>>          impleIFZN7process14ProcessMana
>>>>>>>>          ger12init_threadsEvEUt_vEEE6_M_runEv
>>>>>>>>          @ 0x7f26c04a1220 (unknown)
>>>>>>>>          @ 0x7f26c06fbdc5 start_thread
>>>>>>>>          @ 0x7f26bfc0928d __clone
>>>>>>>>          W0328 11:49:06.678515 50 health_checker.cpp:202] Health
>>>>>>>>          check failed 5 times consecutively: HTTP health check failed: 
>>>>>>>> curl returned
>>>>>>>>          terminated with signal Aborted (core dumped): ABORT:
>>>>>>>>          (../../../3rdparty/libprocess/
>>>>>>>>          include/process/posix/subprocess.hpp:190): Failed to
>>>>>>>>          execute Subprocess::ChildHook: Failed to enter the net 
>>>>>>>> namespace of pid
>>>>>>>>          18596: Pid 18596 does not exist
>>>>>>>>          - Aborted at 1490672946 (unix time) try "date -d
>>>>>>>>          @1490672946" if you are using GNU date ***
>>>>>>>>          PC: @ 0x7f26bfb485f7 __GI_raise
>>>>>>>>          - SIGABRT (@0x4e) received by PID 78 (TID 0x7f26b9951700)
>>>>>>>>          from PID 78; stack trace: ***
>>>>>>>>          @ 0x7f26c0703100 (unknown)
>>>>>>>>          @ 0x7f26bfb485f7 __GI_raise
>>>>>>>>          @ 0x7f26bfb49ce8 __GI_abort
>>>>>>>>          @ 0x7f26c315778e _Abort()
>>>>>>>>          @ 0x7f26c31577cc _Abort()
>>>>>>>>          @ 0x7f26c237a4b6 process::internal::childMain()
>>>>>>>>          @ 0x7f26c2379e9c std::_Function_handler<>::_M_invoke()
>>>>>>>>          @ 0x7f26c2379e53 process::internal::defaultClone()
>>>>>>>>          @ 0x7f26c237b951 process::internal::cloneChild()
>>>>>>>>          @ 0x7f26c237954f process::subprocess()
>>>>>>>>          @ 0x7f26c15a9fb1 mesos::internal::checks::Healt
>>>>>>>>          hCheckerProcess::httpHealthCheck()
>>>>>>>>          @ 0x7f26c15ababd mesos::internal::checks::Healt
>>>>>>>>          hCheckerProcess::performSingleCheck()
>>>>>>>>          @ 0x7f26c2331389 process::ProcessManager::resume()
>>>>>>>>          @ 0x7f26c233a3f7 _ZNSt6thread5_ImplISt12_Bind_s
>>>>>>>>          impleIFZN7process14ProcessMana
>>>>>>>>          ger12init_threadsEvEUt_vEEE6_M_runEv
>>>>>>>>          @ 0x7f26c04a1220 (unknown)
>>>>>>>>          @ 0x7f26c06fbdc5 start_thread
>>>>>>>>          @ 0x7f26bfc0928d __clone
>>>>>>>>          I0328 11:49:06.678840 50 health_checker.cpp:130] Health
>>>>>>>>          checking stopped
>>>>>>>>          I0328 11:49:06.880620 49 health_checker.cpp:130] Health
>>>>>>>>          checking stopped
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ```
>>>>>>>>
>>>>>>>> --
>>>>>>>> Deshi Xiao
>>>>>>>> Twitter: xds2000
>>>>>>>> E-mail: xiaods(AT)gmail.com
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Deshi Xiao
>>>>>>> Twitter: xds2000
>>>>>>> E-mail: xiaods(AT)gmail.com
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Deshi Xiao
>>>>>> Twitter: xds2000
>>>>>> E-mail: xiaods(AT)gmail.com
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Deshi Xiao
>>>> Twitter: xds2000
>>>> E-mail: xiaods(AT)gmail.com
>>>>
>>>
>>>
>>
>>
>> --
>> Deshi Xiao
>> Twitter: xds2000
>> E-mail: xiaods(AT)gmail.com
>>
>
>


-- 
Best Regards,
Haosdent Huang

Reply via email to