Re: mesos container cluster came across health check coredump log

2017-03-28 Thread tommy xiao
My Environment is specified:

mesos 1.2 in docker containerized.

send a sample nginx docker container with mesos native health check.

then get sandbox core dump.

i have digg into more information for your reference:

in mesos slave container, i can only see task container pid. but i can't
found process nginx pid.

but in host console, i can found the nginx pid. so how can i get the pid in
container?




2017-03-28 13:49 GMT+08:00 tommy xiao :

> https://issues.apache.org/jira/browse/MESOS-6184
>
> anyone give some hint?
>
> ```
>
> I0328 11:48:12.922181 48 exec.cpp:162] Version: 1.2.0
> I0328 11:48:12.929252 54 exec.cpp:237] Executor registered on agent
> a29dc3a5-3e3f-4058-8ab4-dd7de2ae58d1-S4
> I0328 11:48:12.931640 54 docker.cpp:850] Running docker -H
> unix:///var/run/docker.sock run --cpu-shares 10 --memory 33554432
> --env-file /tmp/gvqGyb -v /data/mesos/slaves/a29dc3a5-
> 3e3f-4058-8ab4-dd7de2ae58d1-S4/frameworks/d7ef5d2b-f924-
> 42d9-a274-c020afba6bce-/executors/0-hc-xychu-datamanmesos-
> 2f3b47f9ffc048539c7b22baa6c32d8f/runs/458189b8-2ff4-4337-
> ad3a-67321e96f5cb:/mnt/mesos/sandbox --net bridge --label=USER_NAME=xychu
> --label=GROUP_NAME=groupautotest --label=APP_ID=hc 
> --label=VCLUSTER=clusterautotest
> --label=USER=xychu --label=CLUSTER=datamanmesos --label=SLOT=0
> --label=APP=hc -p 31000:80/tcp --name mesos-a29dc3a5-3e3f-4058-8ab4-
> dd7de2ae58d1-S4.458189b8-2ff4-4337-ad3a-67321e96f5cb nginx
> I0328 11:48:16.145714 53 health_checker.cpp:196] Ignoring failure as
> health check still in grace period
> W0328 11:48:26.289958 49 health_checker.cpp:202] Health check failed 1
> times consecutively: HTTP health check failed: curl returned terminated
> with signal Aborted (core dumped): ABORT: (../../../3rdparty/libprocess/
> include/process/posix/subprocess.hpp:190): Failed to execute
> Subprocess::ChildHook: Failed to enter the net namespace of pid 18596: Pid
> 18596 does not exist
>
>-
>   -
>  - Aborted at 1490672906 (unix time) try "date -d @1490672906" if
>  you are using GNU date ***
>  PC: @ 0x7f26bfb485f7 __GI_raise
>  - SIGABRT (@0x4a) received by PID 74 (TID 0x7f26ba152700) from
>  PID 74; stack trace: ***
>  @ 0x7f26c0703100 (unknown)
>  @ 0x7f26bfb485f7 __GI_raise
>  @ 0x7f26bfb49ce8 __GI_abort
>  @ 0x7f26c315778e _Abort()
>  @ 0x7f26c31577cc _Abort()
>  @ 0x7f26c237a4b6 process::internal::childMain()
>  @ 0x7f26c2379e9c std::_Function_handler<>::_M_invoke()
>  @ 0x7f26c2379e53 process::internal::defaultClone()
>  @ 0x7f26c237b951 process::internal::cloneChild()
>  @ 0x7f26c237954f process::subprocess()
>  @ 0x7f26c15a9fb1 mesos::internal::checks::HealthCheckerProcess::
>  httpHealthCheck()
>  @ 0x7f26c15ababd mesos::internal::checks::HealthCheckerProcess::
>  performSingleCheck()
>  @ 0x7f26c2331389 process::ProcessManager::resume()
>  @ 0x7f26c233a3f7 _ZNSt6thread5_ImplISt12_Bind_
>  simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEE6_
>  M_runEv
>  @ 0x7f26c04a1220 (unknown)
>  @ 0x7f26c06fbdc5 start_thread
>  @ 0x7f26bfc0928d __clone
>  W0328 11:48:36.340055 55 health_checker.cpp:202] Health check
>  failed 2 times consecutively: HTTP health check failed: curl returned
>  terminated with signal Aborted (core dumped): ABORT:
>  
> (../../../3rdparty/libprocess/include/process/posix/subprocess.hpp:190):
>  Failed to execute Subprocess::ChildHook: Failed to enter the net 
> namespace
>  of pid 18596: Pid 18596 does not exist
>  - Aborted at 1490672916 (unix time) try "date -d @1490672916" if
>  you are using GNU date ***
>  PC: @ 0x7f26bfb485f7 __GI_raise
>  - SIGABRT (@0x4b) received by PID 75 (TID 0x7f26b9951700) from
>  PID 75; stack trace: ***
>  @ 0x7f26c0703100 (unknown)
>  @ 0x7f26bfb485f7 __GI_raise
>  @ 0x7f26bfb49ce8 __GI_abort
>  @ 0x7f26c315778e _Abort()
>  @ 0x7f26c31577cc _Abort()
>  @ 0x7f26c237a4b6 process::internal::childMain()
>  @ 0x7f26c2379e9c std::_Function_handler<>::_M_invoke()
>  @ 0x7f26c2379e53 process::internal::defaultClone()
>  @ 0x7f26c237b951 process::internal::cloneChild()
>  @ 0x7f26c237954f process::subprocess()
>  @ 0x7f26c15a9fb1 mesos::internal::checks::HealthCheckerProcess::
>  httpHealthCheck()
>  @ 0x7f26c15ababd mesos::internal::checks::HealthCheckerProcess::
>  performSingleCheck()
>  @ 0x7f26c2331389 process::ProcessManager::resume()
>  @ 0x7f26c233a3f7 _ZNSt6thread5_ImplISt12_Bind_
>  simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEE6_
>  M_runEv
>  @ 0x7f26c04a1220 (unknown)
>  @ 0x7f26c06fbdc5 start_thread
>  @ 0x7f26bfc0928d __clone
>  W0328 1

Re: Mesos (and Marathon) port mapping

2017-03-28 Thread Dick Davies
Try setting your hostPort to 0, to tell Mesos to select one
(which it will allocate out of the pool the mesos slave is set to use).

This works for me for redis:


{
  "container": {
"type": "DOCKER",
"docker": {
  "image": "redis",
  "network": "BRIDGE",
  "portMappings": [
{ "containerPort": 6379, "hostPort": 0, "protocol": "tcp"}
  ]
}
  },
  "ports": [0],
  "instances": 1,
  "cpus": 0.1,
  "mem": 128,
  "uris": []
}

(caveat: haven't run marathon or mesos for a little while)

On 28 March 2017 at 17:53, Tomek Janiszewski  wrote:
> 1. Mentioned port range is the Mesos Agent resource setting, so if you don't
> explicitly define port range it would be used.
> https://github.com/apache/mesos/blob/1.2.0/src/slave/constants.hpp#L86
>
> 2. With ports mapping two or more applications could attach to same
> container port but will be exposed under different host port.
>
> 3. I'm not sure if ports mappings works in Host mode. Try with require ports
> option enabled.
> https://github.com/mesosphere/marathon/blob/v1.3.9/docs/docs/ports.md
>
> 4. Yes, service ports are only for Marathon and don't propagate to Mesos.
> http://stackoverflow.com/a/39468348/1387612
>
>
> wt., 28.03.2017, 18:16 użytkownik Thomas HUMMEL 
> napisał:
>>
>> Hello,
>>
>> [Sorry if this post may seem more Marathon-oriented. It still contains
>> Mesos specific questions.]
>>
>> I'm in the process of discovering/testing/trying to understand Mesos and
>> Marathon.
>>
>> After having read some books and docs, I set up a small environment (9
>> linux
>> CentOS 7.3 VMs) consisting of :
>>
>>. 3 Mesos master - quorum = 2
>>. 3 Zookeepers servers running on the same host as the mesos servers
>>. 2 Mesos slaves
>>. 3 Marathon servers
>>. 1 HAproxy facing the Mesos servers
>>
>> Mesos has been installed from sources (1.2.0 version) and Marathon is
>> the 1.3.9
>> tarball comming from mesosphere
>>
>> I've deployed :
>>
>>. mesos-dns as a Marathon (not dockerized) application on one of the
>>  slaves (with a constraint) configured with my site DNS as resolvers
>> and only
>>  "host" as IPSources
>>
>>. marathon-lb as a Marathon dockerized app ("network": "HOST") with the
>>  simple (containerPort: 9090, hostPort: 9090, servicePort: 1)
>> portMapping,
>>  on the same slave using a constraint
>>
>> Everything works fine so far.
>> I've read :
>>
>>https://mesosphere.github.io/marathon/docs/ports.html
>>
>> and
>>
>>http://mesos.apache.org/documentation/latest/port-mapping-isolator/
>>
>> but I'm still quite confused by the following port-related questions :
>>
>> [Note : I'm not using "network/port_mapping" isolation for now. I sticked
>> to
>>
>>export MESOS_containerizers=docker,mesos]
>>
>> 1. for such a simple dockerized app :
>>
>> {
>>"id": "http-server",
>>"cmd": "python3 -m http.server 8080",
>>"cpus": 0.5,
>>"mem": 32.0,
>>"container": {
>>  "type": "DOCKER",
>>  "docker": {
>>"image": "python:3",
>>"network": "BRIDGE",
>>"portMappings": [
>>  { "containerPort": 8080, "hostPort": 31000, "servicePort": 5000 }
>>]
>>  }
>>},
>>"labels":{
>>  "HAPROXY_GROUP":"external"
>>}
>> }
>>
>> a) in HOST mode ("network": "HOST"), any hostPort seems to work (or at
>> least, let say 9090)
>>
>> b) in BRIDGE mode ("network": "BRIDGE"), the valid hostPort range seems
>> to be
>> [31000 - 32000], which seems to match the Mesos non-ephemeral port range
>> given
>> as en example in
>>
>>http://mesos.apache.org/documentation/latest/port-mapping-isolator/
>>
>> But I don't quite understand why since
>>
>>- I'm not using network/port_mapping isolation
>>- I didn't configured any port range anywhere in Mesos
>>
>> 2. Obviously in my setup, 2 apps on the same slave cannot have the same
>> hostPort. Would it be the same with network/port_mapping activated
>> since the
>> doc says : "he agent assigns each container a non-overlapping range
>> of the
>> ports"
>>
>> Am I correct assuming that a Marathon hostPort is to be understood
>> as taken among the non-ephemeral Mesos ports ?
>>
>> With network/port_mapping isolation, could 2 apps have the same
>> non-ephemeal port ? same question with ephemeral-port ? I doubt it but...
>> Is what is described in this doc valid for a dockerized container also
>> ?
>>
>> 3. the portMapping I configured for the dockerized ("network": "HOST")
>> marathon-lb app is
>>
>> "portMappings": [
>>{
>>  "containerPort": 9090,
>>  "hostPort": 9090,
>>  "servicePort": 1,
>>  "protocol": "tcp"
>>
>> on the slave I can verify :
>>
>># lsof -i :9090
>>COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
>>haproxy 29610 root6u  IPv4 461745  0t0  TCP *:websm (LISTEN)
>> But Marathon tells that my app is running on :
>>
>>mesos-slave1.it.pasteur.fr:31830
>>
>> I don't u

Re: Mesos (and Marathon) port mapping

2017-03-28 Thread Tomek Janiszewski
1. Mentioned port range is the Mesos Agent resource setting, so if you
don't explicitly define port range it would be used.
https://github.com/apache/mesos/blob/1.2.0/src/slave/constants.hpp#L86

2. With ports mapping two or more applications could attach to same
container port but will be exposed under different host port.

3. I'm not sure if ports mappings works in Host mode. Try with require
ports option enabled.
https://github.com/mesosphere/marathon/blob/v1.3.9/docs/docs/ports.md

4. Yes, service ports are only for Marathon and don't propagate to Mesos.
http://stackoverflow.com/a/39468348/1387612

wt., 28.03.2017, 18:16 użytkownik Thomas HUMMEL 
napisał:

Hello,

[Sorry if this post may seem more Marathon-oriented. It still contains
Mesos specific questions.]

I'm in the process of discovering/testing/trying to understand Mesos and
Marathon.

After having read some books and docs, I set up a small environment (9 linux
CentOS 7.3 VMs) consisting of :

   . 3 Mesos master - quorum = 2
   . 3 Zookeepers servers running on the same host as the mesos servers
   . 2 Mesos slaves
   . 3 Marathon servers
   . 1 HAproxy facing the Mesos servers

Mesos has been installed from sources (1.2.0 version) and Marathon is
the 1.3.9
tarball comming from mesosphere

I've deployed :

   . mesos-dns as a Marathon (not dockerized) application on one of the
 slaves (with a constraint) configured with my site DNS as resolvers
and only
 "host" as IPSources

   . marathon-lb as a Marathon dockerized app ("network": "HOST") with the
 simple (containerPort: 9090, hostPort: 9090, servicePort: 1)
portMapping,
 on the same slave using a constraint

Everything works fine so far.
I've read :

   https://mesosphere.github.io/marathon/docs/ports.html

and

   http://mesos.apache.org/documentation/latest/port-mapping-isolator/

but I'm still quite confused by the following port-related questions :

[Note : I'm not using "network/port_mapping" isolation for now. I sticked to

   export MESOS_containerizers=docker,mesos]

1. for such a simple dockerized app :

{
   "id": "http-server",
   "cmd": "python3 -m http.server 8080",
   "cpus": 0.5,
   "mem": 32.0,
   "container": {
 "type": "DOCKER",
 "docker": {
   "image": "python:3",
   "network": "BRIDGE",
   "portMappings": [
 { "containerPort": 8080, "hostPort": 31000, "servicePort": 5000 }
   ]
 }
   },
   "labels":{
 "HAPROXY_GROUP":"external"
   }
}

a) in HOST mode ("network": "HOST"), any hostPort seems to work (or at
least, let say 9090)

b) in BRIDGE mode ("network": "BRIDGE"), the valid hostPort range seems
to be
[31000 - 32000], which seems to match the Mesos non-ephemeral port range
given
as en example in

   http://mesos.apache.org/documentation/latest/port-mapping-isolator/

But I don't quite understand why since

   - I'm not using network/port_mapping isolation
   - I didn't configured any port range anywhere in Mesos

2. Obviously in my setup, 2 apps on the same slave cannot have the same
hostPort. Would it be the same with network/port_mapping activated
since the
doc says : "he agent assigns each container a non-overlapping range
of the
ports"

Am I correct assuming that a Marathon hostPort is to be understood
as taken among the non-ephemeral Mesos ports ?

With network/port_mapping isolation, could 2 apps have the same
non-ephemeal port ? same question with ephemeral-port ? I doubt it but...
Is what is described in this doc valid for a dockerized container also ?

3. the portMapping I configured for the dockerized ("network": "HOST")
marathon-lb app is

"portMappings": [
   {
 "containerPort": 9090,
 "hostPort": 9090,
 "servicePort": 1,
 "protocol": "tcp"

on the slave I can verify :

   # lsof -i :9090
   COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
   haproxy 29610 root6u  IPv4 461745  0t0  TCP *:websm (LISTEN)
But Marathon tells that my app is running on :

   mesos-slave1.it.pasteur.fr:31830

I don't understand where this port comes from, especially when I see
nobody's listening on it :

   lsof -i :31830

like if Marathon gave me a fake hostPort ?

4. My understanding is that Marathon service port are bound to only by apps
like marathon-lb. As a matter of fact, it doesn't seem to bother Mesos that
Marathon deploys 2 apps on the same slave with the same servicePort. Am
I correct ?

Thanks for your help

--
Thomas HUMMEL


Mesos (and Marathon) port mapping

2017-03-28 Thread Thomas HUMMEL

Hello,

[Sorry if this post may seem more Marathon-oriented. It still contains 
Mesos specific questions.]


I'm in the process of discovering/testing/trying to understand Mesos and 
Marathon.


After having read some books and docs, I set up a small environment (9 linux
CentOS 7.3 VMs) consisting of :

  . 3 Mesos master - quorum = 2
  . 3 Zookeepers servers running on the same host as the mesos servers
  . 2 Mesos slaves
  . 3 Marathon servers
  . 1 HAproxy facing the Mesos servers

Mesos has been installed from sources (1.2.0 version) and Marathon is 
the 1.3.9

tarball comming from mesosphere

I've deployed :

  . mesos-dns as a Marathon (not dockerized) application on one of the
slaves (with a constraint) configured with my site DNS as resolvers 
and only

"host" as IPSources

  . marathon-lb as a Marathon dockerized app ("network": "HOST") with the
simple (containerPort: 9090, hostPort: 9090, servicePort: 1) 
portMapping,

on the same slave using a constraint

Everything works fine so far.
I've read :

  https://mesosphere.github.io/marathon/docs/ports.html

and

  http://mesos.apache.org/documentation/latest/port-mapping-isolator/

but I'm still quite confused by the following port-related questions :

[Note : I'm not using "network/port_mapping" isolation for now. I sticked to

  export MESOS_containerizers=docker,mesos]

1. for such a simple dockerized app :

{
  "id": "http-server",
  "cmd": "python3 -m http.server 8080",
  "cpus": 0.5,
  "mem": 32.0,
  "container": {
"type": "DOCKER",
"docker": {
  "image": "python:3",
  "network": "BRIDGE",
  "portMappings": [
{ "containerPort": 8080, "hostPort": 31000, "servicePort": 5000 }
  ]
}
  },
  "labels":{
"HAPROXY_GROUP":"external"
  }
}

a) in HOST mode ("network": "HOST"), any hostPort seems to work (or at 
least, let say 9090)


b) in BRIDGE mode ("network": "BRIDGE"), the valid hostPort range seems 
to be
[31000 - 32000], which seems to match the Mesos non-ephemeral port range 
given

as en example in

  http://mesos.apache.org/documentation/latest/port-mapping-isolator/

But I don't quite understand why since

  - I'm not using network/port_mapping isolation
  - I didn't configured any port range anywhere in Mesos

2. Obviously in my setup, 2 apps on the same slave cannot have the same
   hostPort. Would it be the same with network/port_mapping activated 
since the
   doc says : "he agent assigns each container a non-overlapping range 
of the

   ports"

   Am I correct assuming that a Marathon hostPort is to be understood 
as taken among the non-ephemeral Mesos ports ?


   With network/port_mapping isolation, could 2 apps have the same 
non-ephemeal port ? same question with ephemeral-port ? I doubt it but...

   Is what is described in this doc valid for a dockerized container also ?

3. the portMapping I configured for the dockerized ("network": "HOST") 
marathon-lb app is


"portMappings": [
  {
"containerPort": 9090,
"hostPort": 9090,
"servicePort": 1,
"protocol": "tcp"

on the slave I can verify :

  # lsof -i :9090
  COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
  haproxy 29610 root6u  IPv4 461745  0t0  TCP *:websm (LISTEN)
But Marathon tells that my app is running on :

  mesos-slave1.it.pasteur.fr:31830

I don't understand where this port comes from, especially when I see 
nobody's listening on it :


  lsof -i :31830

like if Marathon gave me a fake hostPort ?

4. My understanding is that Marathon service port are bound to only by apps
like marathon-lb. As a matter of fact, it doesn't seem to bother Mesos that
Marathon deploys 2 apps on the same slave with the same servicePort. Am 
I correct ?


Thanks for your help

--
Thomas HUMMEL




Re: SSL certificate error (1.1.0 OK, 1.1.1 NOK: Error downloading resource: Problem with the SSL CA cert (path? access rights?))

2017-03-28 Thread Adam Cecile

Hello,

Actually it's way more complicated:

It's related to the use of libcurl-nss: libnss does not support PEM 
certificates so it cannot used any of the ones available on a regular 
Debian system.
Moreover there's no good way to add this support so I did a dirty 
workaround to fix this but I think you should really re-consider the use 
of nss variant of libcurl...


More information here: 
https://github.com/mesosphere/mesos-deb-packaging/pull/104


Adam.

On 03/28/2017 10:09 AM, Adam Bordelon wrote:
I found https://issues.apache.org/jira/browse/MESOS-7133 but that was 
supposedly fixed in 1.1.1 (broken in 1.1.0)


On Tue, Mar 28, 2017 at 12:23 AM, Adam Cecile > wrote:


Hello,


Here is a simple test showing why I cannot upgrade Mesos anymore
(Debian Jessie, using mesosphere packages):


apt-get install mesos=1.1.0-2.0.107.debian81

MESOS_FETCHER_INFO='{ "sandbox_directory": "/tmp/abc", "items": [
{ "uri": { "value": "https://path/to/file }, "action":
"BYPASS_CACHE" } ] }' /usr/libexec/mesos/mesos-fetcher

[...] 0328 09:20:39.361690 129351 fetcher.cpp:547] Fetched
'https://path/to/file' to '/tmp/abc/file'


apt-get install mesos=1.1.1-2.0.1

MESOS_FETCHER_INFO='{ "sandbox_directory": "/tmp/abc", "items": [
{ "uri": { "value": "https://path/to/file }, "action":
"BYPASS_CACHE" } ] }' /usr/libexec/mesos/mesos-fetcher

[...] Failed to fetch 'https://path/to/file': Error downloading
resource: Problem with the SSL CA cert (path? access rights?)



Obviously the URL is working just fine when being retreive with
CURL, WGET or a JAVA app from the server itself.

Any hint would be appreciated !


Best regards, Adam.






Re: SSL certificate error (1.1.0 OK, 1.1.1 NOK: Error downloading resource: Problem with the SSL CA cert (path? access rights?))

2017-03-28 Thread Adam Bordelon
I found https://issues.apache.org/jira/browse/MESOS-7133 but that was
supposedly fixed in 1.1.1 (broken in 1.1.0)

On Tue, Mar 28, 2017 at 12:23 AM, Adam Cecile  wrote:

> Hello,
>
>
> Here is a simple test showing why I cannot upgrade Mesos anymore (Debian
> Jessie, using mesosphere packages):
>
>
> apt-get install mesos=1.1.0-2.0.107.debian81
>
> MESOS_FETCHER_INFO='{ "sandbox_directory": "/tmp/abc", "items": [ { "uri":
> { "value": "https://path/to/file }, "action": "BYPASS_CACHE" } ] }'
> /usr/libexec/mesos/mesos-fetcher
>
> [...] 0328 09:20:39.361690 129351 fetcher.cpp:547] Fetched '
> https://path/to/file' to '/tmp/abc/file'
>
>
> apt-get install mesos=1.1.1-2.0.1
>
> MESOS_FETCHER_INFO='{ "sandbox_directory": "/tmp/abc", "items": [ { "uri":
> { "value": "https://path/to/file }, "action": "BYPASS_CACHE" } ] }'
> /usr/libexec/mesos/mesos-fetcher
>
> [...] Failed to fetch 'https://path/to/file': Error downloading resource:
> Problem with the SSL CA cert (path? access rights?)
>
>
>
> Obviously the URL is working just fine when being retreive with CURL, WGET
> or a JAVA app from the server itself.
>
> Any hint would be appreciated !
>
>
> Best regards, Adam.
>


SSL certificate error (1.1.0 OK, 1.1.1 NOK: Error downloading resource: Problem with the SSL CA cert (path? access rights?))

2017-03-28 Thread Adam Cecile

Hello,


Here is a simple test showing why I cannot upgrade Mesos anymore (Debian 
Jessie, using mesosphere packages):



apt-get install mesos=1.1.0-2.0.107.debian81

MESOS_FETCHER_INFO='{ "sandbox_directory": "/tmp/abc", "items": [ { 
"uri": { "value": "https://path/to/file }, "action": "BYPASS_CACHE" } ] 
}' /usr/libexec/mesos/mesos-fetcher


[...] 0328 09:20:39.361690 129351 fetcher.cpp:547] Fetched 
'https://path/to/file' to '/tmp/abc/file'



apt-get install mesos=1.1.1-2.0.1

MESOS_FETCHER_INFO='{ "sandbox_directory": "/tmp/abc", "items": [ { 
"uri": { "value": "https://path/to/file }, "action": "BYPASS_CACHE" } ] 
}' /usr/libexec/mesos/mesos-fetcher


[...] Failed to fetch 'https://path/to/file': Error downloading 
resource: Problem with the SSL CA cert (path? access rights?)




Obviously the URL is working just fine when being retreive with CURL, 
WGET or a JAVA app from the server itself.


Any hint would be appreciated !


Best regards, Adam.


Re: Debugging mesos-fetcher binary

2017-03-28 Thread Adam Cecile

Sorry, I just figured out:

{ "sandbox_directory": "/tmp/abc",
  "items": [
{ "uri": { "value": "https://path/to/file"; }, "action": 
"BYPASS_CACHE" }

  ]
}

On 03/28/2017 09:11 AM, Adam Cecile wrote:

Hello,

I tried something like this:

{ "sandbox_directory": "/tmp/abc",
  "items": [
{ "uri": "https://path/to/file";, "action": "BYPASS_CACHE" }
  ]
}

But it's still failing. The item definition is not correct but I 
cannot understand what is expected (according to 
https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L630)


Thanks...

Adam.

On 03/28/2017 08:52 AM, Adam Bordelon wrote:

See the FetcherInfo protobuf definition:
https://github.com/apache/mesos/blob/master/include/mesos/fetcher/fetcher.proto#L33

On Mon, Mar 27, 2017 at 11:46 PM, Adam Cecile > wrote:


Hello,

Since a few version I'm not able to upgrade Mesos because it do
not want to download packages from my Nexus server anymore.

It complains about the SSL certificate, which is perfectly valid...


Anyway... I'd like to run mesos-fetcher by hand to trigger the
issue easily and checks system calls because I have some serious
doubts regarding the CAs being used.

It seems the tool expect a JSON file describing the file to be
dowloaded as MESOS_FETCHER_INFO environment variable. Can you
help me figuring out which format it is ?


Thanks in advance,


Regards, Adam.








Re: Debugging mesos-fetcher binary

2017-03-28 Thread Adam Cecile

Hello,

I tried something like this:

{ "sandbox_directory": "/tmp/abc",
  "items": [
{ "uri": "https://path/to/file";, "action": "BYPASS_CACHE" }
  ]
}

But it's still failing. The item definition is not correct but I cannot 
understand what is expected (according to 
https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L630)


Thanks...

Adam.

On 03/28/2017 08:52 AM, Adam Bordelon wrote:

See the FetcherInfo protobuf definition:
https://github.com/apache/mesos/blob/master/include/mesos/fetcher/fetcher.proto#L33

On Mon, Mar 27, 2017 at 11:46 PM, Adam Cecile > wrote:


Hello,

Since a few version I'm not able to upgrade Mesos because it do
not want to download packages from my Nexus server anymore.

It complains about the SSL certificate, which is perfectly valid...


Anyway... I'd like to run mesos-fetcher by hand to trigger the
issue easily and checks system calls because I have some serious
doubts regarding the CAs being used.

It seems the tool expect a JSON file describing the file to be
dowloaded as MESOS_FETCHER_INFO environment variable. Can you help
me figuring out which format it is ?


Thanks in advance,


Regards, Adam.






Re:RE: framework for short living tasks (1-15 seconds)

2017-03-28 Thread vvshvv
Radek, I thought about the approach, when jobs after processing will commit offsets, but I didn't try yet (will it work?).

As of submit back I am afraid that there might be a failure that will kill container even before I understand that. I think I need to monitor the containers and their exit status codes, and in case of failure, relaunch them or say user that something went wrong.

Regards,
Uladzimir



On hubert.asa...@dlr.de, Mar 28, 2017 8:05 AM wrote:



Another approach (cli-based & short lived framework) for running multiple arbitrary one-off tasks…
 
https://github.com/asamerh4/mesos-batch
 
Cheers,
Hubert
 
From: vvshvv [mailto:vvs...@gmail.com]

Sent: Montag, 27. März 2017 20:53
To: user@mesos.apache.org
Subject: framework for short living tasks (1-15 seconds)
 

Hi,


 


I want to run some image processing tasks on mesos cluster, but at the same time there will be a lot of tasks (1k-10k).


 


What framework do you suggest to use in such case?


 


Regards,


Uladzimir