[MesosCon][Slides] Any slides sharing plan for MesosCon

2016-06-03 Thread Guangya Liu
Unlike last year's MesosCon, I saw that the slides was not shared till now
in http://events.linuxfoundation.org/events/mesoscon-north-america , any
plan to share those slides?

Thanks,

Guangya


Re: Completed executors presented as alive

2016-06-03 Thread haosdent
> 13:33:39.031054  [slave.cpp:2643] Got registration for executor
'service.a3b609b8-27ec-11e6-8044-02c89eb9127e' of framework
f65b163c-0faf-441f-ac14-91739fa4394c- from executor(1)@
10.55.97.170:60083

Yes, according to your log, your executor is still running. If your
executor is http_command_executor,
you could use
https://github.com/apache/mesos/blob/master/docs/executor-http-api.md#shutdown
to shutdown it.
If it is other type executor, seems don't have a api to shutdown executor
as I know. Not sure whether kill the executor in
Agent could resolve your problem or not.

On Fri, Jun 3, 2016 at 4:33 PM, Tomek Janiszewski  wrote:

> Here is truncated response from slave(1)/state
>
> {
> "attributes": {...},
> "completed_frameworks": [],
> "flags": {...},
> "frameworks": [
> {
> "checkpoint": true,
> "completed_executors": [...],
> "executors": [
>   {
>   "queued_tasks": [],
>   "tasks": [],
>   "completed_tasks": [
>   {
>   "discovery": {...},
>   "executor_id": "",
>   "framework_id":
> "f65b163c-0faf-441f-ac14-91739fa4394c-",
>   "id":
> "service.a3b609b8-27ec-11e6-8044-02c89eb9127e",
>   "labels": [...],
>   "name": "service",
>   "resources": {...},
>   "slave_id":
> "ef232fd9-5114-4d8f-adc3-1669c1e6fdc5-S13",
>   "state": "TASK_KILLED",
>   "statuses": []
>   }
>   ],
>   "container": "ead42e63-ac92-4ad0-a99c-4af9c3fa5e31",
>   "directory": "...",
>   "id": "service.a3b609b8-27ec-11e6-8044-02c89eb9127e",
>   "name": "Command Executor (Task:
> service.a3b609b8-27ec-11e6-8044-02c89eb9127e) (Command: sh -c 'cd
> service...')",
>   "resources": {...},
>   "source": "service.a3b609b8-27ec-11e6-8044-02c89eb9127e"
>
>   },
>   ...
> ],
> }
> ],
> "git_sha": "961edbd82e691a619a4c171a7aadc9c32957fa73",
> "git_tag": "0.28.0",
> "version": "0.28.0",
> ...
> }
>
> Here is the log for this container:
>
> > 13:33:19.479182  [slave.cpp:1361] Got assigned task
> service.a3b609b8-27ec-11e6-8044-02c89eb9127e for framework
> f65b163c-0faf-441f-ac14-91739fa4394c-
> > 13:33:19.482566  [slave.cpp:1480] Launching task
> service.a3b609b8-27ec-11e6-8044-02c89eb9127e for framework
> f65b163c-0faf-441f-ac14-91739fa4394c-
> > 13:33:19.483921  [paths.cpp:528] Trying to chown
>
> '/tmp/mesos/slaves/ef232fd9-5114-4d8f-adc3-1669c1e6fdc5-S13/frameworks/f65b163c-0faf-441f-ac14-91739fa4394c-/executors/service.a3b609b8-27ec-11e6-8044-02c89eb9127e/runs/ead42e63-ac92-4ad0-a99c-4af9c3fa5e31'
> to user 'mesosuser'
> > 13:33:19.504173  [slave.cpp:5367] Launching executor
> service.a3b609b8-27ec-11e6-8044-02c89eb9127e of framework
> f65b163c-0faf-441f-ac14-91739fa4394c- with resources cpus(*):0.1;
> mem(*):32 in work directory
>
> '/tmp/mesos/slaves/ef232fd9-5114-4d8f-adc3-1669c1e6fdc5-S13/frameworks/f65b163c-0faf-441f-ac14-91739fa4394c-/executors/service.a3b609b8-27ec-11e6-8044-02c89eb9127e/runs/ead42e63-ac92-4ad0-a99c-4af9c3fa5e31'
> > 13:33:19.505537  [containerizer.cpp:666] Starting container
> 'ead42e63-ac92-4ad0-a99c-4af9c3fa5e31' for executor
> 'service.a3b609b8-27ec-11e6-8044-02c89eb9127e' of framework
> 'f65b163c-0faf-441f-ac14-91739fa4394c-'
> > 13:33:19.505734  [slave.cpp:1698] Queuing task
> 'service.a3b609b8-27ec-11e6-8044-02c89eb9127e' for executor
> 'service.a3b609b8-27ec-11e6-8044-02c89eb9127e' of framework
> f65b163c-0faf-441f-ac14-91739fa4394c-
> ...
> > 13:33:19.977483  [containerizer.cpp:1118] Checkpointing executor's forked
> pid 25576 to
>
> '/tmp/mesos/meta/slaves/ef232fd9-5114-4d8f-adc3-1669c1e6fdc5-S13/frameworks/f65b163c-0faf-441f-ac14-91739fa4394c-/executors/service.a3b609b8-27ec-11e6-8044-02c89eb9127e/runs/ead42e63-ac92-4ad0-a99c-4af9c3fa5e31/pids/forked.pid'
> > 13:33:35.775195  [slave.cpp:1891] Asked to kill task
> service.a3b609b8-27ec-11e6-8044-02c89eb9127e of framework
> f65b163c-0faf-441f-ac14-91739fa4394c-
> > 13:33:35.775645  [slave.cpp:3002] Handling status update TASK_KILLED
> (UUID: eba64915-7df2-483d-8982-a9a46a48a81b) for task
> service.a3b609b8-27ec-11e6-8044-02c89eb9127e of framework
> f65b163c-0faf-441f-ac14-91739fa4394c- f
> rom @0.0.0.0:0
> > 13:33:35.778105  [cpushare.cpp:389] Updated 'cpu.shares' to 102 (cpus
> 0.1) for container ead42e63-ac92-4ad0-a99c-4af9c3fa5e31
> > 13:33:35.778488  [disk.cpp:169] Updating the disk resources for container
> ead42e63-ac92-4ad0-a99c-4af9c3fa5e31 to cpus(*):0.1
> ; mem(*):32
> > 13:33:35.780349  [mem.cpp:353] Updated 

Re: Documentation about debugging mesos-master : newbie

2016-06-03 Thread Vinit Mahedia
To close the loop on this - I got it working with a fresh vagrant vm.
Earlier
I was trying on a vm which I was using for other things.

Thank you *Gilbert* and others for your time and help :-)

Revived an old thread for kafka issue, refer to that if you are interested.

On Fri, Jun 3, 2016 at 2:21 PM, Vinit Mahedia 
wrote:

> Guangya -
>
> I just checked that. I am using v7.10 which has the fix for that issue. I
> also tried to
> build it on ubuntu(vagrant machine) 14.04, how Gilbert and others are
> doing it, but
> there I got this linking error so I think here I might have to work out
> some issues as
> there are already others who have it functioning unlike Mac OS:
>
>
> Error on ubuntu :"make"(during link phase)
>
> libtool: link: `slave/libmesos_no_3rdparty_la-validation.lo' is not a
> valid libtool object
>
>
> Brenno -
>
> That is the reason I tried to build the whole thing as static by passing
> --enable-static
> flag, altough that did not work either. With default setting, shared
> library, it gives that
> infamous error of "memory address not accessible". The only place where I
> could set
> break point was "main" which I believe was due to GTEST macro. Now, I am
> ditching the
> setup on mac os and trying to get it working on ubuntu like others here.
>
> About the issue I am facing with Kafka, I found this thread where someone
> else is running into
> same issue as me, registering the framework. So I will be replying to
> that.Here it is -
> mesos/kafka framework registration issue
> ,
> for curious folks.
>
>
> Thank you for your time, guys.
>
>
>
>
>
>
>
>
> On Fri, Jun 3, 2016 at 6:45 AM, Shuai Lin  wrote:
>
> +1 for setting CFLAGS and CXXFLAGS, I used to call configure like this:
>>
>> ```
>> CFLAGS=-ggdb3 CXXFLAGS=-ggdb3 ../configure ...
>> ```
>>
>> On Fri, Jun 3, 2016 at 6:13 PM, Evers Benno 
>> wrote:
>>
>> > A random guess, but gdb tends to load shared libraries only after you do
>> > "run" for the first time, maybe that's what's missing?
>> >
>> > Apart from this, after installation mesos is just a normal c++ binary,
>> > so you could bypass libtool by installing in some custom prefix (not
>> > sure how to do this on mac) and then using gdb manually with `gdb --args
>> > /usr/local/bin/mesos-master --work_dir= ...`
>> >
>> > Also, I vaguely remember some build issues with flags not being passed
>> > correctly to all third-party dependencies, so you probably want
>> > "CFLAGS=-g3 CXXFLAGS=-g3" in your environment in addition to
>> > --enable-debug.
>> >
>> > Best regards,
>> > Benno
>> >
>> >
>> > On 03.06.2016 02:23, Guangya Liu wrote:
>> > > Hi Vinit,
>> > >
>> > > Please check if you are encountering this issue:
>> > > https://github.com/Homebrew/homebrew-dupes/issues/221
>> > >
>> > > Thanks,
>> > >
>> > > Guangya
>> > >
>> > > On Fri, Jun 3, 2016 at 2:24 AM, Vinit Mahedia > >
>> > > wrote:
>> > >
>> > >> Hi Gilbert,
>> > >>
>> > >> Thank you for replying.
>> > >>
>> > >> Yes, I did that.
>> > >>
>> > >>
>> > >>1.  ./configure --enable-debug --disable-java --disable-python
>> > >>2.  make
>> > >>3. ./bin/gdb-mesos-master.sh --ip=127.0.0.1 --work_dir=.
>> > >>
>> > >> Although even after setting source directory, I can not set
>> breakpoint I
>> > >> get warning like this
>> > >>
>> > >> (gdb) break master.cpp:2481
>> > >> Cannot access memory at address 0x714d40
>> > >>
>> > >>
>> > >> I also tried few things, passing "static" flag to libtool, passing
>> > >>  "--enable-static"
>> > >>
>> > >> Although I got linker error, where I saw libtool was not using
>> --static
>> > >> flag and I do
>> > >> not know if doing that will fix it. I forgot to mention that am
>> building
>> > >> this on Mac OS.
>> > >>
>> > >> Thank you.
>> > >>
>> > >>
>> > >>
>> > >> On Thu, Jun 2, 2016 at 12:33 PM, Gilbert Song > >
>> > >> wrote:
>> > >>
>> > >>> Hi Vinit,
>> > >>>
>> > >>> Did you configure with debug mode (e.g., ../confugure
>> --enable-debug)?
>> > >>>
>> > >>> Assuming you have the gdb installed, you should be able to debug
>> mesos
>> > >>> master
>> > >>> in gbd:
>> > >>>
>> > >>> ./bin/gdb-mesos-master.sh --ip=127.0.0.1 --work_dir=/var/lib/mesos
>> > >>>
>> > >>>
>> > >>> Gilbert
>> > >>>
>> > >>> On Thu, Jun 2, 2016 at 9:30 AM, Vinit Mahedia <
>> vinitmahe...@gmail.com>
>> > >>> wrote:
>> > >>>
>> >  I have been trying to debug mesos-master using gdb-mesos-master.sh
>> > >>> although
>> >  it does not load symbols or sources. I tried to set those paths as
>> > well
>> > >>> but
>> >  since it thinks mesos-master, libtool script, is the main binary.
>> > 
>> >  I just want to set the dev environment and try to fix a very stupid
>> > bug
>> > >>> to
>> >  learn the work flow of test/debug/commit.
>> > 
>> > 

Re: mesos/kafka issues (reviving old thread)

2016-06-03 Thread Vinit Mahedia
Justin,

There's certainly a bug somewhere, either in mesos kafka framework or mesos
itself. If I can get this
mesos master running under debugger, this would be over in maybe few hours
or few days.

Kafka is a stateful service so can't run on marathon directly, there has to
be some framework/wrapper
which manages brokers - Read here

.

On Fri, Jun 3, 2016 at 4:36 PM, Justin Ryan  wrote:

> inline
>
> On 6/3/16, 2:19 PM, "Vinit Mahedia"  wrote:
>
> >Justin,
> >Yeah - as long as everything is on a single box (mesos-kafka scheduler,
> mesos-master, zk etc.)
> >things work just fine, which is what I meant by local setup.
> >
> >
> >I did a local cluster setup as well, 3 vagrant machines, where it does
> >not work. So it does not work at all
> >if you have multi node setup (vagrant machines or bare metal).
> >
>
> I do have two working clusters, but the mesos-kafka scheduler must run on
> the active mesos-master leader, which may also be an indication of another
> symptom that just *happens* to coincide with this.
>
> >
> >others,
> >
> >Is there any alternative for getting kafka up and running on mesos  other
> than this -
> >mesos/kafka  ?
> >
>
> I thought about this for a while, but digging into the code, this is just
> an extension of MesosFramework, which is core mesos functionality.
>
> We’re either missing something or have exposed a bug.
>
> Of course, a person could just run kafka brokers with marathon, but I
> think this is worth the trouble and is fundamentally how things should be
> done.
>
>
> 
>
> P Please consider the environment before printing this e-mail
>
> The information in this electronic mail message is the sender's
> confidential business and may be legally privileged. It is intended solely
> for the addressee(s). Access to this internet electronic mail message by
> anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it is prohibited and may be unlawful. The sender
> believes that this E-mail and any attachments were free of any virus, worm,
> Trojan horse, and/or malicious code when sent. This message and its
> attachments could have been infected during transmission. By reading the
> message and opening any attachments, the recipient accepts full
> responsibility for taking protective and remedial action about viruses and
> other defects. The sender's employer is not liable for any loss or damage
> arising in any way.
>



-- 
~Vinit


Re: mesos/kafka issues (reviving old thread)

2016-06-03 Thread Vinit Mahedia
Justin,
Yeah - as long as everything is on a single box (mesos-kafka scheduler,
mesos-master, zk etc.)
things work just fine, which is what I meant by local setup.

I did a local cluster setup as well, 3 vagrant machines, where it does *not*
work. So it does not work at all
if you have multi node setup (vagrant machines or bare metal).

others,

Is there any alternative for getting kafka up and running on mesos  other
than this - mesos/kafka  ?



On Fri, Jun 3, 2016 at 3:53 PM, Justin Ryan  wrote:

> Hiya Vinit,
>
> I’ve made some progress, I have a conditionally working setup, and another
> setup which was working now failing in new ways.
>
> It does sound like your captures are similar to mine, what I found is that
> if I run the scheduler on, say, zk01 (which is also a mesos-master), while
> it is the leader, things work fine.  If mesos-master fails over to zk02 or
> zk03, the scheduler running on zk01 stops working, though its’ config
> points at all three machines.  Obviously this makes it difficult to run the
> scheduler itself as a mesos task.
>
> Though I wish neither of us were having this problem, it’s good to not
> feel like I’m crazy, I kinda ran out of things to test until one day it
> occurred to me to check if it matters whether the scheduler runs on the
> active mesos-master.  I can think of a couple reasons this would be so, but
> haven’t had time to narrow it down.
>
> Cheers!
>
> Justin
>
> On 6/3/16, 12:52 PM, "Vinit Mahedia"  wrote:
>
> >
> >
> >​​Hey Justin,
> >
> >
> >I am running in the same issues as you mentioned in this old
> > thread <
> https://mail-archives.apache.org/mod_mbox/mesos-user/201604.mbox/%3c89398c43-d45c-4653-8c0a-5ac987395...@ziprealty.com%3E>,
> did you resolve it?
> >
> >
> >I see that kafka framework sends a POST request to register itself but
> mesos master logs does not even show that the request was received, packet
> capture does. My guess was that something is
> > wrong in HTTP request but doing this same thing on local setup works
> fine so that can't be the case. I can capture the requests on the master
> node so there's no network issues either, just like in your case. I also
> verified that two machines can communicate
> > on using netcat as well.
> >
> >
> >
> >Like you mentioned in your thread, once it registered with mesos but that
> was the only time that happened, the brokers did not start even then -
> "start" timed out.
> >
> >
> >^that behavior hints at possible bug in mesos, where it sits on this
> request for too long and some times under some conditions, it gets through,
> that's when it works.
> >
> >
> >PS I have network capture showing POST from framework to master, if
> anyone wants to take a look at it.
> >
> >
> >--
> >Vinit
> >
> >
>
> 
>
> P Please consider the environment before printing this e-mail
>
> The information in this electronic mail message is the sender's
> confidential business and may be legally privileged. It is intended solely
> for the addressee(s). Access to this internet electronic mail message by
> anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it is prohibited and may be unlawful. The sender
> believes that this E-mail and any attachments were free of any virus, worm,
> Trojan horse, and/or malicious code when sent. This message and its
> attachments could have been infected during transmission. By reading the
> message and opening any attachments, the recipient accepts full
> responsibility for taking protective and remedial action about viruses and
> other defects. The sender's employer is not liable for any loss or damage
> arising in any way.
>



-- 
~Vinit


Re: WebUI authentication in 1.0.0-rc1

2016-06-03 Thread Michael Park
Hello, I'm not exactly sure about whether the behavior is undesired or not.

But I think the ACL that you're missing is `GetEndpoint`:
https://github.com/apache/mesos/blob/master/include/mesos/authorizer/acls.proto#L183-L190

Hope that helps,

MPark

On 3 June 2016 at 12:36, Evers Benno  wrote:

>
> I just tried building and running the 1.0.0-rc1, and it seems that the
> web UI is broken due to /metrics/snapshot returning a 403. (There's a
> popup continously displaying "Failed to connect to
> mesos-master.example.org:5050!"
>
> I'm running mesos-master with options `--no-authenticate_http
> --acls={"permissive": "false", [...]}`, so I'm not completely sure if
> this behaviour is as desired or not. (although its certainly unexpected)
>
> Regardless, I looked around for a while, but I couldn't figure out what
> to add to the ACL to restore unauthorized viewing access for everyone?
>
> Best regards,
> Benno
>


mesos/kafka issues (reviving old thread)

2016-06-03 Thread Vinit Mahedia
​​Hey Justin,

I am running in the same issues as you mentioned in this old thread
,
did you resolve it?

I see that kafka framework sends a POST request to register itself but
mesos master logs does not even show that the request was received, packet
capture does. My guess was that something is wrong in HTTP request but
doing this same thing on local setup works fine so that can't be the case.
I can capture the requests on the master node so there's no network issues
either, just like in your case. I also verified that two machines can
communicate on using netcat as well.

Like you mentioned in your thread, once it registered with mesos but that
was the only time that happened, the brokers did not start even then -
"start" timed out.

^that behavior hints at possible bug in mesos, where it sits on this
request for too long and some times under some conditions, it gets through,
that's when it works.

PS I have network capture showing POST from framework to master, if anyone
wants to take a look at it.

-- 
Vinit


Re: Documentation about debugging mesos-master : newbie

2016-06-03 Thread Vinit Mahedia
Guangya -

I just checked that. I am using v7.10 which has the fix for that issue. I
also tried to
build it on ubuntu(vagrant machine) 14.04, how Gilbert and others are doing
it, but
there I got this linking error so I think here I might have to work out
some issues as
there are already others who have it functioning unlike Mac OS:


Error on ubuntu :"make"(during link phase)

libtool: link: `slave/libmesos_no_3rdparty_la-validation.lo' is not a valid
libtool object


Brenno -

That is the reason I tried to build the whole thing as static by passing
--enable-static
flag, altough that did not work either. With default setting, shared
library, it gives that
infamous error of "memory address not accessible". The only place where I
could set
break point was "main" which I believe was due to GTEST macro. Now, I am
ditching the
setup on mac os and trying to get it working on ubuntu like others here.

About the issue I am facing with Kafka, I found this thread where someone
else is running into
same issue as me, registering the framework. So I will be replying to
that.Here it is -
mesos/kafka framework registration issue
,
for curious folks.


Thank you for your time, guys.








On Fri, Jun 3, 2016 at 6:45 AM, Shuai Lin  wrote:

+1 for setting CFLAGS and CXXFLAGS, I used to call configure like this:
>
> ```
> CFLAGS=-ggdb3 CXXFLAGS=-ggdb3 ../configure ...
> ```
>
> On Fri, Jun 3, 2016 at 6:13 PM, Evers Benno  wrote:
>
> > A random guess, but gdb tends to load shared libraries only after you do
> > "run" for the first time, maybe that's what's missing?
> >
> > Apart from this, after installation mesos is just a normal c++ binary,
> > so you could bypass libtool by installing in some custom prefix (not
> > sure how to do this on mac) and then using gdb manually with `gdb --args
> > /usr/local/bin/mesos-master --work_dir= ...`
> >
> > Also, I vaguely remember some build issues with flags not being passed
> > correctly to all third-party dependencies, so you probably want
> > "CFLAGS=-g3 CXXFLAGS=-g3" in your environment in addition to
> > --enable-debug.
> >
> > Best regards,
> > Benno
> >
> >
> > On 03.06.2016 02:23, Guangya Liu wrote:
> > > Hi Vinit,
> > >
> > > Please check if you are encountering this issue:
> > > https://github.com/Homebrew/homebrew-dupes/issues/221
> > >
> > > Thanks,
> > >
> > > Guangya
> > >
> > > On Fri, Jun 3, 2016 at 2:24 AM, Vinit Mahedia 
> > > wrote:
> > >
> > >> Hi Gilbert,
> > >>
> > >> Thank you for replying.
> > >>
> > >> Yes, I did that.
> > >>
> > >>
> > >>1.  ./configure --enable-debug --disable-java --disable-python
> > >>2.  make
> > >>3. ./bin/gdb-mesos-master.sh --ip=127.0.0.1 --work_dir=.
> > >>
> > >> Although even after setting source directory, I can not set
> breakpoint I
> > >> get warning like this
> > >>
> > >> (gdb) break master.cpp:2481
> > >> Cannot access memory at address 0x714d40
> > >>
> > >>
> > >> I also tried few things, passing "static" flag to libtool, passing
> > >>  "--enable-static"
> > >>
> > >> Although I got linker error, where I saw libtool was not using
> --static
> > >> flag and I do
> > >> not know if doing that will fix it. I forgot to mention that am
> building
> > >> this on Mac OS.
> > >>
> > >> Thank you.
> > >>
> > >>
> > >>
> > >> On Thu, Jun 2, 2016 at 12:33 PM, Gilbert Song 
> > >> wrote:
> > >>
> > >>> Hi Vinit,
> > >>>
> > >>> Did you configure with debug mode (e.g., ../confugure
> --enable-debug)?
> > >>>
> > >>> Assuming you have the gdb installed, you should be able to debug
> mesos
> > >>> master
> > >>> in gbd:
> > >>>
> > >>> ./bin/gdb-mesos-master.sh --ip=127.0.0.1 --work_dir=/var/lib/mesos
> > >>>
> > >>>
> > >>> Gilbert
> > >>>
> > >>> On Thu, Jun 2, 2016 at 9:30 AM, Vinit Mahedia <
> vinitmahe...@gmail.com>
> > >>> wrote:
> > >>>
> >  I have been trying to debug mesos-master using gdb-mesos-master.sh
> > >>> although
> >  it does not load symbols or sources. I tried to set those paths as
> > well
> > >>> but
> >  since it thinks mesos-master, libtool script, is the main binary.
> > 
> >  I just want to set the dev environment and try to fix a very stupid
> > bug
> > >>> to
> >  learn the work flow of test/debug/commit.
> > 
> >  If I can get it working, I can help to write if such documentation
> > does
> > >>> not
> >  exist. I also tried to set it up on eclipse CDT but it can't handle
> > >>> libtool
> >  scripts.
> > 
> >  Thank you.
> > 
> > >>>
> > >>
> > >
> >
>


WebUI authentication in 1.0.0-rc1

2016-06-03 Thread Evers Benno

I just tried building and running the 1.0.0-rc1, and it seems that the
web UI is broken due to /metrics/snapshot returning a 403. (There's a
popup continously displaying "Failed to connect to
mesos-master.example.org:5050!"

I'm running mesos-master with options `--no-authenticate_http
--acls={"permissive": "false", [...]}`, so I'm not completely sure if
this behaviour is as desired or not. (although its certainly unexpected)

Regardless, I looked around for a while, but I couldn't figure out what
to add to the ACL to restore unauthorized viewing access for everyone?

Best regards,
Benno


Re: Documentation about debugging mesos-master : newbie

2016-06-03 Thread Shuai Lin
+1 for setting CFLAGS and CXXFLAGS, I used to call configure like this:

```
CFLAGS=-ggdb3 CXXFLAGS=-ggdb3 ../configure ...
```

On Fri, Jun 3, 2016 at 6:13 PM, Evers Benno  wrote:

> A random guess, but gdb tends to load shared libraries only after you do
> "run" for the first time, maybe that's what's missing?
>
> Apart from this, after installation mesos is just a normal c++ binary,
> so you could bypass libtool by installing in some custom prefix (not
> sure how to do this on mac) and then using gdb manually with `gdb --args
> /usr/local/bin/mesos-master --work_dir= ...`
>
> Also, I vaguely remember some build issues with flags not being passed
> correctly to all third-party dependencies, so you probably want
> "CFLAGS=-g3 CXXFLAGS=-g3" in your environment in addition to
> --enable-debug.
>
> Best regards,
> Benno
>
>
> On 03.06.2016 02:23, Guangya Liu wrote:
> > Hi Vinit,
> >
> > Please check if you are encountering this issue:
> > https://github.com/Homebrew/homebrew-dupes/issues/221
> >
> > Thanks,
> >
> > Guangya
> >
> > On Fri, Jun 3, 2016 at 2:24 AM, Vinit Mahedia 
> > wrote:
> >
> >> Hi Gilbert,
> >>
> >> Thank you for replying.
> >>
> >> Yes, I did that.
> >>
> >>
> >>1.  ./configure --enable-debug --disable-java --disable-python
> >>2.  make
> >>3. ./bin/gdb-mesos-master.sh --ip=127.0.0.1 --work_dir=.
> >>
> >> Although even after setting source directory, I can not set breakpoint I
> >> get warning like this
> >>
> >> (gdb) break master.cpp:2481
> >> Cannot access memory at address 0x714d40
> >>
> >>
> >> I also tried few things, passing "static" flag to libtool, passing
> >>  "--enable-static"
> >>
> >> Although I got linker error, where I saw libtool was not using --static
> >> flag and I do
> >> not know if doing that will fix it. I forgot to mention that am building
> >> this on Mac OS.
> >>
> >> Thank you.
> >>
> >>
> >>
> >> On Thu, Jun 2, 2016 at 12:33 PM, Gilbert Song 
> >> wrote:
> >>
> >>> Hi Vinit,
> >>>
> >>> Did you configure with debug mode (e.g., ../confugure --enable-debug)?
> >>>
> >>> Assuming you have the gdb installed, you should be able to debug mesos
> >>> master
> >>> in gbd:
> >>>
> >>> ./bin/gdb-mesos-master.sh --ip=127.0.0.1 --work_dir=/var/lib/mesos
> >>>
> >>>
> >>> Gilbert
> >>>
> >>> On Thu, Jun 2, 2016 at 9:30 AM, Vinit Mahedia 
> >>> wrote:
> >>>
>  I have been trying to debug mesos-master using gdb-mesos-master.sh
> >>> although
>  it does not load symbols or sources. I tried to set those paths as
> well
> >>> but
>  since it thinks mesos-master, libtool script, is the main binary.
> 
>  I just want to set the dev environment and try to fix a very stupid
> bug
> >>> to
>  learn the work flow of test/debug/commit.
> 
>  If I can get it working, I can help to write if such documentation
> does
> >>> not
>  exist. I also tried to set it up on eclipse CDT but it can't handle
> >>> libtool
>  scripts.
> 
>  Thank you.
> 
> >>>
> >>
> >
>


Re: Documentation about debugging mesos-master : newbie

2016-06-03 Thread Evers Benno
A random guess, but gdb tends to load shared libraries only after you do
"run" for the first time, maybe that's what's missing?

Apart from this, after installation mesos is just a normal c++ binary,
so you could bypass libtool by installing in some custom prefix (not
sure how to do this on mac) and then using gdb manually with `gdb --args
/usr/local/bin/mesos-master --work_dir= ...`

Also, I vaguely remember some build issues with flags not being passed
correctly to all third-party dependencies, so you probably want
"CFLAGS=-g3 CXXFLAGS=-g3" in your environment in addition to --enable-debug.

Best regards,
Benno


On 03.06.2016 02:23, Guangya Liu wrote:
> Hi Vinit,
> 
> Please check if you are encountering this issue:
> https://github.com/Homebrew/homebrew-dupes/issues/221
> 
> Thanks,
> 
> Guangya
> 
> On Fri, Jun 3, 2016 at 2:24 AM, Vinit Mahedia 
> wrote:
> 
>> Hi Gilbert,
>>
>> Thank you for replying.
>>
>> Yes, I did that.
>>
>>
>>1.  ./configure --enable-debug --disable-java --disable-python
>>2.  make
>>3. ./bin/gdb-mesos-master.sh --ip=127.0.0.1 --work_dir=.
>>
>> Although even after setting source directory, I can not set breakpoint I
>> get warning like this
>>
>> (gdb) break master.cpp:2481
>> Cannot access memory at address 0x714d40
>>
>>
>> I also tried few things, passing "static" flag to libtool, passing
>>  "--enable-static"
>>
>> Although I got linker error, where I saw libtool was not using --static
>> flag and I do
>> not know if doing that will fix it. I forgot to mention that am building
>> this on Mac OS.
>>
>> Thank you.
>>
>>
>>
>> On Thu, Jun 2, 2016 at 12:33 PM, Gilbert Song 
>> wrote:
>>
>>> Hi Vinit,
>>>
>>> Did you configure with debug mode (e.g., ../confugure --enable-debug)?
>>>
>>> Assuming you have the gdb installed, you should be able to debug mesos
>>> master
>>> in gbd:
>>>
>>> ./bin/gdb-mesos-master.sh --ip=127.0.0.1 --work_dir=/var/lib/mesos
>>>
>>>
>>> Gilbert
>>>
>>> On Thu, Jun 2, 2016 at 9:30 AM, Vinit Mahedia 
>>> wrote:
>>>
 I have been trying to debug mesos-master using gdb-mesos-master.sh
>>> although
 it does not load symbols or sources. I tried to set those paths as well
>>> but
 since it thinks mesos-master, libtool script, is the main binary.

 I just want to set the dev environment and try to fix a very stupid bug
>>> to
 learn the work flow of test/debug/commit.

 If I can get it working, I can help to write if such documentation does
>>> not
 exist. I also tried to set it up on eclipse CDT but it can't handle
>>> libtool
 scripts.

 Thank you.

>>>
>>
> 


Re: Completed executors presented as alive

2016-06-03 Thread Tomek Janiszewski
Here is truncated response from slave(1)/state

{
"attributes": {...},
"completed_frameworks": [],
"flags": {...},
"frameworks": [
{
"checkpoint": true,
"completed_executors": [...],
"executors": [
  {
  "queued_tasks": [],
  "tasks": [],
  "completed_tasks": [
  {
  "discovery": {...},
  "executor_id": "",
  "framework_id":
"f65b163c-0faf-441f-ac14-91739fa4394c-",
  "id":
"service.a3b609b8-27ec-11e6-8044-02c89eb9127e",
  "labels": [...],
  "name": "service",
  "resources": {...},
  "slave_id":
"ef232fd9-5114-4d8f-adc3-1669c1e6fdc5-S13",
  "state": "TASK_KILLED",
  "statuses": []
  }
  ],
  "container": "ead42e63-ac92-4ad0-a99c-4af9c3fa5e31",
  "directory": "...",
  "id": "service.a3b609b8-27ec-11e6-8044-02c89eb9127e",
  "name": "Command Executor (Task:
service.a3b609b8-27ec-11e6-8044-02c89eb9127e) (Command: sh -c 'cd
service...')",
  "resources": {...},
  "source": "service.a3b609b8-27ec-11e6-8044-02c89eb9127e"

  },
  ...
],
}
],
"git_sha": "961edbd82e691a619a4c171a7aadc9c32957fa73",
"git_tag": "0.28.0",
"version": "0.28.0",
...
}

Here is the log for this container:

> 13:33:19.479182  [slave.cpp:1361] Got assigned task
service.a3b609b8-27ec-11e6-8044-02c89eb9127e for framework
f65b163c-0faf-441f-ac14-91739fa4394c-
> 13:33:19.482566  [slave.cpp:1480] Launching task
service.a3b609b8-27ec-11e6-8044-02c89eb9127e for framework
f65b163c-0faf-441f-ac14-91739fa4394c-
> 13:33:19.483921  [paths.cpp:528] Trying to chown
'/tmp/mesos/slaves/ef232fd9-5114-4d8f-adc3-1669c1e6fdc5-S13/frameworks/f65b163c-0faf-441f-ac14-91739fa4394c-/executors/service.a3b609b8-27ec-11e6-8044-02c89eb9127e/runs/ead42e63-ac92-4ad0-a99c-4af9c3fa5e31'
to user 'mesosuser'
> 13:33:19.504173  [slave.cpp:5367] Launching executor
service.a3b609b8-27ec-11e6-8044-02c89eb9127e of framework
f65b163c-0faf-441f-ac14-91739fa4394c- with resources cpus(*):0.1;
mem(*):32 in work directory
'/tmp/mesos/slaves/ef232fd9-5114-4d8f-adc3-1669c1e6fdc5-S13/frameworks/f65b163c-0faf-441f-ac14-91739fa4394c-/executors/service.a3b609b8-27ec-11e6-8044-02c89eb9127e/runs/ead42e63-ac92-4ad0-a99c-4af9c3fa5e31'
> 13:33:19.505537  [containerizer.cpp:666] Starting container
'ead42e63-ac92-4ad0-a99c-4af9c3fa5e31' for executor
'service.a3b609b8-27ec-11e6-8044-02c89eb9127e' of framework
'f65b163c-0faf-441f-ac14-91739fa4394c-'
> 13:33:19.505734  [slave.cpp:1698] Queuing task
'service.a3b609b8-27ec-11e6-8044-02c89eb9127e' for executor
'service.a3b609b8-27ec-11e6-8044-02c89eb9127e' of framework
f65b163c-0faf-441f-ac14-91739fa4394c-
...
> 13:33:19.977483  [containerizer.cpp:1118] Checkpointing executor's forked
pid 25576 to
'/tmp/mesos/meta/slaves/ef232fd9-5114-4d8f-adc3-1669c1e6fdc5-S13/frameworks/f65b163c-0faf-441f-ac14-91739fa4394c-/executors/service.a3b609b8-27ec-11e6-8044-02c89eb9127e/runs/ead42e63-ac92-4ad0-a99c-4af9c3fa5e31/pids/forked.pid'
> 13:33:35.775195  [slave.cpp:1891] Asked to kill task
service.a3b609b8-27ec-11e6-8044-02c89eb9127e of framework
f65b163c-0faf-441f-ac14-91739fa4394c-
> 13:33:35.775645  [slave.cpp:3002] Handling status update TASK_KILLED
(UUID: eba64915-7df2-483d-8982-a9a46a48a81b) for task
service.a3b609b8-27ec-11e6-8044-02c89eb9127e of framework
f65b163c-0faf-441f-ac14-91739fa4394c- f
rom @0.0.0.0:0
> 13:33:35.778105  [cpushare.cpp:389] Updated 'cpu.shares' to 102 (cpus
0.1) for container ead42e63-ac92-4ad0-a99c-4af9c3fa5e31
> 13:33:35.778488  [disk.cpp:169] Updating the disk resources for container
ead42e63-ac92-4ad0-a99c-4af9c3fa5e31 to cpus(*):0.1
; mem(*):32
> 13:33:35.780349  [mem.cpp:353] Updated 'memory.soft_limit_in_bytes' to
32MB for container ead42e63-ac92-4ad0-a99c-4af9c3fa5e3
1
> 13:33:35.782573  [status_update_manager.cpp:320] Received status update
TASK_KILLED (UUID: eba64915-7df2-483d-8982-a9a46a48a8
1b) for task service.a3b609b8-27ec-11e6-8044-02c89eb9127e of framework
f65b163c-0faf-441f-ac14-9173
9fa4394c-
> 13:33:35.783860  [status_update_manager.cpp:824] Checkpointing UPDATE for
status update TASK_KILLED (UUID: eba64915-7df2-483d-8982-a9a46a48a81b) for
task service.a3b609b8-27ec-11e6-8044-02c89eb9127e of framework
f65b163c-0faf-441f-ac14-91739fa4394c-
> 13:33:35.788767  [slave.cpp:3400] Forwarding the update TASK_KILLED
(UUID: eba64915-7df2-483d-8982-a9a46a48a81b) for task
service.a3b609b8-27ec-11e6-8044-02c89eb9127e of framework
f65b163c-0faf-441f-ac14-91739fa4394c- to master@10.82.24.138:5050
> 13:33:35.917932