I added the jenkins user on the slave - this was the missing piece. I'll add this to my PR for the readme. Got much further now; now I'm getting a 403 on the fetch:
/jenkins/computer/mesos-jenkins-6f4719c8-1c61-4b28-b5ab-ba298e846840/slave-agent.jnlp: 403 Forbidden at hudson.remoting.Launcher.parseJnlpArguments(Launcher.java:261) at hudson.remoting.Launcher.run(Launcher.java:215) and corresponding log on jenkins master: Nov 7, 2013 2:38:39 PM winstone.Logger logInternal INFO: While serving http://localhost:8080/jenkins/computer/mesos-jenkins-6f4719c8-1c61-4b28-b5ab-ba298e846840/slave-agent.jnlp: hudson.security.AccessDeniedException2: anonymous is missing the Slave/Connect permission Going to look into what this means. On Thu, Nov 7, 2013 at 2:21 PM, Vinod Kone <vinodk...@gmail.com> wrote: > I looked at the code and it looks there are few places the executor might > fail before it fetches the URI. Most of them have to do with incorrect > permissions. The code was written to have any errors reported either in > slave log or console or executor logs (there might be a bug here if we are > in fact swallowing errors). IIUC, the executor log directory is empty in > your case which suggests the executor died before it could even create > "stdout" or "stderr" files in its sandbox (Is this true?). > > Couple of questions: > > What user is Jenkins master running as? Is that user known to the host on > which mesos slave is running? > > How are you starting the mesos slave (e.g., cmd line flags)? > > > > On Thu, Nov 7, 2013 at 11:00 AM, Whitney Sorenson > <wsoren...@hubspot.com>wrote: > >> The gist was compiled from that log. Here is the complete log from >> toggling the jenkins plugin on / off (you see the ping statements >> inbetween): >> >> https://gist.github.com/wsorenson/8bf64e44fd42da354fa0 >> >> >> >> >> On Thu, Nov 7, 2013 at 1:57 PM, Vinod Kone <vinodk...@gmail.com> wrote: >> >>> What does mesos-slave.err say? >>> >>> >>> On Thu, Nov 7, 2013 at 10:49 AM, Whitney Sorenson <wsoren...@hubspot.com >>> > wrote: >>> >>>> Hi Vinod, >>>> >>>> It's 0.14.0-rc4 in both. >>>> >>>> I believe we have logging working: >>>> >>>> -rw-r--r-- 1 root root 0 Oct 22 23:48 mesos-slave.out >>>> lrwxrwxrwx 1 root root 63 Oct 22 23:48 mesos-slave.INFO -> >>>> mesos-slave.carousel.invalid-user.log.INFO.20131022-234823.5797 >>>> lrwxrwxrwx 1 root root 66 Oct 22 23:49 mesos-slave.WARNING -> >>>> mesos-slave.carousel.invalid-user.log.WARNING.20131022-234954.5797 >>>> drwxr-xr-x 2 root root 4096 Oct 22 23:49 . >>>> -rw-rw-r-- 1 root root 4827 Nov 1 20:34 >>>> mesos-slave.carousel.invalid-user.log.WARNING.20131022-234954.5797 >>>> -rw-rw-r-- 1 root root 10408140 Nov 7 18:44 >>>> mesos-slave.carousel.invalid-user.log.INFO.20131022-234823.5797 >>>> -rw-r--r-- 1 root root 53759705 Nov 7 18:45 mesos-slave.err >>>> >>>> Is there something else to check? Is it possible the executor is >>>> failing before it even attempts to fetch URIs? >>>> >>>> Ray - Thanks - yeah I found the jenkins logs. I was able to wget the >>>> slave.jar, and even run it. The mesos-jenkins slaves are dead now, so I >>>> can't connect to their slave-agent - but the jar does run. Not sure if the >>>> window for trying to connect to one of the mesos launched slaves is long >>>> enough to try before it is terminated due to failures. Interestingly, when >>>> I try to connect to one of the existing slaves I get a 403. >>>> >>>> -Whitney >>>> >>>> >>>> >>>> On Thu, Nov 7, 2013 at 1:34 PM, Vinod Kone <vinodk...@gmail.com> wrote: >>>> >>>>> Hey Whitney, >>>>> >>>>> What version of mesos are you using (both in the cluster and the >>>>> plugin)? >>>>> >>>>> The slave should print stuff to console when it is launching executor >>>>> (e.g., "Fetching resources..."). I don't see that in the gist you pasted. >>>>> Are you capturing stdout/stderr of the slave? >>>>> >>>>> >>>>> On Thu, Nov 7, 2013 at 10:30 AM, Whitney Sorenson < >>>>> wsoren...@hubspot.com> wrote: >>>>> >>>>>> Thanks Ray. >>>>>> >>>>>> I have very similar issue (empty executor directories) - but don't >>>>>> have any issues curling the slave.jar URI - and I don't have any existing >>>>>> JNLP process running. I don't have a jenkins user - is that the only >>>>>> setup >>>>>> you did on the slave? >>>>>> >>>>>> -Whitney >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Nov 7, 2013 at 1:16 PM, Ray Rodriguez >>>>>> <rayrod2...@gmail.com>wrote: >>>>>> >>>>>>> Hi Whitney I would have a look at this github issue where I work >>>>>>> through some of my jenkins mesos-plugin issues with Vinod. Might be >>>>>>> some >>>>>>> of the same issues you are seeing. >>>>>>> https://github.com/jenkinsci/mesos-plugin/issues/2 >>>>>>> >>>>>>> Ray >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Nov 7, 2013 at 1:07 PM, Whitney Sorenson < >>>>>>> wsoren...@hubspot.com> wrote: >>>>>>> >>>>>>>> Hi all! >>>>>>>> >>>>>>>> I am trying to get the Jenkins Mesos plugin functioning. I was able >>>>>>>> to get it installed on our Jenkins master. >>>>>>>> >>>>>>>> However, it's unclear if there are any required steps for setting >>>>>>>> up the slaves. When a framework task is launched, it fails instantly >>>>>>>> and >>>>>>>> there are no logs in the runs folder. >>>>>>>> >>>>>>>> Here's a gist with relevant logs from the slave: >>>>>>>> >>>>>>>> >>>>>>>> https://gist.github.com/wsorenson/b3562c3e4a8992f9a46f/raw/ea5821c442d826456291330452208d8d7ac8418f/failing+jenkins+logs >>>>>>>> >>>>>>>> Any help on how to debug? At first, I thought maybe we needed >>>>>>>> slave.jar or something but it looks like it's trying to fetch that >>>>>>>> from the >>>>>>>> master using the URIs. To clarify, I have done no special jenkins >>>>>>>> related >>>>>>>> setup (as per readme.md) on any of the slaves. >>>>>>>> >>>>>>>> -Whitney >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >