Re: Mesos coarse mode not working (fine grained does)

2015-02-11 Thread Hans van den Bogert
Bumping 1on1 conversation to mailinglist:

On 10 Feb 2015, at 13:24, Hans van den Bogert hansbog...@gmail.com wrote:

 
 It’s self built, I can’t otherwise as I can’t install packages on the cluster 
 here.
 
 The problem seems with libtool. When compiling Mesos on a host with apr-devel 
 and apr-util-devel the shared libraries are named libapr*.so without prefix 
 (the ones with prefix are also installed of course). On our compute nodes no 
 *devel packages are installed, just the binary package, which have are named 
 libapr*.so.0 . But even the “make install”-ed binaries still refer to the 
 devel-packages’ shared library. I’m not sure if this is intended behaviour by 
 libtool, because it is the one changing at start/runtime the binaries’ RPATH 
 (which are initially well defined) to the  libapr*.so. 
 
 But this is probably autoconf fu, just hoping someone here has dealt with the 
 same issue.
 
 On 09 Feb 2015, at 20:37, Tim Chen t...@mesosphere.io wrote:
 
 I'm still trying to grasp what your environment setup is like, it's odd to 
 see a g++ stderr when you running mesos.
 
 Are you building Mesos yourself and running it, or you've installed it 
 through some package?
 
 Tim
 
 
 
 On Mon, Feb 9, 2015 at 1:03 AM, Hans van den Bogert hansbog...@gmail.com 
 wrote:
 Okay, I was kind of ambiguous, I assume you mean this one:
 
 [vdbogert@node002 ~]$ cat 
 /local/vdbogert/var/lib/mesos/slaves/20150206-110658-16813322-5050-5515-S0/frameworks/20150208-200943-16813322-5050-26370-/executors/3/runs/latest/stdout
 [vdbogert@node002 ~]$
 
 it’s empty.
 
 On 09 Feb 2015, at 06:22, Tim Chen t...@mesosphere.io wrote:
 
 Hi Hans,
 
 I was referring to the stdout/stderr of the task, not the slave.
 
 Tim
 
 On Sun, Feb 8, 2015 at 1:21 PM, Hans van den Bogert hansbog...@gmail.com 
 wrote:
 
 
 
  Hi there,
 
  It looks like while trying to launch the executor (or one of the process 
  like the fetcher to fetch the uris)
 The fetching seems to have succeeded as well as extracting, as the 
 “spark-1.2.0-bin-hadoop2.4” directory exists in the slave sandbox. 
 Furthermore, it seems the executor URI is superfluous in my environment as 
 I’ve checked the code, and if an URI is not provided, the task will not 
 refer to an extracted distro, but to a directory with the same path as the 
 current  spark distro, which makes sense in a cluster environment where 
 data is on a network-shared disk. I’ve tried *not* supplying an 
 spark.executor.uri and fine-grained mode still works fine. Coarse-grained 
 mode still  fails with the same libapr* errors.
 
  was failing because of the dependencies problem you see. Your mesos-slave 
  shouldn't be able to run though, were you running 0.20.0 slave and 
  upgraded to 0.21.0? We introduced the dependencies for libapr and libsvn 
  for Mesos 0.21.0.
 I’ve only ever tried compiling 0.21.0. I’ve checked all binaries in 
 MESOS_HOME/build/src/.libs with ‘ldd’ and all are referring to a correct 
 existing libapr*-1.so.0 (mind the trailing “.0”).
 
  What's the stdout for the task like?
 
   Mesos slaves' stdout are empty.
 
 
 It’s a pity spark’s logging in this case is pretty marginal, as is mesos’. 
 One can’t log the (raw) task-descriptions as far as I can see, which would 
 be very helpful in this case.
 I could resort to building spark from source as well and add some logging, 
 but I’m afraid I will introduce other peculiarities. Do you think it’s my 
 only option?
 
 Thanks,
 
 H.
 
  Tim
 
 
 
 
  On Mon, Feb 9, 2015 at 4:10 AM, Hans van den Bogert 
  hansbog...@gmail.com wrote:
  I wasn’t thorough, the complete stderr includes:
 
  g++: /usr/lib64/libaprutil-1.so: No such file or directory
  g++: /usr/lib64/libapr-1.so: No such file or directoryn
  (including that trailing ’n')
 
  Though I can’t figure out how the process indirection is going from the 
  frontend spark application to mesos executors and where this shared 
  library error comes from.
 
  Hope someone can shed some light,
 
  Thanks
 
  On 08 Feb 2015, at 14:15, Hans van den Bogert hansbog...@gmail.com 
  wrote:
 
   Hi,
  
  
   I’m trying to get coarse mode to work under mesos(0.21.0), I thought 
   this would be a trivial change as Mesos was working well in 
   fine-grained mode.
  
   However the mesos tasks fail, I can’t pinpoint where things go wrong.
  
   This is a mesos stderr log from a slave:
  
  Fetching URI 'http://upperpaste.com/spark-1.2.0-bin-hadoop2.4.tgz'
  I0208 12:57:45.415575 25720 fetcher.cpp:126] Downloading 
   'http://upperpaste.com/spark-1.2.0-bin-hadoop2.4.tgz' to 
   '/local/vdbogert/var/lib/mesos//slaves/20150206-110658-16813322-5050-5515-S1/frameworks/20150208-125721-906005770-5050-32371-/executors/0/runs/cb525b32-387c-4698-a27e-8d4213080151/spark-1.2.0-bin-hadoop2.4.tgz'
  I0208 12:58:09.146960 25720 fetcher.cpp:64] Extracted resource 
   

Mesos coarse mode not working (fine grained does)

2015-02-08 Thread Hans van den Bogert
Hi, 


I’m trying to get coarse mode to work under mesos(0.21.0), I thought this would 
be a trivial change as Mesos was working well in fine-grained mode.

However the mesos tasks fail, I can’t pinpoint where things go wrong. 

This is a mesos stderr log from a slave:

Fetching URI 'http://upperpaste.com/spark-1.2.0-bin-hadoop2.4.tgz'
I0208 12:57:45.415575 25720 fetcher.cpp:126] Downloading 
'http://upperpaste.com/spark-1.2.0-bin-hadoop2.4.tgz' to 
'/local/vdbogert/var/lib/mesos//slaves/20150206-110658-16813322-5050-5515-S1/frameworks/20150208-125721-906005770-5050-32371-/executors/0/runs/cb525b32-387c-4698-a27e-8d4213080151/spark-1.2.0-bin-hadoop2.4.tgz'
I0208 12:58:09.146960 25720 fetcher.cpp:64] Extracted resource 
'/local/vdbogert/var/lib/mesos//slaves/20150206-110658-16813322-5050-5515-S1/frameworks/20150208-125721-906005770-5050-32371-/executors/0/runs/cb525b32-387c-4698-a27e-8d4213080151/spark-1.2.0-bin-hadoop2.4.tgz'
 into 
'/local/vdbogert/var/lib/mesos//slaves/20150206-110658-16813322-5050-5515-S1/frameworks/20150208-125721-906005770-5050-32371-/executors/0/runs/cb525b32-387c-4698-a27e-8d4213080151’

Mesos slaves' stdout are empty.


And I can confirm the spark distro is correctly extracted:
$ ls
spark-1.2.0-bin-hadoop2.4  spark-1.2.0-bin-hadoop2.4.tgz  stderr  stdout

The spark-submit log is here:
http://pastebin.com/ms3uZ2BK

Mesos-master
http://pastebin.com/QH2Vn1jX

Mesos-slave
http://pastebin.com/DXFYemix


Can somebody pinpoint me to logs, etc to further investigate this, I’m feeling 
kind of blind.
Furthermore, do the executors on mesos inherit all configs from the spark 
application/submit? E.g. I’ve given my executors 20GB of memory through a 
spark-submit —conf”  parameter. Should these settings also be present in the 
spark-1.2.0-bin-hadoop2.4.tgz distribution’s configs?

If, in order to be helped here, I need to present more logs etc, please let me 
know.

Regards,

Hans van den Bogert
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Mesos coarse mode not working (fine grained does)

2015-02-08 Thread Hans van den Bogert
I wasn’t thorough, the complete stderr includes:

g++: /usr/lib64/libaprutil-1.so: No such file or directory
g++: /usr/lib64/libapr-1.so: No such file or directoryn
(including that trailing ’n')

Though I can’t figure out how the process indirection is going from the 
frontend spark application to mesos executors and where this shared library 
error comes from.

Hope someone can shed some light, 

Thanks

On 08 Feb 2015, at 14:15, Hans van den Bogert hansbog...@gmail.com wrote:

 Hi, 
 
 
 I’m trying to get coarse mode to work under mesos(0.21.0), I thought this 
 would be a trivial change as Mesos was working well in fine-grained mode.
 
 However the mesos tasks fail, I can’t pinpoint where things go wrong. 
 
 This is a mesos stderr log from a slave:
 
Fetching URI 'http://upperpaste.com/spark-1.2.0-bin-hadoop2.4.tgz'
I0208 12:57:45.415575 25720 fetcher.cpp:126] Downloading 
 'http://upperpaste.com/spark-1.2.0-bin-hadoop2.4.tgz' to 
 '/local/vdbogert/var/lib/mesos//slaves/20150206-110658-16813322-5050-5515-S1/frameworks/20150208-125721-906005770-5050-32371-/executors/0/runs/cb525b32-387c-4698-a27e-8d4213080151/spark-1.2.0-bin-hadoop2.4.tgz'
I0208 12:58:09.146960 25720 fetcher.cpp:64] Extracted resource 
 '/local/vdbogert/var/lib/mesos//slaves/20150206-110658-16813322-5050-5515-S1/frameworks/20150208-125721-906005770-5050-32371-/executors/0/runs/cb525b32-387c-4698-a27e-8d4213080151/spark-1.2.0-bin-hadoop2.4.tgz'
  into 
 '/local/vdbogert/var/lib/mesos//slaves/20150206-110658-16813322-5050-5515-S1/frameworks/20150208-125721-906005770-5050-32371-/executors/0/runs/cb525b32-387c-4698-a27e-8d4213080151’
 
 Mesos slaves' stdout are empty.
 
 
 And I can confirm the spark distro is correctly extracted:
$ ls
spark-1.2.0-bin-hadoop2.4  spark-1.2.0-bin-hadoop2.4.tgz  stderr  stdout
 
 The spark-submit log is here:
 http://pastebin.com/ms3uZ2BK
 
 Mesos-master
 http://pastebin.com/QH2Vn1jX
 
 Mesos-slave
 http://pastebin.com/DXFYemix
 
 
 Can somebody pinpoint me to logs, etc to further investigate this, I’m 
 feeling kind of blind.
 Furthermore, do the executors on mesos inherit all configs from the spark 
 application/submit? E.g. I’ve given my executors 20GB of memory through a 
 spark-submit —conf”  parameter. Should these settings also be present in the 
 spark-1.2.0-bin-hadoop2.4.tgz distribution’s configs?
 
 If, in order to be helped here, I need to present more logs etc, please let 
 me know.
 
 Regards,
 
 Hans van den Bogert


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Mesos coarse mode not working (fine grained does)

2015-02-08 Thread Tim Chen
Hi there,

It looks like while trying to launch the executor (or one of the process
like the fetcher to fetch the uris) was failing because of the dependencies
problem you see. Your mesos-slave shouldn't be able to run though, were you
running 0.20.0 slave and upgraded to 0.21.0? We introduced the dependencies
for libapr and libsvn for Mesos 0.21.0.

What's the stdout for the task like?

Tim




On Mon, Feb 9, 2015 at 4:10 AM, Hans van den Bogert hansbog...@gmail.com
wrote:

 I wasn’t thorough, the complete stderr includes:

 g++: /usr/lib64/libaprutil-1.so: No such file or directory
 g++: /usr/lib64/libapr-1.so: No such file or directoryn
 (including that trailing ’n')

 Though I can’t figure out how the process indirection is going from the
 frontend spark application to mesos executors and where this shared library
 error comes from.

 Hope someone can shed some light,

 Thanks

 On 08 Feb 2015, at 14:15, Hans van den Bogert hansbog...@gmail.com
 wrote:

  Hi,
 
 
  I’m trying to get coarse mode to work under mesos(0.21.0), I thought
 this would be a trivial change as Mesos was working well in fine-grained
 mode.
 
  However the mesos tasks fail, I can’t pinpoint where things go wrong.
 
  This is a mesos stderr log from a slave:
 
 Fetching URI 'http://upperpaste.com/spark-1.2.0-bin-hadoop2.4.tgz'
 I0208 12:57:45.415575 25720 fetcher.cpp:126] Downloading '
 http://upperpaste.com/spark-1.2.0-bin-hadoop2.4.tgz' to
 '/local/vdbogert/var/lib/mesos//slaves/20150206-110658-16813322-5050-5515-S1/frameworks/20150208-125721-906005770-5050-32371-/executors/0/runs/cb525b32-387c-4698-a27e-8d4213080151/spark-1.2.0-bin-hadoop2.4.tgz'
 I0208 12:58:09.146960 25720 fetcher.cpp:64] Extracted resource
 '/local/vdbogert/var/lib/mesos//slaves/20150206-110658-16813322-5050-5515-S1/frameworks/20150208-125721-906005770-5050-32371-/executors/0/runs/cb525b32-387c-4698-a27e-8d4213080151/spark-1.2.0-bin-hadoop2.4.tgz'
 into
 '/local/vdbogert/var/lib/mesos//slaves/20150206-110658-16813322-5050-5515-S1/frameworks/20150208-125721-906005770-5050-32371-/executors/0/runs/cb525b32-387c-4698-a27e-8d4213080151’
 
  Mesos slaves' stdout are empty.
 
 
  And I can confirm the spark distro is correctly extracted:
 $ ls
 spark-1.2.0-bin-hadoop2.4  spark-1.2.0-bin-hadoop2.4.tgz  stderr
 stdout
 
  The spark-submit log is here:
  http://pastebin.com/ms3uZ2BK
 
  Mesos-master
  http://pastebin.com/QH2Vn1jX
 
  Mesos-slave
  http://pastebin.com/DXFYemix
 
 
  Can somebody pinpoint me to logs, etc to further investigate this, I’m
 feeling kind of blind.
  Furthermore, do the executors on mesos inherit all configs from the
 spark application/submit? E.g. I’ve given my executors 20GB of memory
 through a spark-submit —conf”  parameter. Should these settings also be
 present in the spark-1.2.0-bin-hadoop2.4.tgz distribution’s configs?
 
  If, in order to be helped here, I need to present more logs etc, please
 let me know.
 
  Regards,
 
  Hans van den Bogert


 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org