For scheduling workflows of batch jobs, I would recommend looking into the
Chronos framework for Mesos:
https://github.com/airbnb/chronos
http://mesosphere.io/learn/run-chronos-on-mesos/
http://nerds.airbnb.com/introducing-chronos/


On Thu, Jul 24, 2014 at 4:58 AM, Itamar Ostricher <ita...@yowza3d.com>
wrote:

> Not written in MPI. Each task is a stand-alone execution of a binary
> program that takes the 1-2 data file paths as parameters (GCS paths), with
> the output stored in another GCS file (path as flag).
> Different tasks do not need to communicate with others. Tasks only talk
> with GCS to read and write their data.
> The biggest bottleneck (for most tasks) is CPU. Few of the tasks do little
> processing, so in these cases the bottleneck is GCS latency.
>
> Our current solution is to run N services on each machine (N = number of
> cores on the machine), with the main Python script sending commands to
> available services (using sockets).
> We are not happy with this solution because it requires us to deal with
> too many low-level details, like tracking the status of the services,
> restarting lost tasks, collecting logs, etc.
>
>
> On Thu, Jul 24, 2014 at 11:37 AM, Tomas Barton <barton.to...@gmail.com>
> wrote:
>
>> Depends on the nature of your tasks. Your code is written in MPI? You
>> tasks needs to communicate with others? One task will operate on all files,
>> some subset, or just on file? You might have:
>>      - one task per machine running on as many cores as possible
>>      - many smaller tasks starting in a dynamic manner depending on the
>> data
>>
>> What is the biggest bottleneck you have? disk read/write, network, CPU,
>> memory?
>>
>> Writing own framework is possible, if you can take advantage of some
>> problem specific property.
>>
>>
>> On 24 July 2014 07:34, Itamar Ostricher <ita...@yowza3d.com> wrote:
>>
>>> many: we have a processing pipeline with ~10 stages (one C++ program per
>>> stage usually), batch processing (almost-)all pairs of files in the
>>> dataset. the dataset contains >10K files at the moment, so a couple of
>>> hundreds of millions of program executions would be my definition for
>>> "many" in this case :-)
>>>
>>> I'll start with few machines with deploy scripts and a small subset of
>>> the dataset just to get the hang of it.
>>> It's a bit difficult to comprehend the stack, with all the possible
>>> options and combinations, though.
>>> If I have a main Python script that generates all the processing
>>> pipeline commands (that can be simply executed via shell), should I use a
>>> specific framework (like Hydra)? Or maybe use raw mesos? Or maybe I should
>>> write my own framework?
>>>
>>>
>>> On Wed, Jul 23, 2014 at 2:25 PM, Tomas Barton <barton.to...@gmail.com>
>>> wrote:
>>>
>>>> Define many :) If you want to use some provisioning tools like Puppet,
>>>> Chef, Ansible... there are quite a few modules to do this job:
>>>>
>>>> http://mesosphere.io/learn/#tools
>>>>
>>>> If you have only a few machines, you might be fine with deploy scripts.
>>>>
>>>> An example of MPI framework is here:
>>>>
>>>> https://github.com/mesosphere/mesos-hydra
>>>>
>>>>
>>>>
>>>>
>>>> On 23 July 2014 12:26, Itamar Ostricher <ita...@yowza3d.com> wrote:
>>>>
>>>>> Thanks Tomas.
>>>>>
>>>>> ldconfig didn't change anything. make still failed.
>>>>>
>>>>>  But the Debian packaged installed like a charm, so I'm good :-)
>>>>> Now I just need to figure out how to use it...
>>>>> (going to start with [1], unless anyone chimes in with a better
>>>>> recommended starting point for a mesos-newbie who is trying to set up a
>>>>> cluster of GCE instances in order to distribute execution of *many* C++
>>>>> programs working on a large dataset that is currently stored in Google
>>>>> Cloud Storage.)
>>>>>
>>>>> [1] http://mesos.apache.org/documentation/latest/deploy-scripts/
>>>>>
>>>>>
>>>>> On Wed, Jul 23, 2014 at 11:55 AM, Tomas Barton <barton.to...@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> that's quite strange. Try to run
>>>>>>
>>>>>> ldconfig
>>>>>>
>>>>>> and then again make.
>>>>>>
>>>>>> You can find binary packages for Debian here:
>>>>>> http://mesosphere.io/downloads/
>>>>>>
>>>>>> Tomas
>>>>>>
>>>>>>
>>>>>> On 23 July 2014 10:09, Itamar Ostricher <ita...@yowza3d.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm trying to do a clean build of mesos for the 0.19.0 tarball.
>>>>>>> I was following the instructions from
>>>>>>> http://mesos.apache.org/gettingstarted/ step by step. Got to
>>>>>>> running `make`, which ran for quite a while, and exited with errors (see
>>>>>>> the end of the output below).
>>>>>>>
>>>>>>> Extra env info: I'm trying to do this build on a 64-bit Debian GCE
>>>>>>> instance:
>>>>>>> itamar@mesos-test-1:/tmp/mesos-0.19.0/build$ uname -a
>>>>>>> Linux mesos-test-1 3.2.0-4-amd64 #1 SMP Debian 3.2.54-2 x86_64
>>>>>>> GNU/Linux
>>>>>>>
>>>>>>> Assistance will be much appreciated!
>>>>>>> Alternatively, I don't mind using precompiled binaries, if anyone
>>>>>>> can point me in the direction of such binaries for the GCE environment I
>>>>>>> described :-)
>>>>>>>
>>>>>>> tail of make output:
>>>>>>> ----------------------------
>>>>>>>
>>>>>>> libtool: link: warning:
>>>>>>> `/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../../lib/libgflags.la'
>>>>>>> seems to be moved
>>>>>>> *** Warning: Linking the shared library libmesos.la against the
>>>>>>> *** static library ../3rdparty/leveldb/libleveldb.a is not portable!
>>>>>>> libtool: link: warning:
>>>>>>> `/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../../lib/libgflags.la'
>>>>>>> seems to be moved
>>>>>>> libtool: link: g++  -fPIC -DPIC -shared -nostdlib
>>>>>>> /usr/lib/gcc/x86_64-linux-gnu/4.7/../../../x86_64-linux-gnu/crti.o
>>>>>>> /usr/lib/gcc/x86_64-linux-gnu/4.7/crtbeginS.o  -Wl,--whole-archive
>>>>>>> ./.libs/libmesos_no_3rdparty.a ../3rdparty/libprocess/.libs/libprocess.a
>>>>>>> ./.libs/libjava.a -Wl,--no-whole-archive
>>>>>>>  ../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src/.libs/libprotobuf.a
>>>>>>> ../3rdparty/libprocess/3rdparty/glog-0.3.3/.libs/libglog.a
>>>>>>> -L/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../../lib
>>>>>>> ../3rdparty/leveldb/libleveldb.a
>>>>>>> ../3rdparty/zookeeper-3.4.5/src/c/.libs/libzookeeper_mt.a
>>>>>>> /tmp/mesos-0.19.0/build/3rdparty/libprocess/3rdparty/glog-0.3.3/.libs/libglog.a
>>>>>>> /usr/lib/libgflags.so -lpthread
>>>>>>> /tmp/mesos-0.19.0/build/3rdparty/libprocess/3rdparty/libev-4.15/.libs/libev.a
>>>>>>> -lsasl2 /usr/lib/x86_64-linux-gnu/libcurl-nss.so -lz -lrt
>>>>>>> -L/usr/lib/gcc/x86_64-linux-gnu/4.7
>>>>>>> -L/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../x86_64-linux-gnu
>>>>>>> -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu
>>>>>>> -L/usr/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/4.7/../../.. -lstdc++ 
>>>>>>> -lm
>>>>>>> -lc -lgcc_s /usr/lib/gcc/x86_64-linux-gnu/4.7/crtendS.o
>>>>>>> /usr/lib/gcc/x86_64-linux-gnu/4.7/../../../x86_64-linux-gnu/crtn.o
>>>>>>>  -pthread -Wl,-soname -Wl,libmesos-0.19.0.so -o .libs/
>>>>>>> libmesos-0.19.0.so
>>>>>>> libtool: link: (cd ".libs" && rm -f "libmesos.so" && ln -s "
>>>>>>> libmesos-0.19.0.so" "libmesos.so")
>>>>>>> libtool: link: ( cd ".libs" && rm -f "libmesos.la" && ln -s "../
>>>>>>> libmesos.la" "libmesos.la" )
>>>>>>> g++ -DPACKAGE_NAME=\"mesos\" -DPACKAGE_TARNAME=\"mesos\"
>>>>>>> -DPACKAGE_VERSION=\"0.19.0\" -DPACKAGE_STRING=\"mesos\ 0.19.0\"
>>>>>>> -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\"
>>>>>>> -DVERSION=\"0.19.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1
>>>>>>> -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 
>>>>>>> -DHAVE_MEMORY_H=1
>>>>>>> -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 
>>>>>>> -DHAVE_UNISTD_H=1
>>>>>>> -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DHAVE_PTHREAD=1 
>>>>>>> -DMESOS_HAS_JAVA=1
>>>>>>> -DHAVE_PYTHON=\"2.7\" -DMESOS_HAS_PYTHON=1 -DHAVE_LIBZ=1 
>>>>>>> -DHAVE_LIBCURL=1
>>>>>>> -DHAVE_LIBSASL2=1 -I. -I../../src   -Wall -Werror
>>>>>>> -DLIBDIR=\"/usr/local/lib\" -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\"
>>>>>>> -DPKGDATADIR=\"/usr/local/share/mesos\" -I../../include
>>>>>>> -I../../3rdparty/libprocess/include
>>>>>>> -I../../3rdparty/libprocess/3rdparty/stout/include -I../include
>>>>>>> -I../3rdparty/libprocess/3rdparty/boost-1.53.0
>>>>>>> -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src
>>>>>>> -I../3rdparty/libprocess/3rdparty/picojson-4f93734
>>>>>>> -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src
>>>>>>> -I../3rdparty/leveldb/include 
>>>>>>> -I../3rdparty/zookeeper-3.4.5/src/c/include
>>>>>>> -I../3rdparty/zookeeper-3.4.5/src/c/generated   -pthread -g -g2 -O2 -MT
>>>>>>> local/mesos_local-main.o -MD -MP -MF local/.deps/mesos_local-main.Tpo 
>>>>>>> -c -o
>>>>>>> local/mesos_local-main.o `test -f 'local/main.cpp' || echo
>>>>>>> '../../src/'`local/main.cpp
>>>>>>> mv -f local/.deps/mesos_local-main.Tpo
>>>>>>> local/.deps/mesos_local-main.Po
>>>>>>> /bin/bash ../libtool  --tag=CXX   --mode=link g++ -pthread -g -g2
>>>>>>> -O2   -o mesos-local local/mesos_local-main.o libmesos.la -lsasl2
>>>>>>> -lcurl -lz  -lrt
>>>>>>> libtool: link: g++ -pthread -g -g2 -O2 -o .libs/mesos-local
>>>>>>> local/mesos_local-main.o  ./.libs/libmesos.so /usr/lib/libgflags.so
>>>>>>> -lpthread -lsasl2 /usr/lib/x86_64-linux-gnu/libcurl-nss.so -lz -lrt 
>>>>>>> -pthread
>>>>>>> ./.libs/libmesos.so: error: undefined reference to 'dlopen'
>>>>>>> ./.libs/libmesos.so: error: undefined reference to 'dlsym'
>>>>>>> ./.libs/libmesos.so: error: undefined reference to 'dlerror'
>>>>>>> collect2: error: ld returned 1 exit status
>>>>>>> make[2]: *** [mesos-local] Error 1
>>>>>>> make[2]: Leaving directory `/tmp/mesos-0.19.0/build/src'
>>>>>>> make[1]: *** [all] Error 2
>>>>>>> make[1]: Leaving directory `/tmp/mesos-0.19.0/build/src'
>>>>>>> make: *** [all-recursive] Error 1
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to