> On June 21, 2016, 2:29 a.m., Qian Zhang wrote: > >
Thanks for the comments. I've responded to them, but the framework was rejected by the reviewers and won't be landing in the code base. > On June 21, 2016, 2:29 a.m., Qian Zhang wrote: > > src/examples/gpu_framework.cpp, line 203 > > <https://reviews.apache.org/r/48915/diff/2/?file=1423732#file1423732line203> > > > > Why do we want to provide this flag to user? Since this is a GPU > > framework, I think we should always set > > `FrameworkInfo::Capability::GPU_RESOURCES` for this framework. Otherwise, > > what is the expected behavior when user set this flag to `false`? The > > framework will wait for GPU resources forever? The purpose was for testing to verify that the fraemwork actually didn't get any GPU resources. > On June 21, 2016, 2:29 a.m., Qian Zhang wrote: > > src/examples/gpu_framework.cpp, line 113 > > <https://reviews.apache.org/r/48915/diff/2/?file=1423732#file1423732line113> > > > > So here if we find the first offer can not satisfy task's resources, we > > will abort the framework, right? But since we are in a `for` loop here, I > > think we should try all the offers first, and if no offer can satisfy > > task's resources, we should let the framework wait rather than aborting. Agreed. I put this in before adding the timeout. > On June 21, 2016, 2:29 a.m., Qian Zhang wrote: > > src/examples/gpu_framework.cpp, line 98 > > <https://reviews.apache.org/r/48915/diff/2/?file=1423732#file1423732line98> > > > > I think we should log this message right before > > `driver->launchTasks(offer.id(), {task});` Sure, that makes sense. - Kevin ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/48915/#review138743 ----------------------------------------------------------- On June 19, 2016, 9:01 p.m., Kevin Klues wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/48915/ > ----------------------------------------------------------- > > (Updated June 19, 2016, 9:01 p.m.) > > > Review request for mesos and Benjamin Mahler. > > > Bugs: MESOS-5649 > https://issues.apache.org/jira/browse/MESOS-5649 > > > Repository: mesos > > > Description > ------- > > This framework is designed to show how to build a GPU capable > framework that can accept offers with GPUs and launch tasks that use > them. The key thing to remember is that the GPU_RESOURCES capability > must be set in `FrameworkInfo` in order for a framework to receive > resource offers from agents that contain GPUs. > > > Diffs > ----- > > src/Makefile.am a4931560f1a5b3fbe41ea181477341d3ac459b58 > src/examples/gpu_framework.cpp PRE-CREATION > > Diff: https://reviews.apache.org/r/48915/diff/ > > > Testing > ------- > > Run a master and an agent capable of handing out GPUs: > ``` > $ sudo bin/mesos-master.sh --ip=127.0.0.1 --log_dir=/var/log/mesos > --work_dir=/var/lib/mesos > $ sudo bin/mesos-agent.sh --master=127.0.0.1:5050 --ip=127.0.0.1 > --log_dir=/var/log/mesos --work_dir=/var/lib/mesos > --isolation="cgroups/devices,gpu/nvidia" > ``` > > Run a couple of instances of the framework and verify the correct output: > ``` > $ ./src/gpu-framework --master=127.0.0.1:5050 --num_gpus=0 > $ ./src/gpu-framework --master=127.0.0.1:5050 --num_gpus=1 > $ ./src/gpu-framework --master=127.0.0.1:5050 --num_gpus=4 > $ ./src/gpu-framework --master=127.0.0.1:5050 --no-allow_gpus --num_gpus=1 > ``` > > > Thanks, > > Kevin Klues > >
