Hi Loic,
Based on your feedback, a few action-items emerged for improving this
containerized approach to running teuthology jobs:
1. use install-deps.sh for installing dependencies
2. modify the sshd configuration so that the ssh port is specified at
runtime via an environment variable. This has the consequence of
being able to use --net=host and thus more than one remote can run
locally (for jobs with multiple remotes).
3. add an option to provide a sha1 so that the code gets checked out
as part of the entrypoint of the container and gets built.
4. write a 'dockerize-config' script for taking a failed job's YAML
file and modify it so that it can run with containers.
5. write a 'failed-devenv' script that given a url to a failed job
(a) fetches the YAML file (b) runs the dockerize-config script (c)
checks out the corresponding sha1 version (d) compiles the code
6. write a 'run-failed-job' that (a) re-builds the code (b)
instantiates one container for each specified remote and (c)
executes the job.
I've implemented 1-3 and am working on 4-6. In short, the goal of all
the above is to capture the dev/build/test loop and make it easier to
go from 'failed job' to 'working on a fix'. The high-level sequence is
(1) run 'make-failed-devenv' so you get the dev environment for the
failed job (2) work on a fix and (3) invoke 'run-failed-job' and
inspect results (possibly going back to 2 if need it).
Thoughts on 4-6?
cheers,
ivo
On Thu, Sep 3, 2015 at 3:23 PM, Loic Dachary <[email protected]> wrote:
>
>
> On 03/09/2015 23:45, Ivo Jimenez wrote:> On Thu, Sep 3, 2015 at 3:09 AM Loic
> Dachary <[email protected]> wrote:
>>>
>>>> 2. Initialize a `cephdev` container (the following assumes `$PWD` is
>>>> the folder containing the ceph code in your machine):
>>>>
>>>> ```bash
>>>> docker run \
>>>> --name remote0
>>>> -p 2222:22
>>>> -d -e AUTHORIZED_KEYS="`cat ~/.ssh/id_rsa.pub`" \
>>>> -v `pwd`:/ceph \
>>>> -v /dev:/dev \
>>>> -v /tmp/ceph_data/$RANDOM:/var/lib/ceph \
>>>> --cap-add=SYS_ADMIN --privileged \
>>>> --device /dev/fuse
>>>> ivotron/cephdev
>>>> ```
>>>
>>> $PWD is ceph built from sources ? Could you share the dockerfile you used
>>> to create ivotron/cephdev ?
>>
>>
>> Yes, the idea is to wrap your ceph folder in a container so that it
>> becomes a target for teuthology. The link to the dockerfile:
>>
>> https://github.com/ivotron/docker-cephdev
>
> You may want to use install-deps.sh instead of apt-get build-dep to get the
> packages from sources instead of a presumably older from the source
> repositories.
>>
>>>
>>>>
>>>> Caveats:
>>>>
>>>> * only a single job can be executed and has to be manually
>>>> assembled. I plan to work on supporting suites, which, in short,
>>>> implies stripping out the `install` task from existing suites and
>>>> leaving only the `install.ship_utilities` subtask instead (the
>>>> container image has all the dependencies in it already).
>>>
>>> Maybe there could be a script to transform config files such as
>>> http://qa-proxy.ceph.com/teuthology/loic-2015-09-02_15:41:18-rbd-master---basic-multi/1042448/config.yaml
>>> into a config file suitable for this use case ?
>>
>>
>> that's what I have in mind but haven't looked into it yet. I was
>> thinking about extending teuthology-suite so that you pass a
>> --filter-tasks flag so that we can remove the unwanted tasks, in the
>> similar way that --filter leaves some suites out.
>>
>>>
>>> Together with git clone -b $sha1 + make in the container, it would be a
>>> nice way to replay / debug a failed job using a single vm and without going
>>> through packages.
>>
>>
>> that'd be relatively straight-forward to accomplish, at least the
>> docker-side of things (a dockerfile that is given the $SHA1). Prior to
>> that, we'd need to have a script that extracts the failed job from
>> paddles (does this exist already?), creates a new sha1-predicated
>
> What do you mean by "extract the failed job" ? Do you expect paddles to have
> more information than the config.yaml file (
> loic-2015-09-02_15:41:18-rbd-master---basic-multi/1042448/config.yaml for
> instance) ?
>
>> container and passes the yaml file of the failed job to teuthology
>> (which would be invoked with the hypothetical --filter-tasks flag
>> mentioned above).
>
> It's probably more than just filtering out tasks. What about a script that
> would
>
> dockerize-config < config.yaml > docker-config.yaml
>
> and be smart enough to do whatever is necessary to transform an existing
> config.yaml so that it is suitable to run on docker targets. And fail loudly
> if it can't ;-)
>
>>
>>>
>>>> * I have only tried the above with the `radosbench` and `ceph-fuse`
>>>> tasks. Using `--cap-add=ALL` and `-v /lib/modules:/lib/modules`
>>>> flags allows a container to load kernel modules so, in principle,
>>>> it should work for `rbd` and `kclient` tasks but I haven't tried
>>>> it yet.
>>>> * For jobs specifying multiple remotes, multiple containers can be
>>>> launched (one per remote). While it is possible to run these
>>>> on the same docker host, the way ceph daemons dynamically
>>>> bind to ports in the 6800-7300 range makes it difficult to
>>>> determine which ports to expose from each container (exposing the
>>>> same port from multiple containers in the same host is not
>>>> allowed, for obvious reasons). So either each remote runs on a
>>>> distinct docker host machine, or a deterministic port assignment
>>>> is implemented such that, for example, 6800 is always assigned to
>>>> osd.0, regardless of where it runs.
>>>
>>> Would docker run --publish-all=true help ?
>>
>>
>> That option doesn't work with --net=container, which is what we are
>> using in this case since we remap sshd's 22 port of the container. In
>> other words, for --publish-all to work we need to use --net=host but
>> that disables the virtual network that docker provides. An alternative
>> would be to configure the base image we're using
>> (https://github.com/tutumcloud/tutum-ubuntu/) so that the port that
>> sshd uses is passed in an env var.
>
> Why not use --net=host then ?
>
>>
>>>
>>>
>>> Clever hack, congrats :-)
>>
>>
>> thanks!
>>
>
> --
> Loïc Dachary, Artisan Logiciel Libre
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html