Re: Persistent volumes

Hendrik Haddorp Mon, 27 Nov 2017 23:34:36 -0800

As said, I only use persistent volumes with my only scheduler straighton Mesos so do not exactly know how this works in Marathon...

The persistent volume is created on a Mesos agent and basically ends upbeing a folder on that hosts disk. So yes, you can not use the volume ona different agent/slave. For marathon you would need to set a hostnameconstraint that makes sure the same host is used when restarting thetask. You won't be able to use fail over to different agents just haveMarathon restart your task once it fails. Also only one task at a timecan have the volume bound.

Yes, you can achieve persistence in pretty much the same way by using ahostpath but then you are using implicit knowledge about yourenvironment, which is not very clean in my opinion, and thus have atighter coupling. The nice thing about persistent volumes is that theyare managed by Mesos. I do not need to tell the Mesos admin that I needspace at some location. I do not need to do something special if I havemultiple instances running as they get all their own directory. And Ican programatically destroy the volume and then the directory on thehost gets deleted again (at least since Mesos 1.0). So in my opinion theusage of persistent volumes is much cleaner. But there are certainly usecases that do not really work with them, like being able to fail over todifferent host. For that you would wither need a shared network mount orstorage like HDFS. Btw, the Mesos containerizer should also enforce diskquotas so your task would not be able to fill the filesystem.


On 27.11.2017 16:11, Dino Lokmic wrote:

yes I did. So I don't have to prepare it before task? I can't usevolume created on slave A, from slave B


Once task fails where will it be restarted? Do I have to specify host?

If I do, it means I can achieve "persistence" same way I deploy now,by specifying hostpath for volume and hostname


....
  "constraints": [
    [
      "hostname",
      "CLUSTER",
      "MYHOSTNAME"
    ]
  ],
  "container": {
    "type": "DOCKER",
    "volumes": [
      {
        "containerPath": "/opt/storm/storm-local",
        "hostPath": "/opt/docker_data/storm/storm-local",
        "mode": "RW"
      },
      {
        "containerPath": "/opt/storm/logs",
        "hostPath": "/opt/docker_logs/storm/logs",
        "mode": "RW"
      },
      {
        "containerPath": "/home/xx/runtime/storm",
        "hostPath": "/home/xx/runtime/storm",
        "mode": "RO"
      }
    ],
    "docker": {
      "image": "xxx/storm-1.1.0",
      "network": "HOST",
      "portMappings": [],
      "privileged": false,
      "parameters": [],
      "forcePullImage": true
    }
  },

....

On Mon, Nov 27, 2017 at 3:05 PM, Hendrik Haddorp<[email protected] <mailto:[email protected]>> wrote:


    I have my own scheduler that is performing a create operation. As
    you are using Marathon this call would have to be done by Marathon.
    Did you read
    https://mesosphere.github.io/marathon/docs/persistent-volumes.html
    <https://mesosphere.github.io/marathon/docs/persistent-volumes.html> ?

    On 27.11.2017 14:59, Dino Lokmic wrote:

        @hendrik

        How did you create this
        "my-volume-227927c2-3266-412b-8572-92c5c93c051a" volume?

        On Mon, Nov 27, 2017 at 7:59 AM, Hendrik Haddorp
        <[email protected] <mailto:[email protected]>
        <mailto:[email protected]
        <mailto:[email protected]>>> wrote:

            Hi,

            I'm using persistent volumes directly on Mesos, without
        Marathon.
            For that the scheduler (like Marathon) has to first
        reserve disk
            space and then create a persistent volume with that. The next
            resource offer message then contain the volume in "disk"
        resource
            part of the offer. Now you can start your task. In the
        request you
            would need to include the resources and for the
        "container" part
            of the request you would have:
                volumes {
                    container_path: "/mount/point/in/container"
                    host_path:
        "my-volume-227927c2-3266-412b-8572-92c5c93c051a"
                    mode: RW
                }

            The container path is the mount point in your container
        and the
            host path is the id of your persistent volume.

            In case you use marathon the documentation should be this:
        https://mesosphere.github.io/marathon/docs/persistent-volumes.html
        <https://mesosphere.github.io/marathon/docs/persistent-volumes.html>
           
        <https://mesosphere.github.io/marathon/docs/persistent-volumes.html
        <https://mesosphere.github.io/marathon/docs/persistent-volumes.html>>

            regards,
            Hendrik


            On 23.11.2017 10:00, Dino Lokmic wrote:

                I have few machines on Linode and I run Mesos there. Can
                someone explain to me, how to set volumes right.

                Now I run taks via marathon like this

                ...

                "constraints": [
                    [
                      "hostname",
                      "CLUSTER",
                      "HOSTNAME"
                    ]
                  ],
                  "container": {
                    "type": "DOCKER",
                    "volumes": [
                      {
                        "containerPath": "/opt/storm/storm-local",
                        "hostPath": "/opt/docker_data/storm/storm-local",
                        "mode": "RW"
                      }
                    ],
                    "docker": {
                      "image": "xxxx",
                      "network": "HOST",
                      "portMappings": [],
                      "privileged": false,
                      "parameters": [],
                      "forcePullImage": true
                    }
                  },
                ...

                So if task is restarted I can be sure it has access to
                previously used data.
                You can see I have scaling problem and my task is
        depending on
                this node.

                I would like for my apps to be node independent and
        also that
                they have redundant data.

                What is best practice for this?

                I want to scale aplication to 2 instances, I1 and I2

                Instance I1 runs on agent A1 and uses volume V1
                Instance I2 runs on agent A2 and uses volume V2

                If agent A1 stops, I1 is restared to A3 and uses V1
                If V1 failes I1 uses copy of data from V3...


                Can someone point to article describing this, or at
        least give
                me few "keywords"


                Thanks

Re: Persistent volumes

Reply via email to