[ 
https://issues.apache.org/jira/browse/MESOS-5653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cliff updated MESOS-5653:
-------------------------
    Description: 
When attempting to create a persistent volume via the /create-volumes operator 
endpoint. I get a HTTP 200  from the master and in the logs on the master I see:

{noformat}
http.cpp:312] HTTP POST for /master/create-volumes from "172.16.10.11:40686 
with User-Agent='curl/7.29.0' "
{noformat}

then next line I see on the master is:
{noformat}
"master.cpp:6560] Sending checkpointed resources  to slave 
0ef7d2e1-8b0d-44d4-8db0-cc58ac2058af-S0 at slave(1)@172.16.10.4:5051"
{noformat}

Now if I look in the logs on the slave that was specified in the request to 
create a persistent volume I see:

then on the slave I see:
{noformat}
 "1572 slave.cpp:2327] Updated checkpointed resources from  to   "
{noformat}

Notice that from destination and a to destination are both missing 
specifically, they should be the valueos of:

checkpointedResources and newCheckpointedResources, from here:
https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L2582

I am currently running only one slave for troubleshooting purposes, the 
resource file on the slave with the disk resource looks like the following:

#resources=file:///etc/default/mesos.resources.json

{noformat}
[
   {
    "name": "disk",
    "type": "SCALAR",
    "scalar": {
      "value": 50000
    }
  },
   {
      "name":"disk",
      "type":"SCALAR",
      "scalar":{
         "value":1000000
      },
      "role":"testing",
      "disk":{
         "source":{
            "type":"MOUNT",
            "mount":{
               "root":"/data"
            }
         }
      }
   },
   {
      "name":"cpus",
      "type":"SCALAR",
      "scalar":{
         "value":16
      },
      "role":"testing"
   },
   {
      "name":"mem",
      "type":"SCALAR",
      "scalar":{
         "value":128000
      },
      "role":"testing"
   },
   {
      "name":"ports",
      "type":"RANGES",
      "ranges":{
         "range":[
            {
               "begin":31000,
               "end":32000
            }
         ]
      },
      "role":"testing"
   }
]
{noformat}

When I {{curl master:5050/slaves | jq '.'}} and look under the key 
{{reserved_resources_full}}, I see the above resources on that slave. 

Here is my request to via the operator endpoint {{/create-resources}}, I am 
trying to create a persistent volume on the disk of type MOUNT above, which is 
in {{/proc/mounts}} as {{/data}}:

{noformat}
curl -i  -d slaveId=0ee7d2e7-8b0d-44d4-8d80-cc58ac2058ae-S4     \      
          -d volumes='[
          {
            "name": "testvol",
            "type": "SCALAR",
            "scalar": { "value": 10000 },
            "role": "testing",
            "disk": {
             "source": {
               "type" : "MOUNT",
                "path" : { "root" : "/data" }
             },
              "persistence": {
               "id" : "cliff"
             },
              "volume": {
               "mode": "RW",
               "container_path": "/data"
              }
            }
          }
        ]' -X POST http://master:5050/master/create-volumes
{noformat}
        
{noformat}
HTTP/1.1 200 OK
Date: Sun, 19 Jun 2016 04:38:45 GMT
{noformat}

If look at the slave specified with slaveID above via:

{noformat}
curl - http://slave1:5051/state  
{noformat}

I will not see the volume created. Also here are no errors in the INFO logs on 
either the master or slave relating to this request. The only log entries are 
those that I have provided. 

The same problem/behavior seems to exist when trying creating persistent 
volumes on dynamically reserved resources as well.

My steps were:
systemctl stop meso-slave
cd /var/mesos
rm -rf meta
systemctl start mesos-slave

then I issued the following to the /reserve operator endpoint:

{noformat}

curl -i \
      -d slaveId=0ee7d2b7-7b0d-44d4-8d80-cc51ac2058ae-S0 \
      -d resources='[

        {
            "name": "disk",
            "type": "SCALAR",
            "scalar": { "value": 10000 },
            "disk": {
             "source": {
               "type" : "MOUNT",
                "path" : { "root" : "/data" }
             },
              "persistence": {
               "id" : "testing"
             },
              "volume": {
               "mode": "RW",
               "container_path": "/data"
              }
            }
          }
          ]' \
          -X POST http://master:5050/master/reserve
{noformat}

The volume will never get created, there will be no error logged anywhere on 
the master or slave and I will only see the following on the slave, the same as 
when attempting to create a persistent volume on statically defined resources:

{noformat}
5558 slave.cpp:2327] Updated checkpointed resources from  to
{noformat}

I also tried enabling auth to rule out that possibly being a factor. Steps 
taken:

{noformat}
/etc/default/mesos-master:
 
export authenticate_http=true
export credentials="/etc/default/credentials.json"

/etc/default/credentials.json
{
   "credentials" : [
     {
       "principal": "test",
       "secret": "test"
     }
   ]
 }   
{noformat}

restart masters with "systemctl restart mesos-master"

{noformat}
# curl -i \
>      -u test:test \
>      -d slaveId=af6e2f17-3d53-4656-a6ce-49658b6b4db3-S0 \
>      -d resources='[
>         {
>           "name": "disk",
>           "type": "SCALAR",
>           "scalar": { "value": 1024 },
>           "reservation": {
>             "principal": "test"
>            }
>         }
>       ]' \
>            -X POST http://master:5050/master/reserve
HTTP/1.1 200 OK

{noformat}

The result is the same, if look at the output of"
{noformat}
http://master:5050/slaves 
{noformat}

I won't see anything reserved:

{noformat}
reserved_resources_full": {},
{noformat}

and again in the logs on the one slave that is currently active I will see:

{noformat}
slave.cpp:2327] Updated checkpointed resources from  to
{noformat}

and no further information either on the slave agent or the master.

Whether or not I specify  a role doesn't have any effect:

{noformat}

curl -i \
     -u test:test \
       -d slaveId=af6e2f17-3d53-4656-a6ce-49658b6b4db3-S0 \
       -d resources='[

         {
             "name": "disk",
             "type": "SCALAR",
             "scalar": { "value": 10000 },
             "role": "test",
             "disk": {
              "source": {
                "type" : "MOUNT",
                 "path" : { "root" : "/data" }
              },
               "persistence": {
                "id" : "testing"
              },
               "volume": {
                "mode": "RW",
                "container_path": "/data"
               }
             }
           }
           ]' \
           -X POST http://master:5050/master/reserve

HTTP/1.1 200 OK
Date: Mon, 20 Jun 2016 21:32:17 GMT
Content-Length: 0


curl http://master:5050/slaves | jq '.' | grep full
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   590  100   590    0     0   158k      0 --:--:-- --:--:-- --:--:--  192k

      "reserved_resources_full": {},
      "used_resources_full": [],
      "offered_resources_full": []

{noformat}

  was:
When attempting to create a persistent volume via the /create-volumes operator 
endpoint. I get a HTTP 200  from the master and in the logs on the master I see:

{noformat}
http.cpp:312] HTTP POST for /master/create-volumes from "172.16.10.11:40686 
with User-Agent='curl/7.29.0' "
{noformat}

then next line I see on the master is:
{noformat}
"master.cpp:6560] Sending checkpointed resources  to slave 
0ef7d2e1-8b0d-44d4-8db0-cc58ac2058af-S0 at slave(1)@172.16.10.4:5051"
{noformat}

Now if I look in the logs on the slave that was specified in the request to 
create a persistent volume I see:

then on the slave I see:
{noformat}
 "1572 slave.cpp:2327] Updated checkpointed resources from  to   "
{noformat}

Notice that from destination and a to destination are both missing 
specifically, they should be the valueos of:

checkpointedResources and newCheckpointedResources, from here:
https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L2582

I am currently running only one slave for troubleshooting purposes, the 
resource file on the slave with the disk resource looks like the following:

#resources=file:///etc/default/mesos.resources.json

{noformat}
[
   {
    "name": "disk",
    "type": "SCALAR",
    "scalar": {
      "value": 50000
    }
  },
   {
      "name":"disk",
      "type":"SCALAR",
      "scalar":{
         "value":1000000
      },
      "role":"testing",
      "disk":{
         "source":{
            "type":"MOUNT",
            "mount":{
               "root":"/data"
            }
         }
      }
   },
   {
      "name":"cpus",
      "type":"SCALAR",
      "scalar":{
         "value":16
      },
      "role":"testing"
   },
   {
      "name":"mem",
      "type":"SCALAR",
      "scalar":{
         "value":128000
      },
      "role":"testing"
   },
   {
      "name":"ports",
      "type":"RANGES",
      "ranges":{
         "range":[
            {
               "begin":31000,
               "end":32000
            }
         ]
      },
      "role":"testing"
   }
]
{noformat}

When I {{curl master:5050/slaves | jq '.'}} and look under the key 
{{reserved_resources_full}}, I see the above resources on that slave. 

Here is my request to via the operator endpoint {{/create-resources}}, I am 
trying to create a persistent volume on the disk of type MOUNT above, which is 
in {{/proc/mounts}} as {{/data}}:

{noformat}
curl -i  -d slaveId=0ee7d2e7-8b0d-44d4-8d80-cc58ac2058ae-S4     \      
          -d volumes='[
          {
            "name": "testvol",
            "type": "SCALAR",
            "scalar": { "value": 10000 },
            "role": "testing",
            "disk": {
             "source": {
               "type" : "MOUNT",
                "path" : { "root" : "/data" }
             },
              "persistence": {
               "id" : "cliff"
             },
              "volume": {
               "mode": "RW",
               "container_path": "/data"
              }
            }
          }
        ]' -X POST http://master:5050/master/create-volumes
{noformat}
        
{noformat}
HTTP/1.1 200 OK
Date: Sun, 19 Jun 2016 04:38:45 GMT
{noformat}

If look at the slave specified with slaveID above via:

{noformat}
curl - http://slave1:5051/state  
{noformat}

I will not see the volume created. Also here are no errors in the INFO logs on 
either the master or slave relating to this request. The only log entries are 
those that I have provided. 

The same problem/behavior seems to exist when trying creating persistent 
volumes on dynamically reserved resources as well.

My steps were:
systemctl stop meso-slave
cd /var/mesos
rm -rf meta
systemctl start mesos-slave

then I issued the following to the /reserve operator endpoint:

{noformat}

curl -i \
      -d slaveId=0ee7d2b7-7b0d-44d4-8d80-cc51ac2058ae-S0 \
      -d resources='[

        {
            "name": "disk",
            "type": "SCALAR",
            "scalar": { "value": 10000 },
            "disk": {
             "source": {
               "type" : "MOUNT",
                "path" : { "root" : "/data" }
             },
              "persistence": {
               "id" : "testing"
             },
              "volume": {
               "mode": "RW",
               "container_path": "/data"
              }
            }
          }
          ]' \
          -X POST http://master:5050/master/reserve
{noformat}

The volume will never get created, there will be no error logged anywhere on 
the master or slave and I will only see the following on the slave, the same as 
when attempting to create a persistent volume on statically defined resources:

{noformat}
5558 slave.cpp:2327] Updated checkpointed resources from  to
{noformat}

I also tried enabling auth to rule out that possibly being a factor. Steps 
taken:

{noformat}
/etc/default/mesos-master:
 
export authenticate_http=true
export credentials="/etc/default/credentials.json"

/etc/default/credentials.json
{
   "credentials" : [
     {
       "principal": "test",
       "secret": "test"
     }
   ]
 }   
{noformat}

restart masters with "systemctl restart mesos-master"

{noformat}
# curl -i \
>      -u test:test \
>      -d slaveId=af6e2f17-3d53-4656-a6ce-49658b6b4db3-S0 \
>      -d resources='[
>         {
>           "name": "disk",
>           "type": "SCALAR",
>           "scalar": { "value": 1024 },
>           "reservation": {
>             "principal": "test"
>            }
>         }
>       ]' \
>            -X POST http://master:5050/master/reserve
HTTP/1.1 200 OK

{noformat}

The result is the same, if look at the output of"
{noformat}
http://master:5050/slaves 
{noformat}

I won't see anything reserved:

{noformat}
reserved_resources_full": {},
{noformat}

and again in the logs on the one slave that is currently active I will see:

{noformat}
slave.cpp:2327] Updated checkpointed resources from  to
{noformat}

and no further information either on the slave agent or the master.

Whether or not I specify  a role doesn't have any effect:

{noformat}

curl -i \
     -u test:test \
       -d slaveId=af6e2f17-3d53-4656-a6ce-49658b6b4db3-S0 \
       -d resources='[

         {
             "name": "disk",
             "type": "SCALAR",
             "scalar": { "value": 10000 },
             "role": "test",
             "disk": {
              "source": {
                "type" : "MOUNT",
                 "path" : { "root" : "/data" }
              },
               "persistence": {
                "id" : "testing"
              },
               "volume": {
                "mode": "RW",
                "container_path": "/data"
               }
             }
           }
           ]' \
           -X POST http://master:5050/master/reserve

HTTP/1.1 200 OK
Date: Mon, 20 Jun 2016 21:32:17 GMT
Content-Length: 0


curl http://slave1:5050/slaves | jq '.' | grep full
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   590  100   590    0     0   158k      0 --:--:-- --:--:-- --:--:--  192k

      "reserved_resources_full": {},
      "used_resources_full": [],
      "offered_resources_full": []

{noformat}


> Creating a persistent volume through the operator endpoints fail and doesn't 
> produce meaningful logs.
> -----------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-5653
>                 URL: https://issues.apache.org/jira/browse/MESOS-5653
>             Project: Mesos
>          Issue Type: Bug
>          Components: master, volumes
>    Affects Versions: 0.28.2
>         Environment: Centos 7 - 3.10.0-327.13.1.el7.x86_64, Mesos 0.28.2
>            Reporter: cliff
>            Assignee: Greg Mann
>              Labels: persistent-volumes
>
> When attempting to create a persistent volume via the /create-volumes 
> operator endpoint. I get a HTTP 200  from the master and in the logs on the 
> master I see:
> {noformat}
> http.cpp:312] HTTP POST for /master/create-volumes from "172.16.10.11:40686 
> with User-Agent='curl/7.29.0' "
> {noformat}
> then next line I see on the master is:
> {noformat}
> "master.cpp:6560] Sending checkpointed resources  to slave 
> 0ef7d2e1-8b0d-44d4-8db0-cc58ac2058af-S0 at slave(1)@172.16.10.4:5051"
> {noformat}
> Now if I look in the logs on the slave that was specified in the request to 
> create a persistent volume I see:
> then on the slave I see:
> {noformat}
>  "1572 slave.cpp:2327] Updated checkpointed resources from  to   "
> {noformat}
> Notice that from destination and a to destination are both missing 
> specifically, they should be the valueos of:
> checkpointedResources and newCheckpointedResources, from here:
> https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L2582
> I am currently running only one slave for troubleshooting purposes, the 
> resource file on the slave with the disk resource looks like the following:
> #resources=file:///etc/default/mesos.resources.json
> {noformat}
> [
>    {
>     "name": "disk",
>     "type": "SCALAR",
>     "scalar": {
>       "value": 50000
>     }
>   },
>    {
>       "name":"disk",
>       "type":"SCALAR",
>       "scalar":{
>          "value":1000000
>       },
>       "role":"testing",
>       "disk":{
>          "source":{
>             "type":"MOUNT",
>             "mount":{
>                "root":"/data"
>             }
>          }
>       }
>    },
>    {
>       "name":"cpus",
>       "type":"SCALAR",
>       "scalar":{
>          "value":16
>       },
>       "role":"testing"
>    },
>    {
>       "name":"mem",
>       "type":"SCALAR",
>       "scalar":{
>          "value":128000
>       },
>       "role":"testing"
>    },
>    {
>       "name":"ports",
>       "type":"RANGES",
>       "ranges":{
>          "range":[
>             {
>                "begin":31000,
>                "end":32000
>             }
>          ]
>       },
>       "role":"testing"
>    }
> ]
> {noformat}
> When I {{curl master:5050/slaves | jq '.'}} and look under the key 
> {{reserved_resources_full}}, I see the above resources on that slave. 
> Here is my request to via the operator endpoint {{/create-resources}}, I am 
> trying to create a persistent volume on the disk of type MOUNT above, which 
> is in {{/proc/mounts}} as {{/data}}:
> {noformat}
> curl -i  -d slaveId=0ee7d2e7-8b0d-44d4-8d80-cc58ac2058ae-S4     \      
>           -d volumes='[
>           {
>             "name": "testvol",
>             "type": "SCALAR",
>             "scalar": { "value": 10000 },
>             "role": "testing",
>             "disk": {
>              "source": {
>                "type" : "MOUNT",
>                 "path" : { "root" : "/data" }
>              },
>               "persistence": {
>                "id" : "cliff"
>              },
>               "volume": {
>                "mode": "RW",
>                "container_path": "/data"
>               }
>             }
>           }
>         ]' -X POST http://master:5050/master/create-volumes
> {noformat}
>         
> {noformat}
> HTTP/1.1 200 OK
> Date: Sun, 19 Jun 2016 04:38:45 GMT
> {noformat}
> If look at the slave specified with slaveID above via:
> {noformat}
> curl - http://slave1:5051/state  
> {noformat}
> I will not see the volume created. Also here are no errors in the INFO logs 
> on either the master or slave relating to this request. The only log entries 
> are those that I have provided. 
> The same problem/behavior seems to exist when trying creating persistent 
> volumes on dynamically reserved resources as well.
> My steps were:
> systemctl stop meso-slave
> cd /var/mesos
> rm -rf meta
> systemctl start mesos-slave
> then I issued the following to the /reserve operator endpoint:
> {noformat}
> curl -i \
>       -d slaveId=0ee7d2b7-7b0d-44d4-8d80-cc51ac2058ae-S0 \
>       -d resources='[
>         {
>             "name": "disk",
>             "type": "SCALAR",
>             "scalar": { "value": 10000 },
>             "disk": {
>              "source": {
>                "type" : "MOUNT",
>                 "path" : { "root" : "/data" }
>              },
>               "persistence": {
>                "id" : "testing"
>              },
>               "volume": {
>                "mode": "RW",
>                "container_path": "/data"
>               }
>             }
>           }
>           ]' \
>           -X POST http://master:5050/master/reserve
> {noformat}
> The volume will never get created, there will be no error logged anywhere on 
> the master or slave and I will only see the following on the slave, the same 
> as when attempting to create a persistent volume on statically defined 
> resources:
> {noformat}
> 5558 slave.cpp:2327] Updated checkpointed resources from  to
> {noformat}
> I also tried enabling auth to rule out that possibly being a factor. Steps 
> taken:
> {noformat}
> /etc/default/mesos-master:
>  
> export authenticate_http=true
> export credentials="/etc/default/credentials.json"
> /etc/default/credentials.json
> {
>    "credentials" : [
>      {
>        "principal": "test",
>        "secret": "test"
>      }
>    ]
>  }   
> {noformat}
> restart masters with "systemctl restart mesos-master"
> {noformat}
> # curl -i \
> >      -u test:test \
> >      -d slaveId=af6e2f17-3d53-4656-a6ce-49658b6b4db3-S0 \
> >      -d resources='[
> >         {
> >           "name": "disk",
> >           "type": "SCALAR",
> >           "scalar": { "value": 1024 },
> >           "reservation": {
> >             "principal": "test"
> >            }
> >         }
> >       ]' \
> >            -X POST http://master:5050/master/reserve
> HTTP/1.1 200 OK
> {noformat}
> The result is the same, if look at the output of"
> {noformat}
> http://master:5050/slaves 
> {noformat}
> I won't see anything reserved:
> {noformat}
> reserved_resources_full": {},
> {noformat}
> and again in the logs on the one slave that is currently active I will see:
> {noformat}
> slave.cpp:2327] Updated checkpointed resources from  to
> {noformat}
> and no further information either on the slave agent or the master.
> Whether or not I specify  a role doesn't have any effect:
> {noformat}
> curl -i \
>      -u test:test \
>        -d slaveId=af6e2f17-3d53-4656-a6ce-49658b6b4db3-S0 \
>        -d resources='[
>          {
>              "name": "disk",
>              "type": "SCALAR",
>              "scalar": { "value": 10000 },
>              "role": "test",
>              "disk": {
>               "source": {
>                 "type" : "MOUNT",
>                  "path" : { "root" : "/data" }
>               },
>                "persistence": {
>                 "id" : "testing"
>               },
>                "volume": {
>                 "mode": "RW",
>                 "container_path": "/data"
>                }
>              }
>            }
>            ]' \
>            -X POST http://master:5050/master/reserve
> HTTP/1.1 200 OK
> Date: Mon, 20 Jun 2016 21:32:17 GMT
> Content-Length: 0
> curl http://master:5050/slaves | jq '.' | grep full
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  
> Current
>                                  Dload  Upload   Total   Spent    Left  Speed
> 100   590  100   590    0     0   158k      0 --:--:-- --:--:-- --:--:--  192k
>       "reserved_resources_full": {},
>       "used_resources_full": [],
>       "offered_resources_full": []
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to