Can you please confirm the procedure i have written?

Thanks to all!

2016-04-09 0:28 GMT+02:00 Stefano Bianchi <[email protected]>:

> yes i imagined that, indeed i will not allocate that amount of RAM, but at
> least a number higher than 920 M.
> OK so summarizing:
> 1) i stop mesos slave with: service mesos-slave stop
> 2)then as June suggest i run: sudo sh -c "echo
> MESOS_WORK_DIR=/scratch.local/mesos >> /etc/default/mesos-slave"
> 3) then as Arjun suggets:
>
> rm -f /tmp/mesos/meta/slaves/latest
>
> mesos-slave --master=MASTER_ADDRESS:5050 --hostname=slave_public_IP_i_set
> --resources='cpu(*):1;mem(*):1000;disk(*):8000'
>
>
>
>
>
>
>
> Is this correct procedure?
>
>
>
>
> 2016-04-08 23:57 GMT+02:00 Stefano Bianchi <[email protected]>:
>
>> i tried the command: free -m
>> and i obtained this output:
>>                       total        used        free      shared
>> buff/cache   available
>>
>> Mem:           1840         120        1407          40         312
>>   1507
>>
>> Swap:             0           0           0
>>
>> so there is not 2048 MB of RAM? i'm sure that openstack tells me that
>> this is a machine with 2048 MB of RAM...
>>
>> 2016-04-08 23:44 GMT+02:00 Arkal Arjun Rao <[email protected]>:
>>
>>> You set it up with 2048MB but you  probably don't really get all of it
>>> (try `free -m` on the slave). Same with Disk (look at the value of df).
>>> from the book "Building Applications in Mesos":
>>> "The slave will reserve 1 GB or 50% of detected memory, whichever is
>>> smaller, in order to run itself and other operating system services.
>>> Likewise, it will reserve 5 GB or 50% of detected disk, whichever is
>>> smaller."
>>>
>>> If you want to explicitly reserve a value, first ensure you have the
>>> resources you want per slave then run this
>>> <kill the mesos slave process>
>>> rm -f /tmp/mesos/meta/slaves/latest
>>> mesos-slave --master=MASTER_ADDRESS:5050
>>> --hostname=slave_public_IP_i_set
>>> --resources='cpu(*):1;mem(*):2000;disk(*):9000'
>>>
>>> Arjun
>>>
>>> On Fri, Apr 8, 2016 at 2:23 PM, Stefano Bianchi <[email protected]>
>>> wrote:
>>>
>>>> What has to be clear is that i'm running virtual machines on openstack,
>>>> so i am not on bare metal.
>>>> All the VMs are Openstack Images, and each slave has been built with
>>>> 2048 MB of RAM, so since slaves are 3 i should see in mesos something close
>>>> to 6144 MB, but mesos shows only 2.7 GB.
>>>> If you look at the command output i posted in previous messages, the
>>>> current mesos resources configuration allows 920 MB and 5112 MB of disk
>>>> space for each slave. I would like that mesos can see for instance 2000 MB
>>>> of RAM and 9000 MB of disk. and for this reason i have run: mesos-slave
>>>> --master=MASTER_ADDRESS:5050 --resources='cpu:1;mem:2000;disk:9000'
>>>>
>>>> June Taylor, i need to understand:
>>>> 1) What the command you suggest do?
>>>> 2) Should i stop mesos-slave before? and then run your command?
>>>>
>>>> Thanks in advance.
>>>>
>>>> 2016-04-08 21:28 GMT+02:00 June Taylor <[email protected]>:
>>>>
>>>>> How much actual RAM do your slaves contain? You can only make
>>>>> available up to that amount, minus the bit that the slave reserves.
>>>>>
>>>>>
>>>>> Thanks,
>>>>> June Taylor
>>>>> System Administrator, Minnesota Population Center
>>>>> University of Minnesota
>>>>>
>>>>> On Fri, Apr 8, 2016 at 1:29 PM, Stefano Bianchi <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi i would like to enter in this mailing list.
>>>>>> i'm currently doing my Master Thesis on Mesos and Calico.
>>>>>> I'm working at INFN, institute of nuclear physics. The goal of the
>>>>>> thesis is to build a PaaS where mesos is the scheduler and Calico must
>>>>>> allow the interconnection between multiple datacenters linked to the 
>>>>>> CERN.
>>>>>>
>>>>>> I'm exploiting an IaaS based on Openstack, here i have created 6
>>>>>> Virtual Machines, 3 Masters and 3 Slaves, on one slave is running 
>>>>>> Mesos-DNS
>>>>>> from Marathon.
>>>>>> All is perfectly working, since i am on another network i changed
>>>>>> correctly the hostnames such that on mesos are resolvable and i tried to
>>>>>> run from marathon a simple http server which is scalable on all my 
>>>>>> machine.
>>>>>> So all is fine and working.
>>>>>>
>>>>>> The only thing that i don't like is that each 3 slaves have 1 CPU 10
>>>>>> GB of disk memory and 2GB of RAM, but mesos currently show for each one
>>>>>> only 5 GB of disk memory and 900MB of RAM.
>>>>>> So checking in documentation i found the command to manage the
>>>>>> resources.
>>>>>> I stopped Slave1, for instance, and i have run this command:
>>>>>>
>>>>>> mesos-slave --master=MASTER_ADDRESS:5050
>>>>>> --resources='cpu:1;mem:2000;disk:9000'
>>>>>>
>>>>>> where i want set 2000 GB of RAM and 9000GB of disk memory.
>>>>>>  The output is the following:
>>>>>>
>>>>>> I0408 15:11:00.915324  7892 main.cpp:215] Build: 2016-03-10 20:32:58 by 
>>>>>> root
>>>>>>
>>>>>> I0408 15:11:00.915436  7892 main.cpp:217] Version: 0.27.2
>>>>>>
>>>>>> I0408 15:11:00.915448  7892 main.cpp:220] Git tag: 0.27.2
>>>>>>
>>>>>> I0408 15:11:00.915459  7892 main.cpp:224] Git SHA: 
>>>>>> 3c9ec4a0f34420b7803848af597de00fedefe0e2
>>>>>>
>>>>>> I0408 15:11:00.923334  7892 systemd.cpp:236] systemd version `219` 
>>>>>> detected
>>>>>>
>>>>>> I0408 15:11:00.923384  7892 main.cpp:232] Inializing systemd state
>>>>>>
>>>>>> I0408 15:11:00.950050  7892 systemd.cpp:324] Started systemd slice 
>>>>>> `mesos_executors.slice`
>>>>>>
>>>>>> I0408 15:11:00.951529  7892 containerizer.cpp:143] Using isolation: 
>>>>>> posix/cpu,posix/mem,filesystem/posix
>>>>>>
>>>>>> I0408 15:11:00.963232  7892 linux_launcher.cpp:101] Using 
>>>>>> /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
>>>>>>
>>>>>> I0408 15:11:00.965541  7892 main.cpp:320] Starting Mesos slave
>>>>>>
>>>>>> I0408 15:11:00.966008  7892 slave.cpp:192] Slave started on 
>>>>>> 1)@192.168.100.56:5051
>>>>>>
>>>>>> I0408 15:11:00.966023  7892 slave.cpp:193] Flags at startup: 
>>>>>> --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5" 
>>>>>> --cgroups_cpu_enable_pids_and_tids_count="false" 
>>>>>> --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" 
>>>>>> --cgroups_limit_swap="false" --cgroups_root="mesos" 
>>>>>> --container_disk_watch_interval="15secs" --containerizers="mesos" 
>>>>>> --default_role="*" --disk_watch_interval="1mins" --docker="docker" 
>>>>>> --docker_auth_server="https://auth.docker.io"; 
>>>>>> --docker_kill_orphans="true" --docker_puller_timeout="60" 
>>>>>> --docker_registry="https://registry-1.docker.io"; 
>>>>>> --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" 
>>>>>> --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" 
>>>>>> --enforce_container_disk_quota="false" 
>>>>>> --executor_registration_timeout="1mins" 
>>>>>> --executor_shutdown_grace_period="5secs" 
>>>>>> --fetcher_cache_dir="/tmp/mesos/fetch" --fetcher_cache_size="2GB" 
>>>>>> --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" 
>>>>>> --hadoop_home="" --help="false" --hostname_lookup="true" 
>>>>>> --image_provisioner_backend="copy" --initialize_driver_logging="true" 
>>>>>> --isolation="posix/cpu,posix/mem" --launcher_dir="/usr/libexec/mesos" 
>>>>>> --logbufsecs="0" --logging_level="INFO" --master="192.168.100.55:5050" 
>>>>>> --oversubscribed_resources_interval="15secs" --perf_duration="10secs" 
>>>>>> --perf_interval="1mins" --port="5051" 
>>>>>> --qos_correction_interval_min="0ns" --quiet="false" 
>>>>>> --recover="reconnect" --recovery_timeout="15mins" 
>>>>>> --registration_backoff_factor="1secs" 
>>>>>> --resources="cpu:1;mem:2000;disk:9000" 
>>>>>> --revocable_cpu_low_priority="true" 
>>>>>> --sandbox_directory="/mnt/mesos/sandbox" --strict="true" 
>>>>>> --switch_user="true" --systemd_enable_support="true" 
>>>>>> --systemd_runtime_directory="/run/systemd/system" --version="false" 
>>>>>> --work_dir="/tmp/mesos"
>>>>>>
>>>>>> I0408 15:11:00.967485  7892 slave.cpp:463] Slave resources: cpu(*):1; 
>>>>>> mem(*):2000; disk(*):9000; cpus(*):1; ports(*):[31000-32000]
>>>>>>
>>>>>> I0408 15:11:00.967547  7892 slave.cpp:471] Slave attributes: [  ]
>>>>>>
>>>>>> I0408 15:11:00.967560  7892 slave.cpp:476] Slave hostname: 
>>>>>> slave1.openstacklocal
>>>>>>
>>>>>> I0408 15:11:00.971304  7893 state.cpp:58] Recovering state from 
>>>>>> '/tmp/mesos/meta'
>>>>>>
>>>>>> *Failed to perform recovery: Incompatible slave info detected*.
>>>>>>
>>>>>> ------------------------------------------------------------
>>>>>>
>>>>>> Old slave info:
>>>>>>
>>>>>> hostname: "*slave_public_IP_i_set*"
>>>>>>
>>>>>> resources {
>>>>>>
>>>>>>   name: "cpus"
>>>>>>
>>>>>>   type: SCALAR
>>>>>>
>>>>>>   scalar {
>>>>>>
>>>>>>     value: 1
>>>>>>
>>>>>>   }
>>>>>>
>>>>>>   role: "*"
>>>>>>
>>>>>> }
>>>>>>
>>>>>> resources {
>>>>>>
>>>>>>   name: "mem"
>>>>>>
>>>>>>   type: SCALAR
>>>>>>
>>>>>>   scalar {
>>>>>>
>>>>>>     value: 920
>>>>>>
>>>>>>   }
>>>>>>
>>>>>>   role: "*"
>>>>>>
>>>>>> }
>>>>>>
>>>>>> resources {
>>>>>>
>>>>>>   name: "disk"
>>>>>>
>>>>>>   type: SCALAR
>>>>>>
>>>>>>   scalar {
>>>>>>
>>>>>>     value: 5112
>>>>>>
>>>>>>   }
>>>>>>
>>>>>>   role: "*"
>>>>>>
>>>>>> }
>>>>>>
>>>>>> resources {
>>>>>>
>>>>>>   name: "ports"
>>>>>>
>>>>>>   type: RANGES
>>>>>>
>>>>>>   ranges {
>>>>>>
>>>>>>     range {
>>>>>>
>>>>>>       begin: 31000
>>>>>>
>>>>>>       end: 32000
>>>>>>
>>>>>>     }
>>>>>>
>>>>>>   }
>>>>>>
>>>>>>   role: "*"
>>>>>>
>>>>>> }
>>>>>>
>>>>>> id {
>>>>>>
>>>>>>   value: "ad490064-1a6e-415c-8536-daef0d8e3572-S7"
>>>>>>
>>>>>> }
>>>>>>
>>>>>> checkpoint: true
>>>>>>
>>>>>> port: 5051
>>>>>>
>>>>>> ------------------------------------------------------------
>>>>>>
>>>>>> New slave info:
>>>>>>
>>>>>> hostname: "
>>>>>>
>>>>>> slave1.openstacklocal
>>>>>>
>>>>>> "
>>>>>>
>>>>>> resources {
>>>>>>
>>>>>>   name: "cpu"
>>>>>>
>>>>>>   type: SCALAR
>>>>>>
>>>>>>   scalar {
>>>>>>
>>>>>>     value: 1
>>>>>>
>>>>>>   }
>>>>>>
>>>>>>   role: "*"
>>>>>>
>>>>>> }
>>>>>>
>>>>>> resources {
>>>>>>
>>>>>>   name: "mem"
>>>>>>
>>>>>>   type: SCALAR
>>>>>>
>>>>>>   scalar {
>>>>>>
>>>>>>     value: 2000
>>>>>>
>>>>>>   }
>>>>>>
>>>>>>   role: "*"
>>>>>>
>>>>>> }
>>>>>>
>>>>>> resources {
>>>>>>
>>>>>>   name: "disk"
>>>>>>
>>>>>>   type: SCALAR
>>>>>>
>>>>>>   scalar {
>>>>>>
>>>>>>     value: 9000
>>>>>>
>>>>>>   }
>>>>>>
>>>>>>   role: "*"
>>>>>>
>>>>>> }
>>>>>>
>>>>>> resources {
>>>>>>
>>>>>>   name: "cpus"
>>>>>>
>>>>>>   type: SCALAR
>>>>>>
>>>>>>   scalar {
>>>>>>
>>>>>>     value: 1
>>>>>>
>>>>>>   }
>>>>>>
>>>>>>   role: "*"
>>>>>>
>>>>>> }
>>>>>>
>>>>>> resources {
>>>>>>
>>>>>>   name: "ports"
>>>>>>
>>>>>>   type: RANGES
>>>>>>
>>>>>>   ranges {
>>>>>>
>>>>>>     range {
>>>>>>
>>>>>>       begin: 31000
>>>>>>
>>>>>>       end: 32000
>>>>>>
>>>>>>     }
>>>>>>
>>>>>>   }
>>>>>>
>>>>>>   role: "*"
>>>>>>
>>>>>> }
>>>>>>
>>>>>> id {
>>>>>>
>>>>>>   value: "ad490064-1a6e-415c-8536-daef0d8e3572-S7"
>>>>>>
>>>>>> }
>>>>>>
>>>>>> checkpoint: true
>>>>>>
>>>>>> port: 5051
>>>>>>
>>>>>> ------------------------------------------------------------
>>>>>>
>>>>>> To remedy this do as follows:
>>>>>>
>>>>>> Step 1: rm -f /tmp/mesos/meta/slaves/latest
>>>>>>
>>>>>>         This ensures slave doesn't recover old live executors.
>>>>>>
>>>>>> Step 2: Restart the slave.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I can notice two things:
>>>>>>
>>>>>>
>>>>>> 1)the message of failure;
>>>>>>
>>>>>> 2)the hostname is changed; the right one is a public IP i have set in 
>>>>>> order to resolve the hostname for mesos.
>>>>>>
>>>>>> As a consequence, when i start the slave, the resources are exaclty the 
>>>>>> same, nothing is changed.
>>>>>>
>>>>>> Can you please help me?
>>>>>>
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Arjun Arkal Rao
>>>
>>> PhD Student,
>>> Haussler Lab,
>>> UC Santa Cruz,
>>> USA
>>>
>>> [email protected]
>>>
>>>
>>
>

Reply via email to