-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51874/
-----------------------------------------------------------

(Updated Sept. 14, 2016, 5:33 p.m.)


Review request for Aurora, Joshua Cohen and Maxim Khutornenko.


Bugs: AURORA-1688
    https://issues.apache.org/jira/browse/AURORA-1688


Repository: aurora


Description
-------

Change framework_name default value from 'TwitterScheduler' to 'Aurora'


Diffs
-----

  RELEASE-NOTES.md ad2c68a6defe07c94480d7dee5b1496b50dc34e5 
  
src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java
 8a386bd208956eb0c8c2f48874b0c6fb3af58872 
  src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
97677f24a50963178a123b420d7ac136e4fde3fe 

Diff: https://reviews.apache.org/r/51874/diff/


Testing (updated)
-------

./build-support/jenkins/build.sh
./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh

Testing to make sure backward compatibility:

# HEAD of master:

# Case 1: Rolling forward does not impact running tasks:
Renaming framework from 'TwitterScheduler' to 'Aurora':

The framework re-registers after restart (treated by master as failover) and 
gets the same framework-id. Running task remain unaffected.

Master log:
I0914 16:48:28.408182  9815 master.cpp:1297] Giving framework 
071c44a1-b4d4-4339-a727-03a79f725851-0000 (TwitterScheduler) at 
scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 3weeks to 
failover
I0914 16:48:28.408226  9815 hierarchical.cpp:382] Deactivated framework 
071c44a1-b4d4-4339-a727-03a79f725851-0000
E0914 16:48:28.408617  9819 process.cpp:2105] Failed to shutdown socket with fd 
28: Transport endpoint is not connected
I0914 16:48:43.722126  9813 master.cpp:2424] Received SUBSCRIBE call for 
framework 'Aurora' at 
scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
I0914 16:48:43.722190  9813 master.cpp:2500] Subscribing framework Aurora with 
checkpointing enabled and capabilities [ REVOCABLE_RESOURCES, GPU_RESOURCES ]
I0914 16:48:43.722225  9813 master.cpp:2564] Updating info for framework 
071c44a1-b4d4-4339-a727-03a79f725851-0000
I0914 16:48:43.722256  9813 master.cpp:2577] Framework 
071c44a1-b4d4-4339-a727-03a79f725851-0000 (Aurora) at 
scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 failed over
I0914 16:48:43.722429  9813 hierarchical.cpp:348] Activated framework 
071c44a1-b4d4-4339-a727-03a79f725851-0000
I0914 16:48:43.722595  9813 master.cpp:5709] Sending 1 offers to framework 
071c44a1-b4d4-4339-a727-03a79f725851-0000 (Aurora) at 
scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083

Scheduler log:
I0914 16:48:44.157 [Thread-10, MesosSchedulerImpl:151] Registered with ID 
value: "071c44a1-b4d4-4339-a727-03a79f725851-0000"
, master: id: "461b98b8-63e1-40e3-96fd-cb62420945ae"
ip: 119646400
port: 5050
pid: "master@192.168.33.7:5050"
hostname: "aurora.local"
version: "1.0.0"
address {
  hostname: "aurora.local"
  ip: "192.168.33.7"
  port: 5050
}

# Case 2: Rolling backward does not impact running tasks:
Rolling back framework name from 'Aurora' to 'TwitterScheduler':

The framework re-registers after restart (treated by master as failover) and 
gets the same framework-id. Running task remain unaffected.

Master log:
I0914 16:51:33.203495  9812 master.cpp:1297] Giving framework 
071c44a1-b4d4-4339-a727-03a79f725851-0000 (Aurora) at 
scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 3weeks to 
failover
I0914 16:51:33.203526  9812 hierarchical.cpp:382] Deactivated framework 
071c44a1-b4d4-4339-a727-03a79f725851-0000
I0914 16:51:49.614074  9813 master.cpp:2424] Received SUBSCRIBE call for 
framework 'TwitterScheduler' at 
scheduler-6fa8b819-aed9-42e1-9c6c-3e4be2f62500@192.168.33.7:8083
I0914 16:51:49.614215  9813 master.cpp:2500] Subscribing framework 
TwitterScheduler with checkpointing enabled and capabilities [ 
REVOCABLE_RESOURCES, GPU_RESOURCES ]
I0914 16:51:49.614312  9813 master.cpp:2564] Updating info for framework 
071c44a1-b4d4-4339-a727-03a79f725851-0000
I0914 16:51:49.614359  9813 master.cpp:2577] Framework 
071c44a1-b4d4-4339-a727-03a79f725851-0000 (TwitterScheduler) at 
scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 failed over
I0914 16:51:49.614977  9813 hierarchical.cpp:348] Activated framework 
071c44a1-b4d4-4339-a727-03a79f725851-0000
I0914 16:51:49.615170  9813 master.cpp:5709] Sending 1 offers to framework 
071c44a1-b4d4-4339-a727-03a79f725851-0000 (TwitterScheduler) at 
scheduler-6fa8b819-aed9-42e1-9c6c-3e4be2f62500@192.168.33.7:8083

Scheduler log:
I0914 16:51:50.249 [Thread-10, MesosSchedulerImpl:151] Registered with ID 
value: "071c44a1-b4d4-4339-a727-03a79f725851-0000"
, master: id: "461b98b8-63e1-40e3-96fd-cb62420945ae"
ip: 119646400
port: 5050
pid: "master@192.168.33.7:5050"
hostname: "aurora.local"
version: "1.0.0"
address {
  hostname: "aurora.local"
  ip: "192.168.33.7"
  port: 5050
}

# Case 3: Restarting with old framework_name (rolling back config) does not 
impact running tasks:
Restarting the scheduler after updating the config from 'Aurora' to 
'TwitterScheduler':

Rename takes effect. The master re-registered the framework to the same id. 
Running task remain unaffected.

Master log:
I0914 20:34:58.059640 28176 master.cpp:1297] Giving framework 
071c44a1-b4d4-4339-a727-03a79f725851-0000 (Aurora) at 
scheduler-4a7c21b7-5d90-4218-936e-4142051b3444@192.168.33.7:8083 3weeks to 
failover
I0914 20:34:58.059675 28176 hierarchical.cpp:382] Deactivated framework 
071c44a1-b4d4-4339-a727-03a79f725851-0000
I0914 20:35:23.447479 28175 master.cpp:2424] Received SUBSCRIBE call for 
framework 'TwitterScheduler' at 
scheduler-cea31751-7cb5-46b2-8208-f9ab1d4fe86c@192.168.33.7:8083
I0914 20:35:23.447573 28175 master.cpp:2500] Subscribing framework 
TwitterScheduler with checkpointing enabled and capabilities [ 
REVOCABLE_RESOURCES, GPU_RESOURCES ]
I0914 20:35:23.447592 28175 master.cpp:2564] Updating info for framework 
071c44a1-b4d4-4339-a727-03a79f725851-0000
I0914 20:35:23.447615 28175 master.cpp:2577] Framework 
071c44a1-b4d4-4339-a727-03a79f725851-0000 (TwitterScheduler) at 
scheduler-4a7c21b7-5d90-4218-936e-4142051b3444@192.168.33.7:8083 failed over
I0914 20:35:23.447777 28175 hierarchical.cpp:348] Activated framework 
071c44a1-b4d4-4339-a727-03a79f725851-0000
I0914 20:35:23.447968 28175 master.cpp:5709] Sending 1 offers to framework 
071c44a1-b4d4-4339-a727-03a79f725851-0000 (TwitterScheduler) at 
scheduler-cea31751-7cb5-46b2-8208-f9ab1d4fe86c@192.168.33.7:8083

Scheduler log:
I0914 20:35:24.000 [Thread-10, MesosSchedulerImpl:151] Registered with ID 
value: "071c44a1-b4d4-4339-a727-03a79f725851-0000
"
, master: id: "848618fb-714d-4b00-ad80-950f6bdc70c6"
ip: 119646400
port: 5050
pid: "master@192.168.33.7:5050"
hostname: "aurora.local"
version: "1.0.0"
address {
  hostname: "aurora.local"
  ip: "192.168.33.7"
  port: 5050
}

# Testing on olders versions that uses Mesos 0.28 (Aurora 0.15) and Mesos 0.27 
(Aurora 0.14)

# Aurora Version: 0.14
# Initial version (TwitterScheduler)
# 
https://git-wip-us.apache.org/repos/asf?p=aurora.git;a=commit;h=b0b598088847630f37c3f995db98a8edf9520b7e

git reset —hard b0b598088847630f37c3f995db98a8edf9520b7e # reset HEAD to v0.14
vagrant destroy
vagrant up

vagrant ssh -c "aurora job create devcluster/www-data/prod/hello 
aurora/examples/jobs/hello_world.aurora" # start some job

Verify the framework name (TwitterScheduler) and id (some id - XXX) 
http://192.168.33.7:5050/#/frameworks
Verify the task is running 
http://192.168.33.7:8081/scheduler/www-data/prod/hello

vagrant@aurora:~$ sudo grep 'Registered with ID value' 
/var/log/upstart/aurora-scheduler.log
I0914 23:26:52.095 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "317cd38a-edc1-4168-8c20-ca81d8306e04-0000"

# Roll forward
git apply -3 ~/Downloads/rb51874.patch # apply the framework name default change
vagrant ssh -c “aurorabuild scheduler” # rebuild

Verify the framework name (Aurora) and id (same id - XXX) 
http://192.168.33.7:5050/#/frameworks
Verify the task is still running 
http://192.168.33.7:8081/scheduler/www-data/prod/hello

vagrant@aurora:~$ sudo grep 'Registered with ID value' 
/var/log/upstart/aurora-scheduler.log
I0914 23:26:52.095 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "317cd38a-edc1-4168-8c20-ca81d8306e04-0000"
I0914 23:33:19.336 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "317cd38a-edc1-4168-8c20-ca81d8306e04-0000"

# Roll backward
git stash
vagrant ssh -c “aurorabuild scheduler” # rebuild

Verify the framework name (TwitterScheduler) and id (same id - XXX) 
http://192.168.33.7:5050/#/frameworks
Verify the task is still running 
http://192.168.33.7:8081/scheduler/www-data/prod/hello

vagrant@aurora:~$ sudo grep 'Registered with ID value' 
/var/log/upstart/aurora-scheduler.log
I0914 23:26:52.095 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "317cd38a-edc1-4168-8c20-ca81d8306e04-0000"
I0914 23:33:19.336 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "317cd38a-edc1-4168-8c20-ca81d8306e04-0000"
I0914 23:35:28.734 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "317cd38a-edc1-4168-8c20-ca81d8306e04-0000"

# Roll forward again
git apply -3 ~/Downloads/rb51874.patch # apply the framework name default change
vagrant ssh -c “aurorabuild scheduler” # rebuild

Verify the framework name (Aurora) and id (same id - XXX) 
http://192.168.33.7:5050/#/frameworks
Verify the task is still running 
http://192.168.33.7:8081/scheduler/www-data/prod/hello

vagrant@aurora:~$ sudo grep 'Registered with ID value' 
/var/log/upstart/aurora-scheduler.log
I0914 23:26:52.095 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "317cd38a-edc1-4168-8c20-ca81d8306e04-0000"
I0914 23:33:19.336 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "317cd38a-edc1-4168-8c20-ca81d8306e04-0000"
I0914 23:35:28.734 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "317cd38a-edc1-4168-8c20-ca81d8306e04-0000"
I0914 23:36:29.195 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "317cd38a-edc1-4168-8c20-ca81d8306e04-0000"

# Restart with old framework name
vagrant ssh
sudo vim /etc/init/aurora-scheduler.conf
# add -framework_name=TwitterScheduler after "exec bin/aurora-scheduler” and 
save
sudo stop aurora-scheduler
sudo start aurora-scheduler

vagrant@aurora:~$ sudo grep 'Registered with ID value' 
/var/log/upstart/aurora-scheduler.log
I0914 23:26:52.095 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "317cd38a-edc1-4168-8c20-ca81d8306e04-0000"
I0914 23:33:19.336 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "317cd38a-edc1-4168-8c20-ca81d8306e04-0000"
I0914 23:35:28.734 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "317cd38a-edc1-4168-8c20-ca81d8306e04-0000"
I0914 23:36:29.195 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "317cd38a-edc1-4168-8c20-ca81d8306e04-0000"
I0914 23:39:46.118 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "317cd38a-edc1-4168-8c20-ca81d8306e04-0000"

Verify the framework name (TwitterScheduler) and id (same id - XXX) 
http://192.168.33.7:5050/#/frameworks
Verify the task is still running 
http://192.168.33.7:8081/scheduler/www-data/prod/hello

# Aurora Version: 0.15

# Initial Version
# 
https://git-wip-us.apache.org/repos/asf?p=aurora.git;a=commit;h=e870884fb30bc4d960aa5ed4901df679edbafb34
git reset —hard e870884fb30bc4d960aa5ed4901df679edbafb34

vagrant destroy
vagrant up

# start some job
vagrant ssh -c "aurora job create devcluster/www-data/prod/hello 
aurora/examples/jobs/hello_world.aurora"

vagrant@aurora:~$ sudo grep 'Registered with ID value' 
/var/log/upstart/aurora-scheduler.log
I0915 00:08:11.136 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "308d7661-6bb1-4936-86b4-a01158bfa06b-0000"

vagrant@aurora:~$ aurora job status  devcluster
 INFO] Retrieving jobs for role None
 INFO] Checking status of devcluster/www-data/prod/hello
Active tasks (1):
           Task role: www-data, env: prod, name: hello, instance: 0, status: 
RUNNING on 192.168.33.7
             CPU: 1.0 core(s), RAM: 128 MB, Disk: 128 MB
             events:
              2016-09-15 00:08:36 PENDING: None
              2016-09-15 00:08:37 ASSIGNED: None
              2016-09-15 00:08:39 STARTING: Initializing sandbox.
              2016-09-15 00:08:39 RUNNING: None
Inactive tasks (0):

# Roll forward:
git stash pop
vagrant ssh -c "aurorabuild scheduler"

vagrant@aurora:~$ sudo grep 'Registered with ID value' 
/var/log/upstart/aurora-scheduler.log
I0915 00:08:11.136 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "308d7661-6bb1-4936-86b4-a01158bfa06b-0000"
I0915 00:12:33.395 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "308d7661-6bb1-4936-86b4-a01158bfa06b-0000"
vagrant@aurora:~$ aurora job status  devcluster
 INFO] Retrieving jobs for role None
 INFO] Checking status of devcluster/www-data/prod/hello
Active tasks (1):
           Task role: www-data, env: prod, name: hello, instance: 0, status: 
RUNNING on 192.168.33.7
             CPU: 1.0 core(s), RAM: 128 MB, Disk: 128 MB
             events:
              2016-09-15 00:08:36 PENDING: None
              2016-09-15 00:08:37 ASSIGNED: None
              2016-09-15 00:08:39 STARTING: Initializing sandbox.
              2016-09-15 00:08:39 RUNNING: None
Inactive tasks (0):

# Rollback
git stash
vagrant ssh -c "aurorabuild scheduler"

vagrant@aurora:~$ sudo grep 'Registered with ID value' 
/var/log/upstart/aurora-scheduler.log
I0915 00:08:11.136 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "308d7661-6bb1-4936-86b4-a01158bfa06b-0000"
I0915 00:12:33.395 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "308d7661-6bb1-4936-86b4-a01158bfa06b-0000"
I0915 00:14:49.374 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "308d7661-6bb1-4936-86b4-a01158bfa06b-0000"
vagrant@aurora:~$ aurora job status  devcluster
 INFO] Retrieving jobs for role None
 INFO] Checking status of devcluster/www-data/prod/hello
Active tasks (1):
           Task role: www-data, env: prod, name: hello, instance: 0, status: 
RUNNING on 192.168.33.7
             CPU: 1.0 core(s), RAM: 128 MB, Disk: 128 MB
             events:
              2016-09-15 00:08:36 PENDING: None
              2016-09-15 00:08:37 ASSIGNED: None
              2016-09-15 00:08:39 STARTING: Initializing sandbox.
              2016-09-15 00:08:39 RUNNING: None
Inactive tasks (0):

# Roll forward:
git stash pop
vagrant ssh -c "aurorabuild scheduler"

vagrant@aurora:~$ sudo grep 'Registered with ID value' 
/var/log/upstart/aurora-scheduler.log
I0915 00:08:11.136 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "308d7661-6bb1-4936-86b4-a01158bfa06b-0000"
I0915 00:12:33.395 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "308d7661-6bb1-4936-86b4-a01158bfa06b-0000"
I0915 00:14:49.374 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "308d7661-6bb1-4936-86b4-a01158bfa06b-0000"
I0915 00:16:14.004 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "308d7661-6bb1-4936-86b4-a01158bfa06b-0000"
vagrant@aurora:~$ aurora job status  devcluster
 INFO] Retrieving jobs for role None
 INFO] Checking status of devcluster/www-data/prod/hello
Active tasks (1):
           Task role: www-data, env: prod, name: hello, instance: 0, status: 
RUNNING on 192.168.33.7
             CPU: 1.0 core(s), RAM: 128 MB, Disk: 128 MB
             events:
              2016-09-15 00:08:36 PENDING: None
              2016-09-15 00:08:37 ASSIGNED: None
              2016-09-15 00:08:39 STARTING: Initializing sandbox.
              2016-09-15 00:08:39 RUNNING: None
Inactive tasks (0):

# Restart with old framework name
vagrant ssh
sudo vim /etc/init/aurora-scheduler.conf
# add -framework_name=TwitterScheduler after "exec bin/aurora-scheduler” and 
save
sudo stop aurora-scheduler
sudo start aurora-scheduler

vagrant@aurora:~$ sudo grep 'Registered with ID value' 
/var/log/upstart/aurora-scheduler.log
I0915 00:08:11.136 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "308d7661-6bb1-4936-86b4-a01158bfa06b-0000"
I0915 00:12:33.395 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "308d7661-6bb1-4936-86b4-a01158bfa06b-0000"
I0915 00:14:49.374 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "308d7661-6bb1-4936-86b4-a01158bfa06b-0000"
I0915 00:16:14.004 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "308d7661-6bb1-4936-86b4-a01158bfa06b-0000"
I0915 00:18:16.200 [Thread-11, MesosSchedulerImpl:151] Registered with ID 
value: "308d7661-6bb1-4936-86b4-a01158bfa06b-0000"
vagrant@aurora:~$ aurora job status  devcluster
 INFO] Retrieving jobs for role None
 INFO] Checking status of devcluster/www-data/prod/hello
Active tasks (1):
           Task role: www-data, env: prod, name: hello, instance: 0, status: 
RUNNING on 192.168.33.7
             CPU: 1.0 core(s), RAM: 128 MB, Disk: 128 MB
             events:
              2016-09-15 00:08:36 PENDING: None
              2016-09-15 00:08:37 ASSIGNED: None
              2016-09-15 00:08:39 STARTING: Initializing sandbox.
              2016-09-15 00:08:39 RUNNING: None
Inactive tasks (0):


Thanks,

Santhosh Kumar Shanmugham

Reply via email to