Re: Review Request 66919: Failure to update registry should abort the master process.

2018-05-07 Thread Xudong Ni via Review Board


> On May 8, 2018, 12:08 a.m., Jiang Yan Xu wrote:
> > src/master/http.cpp
> > Lines 4163 (patched)
> > 
> >
> > You missed a space between `result)` and `{` which I didn't catch 
> > initially but fixed up in a subsequent commit.

Thanks for the follow up commit


- Xudong


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66919/#review202606
---


On May 7, 2018, 6:10 p.m., Xudong Ni wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66919/
> ---
> 
> (Updated May 7, 2018, 6:10 p.m.)
> 
> 
> Review request for mesos, Joseph Wu and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-8630
> https://issues.apache.org/jira/browse/MESOS-8630
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When the registrar fails to update the registry it would abort the
> actor and fail all future operations,However when the registrar
> updates is requested by an operator API such as maintenance update,
> the master process doesn't shut down (a 500 error is returned to the
> client instead)and all subsequent operations will fail.
> 
> Review: https://reviews.apache.org/r/66919
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 135ae4389623385a6638158f5f23d6daca14a0ad 
> 
> 
> Diff: https://reviews.apache.org/r/66919/diff/3/
> 
> 
> Testing
> ---
> 
> xujyan made a local test change to verify this patch:
> https://github.com/xujyan/mesos/commit/68051320a87431f6d2f3fbad6b0b97814200a731
> 
> Test output:
> F0504 20:43:36.530680 1858991 http.cpp:4304] CHECK_READY(result): is FAILED: 
> Failed to update registry: Failed to perform store within 5secs
> 
> 
> Thanks,
> 
> Xudong Ni
> 
>



Re: Review Request 66919: Failure to update registry should abort the master process.

2018-05-07 Thread Jiang Yan Xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66919/#review202606
---


Fix it, then Ship it!




FYI committed with some fixes on the commit message (which comes from the 
review summary).


src/master/http.cpp
Lines 4163 (patched)


You missed a space between `result)` and `{` which I didn't catch initially 
but fixed up in a subsequent commit.


- Jiang Yan Xu


On May 7, 2018, 11:10 a.m., Xudong Ni wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66919/
> ---
> 
> (Updated May 7, 2018, 11:10 a.m.)
> 
> 
> Review request for mesos, Joseph Wu and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-8630
> https://issues.apache.org/jira/browse/MESOS-8630
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When the registrar fails to update the registry it would abort the
> actor and fail all future operations,However when the registrar
> updates is requested by an operator API such as maintenance update,
> the master process doesn't shut down (a 500 error is returned to the
> client instead)and all subsequent operations will fail.
> 
> Review: https://reviews.apache.org/r/66919
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 135ae4389623385a6638158f5f23d6daca14a0ad 
> 
> 
> Diff: https://reviews.apache.org/r/66919/diff/3/
> 
> 
> Testing
> ---
> 
> xujyan made a local test change to verify this patch:
> https://github.com/xujyan/mesos/commit/68051320a87431f6d2f3fbad6b0b97814200a731
> 
> Test output:
> F0504 20:43:36.530680 1858991 http.cpp:4304] CHECK_READY(result): is FAILED: 
> Failed to update registry: Failed to perform store within 5secs
> 
> 
> Thanks,
> 
> Xudong Ni
> 
>



Re: Review Request 66919: Failure to update registry should abort the master process.

2018-05-07 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66919/#review202578
---



PASS: Mesos patch 66919 was successfully built and tested.

Reviews applied: `['66919']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/66919

- Mesos Reviewbot Windows


On May 7, 2018, 11:10 a.m., Xudong Ni wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66919/
> ---
> 
> (Updated May 7, 2018, 11:10 a.m.)
> 
> 
> Review request for mesos, Joseph Wu and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-8630
> https://issues.apache.org/jira/browse/MESOS-8630
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When the registrar fails to update the registry it would abort the
> actor and fail all future operations,However when the registrar
> updates is requested by an operator API such as maintenance update,
> the master process doesn't shut down (a 500 error is returned to the
> client instead)and all subsequent operations will fail.
> 
> Review: https://reviews.apache.org/r/66919
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 135ae4389623385a6638158f5f23d6daca14a0ad 
> 
> 
> Diff: https://reviews.apache.org/r/66919/diff/3/
> 
> 
> Testing
> ---
> 
> xujyan made a local test change to verify this patch:
> https://github.com/xujyan/mesos/commit/68051320a87431f6d2f3fbad6b0b97814200a731
> 
> Test output:
> F0504 20:43:36.530680 1858991 http.cpp:4304] CHECK_READY(result): is FAILED: 
> Failed to update registry: Failed to perform store within 5secs
> 
> 
> Thanks,
> 
> Xudong Ni
> 
>



Re: Review Request 66919: Failure to update registry should abort the master process.

2018-05-07 Thread Xudong Ni via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66919/
---

(Updated May 7, 2018, 6:10 p.m.)


Review request for mesos, Joseph Wu and Jiang Yan Xu.


Bugs: MESOS-8630
https://issues.apache.org/jira/browse/MESOS-8630


Repository: mesos


Description
---

When the registrar fails to update the registry it would abort the
actor and fail all future operations,However when the registrar
updates is requested by an operator API such as maintenance update,
the master process doesn't shut down (a 500 error is returned to the
client instead)and all subsequent operations will fail.

Review: https://reviews.apache.org/r/66919


Diffs
-

  src/master/http.cpp 135ae4389623385a6638158f5f23d6daca14a0ad 


Diff: https://reviews.apache.org/r/66919/diff/3/


Testing (updated)
---

xujyan made a local test change to verify this patch:
https://github.com/xujyan/mesos/commit/68051320a87431f6d2f3fbad6b0b97814200a731

Test output:
F0504 20:43:36.530680 1858991 http.cpp:4304] CHECK_READY(result): is FAILED: 
Failed to update registry: Failed to perform store within 5secs


Thanks,

Xudong Ni



Re: Review Request 66919: Failure to update registry should abort the master process.

2018-05-07 Thread Xudong Ni via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66919/
---

(Updated May 7, 2018, 5:20 p.m.)


Review request for mesos, Joseph Wu and Jiang Yan Xu.


Bugs: MESOS-8630
https://issues.apache.org/jira/browse/MESOS-8630


Repository: mesos


Description (updated)
---

When the registrar fails to update the registry it would abort the
actor and fail all future operations,However when the registrar
updates is requested by an operator API such as maintenance update,
the master process doesn't shut down (a 500 error is returned to the
client instead)and all subsequent operations will fail.

Review: https://reviews.apache.org/r/66919


Diffs (updated)
-

  src/master/http.cpp 135ae4389623385a6638158f5f23d6daca14a0ad 


Diff: https://reviews.apache.org/r/66919/diff/3/

Changes: https://reviews.apache.org/r/66919/diff/2-3/


Testing
---

No error found by running all unit tests(make check)


Thanks,

Xudong Ni



Re: Review Request 66919: Failure to update registry should abort the master process.

2018-05-07 Thread Xudong Ni via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66919/
---

(Updated May 7, 2018, 4:54 p.m.)


Review request for mesos, Joseph Wu and Jiang Yan Xu.


Bugs: MESOS-8630
https://issues.apache.org/jira/browse/MESOS-8630


Repository: mesos


Description (updated)
---

When the registrar fails to update the registry it would abort the
actor and fail all future operations,However when the registrar
updates is requested by an operator API such as maintenance update,
the master process doesn't shut down (a 500 error is returned to the
client instead)and all subsequent operations will fail.


Diffs (updated)
-

  src/master/http.cpp 135ae4389623385a6638158f5f23d6daca14a0ad 


Diff: https://reviews.apache.org/r/66919/diff/2/

Changes: https://reviews.apache.org/r/66919/diff/1-2/


Testing
---

No error found by running all unit tests(make check)


Thanks,

Xudong Ni



Re: Review Request 66919: Failure to update registry should abort the master process.

2018-05-04 Thread Jiang Yan Xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66919/#review202496
---



The "process aborting" logic is a bit hard to write tests for but we can test 
it manually without checking in the code.

e.g., I changed one test a bit to show the check failure: 
https://github.com/xujyan/mesos/commit/68051320a87431f6d2f3fbad6b0b97814200a731

Could you update the "Testing Done" section with the link? That should cover 
it. :)

- Jiang Yan Xu


On May 4, 2018, 12:46 p.m., Xudong Ni wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66919/
> ---
> 
> (Updated May 4, 2018, 12:46 p.m.)
> 
> 
> Review request for mesos, Joseph Wu and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-8630
> https://issues.apache.org/jira/browse/MESOS-8630
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When the registrar fails to update the registry it would abort the
> actor and fail all future operations,However when the registrar
> updates is requested by an operator API such as maintenance update,
> the master process doesn't shut down (a 500 error is returned to the
> client instead)and all subsequent operations will fail.
> 
> Review: https://reviews.apache.org/r/66919
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 135ae4389623385a6638158f5f23d6daca14a0ad 
> 
> 
> Diff: https://reviews.apache.org/r/66919/diff/1/
> 
> 
> Testing
> ---
> 
> No error found by running all unit tests(make check)
> 
> 
> Thanks,
> 
> Xudong Ni
> 
>



Re: Review Request 66919: Failure to update registry should abort the master process.

2018-05-04 Thread Jiang Yan Xu


> On May 4, 2018, 8:19 a.m., Jiang Yan Xu wrote:
> > src/master/http.cpp
> > Lines 4163-4165 (patched)
> > 
> >
> > The following will be more idiomatic.
> > 
> > ```
> >   .onAny([](const Future& result) {
> > CHECK_READY(result);
> >   })
> > ```
> > 
> > Note the spaces and const ref.
> > 
> > 
> > It'll be helpful to get familiarize with
> > 
> > https://github.com/apache/mesos/blob/master/docs/c%2B%2B-style-guide.md
> > 
> > which in turn references 
> > 
> > https://google.github.io/styleguide/cppguide.html
> 
> Jiang Yan Xu wrote:
> It's also helpful to attach a message to the check failure:
> 
> 
> ```
> CHECK_READY(result)
>   << "Failed to update maintenance schedule in the registry";
> ```

Also let's add a TODO above the `CHECK_READY`:

```
TODO(fiu): Consider changing/refactoring the registrar itself so the individual 
call sites don't need to handle this separately. All registrar failures that 
cause it to abort should instead abort the process.
```


- Jiang Yan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66919/#review202416
---


On May 4, 2018, 12:46 p.m., Xudong Ni wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66919/
> ---
> 
> (Updated May 4, 2018, 12:46 p.m.)
> 
> 
> Review request for mesos, Joseph Wu and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-8630
> https://issues.apache.org/jira/browse/MESOS-8630
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When the registrar fails to update the registry it would abort the
> actor and fail all future operations,However when the registrar
> updates is requested by an operator API such as maintenance update,
> the master process doesn't shut down (a 500 error is returned to the
> client instead)and all subsequent operations will fail.
> 
> Review: https://reviews.apache.org/r/66919
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 135ae4389623385a6638158f5f23d6daca14a0ad 
> 
> 
> Diff: https://reviews.apache.org/r/66919/diff/1/
> 
> 
> Testing
> ---
> 
> No error found by running all unit tests(make check)
> 
> 
> Thanks,
> 
> Xudong Ni
> 
>



Re: Review Request 66919: Failure to update registry should abort the master process.

2018-05-04 Thread Jiang Yan Xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66919/#review202416
---



LGTM

It's hard to write a test for this but we can probably modify a test in 
master_maintenance_tests.cpp to observe the check failure. 

Let's also check Joseph with the comment here.
https://github.com/apache/mesos/blob/520b729857223aeade345cbdf61209ec4f395ad9/src/master/maintenance.hpp#L39


src/master/http.cpp
Lines 4163-4165 (patched)


The following will be more idiomatic.

```
  .onAny([](const Future& result) {
CHECK_READY(result);
  })
```

Note the spaces and const ref.

It'll be helpful to get familiarize with

https://github.com/apache/mesos/blob/master/docs/c%2B%2B-style-guide.md

which in turn references 

https://google.github.io/styleguide/cppguide.html


- Jiang Yan Xu


On May 3, 2018, 9:43 a.m., Xudong Ni wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66919/
> ---
> 
> (Updated May 3, 2018, 9:43 a.m.)
> 
> 
> Review request for mesos and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-8630
> https://issues.apache.org/jira/browse/MESOS-8630
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When the registrar fails to update the registry it would abort the
> actor and fail all future operations,However when the registrar
> updates is requested by an operator API such as maintenance update,
> the master process doesn't shut down (a 500 error is returned to the
> client instead)and all subsequent operations will fail.
> 
> Review: https://reviews.apache.org/r/66919
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 135ae4389623385a6638158f5f23d6daca14a0ad 
> 
> 
> Diff: https://reviews.apache.org/r/66919/diff/1/
> 
> 
> Testing
> ---
> 
> No error found by running all unit tests(make check)
> 
> 
> Thanks,
> 
> Xudong Ni
> 
>



Re: Review Request 66919: Failure to update registry should abort the master process.

2018-05-02 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66919/#review202327
---



PASS: Mesos patch 66919 was successfully built and tested.

Reviews applied: `['66919']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/66919

- Mesos Reviewbot Windows


On May 2, 2018, 11:38 p.m., Xudong Ni wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66919/
> ---
> 
> (Updated May 2, 2018, 11:38 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Bugs: MESOS-8630
> https://issues.apache.org/jira/browse/MESOS-8630
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When the registrar fails to update the registry it would abort the
> actor and fail all future operations,However when the registrar
> updates is requested by an operator API such as maintenance update,
> the master process doesn't shut down (a 500 error is returned to the
> client instead)and all subsequent operations will fail.
> 
> Review: https://reviews.apache.org/r/66919
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 135ae4389623385a6638158f5f23d6daca14a0ad 
> 
> 
> Diff: https://reviews.apache.org/r/66919/diff/1/
> 
> 
> Testing
> ---
> 
> No error found by running all unit tests(make check)
> 
> 
> Thanks,
> 
> Xudong Ni
> 
>



Re: Review Request 66919: Failure to update registry should abort the master process.

2018-05-02 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66919/#review202322
---



Bad review!

Reviews applied: []

Error:
No reviewers specified. Please find a reviewer by asking on JIRA or the mailing 
list.

- Mesos Reviewbot


On May 2, 2018, 5:38 p.m., Xudong Ni wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66919/
> ---
> 
> (Updated May 2, 2018, 5:38 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Bugs: MESOS-8630
> https://issues.apache.org/jira/browse/MESOS-8630
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When the registrar fails to update the registry it would abort the
> actor and fail all future operations,However when the registrar
> updates is requested by an operator API such as maintenance update,
> the master process doesn't shut down (a 500 error is returned to the
> client instead)and all subsequent operations will fail.
> 
> Review: https://reviews.apache.org/r/66919
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 135ae4389623385a6638158f5f23d6daca14a0ad 
> 
> 
> Diff: https://reviews.apache.org/r/66919/diff/1/
> 
> 
> Testing
> ---
> 
> No error found by running all unit tests(make check)
> 
> 
> Thanks,
> 
> Xudong Ni
> 
>



Re: Review Request 66919: Failure to update registry should abort the master process.

2018-05-02 Thread Xudong Ni via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66919/
---

(Updated May 2, 2018, 9:38 p.m.)


Review request for mesos.


Bugs: MESOS-8630
https://issues.apache.org/jira/browse/MESOS-8630


Repository: mesos


Description (updated)
---

When the registrar fails to update the registry it would abort the
actor and fail all future operations,However when the registrar
updates is requested by an operator API such as maintenance update,
the master process doesn't shut down (a 500 error is returned to the
client instead)and all subsequent operations will fail.

Review: https://reviews.apache.org/r/66919


Diffs
-

  src/master/http.cpp 135ae4389623385a6638158f5f23d6daca14a0ad 


Diff: https://reviews.apache.org/r/66919/diff/1/


Testing
---

No error found by running all unit tests(make check)


Thanks,

Xudong Ni



Review Request 66919: Failure to update registry should abort the master process.

2018-05-02 Thread Xudong Ni via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66919/
---

Review request for mesos.


Bugs: MESOS-8630
https://issues.apache.org/jira/browse/MESOS-8630


Repository: mesos


Description
---

When the registrar fails to update the registry it would abort the
actor and fail all future operations,However when the registrar
updates is requested by an operator API such as maintenance update,
the master process doesn't shut down (a 500 error is returned to the
client instead)and all subsequent operations will fail.


Diffs
-

  src/master/http.cpp 135ae4389623385a6638158f5f23d6daca14a0ad 


Diff: https://reviews.apache.org/r/66919/diff/1/


Testing
---

No error found by running all unit tests(make check)


Thanks,

Xudong Ni