Re: Review Request 45905: Added metrics to the balloon framework.

2016-06-09 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45905/#review136956
---



Patch looks great!

Reviews applied: [46407, 48299, 45604, 46411, 48303, 45905]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker_build.sh

- Mesos ReviewBot


On June 9, 2016, 11:55 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45905/
> ---
> 
> (Updated June 9, 2016, 11:55 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Artem Harutyunyan, Kevin Klues, and 
> Vinod Kone.
> 
> 
> Bugs: MESOS-5174
> https://issues.apache.org/jira/browse/MESOS-5174
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Adds metrics to gauge the health of the framework.  This includes:
> 
> * uptime_secs = How long the framework has been running.
> * registered = If the framework is registered.
> * tasks_finished = Number of tasks finished (successfully).
> * tasks_oomed = Number of tasks that were OOM killed.
> * allowed_terminations = Number of terminal status updates which
>   are acceptable due to infrastructure reasons.
> * abnormal_terminations = Number of terminal status updates which 
>   were not `TASK_FINISHED` or `TASK_FAILED` due to OOM.
> 
> 
> Diffs
> -
> 
>   src/examples/balloon_framework.cpp 739fb504e93154bf032b4c621151fa3c99b60037 
> 
> Diff: https://reviews.apache.org/r/45905/diff/
> 
> 
> Testing
> ---
> 
> ```
> make check
> 
> sudo bin/mesos-tests.sh --gtest_filter="*ROOT_CGROUPS_BalloonFramework"
> 
> # Also launched two instances on a cluster.
> # This one OOM's:
> ./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
> --balloon_limit=256MB --task_memory=128MB 
> --executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
> 
> # This one does not OOM:
> ./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
> --balloon_limit=256MB --task_memory=256MB 
> --executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 45905: Added metrics to the balloon framework.

2016-06-09 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45905/
---

(Updated June 9, 2016, 4:55 p.m.)


Review request for mesos, Greg Mann, Artem Harutyunyan, Kevin Klues, and Vinod 
Kone.


Changes
---

Renamed `allowed_terminations` to `launch_failures`.  Also touched up a related 
comment.


Bugs: MESOS-5174
https://issues.apache.org/jira/browse/MESOS-5174


Repository: mesos


Description
---

Adds metrics to gauge the health of the framework.  This includes:

* uptime_secs = How long the framework has been running.
* registered = If the framework is registered.
* tasks_finished = Number of tasks finished (successfully).
* tasks_oomed = Number of tasks that were OOM killed.
* allowed_terminations = Number of terminal status updates which
  are acceptable due to infrastructure reasons.
* abnormal_terminations = Number of terminal status updates which 
  were not `TASK_FINISHED` or `TASK_FAILED` due to OOM.


Diffs (updated)
-

  src/examples/balloon_framework.cpp 739fb504e93154bf032b4c621151fa3c99b60037 

Diff: https://reviews.apache.org/r/45905/diff/


Testing
---

```
make check

sudo bin/mesos-tests.sh --gtest_filter="*ROOT_CGROUPS_BalloonFramework"

# Also launched two instances on a cluster.
# This one OOM's:
./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
--balloon_limit=256MB --task_memory=128MB 
--executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
--executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"

# This one does not OOM:
./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
--balloon_limit=256MB --task_memory=256MB 
--executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
--executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
```


Thanks,

Joseph Wu



Re: Review Request 45905: Added metrics to the balloon framework.

2016-06-07 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45905/#review136550
---


Fix it, then Ship it!





src/examples/balloon_framework.cpp (line 256)


s/allowed_terminations/launch_failures/ ?


- Vinod Kone


On June 7, 2016, 12:30 a.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45905/
> ---
> 
> (Updated June 7, 2016, 12:30 a.m.)
> 
> 
> Review request for mesos, Greg Mann, Artem Harutyunyan, Kevin Klues, and 
> Vinod Kone.
> 
> 
> Bugs: MESOS-5174
> https://issues.apache.org/jira/browse/MESOS-5174
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Adds metrics to gauge the health of the framework.  This includes:
> 
> * uptime_secs = How long the framework has been running.
> * registered = If the framework is registered.
> * tasks_finished = Number of tasks finished (successfully).
> * tasks_oomed = Number of tasks that were OOM killed.
> * allowed_terminations = Number of terminal status updates which
>   are acceptable due to infrastructure reasons.
> * abnormal_terminations = Number of terminal status updates which 
>   were not `TASK_FINISHED` or `TASK_FAILED` due to OOM.
> 
> 
> Diffs
> -
> 
>   src/examples/balloon_framework.cpp 739fb504e93154bf032b4c621151fa3c99b60037 
> 
> Diff: https://reviews.apache.org/r/45905/diff/
> 
> 
> Testing
> ---
> 
> ```
> make check
> 
> sudo bin/mesos-tests.sh --gtest_filter="*ROOT_CGROUPS_BalloonFramework"
> 
> # Also launched two instances on a cluster.
> # This one OOM's:
> ./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
> --balloon_limit=256MB --task_memory=128MB 
> --executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
> 
> # This one does not OOM:
> ./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
> --balloon_limit=256MB --task_memory=256MB 
> --executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 45905: Added metrics to the balloon framework.

2016-06-06 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45905/#review136399
---



Patch looks great!

Reviews applied: [46407, 48299, 45604, 46411, 48303, 45905]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker_build.sh

- Mesos ReviewBot


On June 7, 2016, 12:30 a.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45905/
> ---
> 
> (Updated June 7, 2016, 12:30 a.m.)
> 
> 
> Review request for mesos, Greg Mann, Artem Harutyunyan, Kevin Klues, and 
> Vinod Kone.
> 
> 
> Bugs: MESOS-5174
> https://issues.apache.org/jira/browse/MESOS-5174
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Adds metrics to gauge the health of the framework.  This includes:
> 
> * uptime_secs = How long the framework has been running.
> * registered = If the framework is registered.
> * tasks_finished = Number of tasks finished (successfully).
> * tasks_oomed = Number of tasks that were OOM killed.
> * allowed_terminations = Number of terminal status updates which
>   are acceptable due to infrastructure reasons.
> * abnormal_terminations = Number of terminal status updates which 
>   were not `TASK_FINISHED` or `TASK_FAILED` due to OOM.
> 
> 
> Diffs
> -
> 
>   src/examples/balloon_framework.cpp 739fb504e93154bf032b4c621151fa3c99b60037 
> 
> Diff: https://reviews.apache.org/r/45905/diff/
> 
> 
> Testing
> ---
> 
> ```
> make check
> 
> sudo bin/mesos-tests.sh --gtest_filter="*ROOT_CGROUPS_BalloonFramework"
> 
> # Also launched two instances on a cluster.
> # This one OOM's:
> ./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
> --balloon_limit=256MB --task_memory=128MB 
> --executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
> 
> # This one does not OOM:
> ./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
> --balloon_limit=256MB --task_memory=256MB 
> --executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 45905: Added metrics to the balloon framework.

2016-06-06 Thread Joseph Wu


> On April 21, 2016, 2:40 p.m., Vinod Kone wrote:
> > src/examples/balloon_framework.cpp, line 389
> > 
> >
> > Why do you need to store this?

With the process split, we don't need to store it anymore.


> On April 21, 2016, 2:40 p.m., Vinod Kone wrote:
> > src/examples/balloon_framework.cpp, line 297
> > 
> >
> > is failed to fetch the URI the only reason when we get 
> > REASON_CONTAINER_LAUNCH_FAILED ?

Not necessarily, but for a framework that constantly launches (and OOMs) tasks, 
this is the most common "unexpected" failure condition.


- Joseph


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45905/#review129966
---


On June 6, 2016, 5:30 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45905/
> ---
> 
> (Updated June 6, 2016, 5:30 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Artem Harutyunyan, Kevin Klues, and 
> Vinod Kone.
> 
> 
> Bugs: MESOS-5174
> https://issues.apache.org/jira/browse/MESOS-5174
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Adds metrics to gauge the health of the framework.  This includes:
> 
> * uptime_secs = How long the framework has been running.
> * registered = If the framework is registered.
> * tasks_finished = Number of tasks finished (successfully).
> * tasks_oomed = Number of tasks that were OOM killed.
> * allowed_terminations = Number of terminal status updates which
>   are acceptable due to infrastructure reasons.
> * abnormal_terminations = Number of terminal status updates which 
>   were not `TASK_FINISHED` or `TASK_FAILED` due to OOM.
> 
> 
> Diffs
> -
> 
>   src/examples/balloon_framework.cpp 739fb504e93154bf032b4c621151fa3c99b60037 
> 
> Diff: https://reviews.apache.org/r/45905/diff/
> 
> 
> Testing
> ---
> 
> ```
> make check
> 
> sudo bin/mesos-tests.sh --gtest_filter="*ROOT_CGROUPS_BalloonFramework"
> 
> # Also launched two instances on a cluster.
> # This one OOM's:
> ./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
> --balloon_limit=256MB --task_memory=128MB 
> --executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
> 
> # This one does not OOM:
> ./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
> --balloon_limit=256MB --task_memory=256MB 
> --executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 45905: Added metrics to the balloon framework.

2016-06-06 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45905/
---

(Updated June 6, 2016, 5:30 p.m.)


Review request for mesos, Greg Mann, Artem Harutyunyan, Kevin Klues, and Vinod 
Kone.


Changes
---

Split up the framework into a process (new prior review /r/48303/ ).


Bugs: MESOS-5174
https://issues.apache.org/jira/browse/MESOS-5174


Repository: mesos


Description
---

Adds metrics to gauge the health of the framework.  This includes:

* uptime_secs = How long the framework has been running.
* registered = If the framework is registered.
* tasks_finished = Number of tasks finished (successfully).
* tasks_oomed = Number of tasks that were OOM killed.
* allowed_terminations = Number of terminal status updates which
  are acceptable due to infrastructure reasons.
* abnormal_terminations = Number of terminal status updates which 
  were not `TASK_FINISHED` or `TASK_FAILED` due to OOM.


Diffs (updated)
-

  src/examples/balloon_framework.cpp 739fb504e93154bf032b4c621151fa3c99b60037 

Diff: https://reviews.apache.org/r/45905/diff/


Testing
---

```
make check

sudo bin/mesos-tests.sh --gtest_filter="*ROOT_CGROUPS_BalloonFramework"

# Also launched two instances on a cluster.
# This one OOM's:
./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
--balloon_limit=256MB --task_memory=128MB 
--executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
--executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"

# This one does not OOM:
./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
--balloon_limit=256MB --task_memory=256MB 
--executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
--executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
```


Thanks,

Joseph Wu



Re: Review Request 45905: Added metrics to the balloon framework.

2016-04-21 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45905/#review129966
---




src/examples/balloon_framework.cpp (line 296)


is failed to fetch the URI the only reason when we get 
REASON_CONTAINER_LAUNCH_FAILED ?



src/examples/balloon_framework.cpp (line 350)


As discussed in the review for long lived framework, can you make this a 
pure class and make the balloon scheduler a process? 

Just feels weird to me if some class directly calls into a process object 
instead of dispatching to it.



src/examples/balloon_framework.cpp (line 388)


Why do you need to store this?


- Vinod Kone


On April 19, 2016, 9:52 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45905/
> ---
> 
> (Updated April 19, 2016, 9:52 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Artem Harutyunyan, Kevin Klues, and 
> Vinod Kone.
> 
> 
> Bugs: MESOS-5174
> https://issues.apache.org/jira/browse/MESOS-5174
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Adds metrics to gauge the health of the framework.  This includes:
> 
> * uptime_secs = How long the framework has been running.
> * registered = If the framework is registered.
> * tasks_finished = Number of tasks finished (successfully).
> * tasks_oomed = Number of tasks that were OOM killed.
> * allowed_terminations = Number of terminal status updates which
>   are acceptable due to infrastructure reasons.
> * abnormal_terminations = Number of terminal status updates which 
>   were not `TASK_FINISHED` or `TASK_FAILED` due to OOM.
> 
> 
> Diffs
> -
> 
>   src/examples/balloon_framework.cpp 15c45612b777edaf97aea9b953439d4ad56920f3 
> 
> Diff: https://reviews.apache.org/r/45905/diff/
> 
> 
> Testing
> ---
> 
> ```
> make check
> 
> sudo bin/mesos-tests.sh --gtest_filter="*ROOT_CGROUPS_BalloonFramework"
> 
> # Also launched two instances on a cluster.
> # This one OOM's:
> ./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
> --balloon_limit=256MB --task_memory=128MB 
> --executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
> 
> # This one does not OOM:
> ./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
> --balloon_limit=256MB --task_memory=256MB 
> --executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 45905: Added metrics to the balloon framework.

2016-04-19 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45905/#review129682
---



Patch looks great!

Reviews applied: [46407, 45604, 46411, 45905]

Passed command: export OS='ubuntu:14.04' CONFIGURATION='--verbose' 
COMPILER='gcc' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker_build.sh

- Mesos ReviewBot


On April 19, 2016, 9:52 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45905/
> ---
> 
> (Updated April 19, 2016, 9:52 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Artem Harutyunyan, Kevin Klues, and 
> Vinod Kone.
> 
> 
> Bugs: MESOS-5174
> https://issues.apache.org/jira/browse/MESOS-5174
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Adds metrics to gauge the health of the framework.  This includes:
> 
> * uptime_secs = How long the framework has been running.
> * registered = If the framework is registered.
> * tasks_finished = Number of tasks finished (successfully).
> * tasks_oomed = Number of tasks that were OOM killed.
> * allowed_terminations = Number of terminal status updates which
>   are acceptable due to infrastructure reasons.
> * abnormal_terminations = Number of terminal status updates which 
>   were not `TASK_FINISHED` or `TASK_FAILED` due to OOM.
> 
> 
> Diffs
> -
> 
>   src/examples/balloon_framework.cpp 15c45612b777edaf97aea9b953439d4ad56920f3 
> 
> Diff: https://reviews.apache.org/r/45905/diff/
> 
> 
> Testing
> ---
> 
> ```
> make check
> 
> sudo bin/mesos-tests.sh --gtest_filter="*ROOT_CGROUPS_BalloonFramework"
> 
> # Also launched two instances on a cluster.
> # This one OOM's:
> ./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
> --balloon_limit=256MB --task_memory=128MB 
> --executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
> 
> # This one does not OOM:
> ./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
> --balloon_limit=256MB --task_memory=256MB 
> --executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 45905: Added metrics to the balloon framework.

2016-04-19 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45905/
---

(Updated April 19, 2016, 2:52 p.m.)


Review request for mesos, Greg Mann, Artem Harutyunyan, Kevin Klues, and Vinod 
Kone.


Changes
---

Update/rebase based on changes in previous reviews.


Bugs: MESOS-5174
https://issues.apache.org/jira/browse/MESOS-5174


Repository: mesos


Description
---

Adds metrics to gauge the health of the framework.  This includes:

* uptime_secs = How long the framework has been running.
* registered = If the framework is registered.
* tasks_finished = Number of tasks finished (successfully).
* tasks_oomed = Number of tasks that were OOM killed.
* allowed_terminations = Number of terminal status updates which
  are acceptable due to infrastructure reasons.
* abnormal_terminations = Number of terminal status updates which 
  were not `TASK_FINISHED` or `TASK_FAILED` due to OOM.


Diffs (updated)
-

  src/examples/balloon_framework.cpp 15c45612b777edaf97aea9b953439d4ad56920f3 

Diff: https://reviews.apache.org/r/45905/diff/


Testing
---

```
make check

sudo bin/mesos-tests.sh --gtest_filter="*ROOT_CGROUPS_BalloonFramework"

# Also launched two instances on a cluster.
# This one OOM's:
./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
--balloon_limit=256MB --task_memory=128MB 
--executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
--executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"

# This one does not OOM:
./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
--balloon_limit=256MB --task_memory=256MB 
--executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
--executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
```


Thanks,

Joseph Wu



Re: Review Request 45905: Added metrics to the balloon framework.

2016-04-14 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45905/#review129012
---



Bad patch!

Reviews applied: [45905, 45604]

Failed command: ./support/apply-review.sh -n -r 45604

Error:
2016-04-14 22:23:18 URL:https://reviews.apache.org/r/45604/diff/raw/ 
[23649/23649] -> "45604.patch" [1]
error: patch failed: src/examples/balloon_executor.cpp:24
error: src/examples/balloon_executor.cpp: patch does not apply

Full log: https://builds.apache.org/job/mesos-reviewbot/12533/console

- Mesos ReviewBot


On April 14, 2016, 9:43 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45905/
> ---
> 
> (Updated April 14, 2016, 9:43 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Artem Harutyunyan, Kevin Klues, and 
> Vinod Kone.
> 
> 
> Bugs: MESOS-5174
> https://issues.apache.org/jira/browse/MESOS-5174
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Adds metrics to gauge the health of the framework.  This includes:
> 
> * uptime_secs = How long the framework has been running.
> * registered = If the framework is registered.
> * tasks_finished = Number of tasks finished (successfully).
> * tasks_oomed = Number of tasks that were OOM killed.
> * allowed_terminations = Number of terminal status updates which
>   are acceptable due to infrastructure reasons.
> * abnormal_terminations = Number of terminal status updates which 
>   were not `TASK_FINISHED` or `TASK_FAILED` due to OOM.
> 
> 
> Diffs
> -
> 
>   src/examples/balloon_framework.cpp 15c45612b777edaf97aea9b953439d4ad56920f3 
> 
> Diff: https://reviews.apache.org/r/45905/diff/
> 
> 
> Testing
> ---
> 
> ```
> make check
> 
> sudo bin/mesos-tests.sh --gtest_filter="*ROOT_CGROUPS_BalloonFramework"
> 
> # Also launched two instances on a cluster.
> # This one OOM's:
> ./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
> --balloon_limit=256MB --task_memory=128MB 
> --executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
> 
> # This one does not OOM:
> ./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
> --balloon_limit=256MB --task_memory=256MB 
> --executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 45905: Added metrics to the balloon framework.

2016-04-14 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45905/
---

(Updated April 14, 2016, 2:43 p.m.)


Review request for mesos, Greg Mann, Artem Harutyunyan, Kevin Klues, and Vinod 
Kone.


Changes
---

Accidentally broke the test (previous commit).  Fixed now.


Bugs: MESOS-5174
https://issues.apache.org/jira/browse/MESOS-5174


Repository: mesos


Description
---

Adds metrics to gauge the health of the framework.  This includes:

* uptime_secs = How long the framework has been running.
* registered = If the framework is registered.
* tasks_finished = Number of tasks finished (successfully).
* tasks_oomed = Number of tasks that were OOM killed.
* allowed_terminations = Number of terminal status updates which
  are acceptable due to infrastructure reasons.
* abnormal_terminations = Number of terminal status updates which 
  were not `TASK_FINISHED` or `TASK_FAILED` due to OOM.


Diffs (updated)
-

  src/examples/balloon_framework.cpp 15c45612b777edaf97aea9b953439d4ad56920f3 

Diff: https://reviews.apache.org/r/45905/diff/


Testing
---

```
make check

sudo bin/mesos-tests.sh --gtest_filter="*ROOT_CGROUPS_BalloonFramework"

# Also launched two instances on a cluster.
# This one OOM's:
./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
--balloon_limit=256MB --task_memory=128MB 
--executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
--executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"

# This one does not OOM:
./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
--balloon_limit=256MB --task_memory=256MB 
--executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
--executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
```


Thanks,

Joseph Wu



Re: Review Request 45905: Added metrics to the balloon framework.

2016-04-11 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45905/
---

(Updated April 11, 2016, 11:24 a.m.)


Review request for mesos, Greg Mann, Artem Harutyunyan, Kevin Klues, and Vinod 
Kone.


Changes
---

Added JIRA.  This should also re-trigger ReviewBot, with the commit message fix 
(that MPark just committed).


Bugs: MESOS-5174
https://issues.apache.org/jira/browse/MESOS-5174


Repository: mesos


Description
---

Adds metrics to gauge the health of the framework.  This includes:

* uptime_secs = How long the framework has been running.
* registered = If the framework is registered.
* tasks_finished = Number of tasks finished (successfully).
* tasks_oomed = Number of tasks that were OOM killed.
* allowed_terminations = Number of terminal status updates which
  are acceptable due to infrastructure reasons.
* abnormal_terminations = Number of terminal status updates which 
  were not `TASK_FINISHED` or `TASK_FAILED` due to OOM.


Diffs
-

  src/examples/balloon_framework.cpp 15c45612b777edaf97aea9b953439d4ad56920f3 

Diff: https://reviews.apache.org/r/45905/diff/


Testing
---

```
make check

sudo bin/mesos-tests.sh --gtest_filter="*ROOT_CGROUPS_BalloonFramework"

# Also launched two instances on a cluster.
# This one OOM's:
./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
--balloon_limit=256MB --task_memory=128MB 
--executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
--executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"

# This one does not OOM:
./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
--balloon_limit=256MB --task_memory=256MB 
--executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
--executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
```


Thanks,

Joseph Wu



Re: Review Request 45905: Added metrics to the balloon framework.

2016-04-08 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45905/#review127952
---



Bad patch!

Reviews applied: [45905, 45604]

Failed command: ./support/apply-review.sh -n -r 45604

Error:
2016-04-09 04:10:44 URL:https://reviews.apache.org/r/45604/diff/raw/ 
[23317/23317] -> "45604.patch" [1]
Total errors found: 0
Checking 2 files
Error: No line in the commit message summary may exceed 72 characters.

Full log: https://builds.apache.org/job/mesos-reviewbot/12418/console

- Mesos ReviewBot


On April 9, 2016, 12:32 a.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45905/
> ---
> 
> (Updated April 9, 2016, 12:32 a.m.)
> 
> 
> Review request for mesos, Greg Mann, Artem Harutyunyan, Kevin Klues, and 
> Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Adds metrics to gauge the health of the framework.  This includes:
> 
> * uptime_secs = How long the framework has been running.
> * registered = If the framework is registered.
> * tasks_finished = Number of tasks finished (successfully).
> * tasks_oomed = Number of tasks that were OOM killed.
> * allowed_terminations = Number of terminal status updates which
>   are acceptable due to infrastructure reasons.
> * abnormal_terminations = Number of terminal status updates which 
>   were not `TASK_FINISHED` or `TASK_FAILED` due to OOM.
> 
> 
> Diffs
> -
> 
>   src/examples/balloon_framework.cpp 15c45612b777edaf97aea9b953439d4ad56920f3 
> 
> Diff: https://reviews.apache.org/r/45905/diff/
> 
> 
> Testing
> ---
> 
> ```
> make check
> 
> sudo bin/mesos-tests.sh --gtest_filter="*ROOT_CGROUPS_BalloonFramework"
> 
> # Also launched two instances on a cluster.
> # This one OOM's:
> ./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
> --balloon_limit=256MB --task_memory=128MB 
> --executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
> 
> # This one does not OOM:
> ./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
> --balloon_limit=256MB --task_memory=256MB 
> --executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 45905: Added metrics to the balloon framework.

2016-04-08 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45905/
---

(Updated April 8, 2016, 5:32 p.m.)


Review request for mesos, Greg Mann, Artem Harutyunyan, Kevin Klues, and Vinod 
Kone.


Changes
---

Removed the /framework/counters endpoint.  Added an extra metric for allowable 
failures (currently just one).


Repository: mesos


Description (updated)
---

Adds metrics to gauge the health of the framework.  This includes:

* uptime_secs = How long the framework has been running.
* registered = If the framework is registered.
* tasks_finished = Number of tasks finished (successfully).
* tasks_oomed = Number of tasks that were OOM killed.
* allowed_terminations = Number of terminal status updates which
  are acceptable due to infrastructure reasons.
* abnormal_terminations = Number of terminal status updates which 
  were not `TASK_FINISHED` or `TASK_FAILED` due to OOM.


Diffs (updated)
-

  src/examples/balloon_framework.cpp 15c45612b777edaf97aea9b953439d4ad56920f3 

Diff: https://reviews.apache.org/r/45905/diff/


Testing
---

```
make check

sudo bin/mesos-tests.sh --gtest_filter="*ROOT_CGROUPS_BalloonFramework"

# Also launched two instances on a cluster.
# This one OOM's:
./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
--balloon_limit=256MB --task_memory=128MB 
--executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
--executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"

# This one does not OOM:
./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
--balloon_limit=256MB --task_memory=256MB 
--executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
--executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
```


Thanks,

Joseph Wu



Re: Review Request 45905: Added metrics to the balloon framework.

2016-04-07 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45905/#review127719
---



Bad patch!

Reviews applied: [45905, 45604]

Failed command: ./support/apply-review.sh -n -r 45604

Error:
2016-04-08 02:24:27 URL:https://reviews.apache.org/r/45604/diff/raw/ 
[18869/18869] -> "45604.patch" [1]
Total errors found: 0
Checking 2 files
Error: No line in the commit message summary may exceed 72 characters.

Full log: https://builds.apache.org/job/mesos-reviewbot/12397/console

- Mesos ReviewBot


On April 7, 2016, 10:59 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45905/
> ---
> 
> (Updated April 7, 2016, 10:59 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Artem Harutyunyan, Kevin Klues, and 
> Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Adds metrics to gauge the health of the framework.  This includes:
> 
> * uptime_secs = How long the framework has been running.
> * registered = If the framework is registered.
> * tasks_finished = Number of tasks finished (successfully).
> * tasks_oomed = Number of tasks that were OOM killed.
> * abnormal_terminations = Number of terminal status updates which 
>   were not `TASK_FINISHED` or `TASK_FAILED` due to OOM.
> 
> Also adds an endpoint `/framework/counters` which returns the list of 
> metrics which are "counters".
> 
> 
> Diffs
> -
> 
>   src/examples/balloon_framework.cpp 15c45612b777edaf97aea9b953439d4ad56920f3 
> 
> Diff: https://reviews.apache.org/r/45905/diff/
> 
> 
> Testing
> ---
> 
> ```
> make check
> 
> sudo bin/mesos-tests.sh --gtest_filter="*ROOT_CGROUPS_BalloonFramework"
> 
> # Also launched two instances on a cluster.
> # This one OOM's:
> ./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
> --balloon_limit=256MB --task_memory=128MB 
> --executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
> 
> # This one does not OOM:
> ./balloon-framework --master=zk://localhost:2181/mesos --checkpoint 
> --balloon_limit=256MB --task_memory=256MB 
> --executor_uri="https://s3.amazonaws.com/url/to/balloon-executor; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./balloon-executor"
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>