[jira] [Commented] (YARN-8466) Add Chaos Monkey unit test framework for feature validation in scale

2018-06-26 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524485#comment-16524485
 ] 

Wangda Tan commented on YARN-8466:
--

Thanks [~cheersyang], actually this JIRA is inspired by the distributed chaos 
monkey framework mentioned by you offline. 

For the UT-like binary, the benefit is we can really run smoke test in a 
self-contained way. W/o any env setup, we can do sanity test in minutes. And 
the mock framework allows to start/stop app/node really fast. 

And I can definitely see the value of distributed chaos monkey framework. If we 
can make the test can easily run, it will be super useful to run before any 
releases! 

[~sunilg], 
To me, the UT is not necessarily to use the same code base of the distributed 
one (of course, ideally share the same one, but in practice it could be hard). 

> Add Chaos Monkey unit test framework for feature validation in scale
> 
>
> Key: YARN-8466
> URL: https://issues.apache.org/jira/browse/YARN-8466
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Wangda Tan
>Priority: Critical
> Attachments: YARN-8466.poc.001.patch
>
>
> Currently we don't have such framework for testing. 
> We need a framework to do this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8466) Add Chaos Monkey unit test framework for feature validation in scale

2018-06-26 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1652#comment-1652
 ] 

Sunil Govindan commented on YARN-8466:
--

Thanks [~leftnoteasy] for proposing this and [~cheersyang] for good insights. 
In my opinion, we were seeing a few issues when nodes are added/removed in 
sync/async scheduling and metrics etc going wrong. So having a system which can 
check with various invariants will ensure that any new patch which comes in is 
not breaking some metrics etc. It will be great to collaborate and get this as 
a common test system which can extend this to a distributed system.

> Add Chaos Monkey unit test framework for feature validation in scale
> 
>
> Key: YARN-8466
> URL: https://issues.apache.org/jira/browse/YARN-8466
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Wangda Tan
>Priority: Critical
> Attachments: YARN-8466.poc.001.patch
>
>
> Currently we don't have such framework for testing. 
> We need a framework to do this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8466) Add Chaos Monkey unit test framework for feature validation in scale

2018-06-26 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524431#comment-16524431
 ] 

Weiwei Yang commented on YARN-8466:
---

Hi [~leftnoteasy]

Having that in UT sounds an easier approach but I think have it in distributed 
env is more useful. It can be fairly env independent and it often takes while 
until it breaks something. We have an internal system written by python, that 
interacts with a yarn cluster to make troubles, like kill NM/RM processes, 
containers, jobs, shutdown queues etc. Maybe we can have both, it seems hbase 
supports both. For the distributed approach, we would love to contribute our 
work to the community if people think it is useful. 

> Add Chaos Monkey unit test framework for feature validation in scale
> 
>
> Key: YARN-8466
> URL: https://issues.apache.org/jira/browse/YARN-8466
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Wangda Tan
>Priority: Critical
> Attachments: YARN-8466.poc.001.patch
>
>
> Currently we don't have such framework for testing. 
> We need a framework to do this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org