Re: [IEP-35] Monitoring & Profiling. Proof of concept

2019-05-14 Thread Nikolay Izhikov
Ticket for IEP.Phase1 created - 
https://issues.apache.org/jira/browse/IGNITE-11848


В Пн, 13/05/2019 в 18:06 +0300, Nikolay Izhikov пишет:
> Hello, Igniters.
> 
> We have discussed this IEP [1] with Alexey Goncharyuk, Anton Vinogradov, 
> Andrey Gura, Alexey Scherbakov and Pavel Kovalenko.
> 
> Issues to address:
> 
> 1. Study experience of following libs, tools:
>   * OpenTracing
>   * OpenSensus
>   * DropWizard
> 
> 2. Support histogram sensor: Sensor that collects values that gets into 
> predefined segments 
> 
> 3. Use more widely used naming(like in OpenSensus?) 
> 
> 4. Consider the usage of OpenSensus as a default implementation for local 
> metric storage.
> 
> 5. To measure the performance penalty for metrics for 5_000 caches.
> 
> 6. Some metrics should be part of public API and others are not(may be 
> changed/removed in release without warnings).
> 
> My plan for Phase #1 is the following:
> 
> 1. Address the issues.
> 2. Prepare public API
> 3. Prepare PR for monitoring subsystem + existing metrics rewritten with it.
> 4. Prepare a PR with lists of each user API.
> 5. Collect feedback for a #4.
> 6. Design a log exposer. Consider the usage of JFR format or some other 
> widely used, tool compatible format.
> 
> [1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> 
> В Чт, 02/05/2019 в 14:02 +0300, Nikolay Izhikov пишет:
> > Hello, Maxim.
> > 
> > > How will be recorded throughput sensor values which will require an 
> > > interval for the rate calculations?
> > 
> > I answered to this question in IEP "Design principles":
> > 
> > ```
> > Sensors should contain only raw values. No aggregation of numeric metrics 
> > on Ignite side. 
> > Min, max, avg and other functions are the matter of an external monitoring 
> > system.
> > ```
> > 
> > Throughput is a function `(S(t2) - S(t1))/(t2-t1)`
> > where S(t) is the sensor value in some point of time t.
> > 
> > Seems, throughput calculation is a responsibility of an external system.
> > 
> > What do you think?
> > 
> > > It seems to me that we can add an additional parameter of 
> > > `sensitivityLevel` to provide for the user a flexible sensor control 
> > > (e.g., INFO, WARN, NOTICE, DEBUG).
> > 
> > For now, I think that all sensors and lists will be very(very!) lightweight.
> > So, we should be able to disable/enable it's, for sure.
> > 
> > But, we should turn off and turn on the whole Ignite subsystem 
> > for the case we have strong performance limitations for a particular 
> > workload.
> > 
> > So, we have two "level" of monitoring - INFO and DEBUG(for profiling: 
> > IEP-35 - Phase 3).
> > For example, AFAIK we can't disable current SQL system views(Why should we?)
> > 
> > В Вт, 30/04/2019 в 14:33 +0300, Maxim Muzafarov пишет:
> > > Hello Nikolay,
> > > 
> > > I've looked through your PRs changes.
> > > 
> > > > Sensors
> > > 
> > > How will be recorded throughput sensor values which will require an
> > > interval for the rate calculations? Do we have such an example? For
> > > instance, getAllocationRate() or getEvictionRate(). These metrics are
> > > out of the scope of current PoC and IEP as they are not related to the
> > > user metrics, but it is a good example of a particular metric type.
> > > 
> > > It seems to me that we can add an additional parameter of
> > > `sensitivityLevel` to provide for the user a flexible sensor control
> > > (e.g., INFO, WARN, NOTICE, DEBUG).
> > > 
> > > It also seems that for the sensors getValue() the completely
> > > functional java approach can be used. Am I right?
> > > 
> > > On Mon, 29 Apr 2019 at 11:44, Nikolay Izhikov  wrote:
> > > > 
> > > > Hello, Vyacheslav.
> > > > 
> > > > Thanks for the feedback!
> > > > 
> > > > > HttpExposer with Jetty's dependencies should be detached> from the 
> > > > > core module.
> > > > 
> > > > Agreed. module hierarchy is the essence of the next steps.
> > > > For now it just a proof of my ideas for Ignite monitoring we can 
> > > > discuss.
> > > > 
> > > > > I like your approach with 'wrapper' for monitored objects, like don't 
> > > > > like using 'ServiceConfiguration' directly as a monitored object for 
> > > > > services
> > > > 
> > > > Agreed in general.
> > > > Seems, choosing the right data to expose is the matter of separate 
> > > > discussion for each Ignite entities.
> > > > I've planned to file tickets for each entity so anyone interested can 
> > > > share his vision in it.
> > > > 
> > > > > In my opinion, each sensor should have a timestamp.
> > > > 
> > > > I'm not sure that *every* sensor should have directly associated 
> > > > timestamp.
> > > > Seems, we should support sensors without timestamp for a current 
> > > > monitoring numbers at least.
> > > > 
> > > > > Also, it'd be great to have an ability to store a list of a fixed 
> > > > > size> of last N sensors
> > > > 
> > > > What use-cases do you know for such sensors?
> > > > We have plans to support fixed size lists to show "Last N SQL 

Re: [IEP-35] Monitoring & Profiling. Proof of concept

2019-05-13 Thread Nikolay Izhikov
Hello, Igniters.

We have discussed this IEP [1] with Alexey Goncharyuk, Anton Vinogradov, Andrey 
Gura, Alexey Scherbakov and Pavel Kovalenko.

Issues to address:

1. Study experience of following libs, tools:
* OpenTracing
* OpenSensus
* DropWizard

2. Support histogram sensor: Sensor that collects values that gets into 
predefined segments 

3. Use more widely used naming(like in OpenSensus?) 

4. Consider the usage of OpenSensus as a default implementation for local 
metric storage.

5. To measure the performance penalty for metrics for 5_000 caches.

6. Some metrics should be part of public API and others are not(may be 
changed/removed in release without warnings).

My plan for Phase #1 is the following:

1. Address the issues.
2. Prepare public API
3. Prepare PR for monitoring subsystem + existing metrics rewritten with it.
4. Prepare a PR with lists of each user API.
5. Collect feedback for a #4.
6. Design a log exposer. Consider the usage of JFR format or some other widely 
used, tool compatible format.

[1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392

В Чт, 02/05/2019 в 14:02 +0300, Nikolay Izhikov пишет:
> Hello, Maxim.
> 
> > How will be recorded throughput sensor values which will require an 
> > interval for the rate calculations?
> 
> I answered to this question in IEP "Design principles":
> 
> ```
> Sensors should contain only raw values. No aggregation of numeric metrics on 
> Ignite side. 
> Min, max, avg and other functions are the matter of an external monitoring 
> system.
> ```
> 
> Throughput is a function `(S(t2) - S(t1))/(t2-t1)`
> where S(t) is the sensor value in some point of time t.
> 
> Seems, throughput calculation is a responsibility of an external system.
> 
> What do you think?
> 
> > It seems to me that we can add an additional parameter of 
> > `sensitivityLevel` to provide for the user a flexible sensor control (e.g., 
> > INFO, WARN, NOTICE, DEBUG).
> 
> For now, I think that all sensors and lists will be very(very!) lightweight.
> So, we should be able to disable/enable it's, for sure.
> 
> But, we should turn off and turn on the whole Ignite subsystem 
> for the case we have strong performance limitations for a particular workload.
> 
> So, we have two "level" of monitoring - INFO and DEBUG(for profiling: IEP-35 
> - Phase 3).
> For example, AFAIK we can't disable current SQL system views(Why should we?)
> 
> В Вт, 30/04/2019 в 14:33 +0300, Maxim Muzafarov пишет:
> > Hello Nikolay,
> > 
> > I've looked through your PRs changes.
> > 
> > > Sensors
> > 
> > How will be recorded throughput sensor values which will require an
> > interval for the rate calculations? Do we have such an example? For
> > instance, getAllocationRate() or getEvictionRate(). These metrics are
> > out of the scope of current PoC and IEP as they are not related to the
> > user metrics, but it is a good example of a particular metric type.
> > 
> > It seems to me that we can add an additional parameter of
> > `sensitivityLevel` to provide for the user a flexible sensor control
> > (e.g., INFO, WARN, NOTICE, DEBUG).
> > 
> > It also seems that for the sensors getValue() the completely
> > functional java approach can be used. Am I right?
> > 
> > On Mon, 29 Apr 2019 at 11:44, Nikolay Izhikov  wrote:
> > > 
> > > Hello, Vyacheslav.
> > > 
> > > Thanks for the feedback!
> > > 
> > > > HttpExposer with Jetty's dependencies should be detached> from the core 
> > > > module.
> > > 
> > > Agreed. module hierarchy is the essence of the next steps.
> > > For now it just a proof of my ideas for Ignite monitoring we can discuss.
> > > 
> > > > I like your approach with 'wrapper' for monitored objects, like don't 
> > > > like using 'ServiceConfiguration' directly as a monitored object for 
> > > > services
> > > 
> > > Agreed in general.
> > > Seems, choosing the right data to expose is the matter of separate 
> > > discussion for each Ignite entities.
> > > I've planned to file tickets for each entity so anyone interested can 
> > > share his vision in it.
> > > 
> > > > In my opinion, each sensor should have a timestamp.
> > > 
> > > I'm not sure that *every* sensor should have directly associated 
> > > timestamp.
> > > Seems, we should support sensors without timestamp for a current 
> > > monitoring numbers at least.
> > > 
> > > > Also, it'd be great to have an ability to store a list of a fixed size> 
> > > > of last N sensors
> > > 
> > > What use-cases do you know for such sensors?
> > > We have plans to support fixed size lists to show "Last N SQL queries" or 
> > > similar data.
> > > Essentially, a sensor is just a single value with the name and known 
> > > meaning.
> > > 
> > > > It'd be great if you provide a more extended test to show the work of> 
> > > > the system.
> > > 
> > > Sorry, for that :)
> > > When you run 'MonitoringSelfTest' you should open 
> > > http://localhost:8080/ignite/monitoring to view exposed info.
> > > I provide 

Re: [IEP-35] Monitoring & Profiling. Proof of concept

2019-05-02 Thread Nikolay Izhikov
Hello, Maxim.

> How will be recorded throughput sensor values which will require an interval 
> for the rate calculations?

I answered to this question in IEP "Design principles":

```
Sensors should contain only raw values. No aggregation of numeric metrics on 
Ignite side. 
Min, max, avg and other functions are the matter of an external monitoring 
system.
```

Throughput is a function `(S(t2) - S(t1))/(t2-t1)`
where S(t) is the sensor value in some point of time t.

Seems, throughput calculation is a responsibility of an external system.

What do you think?

> It seems to me that we can add an additional parameter of `sensitivityLevel` 
> to provide for the user a flexible sensor control (e.g., INFO, WARN, NOTICE, 
> DEBUG).

For now, I think that all sensors and lists will be very(very!) lightweight.
So, we should be able to disable/enable it's, for sure.

But, we should turn off and turn on the whole Ignite subsystem 
for the case we have strong performance limitations for a particular workload.

So, we have two "level" of monitoring - INFO and DEBUG(for profiling: IEP-35 - 
Phase 3).
For example, AFAIK we can't disable current SQL system views(Why should we?)

В Вт, 30/04/2019 в 14:33 +0300, Maxim Muzafarov пишет:
> Hello Nikolay,
> 
> I've looked through your PRs changes.
> 
> > Sensors
> 
> How will be recorded throughput sensor values which will require an
> interval for the rate calculations? Do we have such an example? For
> instance, getAllocationRate() or getEvictionRate(). These metrics are
> out of the scope of current PoC and IEP as they are not related to the
> user metrics, but it is a good example of a particular metric type.
> 
> It seems to me that we can add an additional parameter of
> `sensitivityLevel` to provide for the user a flexible sensor control
> (e.g., INFO, WARN, NOTICE, DEBUG).
> 
> It also seems that for the sensors getValue() the completely
> functional java approach can be used. Am I right?
> 
> On Mon, 29 Apr 2019 at 11:44, Nikolay Izhikov  wrote:
> > 
> > Hello, Vyacheslav.
> > 
> > Thanks for the feedback!
> > 
> > > HttpExposer with Jetty's dependencies should be detached> from the core 
> > > module.
> > 
> > Agreed. module hierarchy is the essence of the next steps.
> > For now it just a proof of my ideas for Ignite monitoring we can discuss.
> > 
> > > I like your approach with 'wrapper' for monitored objects, like don't 
> > > like using 'ServiceConfiguration' directly as a monitored object for 
> > > services
> > 
> > Agreed in general.
> > Seems, choosing the right data to expose is the matter of separate 
> > discussion for each Ignite entities.
> > I've planned to file tickets for each entity so anyone interested can share 
> > his vision in it.
> > 
> > > In my opinion, each sensor should have a timestamp.
> > 
> > I'm not sure that *every* sensor should have directly associated timestamp.
> > Seems, we should support sensors without timestamp for a current monitoring 
> > numbers at least.
> > 
> > > Also, it'd be great to have an ability to store a list of a fixed size> 
> > > of last N sensors
> > 
> > What use-cases do you know for such sensors?
> > We have plans to support fixed size lists to show "Last N SQL queries" or 
> > similar data.
> > Essentially, a sensor is just a single value with the name and known 
> > meaning.
> > 
> > > It'd be great if you provide a more extended test to show the work of> 
> > > the system.
> > 
> > Sorry, for that :)
> > When you run 'MonitoringSelfTest' you should open 
> > http://localhost:8080/ignite/monitoring to view exposed info.
> > I provide this info in gist - 
> > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > 
> > I will extend this test to print results to console in the next iterations 
> > - stay tuned :)
> > 
> > В Вс, 28/04/2019 в 23:35 +0300, Vyacheslav Daradur пишет:
> > > Hi, Nikolay,
> > > 
> > > I looked through PR and IEP, and I have some comments:
> > > 
> > > It would be better to implement it as a separate module, I can't say
> > > if it is possible for the main part of monitoring or not, but I
> > > believe that HttpExposer with Jetty's dependencies should be detached
> > > from the core module.
> > > 
> > > I like your approach with 'wrapper' for monitored objects, like
> > > 'ComputeTaskInfo' in PR, and don't like using 'ServiceConfiguration'
> > > directly as a monitored object for services. I believe we shouldn't
> > > mix approaches. It'd be better always use some kind of container with
> > > monitored object's information to work with such data.
> > > 
> > > In my opinion, each sensor should have a timestamp. Usually monitoring
> > > systems aggregate data and build graphics according to sensors
> > > timestamp.
> > > 
> > > Also, it'd be great to have an ability to store a list of a fixed size
> > > of last N sensors, not to miss them without pushing to an external
> > > monitoring system.
> > > 
> > > It'd be great if you provide a more extended test to show 

Re: [IEP-35] Monitoring & Profiling. Proof of concept

2019-04-30 Thread Maxim Muzafarov
Hello Nikolay,

I've looked through your PRs changes.

> Sensors

How will be recorded throughput sensor values which will require an
interval for the rate calculations? Do we have such an example? For
instance, getAllocationRate() or getEvictionRate(). These metrics are
out of the scope of current PoC and IEP as they are not related to the
user metrics, but it is a good example of a particular metric type.

It seems to me that we can add an additional parameter of
`sensitivityLevel` to provide for the user a flexible sensor control
(e.g., INFO, WARN, NOTICE, DEBUG).

It also seems that for the sensors getValue() the completely
functional java approach can be used. Am I right?

On Mon, 29 Apr 2019 at 11:44, Nikolay Izhikov  wrote:
>
> Hello, Vyacheslav.
>
> Thanks for the feedback!
>
> > HttpExposer with Jetty's dependencies should be detached> from the core 
> > module.
>
> Agreed. module hierarchy is the essence of the next steps.
> For now it just a proof of my ideas for Ignite monitoring we can discuss.
>
> > I like your approach with 'wrapper' for monitored objects, like don't like 
> > using 'ServiceConfiguration' directly as a monitored object for services
>
> Agreed in general.
> Seems, choosing the right data to expose is the matter of separate discussion 
> for each Ignite entities.
> I've planned to file tickets for each entity so anyone interested can share 
> his vision in it.
>
> > In my opinion, each sensor should have a timestamp.
>
> I'm not sure that *every* sensor should have directly associated timestamp.
> Seems, we should support sensors without timestamp for a current monitoring 
> numbers at least.
>
> > Also, it'd be great to have an ability to store a list of a fixed size> of 
> > last N sensors
>
> What use-cases do you know for such sensors?
> We have plans to support fixed size lists to show "Last N SQL queries" or 
> similar data.
> Essentially, a sensor is just a single value with the name and known meaning.
>
> > It'd be great if you provide a more extended test to show the work of> the 
> > system.
>
> Sorry, for that :)
> When you run 'MonitoringSelfTest' you should open 
> http://localhost:8080/ignite/monitoring to view exposed info.
> I provide this info in gist - 
> https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
>
> I will extend this test to print results to console in the next iterations - 
> stay tuned :)
>
> В Вс, 28/04/2019 в 23:35 +0300, Vyacheslav Daradur пишет:
> > Hi, Nikolay,
> >
> > I looked through PR and IEP, and I have some comments:
> >
> > It would be better to implement it as a separate module, I can't say
> > if it is possible for the main part of monitoring or not, but I
> > believe that HttpExposer with Jetty's dependencies should be detached
> > from the core module.
> >
> > I like your approach with 'wrapper' for monitored objects, like
> > 'ComputeTaskInfo' in PR, and don't like using 'ServiceConfiguration'
> > directly as a monitored object for services. I believe we shouldn't
> > mix approaches. It'd be better always use some kind of container with
> > monitored object's information to work with such data.
> >
> > In my opinion, each sensor should have a timestamp. Usually monitoring
> > systems aggregate data and build graphics according to sensors
> > timestamp.
> >
> > Also, it'd be great to have an ability to store a list of a fixed size
> > of last N sensors, not to miss them without pushing to an external
> > monitoring system.
> >
> > It'd be great if you provide a more extended test to show the work of
> > the system. Everybody who looks to PR needs to run the test and get
> > the info manually to see the completeness of sensors, this might be
> > simplified by proper test.
> >
> > Thank you!
> >
> >
> >
> > On Fri, Apr 26, 2019 at 5:56 PM Nikolay Izhikov  wrote:
> > >
> > > Hello, Igniters.
> > >
> > > I've prepared Proof of Concept for IEP-35 [1]
> > > PR can be found here - https://github.com/apache/ignite/pull/6510
> > >
> > > I've done following changes:
> > >
> > > 1. `GridMonitoringManager`  [2] - simple implementation of 
> > > manager to store all monitoring info
> > > 2. `HttpPullExposerSpi` [3] - pull exposer implementation that 
> > > can respond with JSON from http://localhost:8080/ignite/monitoring. JSON 
> > > content can be veiwed in gist [4]
> > > 3. Compute task start and finish monitoring in "compute" list [5]
> > > 4. Service registration are monitored in "service" list - [6]
> > > 5. Current `IgniteSpiMBeanAdapter` rewritten using 
> > > `GridMonitoringManager` [7]
> > >
> > > Design principles, monitoring subsystem details and new Ignite entities 
> > > can be found in IEP [1].
> > >
> > > My next steps will be:
> > >
> > > 1. Implementation of JMX exposer
> > > 2. Registration of all "lists" and "sensor groups" as a SQL 
> > > System view.
> > > 3. Add monitoring for all unmonitoring Ignite API. (described in 
> > > IEP).
> > >  

Re: [IEP-35] Monitoring & Profiling. Proof of concept

2019-04-29 Thread Nikolay Izhikov
Hello, Vyacheslav.

Thanks for the feedback!

> HttpExposer with Jetty's dependencies should be detached> from the core 
> module.

Agreed. module hierarchy is the essence of the next steps.
For now it just a proof of my ideas for Ignite monitoring we can discuss.

> I like your approach with 'wrapper' for monitored objects, like don't like 
> using 'ServiceConfiguration' directly as a monitored object for services

Agreed in general.
Seems, choosing the right data to expose is the matter of separate discussion 
for each Ignite entities.
I've planned to file tickets for each entity so anyone interested can share his 
vision in it.

> In my opinion, each sensor should have a timestamp.

I'm not sure that *every* sensor should have directly associated timestamp.
Seems, we should support sensors without timestamp for a current monitoring 
numbers at least.

> Also, it'd be great to have an ability to store a list of a fixed size> of 
> last N sensors

What use-cases do you know for such sensors?
We have plans to support fixed size lists to show "Last N SQL queries" or 
similar data.
Essentially, a sensor is just a single value with the name and known meaning.

> It'd be great if you provide a more extended test to show the work of> the 
> system. 

Sorry, for that :)
When you run 'MonitoringSelfTest' you should open 
http://localhost:8080/ignite/monitoring to view exposed info.
I provide this info in gist - 
https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d

I will extend this test to print results to console in the next iterations - 
stay tuned :)

В Вс, 28/04/2019 в 23:35 +0300, Vyacheslav Daradur пишет:
> Hi, Nikolay,
> 
> I looked through PR and IEP, and I have some comments:
> 
> It would be better to implement it as a separate module, I can't say
> if it is possible for the main part of monitoring or not, but I
> believe that HttpExposer with Jetty's dependencies should be detached
> from the core module.
> 
> I like your approach with 'wrapper' for monitored objects, like
> 'ComputeTaskInfo' in PR, and don't like using 'ServiceConfiguration'
> directly as a monitored object for services. I believe we shouldn't
> mix approaches. It'd be better always use some kind of container with
> monitored object's information to work with such data.
> 
> In my opinion, each sensor should have a timestamp. Usually monitoring
> systems aggregate data and build graphics according to sensors
> timestamp.
> 
> Also, it'd be great to have an ability to store a list of a fixed size
> of last N sensors, not to miss them without pushing to an external
> monitoring system.
> 
> It'd be great if you provide a more extended test to show the work of
> the system. Everybody who looks to PR needs to run the test and get
> the info manually to see the completeness of sensors, this might be
> simplified by proper test.
> 
> Thank you!
> 
> 
> 
> On Fri, Apr 26, 2019 at 5:56 PM Nikolay Izhikov  wrote:
> > 
> > Hello, Igniters.
> > 
> > I've prepared Proof of Concept for IEP-35 [1]
> > PR can be found here - https://github.com/apache/ignite/pull/6510
> > 
> > I've done following changes:
> > 
> > 1. `GridMonitoringManager`  [2] - simple implementation of manager 
> > to store all monitoring info
> > 2. `HttpPullExposerSpi` [3] - pull exposer implementation that can 
> > respond with JSON from http://localhost:8080/ignite/monitoring. JSON 
> > content can be veiwed in gist [4]
> > 3. Compute task start and finish monitoring in "compute" list [5]
> > 4. Service registration are monitored in "service" list - [6]
> > 5. Current `IgniteSpiMBeanAdapter` rewritten using 
> > `GridMonitoringManager` [7]
> > 
> > Design principles, monitoring subsystem details and new Ignite entities can 
> > be found in IEP [1].
> > 
> > My next steps will be:
> > 
> > 1. Implementation of JMX exposer
> > 2. Registration of all "lists" and "sensor groups" as a SQL System 
> > view.
> > 3. Add monitoring for all unmonitoring Ignite API. (described in 
> > IEP).
> > 4. Rewrite existing jmx metrics using GridMonitoringManager.
> > 
> > Please, share you thoughts.
> > 
> > Part of JSON file:
> > ```
> > "COMPUTE": {
> >   "tasks": {
> > "name": "tasks",
> > "rows": [
> >   {
> > "id": "0798817a-eeec-4386-9af7-94edb39ffced",
> > "sessionId": "a1814f95a61-912451ff-ca7b-4764-a7fd-728f6a90",
> > "data": {
> >   "taskClasName": 
> > "org.apache.ignite.monitoring.MonitoringSelfTest$$Lambda$145/1500885480",
> >   "startTime": 1556287337944,
> >   "timeout": 9223372036854776000,
> >   "execName": null
> > },
> > "name": "anotherBroadcast"
> >   }
> > ```
> > 
> > [1] 
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > [2] 
> > 

Re: [IEP-35] Monitoring & Profiling. Proof of concept

2019-04-28 Thread Vyacheslav Daradur
Hi, Nikolay,

I looked through PR and IEP, and I have some comments:

It would be better to implement it as a separate module, I can't say
if it is possible for the main part of monitoring or not, but I
believe that HttpExposer with Jetty's dependencies should be detached
from the core module.

I like your approach with 'wrapper' for monitored objects, like
'ComputeTaskInfo' in PR, and don't like using 'ServiceConfiguration'
directly as a monitored object for services. I believe we shouldn't
mix approaches. It'd be better always use some kind of container with
monitored object's information to work with such data.

In my opinion, each sensor should have a timestamp. Usually monitoring
systems aggregate data and build graphics according to sensors
timestamp.

Also, it'd be great to have an ability to store a list of a fixed size
of last N sensors, not to miss them without pushing to an external
monitoring system.

It'd be great if you provide a more extended test to show the work of
the system. Everybody who looks to PR needs to run the test and get
the info manually to see the completeness of sensors, this might be
simplified by proper test.

Thank you!



On Fri, Apr 26, 2019 at 5:56 PM Nikolay Izhikov  wrote:
>
> Hello, Igniters.
>
> I've prepared Proof of Concept for IEP-35 [1]
> PR can be found here - https://github.com/apache/ignite/pull/6510
>
> I've done following changes:
>
> 1. `GridMonitoringManager`  [2] - simple implementation of manager to 
> store all monitoring info
> 2. `HttpPullExposerSpi` [3] - pull exposer implementation that can 
> respond with JSON from http://localhost:8080/ignite/monitoring. JSON content 
> can be veiwed in gist [4]
> 3. Compute task start and finish monitoring in "compute" list [5]
> 4. Service registration are monitored in "service" list - [6]
> 5. Current `IgniteSpiMBeanAdapter` rewritten using 
> `GridMonitoringManager` [7]
>
> Design principles, monitoring subsystem details and new Ignite entities can 
> be found in IEP [1].
>
> My next steps will be:
>
> 1. Implementation of JMX exposer
> 2. Registration of all "lists" and "sensor groups" as a SQL System 
> view.
> 3. Add monitoring for all unmonitoring Ignite API. (described in IEP).
> 4. Rewrite existing jmx metrics using GridMonitoringManager.
>
> Please, share you thoughts.
>
> Part of JSON file:
> ```
> "COMPUTE": {
>   "tasks": {
> "name": "tasks",
> "rows": [
>   {
> "id": "0798817a-eeec-4386-9af7-94edb39ffced",
> "sessionId": "a1814f95a61-912451ff-ca7b-4764-a7fd-728f6a90",
> "data": {
>   "taskClasName": 
> "org.apache.ignite.monitoring.MonitoringSelfTest$$Lambda$145/1500885480",
>   "startTime": 1556287337944,
>   "timeout": 9223372036854776000,
>   "execName": null
> },
> "name": "anotherBroadcast"
>   }
> ```
>
> [1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> [2] 
> https://github.com/apache/ignite/pull/6510/files#diff-ec7d5cf5e35b99303deb9accee153c50R34
> [3] 
> https://github.com/apache/ignite/pull/6510/files#diff-32239c45e0ae3b692af2eae7078e1436R47
> [4] https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> [5] 
> https://github.com/apache/ignite/pull/6510/files#diff-d651ed29d07bd0c5ce291654a3254cc0R749
> [6] 
> https://github.com/apache/ignite/pull/6510/files#diff-0b4e54fbda2b0da1c10eff48416336f6R1606
> [7] 
> https://github.com/apache/ignite/pull/6510/files#diff-4398bf118150500e059069b3a1638ec7R61



-- 
Best Regards, Vyacheslav D.


[IEP-35] Monitoring & Profiling. Proof of concept

2019-04-26 Thread Nikolay Izhikov
Hello, Igniters.

I've prepared Proof of Concept for IEP-35 [1]
PR can be found here - https://github.com/apache/ignite/pull/6510

I've done following changes:

1. `GridMonitoringManager`  [2] - simple implementation of manager to 
store all monitoring info
2. `HttpPullExposerSpi` [3] - pull exposer implementation that can 
respond with JSON from http://localhost:8080/ignite/monitoring. JSON content 
can be veiwed in gist [4] 
3. Compute task start and finish monitoring in "compute" list [5]
4. Service registration are monitored in "service" list - [6]
5. Current `IgniteSpiMBeanAdapter` rewritten using 
`GridMonitoringManager` [7]

Design principles, monitoring subsystem details and new Ignite entities can be 
found in IEP [1].

My next steps will be:

1. Implementation of JMX exposer
2. Registration of all "lists" and "sensor groups" as a SQL System view.
3. Add monitoring for all unmonitoring Ignite API. (described in IEP).
4. Rewrite existing jmx metrics using GridMonitoringManager.

Please, share you thoughts.

Part of JSON file:
```
"COMPUTE": {
  "tasks": {
"name": "tasks",
"rows": [
  {
"id": "0798817a-eeec-4386-9af7-94edb39ffced",
"sessionId": "a1814f95a61-912451ff-ca7b-4764-a7fd-728f6a90",
"data": {
  "taskClasName": 
"org.apache.ignite.monitoring.MonitoringSelfTest$$Lambda$145/1500885480",
  "startTime": 1556287337944,
  "timeout": 9223372036854776000,
  "execName": null
},
"name": "anotherBroadcast"
  }
```

[1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
[2] 
https://github.com/apache/ignite/pull/6510/files#diff-ec7d5cf5e35b99303deb9accee153c50R34
[3] 
https://github.com/apache/ignite/pull/6510/files#diff-32239c45e0ae3b692af2eae7078e1436R47
[4] https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
[5] 
https://github.com/apache/ignite/pull/6510/files#diff-d651ed29d07bd0c5ce291654a3254cc0R749
[6] 
https://github.com/apache/ignite/pull/6510/files#diff-0b4e54fbda2b0da1c10eff48416336f6R1606
[7] 
https://github.com/apache/ignite/pull/6510/files#diff-4398bf118150500e059069b3a1638ec7R61


signature.asc
Description: This is a digitally signed message part