[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2014-01-15 Thread Bertrand Delacretaz (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872031#comment-13872031
 ] 

Bertrand Delacretaz commented on SLING-3278:


Is the 500ms caching time used to expire results from the cache, and is it 
the same for all results?

If yes I think results will need different times to live depending on their 
nature - I'd suggest adding a method to the HealthCheckExecution result that is 
used to expire results from the cache, maybe getTimeToLiveMsec(). For now this 
can be based on this default value, but it will allow results to provide 
appropriate values later on, without changing the API.

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Carsten Ziegeler
 Fix For: Health Check Core 1.0.8

 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-21-withExecutorResult.patch,
  SLING-3278-hc.webconsole-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-21.patch, 
 SLING-3278-more-explicit-use-of-constructor.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2014-01-15 Thread Carsten Ziegeler (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872036#comment-13872036
 ] 

Carsten Ziegeler commented on SLING-3278:
-

Thanks for your patch, I've just applied it.

We should at least cache for 1500ms - this was the time used within the jmx 
implementation; right now we have 2000ms, so I think this is fine.
In general I personally would even increase the caching time :)

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Carsten Ziegeler
 Fix For: Health Check Core 1.0.8

 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-21-withExecutorResult.patch,
  SLING-3278-hc.webconsole-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-21.patch, 
 SLING-3278-more-explicit-use-of-constructor.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2014-01-15 Thread Georg Henzler (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872059#comment-13872059
 ] 

Georg Henzler commented on SLING-3278:
--

Ok, let's settle for 2000ms, the most important is that it is configurable :)

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Carsten Ziegeler
 Fix For: Health Check Core 1.0.8

 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-21-withExecutorResult.patch,
  SLING-3278-hc.webconsole-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-21.patch, 
 SLING-3278-more-explicit-use-of-constructor.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2014-01-15 Thread Bertrand Delacretaz (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872164#comment-13872164
 ] 

Bertrand Delacretaz commented on SLING-3278:


Any thoughts about my getTimeToLiveMsec() ? See above, my comment of 2 hours 
ago.

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Carsten Ziegeler
 Fix For: Health Check Core 1.0.8

 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-21-withExecutorResult.patch,
  SLING-3278-hc.webconsole-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-21.patch, 
 SLING-3278-more-explicit-use-of-constructor.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2014-01-15 Thread Carsten Ziegeler (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872179#comment-13872179
 ] 

Carsten Ziegeler commented on SLING-3278:
-

I'm not sure how a getter method should help in having different TTLs? As a 
client of the executor service, when you get a result you usually do not care 
what the TTL of the check is. There is not much you can do with this value

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Carsten Ziegeler
 Fix For: Health Check Core 1.0.8

 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-21-withExecutorResult.patch,
  SLING-3278-hc.webconsole-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-21.patch, 
 SLING-3278-more-explicit-use-of-constructor.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2014-01-15 Thread Bertrand Delacretaz (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872209#comment-13872209
 ] 

Bertrand Delacretaz commented on SLING-3278:


The executor needs it to set a per-result (or per-HC) time to live, see the 
add resultCacheTtlInMs as property... comment in HealthCheckResultCache. 

But actually looking at it again it's only HealthCheckMetadata that needs to 
provide that, it makes sense that this is linked to a particular HealthCheck 
instead of a particular result. So I think what we need is a 
HealthCheckMetadata.getResultTTLMsec() method, and if that returns zero the 
HealthCheckResultCache uses the globally configured TTL.



 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Carsten Ziegeler
 Fix For: Health Check Core 1.0.8

 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-21-withExecutorResult.patch,
  SLING-3278-hc.webconsole-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-21.patch, 
 SLING-3278-more-explicit-use-of-constructor.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2014-01-15 Thread Carsten Ziegeler (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872234#comment-13872234
 ] 

Carsten Ziegeler commented on SLING-3278:
-

Got it, so we define a new service property for this, yes sounds good to me.
What about defining that a provided value of less than 1 means no caching at 
all?

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Carsten Ziegeler
 Fix For: Health Check Core 1.0.8

 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-21-withExecutorResult.patch,
  SLING-3278-hc.webconsole-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-21.patch, 
 SLING-3278-more-explicit-use-of-constructor.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2014-01-15 Thread Bertrand Delacretaz (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872246#comment-13872246
 ] 

Bertrand Delacretaz commented on SLING-3278:


We need two special values:

# do not cache, can be zero?
# use default TTL, can be less than zero? I'd make this the default value, if 
the HC service doesn't provide a value via a service property .

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Carsten Ziegeler
 Fix For: Health Check Core 1.0.8

 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-21-withExecutorResult.patch,
  SLING-3278-hc.webconsole-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-21.patch, 
 SLING-3278-more-explicit-use-of-constructor.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2014-01-15 Thread Georg Henzler (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13873087#comment-13873087
 ] 

Georg Henzler commented on SLING-3278:
--

+1 for adding resultCacheTtlInMs as service prop + 
HealthCheckMetadata.getResultTTLMsec()

Regarding the special values: Using zero for do not cache is good, for 2. 
use default TTL we could maybe use the type Long (instead of long) and the 
value null (and an empty string in the console config UI). Leave empty for 
using the global default reads better in the documentation than Use -1 for 
the global default.

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Carsten Ziegeler
 Fix For: Health Check Core 1.0.8

 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-21-withExecutorResult.patch,
  SLING-3278-hc.webconsole-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-21.patch, 
 SLING-3278-more-explicit-use-of-constructor.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2014-01-14 Thread Carsten Ziegeler (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13870750#comment-13870750
 ] 

Carsten Ziegeler commented on SLING-3278:
-

As discussed in the mailing list, we switched back to execute(String) and 
merged the jmx implementation into the core in order to use the 
caching/executor service.

I think with the current implementation we're fine and can close this issue.

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Carsten Ziegeler
 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-21-withExecutorResult.patch,
  SLING-3278-hc.webconsole-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-21.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2013-12-31 Thread Bertrand Delacretaz (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859422#comment-13859422
 ] 

Bertrand Delacretaz commented on SLING-3278:


I have tried to clarify the use cases, and also suggested API changes, at

https://cwiki.apache.org/confluence/display/SLING/Health+Checks+Executor+Design

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Carsten Ziegeler
 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-21-withExecutorResult.patch,
  SLING-3278-hc.webconsole-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-21.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2013-12-27 Thread Bertrand Delacretaz (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857487#comment-13857487
 ] 

Bertrand Delacretaz commented on SLING-3278:


Also, for my use cases I'll need a way to selectively clear cached results and 
specify a timeout when executing HCs. We can discuss this separately (started 
that in [1]), just wanted to mention it here so we don't forget.

[1] http://sling.markmail.org/thread/xg4k2pu4ii7xdgbw

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Carsten Ziegeler
 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-21-withExecutorResult.patch,
  SLING-3278-hc.webconsole-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-21.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2013-12-24 Thread Bertrand Delacretaz (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856225#comment-13856225
 ] 

Bertrand Delacretaz commented on SLING-3278:


I still think execute(ServiceReference) is an unnecessary leak of 
implementation details, will discuss it on the dev list.

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Carsten Ziegeler
 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-21-withExecutorResult.patch,
  SLING-3278-hc.webconsole-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-21.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2013-12-23 Thread Carsten Ziegeler (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855715#comment-13855715
 ] 

Carsten Ziegeler commented on SLING-3278:
-

Thanks again Georg for your updated patch - I've committed a modified version 
in rev 1553133 in order to make going forward much easier. I left out the async 
stuff, used different names/signatures for the executor service and also did 
some minor code clean ups/changes.

Now, let's discuss the things - as a first step I think we should focus on the 
API
1) I renamed the runXX methods to execute - as the service is named Executor 
and changed the result to CollectionHealthCheckResult
2) I totally agree, that there is rarely the use to call this executor from 
within own code, so jmx and web console are the number one clients for this. 
Therefore I think we can go with an OSGi free interface and directly use the 
service. The JMX code already has the service anyways and doing it once in the 
web console code is not too hard either. So we don't make it harder for users 
as they don't use this anyway :)
3) I removed the async execution for now. As Bertrand suggested, let's discuss 
this on the list separately from this and keep the focus on the executor service
4) I think the HealthCheckResult interface is fine for now; we might need to 
tweak it a little bit before we close this issue

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Carsten Ziegeler
 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-21-withExecutorResult.patch,
  SLING-3278-hc.webconsole-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-21.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2013-12-23 Thread Carsten Ziegeler (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855763#comment-13855763
 ] 

Carsten Ziegeler commented on SLING-3278:
-

I've updated the implementation a little bit - now the service.id is used as a 
cache key. WIth this we don't need to hold any service reference objects (or 
anything else anymore).
And therefore I now agree to use the ServiceReference as a single argument for 
executing a health check, so forgot comment 2) from above. Without the service 
reference we don't have anything we could use as a cache key

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Carsten Ziegeler
 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-21-withExecutorResult.patch,
  SLING-3278-hc.webconsole-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-21.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2013-12-20 Thread Georg Henzler (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13854799#comment-13854799
 ] 

Georg Henzler commented on SLING-3278:
--

Find another patch attached that uses a bundle-private ExecutionResult. Changes 
to the existing API are really minimal now (the biggest change having an 
interface HealthCheckResult, but I think that's good). Other than that the from 
a HealthCheck implementor's point of view, nothing has changed as the Result is 
still the class to construct and return. 

Oher than that i left 
org.apache.sling.hc.api.HealthCheckExecutor.run(ServiceReference) with the 
service reference for now as I believe it's still the cleanest way. Having had 
a closer look both name/class-name can not be used as they are not unique for 
factory components like ScriptableHealthCheck (only the service PID would be 
unique, and that's certainly something that is OSGi specific itself and a 
service reference would be needed on client side to retrieve it).



 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Georg Henzler
 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-21-withExecutorResult.patch,
  SLING-3278-hc.webconsole-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-21.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2013-12-19 Thread Georg Henzler (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852731#comment-13852731
 ] 

Georg Henzler commented on SLING-3278:
--

1) maybe we should rather go run(String fullyQualifiedClassname) then - 
although getting the service is easy it's also easy to get it wrong. And it's 
just a waste to get the service if you don't need to (the executor does not get 
the service if there is a cache hit, there is async results or a future is 
running already)
2) I agree we should get of the method setHealthCheckDescriptor(), but adding 
as constructor element is not an easy option (the health check itself is 
constructing it, but it is set by the executor later). If we add it as a 
constructor parameter we need a clone constructor Result(resultFromCheckItself, 
hcDescriptor, finishedDate, elapsedTime). Or we put the class result behind an 
implementation class (that would be a nicer option IMHO), the problem with that 
is that new Result(...) is used directly in the checks and is part of the 
current API = the clone constructor is probably still the better solution
3) The descriptor contains currently the hc name and the tags - this meta 
information is useful in the UI (it is shown in the web console). Before my 
patch the service reference was used directly in the web console (using a lot 
more code, e.g. dealing with the fact that the OSGi array props do appear as 
simple string if only one element is contained). Now IMHO it is cleanly 
separated and the HC meta information is available to the UI (without being 
tied to the OSGi API, also see your first point ;-)). Another option would be 
to copy name and tags as plain properties to the Result, but IMHO it is cleaner 
and more extensible to leave the constant meta data (not changing for multiple 
hc executions) in a separate class (also it is a really useful key class for 
the executor, if we got rid of it in the API we should at least leave it in the 
impl.executor package) 

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Georg Henzler
 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-19.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2013-12-19 Thread Bertrand Delacretaz (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852761#comment-13852761
 ] 

Bertrand Delacretaz commented on SLING-3278:


I'm also for having an executor method that takes a single HealthCheck - 
getting the service is not expensive, I don't think that's a problem, and 
that's consistent with how we generally do things in Sling.

bq. We cannot use a created date in the Result constructor because the instance 
of the result is created by the implementing class...

It's created when the check is done executing, as the Result is immutable 
there's no other way. So I think storing the creation timestamp at that point 
is fine. If a health check starts at T, runs for 2 minutes and has a time to 
live of 1 minute, you want to kill it at T+3, not T+1. So IMO we should set the 
Result creation time in its constructor, have a method to set the time to live 
and an isExpired method that becomes true at T+3 in my example.

In the meantime I've thought about a fluent API that looks better for the 
executor, started a thread on our dev list to discuss that ([RT] Fluent API 
for HealthCheckExecutor). That would have little impact to the core of your 
patch, but provide a more flexible API, including per-call execution timeouts 
and the ability to clear some or all cached results at will.

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Georg Henzler
 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-19.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2013-12-19 Thread Carsten Ziegeler (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852762#comment-13852762
 ] 

Carsten Ziegeler commented on SLING-3278:
-

1) The class name is implementation detail, no one knows it - I can leave it we 
go with health check name, so if I want to execute a hc via the executor, it 
has to have a name or a tag. Deal?
2) The Executor could return an ExecutionResult (or maybe there is a better 
name), containing the additional information and as a field the result
3) Ok, I'm fine with having this descriptor as long as it is a simple data 
object not containing any complex objects like a ServiceReference. The idea 
behind all of this is to make this serializable and be able to serialize it to 
a remote machine. As long as these are data objects, everything is fine, but a 
ServiceReference can never be transferred

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Georg Henzler
 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-19.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2013-12-19 Thread Georg Henzler (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852998#comment-13852998
 ] 

Georg Henzler commented on SLING-3278:
--

Re Fluent API as well as The class name is implementation detail, no one knows 
it:  I think we have a disconnect in how we think the health check executor 
could/should be used. I think implementation projects really only want to be 
able to add checks and run them via web console or JMX, they would never want 
to execute the checks in a custom fashion other than configuring timeouts or 
restricting it to certain tags (also see mailing list post).  @Carsten / 1): I 
think you need the run(hc-class) mainly for the JMX module: Don't you have 
access to the classname via the property component.name there? The problem with 
the name is that it may contain spaces and therefore is not really a nice id 
for using it as a parameter?

2) The ExecutionResult is a good idea - that way all timing data can go there. 
Also, Execution Result should be an interface and ExecutionResultImpl can be 
hidden in the impl.executor package - that way ExecutionResultImpl does not 
have to be immutable and the timing data can be collected correctly (also 
solving Bertrands concerns).

3) If we create the interface ExecutionResult, we can hide the existence of the 
HealthCheckDescriptor and move it to impl.executor. The field serviceReference 
can be marked transient and is therefore be taken out from Serialization (if 
that really ever is needed, I think probably not)



 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Georg Henzler
 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-19.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2013-12-18 Thread Bertrand Delacretaz (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851862#comment-13851862
 ] 

Bertrand Delacretaz commented on SLING-3278:


Competition is good! That's how we get excellent code ;-)

Thanks Georg for your revised patch, I see your point about parallel execution 
of the HC, agree that it's a good thing and that's the cause for some 
additional complexity w.r.t my variant. but it's worth it.

Here are my comments in no particular order. 

I agree with the reformatting issue that Carsten mentions.

IMO the Result timing information should be: 
* Result creation time, set automatically in constructor
* Optional time to live, can be set with a method but not changed after that
* Optional HC execution elapsed time, can be set with a method but not changed 
after that
* isExpired() method that uses creation time + time to live

We do not use @author tags, as is customary in Apache projects. You will get 
credit in the commit message but we don't want code to appear like it belongs 
to specific people, especially as over time this changes.

The it tests fail, I'll attach a patch that fixes them.

I'd still like to remove the async execution from this patch, and rediscuss on 
list. My use case would be to execute some HC at regular intervals based on 
their tags, instead of based on individual HC configurations. This can then be 
implemented in the support bundle.

We need to keep the CompositeHealthCheck as we already released it, I agree 
that it can lead to executing some health checks several times if you don't 
choose tags wisely. That's not really a problem, and even less with this 
improved execution mechanism as results are cached. It shouldn't be hard to 
adapt CompositeHealthCheck to use the new executor.

The maven-sling-plugin shouldn't be added to the pom.xml in this patch - the 
Sling parent pom as an autoInstallBundle profile which does that already.

The SlowHealthCheck demo HC from my patch should be included once we apply your 
patch, with its config as that's a useful demo for caching results and 
execution timeout.


 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Georg Henzler
 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-18.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-v0.5.patch, 
 SLING-3278-hc.webconsole-2013-12-18.patch, SLING-3278-hc.webconsole-v0.5.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2013-12-18 Thread Georg Henzler (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852213#comment-13852213
 ] 

Georg Henzler commented on SLING-3278:
--

Competition in general is not that bad, I agree :)

Find attached my patch that adresses comments from above:
* Removed author tags, reformatting lines and sling plugin from pom
* I added the method Result run(ServiceReference healthCheckReference) to 
execute a single health check with caching/timeout checks in place. The 
parameter is not of type HealthCheck as this would push the responsibility for 
getting/ungetting the service to the user of the interface
* I tried to not use HealthCheckDescriptor in result (and make it private to 
the bundle by moving it to impl.executor)... however, I think the code gets 
worse then (an extra map would be needed to keep references by descriptor). In 
general I think it's good design to have the descriptor: It is a different type 
of attributes and for the user it is immedately clear that these attributes 
won't change over time. Also ServiceReferences can safely be cached reused 
(opposed to the service class itself) and HealthCheckDescriptor is immutable.
* Timing information: We cannot use a created date in the Result constructor 
because the instance of the result is created by the implementing class and 
there is no guarantee, that new Result(..) is called at the beginning of a 
check (rather it will normally be called at the very end!). 



 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Georg Henzler
 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-18.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-v0.5.patch, 
 SLING-3278-hc.webconsole-2013-12-18.patch, 
 SLING-3278-hc.webconsole-v0.5.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2013-12-18 Thread Carsten Ziegeler (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852634#comment-13852634
 ] 

Carsten Ziegeler commented on SLING-3278:
-

Thanks for the updated patch. I think we're making good progress!
Now, some comments:
- run(ServiceReference) looks convenient but creates an API that is tied to 
OSGi API - we should avoid that. Getting a service is fairly easy - and with 
passing in the service object we ensure that the client is allowed/able to get 
the service.
- Result must be immutable - that's the contract we had before and we have to 
keep it. So no setter methods - if the constructor approach gets ugly I suggest 
a builder passed approach, like createResult(log).name(bla)...build();
- I don't think we need the health check descriptor - so far I don't see a need 
for client code for this information. And if, why not simply copy the 
information from the service reference into an immutable data object? But I 
would go without this


 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Georg Henzler
 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-19.patch, 
 SLING-3278-hc.webconsole-2013-12-19.patch, hc-it.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2013-12-17 Thread Georg Henzler (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851218#comment-13851218
 ] 

Georg Henzler commented on SLING-3278:
--

Hi Bertrand, 

now we have two competing implementations :( I was hoping you would base your 
work on my patch or just give feedback. 

So I have an improved version of the patch that makes as little changes as 
possible (but still is more capable as the version from your patch):

* HealthCheckExecutor.runAllForTags() acts as a facade to the user interface, 
implementation details (timeouts, hc lookup etc.) are handled in the executor, 
but the user of the interface does not need to know them. Also the design is 
cleaner as in the web console, you don't have to read service reference 
properties (as the results have the property HealthCheckDescriptor)
* The tests are truely run in parallel and the latest results are returned (as 
HealthCheckExecutorImpl.runAllForTags(String...) waits for the futures)
* Caching is in place (configurable, 2sec default)
* Async execution can easily be achieved by specifying a property (this could 
easily be taken out if necessary)

In contrary your patch has the following disadvantages:
* The checks are somewhat triggered for parallel execution, but usually you 
receive the result from the last call (if the last call to the check is 2hours 
in the past, then the result will be 2h old). I really think the 
HealthCheckExecutor needs to be the broker for futures in order to be able to 
achieve the goals as stated in the issue description (e.g. look at 
HealthCheckExecutorImpl.waitForFuturesRespectingTimeout() for what you cannot 
achieve with your design)
* The HealthCheckExecutor is only capable of running one check at at time - I 
believe the main use case for the HC in general is to run all tests and get a 
current system health as quickly as possible. No matter how the actual 
implementation ends up looking like, I would really like to see the signature  
SetResult runAllForTags(String... tags) in the interface HealthCheckExecutor 

My latest patch is attached, and I think it's better to use that version as a 
base for further work.






 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Georg Henzler
 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-18.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-v0.5.patch, 
 SLING-3278-hc.webconsole-2013-12-18.patch, SLING-3278-hc.webconsole-v0.5.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2013-12-17 Thread Carsten Ziegeler (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851397#comment-13851397
 ] 

Carsten Ziegeler commented on SLING-3278:
-

Hi thanks for your patch - unfortunately your patch does also reformat a lot of 
the existing code, could you please redo the patch without reformatting?

I'm missing a method to execute a single specific health check, the advantage 
of the executor in that case should be that it takes care of caching etc. even 
for a single check

The Result has a reference to a new api class HealthCheckDescriptor which in 
turn holds a service object - we should avoid this and keep the result as a 
simple data object

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Georg Henzler
 Attachments: SLING-3278-bertrand.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-2013-12-18.patch, 
 SLING-3278-hc.core-HealthCheckExecutorService-v0.5.patch, 
 SLING-3278-hc.webconsole-2013-12-18.patch, SLING-3278-hc.webconsole-v0.5.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (SLING-3278) Provide a HealthCheckExecutor service

2013-12-12 Thread Georg Henzler (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846328#comment-13846328
 ] 

Georg Henzler commented on SLING-3278:
--

The property for async execution property can make sense when you want to make 
sure a check is called not as often as the health check itself (e.g. only twice 
a day).

I'm pretty much done, No 2 of Bertrand's list and unit tests are missing if you 
like you can have a look at the patches to give feedback before I submit a 
final one.

Impl Notes:
* The main entry method is 
org.apache.sling.hc.core.executor.HealthCheckExecutor.runAllForTags(String...)
* Results have now a HealthCheckDescriptor that contains meta info for the 
check (also used in the executor as cache key etc.) 
* Async is supported by attribute hc.async.cronExpression, a service listener 
is in place for registering/unregistering of jobs 
(org.apache.sling.hc.core.executor.AsyncHealthCheckExecutor)
* I did add a natural order to results (failed tests first, then by name 
alphabetically) - if not using this the order would be arbitrary (depending on 
execution time)
* The result has an additional finishDate and elapsedTime (I think finish date 
is more interesting for caching than the start date!)

Other thoughts (not in patch):
* I'm not sure if the CompositeHealthCheck makes sense - is this not a grouping 
competing with the tags? It is easy to configure it in a way that some checks 
are executed twice, especially if you run all checks without giving a tag (and 
the HealthCheckExecutor cannot prevent it as the CompositeHealthCheck looks 
like any other check to it)
* Exceptions: The result should be able to carry a exception - I would even go 
as far as adding throws Exception to the execute() signature (this would not 
break any existing implementation classes) and generically add a last critical 
log if the HC happens to throw an exception

 Provide a HealthCheckExecutor service
 -

 Key: SLING-3278
 URL: https://issues.apache.org/jira/browse/SLING-3278
 Project: Sling
  Issue Type: New Feature
  Components: Health Check
Reporter: Georg Henzler
Assignee: Georg Henzler
 Attachments: 
 SLING-3278-hc.core-HealthCheckExecutorService-v0.5.patch, 
 SLING-3278-hc.webconsole-v0.5.patch


 Goals:
 * Be able to get an overall (aggregated) result as quickly as possible 
 (ideally 2sec)
 * Whenever possible, return most current results (e.g. for a memory check)
 * Provide a declarative way for async checks (async checks should be the 
 exception though) 
 Approach
 * Run checks in parallel
 * Make sure long running (or even stuck) checks are timed out
 * If a health check must run asynchronously (because its execution time 
 cannot be optimized), it should be enough to just specify a service property 
 (e.g. hc.async).
 See also
 http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
 http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)