Re: [tc-dev] [tc-users] [tc-forge-dev] Cluster Events update

Taylor Gautier Mon, 19 Jan 2009 08:35:43 -0800

Yes, agreed, this effort is entirely geared for application - not monitoring - 
usage.



----- Original Message ----- 
From: "Geert Bevin" <gbe...@terracottatech.com> 
To: tc-us...@lists.terracotta.org 
Cc: "tc-forge-dev" <tc-forge-...@lists.terracotta.org>, "tc-dev" 
<tc-dev@lists.terracotta.org> 
Sent: Monday, January 19, 2009 4:45:22 AM GMT -08:00 US/Canada Pacific 
Subject: Re: [tc-dev] [tc-users] [tc-forge-dev]  Cluster Events update 

We sort of started migrating away from monitoring during our latest   
discussion about cluster events since there's already an overlap in   
functionality between the JMX API that the admin console uses and the   
cluster events that are currently being sent out. We were thinking of   
gearing the new cluster events towards the programmer of the   
application, ie. something that is used through a Java API and   
annotations from within a DSO application that is part of the cluster.   
We can then independently clean up the admin console JMX API for   
public consumption and have two clearly distinct approaches without   
overlap. 

On 16 Jan 2009, at 20:31, Kunal wrote: 

> This would be generic monitoring and would benefit all use cases to   
> ease monitoring in production, IMHO. 
> 
> Kunal. 
> 
> 
> 
> From: Steven Harris <st...@terracottatech.com> 
> Date: Fri, 16 Jan 2009 11:00:37 -0800 
> To: Kunal Bhasin <kbha...@terracottatech.com> 
> Cc: Untitled 5 <tc-us...@lists.terracotta.org>, tc-forge-dev 
> <tc-forge-...@lists.terracotta.org 
> >, tc-dev <tc-dev@lists.terracotta.org> 
> Subject: Re: [tc-forge-dev] [tc-users] Cluster Events update 
> 
> For which usecase? 
> 
> 
> Cheers, 
> Steve Harris 
> "Terracotta.  It's ten pounds of awesome in a five pound sack. 
> <http://www.miketec.org/serendipity/index.php?/archives/7-Oracle-and-Postgres-Redux.html
>  
> > " 
> 
> 
> 
> 
> 
> On Jan 16, 2009, at 10:57 AM, kbha...@terracottatech.com wrote: 
> 
>> I would also recommend adding notifications for the L2 processes   
>> starting and stopping. At least broadcasting the status of the L2   
>> based on L2 group comm would be a good start. 
>> 
>> Also, sending the ip/hostname of the machine along with the   
>> clientid would also be useful. 
>> 
>> Kunal 
>> --Sent while mobile-- 
>> 
>> On Jan 16, 2009, at 9:04 AM, Taylor Gautier <tgaut...@terracottatech.com 
>> > wrote: 
>> 
>>> For the upcoming release of Terracotta, we are considering   
>>> changing the eventing mechanism.  So far, our design discussions   
>>> have identified some core use cases, and we think we have   
>>> identified the general strategy.  I'd love to hear any comments   
>>> from our users on this new direction. 
>>> 
>>> ============ DRAFT REQUIREMENTS/DESIGN FOR CLUSTER EVENTS   
>>> =================== 
>>> MOTIVATION 
>>> =================================================== 
>>> 1) The current eventing mechanism has race conditions 
>>> 
>>> 2) The implementation is confusing slightly - a disconnected event   
>>> comes to a node for two occasions - a) if a node is disconnected   
>>> then the node gets a this.disconnected event. This event can never   
>>> have perfect knowledge, and the node may reconnect at some time.   
>>> b) if another node is quarantined (never to rejoin) it gets a   
>>> nodeX disconnected. 
>>> 
>>> Since these two events are named similarly, they overload the   
>>> meaning of "disconnected". In the first case the connection has   
>>> been severed, but may be restored, while the second is an absolute   
>>> measure of the membership of the cluster - the server sent the   
>>> message so it is definitive. 
>>> 
>>> USE CASES 
>>> =================================================== 
>>> 1) Client needs to change behavior when TC is no longer able to   
>>> service operations - e.g. kill themselves 
>>> 
>>> 2) Map evictor use case 
>>>   - needs to know when a node has left the system. 
>>>   - needs to query the system to know from a list of objects what   
>>> objects are not faulted into any client (it is accepted that this   
>>> query is async and the response is guaranteed to be out of date) 
>>> 
>>> 3) Clustered async use case 
>>>   - needs to know if a node has left the system. 
>>>   - needs to query the system to know from a list of objects which   
>>> ones are "orphaned" - e.g. are no longer accepted (this may be   
>>> identical to 2a) 
>>> 
>>> 4) Master/Worker 
>>>   - needs to know when a node has joined the system to re-balance   
>>> work across all nodes (although this can easily be coded with wait/ 
>>> notify) 
>>>   - needs to know when a node (and which node) has left the system   
>>> to re-balance work from that node across remaining nodes 
>>> 
>>> 5) Location Aware Cache 
>>>   - need to execute work on where an object is faulted 
>>>   - need to query the system about where an object, or a list of   
>>> objects, is faulted 
>>> 
>>> NOTE: 
>>> Use case of switching to local data could be construed as #1, but   
>>> is a much more involved use case, so while someone could use the   
>>> solution for Use Case #1, we aren't specifically targeting that   
>>> capability. 
>>> 
>>> SUGGESTED SOLUTIONS 
>>> =================================================== 
>>> Roughly the following "things" seem to solve the use cases: 
>>> 
>>> 1) Topology Change Events 
>>>   - node joined ? - no "real" use case for it - regular code   
>>> techniques can be used 
>>>   - node left 
>>> 
>>> 2) Cluster Operational Events 
>>>   - tc operations are enabled 
>>>   - tc operations are disabled 
>>> 
>>> 3) Data Aware Information 
>>>   - a list of nodes where an object is faulted 
>>>   - a list of list of nodes where a list of objects is faulted 
>>>   - out of this list of objects, which ones are not faulted anywhere 
>>> 
>>> ============ DRAFT IMPLEMENTATION THOUGHTS =================== 
>>> 
>>> At the meeting of today we decided that the use cases, events and   
>>> data aware operations are sufficient and appropriate. 
>>> 
>>> We also decided to focus on a non JMX API for several reasons: 
>>> * artificially introduces a whole infrastructure that is not   
>>> needed and that is leaky 
>>> * makes it more difficult for the developer to integrate 
>>> * is not a compile-type API, which means less safety 
>>> 
>>> We're going to design a POJO API with compile-time safety that is   
>>> based on dependency injection. The first idea is to annotate a   
>>> field a being a 'DSOClusterUtil' (or whatever name). This would   
>>> then cause DSO to inject a local instance of that utility class,   
>>> providing the developers with API methods to perform listener   
>>> registration and data locality inspection. 
>>> 
>>> _______________________________________________ 
>>> tc-users mailing list 
>>> tc-us...@lists.terracotta.org 
>>> http://lists.terracotta.org/mailman/listinfo/tc-users 
>> _______________________________________________ 
>> tc-forge-dev mailing list 
>> tc-forge-...@lists.terracotta.org 
>> http://lists.terracotta.org/mailman/listinfo/tc-forge-dev 
> 
> 
> _______________________________________________ 
> tc-users mailing list 
> tc-us...@lists.terracotta.org 
> http://lists.terracotta.org/mailman/listinfo/tc-users 

-- 
Geert Bevin 
Terracotta - http://www.terracotta.org 
Uwyn "Use what you need" - http://uwyn.com 
RIFE Java application framework - http://rifers.org 
Flytecase Band - http://flytecase.be 
Music and words - http://gbevin.com 

_______________________________________________ 
tc-dev mailing list 
tc-dev@lists.terracotta.org 
http://lists.terracotta.org/mailman/listinfo/tc-dev

_______________________________________________
tc-dev mailing list
tc-dev@lists.terracotta.org
http://lists.terracotta.org/mailman/listinfo/tc-dev

Re: [tc-dev] [tc-users] [tc-forge-dev] Cluster Events update

Reply via email to