Updates:
        Labels: -Version-2.0.00 -Release-Type-Candidate Version-2.1.00

Comment #7 on issue 940 by EMMartins: Electors covering different cache configuration + covering different conf in listener
http://code.google.com/p/mobicents/issues/detail?id=940

the buddy group config still has a lot of grey zones, with unexpected or not
logically understandable jboss cache behavior, its inclusion on mobicents jain slee
2.x is than  postponed to version 2.1.

Here is a log of the current state:

[16:20] <baranowb> lets consider two scenarios
[16:20] <baranowb> 1. no gravitation
[16:21] <baranowb> in this scenario each cache node acts as standalone, so from point
of cache user there is no HA, only FT - nodes dont see data of other
[16:21] <baranowb> Ive told YOu about it, thought there is a way to overcome this.
[16:22] <martins> how can you have ft
[16:22] <martins> if nothin is replicated
[16:22] <baranowb> no,
[16:22] <baranowb> see
[16:22] <baranowb> lets say we have two nodes
[16:22] <baranowb> N_1 and N_2
[16:23] <baranowb> each has
[16:23] <baranowb> data
[16:23] <baranowb> in /ac,/timers, bla bla bal
[16:23] <baranowb> right?
[16:23] <martins> y
[16:24] <baranowb> but cahce structure looks like
[16:24] <baranowb> damn
[16:24] <baranowb> _/ac
[16:24] <baranowb> _/timers
[16:24] <baranowb> _/_BUDDY_BACKUP_/N_#/ac
[16:24] <baranowb> _/_BUDDY_BACKUP_/N_#/timers
[16:24] <baranowb> now on N_1 we dont see part thats in _BACKUP_
[16:25] <baranowb> that is, we dont see N_2 data, the same applies to view from N_2
[16:25] <baranowb> following ?
[16:25] <martins> y
[16:25] <baranowb> ok
[16:25] <baranowb> what happens on N_1 failure is that
[16:26] <baranowb> N_2 takes ownership of this data in  _/_BUDDY_BACKUP_/N_1
[16:26] <baranowb> and now its visible
[16:26] <martins> right
[16:26] <baranowb> and everythign is recreated and works
[16:27] <martins> ok, so why you don't have HA
[16:28] <baranowb> ok, consider, that with cluster wide replciation
[16:28] <baranowb> if we remove something from _/ac
[16:28] <baranowb> it is removed from each node
[16:29] <baranowb> with case above each node is considered as data owner, and only
owning node can remove
[16:30] <martins> correct
[16:30] <baranowb> "can" - its not entirely true
[16:30] <baranowb> so see
[16:30] <baranowb> N_1 creates service, acs, timers
[16:30] <baranowb> now we fire on some ac in other container
[16:31] <baranowb> ac is not in _/ac
[16:31] <martins> ok, I think I understand your misunderstanding :)
[16:31] <baranowb> so we would have to lookup _BACKUP_ and move it to _/
[16:31] <baranowb> lol
[16:31] <martins> that is not "no HA"
[16:31] <baranowb> so what did I mis :)
[16:31] <martins> you have HA
[16:32] <martins> because HA is simply load balancing
[16:32] <baranowb> ach, ok, I was thinkiing more of "create on 1, fire to 2"
[16:32] <martins> what you don't have is support for "loose" load balancing
[16:32] <martins> you need afinity
[16:32] <baranowb> ok, thats ql, and this will work
[16:33] <baranowb> so if this is what we want, than lets move to second scenario Ive
been trying to make work
[16:33] <martins> but that was expected right from the beggining, that is why there
is also gravitation option
[16:33] <martins> right? :)
[16:34] <baranowb> y, had something else in mind by HA term than :)
[16:34] <baranowb> ok about case #2
[16:34] <martins> ok, so that one works as you explain, it just requires balanicng
affinity to support fail over?
[16:34] <baranowb> y
[16:34] <martins> ok
[16:34] <baranowb> and everything works like a charm
[16:34] <martins> now with gravitation
[16:34] <baranowb> ok, with gravitation it sucks :)
[16:35] <baranowb> it sucks at point where new buddy joins
[16:35] <baranowb> data ownwership gets screwed
[16:35] <baranowb> totaly
[16:35] <baranowb> consider scenario
[16:35] <baranowb> N_1,N_2, N_3
[16:35] <baranowb> N_1, fails, N_2, takes over, as expected,
[16:36] <baranowb> ok, at this point, N_2 is owner, has runnign timers, what ever [16:37] <baranowb> we can still fire on N_3, and it works, N_3 just has some entries
for new acs _/ac's
[16:37] <baranowb> this is quite good imho
[16:37] <baranowb> but now consider taht N_1 rejoins
[16:37] <baranowb> now everything gets screwed due to gravitation
[16:38] <baranowb> N_1 cache takes over all data from buddy group, I mean totaly
everything
[16:38] <baranowb> no matter how I set overrides, invoce CacheData init
[16:38] <martins> so it gets its data back
[16:38] <martins> ?
[16:38] <baranowb> it takes everything
[16:38] <martins> what is everything
[16:39] <baranowb> everything that is in buddy group
[16:39] <baranowb> so it gets data from N_2 and N_3
[16:39] <baranowb> leaving their cache _/ empty
[16:39] <martins> wow
[16:40] <martins> that sounds like a bad config or a bug
[16:40] <baranowb> posted on jbc about it(not sure why timestamp shows 2 days ago,
when forums did not work)
[16:40] <baranowb> http://community.jboss.org/thread/85420
[16:41] <baranowb> 1.st there is some page in wiki which says that with gravitation
cache shoudl have structure like
[16:41] <baranowb> _/node_address/ac
[16:41] <baranowb> foubnd it by accident, dunno why its not in user guide
[16:41] <baranowb> 2. nd thing, if cache is set to local mode ||
overrideoption.setSkipDataGravitation(true)
[16:41] <baranowb> it should not happen
[16:42] <baranowb> but somewhere in debug log I see that overrideoption are cleaned
and everything isfetched
[16:43] <martins> 1. useless
[16:43] <baranowb> y, we should be able to work with override options
[16:44] <martins> I can't quickly understand your post
[16:44] <baranowb> and its the same thing as having data in backup without gravitation
[16:44] <martins> you are complaining that once N2 starts N1 gets its data ?
[16:44] <martins> in /BACKUP_...
[16:44] <martins> ?
[16:45] <baranowb> hmm, maybe I should rephrase it than, its the same scenario
[16:45] <martins> you need to provide a bit more simple use cases
[16:45] <baranowb> when another node starts and joins, it gets whole data of buddy group
[16:45] <baranowb> that is
[16:45] <martins> lets say every NODE writes a /ac { x = ip_address }
[16:45] <martins> just data
[16:45] <martins> just that
[16:45] <baranowb> ok
[16:45] <baranowb> so
[16:46] <baranowb> N_1 has
[16:46] <baranowb> (lets consider fqns, not data)
[16:46] <baranowb> N_1
[16:46] <martins> ok then ac/ip_address
[16:46] <baranowb> _/ac/N1_IP
[16:46] <martins> is what each writes
[16:46] <martins> right
[16:47] <baranowb> _/_BUDDY_BACKUP_/_N2_/ac/N_2_IP
[16:47] <baranowb> _/_BUDDY_BACKUP_/_N3_/ac/N_3_IP
[16:47] <baranowb> N_2
[16:47] <baranowb>  _/ac/N2_IP
[16:47] <baranowb> _/_BUDDY_BACKUP_/_N1_/ac/N_1_IP
[16:47] <baranowb> _/_BUDDY_BACKUP_/_N3_/ac/N_3_IP
[16:47] <baranowb> similar N_@
[16:48] <baranowb> 3
[16:48] <baranowb> right ?
[16:48] <martins> 2 backup nodes of N3 ?
[16:48] <baranowb> N2 and N1
[16:49] <baranowb> N_3
[16:49] <baranowb> _/ac/N3_IP
[16:49] <baranowb> _/_BUDDY_BACKUP_/_N1_/ac/N_1_IP
[16:49] <martins> that is a bit off topic, but in a cluster with N = 3 each node has
2 backups?
[16:49] <baranowb> _/_BUDDY_BACKUP_/_N2_/ac/N_2_IP
[16:49] <baranowb> its configurable, depends on nmber of buddies
[16:49] <martins> ok
[16:49] <baranowb> You can have 1, 2,3....
[16:50] <baranowb> so N_1 dies
[16:50] <baranowb> N_2 is elected as owner
[16:50] <baranowb> so data looks like
[16:50] <baranowb> _/ac/N2_IP
[16:50] <baranowb> _/ac/N1_IP
[16:50] <baranowb> _/_BUDDY_BACKUP_/_N3_/ac/N_3_IP
[16:50] <baranowb> N_3 data
[16:50] <baranowb> _/ac/N3_IP
[16:50] <baranowb> _/_BUDDY_BACKUP_/_N2_/ac/N_2_IP
[16:50] <baranowb> _/_BUDDY_BACKUP_/_N2_/ac/N_1_IP
[16:51] <baranowb> right?
[16:51] <martins> y
[16:51] <baranowb> ok,. now N_1 starts, and after start data looks like
[16:51] <baranowb> N_1 data
[16:51] <baranowb> _/ac/N2_IP
[16:51] <baranowb> _/ac/N1_IP
[16:51] <baranowb> _/ac/N3_IP
[16:51] <baranowb> no backup
[16:52] <baranowb> on N2, or N3
[16:52] <baranowb> _/ac/
[16:52] <baranowb> _/_BUDDY_BACKUP_/_N1_/ac/N_1_IP
[16:52] <baranowb> _/_BUDDY_BACKUP_/_N1_/ac/N_2_IP
[16:52] <baranowb> _/_BUDDY_BACKUP_/_N1_/ac/N_3_IP
[16:54] <martins> that is the normal behavior you get with default buddy group +
datagravitation on config ?
[16:55] <baranowb> y, this is somewhat default way, I mean if there is override used
[16:55] <baranowb> to control gravitation
[16:55] <martins> lets stick to default one first
[16:56] <baranowb> so first scenario is goal  for now ?
[16:56] <martins> what is important to know
[16:56] <martins> is if this behavior is expected
[16:56] <martins> if it is not who is fault
[16:56] <martins> if it is what is the reason, since it doesn't seem logical at first [16:57] <martins> is there any prob you get by using this default buddy groups with
data gravitation ?
[16:57] <baranowb> y, I will add simpler explanation to post and @ manik directly
[16:57] <baranowb> wdym?
[16:57] <martins> I will ask him and add you to cc, I have something else I need to
clarify with him
[16:58] <martins> I mean, is there any issue because when N1 rejoins it sucks all data ?
[16:58] <baranowb> y
[16:58] <baranowb> see, on failure we look through _BACKUP_
[16:58] <baranowb> and reinit local resources
[16:59] <baranowb> (this happens on wining node)
[16:59] <baranowb> now, if N_2 fails (real owner before N_1 sucks)
[16:59] <martins> why we do that, doesn't jboss cache move the data on its own
[16:59] <martins> ?
[17:00] <baranowb> 1. there is bug :), it does not happen always, its because on 3.1.0 there is that bug whcih causes jbc not to fire BG events(hence it does not
update internals and move data)
[17:00] <baranowb> Ive told about it and it seems fixed in 3.2.1
[17:00] <baranowb> our listener handles everything, basically it does what cache should
[17:00] <martins> this also happens without data gravitation
[17:00] <martins> ?
[17:01] <baranowb> y
[17:01] <baranowb> its buddy group membership logic
[17:01] <martins> hmm I don't like that
[17:01] <martins> maybe we should move this to post ga
[17:01] <baranowb> 2. its more efficient to perform it only localy, no network traffic [17:02] <martins> if jbc internals are not working correctly with current AS version
[17:02] <baranowb> and we iterate only data that needs to be inspected
[17:02] <martins> for budy groups
[17:02] <martins> it's a  big reason to skip it
[17:02] <baranowb> afaik its the only thing, atleast that Ive noticed
[17:03] <baranowb> well one thing is
[17:03] <martins> do you thin that this solution going over what jbc does
[17:04] <martins> can be significantly better than cluster wide
[17:04] <martins> for this first version ?
[17:04] <baranowb> our code has one advantage - only one node is owner of data
[17:04] <martins> I mean, do you think it is worth
[17:04] <baranowb> in jbc impl, data is copied to _/ of each node
[17:04] <baranowb> and imho this is not what we want
[17:05] <baranowb> data owner is consistent with election policy
[17:05] <baranowb> see
[17:05] <baranowb> in jbc impl
[17:05] <baranowb> on buddy failure, each node moved _/BACKUP_/N/
[17:06] <baranowb> to its _/
[17:06] <baranowb> so if node fails, and there are two buddies, each has copy of
failed buddy at its root after failure
[17:07] <martins> that sounds like another bug
[17:07] <baranowb> and iirc impl of cluster we have expects only elected node to have
it, right?
[17:08] <martins> well, does jbc updates all nodes when one changes that data ?
[17:08] <baranowb> hmm, dont think so, did not see this happen
[17:09] <martins> I mean, if oen takes ownership in mobicents side, such as restoring a timer, and when the tiemr fires it deletes its data, does jbc deletes the data from
all nodes where it restored the data?
[17:09] <martins> that is a huge leak
[17:09] <martins> if it doesn't delete in all
[17:10] <baranowb> hehe, dont know, cause I assumed in impl that others should not
get it, so only one node has data :)
[17:10] <baranowb> possibly gravitation removes it, but cant say for sure
[17:11] <martins> well, must do the same when a node gets data through graviation [17:11] <baranowb> since its direct call, it should gravitate remove operation
[17:11] <martins> when it deletes must delete from all
[17:13] <martins> I'm not getting into this buddy group thing with that much "grey" zones
[17:14] <martins> in that N1,N2,N3
[17:14] <martins> when N1 rejoins
[17:15] <martins> does JBC really copy all to N1 and deletes data from N2 and N3, or
is it our code doing the N2 and N3 deletes?
[17:15] <baranowb> its jbc, Ive tried it - I removed our calls to data to see what is
going on
[17:15] <baranowb> besides there is jbc log
[17:16] <baranowb> about gravitation
[17:16] <martins> does it invoke any callback in N2 and N3when it does  that
[17:16] <baranowb> #
[17:16] <baranowb> 16:04:08,234 TRACE [MVCCNodeHelper] Node /ac is not in context,
fetching from container.
[17:16] <baranowb> #
[17:16] <baranowb> 16:04:08,234 TRACE [DataGravitatorInterceptor] Checking local
existence of requested fqn /ac
[17:16] <baranowb> #
[17:16] <baranowb> 16:04:08,234 TRACE [DataGravitatorInterceptor] Gravitating from
local backup tree
[17:17] <baranowb> #
[17:17] <baranowb> 16:04:08,234 TRACE [CallInterceptor] Executing command:
GravitateDataCommand{fqn=/ac, searchSubtrees=true}.
[17:18] <baranowb> #
[17:18] <baranowb> 16:04:08,265 TRACE [InvocationContextInterceptor] Invoked with command EvictCommand{fqn=/_BUDDY_BACKUP_/127.0.0.1_3273/ac, recursive=true} and InvocationContext [InvocationContext{transaction=TransactionImple < ac, BasicAction:
-560196f5:ce0:4b225f4b:63 status: ActionStatus.RUNNING >,
globalTransaction=GlobalTransaction:<127.0.0.1:3306>:0,
transactionContext=TransactionEntry
[17:18] <baranowb> #
[17:18] <baranowb> modificationList: null, optionOverrides=Option{failSilently=false,
cacheModeLocal=false, dataVersion=null, suppressLocking=false,
lockAcquisitionTimeout=-1, forceDataGravitation=false, skipDataGravitation=false,
forceAsynchronous=false, forceSynchronous=false, suppressPersistence=false,
suppressEventNotification=false}, originLocal=true, bypassUnmarshalling=false}]
[17:18] <baranowb> #
[17:18] <baranowb> 16:04:08,265 TRACE [InvocationContextInterceptor] Setting up
transactional context.
[17:18] <baranowb> #
[17:18] <baranowb> 16:04:08,265 TRACE [InvocationContextInterceptor] Setting tx as
TransactionImple < ac, BasicAction: -560196f5:ce0:4b225f4b:63 status:
ActionStatus.RUNNING > and gtx as GlobalTransaction:<127.0.0.1:3306>:0
[17:18] <baranowb> #
[17:18] <baranowb> 16:04:08,265 TRACE [TxInterceptor] local transaction exists -
registering global tx if not present for Thread[main,5,jboss]
[17:18] <baranowb> #
[17:18] <baranowb> 16:04:08,265 TRACE [TxInterceptor] Associated gtx in txTable is
GlobalTransaction:<127.0.0.1:3306>:0
[17:18] <baranowb> #
[17:18] <baranowb> 16:04:08,265 TRACE [TxInterceptor] Transaction TransactionImple < ac, BasicAction: -560196f5:ce0:4b225f4b:63 status: ActionStatus.RUNNING > is already
registered.
[17:18] <baranowb> #
[17:18] <baranowb> 16:04:08,265 TRACE [MVCCLockManager] Attempting to lock
/_BUDDY_BACKUP_/127.0.0.1_3273/ac
[17:18] <baranowb> #
[17:18] <baranowb> 16:04:08,265 TRACE [MVCCLockManager] Attempting to lock
/_BUDDY_BACKUP_/127.0.0.1_3273/ac/ACH=.....
[17:18] <baranowb> #
[17:18] <baranowb> 16:04:08,265 TRACE [MVCCNodeHelper] Retrieving wrapped node
/_BUDDY_BACKUP_/127.0.0.1_3273/ac/ACH=.....
[17:18] <baranowb> #
[17:18] <baranowb> 16:04:08,265 TRACE [MVCCLockManager] Attempting to lock
/_BUDDY_BACKUP_/127.0.0.1_3273/ac/ACH=..../...
[17:18] <baranowb> #
[17:18] <baranowb> 16:04:08,265 TRACE [MVCCLockManager] Attempting to lock
/_BUDDY_BACKUP_/127.0.0.1_3273/ac/ACH=.../.../...
[17:18] <baranowb> #
[17:18] <baranowb> 16:04:08,265 TRACE [MVCCLockManager] Attempting to lock
/_BUDDY_BACKUP_/127.0.0.1_3273/ac/ACH=.../...
[17:18] <baranowb> #
[17:18] <baranowb> 16:04:08,265 TRACE [MVCCLockManager] Attempting to lock
/_BUDDY_BACKUP_/127.0.0.1_3273/ac/ACH=/../.../...
[17:18] <baranowb> #
[17:18] <baranowb> 16:04:08,265 TRACE [MVCCLockManager] Attempting to lock
/_BUDDY_BACKUP_/127.0.0.1_3273/ac/ACH=...
[17:18] <baranowb> #
[17:18] <baranowb> 16:04:08,265 TRACE [MVCCNodeHelper] Retrieving wrapped node
/_BUDDY_BACKUP_/127.0.0.1_3273/ac/ACH=SERVICE>>ServiceID[name=TimerExampleService,vendor=org.mobicents,version=1.0]
[17:18] <baranowb> #
[17:18] <baranowb> 16:04:08,265 TRACE [CallInterceptor] Executing command:
EvictCommand{fqn=/_BUDDY_BACKUP_/127.0.0.1_3273/ac, recursive=true}.
[17:18] <baranowb> bla bla bla and so on
[17:19] <baranowb> I dont have log for other nodes
[17:19] <martins> which node is that
[17:19] <baranowb> N_1
[17:19] <martins> what is 3273
[17:19] <martins> N3?
[17:19] <baranowb> port
[17:19] <baranowb> y, buddies are stored as
[17:19] <baranowb> IP_PO
[17:19] <baranowb> PORT
[17:19] <martins> I know it is port :p
[17:19] <baranowb> :)
[17:20] <baranowb> its N3 I think
[17:20] <martins> that is on rejoin ?
[17:20] <baranowb> y
[17:21] <martins> you are fetching data from N3
[17:21] <martins> and it has the data on back tree
[17:21] <martins> what is wrong there
[17:21] <baranowb> its local call, and gravitation should be supresed
[17:22] <baranowb> for this case
[17:22] <martins> why is it local
[17:22] <baranowb> cause we should not get /ac
[17:22] <baranowb> from all other nodes
[17:23] <martins> you need to get AC to get a child
[17:23] <baranowb> well thats not entirely true - see
[17:23] <baranowb> when all nodes are running
[17:23] <baranowb> one dies
[17:23] <baranowb> N2 takes over
[17:23] <baranowb> we can create on N3 ubnder /ac
[17:24] <baranowb> nothing happens, no data sucking, no nothing, just creating
something under /ac
[17:24] <martins> that is because ac already exists?
[17:25] <baranowb> y
[17:25] <baranowb> atleast I suspect thats the cause
[17:25] <baranowb> there is /ac in local cache instance
[17:26] <martins> baranowb, I think there are much to uncover and work on this jbc config
[17:27] <martins> I feel it is far from a stable usable config for our HA
[17:27] <martins> do you agree?
[17:27] <baranowb> with gravitation i agree,
[17:28] <baranowb> first case could be of use, it seems to work ok
[17:28] <baranowb> and result is deterministic
[17:29] <martins> I'm not sure
[17:29] <baranowb> but if there is no urge, we can leave it, I still have few valid fixes [17:29] <martins> it looks like a use case that may be more toruble making than life
savior
[17:29] <martins> would be preferable to split a cluster in smaller clusters
[17:30] <baranowb> ok, I agree, we can freeze this issue with patch
[17:30] <baranowb> yes, exactly
[17:30] <martins> don't freeze, we need to continue working on this in parallel
[17:30] <baranowb> freeze for this release I mean :)
[17:31] <martins> till we have our doubts all answered
[17:31] <martins> the buddy group + gravitation has the potentical to become the
default setup
[17:31] <martins> potential
[17:31] <martins> and thus we should make it a big priority in HA
[17:32] <martins> well, we should point it for what kind of release, 2.1 ?
[17:33] <baranowb> y


--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings

Reply via email to