akki edited a comment on issue #1047:
URL: https://github.com/apache/helix/issues/1047#issuecomment-637292194


   Hi, @pkuwm
   Thanks for your reply.
   I actually tried subscribing to the mailing list & sending my mail and [this 
time it 
worked](http://mail-archives.apache.org/mod_mbox/helix-user/202006.mbox/browser).
   
   
   ---------------------------------------------
   Copy-pasting it here as well, in case this is the preferred medium:
   
   Hi Helix community
   
   Nice to e-meet you guys. I am pretty new to this project and it is my first 
time writing to this mailing list - I apologize in advance for any mistakes.
   
   I am trying to implement a system's state model requirement here but am not 
able to achieve it. Hoping anyone here could point me in the right direction.
   
   
   ### GOAL
   My system is a typical multi-node + multi-resource system with the following 
properties:
   1. Any partition should have one & only one online partition at any given 
point of time.
   2. The _ONLINE -> OFFLINE_ transition is not instantaneous (typically takes 
minutes).
   3. Offline partitions have no special role - they can be dropped as soon as 
they become offline.
   
   If it helps in understanding better, my application is a tool which copies 
data from Kafka to Hadoop.
   And having two ONLINE partitions at the same time means I am duplicating 
this data in Hadoop.
   
   
   ### WHAT I HAVE TRIED
   I was able to successfully modify the 
[Quickstart](https://github.com/apache/helix/blob/master/helix-core/src/main/java/org/apache/helix/examples/Quickstart.java)
 script to imitate my use-case so I believe Helix can handle this scenario.
   But when I do it in my application I see that Helix fires the ONLINE -> 
OFFLINE & OFFLINE -> ONLINE transitions (to the corresponding 2 nodes) almost 
simultaneously. I want Helix to signal "ONLINE -> OFFLINE", then wait until the 
partition goes offline and only then fire the "OFFLINE -> ONLINE" transition to 
the new upcoming node.
   I have implemented my `@Transition(from = "ONLINE", to = "OFFLINE")` 
function in such a way that it waits for the partition to go offline (using 
[`latch.await()`](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CountDownLatch.html#await--))
 and only then returns (I have confirmed this from application logs).
   
   My application is different from my Quickstart trial app in the following 
ways (or at least, these are the ones known to me, I am building upon someone 
else's project so there might be code that I am not aware of):
   1. The rebalancing algo is **not** AUTO - I am using my own custom logic to 
distribute partitions among nodes
   2. I have enabled nodes to auto-join i.e. 
`props.put(ZKHelixManager.ALLOW_PARTICIPANT_AUTO_JOIN, String.valueOf(true));`
   Is it possible for me to achieve this system with these settings enabled?
   
   
   ### DEBUG LOGS / CODE
   If it helps, this is what I see in Zookeeper after adding a 2nd node to my 
cluster which had 1 node with 1 resource with 6 partitions - 
https://gist.github.com/akki/1d80c97463198275b3abe39350688bda#file-zookeeper-output-txt
   [As you can 
see](https://gist.github.com/akki/1d80c97463198275b3abe39350688bda#file-zookeeper-output-txt-L15),
 there are a few partitions which have 2 ONLINE replicas at the same time 
(after a while the draining replica goes away but in that duration, my data 
gets duplicated, which is the problem I want to overcome). I cannot understand 
how this is possible when I have set up [these 
bounds](https://gist.github.com/akki/1d80c97463198275b3abe39350688bda#file-onlineofflinestatemodel-java-L36)
 in [my model 
definition](https://gist.github.com/akki/1d80c97463198275b3abe39350688bda#file-onlineofflinestatemodel-java).
   
   
   I would really appreciate if anyone here could give me any clues that what I 
might be doing wrong (or what I am trying to achieve is even possible or not 
with Helix).
   
   Thank you so much for building such a wonderful tool and having this mailing 
list to help us out.
   
   
   Regards
   Akshesh Doshi


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to