What code do you have in your shutdown hook. Are you disconnecting gracefully from the cluster and waiting until the liveinstance znode disappears.
thanks, Kishore G On Mon, Jan 5, 2015 at 4:11 PM, Varun Sharma <[email protected]> wrote: > But then the nodes would restart and not have the assigned partitions > since the controller would not write out the messages to open partitions > which should have been on the restarting node ? > > On Mon, Jan 5, 2015 at 4:08 PM, kishore g <[email protected]> wrote: > >> Try pausing the cluster controller before restarting and unpause after re >> start. >> On Jan 5, 2015 3:41 PM, "Varun Sharma" <[email protected]> wrote: >> >>> Hi, >>> >>> When I do a cluster wide restart, I see the following errors being >>> logged: >>> >>> 2015-01-05 22:08:27,526 [main] (ParticipantManagerHelper.java:234) >>> INFO *Carrying over old session: 149a14ada0d0323*, resource: >>> $terrapin$data$meta_board_join$1415863274925 to current session: >>> *149a14ada0d0324* >>> >>> This is then followed by a large number of errors: >>> >>> 2015-01-05 22:08:30,321 [main] (HelixTaskExecutor.java:559) WARN >>> SessionId does NOT match. *expected sessionId: 149a14ada0d0324*, >>> tgtSessionId in message: *149a14ada0d0323*, messageId: >>> da2ce3df-b797-4a27-9916-862c27af290a >>> >>> >>> Does this signify a problem - it happens everytime I do restart ? >>> >>> >>> Thanks >>> >>> Varun >>> >> >
