Hi Steph, When the NPE occurs, do you get the state transition callbacks?
thanks, Kishore G On Sun, Feb 15, 2015 at 11:23 PM, Steph Meslin-Weber <[email protected]> wrote: > Unfortunately it appears that when the NPE occurs, dropping the > participant no longer cleans up the related INSTANCE node. Perhaps some > state is lost? > > Thanks, > Steph > On 16 Feb 2015 06:52, "Zhen Zhang" <[email protected]> wrote: > >> I think the NPE is not fatal. It happens when no message handler factory >> is registered for this message type. The message will not be removed and >> remain in UNREAD state. Later when the message handler factory is >> registered via: >> DefaultMessagingService#registerMessageHandlerFactory, we will send a NOP >> message, which will in turn trigger HelixTaskExecutor to process all UNREAD >> messages. We should definitely fix this by logging a warning message >> instead of throwing an NPE. >> >> Thanks, >> Jason >> >> >> On Sun, Feb 15, 2015 at 7:30 PM, kishore g <[email protected]> wrote: >> >>> Controller assuming the state transition occurred is even more dangerous. >>> >>> >>> >>> >>> >>> On Sun, Feb 15, 2015 at 7:18 PM, [email protected] <[email protected]> >>> wrote: >>> >>>> In my experience it was fatal. The callback would jot be called but the >>>> controller would somehow assume the state transition occurred. >>>> On Feb 15, 2015 7:13 PM, "kishore g" <[email protected]> wrote: >>>> >>>> > Thanks Vlad. That explains the problem. That also explains how adding >>>> > sleep of 3seconds work. >>>> > >>>> > Jason, is this exception fatal?. Will the message be processed again >>>> after >>>> > the handler is added. >>>> > >>>> > thanks, >>>> > Kishore G >>>> > >>>> > On Sun, Feb 15, 2015 at 6:41 PM, [email protected] <[email protected] >>>> > >>>> > wrote: >>>> > >>>> >> https://issues.apache.org/jira/browse/HELIX-548 >>>> >> On Feb 15, 2015 6:38 PM, "kishore g" <[email protected]> wrote: >>>> >> >>>> >> > Hi Vlad, >>>> >> > >>>> >> > Was there any jira associated with it? >>>> >> > >>>> >> > thanks. >>>> >> > Kishore G >>>> >> > >>>> >> > On Sun, Feb 15, 2015 at 4:36 PM, [email protected] < >>>> [email protected]> >>>> >> > wrote: >>>> >> > >>>> >> >> Looks like the same problem we encountered recently. >>>> >> >> >>>> >> >> Regards, >>>> >> >> Vlad >>>> >> >> On Feb 15, 2015 4:35 PM, "kishore g" <[email protected]> wrote: >>>> >> >> >>>> >> >> > Steph described this problem on IRC. >>>> >> >> > >>>> >> >> > He is using 0.7.1. On connecting to cluster he gets this NPE >>>> >> >> > >>>> >> >> > http://pastebin.com/YE3fwK5i >>>> >> >> > >>>> >> >> > java.lang.NullPointerException >>>> >> >> > at >>>> >> >> > >>>> >> >> >>>> >> >>>> org.apache.helix.messaging.handling.HelixTaskExecutor.createMessageHandler(HelixTaskExecutor.java:661) >>>> >> >> > at >>>> >> >> > >>>> >> >> >>>> >> >>>> org.apache.helix.messaging.handling.HelixTaskExecutor.onMessage(HelixTaskExecutor.java:581) >>>> >> >> > at >>>> >> >> > >>>> >> >> >>>> >> >>>> org.apache.helix.manager.zk.ZkCallbackHandler.invoke(ZkCallbackHandler.java:202) >>>> >> >> > at >>>> >> >> > >>>> >> >> >>>> >> >>>> org.apache.helix.manager.zk.ZkCallbackHandler.init(ZkCallbackHandler.java:336) >>>> >> >> > at >>>> >> >> > >>>> >> >> >>>> >> >>>> org.apache.helix.manager.zk.ZkCallbackHandler.<init>(ZkCallbackHandler.java:130) >>>> >> >> > at >>>> >> >> > >>>> >> >> >>>> >> >>>> org.apache.helix.manager.zk.ZkHelixConnection.addListener(ZkHelixConnection.java:533) >>>> >> >> > at >>>> >> >> > >>>> >> >> >>>> >> >>>> org.apache.helix.manager.zk.ZkHelixConnection.addMessageListener(ZkHelixConnection.java:267) >>>> >> >> > at >>>> >> >> > >>>> >> >> >>>> >> >>>> org.apache.helix.manager.zk.ZkHelixParticipant.setupMsgHandler(ZkHelixParticipant.java:347) >>>> >> >> > at >>>> >> >> > >>>> >> >> >>>> >> >>>> org.apache.helix.manager.zk.ZkHelixParticipant.init(ZkHelixParticipant.java:383) >>>> >> >> > at >>>> >> >> > >>>> >> >> >>>> >> >>>> org.apache.helix.manager.zk.ZkHelixParticipant.onConnected(ZkHelixParticipant.java:401) >>>> >> >> > at >>>> >> >> > >>>> >> >> >>>> >> >>>> org.apache.helix.manager.zk.ZkHelixParticipant.start(ZkHelixParticipant.java:428) >>>> >> >> > at >>>> >> >> > >>>> >> >> >>>> >> >>>> com.example.ProtostuffServerNode.spinUpParticipant(ProtostuffServerNode.java:134) >>>> >> >> > >>>> >> >> > >>>> >> >> > Here is his connection code. >>>> >> >> > >>>> >> >> > http://pastebin.com/QRfVU1tc >>>> >> >> > >>>> >> >> > private static HelixParticipant spinUpParticipant(HelixAdmin >>>> admin, >>>> >> >> > ParticipantId participantId) { >>>> >> >> > LOGGER.info("Starting up "+participantId); >>>> >> >> > HelixConnection connection = new >>>> ZkHelixConnection( >>>> >> >> > ZK_ADDRESS); >>>> >> >> > connection.connect(); >>>> >> >> > HelixParticipant participant = connection. >>>> >> >> > createParticipant(CLUSTER_ID, participantId); >>>> >> >> > StateMachineEngine stateMach = participant. >>>> >> >> > getStateMachineEngine(); >>>> >> >> > >>>> >> >> > >>>> StateTransitionHandlerFactory<LocalTransitionHandler> >>>> >> >> > transitionHandlerFactory = new OnlineOfflineHandlerFactory(); >>>> >> >> > >>>> stateMach.registerStateModelFactory(STATE_MODEL_NAME, >>>> >> >> > transitionHandlerFactory); >>>> >> >> > participant.start(); >>>> >> >> > >>>> >> >> > admin.enableInstance(CLUSTER_NAME, >>>> >> >> participantId.toString( >>>> >> >> > ), true); >>>> >> >> > >>>> >> >> > return participant; >>>> >> >> > } >>>> >> >> > >>>> >> >> > Adding 3s sleep after registerStateModelFactory works. Any idea >>>> what >>>> >> is >>>> >> >> > happening. >>>> >> >> > >>>> >> >> > thanks, >>>> >> >> > Kishore G >>>> >> >> > >>>> >> >> > >>>> >> >> > >>>> >> >> > >>>> >> >> >>>> >> > >>>> >> > >>>> >> >>>> > >>>> > >>>> >>> >>> >>
