Unfortunately it appears that when the NPE occurs,  dropping the
participant no longer cleans up the related INSTANCE node. Perhaps some
state is lost?

Thanks,
Steph
On 16 Feb 2015 06:52, "Zhen Zhang" <[email protected]> wrote:

> I think the NPE is not fatal. It happens when no message handler factory
> is registered for this message type. The message will not be removed and
> remain in UNREAD state. Later when the message handler factory is
> registered via:
> DefaultMessagingService#registerMessageHandlerFactory, we will send a NOP
> message, which will in turn trigger HelixTaskExecutor to process all UNREAD
> messages. We should definitely fix this by logging a warning message
> instead of throwing an NPE.
>
> Thanks,
> Jason
>
>
> On Sun, Feb 15, 2015 at 7:30 PM, kishore g <[email protected]> wrote:
>
>> Controller assuming the state transition occurred is even more dangerous.
>>
>>
>>
>>
>>
>> On Sun, Feb 15, 2015 at 7:18 PM, [email protected] <[email protected]>
>> wrote:
>>
>>> In my experience it was fatal. The callback would jot be called but the
>>> controller would somehow assume the state transition occurred.
>>> On Feb 15, 2015 7:13 PM, "kishore g" <[email protected]> wrote:
>>>
>>> > Thanks Vlad. That explains the problem. That also explains how adding
>>> > sleep of 3seconds work.
>>> >
>>> > Jason, is this exception fatal?. Will the message be processed again
>>> after
>>> > the handler is added.
>>> >
>>> > thanks,
>>> > Kishore G
>>> >
>>> > On Sun, Feb 15, 2015 at 6:41 PM, [email protected] <[email protected]>
>>> > wrote:
>>> >
>>> >> https://issues.apache.org/jira/browse/HELIX-548
>>> >> On Feb 15, 2015 6:38 PM, "kishore g" <[email protected]> wrote:
>>> >>
>>> >> > Hi Vlad,
>>> >> >
>>> >> > Was there any jira associated with it?
>>> >> >
>>> >> > thanks.
>>> >> > Kishore G
>>> >> >
>>> >> > On Sun, Feb 15, 2015 at 4:36 PM, [email protected] <
>>> [email protected]>
>>> >> > wrote:
>>> >> >
>>> >> >> Looks like the same problem we encountered recently.
>>> >> >>
>>> >> >> Regards,
>>> >> >> Vlad
>>> >> >> On Feb 15, 2015 4:35 PM, "kishore g" <[email protected]> wrote:
>>> >> >>
>>> >> >> > Steph described this problem on IRC.
>>> >> >> >
>>> >> >> > He is using 0.7.1. On connecting to cluster he gets this NPE
>>> >> >> >
>>> >> >> > http://pastebin.com/YE3fwK5i
>>> >> >> >
>>> >> >> > java.lang.NullPointerException
>>> >> >> >         at
>>> >> >> >
>>> >> >>
>>> >>
>>> org.apache.helix.messaging.handling.HelixTaskExecutor.createMessageHandler(HelixTaskExecutor.java:661)
>>> >> >> >         at
>>> >> >> >
>>> >> >>
>>> >>
>>> org.apache.helix.messaging.handling.HelixTaskExecutor.onMessage(HelixTaskExecutor.java:581)
>>> >> >> >         at
>>> >> >> >
>>> >> >>
>>> >>
>>> org.apache.helix.manager.zk.ZkCallbackHandler.invoke(ZkCallbackHandler.java:202)
>>> >> >> >         at
>>> >> >> >
>>> >> >>
>>> >>
>>> org.apache.helix.manager.zk.ZkCallbackHandler.init(ZkCallbackHandler.java:336)
>>> >> >> >         at
>>> >> >> >
>>> >> >>
>>> >>
>>> org.apache.helix.manager.zk.ZkCallbackHandler.<init>(ZkCallbackHandler.java:130)
>>> >> >> >         at
>>> >> >> >
>>> >> >>
>>> >>
>>> org.apache.helix.manager.zk.ZkHelixConnection.addListener(ZkHelixConnection.java:533)
>>> >> >> >         at
>>> >> >> >
>>> >> >>
>>> >>
>>> org.apache.helix.manager.zk.ZkHelixConnection.addMessageListener(ZkHelixConnection.java:267)
>>> >> >> >         at
>>> >> >> >
>>> >> >>
>>> >>
>>> org.apache.helix.manager.zk.ZkHelixParticipant.setupMsgHandler(ZkHelixParticipant.java:347)
>>> >> >> >         at
>>> >> >> >
>>> >> >>
>>> >>
>>> org.apache.helix.manager.zk.ZkHelixParticipant.init(ZkHelixParticipant.java:383)
>>> >> >> >         at
>>> >> >> >
>>> >> >>
>>> >>
>>> org.apache.helix.manager.zk.ZkHelixParticipant.onConnected(ZkHelixParticipant.java:401)
>>> >> >> >         at
>>> >> >> >
>>> >> >>
>>> >>
>>> org.apache.helix.manager.zk.ZkHelixParticipant.start(ZkHelixParticipant.java:428)
>>> >> >> >         at
>>> >> >> >
>>> >> >>
>>> >>
>>> com.example.ProtostuffServerNode.spinUpParticipant(ProtostuffServerNode.java:134)
>>> >> >> >
>>> >> >> >
>>> >> >> > Here is his connection code.
>>> >> >> >
>>> >> >> > http://pastebin.com/QRfVU1tc
>>> >> >> >
>>> >> >> > private static HelixParticipant spinUpParticipant(HelixAdmin
>>> admin,
>>> >> >> > ParticipantId participantId) {
>>> >> >> >                 LOGGER.info("Starting up "+participantId);
>>> >> >> >                 HelixConnection connection = new
>>> ZkHelixConnection(
>>> >> >> > ZK_ADDRESS);
>>> >> >> >                 connection.connect();
>>> >> >> >                 HelixParticipant participant = connection.
>>> >> >> > createParticipant(CLUSTER_ID, participantId);
>>> >> >> >                 StateMachineEngine stateMach = participant.
>>> >> >> > getStateMachineEngine();
>>> >> >> >
>>> >> >> >
>>>  StateTransitionHandlerFactory<LocalTransitionHandler>
>>> >> >> > transitionHandlerFactory = new OnlineOfflineHandlerFactory();
>>> >> >> >
>>>  stateMach.registerStateModelFactory(STATE_MODEL_NAME,
>>> >> >> > transitionHandlerFactory);
>>> >> >> >                 participant.start();
>>> >> >> >
>>> >> >> >                 admin.enableInstance(CLUSTER_NAME,
>>> >> >> participantId.toString(
>>> >> >> > ), true);
>>> >> >> >
>>> >> >> >                 return participant;
>>> >> >> >         }
>>> >> >> >
>>> >> >> > Adding 3s sleep after registerStateModelFactory works. Any idea
>>> what
>>> >> is
>>> >> >> > happening.
>>> >> >> >
>>> >> >> > thanks,
>>> >> >> > Kishore G
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >>
>>> >> >
>>> >> >
>>> >>
>>> >
>>> >
>>>
>>
>>
>

Reply via email to