Are there any additional messages prior to this, indicating what might
have caused it?

Marcin

On 04.12.2017 09:32, Chaigneau, Nicolas wrote:
> Hello,
> 
> 
> Got another occurence of the issue, on 2017 December 1st at 09:12.
> 
> This time I had the server running with debug logs.
> I noticed that, not only did the LFC stopped being executed, but also the 
> lease reclaiming mechanism (both at the same time).
> 
> So I think that the issue is bigger, that all timers stop working. :(
> 
> 
> egrep "DHCPSRV_MEMFILE_LFC_START|DHCPSRV_TIMERMGR_RUN_TIMER_OPERATION" 
> kea1-open-20171202.log
> 
> (...)
> 2017-12-01 09:12:42.376 DEBUG [kea-dhcp4.dhcpsrv/32013] 
> DHCPSRV_TIMERMGR_RUN_TIMER_OPERATION running operation for timer: memfile-lfc
> 2017-12-01 09:12:42.376 INFO  [kea-dhcp4.dhcpsrv/32013] 
> DHCPSRV_MEMFILE_LFC_START starting Lease File Cleanup
> 2017-12-01 09:12:43.240 DEBUG [kea-dhcp4.dhcpsrv/32013] 
> DHCPSRV_TIMERMGR_RUN_TIMER_OPERATION running operation for timer: 
> reclaim-expired-leases
> (...)
> 2017-12-01 09:13:40.255 DEBUG [kea-dhcp4.dhcpsrv/32013] 
> DHCPSRV_TIMERMGR_RUN_TIMER_OPERATION running operation for timer: 
> reclaim-expired-leases
> 2017-12-01 09:13:41.256 DEBUG [kea-dhcp4.dhcpsrv/32013] 
> DHCPSRV_TIMERMGR_RUN_TIMER_OPERATION running operation for timer: 
> reclaim-expired-leases
> 
> No more timer-related logs after this point.
> (Note that the server is still processing DHCP packets correctly.)
> 
> 
> 
> 
> Regards,
> Nicolas.
> 
>>
>> OK, thanks for the update!
>>
>> For the record, the issue can appear whether Kea receives DHCP packets or 
>> not.
>> So you can just start the server and wait :)
>>
>>
>>
>>
>>> I've been running Kea with LFC for 2 days, simulating DHCP requests at
>>> the rate of 1 lease/sec with valid lifetime of 60 seconds. I haven't
>>> reproduced the problem so far.
>>>
>>> Marcin
>>>
>>> On 30.11.2017 09:28, Chaigneau, Nicolas wrote:
>>>>
>>>>
>>>>
>>>> I've restarted Kea a week ago with debug logs, in the hope of seeing 
>>>> something when the issue manifests.
>>>>
>>>> No "luck" so far. Still waiting...
>>>>
>>>>
>>>>
>>>>  
>>>>> The worker thread seems to be running.
>>>>> The following command (on the Kea process not executing LFC) shows two 
>>>>> threads:
>>>>>
>>>>> ps -T -p <pid of Kea>
>>>>>   PID  SPID TTY          TIME CMD
>>>>> 28268 28268 ?        00:00:03 kea-dhcp4
>>>>> 28268 28269 ?        00:00:03 kea-dhcp4
>>>>>
>>>>>
>>>>> The issue also happened on our lab, so I could try to upgrade to 1.3.0.
>>>>>
>>>>> But I would really prefer to find out the issue with 1.2.0 and fix this 
>>>>> version.
>>>>> We've just deployed 1.2.0 on production, having to upgrade now would 
>>>>> be... difficult :/
>>>>>
>>>>>
>>>>>
>>>>>> It would be useful to check if the worker thread triggering timers is
>>>>>> still running or died. The kea-dhcp4 process should run main thread and
>>>>>> the second thread for timers. You could use "ps" or "top" to verify this.
>>>>>>
>>>>>> In the Kea 1.3.0, we have removed the worker thread and made several
>>>>>> significant updates to the timers scheduling. It would be useful it you
>>>>>> could verify whether Kea 1.3.0 also has this issue, but if it is
>>>>>> production environment, it is not good for experiments.
>>>>>>
>>>>>> Marcin
>>>>>>
>>>>>> On 20.11.2017 10:21, Chaigneau, Nicolas wrote:
>>>>>>> The Kea server not executing LFC is still running, so if there's 
>>>>>>> something you want me to check...  feel free to ask :)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Message d'origine-----
>>>>>>> De : Marcin Siodelski [mailto:[email protected]] 
>>>>>>>
>>>>>>> Nicolas,
>>>>>>>
>>>>>>> Thanks for creating the ticket. We'll have a look and see if we can
>>>>>>> reproduce it here.
>>>>>>>
>>>>>>> Marcin Siodelski
>>>>>>> ISC
>>>>>>>
>>>>>>> On 20.11.2017 09:07, Chaigneau, Nicolas wrote:
>>>>>>>>
>>>>>>>> Observed a 4th occurrence of the issue.
>>>>>>>>
>>>>>>>> Server started on 2017, November 15th at 17:20:47
>>>>>>>> Last time LFC was executed: on November 19th at 04:50:47
>>>>>>>>
>>>>>>>> I'll open a ticket.
>>>>>>>>
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Nicolas.
>>>>>>>>  
>>>>>>>>>
>>>>>>>>> So this is confirmed: LFC is really not executed anymore at some 
>>>>>>>>> point, for a reason yet to be determined.
>>>>>>>>>
>>>>>>>>> I'll have a look at the changes between Kea 0.9.1 and 1.2.0.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> If you think of anything that could help, I'm all ears :)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Nicolas.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Hello Marcin,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks for your answer.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I was too hasty to suspect the log rotation, this has nothing to do 
>>>>>>>>>> with it. Rotation occurs at night between 3 and 4am, and the last 
>>>>>>>>>> log I have was in the middle of the day.
>>>>>>>>>>
>>>>>>>>>> Furthermore, I think the issue manifesting after 24H is just a 
>>>>>>>>>> coincidence. Before that I've had Kea run for several days before 
>>>>>>>>>> seeing the issue. So it seems random...
>>>>>>>>>>
>>>>>>>>>> I don't think there's an issue with logging either, because when I 
>>>>>>>>>> stopped the server after the LFC issue appeared, then I got the 
>>>>>>>>>> normal shutdown logs.
>>>>>>>>>>
>>>>>>>>>> I think the LFC is not executed because of the two observations:
>>>>>>>>>> - No log from Kea saying that LFC is launched
>>>>>>>>>> - Lease files are not modified anymore (and their last modification 
>>>>>>>>>> date coincides with the last "DHCPSRV_MEMFILE_LFC_START" log from 
>>>>>>>>>> Kea)
>>>>>>>>>>
>>>>>>>>>> But you're right, I'm not 100% sure LFC is not executed...
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> So I've launched a little script that does loops over a "ps" to try 
>>>>>>>>>> and catch the LFC process running.
>>>>>>>>>> It's not fully reliable because if LFC has nothing to do, it may 
>>>>>>>>>> start and end so fast that the ps won't see it.
>>>>>>>>>> I tried that with LFC scheduled every 10 seconds. The script catches 
>>>>>>>>>> about 2 LFC executions out of 3.
>>>>>>>>>>
>>>>>>>>>> So on the live system on which the LFC appears to not run anymore, 
>>>>>>>>>> I'll let the script run for an hour. In this time frame LFC is 
>>>>>>>>>> normally executed 6 times.
>>>>>>>>>> If I don't see anything, I'll conclude that LFC is indeed not 
>>>>>>>>>> executed.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Nicolas.
>>>>>>>>>>
>>>>>>>>>> -----Message d'origine-----
>>>>>>>>>> De : Marcin Siodelski [mailto:[email protected]] 
>>>>>>>>>>
>>>>>>>>>> Nicolas,
>>>>>>>>>>
>>>>>>>>>> I think we should start from checking if the issue doesn't stem from 
>>>>>>>>>> the
>>>>>>>>>> log rotation. If I understand correctly, you're rotating the log file
>>>>>>>>>> daily. Per your message below, things go weird after around 24 hours
>>>>>>>>>> since server startup. Is it possible to verify that rotating every 
>>>>>>>>>> 12h
>>>>>>>>>> would cause the issue to begin after 12h rather than 24h?
>>>>>>>>>>
>>>>>>>>>> I can't see anything in the Kea code that could prevent periodic
>>>>>>>>>> execution of LFC, other than server reconfiguration. You're saying 
>>>>>>>>>> that
>>>>>>>>>> LFC is not executed because the lease files aren't modified. But, is
>>>>>>>>>> there any other log output present there? Perhaps the LFC is not
>>>>>>>>>> executed because of some trouble with logging being a result of log 
>>>>>>>>>> file
>>>>>>>>>> rotation? We should see if logging generally works fine after log 
>>>>>>>>>> file
>>>>>>>>>> rotation in the first place. Could you enable more verbose logging to
>>>>>>>>>> see if any log is produced when the LFC starting message is gone?
>>>>>>>>>>
>>>>>>>>>> Marcin Siodelski
>>>>>>>>>> ISC
>>>>>>>>>>
>>>>>>>>>> On 17.11.2017 10:41, Chaigneau, Nicolas wrote:
>>>>>>>>>>> Hello,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I've observed the issue again.
>>>>>>>>>>>
>>>>>>>>>>> With a Kea server that does not receive any DHCP packet:
>>>>>>>>>>> At some point, Kea is still running but LFC is no longer executed.
>>>>>>>>>>>
>>>>>>>>>>> In Kea log file we don't have "DHCPSRV_MEMFILE_LFC_START" messages 
>>>>>>>>>>> anymore.
>>>>>>>>>>> And the lease files are not modified on disk (they would be if LFC 
>>>>>>>>>>> was executed, so this is not just a logging issue).
>>>>>>>>>>>
>>>>>>>>>>> Kea was started on 2017, November 15th at 16:20:23.
>>>>>>>>>>> LFC was first executed 10 minutes later, at 16:30:23.
>>>>>>>>>>> And the last execution of LFC: November 16th, at 16:40:23.
>>>>>>>>>>> Nothing after that...
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Nicolas.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Wait... when I restarted the server, I got logs again... before 
>>>>>>>>>>>> the shutdown!
>>>>>>>>>>>> So logging actually still worked. I just didn't have anything 
>>>>>>>>>>>> being logged.
>>>>>>>>>>>>
>>>>>>>>>>>> 2017-11-15 17:20:45.551 INFO  [kea-dhcp4.dhcp4/29754] 
>>>>>>>>>>>> DHCP4_SHUTDOWN server shutdown
>>>>>>>>>>>> 2017-11-15 17:20:47.603 INFO  [kea-dhcp4.dhcp4/28141] 
>>>>>>>>>>>> DHCP4_STARTING Kea DHCPv4 server version 1.2.0 starting
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> The only logs I normally have in Kea logs is LFC execution every 
>>>>>>>>>>>> 10 minutes:
>>>>>>>>>>>>
>>>>>>>>>>>> 2017-11-16 02:10:47.617 INFO  [kea-dhcp4.dhcpsrv/28141] 
>>>>>>>>>>>> DHCPSRV_MEMFILE_LFC_START starting Lease File Cleanup
>>>>>>>>>>>> 2017-11-16 02:10:47.617 INFO  [kea-dhcp4.dhcpsrv/28141] 
>>>>>>>>>>>> DHCPSRV_MEMFILE_LFC_EXECUTE executing Lease File Cleanup  (...)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I don't have these logs anymore after some time.
>>>>>>>>>>>> So... either LFC is no longer executed after a while, or Kea stops 
>>>>>>>>>>>> writing these log messages ?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> A few more detail:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> This worked fine before (with the same logrotate configuration), 
>>>>>>>>>>>>> with:
>>>>>>>>>>>>> - Kea 0.9.1
>>>>>>>>>>>>> - RHEL 6.4
>>>>>>>>>>>>>
>>>>>>>>>>>>> The issue now appears with:
>>>>>>>>>>>>> - Kea 1.2.0
>>>>>>>>>>>>> - RHEL 7.1
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Should I open a ticket for this ? (I'm not sure the problem is 
>>>>>>>>>>>>> with Kea, maybe it's logrotate...)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've set up Kea log rotation with logrotate.
>>>>>>>>>>>>>> This works for some time, and then it stops working: the new log 
>>>>>>>>>>>>>> file stays empty, until the server is restarted.
>>>>>>>>>>>>>> I'm not sure what's going on... has anyone else had this issue ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm using the following options for logrotate:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     daily
>>>>>>>>>>>>>>     rotate 30
>>>>>>>>>>>>>>     missingok
>>>>>>>>>>>>>>     nocompress
>>>>>>>>>>>>>>     copytruncate
>>>>>>>>>>>>>>     dateext
>>>>>>>>>>>>>>     extension .log
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I thought "copytruncate" would be good enough, but apparently it 
>>>>>>>>>>>>>> is not.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Is it possible to send a signal to Kea so that it reopens its 
>>>>>>>>>>>>>> log file ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Nicolas.
>>
> 
> This message contains information that may be privileged or confidential and 
> is the property of the Capgemini Group. It is intended only for the person to 
> whom it is addressed. If you are not the intended recipient, you are not 
> authorized to read, print, retain, copy, disseminate, distribute, or use this 
> message or any part thereof. If you receive this message in error, please 
> notify the sender immediately and delete all copies of this message.
> 

_______________________________________________
Kea-users mailing list
[email protected]
https://lists.isc.org/mailman/listinfo/kea-users

Reply via email to