I've been running Kea with LFC for 2 days, simulating DHCP requests at
the rate of 1 lease/sec with valid lifetime of 60 seconds. I haven't
reproduced the problem so far.

Marcin

On 30.11.2017 09:28, Chaigneau, Nicolas wrote:
> 
> 
> 
> I've restarted Kea a week ago with debug logs, in the hope of seeing 
> something when the issue manifests.
> 
> No "luck" so far. Still waiting...
> 
> 
> 
>  
>> The worker thread seems to be running.
>> The following command (on the Kea process not executing LFC) shows two 
>> threads:
>>
>> ps -T -p <pid of Kea>
>>   PID  SPID TTY          TIME CMD
>> 28268 28268 ?        00:00:03 kea-dhcp4
>> 28268 28269 ?        00:00:03 kea-dhcp4
>>
>>
>> The issue also happened on our lab, so I could try to upgrade to 1.3.0.
>>
>> But I would really prefer to find out the issue with 1.2.0 and fix this 
>> version.
>> We've just deployed 1.2.0 on production, having to upgrade now would be... 
>> difficult :/
>>
>>
>>
>>> It would be useful to check if the worker thread triggering timers is
>>> still running or died. The kea-dhcp4 process should run main thread and
>>> the second thread for timers. You could use "ps" or "top" to verify this.
>>>
>>> In the Kea 1.3.0, we have removed the worker thread and made several
>>> significant updates to the timers scheduling. It would be useful it you
>>> could verify whether Kea 1.3.0 also has this issue, but if it is
>>> production environment, it is not good for experiments.
>>>
>>> Marcin
>>>
>>> On 20.11.2017 10:21, Chaigneau, Nicolas wrote:
>>>> The Kea server not executing LFC is still running, so if there's something 
>>>> you want me to check...  feel free to ask :)
>>>>
>>>>
>>>>
>>>> -----Message d'origine-----
>>>> De : Marcin Siodelski [mailto:[email protected]] 
>>>>
>>>> Nicolas,
>>>>
>>>> Thanks for creating the ticket. We'll have a look and see if we can
>>>> reproduce it here.
>>>>
>>>> Marcin Siodelski
>>>> ISC
>>>>
>>>> On 20.11.2017 09:07, Chaigneau, Nicolas wrote:
>>>>>
>>>>> Observed a 4th occurrence of the issue.
>>>>>
>>>>> Server started on 2017, November 15th at 17:20:47
>>>>> Last time LFC was executed: on November 19th at 04:50:47
>>>>>
>>>>> I'll open a ticket.
>>>>>
>>>>>
>>>>> Regards,
>>>>> Nicolas.
>>>>>  
>>>>>>
>>>>>> So this is confirmed: LFC is really not executed anymore at some point, 
>>>>>> for a reason yet to be determined.
>>>>>>
>>>>>> I'll have a look at the changes between Kea 0.9.1 and 1.2.0.
>>>>>>
>>>>>>
>>>>>> If you think of anything that could help, I'm all ears :)
>>>>>>
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Nicolas.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Hello Marcin,
>>>>>>>
>>>>>>>
>>>>>>> Thanks for your answer.
>>>>>>>
>>>>>>>
>>>>>>> I was too hasty to suspect the log rotation, this has nothing to do 
>>>>>>> with it. Rotation occurs at night between 3 and 4am, and the last log I 
>>>>>>> have was in the middle of the day.
>>>>>>>
>>>>>>> Furthermore, I think the issue manifesting after 24H is just a 
>>>>>>> coincidence. Before that I've had Kea run for several days before 
>>>>>>> seeing the issue. So it seems random...
>>>>>>>
>>>>>>> I don't think there's an issue with logging either, because when I 
>>>>>>> stopped the server after the LFC issue appeared, then I got the normal 
>>>>>>> shutdown logs.
>>>>>>>
>>>>>>> I think the LFC is not executed because of the two observations:
>>>>>>> - No log from Kea saying that LFC is launched
>>>>>>> - Lease files are not modified anymore (and their last modification 
>>>>>>> date coincides with the last "DHCPSRV_MEMFILE_LFC_START" log from Kea)
>>>>>>>
>>>>>>> But you're right, I'm not 100% sure LFC is not executed...
>>>>>>>
>>>>>>>
>>>>>>> So I've launched a little script that does loops over a "ps" to try and 
>>>>>>> catch the LFC process running.
>>>>>>> It's not fully reliable because if LFC has nothing to do, it may start 
>>>>>>> and end so fast that the ps won't see it.
>>>>>>> I tried that with LFC scheduled every 10 seconds. The script catches 
>>>>>>> about 2 LFC executions out of 3.
>>>>>>>
>>>>>>> So on the live system on which the LFC appears to not run anymore, I'll 
>>>>>>> let the script run for an hour. In this time frame LFC is normally 
>>>>>>> executed 6 times.
>>>>>>> If I don't see anything, I'll conclude that LFC is indeed not executed.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>> Nicolas.
>>>>>>>
>>>>>>> -----Message d'origine-----
>>>>>>> De : Marcin Siodelski [mailto:[email protected]] 
>>>>>>>
>>>>>>> Nicolas,
>>>>>>>
>>>>>>> I think we should start from checking if the issue doesn't stem from the
>>>>>>> log rotation. If I understand correctly, you're rotating the log file
>>>>>>> daily. Per your message below, things go weird after around 24 hours
>>>>>>> since server startup. Is it possible to verify that rotating every 12h
>>>>>>> would cause the issue to begin after 12h rather than 24h?
>>>>>>>
>>>>>>> I can't see anything in the Kea code that could prevent periodic
>>>>>>> execution of LFC, other than server reconfiguration. You're saying that
>>>>>>> LFC is not executed because the lease files aren't modified. But, is
>>>>>>> there any other log output present there? Perhaps the LFC is not
>>>>>>> executed because of some trouble with logging being a result of log file
>>>>>>> rotation? We should see if logging generally works fine after log file
>>>>>>> rotation in the first place. Could you enable more verbose logging to
>>>>>>> see if any log is produced when the LFC starting message is gone?
>>>>>>>
>>>>>>> Marcin Siodelski
>>>>>>> ISC
>>>>>>>
>>>>>>> On 17.11.2017 10:41, Chaigneau, Nicolas wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>>
>>>>>>>> I've observed the issue again.
>>>>>>>>
>>>>>>>> With a Kea server that does not receive any DHCP packet:
>>>>>>>> At some point, Kea is still running but LFC is no longer executed.
>>>>>>>>
>>>>>>>> In Kea log file we don't have "DHCPSRV_MEMFILE_LFC_START" messages 
>>>>>>>> anymore.
>>>>>>>> And the lease files are not modified on disk (they would be if LFC was 
>>>>>>>> executed, so this is not just a logging issue).
>>>>>>>>
>>>>>>>> Kea was started on 2017, November 15th at 16:20:23.
>>>>>>>> LFC was first executed 10 minutes later, at 16:30:23.
>>>>>>>> And the last execution of LFC: November 16th, at 16:40:23.
>>>>>>>> Nothing after that...
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Nicolas.
>>>>>>>>
>>>>>>>>
>>>>>>>>> Wait... when I restarted the server, I got logs again... before the 
>>>>>>>>> shutdown!
>>>>>>>>> So logging actually still worked. I just didn't have anything being 
>>>>>>>>> logged.
>>>>>>>>>
>>>>>>>>> 2017-11-15 17:20:45.551 INFO  [kea-dhcp4.dhcp4/29754] DHCP4_SHUTDOWN 
>>>>>>>>> server shutdown
>>>>>>>>> 2017-11-15 17:20:47.603 INFO  [kea-dhcp4.dhcp4/28141] DHCP4_STARTING 
>>>>>>>>> Kea DHCPv4 server version 1.2.0 starting
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The only logs I normally have in Kea logs is LFC execution every 10 
>>>>>>>>> minutes:
>>>>>>>>>
>>>>>>>>> 2017-11-16 02:10:47.617 INFO  [kea-dhcp4.dhcpsrv/28141] 
>>>>>>>>> DHCPSRV_MEMFILE_LFC_START starting Lease File Cleanup
>>>>>>>>> 2017-11-16 02:10:47.617 INFO  [kea-dhcp4.dhcpsrv/28141] 
>>>>>>>>> DHCPSRV_MEMFILE_LFC_EXECUTE executing Lease File Cleanup  (...)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I don't have these logs anymore after some time.
>>>>>>>>> So... either LFC is no longer executed after a while, or Kea stops 
>>>>>>>>> writing these log messages ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> A few more detail:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This worked fine before (with the same logrotate configuration), 
>>>>>>>>>> with:
>>>>>>>>>> - Kea 0.9.1
>>>>>>>>>> - RHEL 6.4
>>>>>>>>>>
>>>>>>>>>> The issue now appears with:
>>>>>>>>>> - Kea 1.2.0
>>>>>>>>>> - RHEL 7.1
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Should I open a ticket for this ? (I'm not sure the problem is with 
>>>>>>>>>> Kea, maybe it's logrotate...)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hello,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I've set up Kea log rotation with logrotate.
>>>>>>>>>>> This works for some time, and then it stops working: the new log 
>>>>>>>>>>> file stays empty, until the server is restarted.
>>>>>>>>>>> I'm not sure what's going on... has anyone else had this issue ?
>>>>>>>>>>>
>>>>>>>>>>> I'm using the following options for logrotate:
>>>>>>>>>>>
>>>>>>>>>>>     daily
>>>>>>>>>>>     rotate 30
>>>>>>>>>>>     missingok
>>>>>>>>>>>     nocompress
>>>>>>>>>>>     copytruncate
>>>>>>>>>>>     dateext
>>>>>>>>>>>     extension .log
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I thought "copytruncate" would be good enough, but apparently it is 
>>>>>>>>>>> not.
>>>>>>>>>>>
>>>>>>>>>>> Is it possible to send a signal to Kea so that it reopens its log 
>>>>>>>>>>> file ?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Nicolas.
> 
> This message contains information that may be privileged or confidential and 
> is the property of the Capgemini Group. It is intended only for the person to 
> whom it is addressed. If you are not the intended recipient, you are not 
> authorized to read, print, retain, copy, disseminate, distribute, or use this 
> message or any part thereof. If you receive this message in error, please 
> notify the sender immediately and delete all copies of this message.
> 

_______________________________________________
Kea-users mailing list
[email protected]
https://lists.isc.org/mailman/listinfo/kea-users

Reply via email to