Re: Strange named freezing

2021-12-27 Thread Gregory Sloop
Entorpy is not RAM or CPU.
 
VM's or Jails can often have difficulty accessing truly good sources of random 
events, and thus have difficulty having enough available entropy to handle 
encryption/cryptography functions in a timely manner.
 
See:
https://www.google.com/search?q=entropy
 
 
  


> More, than enough. During last freeze server has ~30Gb free RAM and ~ 2-3% 
> CPU load and more than 200Gb free storage space for this jail. DC jail dont 
> have any resources limitations. Its very strange, because during using 
> previously DC in the similar jail on this server I dont have this trouble.

> 27.12.2021 11:07, Ondřej Surý пишет:

>> Does the jail have enough entropy? That would be my first guess…

>> --
>> Ondřej Surý — ISC (He/Him)

>> My working hours and your working hours may be different. Please do not feel 
>> obligated to reply outside your normal working hours.
>>> On 13. 12. 2021, at 7:18, Nikita Druba  wrote:

>>> What can be wrong here? How I can more localize the problem?


> ___
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
> from this list

> ISC funds the development of this software with paid support subscriptions. 
> Contact us at https://www.isc.org/contact/ for more information.


> bind-users mailing list
> bind-users@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users

-- 
Gregory Sloop, Principal: Sloop Network & Computer Consulting
Voice: 503.251.0452 x121
EMail: gr...@sloop.net
http://www.sloop.net
---___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Strange named freezing

2021-12-27 Thread Matus UHLAR - fantomas

On 27.12.21 17:04, Nikita Druba wrote:
More, than enough. During last freeze server has ~30Gb free RAM and ~ 
2-3% CPU load and more than 200Gb free storage space for this jail. DC 
jail dont have any resources limitations. Its very strange, because 
during using previously DC in the similar jail on this server I dont 
have this trouble.


you don't know what entrypy is, right?
on linux do:

# cat /proc/sys/kernel/random/entropy_avail
3940

if this number gets to 0, you'll have problem with using /dev/random (which
is a blocking device) that leeds to problems like you have described.

using /dev/urandom instead should help.
there are daemons like haveged that can help you provide entropy.

some HW random number generators provide entropy source.



27.12.2021 11:07, Ondřej Surý пишет:

Does the jail have enough entropy? That would be my first guess…



On 13. 12. 2021, at 7:18, Nikita Druba  wrote:

What can be wrong here? How I can more localize the problem?

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
42.7 percent of all statistics are made up on the spot.
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Strange named freezing

2021-12-27 Thread Nikita Druba
More, than enough. During last freeze server has ~30Gb free RAM and ~ 
2-3% CPU load and more than 200Gb free storage space for this jail. DC 
jail dont have any resources limitations. Its very strange, because 
during using previously DC in the similar jail on this server I dont 
have this trouble.


27.12.2021 11:07, Ondřej Surý пишет:

Does the jail have enough entropy? That would be my first guess…

--
Ondřej Surý — ISC (He/Him)

My working hours and your working hours may be different. Please do not feel 
obligated to reply outside your normal working hours.


On 13. 12. 2021, at 7:18, Nikita Druba  wrote:

What can be wrong here? How I can more localize the problem?



___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Strange named freezing

2021-12-27 Thread Ondřej Surý
Does the jail have enough entropy? That would be my first guess…

--
Ondřej Surý — ISC (He/Him)

My working hours and your working hours may be different. Please do not feel 
obligated to reply outside your normal working hours.

> On 13. 12. 2021, at 7:18, Nikita Druba  wrote:
> 
> What can be wrong here? How I can more localize the problem?
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Strange named freezing

2021-12-27 Thread Nikita Druba
I apologize for the persistence, but maybe there will be some 
recommendations for debugging?


13.12.2021 7:18, Nikita Druba пишет:

Hi!

My system - OS FreeBSD 12.2 and filesystem - zfs. Samba 4.13.14 runs 
in a jail with Bind 9.16.23 like backend. Also I have Bind 9.16.23 on 
another server, its working like secondary dns. Secondary Bind gets 
zones from DC by transferring with a tsig-key. Also, I have several 
subnetworks(loopback and 3 other), whom DC listen.


Some time ago I moved DC from one jail to another. And I have strange 
behaviour of Bind at new DC.


When I set in resolv.conf of new DC other dns server, for example - 
old DC or secondary Bind, all works fine. New DC successfully resolve 
any records by nslookup or host commands from himself or other host.


When I set in resolv.conf of new DC localhost or himself internal ip, 
Bind periodically freezing by the next regularity:


- Bind stops to reply for the requests for a ~5 minutes. After start 
working without service restart and freeze again.


- At the daytime(when employees in a office), in freezes after less 1 
minute work, at the night - after 10-15 minutes.


- If I change resolv.conf from secondary Bind to internal IP, then not 
need to restart Bind or Samba to start or stop periodically freezing. 
Just change nameserver record and wait. If it was freezed, when 
resolv.conf changing, then it will be in freeze state ~5 minutes after 
start freezing and after will work fine.


- If I change resolv.conf from secondary Bind to loopback, then NEED 
to restart Bind to start or stop freezing.


- When Bind freeze - it don't stopped service by a command and don't 
killed by default, only kill -9 work.


- Internal Samba DNS work fine and don't freeze, when resolv.conf look 
to localhost.


- Sometime Bind freeze not for all subnetworks. It can freeze for 
localhost and 2 subnetworks. In one last subnetwork DC Bind can 
successfully resolve any records from any subnetworks. But this 
situation I saw only one time and can't repeat it for now.


- No special Bind log records with "debug 50", in time or before of 
freezing. Its freezing after any messages. And all this messages I see 
in log, when Bind works without freezing.


- I tried to run bind with logging to terminal, but don't saw no 
additional information, when freeze. Terminal logs the same, like in 
log files.


- rndc freeze also.

I found one way for resolving this problem. My server, where work jail 
with DC, have 40 CPUs(20 cores and 40 threads). Therefore, when I 
starts named, it is creates 40 workers for every listen ip, i.e. 40 
tcp and 40 udp for every ip.


Because its too much for my configuration, I intuitively made a 
decision to try to decrease number of named workers to 10 by "-n 10". 
And all works without freezing with correct resolv.conf during last 2 
weeks.


After, I tried set "-n 40", the same like named defines this value 
automatically. After restart named freezed again. May be it was 
coincidence, but with other settings named do not stop freezing. Also 
I noticed, that when named works without freezing, "number of zones" 
in "rndc status" output decreasing from 9 to 3. Seems, that named 
missed samba zones, but resolving of records from them works fine.


I tried to collect some logs by ktrace and catched freeze moment. 
After last record from usual log(when Bind freezing), in kdump starts 
many times repeating the next records:


 36460 named    CALL  nanosleep(0x7fffea30,0)
 36460 named    RET   nanosleep 0

What can be wrong here? How I can more localize the problem?



___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users