On Fri, Jun 26, 2020 at 5:22 PM Stuart Henderson <s...@spacehopper.org> wrote:
> On 2020/06/26 15:30, sven falempin wrote: > > behavior confirmed on current. > > > > Once the process stalls, ( could be anything writing to the vnconfig > disk, > > cp , umount ) > > a few other calls like df , or ps, etc may hang, never the same > > sp or mp kernel, reproduced on today's snapshots. > > vnconfig is used as part of "make release", many builds are done every > week using this so it's not a general problem with vnconfig. > > Can you show some commands or a script to trigger the behaviour? > the perl script use the system to call : vnconfig. mount. umount. <- saw hanged cp.<- saw hanged tar.<- saw hanged svn up.<- saw hanged and dd. newfs. really nothing fancy, only stuff writing to disk got stuck. At one point it does a chroot but it never hangs near that , most of the time it hangs before. The script has been used like 1000 times on 6.0 and maybe twice more on 6.4. I have absolutely no idea what the 'needbuf' of top is . the script hangs at random position , always writing into vnconfig. I have no idea how to reproduce outside the perl script , so maybe it is related to some devious perl stdin/stdout buffer . Nevertheless there's like a 5% chance that's the script will work( slowly ) Most of the system call are inside a routine to log sub debug_system { $logger->debug('running: '.join(' ', @_)); return system(@_); } so i can easily put things inside to try to understand the issue. It is really a strange behavior, and the device must be shut down electrically. Something really odd, i run syslogd on a buffer, and syslogc buffer is stuck too when the device stuck (but it supposed to be mostly already allocated memory ). It's really like the vm does not want to give anymore bucket (<- i don't know what i m talking about here, but i looks like that anything that doesn't malloc is ok , computer reply to ping , can do a few things for a while , and then complete hang ) I ran the 6.7 release on a VM somewhere and another device with many perl script and they work. Only this fails 95% of the time and is VERY VERY slow when ok. compared to what i saw in /usr/src the vnconfig is big , ( forgot to copy df -h ), like 2GB -- -- --------------------------------------------------------------------------------------------------------------------- Knowing is not enough; we must apply. Willing is not enough; we must do