I've got a RH 7.2 machine running as a SMB client, writing backup files to a Win98 box as backup storage. Sometimes the Win98 machine goes down and, obviously, the remote files are not accessible.
The problem is that the RH machine has some cron scripts that write to the SMB shares. When the Win98 machine goes down, the scripts suspend when they can't write to the shared file. `Cron` dutifully continues to run the script every 15 minutes trying to write to the same file, and a "traffic jam" of these processes start to accumulate that halt, but don't die, when trying unsuccessfully write to the file. These scripts continue to exist as processes, which can be seen piling up w/ `ps` (edited for clarity): root 1695 937 0 01:00 ? 00:00:00 CROND manager 1696 1695 0 01:00 ? 00:00:00 /bin/sh -c b_up.sh manager 1700 1696 0 01:00 ? 00:00:00 cp LOCAL REMOTE root 1711 937 0 01:15 ? 00:00:00 CROND manager 1712 1711 0 01:15 ? 00:00:00 /bin/sh -c b_up.sh manager 1716 1712 0 01:15 ? 00:00:00 cp LOCAL REMOTE root 1721 937 0 01:30 ? 00:00:00 CROND manager 1722 1721 0 01:30 ? 00:00:00 /bin/sh -c b_up.sh manager 1726 1722 0 01:30 ? 00:00:00 cp LOCAL REMOTE root 1727 937 0 01:45 ? 00:00:00 CROND manager 1728 1727 0 01:45 ? 00:00:00 /bin/sh -c b_up.sh manager 1732 1728 0 01:45 ? 00:00:00 cp LOCAL REMOTE ...etc, ad infinitum But even after the Win98 machine comes back up, the scripts don't complete and are still left in a state of suspension. I tried doing some error correction, but in the following code, `cp` never returns an error, it just locks up forever after trying to access the inaccessible REMOTE file: cp ${LOCAL} ${REMOTE} ERR=$? if [ $ERR -ne 0 ] then echo "error copying file" | mail -s ERROR root exit fi Additionally, the process jam seems to lock up my email subsystem such that no email error messages can get out, neither can I ftp to the RH box. Logwatch reports the following: --------------------- sendmail Begin ------------------------ 264707 bytes transferred 7 messages sent **Unmatched Entries** rejecting connections on daemon MTA: load average: 169 rejecting connections on daemon MTA: load average: 169 rejecting connections on daemon MTA: load average: 169 rejecting connections on daemon MTA: load average: 169 ...etc, ad infinitum So...anyone have ideas how I can resolve this SMB lockup problem, which seems to cascade into other problems??? Any assistance will be greatly appreciated. Thanks! Cosmo Lee -- redhat-list mailing list unsubscribe mailto:[EMAIL PROTECTED]?subject=unsubscribe https://listman.redhat.com/mailman/listinfo/redhat-list