Re: [CentOS] everything seems to hang, but system is idle?
On Sun, 2010-04-11 at 14:49 +0200, Rudi Ahlers wrote: [r...@intranet ~]# yum install strace -y Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * local-addons: 192.168.1.250 * local-base: 192.168.1.250 * local-extras: 192.168.1.250 * local-updates: 192.168.1.250 * rpmforge: apt.sw.be Setting up Install Process Resolving Dependencies -- Running transaction check --- Package strace.x86_64 0:4.5.18-5.el5_4.4 set to be updated -- Finished Dependency Resolution Dependencies Resolved = PackageArch Version Repository Size = Installing: strace x86_64 4.5.18-5.el5_4.4 local-updates 177 k Transaction Summary = Install 1 Package(s) Upgrade 0 Package(s) Total size: 177 k Downloading Packages: Running rpm_check_debug Running Transaction Test And that's where it sits and does nothing. The system's load isn't very high: I saw this exact simptom once, on a server that had an dead nfs mount. HTH, Calin Key fingerprint = 37B8 0DA5 9B2A 8554 FB2B 4145 5DC1 15DD A3EF E857 = How many retured bricklayers from FLORIDA are out purchasing PENCIL SHARPENERS right NOW?? ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] everything seems to hang, but system is idle?
On Sun, 2010-04-11 at 15:48 +0200, Rudi Ahlers wrote: Yes, I think it could be a problematic iscsi config. Now that I think of it, the server wasn't rebooted in about 2 or 3 months and I did some iscsi testing a while ago, but with a recent power outage it could have enabled / a faulty configuration, I just rebooted the server again, managed to the remove iscsi this time, and will see if this solves the problem. --- Well along with that I would do a file system check. Then if it keeps on stop just the nmbd service and not smbd for cifs sharing. John ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] everything seems to hang, but system is idle? [SOLVED]
On Mon, Apr 12, 2010 at 1:50 PM, JohnS jse...@gmail.com wrote: On Sun, 2010-04-11 at 15:48 +0200, Rudi Ahlers wrote: Yes, I think it could be a problematic iscsi config. Now that I think of it, the server wasn't rebooted in about 2 or 3 months and I did some iscsi testing a while ago, but with a recent power outage it could have enabled / a faulty configuration, I just rebooted the server again, managed to the remove iscsi this time, and will see if this solves the problem. --- Well along with that I would do a file system check. Then if it keeps on stop just the nmbd service and not smbd for cifs sharing. John ___ So far everything seems to be fine, apart from my Cobbler problem (see other thread). rsync has been running very well for the past few hours (have been deleting restoring a lot of stuff to check ) so it was probably the improper iscsi mount that was giving issues. Thanx for all the help :) -- Kind Regards Rudi Ahlers SoftDux Website: http://www.SoftDux.com Technical Blog: http://Blog.SoftDux.com Office: 087 805 9573 Cell: 082 554 7532 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] everything seems to hang, but system is idle? [SOLVED]
On Apr 12, 2010, at 10:25 AM, Rudi Ahlers rudiahl...@gmail.com wrote: On Mon, Apr 12, 2010 at 1:50 PM, JohnS jse...@gmail.com wrote: On Sun, 2010-04-11 at 15:48 +0200, Rudi Ahlers wrote: Yes, I think it could be a problematic iscsi config. Now that I think of it, the server wasn't rebooted in about 2 or 3 months and I did some iscsi testing a while ago, but with a recent power outage it could have enabled / a faulty configuration, I just rebooted the server again, managed to the remove iscsi this time, and will see if this solves the problem. --- Well along with that I would do a file system check. Then if it keeps on stop just the nmbd service and not smbd for cifs sharing. John ___ So far everything seems to be fine, apart from my Cobbler problem (see other thread). rsync has been running very well for the past few hours (have been deleting restoring a lot of stuff to check ) so it was probably the improper iscsi mount that was giving issues. Thanx for all the help :) Don't mount a page cached iSCSI target over loopback or you'll deadlock the page cache. Sounds like what happened. -Ross ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
[CentOS] everything seems to hang, but system is idle?
Hi All, My one server recently started acting very weird. At fist I couldn't import any images with cobbler as rsync crashes the whole time. I was told to ask on the cobbler list (maybe it's not supported here?) but I left it at that (subscribed to far too many lists already). Yesteday I wanted to copy some stuff from a USB disk to the server (using rsync to update the files which have changed, or new files) but it seems like rsync hangs without any errors. Now, when I tried to update CentOS I get the same error in that yum hangs. At the same time I can open a new SSH session and do whatevery I like. But it seems that running a command which takes time to complete hangs. For example: Package Arch VersionRepository Size = Removing: iscsi-initiator-utils x86_64 6.2.0.871-0.12.el5_4.1 installed 1.9 M Removing for dependencies: gnome-applet-vm x86_64 0.1.2-1.el5installed 121 k libvirt x86_64 0.6.3-20.1.el5_4 installed 7.1 M libvirt-pythonx86_64 0.6.3-20.1.el5_4 installed 431 k python-virtinst noarch 0.400.3-5.el5 installed 1.4 M virt-manager x86_64 0.6.1-8.el5installed 4.9 M virt-viewer x86_64 0.0.2-3.el5installed 48 k xen x86_64 3.0.3-94.el5_4.3 installed 4.7 M Transaction Summary = Remove8 Package(s) Reinstall 0 Package(s) Downgrade 0 Package(s) Is this ok [y/N]: y Downloading Packages: Running rpm_check_debug Running Transaction Test [r...@intranet Torrents]# mount /dev/sdc1 /mnt/ Even running top does the same, yet I can't kill top with CTRL+C, or even killall -9 top from a new SSH session. [r...@intranet ~]# ps ax | grep top 20825 pts/6R+ 0:00 grep top [r...@intranet ~]# ps ax | grep rsync 6536 ?D 3:10 rsync -avz --progress /mnt/usb-backup/backups/current/home/www/linux/centos /home/www/linux/ 6537 ?Z 0:00 [rsync] defunct 8630 ?S 0:00 rsync -avz --progress --stats /mnt/usb-backup/backups/current/home/www/linux/ /home/www/linux/ 8631 ?Z 0:00 [rsync] defunct 20827 pts/6R+ 0:00 grep rsync [r...@intranet ~]# ps ax | grep yum 20390 pts/4S+ 0:02 /usr/bin/python /usr/bin/yum remove iscsi-initiator-utils 20829 pts/6R+ 0:00 grep yum [r...@intranet ~]# /var/log/messages doesn't show me any errors. Any suggestions on this? -- Kind Regards Rudi Ahlers SoftDux Website: http://www.SoftDux.com Technical Blog: http://Blog.SoftDux.com Office: 087 805 9573 Cell: 082 554 7532 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] everything seems to hang, but system is idle?
On Sun, 2010-04-11 at 12:58 +0200, Rudi Ahlers wrote: At the same time I can open a new SSH session and do whatevery I like. But it seems that running a command which takes time to complete hangs. --- Try killing off those rsyncs and try it again. You need to provide some other type of error messages. Use strace. tail /var/log/messages and paste it in your reply. Even if you don't see anything in it that does not mean someone else can't. You may need a reboot. John ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] everything seems to hang, but system is idle?
On Sun, Apr 11, 2010 at 2:04 PM, JohnS jse...@gmail.com wrote: On Sun, 2010-04-11 at 12:58 +0200, Rudi Ahlers wrote: At the same time I can open a new SSH session and do whatevery I like. But it seems that running a command which takes time to complete hangs. --- Try killing off those rsyncs and try it again. You need to provide some other type of error messages. Use strace. tail /var/log/messages and paste it in your reply. Even if you don't see anything in it that does not mean someone else can't. You may need a reboot. John ___ John, I already said I can't kill the process and tail -f /var/log/messages *really* doesn't show me anything. I am running tail -f /var/log/message in on SSH window, and at the same time killed re-ran yum remove iscsi-initiator-utils -y in another SSH window. /var/log/message has *nothing* to report. [r...@intranet ~]# tail -f /var/log/messages Apr 11 14:18:50 intranet nmbd[4310]: find_domain_master_name_query_fail: Apr 11 14:18:50 intranet nmbd[4310]: Unable to find the Domain Master Browser name SOFTDUX1b for the workgroup SOFTDUX. Apr 11 14:18:50 intranet nmbd[4310]: Unable to sync browse lists in this workgroup. Apr 11 14:18:50 intranet nmbd[4310]: [2010/04/11 14:18:50, 0] nmbd/nmbd_browsesync.c:find_domain_master_name_query_fail(351) Apr 11 14:18:50 intranet nmbd[4310]: find_domain_master_name_query_fail: Apr 11 14:18:50 intranet nmbd[4310]: Unable to find the Domain Master Browser name SOFTDUX1b for the workgroup SOFTDUX. Apr 11 14:18:50 intranet nmbd[4310]: Unable to sync browse lists in this workgroup. Apr 11 14:19:14 intranet snmpd[3912]: error scanning interface data (expected 10, got 0) Apr 11 14:20:44 intranet snmpd[3912]:last message repeated 6 times Apr 11 14:22:14 intranet snmpd[3912]:last message repeated 6 times Apr 11 14:23:44 intranet snmpd[3912]:last message repeated 6 times The ONLY fix is a reboot, but I don't want to reboot every few minutes (have done it already a few times today as the server is a dev server and everyone else working it (mainly web development) have to wait and this cuts in on production time. -- Kind Regards Rudi Ahlers SoftDux Website: http://www.SoftDux.com Technical Blog: http://Blog.SoftDux.com Office: 087 805 9573 Cell: 082 554 7532 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] everything seems to hang, but system is idle?
On Sun, Apr 11, 2010 at 2:25 PM, Rudi Ahlers rudiahl...@gmail.com wrote: On Sun, Apr 11, 2010 at 2:04 PM, JohnS jse...@gmail.com wrote: On Sun, 2010-04-11 at 12:58 +0200, Rudi Ahlers wrote: At the same time I can open a new SSH session and do whatevery I like. But it seems that running a command which takes time to complete hangs. --- Try killing off those rsyncs and try it again. You need to provide some other type of error messages. Use strace. tail /var/log/messages and paste it in your reply. Even if you don't see anything in it that does not mean someone else can't. You may need a reboot. John ___ John, I already said I can't kill the process and tail -f /var/log/messages *really* doesn't show me anything. I am running tail -f /var/log/message in on SSH window, and at the same time killed re-ran yum remove iscsi-initiator-utils -y in another SSH window. /var/log/message has *nothing* to report. [r...@intranet ~]# tail -f /var/log/messages Apr 11 14:18:50 intranet nmbd[4310]: find_domain_master_name_query_fail: Apr 11 14:18:50 intranet nmbd[4310]: Unable to find the Domain Master Browser name SOFTDUX1b for the workgroup SOFTDUX. Apr 11 14:18:50 intranet nmbd[4310]: Unable to sync browse lists in this workgroup. Apr 11 14:18:50 intranet nmbd[4310]: [2010/04/11 14:18:50, 0] nmbd/nmbd_browsesync.c:find_domain_master_name_query_fail(351) Apr 11 14:18:50 intranet nmbd[4310]: find_domain_master_name_query_fail: Apr 11 14:18:50 intranet nmbd[4310]: Unable to find the Domain Master Browser name SOFTDUX1b for the workgroup SOFTDUX. Apr 11 14:18:50 intranet nmbd[4310]: Unable to sync browse lists in this workgroup. Apr 11 14:19:14 intranet snmpd[3912]: error scanning interface data (expected 10, got 0) Apr 11 14:20:44 intranet snmpd[3912]:last message repeated 6 times Apr 11 14:22:14 intranet snmpd[3912]:last message repeated 6 times Apr 11 14:23:44 intranet snmpd[3912]:last message repeated 6 times The ONLY fix is a reboot, but I don't want to reboot every few minutes (have done it already a few times today as the server is a dev server and everyone else working it (mainly web development) have to wait and this cuts in on production time. I can't install strace either: [r...@intranet ~]# yum install strace -y Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * local-addons: 192.168.1.250 * local-base: 192.168.1.250 * local-extras: 192.168.1.250 * local-updates: 192.168.1.250 * rpmforge: apt.sw.be Setting up Install Process Resolving Dependencies -- Running transaction check --- Package strace.x86_64 0:4.5.18-5.el5_4.4 set to be updated -- Finished Dependency Resolution Dependencies Resolved = PackageArch Version Repository Size = Installing: strace x86_64 4.5.18-5.el5_4.4 local-updates 177 k Transaction Summary = Install 1 Package(s) Upgrade 0 Package(s) Total size: 177 k Downloading Packages: Running rpm_check_debug Running Transaction Test And that's where it sits and does nothing. The system's load isn't very high: [r...@intranet ~]# uptime 14:32:25 up 1 day, 2:05, 6 users, load average: 2.02, 2.02, 2.01 and again /var/log/messages reports nothing related to this problem: [r...@intranet ~]# tail -f /var/log/messages Apr 11 14:19:14 intranet snmpd[3912]: error scanning interface data (expected 10, got 0) Apr 11 14:20:44 intranet snmpd[3912]:last message repeated 6 times Apr 11 14:22:14 intranet snmpd[3912]:last message repeated 6 times Apr 11 14:23:44 intranet snmpd[3912]:last message repeated 6 times Apr 11 14:25:14 intranet snmpd[3912]:last message repeated 6 times Apr 11 14:26:44 intranet snmpd[3912]:last message repeated 6 times Apr 11 14:28:14 intranet snmpd[3912]:last message repeated 6 times Apr 11 14:29:44 intranet snmpd[3912]:last message repeated 6 times Apr 11 14:31:14 intranet snmpd[3912]:last message repeated 6 times Apr 11 14:32:44 intranet snmpd[3912]:last message repeated 6 times Apr 11 14:33:23 intranet snmpd[3912]:last message repeated 3 times Apr 11 14:33:23 intranet snmpd[3912]: Received TERM or STOP signal... shutting down... I stopped snmpd since it's not being used. After that no other errors which tells me what causes this came up. -- Kind Regards Rudi Ahlers SoftDux Website: http://www.SoftDux.com Technical Blog:
Re: [CentOS] everything seems to hang, but system is idle?
Check dmesg. The kernel may be reporting disk or filesystem IO problems that are not going to syslog. -geoff - Geoff Galitz Blankenheim NRW, Germany http://www.galitz.org/ http://german-way.com/blog/ -Original Message- From: centos-boun...@centos.org [mailto:centos-boun...@centos.org] On Behalf Of Rudi Ahlers Sent: Sonntag, 11. April 2010 14:49 To: CentOS mailing list Subject: Re: [CentOS] everything seems to hang, but system is idle? On Sun, Apr 11, 2010 at 2:25 PM, Rudi Ahlers rudiahl...@gmail.com wrote: On Sun, Apr 11, 2010 at 2:04 PM, JohnS jse...@gmail.com wrote: On Sun, 2010-04-11 at 12:58 +0200, Rudi Ahlers wrote: At the same time I can open a new SSH session and do whatevery I like. But it seems that running a command which takes time to complete hangs. --- Try killing off those rsyncs and try it again. You need to provide some other type of error messages. Use strace. tail /var/log/messages and paste it in your reply. Even if you don't see anything in it that does not mean someone else can't. You may need a reboot. ... ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] everything seems to hang, but system is idle?
On Sun, Apr 11, 2010 at 3:00 PM, Geoff Galitz ge...@galitz.org wrote: Check dmesg. The kernel may be reporting disk or filesystem IO problems that are not going to syslog. -geoff - Thanx Geoff, Already checked that, without any decent lead either: [r...@intranet ~]# tail -f /var/log/dmesg md: ... autorun DONE. device-mapper: multipath: version 1.0.5 loaded EXT3 FS on dm-0, internal journal kjournald starting. Commit interval 5 seconds EXT3 FS on dm-1, internal journal EXT3-fs: mounted filesystem with ordered data mode. kjournald starting. Commit interval 5 seconds EXT3 FS on md0, internal journal EXT3-fs: mounted filesystem with ordered data mode. Adding 8388600k swap on /dev/nas/swap. Priority:-1 extents:1 across:8388600k ip_tables: (C) 2000-2006 Netfilter Core Team Netfilter messages via NETLINK v0.30. ip_conntrack version 2.4 (8192 buckets, 65536 max) - 304 bytes per conntrack device vif0.0 entered promiscuous mode xenbr0: topology change detected, propagating xenbr0: port 1(vif0.0) entering forwarding state r8169: peth0: link up device peth0 entered promiscuous mode xenbr0: topology change detected, propagating xenbr0: port 2(peth0) entering forwarding state virbr0: no IPv6 routers present r8169: eth1: link up eth0: no IPv6 routers present eth1: no IPv6 routers present fuse init (API version 7.10) md: syncing RAID array md0 md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc. md: using maximum available idle IO bandwidth (but not more than 20 KB/sec) for reconstruction. md: using 128k window, over a total of 104320 blocks. md: syncing RAID array md2 md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc. md: using maximum available idle IO bandwidth (but not more than 20 KB/sec) for reconstruction. md: using 128k window, over a total of 244195904 blocks. md: md2: sync done. RAID1 conf printout: --- wd:1 rd:2 disk 1, wo:0, o:1, dev:sda1 md: md0: sync done. RAID1 conf printout: --- wd:2 rd:2 disk 0, wo:0, o:1, dev:hda1 disk 1, wo:0, o:1, dev:hdb1 usb 1-1: new high speed USB device using ehci_hcd and address 3 usb 1-1: configuration #1 chosen from 1 choice scsi3 : SCSI emulation for USB Mass Storage devices usb-storage: device found at 3 usb-storage: waiting for device to settle before scanning Vendor: Kingston Model: DT Mini Slim Rev: 1.00 Type: Direct-Access ANSI SCSI revision: 02 SCSI device sdc: 31506432 512-byte hdwr sectors (16131 MB) sdc: Write Protect is off sdc: Mode Sense: 23 00 00 00 sdc: assuming drive cache: write through SCSI device sdc: 31506432 512-byte hdwr sectors (16131 MB) sdc: Write Protect is off sdc: Mode Sense: 23 00 00 00 sdc: assuming drive cache: write through sdc: sdc1 sd 3:0:0:0: Attached scsi removable disk sdc sd 3:0:0:0: Attached scsi generic sg2 type 0 usb-storage: device scan complete SCSI device sdc: 31506432 512-byte hdwr sectors (16131 MB) sdc: Write Protect is off sdc: Mode Sense: 23 00 00 00 sdc: assuming drive cache: write through sdc: SCSI device sdc: 31506432 512-byte hdwr sectors (16131 MB) sdc: Write Protect is off sdc: Mode Sense: 23 00 00 00 sdc: assuming drive cache: write through sdc: /dev/sdc is a faulty USB memory stick,which I just put in before posting this, hence the errors. But this has happened before I put it in even. -- Kind Regards Rudi Ahlers SoftDux Website: http://www.SoftDux.com Technical Blog: http://Blog.SoftDux.com Office: 087 805 9573 Cell: 082 554 7532 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] everything seems to hang, but system is idle?
On Sun, Apr 11, 2010 at 6:58 AM, Rudi Ahlers rudiahl...@gmail.com wrote: [snip] At the same time I can open a new SSH session and do whatevery I like. But it seems that running a command which takes time to complete hangs. Is the server mounting any remote filesystems? ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] everything seems to hang, but system is idle?
Thanx Geoff, Already checked that, without any decent lead either: Have you tried iostat, vmstat or sar to see if there is unusual activity? Were there any changes to the kernel lately (such as an update or a new module)? Or perhaps an NFS/CIFS mount gone wonky causing blocking? -geoff - Geoff Galitz Blankenheim NRW, Germany http://www.galitz.org/ http://german-way.com/blog/ ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] everything seems to hang, but system is idle?
On Sun, Apr 11, 2010 at 3:33 PM, Kwan Lowe kwan.l...@gmail.com wrote: On Sun, Apr 11, 2010 at 6:58 AM, Rudi Ahlers rudiahl...@gmail.com wrote: [snip] At the same time I can open a new SSH session and do whatevery I like. But it seems that running a command which takes time to complete hangs. Is the server mounting any remote filesystems? Nope, but I did notice that isci was giving errors, probably from an earlier (probably about 2 months ago) iscsi test, but the iscsi settings were removed, and I just rebooted in order to uninstall isci - which was succesful this time. -- Kind Regards Rudi Ahlers SoftDux Website: http://www.SoftDux.com Technical Blog: http://Blog.SoftDux.com Office: 087 805 9573 Cell: 082 554 7532 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] everything seems to hang, but system is idle?
On Sun, Apr 11, 2010 at 3:32 PM, Geoff Galitz ge...@galitz.org wrote: Thanx Geoff, Already checked that, without any decent lead either: Have you tried iostat, vmstat or sar to see if there is unusual activity? Were there any changes to the kernel lately (such as an update or a new module)? Or perhaps an NFS/CIFS mount gone wonky causing blocking? -geoff - Geoff Galitz Blankenheim NRW, Germany http://www.galitz.org/ http://german-way.com/blog/ ___ Yes, I think it could be a problematic iscsi config. Now that I think of it, the server wasn't rebooted in about 2 or 3 months and I did some iscsi testing a while ago, but with a recent power outage it could have enabled / a faulty configuration, I just rebooted the server again, managed to the remove iscsi this time, and will see if this solves the problem. -- Kind Regards Rudi Ahlers SoftDux Website: http://www.SoftDux.com Technical Blog: http://Blog.SoftDux.com Office: 087 805 9573 Cell: 082 554 7532 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos