Bugs item #773585, was opened at 2003-07-18 08:04 Message generated for change (Comment added) made by mchasal You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109368&aid=773585&group_id=9368
Category: Installation Group: 2.3 >Status: Closed >Resolution: Invalid Priority: 8 Submitted By: Jeff Squyres (jsquyres) Assigned to: Michael Chase-Salerno (mchasal) Summary: mksiimage hangs while making client image Initial Comment: This is on a vanilla RH 8.0 system in vmware. Step XXX hangs when attempting to make the client image. The huge long list of RPM's is output, but then it just sits there -- no RPMs are installed. No obvious errors have occurred. Here's a snipit of the output: ----- 18: 2003-6-18 7:32:57 [SystemInstaller::Package::Rpm :: Line 223] Performing RPM stage 1 install, command is: 19: 2003-6-18 7:32:57 [SystemInstaller::Package::Rpm :: Line 224] cd /tftpboot/rpm;rpm -ir /var/lib/systemimager/images/oscarimage -v --percent filesystem-2.1.6-5.noarch.rpm setup-2.5.20-1.noarch.rpm basesystem-8.0-1.noarch.rpm glibc-common-2.2.93-5.i386.rpm glibc-2.2.93-5.i686.rpm libtermcap-2.0.8-31.i386.rpm termcap-11.0.1-13.noarch.rpm warning: filesystem-2.1.6-5.noarch.rpm: V3 DSA signature: NOKEY, key ID db42a60e 20: Preparing packages for installation... 21: setup-2.5.20-1 22: filesystem-2.1.6-5 23: basesystem-8.0-1 warning: pciutils-2.1.10-2.i386.rpm: V3 DSA signature: NOKEY, key ID db42a60e 24: termcap-11.0.1-13 25: glibc-common-2.2.93-5 26: glibc-2.2.93-5 27: libtermcap-2.0.8-31 28: 2003-6-18 7:33:31 [SystemInstaller::Package::Rpm :: Line 223] Performing RPM stage 2 install, command is: 29: 2003-6-18 7:33:31 [SystemInstaller::Package::Rpm :: Line 224] cd /tftpboot/rpm;rpm -ir /var/lib/systemimager/images/oscarimage -v --percent pciutils-2.1.10-2.i386.rpm raidtools-1.00.2-3.3.i386.rpm libtiff-3.5.7-7.i386.rpm iproute-2.4.7 ......lots more RPMs............ ----- and then it just sits there (full oscarinstall.log is attached; note that it's the second run of install_cluster because of the eth0/missing interface name in ODA problem). Doing a "ps -auxww", I can see that mksiimage has been run with the following arguments: ----- /usr/bin/perl /usr/bin/mksiimage -A --name oscarimage --location /tftpboot/rpm --filename /root/oscar/oscarsamples/redhat-8.0-i386.rpmlist --arch i686 --path /var/lib/systemimager/images/oscarimage --verbose --filename=/tmp/oscar-install-rpmlist.4946 ----- I can also see an "rpm" command, but it doesn't list any of its arguments in ps: ----- [snipped] root 5602 0.7 9.3 15996 11776 pts/0 S 07:33 0:10 rpm root 5632 0.0 0.5 2608 692 pts/1 R 07:58 0:00 ps auxwwwww ----- Indeed, double checking /proc/5602/cmdline, it only shows "rpm" as well. Wild guess/speculation: could we have an RPM list that is now so long that it has overflowed some buffer? And perhaps the rpm command is waiting for some kind of input? ---------------------------------------------------------------------- >Comment By: Michael Chase-Salerno (mchasal) Date: 2003-07-22 19:46 Message: Logged In: YES user_id=99742 This is due to a bug in rpm that Jeff found. ---------------------------------------------------------------------- Comment By: Jeff Squyres (jsquyres) Date: 2003-07-22 15:04 Message: Logged In: YES user_id=11722 No, the rest of the vm appears to be running fine (i.e., I can login and run other programs just fine). Disk space and ram appear to be far less than full. ---------------------------------------------------------------------- Comment By: Michael Chase-Salerno (mchasal) Date: 2003-07-22 14:31 Message: Logged In: YES user_id=99742 It sounds like an issue with rpm itself, if rpm isn't returning, mksiimage can't determine that it ended and how. I don't know how we would be able to deal with this gracefully. I also don't know what or how and exported or not exported variable would affect this especially since the stage 1 rpm install works. Is it possible that you are running out of resources in the VM? Maybe running out of memory or swap or something? ---------------------------------------------------------------------- Comment By: Brian Elliott Finley (brianfinley) Date: 2003-07-22 14:22 Message: Logged In: YES user_id=140 Or simply a variable that is not exported, but needs to be? ---------------------------------------------------------------------- Comment By: Jeff Squyres (jsquyres) Date: 2003-07-22 07:42 Message: Logged In: YES user_id=11722 More clues: It looks like many/all the RPMs from the massive "rpm -ir ..." line were installed. i.e., it looks like rpm did a bunch of stuff (and is still running, according to ps), but then effectively closed out. The rpm command then still hung, and SIS is not noticing. I still have no output from the rpm -ir command (the progress bar on the tcl widget has not moved). Any further ideas? ---------------------------------------------------------------------- Comment By: Jeff Squyres (jsquyres) Date: 2003-07-21 15:07 Message: Logged In: YES user_id=11722 The activity that I see seems to indicate that rpm is doing *something* for a least several minutes (i.e., load is up, disk is spinning, etc.), but there is no output during this time. Eventually, the load goes down to 0, the disk stops, etc. Hence, I'm guessing that *something* happened, but then caused it to hang. Even weirder -- after it has hung (and while it is still hanging), I can run rpm commands (to include installing and uninstalling RPMs), indicating that the RPM database is not locked. ---------------------------------------------------------------------- Comment By: Michael Chase-Salerno (mchasal) Date: 2003-07-21 14:06 Message: Logged In: YES user_id=99742 Hmm, that's an interesting comment. I've always seen rpm output come a line at a time, no buffering at all. That may be part of the issue here. Tksis is looping on lines of output to update the status bar. If there is no output, it would appear to hang. If your're seeing rpm buffer output, that would be a problem. ---------------------------------------------------------------------- Comment By: Jeff Squyres (jsquyres) Date: 2003-07-21 12:22 Message: Logged In: YES user_id=11722 Is there a way to flush the output from rpm somehow? It seems like there's a whole lot of output from rpm all at once (when it succeeds), suggesting that the output is being buffered and flushed at some point. I ask because I'm wondering if rpm gets halfway through and then dies/hangs, and there's some kind of useful output that we're not seeing because it's still buffered. ---------------------------------------------------------------------- Comment By: Michael Chase-Salerno (mchasal) Date: 2003-07-21 12:01 Message: Logged In: YES user_id=99742 This has been discussed fairly extensively on IRC, here's a summary. Currently we feel that mksiimage is properly unmounting /proc. The debug message is there in the output and there are no errors from it. This indicates that either umount is failing quietly, or that something later on is causing a remount. I'm guessing that umount is working, so Jason is investigating if something else is causing a remount. ---------------------------------------------------------------------- Comment By: Michael Chase-Salerno (mchasal) Date: 2003-07-21 11:21 Message: Logged In: YES user_id=99742 I'd be suprised if this were a buffer issue since its never been hit before and some very large images have been built. I'm not sure if something in VMWare could be causing this, but it seems like the best possibility (at least from my perspective;) ) Since this hasn't been seen outside of VMware, I'm inclined to attribute it to that. Does this only happen sproadically? I got that impression from the history here. ---------------------------------------------------------------------- Comment By: Jeff Squyres (jsquyres) Date: 2003-07-19 10:07 Message: Logged In: YES user_id=11722 Actually -- I lied. It just happened to be again in RH 8.0 under VMware 4. So I guess this is still an active issue. :-( [shrug] ---------------------------------------------------------------------- Comment By: Jeff Squyres (jsquyres) Date: 2003-07-18 13:40 Message: Logged In: YES user_id=11722 I upgraded to VMware 4 and this problem went away. So I don't know if this is a real SIS problem or a VMware problem. [shrug] Sean says he heard something about this on IRC the other day, so I'll downgrade the severity and leave this for the SIS folks to either close or fix... ---------------------------------------------------------------------- Comment By: Jason Brechin (brechin) Date: 2003-07-18 11:50 Message: Logged In: YES user_id=274641 I have not seen this problem... I got through an installation just now... it was slow (VMWare), but it ended up working. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109368&aid=773585&group_id=9368 ------------------------------------------------------- This SF.net email is sponsored by: VM Ware With VMware you can run multiple operating systems on a single machine. WITHOUT REBOOTING! Mix Linux / Windows / Novell virtual machines at the same time. Free trial click here: http://www.vmware.com/wl/offer/345/0 _______________________________________________ Oscar-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/oscar-devel
