Hi Michael,

Your analysis is perfectly correct. After replacing ethernet cables with 
new ones and relocating the switch to a more suitable place (it was 
placed on top of a hot computer which might have led to its 
malfunctioning), the bizarre errors were gone and the client 
installation was done within a minute. So now my new cluster is now up 
and running.

I cannot thank you enough for the help. I have been wondering all along 
that the problem may be on the computers themselves. It has never 
occurred to me that the problem is actually on the ethernet cables 
and/or the switch!

Shiang-Tai Lin

Michael Edwards wrote:
> Have you tried a different switch?  Or maybe even a crossover cable...
> That would eliminate the network fabric as the problem.
> This is probably a long shot, but it looks like you are having network 
> issues of some kind.
>
> Also, check /var/log/messages to see if there are any obvious network 
> errors.
>
> You could also try and find some 10/100 ethernet cards and try imaging 
> using those (the network drivers are better).  That would let you know 
> if the network card drivers are bad.
>
> If the Network cards are on the motherboard, you could check for 
> firmware/bios updates at your motherboard mfg website.  Some network 
> cards are known to be flakey with linux as well, so you might check 
> boards and google for your network card model and your linux distribution.
>
> On Mon, Apr 28, 2008 at 10:46 PM, stlin <[EMAIL PROTECTED] 
> <mailto:[EMAIL PROTECTED]>> wrote:
>
>     Hi Michael:
>
>     I follow your suggestion and have a clean OS (FC8) and OSCAR (5.1b2)
>     install. The client installations again stuck at "Quietly installing
>     image..." and failed after about 2 hrs. Note that I have turned
>     off the
>     firewall on the head node during the OS install. The oscar log file is
>     posted on
>     from begining to step 3: http://pastebin.ca/1001279
>     steps 4 to 6: http://pastebin.ca/1001283
>     and the client terminal output: http://pastebin.ca/1001286
>
>     Although the client installation was not complete, I can ssh from the
>     client to the sever. So for some unknown reason, the data transfer
>     seems
>     to be extremely slow between the head and client nodes and rsync
>     seemed
>     to fail after several hundred minutes.
>
>     Thanks a lot for looking into this problem for me.
>
>     Sincerely,
>     Shiang-Tai Lin
>
>     Michael Edwards wrote:
>     > Personally I would start from a clean OS install.  I might even try
>     > downloading the repositories again.  It sounds like something is
>     very
>     > broken.  When you try again try running the wizard by doing
>     >
>     > env OSCAR_VERBOSE=3 ./install_cluster eth0
>     >
>     > I wouldn't trust your current setup though, it seems like it has
>     > gotten beyond useful debugging.
>     >
>     > On Sat, Apr 26, 2008 at 1:50 AM, stlin <[EMAIL PROTECTED]
>     <mailto:[EMAIL PROTECTED]>
>     > <mailto:[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>>> wrote:
>     >
>     >     Hi Michael,
>     >
>     >     Thanks for your reply.
>     >
>     >     I tried "/etc/init.d/mysqld status" and the result showed that
>     >     mysql was running. And even if I execute "/etc/init.d/mysqld
>     >     restart", I still get "DBD::mysql::st execute failed: MySQL
>     server
>     >     has gone away at /opt/oscar/lib/OSCAR/oda.pm <http://oda.pm>
>     <http://oda.pm> line
>     >     802." when I click on the "Setup Networking" button.
>     >
>     >     I then tried to start over (/opt/oscar/scripts/start_over),
>     >     rebooted the server, and launched the OSCAR wizard. But this
>     time
>     >     the client installation got stuck after received
>     >     boel_binaries.tar.gz. The terminal output from the client is
>     >     http://pastebin.ca/998123 and the oscar log file is posted to
>     >     http://pastebin.ca/998121.
>     >
>     >     Any hint will be highly appreciated. Thank you.
>     >
>     >     Sincerely,
>     >
>     >     Shiang-Tai
>     >
>     >     Michael Edwards wrote:
>     >>     It looks like your mysql daemon on the head node died or
>     was not
>     >>     started.
>     >>
>     >>     Try doing "/etc/init.d/mysqld restart" and try reimaging
>     the nodes.
>     >>
>     >>     On Fri, Apr 25, 2008 at 11:09 AM, Shiang-Tai Lin
>     >>     <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
>     <mailto:[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>>> wrote:
>     >>
>     >>         Hi Michael,
>     >>
>     >>         Thanks for the instruction.
>     >>
>     >>         The oscar log is posted on http://pastebin.ca/997165
>     >>
>     >>         The complete client terminal message is posted on
>     >>         http://pastebin.ca/997170 </997170>
>     >>
>     >>         Thanks in advance for any hints.
>     >>
>     >>         Shiang-Tai
>     >>
>     >>         Michael Edwards wrote:
>     >>         > Logs etc would be helpful.  Post them at pastebin.ca
>     <http://pastebin.ca>
>     >>         <http://pastebin.ca>
>     >>         > <http://pastebin.ca> and post a link here.
>     >>         >
>     >>         > On Fri, Apr 25, 2008 at 10:30 AM, Shiang-Tai Lin
>     >>         <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
>     <mailto:[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>>
>     >>         > <mailto:[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
>     <mailto:[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>>>> wrote:
>     >>         >
>     >>         >     Hi,
>     >>         >
>     >>         >     My attempt to set up a cluster using OSCAR 5.1b2 with
>     >>         Fedora core
>     >>         >     8 was
>     >>         >     failed during the client installation (Step 6,
>     >>         Monitoring Cluster
>     >>         >     Deployment). The installation of client image
>     seems to
>     >>         be abnormally
>     >>         >     slow (data transfer speed was about 50 Kb/s) and
>     failed
>     >>         after
>     >>         >     about 150
>     >>         >     minutes with nc and rsync error. The last few lines
>     >>         from the client
>     >>         >     terminal output are
>     >>         >
>     >>         >     ************quote from the client
>     >>         >     terminal********************************
>     >>         >     Quietly installing image...
>     >>         >     /rsync -aHS --exclude=lost+found/ --exclude=/proc/*
>     >>         --numeric-ids
>     >>         >     10.0.3.230::oscarimage/ /a/
>     >>         >     /nc: connect: No route to host
>     >>         >     -|/-|/-|/-nc: connect: No route to host
>     >>         >     |/-|/-|/-nc: connect: No route to host
>     >>         >     |/-|/-|/-|nc: connect: No route to host
>     >>         >     -|/-|/-|/-|/-|/-|/-|/-|/rsync error: timeout in data
>     >>         send/receive
>     >>         >     (code
>     >>         >     30) at io.c(165) [sender=2.6.9]
>     >>         >     rsync: read error: Connection reset by peer (104)
>     >>         >     rsync error: error in rsync protocol data stream
>     (code
>     >>         12) at
>     >>         >     io.c(759)
>     >>         >     [receiver=3.0.0pre6]
>     >>         >     rsync: connection unexpectedly closed (776458 bytes
>     >>         received so far) a
>     >>         >     rsync error: error in rsync protocol data stream
>     (code
>     >>         12) at
>     >>         >     io.c(600) a
>     >>         >     Killing off running processes.
>     >>         >
>     >>         >     write_variables
>     >>         >
>     >>        
>     **************************************************************
>     >>         >
>     >>         >     The hardware spec (both for the server and the
>     client) are
>     >>         >     CPU: dual Intel Xeon E5345 (quad core)
>     >>         >     Motherboard: Tyan S2692 Tempest i5000XL #D1796-100
>     >>         >     Network Card: Intel(R) PRO/1000 Gigabit Server
>     Adapter
>     >>         (Intel GbE from
>     >>         >     ESB2(w/ single port "Gilgal")-ASF2.0)
>     >>         >     Hard Drive: 250GB SATA2
>     >>         >
>     >>         >     I have upgraded the BIOS to the latest version
>     and used
>     >>         "UYOK" in
>     >>         >     network setup. I have also stopped the firewall
>     >>         (service iptables
>     >>         >     stop).
>     >>         >     Any hints to solve the problem is greatly
>     appreciated.
>     >>         Please let me
>     >>         >     know if it is necessary to post the complete
>     oscar log
>     >>         file and the
>     >>         >     client terminal messages (they are over 50 Kb).
>     Thanks
>     >>         a lot.
>     >>         >
>     >>         >     Sincerely,
>     >>         >     Shiang-Tai Lin
>     >>         >
>     >>         >
>     >>         >
>     >>         >
>     >>         >
>     >>        
>     -------------------------------------------------------------------------
>     >>         >     This SF.net email is sponsored by the 2008
>     JavaOne(SM)
>     >>         Conference
>     >>         >     Don't miss this year's exciting event. There's still
>     >>         time to save
>     >>         >     $100.
>     >>         >     Use priority code J8TL2D2.
>     >>         >
>     >>        
>     
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
>     >>         >     _______________________________________________
>     >>         >     Oscar-users mailing list
>     >>         >     Oscar-users@lists.sourceforge.net
>     <mailto:Oscar-users@lists.sourceforge.net>
>     >>         <mailto:Oscar-users@lists.sourceforge.net
>     <mailto:Oscar-users@lists.sourceforge.net>>
>     >>         >     <mailto:Oscar-users@lists.sourceforge.net
>     <mailto:Oscar-users@lists.sourceforge.net>
>     >>         <mailto:Oscar-users@lists.sourceforge.net
>     <mailto:Oscar-users@lists.sourceforge.net>>>
>     >>         >    
>     https://lists.sourceforge.net/lists/listinfo/oscar-users
>     >>         >
>     >>         >
>     >>         >
>     >>        
>     ------------------------------------------------------------------------
>     >>         >
>     >>         >
>     >>        
>     -------------------------------------------------------------------------
>     >>         > This SF.net email is sponsored by the 2008 JavaOne(SM)
>     >>         Conference
>     >>         > Don't miss this year's exciting event. There's still time
>     >>         to save $100.
>     >>         > Use priority code J8TL2D2.
>     >>         >
>     >>        
>     
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
>     >>         >
>     >>        
>     ------------------------------------------------------------------------
>     >>         >
>     >>         > _______________________________________________
>     >>         > Oscar-users mailing list
>     >>         > Oscar-users@lists.sourceforge.net
>     <mailto:Oscar-users@lists.sourceforge.net>
>     >>         <mailto:Oscar-users@lists.sourceforge.net
>     <mailto:Oscar-users@lists.sourceforge.net>>
>     >>         > https://lists.sourceforge.net/lists/listinfo/oscar-users
>     >>         >
>     >>
>     >>
>     >>        
>     -------------------------------------------------------------------------
>     >>         This SF.net email is sponsored by the 2008 JavaOne(SM)
>     Conference
>     >>         Don't miss this year's exciting event. There's still
>     time to
>     >>         save $100.
>     >>         Use priority code J8TL2D2.
>     >>        
>     
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
>     >>         _______________________________________________
>     >>         Oscar-users mailing list
>     >>         Oscar-users@lists.sourceforge.net
>     <mailto:Oscar-users@lists.sourceforge.net>
>     >>         <mailto:Oscar-users@lists.sourceforge.net
>     <mailto:Oscar-users@lists.sourceforge.net>>
>     >>         https://lists.sourceforge.net/lists/listinfo/oscar-users
>     >>
>     >>
>     >>    
>     ------------------------------------------------------------------------
>     >>    
>     -------------------------------------------------------------------------
>     >>     This SF.net email is sponsored by the 2008 JavaOne(SM)
>     Conference
>     >>     Don't miss this year's exciting event. There's still time
>     to save
>     >>     $100. Use priority code J8TL2D2.
>     >>    
>     
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
>     >>
>     >>
>     >>    
>     ------------------------------------------------------------------------
>     >>
>     >>     _______________________________________________
>     >>     Oscar-users mailing list
>     >>     Oscar-users@lists.sourceforge.net
>     <mailto:Oscar-users@lists.sourceforge.net>
>     <mailto:Oscar-users@lists.sourceforge.net
>     <mailto:Oscar-users@lists.sourceforge.net>>
>     >>     https://lists.sourceforge.net/lists/listinfo/oscar-users
>     >>
>     >
>     >
>     >    
>     -------------------------------------------------------------------------
>     >     This SF.net email is sponsored by the 2008 JavaOne(SM)
>     Conference
>     >     Don't miss this year's exciting event. There's still time to
>     save
>     >     $100.
>     >     Use priority code J8TL2D2.
>     >    
>     
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
>     >     _______________________________________________
>     >     Oscar-users mailing list
>     >     Oscar-users@lists.sourceforge.net
>     <mailto:Oscar-users@lists.sourceforge.net>
>     >     <mailto:Oscar-users@lists.sourceforge.net
>     <mailto:Oscar-users@lists.sourceforge.net>>
>     >     https://lists.sourceforge.net/lists/listinfo/oscar-users
>     >
>     >
>     >
>     ------------------------------------------------------------------------
>     >
>     >
>     -------------------------------------------------------------------------
>     > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
>     > Don't miss this year's exciting event. There's still time to
>     save $100.
>     > Use priority code J8TL2D2.
>     >
>     
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
>     >
>     ------------------------------------------------------------------------
>     >
>     > _______________________________________________
>     > Oscar-users mailing list
>     > Oscar-users@lists.sourceforge.net
>     <mailto:Oscar-users@lists.sourceforge.net>
>     > https://lists.sourceforge.net/lists/listinfo/oscar-users
>     >
>
>
>     -------------------------------------------------------------------------
>     This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
>     Don't miss this year's exciting event. There's still time to save
>     $100.
>     Use priority code J8TL2D2.
>     
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
>     _______________________________________________
>     Oscar-users mailing list
>     Oscar-users@lists.sourceforge.net
>     <mailto:Oscar-users@lists.sourceforge.net>
>     https://lists.sourceforge.net/lists/listinfo/oscar-users
>
>
> ------------------------------------------------------------------------
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
> Don't miss this year's exciting event. There's still time to save $100. 
> Use priority code J8TL2D2. 
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> ------------------------------------------------------------------------
>
> _______________________________________________
> Oscar-users mailing list
> Oscar-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/oscar-users
>   


-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
Oscar-users mailing list
Oscar-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to