Hello, again, It seems like many people on this list are having network issues. I thought that I would do my part to add to the confusion.
We are running 20 Red Hat 7.2 systems with a patched vanilla kernel on a single hipersockets guestlan. We recently migrated all of our linux images from z/VM 4.2 to 4.4. The transition went smoothly, but we noticed after a few days that images began to experience slight connection problems as reported by our Big Brother monitor. The notifications that BB sent indicated that the machine would reply on the second or third attempt, so we suspected it could be a product of our images finally dropping from Q3 into dormancy and subsequently taking longer to reply. Unfortunately, we soon noticed that these connection errors were becoming more pronounced on random images. Keypresses during interactive ssh session would take from 2-8 secs. to be echoed back to the screen. We were able to work around this by stopping the network, rmmoding the qeth and qdio mods then bringing the network back up. In the last few days, all of our images have begun to exhibit this behavior and bouncing the network on an image no longer makes much of a difference. We are seeing pings within the guestlan where 30-40% of the traffic is being dropped. With the idea that this could be an artifact of older OCO modules, I moved a subset of our images to the 2.4.21 kernel with the GPL'd network drivers, but that had no discernible effect. My colleague has been searching for relevant apars but has not found any so far. Our guestlan is set to infinite connections, "q osa" & "q nic" return normal results and I haven't found any descriptive errors on the linux side, yet. I have seen the following error in a few images, but not across the board: qdio : sense data available on qdio channel. qdio : irb: 01 c2 60 17 00 fc 21 38 0e 00 00 00 00 80 00 00 qdio : irb: 01 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 qdio : sense data: 80 00 1f 07 00 00 00 00 00 00 00 00 00 00 00 00 qdio : sense data: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 qdio : received check condition on activate queues on irq 0x3 (cs=x0, ds=xe). qeth: activate queues on irq 0x3: dstat=0xe, cstat=0x0 qeth: recovery was scheduled on irq 0x1 (hsi0) with problem 0x3 qdio : Did not get interrupt on halt_IO, irq=0x3. I think that I've seen this error, previously, when everything was working fine, so I don't consider it significant. Hopefully, someone can confirm or deny that. Any ideas? -- Michael Lambert <[EMAIL PROTECTED]>
