Re: Synchronous option for chccwdev -- was there a resolution?

Florian Bilek Fri, 27 Jul 2012 14:17:27 -0700

Dear all,

I can confirm that the udev settle is returning always with zero. I have
the timeout set to even to 60 sec and an exit if the device node is
available. And that is stlll not enough because udev exits. Without exit I
had set the timeout to 30 secs. Only running the loop two times the chance
gets high to succeed.

As I have written in my first mail, I encountered that problem already long
time ago. I found always workarounds but with every kernel update the
chance is there that the race condition is coming back.

In case it seems that there isn't a reliable check that the device is
really useable. It is always a gamble if your procedure succeeds or not.

Since my work on the clone procedure I see that the critical path is the
amount of steps necessary to make one device useable:

1. attach it (vmcp)
2. vary it online (chccwdev -e)
3. format it (dasdfmt)
4. partition it (fdasd)
5. make the file system

The most critical steps are 2 and 3. I see it in my exec that most times
both steps are failing and need to be rerun. This is happening with the
actual kernel version on SLES 11 SP2. SP1 had usually only one of both
steps failing.

On SLES 10 the problem was that the partition node didn't show up or after
a certain amount of (successful) chccwdevs the kernel could not bring the
device online any more and an reboot of the guest was required.

sync or other tricks I know do not really solve the problem and also udev
is disappointing since it tells that every this ins save which is not the
case. I used dasd_config from SLES 11 but it didn't solve the problem
either.

The chances are high that between one of these steps the situation arises.
Having the 30 seconds fixed delay between all the steps makes the process
including formatting and creation of the filesystem quite long. Personally
I would say unacceptable long.

We use here an IBM z/10 with DS 8700 for the disks. z/VM is 6.2 on latest
RSU  So it is original and fast equipment and there the situation still
appears.

Kind regards,
Florian

On Fri, Jul 27, 2012 at 8:23 PM, David Boyes <[email protected]> wrote:

> > I believe that's the piece that's missing (for most people).  I can
> easily
> > reproduce the problem on my SLES11 SP2 system with this script:
> > vmcp define vfb-512 302 2000
> > date +%H:%M:%S.%N
> > chccwdev -e 0.0.0302
> > mkswap /dev/disk/by-path/ccw-0.0.0302-part1
>
> Yeah, that's pretty much guaranteed to fail. If you insert a 'udevadm
> settle' after the 'chccwdev -e', I still get a failure about 3 times out of
>  100 attempts, though.
>
> Alan may be on to something with the timeout value for udev for that type
> of device.
>
> ----------------------------------------------------------------------
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to [email protected] with the message: INFO LINUX-390 or
> visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> ----------------------------------------------------------------------
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
----------------------------------------------------------------------
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Synchronous option for chccwdev -- was there a resolution?

Reply via email to