Re: SCSI init discussion/SAN problem (not interesting)

2006-11-28 Thread Evan Rempel

Bernd Eckenfels wrote:

In article <[EMAIL PROTECTED]> you wrote:


Was this post just not interesting enough, or is it the lack of access to 
hardware
to test this on that prevented it from being picked up by someone?



see google, for example: http://christophe.varoqui.free.fr/multipath.html



While that information is accurate, it is not new to me.

I must have been unclear in my description of how the scsi device registration
with the kernel causes multipath devices to function inefficiently.

When a device has multiple paths, the kernel will see multiple scsi devices, 
even
though there is only one physical device. For each of the scsi devices that the
kernel can see, the partition table (or some other IO that I am unaware of) is
read from the device, meaning IO is generated on ALL paths to the device. This 
isn't
a problem for some devices, but on others it can initiate a failover process 
which can
take many seconds, only to have the process repeated as IO is generated on a 
third path to
the device.

Is it unreasonable for the scsi initialization routines to be aware that some 
kernel scsi
devices are really the same physical devices and register them with the kernel 
WITHOUT
generating any IO on the physical device?

Doing this there would be a maximum of one failover per physical device durint 
the boot
sequence. This one failover could be eliminated if the scsi initialization code 
were aware
of "active" paths and only generated IO on active paths, rather than the first 
path.

All of this is before device mapper or multipath get thier hands on the scsi 
devices. It is
completely within the scope of the scsi initialization code in the kernel.

Is this more clear? If not, could someone ask for clearification of the fuzzy 
parts?

Evan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SCSI init discussion/SAN problem (not interesting)

2006-11-28 Thread Evan Rempel

Bernd Eckenfels wrote:

In article [EMAIL PROTECTED] you wrote:


Was this post just not interesting enough, or is it the lack of access to 
hardware
to test this on that prevented it from being picked up by someone?



see google, for example: http://christophe.varoqui.free.fr/multipath.html



While that information is accurate, it is not new to me.

I must have been unclear in my description of how the scsi device registration
with the kernel causes multipath devices to function inefficiently.

When a device has multiple paths, the kernel will see multiple scsi devices, 
even
though there is only one physical device. For each of the scsi devices that the
kernel can see, the partition table (or some other IO that I am unaware of) is
read from the device, meaning IO is generated on ALL paths to the device. This 
isn't
a problem for some devices, but on others it can initiate a failover process 
which can
take many seconds, only to have the process repeated as IO is generated on a 
third path to
the device.

Is it unreasonable for the scsi initialization routines to be aware that some 
kernel scsi
devices are really the same physical devices and register them with the kernel 
WITHOUT
generating any IO on the physical device?

Doing this there would be a maximum of one failover per physical device durint 
the boot
sequence. This one failover could be eliminated if the scsi initialization code 
were aware
of active paths and only generated IO on active paths, rather than the first 
path.

All of this is before device mapper or multipath get thier hands on the scsi 
devices. It is
completely within the scope of the scsi initialization code in the kernel.

Is this more clear? If not, could someone ask for clearification of the fuzzy 
parts?

Evan.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SCSI init discussion/SAN problem (not interesting)

2006-11-27 Thread Evan Rempel


Was this post just not interesting enough, or is it the lack of access to 
hardware
to test this on that prevented it from being picked up by someone?

If it is lack of access to hardware, I am sure  my organization can provide
a solution.

Evan.

On Nov 17, 2006 Evan Rempel wrote:


I have a problem with the order that the SCSI subsystem attaches disk 
devices that shows up in a multipath environment.


My understanding is that during the finishing phase of the SCSI 
subsystem the partition table is read from the drive and the bare drive 
and each partition are registered with the kernel. Please correct me if 
I am wrong becuase I am not a kernel developer even at the tinkering level.


The problem shows up in a multipath environment where the same physical 
device has it's partition table read and then registered with the kernel 
*for each path on which it available*. I understand the requirement for 
the second (possibly more) devices registered with the kernel, and I 
want this behaviour to continue (how else would multipath work?).
The problem is that reading the partition table on each of the paths 
causes I/O to be generated to the physical disk on each of the paths.
For some disk controllers (any with active/passive controllers) this 
will initial a failover event from the active to the passive controller. 
This failover can take a few seconds, but multipathing may result in 
100's of such paths and failover events which make the boot time very 
long. I have a machine that takes close to 1hr to boot due to this 
behavior.


What I would like to have considered is the ability to get the serial 
number/WWName of the device prior to reading the partition table. If the 
serial number/WWName has already been registered under a different SCSI 
ID, then just use the partition table that was used to load the first 
instance. This will result in I/O only on the first path to each disk.


Another thing that might make things even better is to do something like 
the mp_prio utils of multipathing do and determine which paths is an 
active path, and only read the partition table from the active paths.
This may require a 2 pass device registration mechanism becuase it may 
be possible that none of the paths are active paths, meaning that the 
device did not get registered by the end of the device list. We would 
have to go back to the beginning of the list and for any device that was 
not yet registered with the kernel, read the serial number/WWName and 
partition table, register with the kernel and then determine if any of 
the other paths are for the same device to load them into the kernel.


I hope this is clear enough to start a dialog on how to change the scsi 
initialization faster for large systems on multipath hardware.


Evan Rempel
Senior Programmer Analyst
University of Victoria


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SCSI init discussion/SAN problem (not interesting)

2006-11-27 Thread Evan Rempel


Was this post just not interesting enough, or is it the lack of access to 
hardware
to test this on that prevented it from being picked up by someone?

If it is lack of access to hardware, I am sure  my organization can provide
a solution.

Evan.

On Nov 17, 2006 Evan Rempel wrote:


I have a problem with the order that the SCSI subsystem attaches disk 
devices that shows up in a multipath environment.


My understanding is that during the finishing phase of the SCSI 
subsystem the partition table is read from the drive and the bare drive 
and each partition are registered with the kernel. Please correct me if 
I am wrong becuase I am not a kernel developer even at the tinkering level.


The problem shows up in a multipath environment where the same physical 
device has it's partition table read and then registered with the kernel 
*for each path on which it available*. I understand the requirement for 
the second (possibly more) devices registered with the kernel, and I 
want this behaviour to continue (how else would multipath work?).
The problem is that reading the partition table on each of the paths 
causes I/O to be generated to the physical disk on each of the paths.
For some disk controllers (any with active/passive controllers) this 
will initial a failover event from the active to the passive controller. 
This failover can take a few seconds, but multipathing may result in 
100's of such paths and failover events which make the boot time very 
long. I have a machine that takes close to 1hr to boot due to this 
behavior.


What I would like to have considered is the ability to get the serial 
number/WWName of the device prior to reading the partition table. If the 
serial number/WWName has already been registered under a different SCSI 
ID, then just use the partition table that was used to load the first 
instance. This will result in I/O only on the first path to each disk.


Another thing that might make things even better is to do something like 
the mp_prio utils of multipathing do and determine which paths is an 
active path, and only read the partition table from the active paths.
This may require a 2 pass device registration mechanism becuase it may 
be possible that none of the paths are active paths, meaning that the 
device did not get registered by the end of the device list. We would 
have to go back to the beginning of the list and for any device that was 
not yet registered with the kernel, read the serial number/WWName and 
partition table, register with the kernel and then determine if any of 
the other paths are for the same device to load them into the kernel.


I hope this is clear enough to start a dialog on how to change the scsi 
initialization faster for large systems on multipath hardware.


Evan Rempel
Senior Programmer Analyst
University of Victoria


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


SCSI init discussion/SAN problem

2006-11-17 Thread Evan Rempel


I have a problem with the order that the SCSI subsystem attaches disk 
devices that shows up in a multipath environment.


My understanding is that during the finishing phase of the SCSI 
subsystem the partition table is read from the drive and the bare drive 
and each partition are registered with the kernel. Please correct me if 
I am wrong becuase I am not a kernel developer even at the tinkering level.


The problem shows up in a multipath environment where the same physical 
device has it's partition table read and then registered with the kernel 
*for each path on which it available*. I understand the requirement for 
the second (possibly more) devices registered with the kernel, and I 
want this behaviour to continue (how else would multipath work?).
The problem is that reading the partition table on each of the paths 
causes I/O to be generated to the physical disk on each of the paths.
For some disk controllers (any with active/passive controllers) this 
will initial a failover event from the active to the passive controller. 
This failover can take a few seconds, but multipathing may result in 
100's of such paths and failover events which make the boot time very 
long. I have a machine that takes close to 1hr to boot due to this behavior.


What I would like to have considered is the ability to get the serial 
number/WWName of the device prior to reading the partition table. If the 
serial number/WWName has already been registered under a different SCSI 
ID, then just use the partition table that was used to load the first 
instance. This will result in I/O only on the first path to each disk.


Another thing that might make things even better is to do something like 
the mp_prio utils of multipathing do and determine which paths is an 
active path, and only read the partition table from the active paths.
This may require a 2 pass device registration mechanism becuase it may 
be possible that none of the paths are active paths, meaning that the 
device did not get registered by the end of the device list. We would 
have to go back to the beginning of the list and for any device that was 
not yet registered with the kernel, read the serial number/WWName and 
partition table, register with the kernel and then determine if any of 
the other paths are for the same device to load them into the kernel.


I hope this is clear enough to start a dialog on how to change the scsi 
initialization faster for large systems on multipath hardware.


Evan Rempel
Senior Programmer Analyst
University of Victoria
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


SCSI init discussion/SAN problem

2006-11-17 Thread Evan Rempel


I have a problem with the order that the SCSI subsystem attaches disk 
devices that shows up in a multipath environment.


My understanding is that during the finishing phase of the SCSI 
subsystem the partition table is read from the drive and the bare drive 
and each partition are registered with the kernel. Please correct me if 
I am wrong becuase I am not a kernel developer even at the tinkering level.


The problem shows up in a multipath environment where the same physical 
device has it's partition table read and then registered with the kernel 
*for each path on which it available*. I understand the requirement for 
the second (possibly more) devices registered with the kernel, and I 
want this behaviour to continue (how else would multipath work?).
The problem is that reading the partition table on each of the paths 
causes I/O to be generated to the physical disk on each of the paths.
For some disk controllers (any with active/passive controllers) this 
will initial a failover event from the active to the passive controller. 
This failover can take a few seconds, but multipathing may result in 
100's of such paths and failover events which make the boot time very 
long. I have a machine that takes close to 1hr to boot due to this behavior.


What I would like to have considered is the ability to get the serial 
number/WWName of the device prior to reading the partition table. If the 
serial number/WWName has already been registered under a different SCSI 
ID, then just use the partition table that was used to load the first 
instance. This will result in I/O only on the first path to each disk.


Another thing that might make things even better is to do something like 
the mp_prio utils of multipathing do and determine which paths is an 
active path, and only read the partition table from the active paths.
This may require a 2 pass device registration mechanism becuase it may 
be possible that none of the paths are active paths, meaning that the 
device did not get registered by the end of the device list. We would 
have to go back to the beginning of the list and for any device that was 
not yet registered with the kernel, read the serial number/WWName and 
partition table, register with the kernel and then determine if any of 
the other paths are for the same device to load them into the kernel.


I hope this is clear enough to start a dialog on how to change the scsi 
initialization faster for large systems on multipath hardware.


Evan Rempel
Senior Programmer Analyst
University of Victoria
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/