I'm new to this field, but because I need to face this problem in the real
world, I want
to contribute to this discussion my experience and some questions.

Fact #1 :
Reset wars do happen. Booting linux system in multi-initiator environment
often cause infinite
reset-bus loop, even with only one linux system - others are NT and Sun.
Sometimes it is ended with crash of all hosts. ( including NT and Sun).
Almost every time, some period of time after linux boot all hosts are
loosing network view.

Fact #2:
Only 2 linux systems on same FC network, usually do not start infinite bus
reset.

Fact #3:
This behavior isn't seems to be related to low-level driver :
We have tested 2 QLogic HBA drivers and one Emulex HBA driver.

Now some questions:
Which level is responsible to send bus reset ? Is it middle level or low
level ?
If it is the middle level, can low level driver filter it out ?

Which level should deal with hot swapping and adding new devices ?
Till this discussion I was sure that this is the low-level job.

If bus reset do happen, it shouldn't affect IO operations from high level
driver point of view,
because retries should handle it anyway ?


Thank you.
Sergey Vichik.
StoreAge




----- Original Message -----
From: David Teigland <[EMAIL PROTECTED]>
To: Mark Veteikis <[EMAIL PROTECTED]>; Kurt Garloff <[EMAIL PROTECTED]>; Chris
Meadors <[EMAIL PROTECTED]>; Martin Peschke
<[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Friday, August 11, 2000 6:16 PM
Subject: Re: shared SCSI buses


> On Fri, Aug 11, 2000 at 09:35:19AM -0500, Mark Veteikis wrote:
> > >
> > >
> > > I'm interested in combining two or more active hosts with multiple
devices
> > > on a single parallel SCSI bus.  I've successfully done this, but don't
know
> > > the extent of problems which could arise when hosts or disks are added
or
> > > removed (crashed) on the in-use bus.
> > >
> > >  A) How likely is it that the scsi driver(s) will see errors when
nodes and
> > >  drives come and go and are there specific cases which are bad?
> > >
> > >  B) What are the possibilities of a node surviving if it sees scsi
errors?
> > >
> > >  C) How much work would it take to make all these odd cases reliable?
> > >
> > > I'm interested in the status on both 2.2 and 2.4.  Thanks.
> >
> > Have you looked at Fibre Channel? Linux has support. Or are your target
> > devices/HBAs locked into SCSI?
>
> Thanks to all for the input.  I should have provided some more background
> information.  I work on the GFS project and we primarily use Fibre
Channel.  I
> know SCA parallel SCSI drives are the way to go, but it still sounds like
a
> touchy issue.  I've seen my share of scsi mid-layer errors which lock up
> the machine, so I wanted to try and get a clearer picture of things.
>
> - Hot-swapping SCA disks on the bus should be relatively reliable if it's
done
>   with care.  It sounds like if any transfer is happening during a swap
you're
>   in serious danger of crashing everthing.  The scsi drivers can be
prompted to
>   add or remove devices.  I wonder if multiple hosts put a wrench in
things
>   here.
>
> - The other important issue is hosts which crash at any time, including
during
>   a transfer.  It sounds like the drivers on other machines will currently
>   start a reset-war, but the drivers could be improved to avoid this and
>   hopefully keep using the devices as they were.
>
> - A similar problem for devices which crash abruptly.
>
> - How about adding machines to the bus and then booting them up?
>
> By the look of things here, it is not reasonable to use GFS with multiple
hosts
> on a shared SCSI bus if you're interested in HA.  If any machine or disk
> crashes, all your devices are probably in trouble.  Stopping all machines'
I/O
> (and maybe unmounting everyone) to add or remove storage would also be
> prohibitive.
>
> Thanks.
>
> --
> Dave Teigland  <[EMAIL PROTECTED]>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]

Reply via email to