I'm not sure how I got a copy of this. The headers look like it came from
linux-scsi, but they weren't in a To: or Cc: field. No matter, I didn't
mind getting it at all, as I'm doing some of the same work right now.
On Fri, 11 Aug 2000, Kurt Garloff wrote:
> On Thu, Aug 10, 2000 at 07:33:50PM -0500, David Teigland wrote:
> > A) How likely is it that the scsi driver(s) will see errors when nodes and
> > drives come and go and are there specific cases which are bad?
>
> SCSI bus is electrically NOT designed for hot-swapping. If your bus is idle
> at the time you connect or disconnect a device, nothing bad happens, but
> this can not be guaranteed in case where you have data being transfered over
> the bus.
> But there are designs (SCA connector), where some care for the electrical
> problems is taken.
If you are even having 2 thoughts about hot swapping drives, go for SCA.
I cannot say enough wonderful things about it, actually it is more that I
can't say enough horrible things about the swapable canister mounting
bracket setup. A good SCA backplance can cost almost as much as a
harddrive, but is worth it.
> > B) What are the possibilities of a node surviving if it sees scsi errors?
>
> Error handling in Linux SCSI strongly depends on the situation and the
> adapter driver being used. Possible reaction range from retrying a command
> that failed or reporting EIO to the app to serious things such as bus
> resets, which can cause problems to your SCSI subsystem.
That is one place I'm still not so happy. Linux's ability to deal with
SCSI errors. There was just a discussion on linux-kernel about adding and
removing devices to the SCSI chain while it is hot. Seems to be a little
work needed there too.
> > C) How much work would it take to make all these odd cases reliable?
>
> General SCSI error handling improvements. Talk to Eric Youngdale about this
> and discuss it here, please.
I'd be interested in discussions on improving error handling. I have
enough experience with it. The first thing I saw when I attempted to
install Linux for the first time was my SCSI CD-ROM drive totally hang the
whole machine because of a SCSI error. To my knowledge a work around has
never been found. I still have the drive, but it isn't in use.
> > I'm interested in the status on both 2.2 and 2.4.
>
> Funnily, you did not mention the filesystem problems:
> If you mount any partition read-write with one machine, you are not supposed
> to mount it from any other machine. Not even ro, as the kernel does caching
> and will see an inconsistent FS, which can cause problems.
> Have a look at GlobalFilesystem, if you look for a solution for this problem.
I think this message was orginally cross posted to the GFS mailing list
(doesn't sistina host that list?).
But yeah, if two machines are going to be looking at the filesystem and
even one of them is going to be changing it you are going to need a
clustered filesystem.
> Sharing partitions ro works perfectly just out of the box.
Which is what I'm going to be doing for /usr (while /home is going to be
GFS). But the one problem I see is if I upgrade or install new software
on /usr. The procedure I was thinking of was to remount /usr rw on one
node. Upgrade the software there and remount it ro. What I have done in
testing is to umount and then re-mount the partition on the other
node(s). But on a live machine the filesystem will surely be in use so I
can't just unmount it. Is there an easy way to invalidate caches from
userspace? Sort of a reversed sync.
Thinking I'm in for some fun,
Chris
--
Two penguins were walking on an iceburg. The first one said to the
second, "you look like you are wearing a tuxedo." The second one said,
"I might be..."
--David Lynch, Twin Peaks
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]