On Monday 29 June 2009 10:53:40 James Harper wrote:
> > > I'll investigate doing some deeper error code retrieval - maybe even
>
> try
>
> > > and get the sense data although I don't know if that's possible.
>
> There
>
> > > may be some condition we can test for to know that it's ok to retry
>
> vs
>
> > > knowing when it's a hard error.
> >
> > Yes, well, you should never retry if it is a hard error. The Bacula
>
> code in
>
> > block.c retries only if it gets an EBUSY status. In fact, Bacula
>
> should
>
> > never get an EBUSY status, but there are some stupid (Unix/Linux) OSes
>
> out
>
> > there, so we resolved the problems with the retry loop.
>
> How sure are you about that?
>
> do {
> ...
> stat = dev->write(block->buf, (size_t)wlen);
> } while (stat == -1 && (errno == EBUSY || errno == EIO) && retry++ < 3);
Yes, you are right. I don't remember why it is programmed that way. It is
not something I would do today. As I say, it is tricky.
Kern
>
> It definitely tests for EIO there. Curiously though, the '...' bit that
> I omitted for clarity only does a sleep and clrerror() on EBUSY, not on
> EIO.
>
> Without the test for EIO, my backups would never complete successfully
> under Windows. With 3 retries it is enough to get them working most of
> the time. But if there is Windows specific retry code required it should
> probably go in mtops.cpp and the EIO test there should be removed.
>
> I added some code to mtops.cpp to get the scsi error sense data and it
> always comes up clean, but I'm not yet sure if that's because it was
> cleared by Windows, or if the error is not scsi related.
>
> I also tried adding a call to GetTapeStatus before every write to see if
> the device is ready for io, but it always says it is.
>
> I'm still wondering if some other part of Windows is poking at the drive
> every so often and interfering with us, even though we have the device
> opened exclusively. Normally I'd use procmon from sysinternals but it
> doesn't seem to like me under 64 bit - it never shows any tape drive
> access.
>
> > > And yet with all the WHQL testing that you have to do (and it's
>
> painful)
>
> > > it's _supposed_ to be the other way around :)
> >
> > Well, that is the inevitable result when the source is not available.
>
> Having done application programming and kernel programming on both
> Windows and Linux, they couldn't be more dissimilar. Linux is barely
> documented but as long as you can read the code you'll probably get
> where you are going. Also, there is no such thing as undocumented calls.
> Windows on the other hand is actually really well documented, provided
> the only things you want to do are things that the designers thought of.
> And you must never ever use an undocumented call or rely on undocumented
> behaviour, or bad things might happen. *sigh*.
>
> James
------------------------------------------------------------------------------
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel