Hello,

I would hesitate to make any changes to Bacula, because it has been
functioning correctly now for every device I know of for 14 years.  This
is particularly true of "ignoring" an I/O error status. That
doesn't mean that we couldn't consider it ...

What would be much preferable would be to check the Linux tape
specification, which if I remember right is virtually identical to the 
Solaris
tape driver specification.  From memory (please check) when the tape
driver hits the end of the tape, it should return an EOF.  If you read a
second time, it should still return an EOF, which implies two consecutive
EOF marks, which is the standard way (historically) of reporting an
End of tape condition.  If a third read is done, then it is up to the tape
driver to decide what to return -- typically it is an I/O error, but 
returning
an EOF would also be valid.

I am not sure about what Zmanda writes, because one would have to read their
complete document in context, but I can tell you that I do not consider
them to be the tape experts. Until recently, and I am not sure they do 
it right, they
could not span tapes on writing -- probably because they did not know
how.  The standard is what is in the Linux man pages, which I think is:

   man st

Here is what I find that describes the condition you are reporting (excerpt
from my man st output).

        When  a filemark is encountered while reading, the following 
happens.
        If there are data remaining in the buffer when the filemark is 
found,
        the  buffered  data  is  returned.  The next read returns zero 
bytes.
        The following read returns data from  the  next  file. The  end  of
        recorded data is signaled by returning zero bytes for two 
consecutive
        read calls.  The third read returns an error.

My view as written above is, if the driver does not conform to the
above it is broken.

So from what you write, I would say that the lin_tape handles the condition
in an incorrect manner and it should be fixed, but to determine that would
require finding the correct man page and perhaps looking at how the other
kernel tape drivers work.

If there is an ioctl() that Bacula could call that is implemented on all 
OSes,
then I would consider anything specific that you could propose, but Bacula
doesn't do SCSI I/O and I hope we never have to.

If you have some specific suggestions for btape, please let me know.
What you are asking is a bit vague for me, because I cannot reproduce
the error so I don't know what code needs fixing or where comments
should be put.

Please let me know.

Best regards,
Kern


On 07/03/2013 03:30 PM, Mariusz Mazur wrote:
> When doing a goto End Of Data operation using an fsf(1) loop, sd does a read()
> on each iteration to check whether it reached EOD yet. It expects two things:
> 1. read() to return 0 bytes, meaning EOF.
> 2. read to return error ENOSPC which is, according to the code comment, what
> IBM drivers tended to do.
>
> Problem is, according to this text
> http://wiki.zmanda.com/index.php/Tape_Driver_Semantics#Read
>
> "What happens when you try to read past EOM varies by kernel driver. Some will
> continue to return EOF. Others will return an error (typically EIO)."
>
> An EIO error would also be likely. And that's exactly what ibm's open source
> lin_tape driver (which I'm using for my ts2900 autoloader) is doing.
>
> It would be possible to patch up lin_tape to return something saner than EIO,
> however that's not that great a solution – it'd be better if bacula supported
> such tape drivers out of the box.
>
> What I did for my own purposes amounts to this:
>
> --- bacula-5.2.13.orig/src/stored/dev.c 2013-02-19 20:21:35.000000000 +0100
> +++ bacula-5.2.13/src/stored/dev.c  2013-07-02 17:19:27.775246812 +0200
> @@ -1198,6 +1198,8 @@
>                */
>               } else if (at_eof() && errno == ENOSPC) {
>                  stat = 0;
> +            } else if (errno == EIO) {   /* && has_cap(CAP_IOERRATEOM) */
> +               stat = 0;
>               } else {
>                  berrno be;
>                  set_eot();
>
> however that has the potential to hide actual io errors, so were it mergeable,
> CAP_IOERRATEOM would need to default to off.
>
> The prefered solution imho would be for sd to check scsi sense data on EIO,
> cause there's a flag there that would clearly indicate whether the problem is
> real or if it's just trying to read past EOD. However I don't know if linux
> has a generic sense data reading api, not to mention there existing a cross-
> platform one.
>
> Or am I missing something?
>
> Also, no matter what the preferred solution here would be, btape should, on
> detecting those EIO's (and it sees them when running the standard tape test
> checks), suggest possible solutions.
>
> --mmazur
>
>
>
>
> (Below are some keywords for people with similar issues to be able to find
> this email via google, please ignore.)
>
>
> ibm ts2900 ts3310
>
>
> lin_taped:
> tape_check_result SenseKey: 08 ASC: 00 ASCQ: 05.
> tape_check_result: Information field: 64512
> tape_check_result Encountered EOD.
>
>
> btape:
> btape: dev.c:716-0 Enter eod
> btape: dev.c:598-0 rewind res=0 fd=3 "tapeDrive" (/dev/IBMtape0n)
> btape: dev.c:809-0 eod: doing fsf 1
> btape: dev.c:1136-0 fsf
> btape: dev.c:1181-0 FSF has cap_fsf
> btape: dev.c:1191-0 Doing read before fsf
> btape: dev.c:1229-0 Doing MTFSF
> btape: dev.c:1262-0 Return 0 from FSF
> btape: dev.c:1264-0 ST_EOF set on exit FSF
> btape: dev.c:1269-0 Return from FSF file=1
> btape: dev.c:809-0 eod: doing fsf 1
> btape: dev.c:1133-0 ST_EOF set on entry to FSF
> btape: dev.c:1136-0 fsf
> btape: dev.c:1181-0 FSF has cap_fsf
> btape: dev.c:1191-0 Doing read before fsf
> btape: dev.c:1206-0 Set ST_EOT read errno=5. ERR=Input/output error
> btape: dev.c:1209-0 dev.c:1208 read error on "tapeDrive" (/dev/IBMtape0n).
> ERR=Input/output error.
>
>
> dir job email:
> JobId 411: 3304 Issuing autochanger "load slot 1, drive 0" command.
> JobId 411: 3305 Autochanger "load slot 1, drive 0", status is OK.
> JobId 411: Error: dev.c:1208 read error on "tapeDrive" (/dev/IBMtape0n).
> ERR=Input/output error.
> JobId 411: Volume "269BCML5" previously written, moving to end of data.
> JobId 411: Error: Unable to position to end of data on device "tapeDrive"
> (/dev/IBMtape0n): ERR=dev.c:1208 read error on "tapeDrive" (/dev/IBMtape0n).
> ERR=Input/output error.
> JobId 411: Marking Volume "269BCML5" in Error in Catalog.
> JobId 411: 3307 Issuing autochanger "unload slot 1, drive 0" command.
>
>
> dir job email with fixed block size (beats me why):
> bak-sd JobId 533: Error: Bacula cannot write on tape Volume "269BCML5"
> because: The number of files mismatch! Volume=5 Catalog=6
>
>
> Also google: [Bacula-devel] TS3310 - possible solution Mariusz Czulada 2007
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev
> _______________________________________________
> Bacula-devel mailing list
> Bacula-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-devel
>


------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to