Hello, I would hesitate to make any changes to Bacula, because it has been functioning correctly now for every device I know of for 14 years. This is particularly true of "ignoring" an I/O error status. That doesn't mean that we couldn't consider it ...
What would be much preferable would be to check the Linux tape specification, which if I remember right is virtually identical to the Solaris tape driver specification. From memory (please check) when the tape driver hits the end of the tape, it should return an EOF. If you read a second time, it should still return an EOF, which implies two consecutive EOF marks, which is the standard way (historically) of reporting an End of tape condition. If a third read is done, then it is up to the tape driver to decide what to return -- typically it is an I/O error, but returning an EOF would also be valid. I am not sure about what Zmanda writes, because one would have to read their complete document in context, but I can tell you that I do not consider them to be the tape experts. Until recently, and I am not sure they do it right, they could not span tapes on writing -- probably because they did not know how. The standard is what is in the Linux man pages, which I think is: man st Here is what I find that describes the condition you are reporting (excerpt from my man st output). When a filemark is encountered while reading, the following happens. If there are data remaining in the buffer when the filemark is found, the buffered data is returned. The next read returns zero bytes. The following read returns data from the next file. The end of recorded data is signaled by returning zero bytes for two consecutive read calls. The third read returns an error. My view as written above is, if the driver does not conform to the above it is broken. So from what you write, I would say that the lin_tape handles the condition in an incorrect manner and it should be fixed, but to determine that would require finding the correct man page and perhaps looking at how the other kernel tape drivers work. If there is an ioctl() that Bacula could call that is implemented on all OSes, then I would consider anything specific that you could propose, but Bacula doesn't do SCSI I/O and I hope we never have to. If you have some specific suggestions for btape, please let me know. What you are asking is a bit vague for me, because I cannot reproduce the error so I don't know what code needs fixing or where comments should be put. Please let me know. Best regards, Kern On 07/03/2013 03:30 PM, Mariusz Mazur wrote: > When doing a goto End Of Data operation using an fsf(1) loop, sd does a read() > on each iteration to check whether it reached EOD yet. It expects two things: > 1. read() to return 0 bytes, meaning EOF. > 2. read to return error ENOSPC which is, according to the code comment, what > IBM drivers tended to do. > > Problem is, according to this text > http://wiki.zmanda.com/index.php/Tape_Driver_Semantics#Read > > "What happens when you try to read past EOM varies by kernel driver. Some will > continue to return EOF. Others will return an error (typically EIO)." > > An EIO error would also be likely. And that's exactly what ibm's open source > lin_tape driver (which I'm using for my ts2900 autoloader) is doing. > > It would be possible to patch up lin_tape to return something saner than EIO, > however that's not that great a solution – it'd be better if bacula supported > such tape drivers out of the box. > > What I did for my own purposes amounts to this: > > --- bacula-5.2.13.orig/src/stored/dev.c 2013-02-19 20:21:35.000000000 +0100 > +++ bacula-5.2.13/src/stored/dev.c 2013-07-02 17:19:27.775246812 +0200 > @@ -1198,6 +1198,8 @@ > */ > } else if (at_eof() && errno == ENOSPC) { > stat = 0; > + } else if (errno == EIO) { /* && has_cap(CAP_IOERRATEOM) */ > + stat = 0; > } else { > berrno be; > set_eot(); > > however that has the potential to hide actual io errors, so were it mergeable, > CAP_IOERRATEOM would need to default to off. > > The prefered solution imho would be for sd to check scsi sense data on EIO, > cause there's a flag there that would clearly indicate whether the problem is > real or if it's just trying to read past EOD. However I don't know if linux > has a generic sense data reading api, not to mention there existing a cross- > platform one. > > Or am I missing something? > > Also, no matter what the preferred solution here would be, btape should, on > detecting those EIO's (and it sees them when running the standard tape test > checks), suggest possible solutions. > > --mmazur > > > > > (Below are some keywords for people with similar issues to be able to find > this email via google, please ignore.) > > > ibm ts2900 ts3310 > > > lin_taped: > tape_check_result SenseKey: 08 ASC: 00 ASCQ: 05. > tape_check_result: Information field: 64512 > tape_check_result Encountered EOD. > > > btape: > btape: dev.c:716-0 Enter eod > btape: dev.c:598-0 rewind res=0 fd=3 "tapeDrive" (/dev/IBMtape0n) > btape: dev.c:809-0 eod: doing fsf 1 > btape: dev.c:1136-0 fsf > btape: dev.c:1181-0 FSF has cap_fsf > btape: dev.c:1191-0 Doing read before fsf > btape: dev.c:1229-0 Doing MTFSF > btape: dev.c:1262-0 Return 0 from FSF > btape: dev.c:1264-0 ST_EOF set on exit FSF > btape: dev.c:1269-0 Return from FSF file=1 > btape: dev.c:809-0 eod: doing fsf 1 > btape: dev.c:1133-0 ST_EOF set on entry to FSF > btape: dev.c:1136-0 fsf > btape: dev.c:1181-0 FSF has cap_fsf > btape: dev.c:1191-0 Doing read before fsf > btape: dev.c:1206-0 Set ST_EOT read errno=5. ERR=Input/output error > btape: dev.c:1209-0 dev.c:1208 read error on "tapeDrive" (/dev/IBMtape0n). > ERR=Input/output error. > > > dir job email: > JobId 411: 3304 Issuing autochanger "load slot 1, drive 0" command. > JobId 411: 3305 Autochanger "load slot 1, drive 0", status is OK. > JobId 411: Error: dev.c:1208 read error on "tapeDrive" (/dev/IBMtape0n). > ERR=Input/output error. > JobId 411: Volume "269BCML5" previously written, moving to end of data. > JobId 411: Error: Unable to position to end of data on device "tapeDrive" > (/dev/IBMtape0n): ERR=dev.c:1208 read error on "tapeDrive" (/dev/IBMtape0n). > ERR=Input/output error. > JobId 411: Marking Volume "269BCML5" in Error in Catalog. > JobId 411: 3307 Issuing autochanger "unload slot 1, drive 0" command. > > > dir job email with fixed block size (beats me why): > bak-sd JobId 533: Error: Bacula cannot write on tape Volume "269BCML5" > because: The number of files mismatch! Volume=5 Catalog=6 > > > Also google: [Bacula-devel] TS3310 - possible solution Mariusz Czulada 2007 > > ------------------------------------------------------------------------------ > This SF.net email is sponsored by Windows: > > Build for Windows Store. > > http://p.sf.net/sfu/windows-dev2dev > _______________________________________________ > Bacula-devel mailing list > Bacula-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-devel > ------------------------------------------------------------------------------ This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev _______________________________________________ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel