-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello Dan,
See below ... On 20.05.2015 21:59, Dan Langille wrote: > >> On May 20, 2015, at 3:37 AM, Kern Sibbald <k...@sibbald.com> wrote: >> >> Hello Dan, >> >> We recently discussed a problem with logical end of tape markers on the >> newest FreeBSD system. That reminded me of the previous time we had >> problems, and it took us a very long time to find the problem 3-4 months >> if I remember right. > > Thanks. I will try running that test ASAP, but I'm loaded up at present. OK, I understand ... > > > The current situation arises on these second hand tapes I have obtained. > The drive writes 200 or 300 MB to the tape and then the errors arise: Hmm. I would be suspicious of second hand tapes. > > > Going back to a job which exhibits the problem: > >>> ### >>> 01-May 09:39 crey-sd JobId 205441: End of Volume "FAI022" at 11:11326 on device "DTL03" (/dev/nsa0). Write of 64512 bytes got 49152. >>> 01-May 09:39 crey-sd JobId 205441: Error: Error writing final EOF to tape. This Volume may not be readable. >>> tape_dev.c:941 ioctl MTWEOF error on "DTL03" (/dev/nsa0). ERR=Input/output error. >>> 01-May 09:39 crey-sd JobId 205441: End of medium on Volume "FAI022" Bytes=10,784,406,528 Blocks=167,168 at 01-May-2015 09:39. >>> 01-May 09:39 crey-sd JobId 205441: 3307 Issuing autochanger "unload slot 2, drive 0" command. >>> ### > > I don't know enough about tape to comment, so I'm speculating. Could a previous use of this tape put an end of medium marker on the tape? I don't think that is possible. The logical end of tape is a physical marker on the tape (a little strip of reflecting metallic film, if I am not mistaken). > > > FreeBSD sees this, reports it back to Bacula. Bacula says: No, thanks, I know what I'm doing, and continues to write data. Once Bacula gets back a status of -1 it will attempt to write an EOF or perhaps 2 EOFs depending on your configuration. There should be many megabytes of space left on the tape before the end of tape, so the write should succeed. It may not succeed if the driver writer did not understand logical end of tape markers -- that is he may attempt to force no more writing. That is the kind of subtle implementation mistake that causes Bacula to object as it is doing. > > > FreeBSD says: no, no, you can't do this, I told you so... and everything stops. > > Is that probable? Likely? That is possible. If this problem started occurring after an upgrade of you OS, then it has a much higher probability and someone should look at the st tape driver code to see what changed. If this problem started after getting second hand tapes, there is a high probability that the tapes are just worn or bad, or it is even possible that some of them have a manufacturing defect or an age defect that results in no logical end of tape marker. If there is no end of tape marker, then Bacula will write to the end of the physical tape (not good) and the tape drive will correctly refuse to let Bacula write even an EOF mark. Knowing which of these possibilities (or others) is correct is non-trivial. Best regards, Kern > > > This is all moot until I run the test program. > >>> >>> See also, this from /var/log/messages: >>> >>> ### >>> May 1 09:39:01 knew kernel: (sa0:sym0:0:1:0): WRITE FILEMARKS(6). CDB: 10 00 00 00 01 00 >>> May 1 09:39:01 knew kernel: (sa0:sym0:0:1:0): CAM status: SCSI Status Error >>> May 1 09:39:01 knew kernel: (sa0:sym0:0:1:0): SCSI status: Check Condition >>> May 1 09:39:01 knew kernel: (sa0:sym0:0:1:0): SCSI sense: MEDIUM ERROR asc:c,0 (Write error) >>> May 1 09:39:01 knew kernel: (sa0:sym0:0:1:0): Command Specific Info: 0x28b4b >>> May 1 09:39:01 knew kernel: (sa0:sym0:0:1:0): Error 5, Retries exhausted >>> May 1 09:39:01 knew kernel: (sa0:sym0:0:1:0): WRITE FILEMARKS(6). CDB: 10 00 00 00 02 00 >>> May 1 09:39:01 knew kernel: (sa0:sym0:0:1:0): CAM status: SCSI Status Error >>> May 1 09:39:01 knew kernel: (sa0:sym0:0:1:0): SCSI status: Check Condition >>> May 1 09:39:01 knew kernel: (sa0:sym0:0:1:0): SCSI sense: MEDIUM ERROR asc:c,0 (Write error) >>> May 1 09:39:01 knew kernel: (sa0:sym0:0:1:0): Command Specific Info: 0x28b4b >>> May 1 09:39:01 knew kernel: (sa0:sym0:0:1:0): Error 5, Retries exhausted >>> May 1 09:42:41 knew kernel: (sa0:sym0:0:1:0): WRITE FILEMARKS(6). CDB: 10 00 00 00 01 00 >>> May 1 09:42:41 knew kernel: (sa0:sym0:0:1:0): CAM status: SCSI Status Error >>> May 1 09:42:41 knew kernel: (sa0:sym0:0:1:0): SCSI status: Check Condition >>> May 1 09:42:41 knew kernel: (sa0:sym0:0:1:0): SCSI sense: MEDIUM ERROR asc:c,0 (Write error) >>> May 1 09:42:41 knew kernel: (sa0:sym0:0:1:0): Command Specific Info: 0x11fb >>> May 1 09:42:41 knew kernel: (sa0:sym0:0:1:0): Error 5, Retries exhausted >>> May 1 09:42:41 knew kernel: (sa0:sym0:0:1:0): WRITE FILEMARKS(6). CDB: 10 00 00 00 02 00 >>> May 1 09:42:41 knew kernel: (sa0:sym0:0:1:0): CAM status: SCSI Status Error >>> May 1 09:42:41 knew kernel: (sa0:sym0:0:1:0): SCSI status: Check Condition >>> May 1 09:42:41 knew kernel: (sa0:sym0:0:1:0): SCSI sense: MEDIUM ERROR asc:c,0 (Write error) >>> May 1 09:42:41 knew kernel: (sa0:sym0:0:1:0): Command Specific Info: 0x11fb >>> May 1 09:42:41 knew kernel: (sa0:sym0:0:1:0): Error 5, Retries exhausted >>> ### Hmm. Clearly something is wrong. If you don't succeed in getting it to work, let me know and I will try to build it on my Ubuntu system and run it against my tape drive. >>> >> >> Anyway, here is your email on the subject and the source of the file >> that we used to diagnose the problem. I would be a bit surprised if the >> current problem is the same thing, but it is probably worth trying. I >> haven't tried compiling the program recently so let me know if you have >> problems. >> >> Best regards, >> Kern >> >> PS: It took me a long time to find these attachments because I thought >> they were in the test environment, but it looks like I committed them to >> the bacula/platforms/freebsd directory. >> <pthreads-fix.txt><tapetest.c>------------------------------------------------------------------------------ >> One dashboard for servers and applications across Physical-Virtual-Cloud >> Widest out-of-the-box monitoring support with 50+ applications >> Performance metrics, stats and reports that give you Actionable Insights >> Deep dive visibility with transaction tracing using APM Insight. >> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y_______________________________________________ >> Bacula-devel mailing list >> Bacula-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/bacula-devel > > — > Dan Langille > http://langille.org/ > > > > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlVdesIACgkQNgfoSvWqwEjRzACgrgM49Nsz3hzKahNlq8Qm+lfy dnYAn3is17IvOZAf1ZuTF8TMQm5jzoNa =+frE -----END PGP SIGNATURE----- ------------------------------------------------------------------------------ One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y _______________________________________________ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel