-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello Dan,

See below ...

On 20.05.2015 21:59, Dan Langille wrote:
>
>> On May 20, 2015, at 3:37 AM, Kern Sibbald <k...@sibbald.com> wrote:
>>
>> Hello Dan,
>>
>> We recently discussed a problem with logical end of tape markers on the
>> newest FreeBSD system.  That reminded me of the previous time we had
>> problems, and it took us a very long time to find the problem 3-4 months
>> if I remember right.
>
> Thanks.  I will try running that test ASAP, but I'm loaded up at present.

OK, I understand ...
>
>
> The current situation arises on these second hand tapes I have obtained.
> The drive writes 200 or 300 MB to the tape and then the errors arise:

Hmm.  I would be suspicious of second hand tapes.

>
>
> Going back to a job which exhibits the problem:
>
>>> ###
>>> 01-May 09:39 crey-sd JobId 205441: End of Volume "FAI022" at
11:11326 on device "DTL03" (/dev/nsa0). Write of 64512 bytes got 49152.
>>> 01-May 09:39 crey-sd JobId 205441: Error: Error writing final EOF to
tape. This Volume may not be readable.
>>> tape_dev.c:941 ioctl MTWEOF error on "DTL03" (/dev/nsa0).
ERR=Input/output error.
>>> 01-May 09:39 crey-sd JobId 205441: End of medium on Volume "FAI022"
Bytes=10,784,406,528 Blocks=167,168 at 01-May-2015 09:39.
>>> 01-May 09:39 crey-sd JobId 205441: 3307 Issuing autochanger "unload
slot 2, drive 0" command.
>>> ###
>
> I don't know enough about tape to comment, so I'm speculating. Could a
previous use of this tape put an end of medium marker on the tape?

I don't think that is possible.  The logical end of tape is a physical
marker on the tape (a little strip of reflecting metallic film, if I am
not mistaken).

>
>
> FreeBSD sees this, reports it back to Bacula.  Bacula says: No,
thanks, I know what I'm doing, and continues to write data.
Once Bacula gets back a status of -1 it will attempt to write an EOF or
perhaps 2 EOFs depending on your configuration.  There should be many
megabytes of space left on the tape before the end of tape, so the write
should succeed.  It may not succeed if the driver writer did not
understand logical end of tape markers -- that is he may attempt to
force no more writing.  That is the kind of subtle implementation
mistake that causes Bacula to object as it is doing.
>
>
> FreeBSD says: no, no, you can't do this, I told you so... and
everything stops.
>
> Is that probable? Likely?
That is possible.  If this problem started occurring after an upgrade of
you OS, then it has a much higher probability and someone should look at
the st tape driver code to see what changed.  If this problem started
after getting second hand tapes, there is a high probability that the
tapes are just worn or bad, or it is even possible that some of them
have a manufacturing defect or an age defect that results in no logical
end of tape marker.  If there is no end of tape marker, then Bacula will
write to the end of the physical tape (not good) and the tape drive will
correctly refuse to let Bacula write even an EOF mark.

Knowing which of these possibilities (or others) is correct is non-trivial.

Best regards,
Kern

>
>
> This is all moot until I run the test program.
>
>>>
>>> See also, this from /var/log/messages:
>>>
>>> ###
>>> May  1 09:39:01 knew kernel: (sa0:sym0:0:1:0): WRITE FILEMARKS(6).
CDB: 10 00 00 00 01 00
>>> May  1 09:39:01 knew kernel: (sa0:sym0:0:1:0): CAM status: SCSI
Status Error
>>> May  1 09:39:01 knew kernel: (sa0:sym0:0:1:0): SCSI status: Check
Condition
>>> May  1 09:39:01 knew kernel: (sa0:sym0:0:1:0): SCSI sense: MEDIUM
ERROR asc:c,0 (Write error)
>>> May  1 09:39:01 knew kernel: (sa0:sym0:0:1:0): Command Specific
Info: 0x28b4b
>>> May  1 09:39:01 knew kernel: (sa0:sym0:0:1:0): Error 5, Retries
exhausted
>>> May  1 09:39:01 knew kernel: (sa0:sym0:0:1:0): WRITE FILEMARKS(6).
CDB: 10 00 00 00 02 00
>>> May  1 09:39:01 knew kernel: (sa0:sym0:0:1:0): CAM status: SCSI
Status Error
>>> May  1 09:39:01 knew kernel: (sa0:sym0:0:1:0): SCSI status: Check
Condition
>>> May  1 09:39:01 knew kernel: (sa0:sym0:0:1:0): SCSI sense: MEDIUM
ERROR asc:c,0 (Write error)
>>> May  1 09:39:01 knew kernel: (sa0:sym0:0:1:0): Command Specific
Info: 0x28b4b
>>> May  1 09:39:01 knew kernel: (sa0:sym0:0:1:0): Error 5, Retries
exhausted
>>> May  1 09:42:41 knew kernel: (sa0:sym0:0:1:0): WRITE FILEMARKS(6).
CDB: 10 00 00 00 01 00
>>> May  1 09:42:41 knew kernel: (sa0:sym0:0:1:0): CAM status: SCSI
Status Error
>>> May  1 09:42:41 knew kernel: (sa0:sym0:0:1:0): SCSI status: Check
Condition
>>> May  1 09:42:41 knew kernel: (sa0:sym0:0:1:0): SCSI sense: MEDIUM
ERROR asc:c,0 (Write error)
>>> May  1 09:42:41 knew kernel: (sa0:sym0:0:1:0): Command Specific
Info: 0x11fb
>>> May  1 09:42:41 knew kernel: (sa0:sym0:0:1:0): Error 5, Retries
exhausted
>>> May  1 09:42:41 knew kernel: (sa0:sym0:0:1:0): WRITE FILEMARKS(6).
CDB: 10 00 00 00 02 00
>>> May  1 09:42:41 knew kernel: (sa0:sym0:0:1:0): CAM status: SCSI
Status Error
>>> May  1 09:42:41 knew kernel: (sa0:sym0:0:1:0): SCSI status: Check
Condition
>>> May  1 09:42:41 knew kernel: (sa0:sym0:0:1:0): SCSI sense: MEDIUM
ERROR asc:c,0 (Write error)
>>> May  1 09:42:41 knew kernel: (sa0:sym0:0:1:0): Command Specific
Info: 0x11fb
>>> May  1 09:42:41 knew kernel: (sa0:sym0:0:1:0): Error 5, Retries
exhausted
>>> ###

Hmm. Clearly something is wrong.  If you don't succeed in getting it to
work, let me know and I will try to build it on my Ubuntu system and run
it against my tape drive.

>>>
>>
>> Anyway, here is your email on the subject and the source of the file
>> that we used to diagnose the problem.  I would be a bit surprised if the
>> current problem is the same thing, but it is probably worth trying.  I
>> haven't tried compiling the program recently so let me know if you have
>> problems.
>>
>> Best regards,
>> Kern
>>
>> PS: It took me a long time to find these attachments because I thought
>> they were in the test environment, but it looks like I committed them to
>> the bacula/platforms/freebsd directory.
>>
<pthreads-fix.txt><tapetest.c>------------------------------------------------------------------------------
>> One dashboard for servers and applications across Physical-Virtual-Cloud
>> Widest out-of-the-box monitoring support with 50+ applications
>> Performance metrics, stats and reports that give you Actionable Insights
>> Deep dive visibility with transaction tracing using APM Insight.
>>
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y_______________________________________________
>> Bacula-devel mailing list
>> Bacula-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/bacula-devel
>
> —
> Dan Langille
> http://langille.org/
>
>
>
>
>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlVdesIACgkQNgfoSvWqwEjRzACgrgM49Nsz3hzKahNlq8Qm+lfy
dnYAn3is17IvOZAf1ZuTF8TMQm5jzoNa
=+frE
-----END PGP SIGNATURE-----


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to