Hey Andreas,

sorry for taking so long to come back to you.   Did this restore issue happen to you more than once ? I.e. for more than one copy/backup ?
The hint with `Maximum Concurrent Jobs = 20` is something i will look into.

Kind Regards
Sebastian Sura

Am 18.06.25 um 18:21 schrieb 'Andreas R' via bareos-users:
Here is a little update.
I have created more devices in the sd for parallel job execution. But all of them with "Maximum Concurrent Jobs = 1". Previously we had a single device with 20 concurrent jobs. That solved the problem for us. At least for new backups.

Still, something seems to be wrong with copy jobs and the crashing fd.
Let me know if I can provide any more information to get this sorted out.

I will be on vacation until end of next week.
Best wishes,
Andreas
Andreas R schrieb am Dienstag, 3. Juni 2025 um 14:53:05 UTC+2:

    Hi Sebastian,

    the bscan output with the modified bsr was uploaded to the shared
    folder.

    I did some more debugging.

    First I created a new storage and a new disk pool.
    Then I copied the initial full job to the new disk pool. (disk > disk)
    Selection Pattern = "SELECT 212964 AS jobid;"

    The restore from that pool also failed. So it seems the problem is
    not related to tape.

    With the debug traces I was able to identify affected files. There
    is some kind of pattern:
    host1:
    - /var/adm/backup/rpmdb/Packages-20250517.gz
    - /var/adm/backup/rpmdb/Packages-20250520.gz
    - /var/lib/ca-certificates/openssl/OISTE_WISeKey_Global_Root_GC_CA.pem
    host2:
    - /var/adm/backup/rpmdb/Packages-20250517.gz
    - /etc/vmware-tools/vgauth/schemas/XMLSchema.xsd
    host3:
    - /etc/vmware-tools/vgauth/schemas/XMLSchema.xsd
    host4:
    -
    /var/lib/ca-certificates/openssl/DIGITALSIGN_GLOBAL_ROOT_ECDSA_CA.pem
    - /var/lib/sss/mc/initgroups
    etc.
    All these jobs run simultaneously to a single pool.

    Have a nice vacation,
    Andreas

    Sebastian Sura schrieb am Dienstag, 3. Juni 2025 um 09:45:07 UTC+2:

        Hi Andreas,

        thanks for the help!  Diffing those files yielded:

        -bscan: stored/bscan.cc:496-0 Record: ... Stream=20 len=262144
        +bscan: stored/bscan.cc:496-0 Record: ... Stream=20 len=209312

        This is very weird.  It looks like some of the data was not
        copied correctly.  I will come back to this after my
        vacation.  It definitely looks weird.
        Could you modify the copy.bsr by deleting the
        VolSessionId=,VolSessionTime=,FileIndex=,Count= lines and
        running bscan again like before?
        I am wondering if some other job somehow cut off that part.

        Kind Regards
        Sebastian Sura

        Am 02.06.25 um 07:48 schrieb Sebastian Sura:

        Hi Andreas,

        i want to check why the copy is not restorable. Could you do
        the following for me ?
        1) Grab the bsr of the (working) full and the (not working)
        copy.  You can do this via

        * restore jobid=<full/copy id> bsr=/path/to/the/file.bsr all done

        bareos then writes the bsr in the given file.  Lets say the
        bsrs are now in /tmp/full.bsr an /tmp/copy.bsr.

        2) We now want to use bscan to see what data is getting sent
        to the fd:

        $ bscan -b /path/to/the/file.bsr --list-records -c
        path/to/config ... <your device>

        This should output a list like the following:

        bscan: stored/butil.cc:327-0 Using device: "FileStorage2" for
        reading.
        02-Jun 07:37 bscan JobId 0: Ready to read from volume
        "Copy-0002" on device "FileStorage2" (storage).
        02-Jun 07:37 bscan JobId 0: Forward spacing Volume
        "Copy-0002" to file:block 0:216.
        bscan: stored/bscan.cc:501-0 Record: SessId=1
        SessTim=1748841876 FileIndex=-4 Stream=5 len=164
        bscan: stored/bscan.cc:501-0 Record: SessId=1
        SessTim=1748841876 FileIndex=1 Stream=1 len=184
        bscan: stored/bscan.cc:501-0 Record: SessId=1
        SessTim=1748841876 FileIndex=1 Stream=22 len=640
        bscan: stored/bscan.cc:501-0 Record: SessId=1
        SessTim=1748841876 FileIndex=1 Stream=20 len=8624
        bscan: stored/bscan.cc:501-0 Record: SessId=1
        SessTim=1748841876 FileIndex=1 Stream=20 len=16
        bscan: stored/bscan.cc:501-0 Record: SessId=1
        SessTim=1748841876 FileIndex=1 Stream=1998 len=81
        bscan: stored/bscan.cc:501-0 Record: SessId=1
        SessTim=1748841876 FileIndex=1 Stream=19 len=322
        bscan: stored/bscan.cc:501-0 Record: SessId=1
        SessTim=1748841876 FileIndex=1 Stream=40 len=16
        bscan: stored/bscan.cc:501-0 Record: SessId=1
        SessTim=1748841876 FileIndex=2 Stream=1 len=185
        ...

        Could you send the two bsrs and the two lists to me ?

        Kind Regards
        Sebastian Sura

        Am 30.05.25 um 13:08 schrieb 'Andreas R' via bareos-users:
        I have sent you the debug trace. Let me know if I can
        provide further information.
        Kind Regards
        Andreas
        Sebastian Sura schrieb am Mittwoch, 28. Mai 2025 um 09:45:20
        UTC+2:

            Thanks for that traceback.  Something really weird is
            happening.  It looks like the fd tries to decrypt your
            encrypted backup, and it thinks it succeeds, but it
            actually went wrong.

            Could you redo the restore, but with debug tracing
            enabled ? I.e. do

            setdebug client=<clientname> level=500 trace=1

            before the restore.
            This command should print a filename where the debug
            messages will be stored.  It would be great if you could
            send this file to me (after the filedaemon crashed).

            I created an internal issue to track this as there is
            clearly something going wrong here.

            Kind Regards
            Sebastian Sura

            Am 27.05.25 um 13:23 schrieb 'Andreas R' via bareos-users:
            Thank you for looking into this matter.
            Here is the debug report.

            Best Regards,
            Andreas

            Sebastian Sura schrieb am Dienstag, 27. Mai 2025 um
            10:07:07 UTC+2:

                Thanks for the crash report.  This looks very
                weird.  I have not seen this kind of crash before.
                Would it be possible for you to install the debug
                packages and recreate the crash ?

                See here on how to install the debug symbol
                packages:
                
https://docs.bareos.org/Appendix/Debugging.html#installing-debug-symbols-packages

                Kind Regards
                Sebastian Sura

                Am 26.05.25 um 16:42 schrieb 'Andreas R' via
                bareos-users:
                HiSebastian,

                thankyouforyour reply.
                Ihaveattachedbothfiles.

                KindRegards,
                Andreas

                Sebastian Sura schrieb am Montag, 26. Mai 2025 um
                14:38:07 UTC+2:

                    Hi Andreas,

                    you attached the `.bactrace` file that the fd
                    created.  It would be very helpful if you
                    could also send us the `.traceback` file that
                    was created during the crash, as that file
                    contains the stacktrace.
                    Without it we would have to guess were the
                    problem occured.

                    As this problem occured on a restore, could you

                    1) check if this is reproducable, and if so,
                    2) send us the bootstrap record file of that
                    restore job ?

                    If you give the restore command the option
                    `bootstrap=<path>`, then bareos will write the
                    bsr file to that path and will not delete it.

                    Kind Regards
                    Sebastian Sura

                    Am 26.05.25 um 12:23 schrieb 'Andreas R' via
                    bareos-users:
                    Hi,

                    I have trouble restoring from tape. Jobs
                    start as expected, but at some point during
                    the restore, the filedaemon is killed with
                    signal 11.

                    *restore jobid=213438 client=prestore01-fd
                    all done yes

                    May 23 05:16:57 prestore01 bareos-fd[30717]:
                    bareos-fd, prestore01-fd got signal 11 -
                    Segmentation violation. Attempting traceback.
                    May 23 05:16:57 prestore01 bareos-fd[30717]:
                    exepath=/usr/sbin/
                    May 23 05:16:57 prestore01 bareos-fd[30717]:
                    BAREOS interrupted by signal 11: Segmentation
                    violation
                    May 23 05:16:57 prestore01 bareos-fd[30917]:
                    Calling: /usr/sbin/btraceback
                    /usr/sbin/bareos-fd 30717 /var/lib/bareos
                    May 23 05:16:57 prestore01 bareos-fd[30924]:
                    bsmtp: tools/bsmtp.cc:455-0 Failed to connect
                    to mailhost localhost
                    May 23 05:16:57 prestore01 bareos-fd[30717]:
                    The btraceback call returned 1
                    May 23 05:16:57 prestore01 bareos-fd[30717]:
                    Dumping:
                    /var/lib/bareos/prestore01-fd.30717.bactrace

                    cat /var/lib/bareos/prestore01-fd.30717.bactrace
                    Attempt to dump current JCRs. njcrs=1
                    threadid=0x00007f399fdfe6c0 JobId=213439
                    JobStatus=R jcr=0x7f3998047ec0
                    name=RestoreFiles.2025-05-23_10.16.37_28
                    threadid=0x00007f399fdfe6c0 killable=1
                    JobId=213439 JobStatus=R jcr=0x7f3998047ec0
                    name=RestoreFiles.2025-05-23_10.16.37_28
                           UseCount=1
                           JobType=R JobLevel=
                           sched_time=23-May-2025 05:16
                    start_time=23-May-2025 05:16
                           end_time=31-Dec-1969 18:00
                    wait_time=31-Dec-1969 18:00
                           db=(nil) db_batch=(nil) batch_started=0

                    Steps to reproduce:
                    1. Full backup to disk
                    2. Copy to tape via next pool
                    3. Restore from disk is ok
                    4. Restore from tape is not ok

                    What I tried without success so far:
                    - Deleted the jobs from tape and copied them
                    again
                      The error occourred after the same amount
                    of restored files
                    - Tried a different Tape
                    - Tried other fd versions. 22(debian),
                    23(suse) and 24(suse)
                    - Changed the blocksize to 512 in the sd
                    - Disabled compression and rerun everything

                    Client {
                     Name = prestore01-fd
                     #Maximum Concurrent Jobs = 20
                     FDport = 9102
                     PKI Signatures = Yes
                     PKI Encryption = Yes
                     PKI Keypair = "/etc/bareos/master.pem"
                     PKI Master Key = "/etc/bareos/prestore01.cert"
                     PkiCipher = AES256
                    }

                    Pool {
                    Name = Full
                    Pool Type = Backup
                    Recycle = Yes
                    Volume Retention = 12 months
                    Maximum Volumes = 125
                    Maximum Volume Bytes = 125G
                    Next Pool = "TapeFull"
                    Label Format = "Full-"
                    Storage = LocalStorage
                    }

                    Pool {
                    Name = TapeFull
                    Pool Type = Backup
                    Recycle = Yes
                    Volume Retention = 13 month
                    Storage = TL1000
                    Cleaning Prefix = CLN
                    }

                    Job {
                    Name = CopyFull2Tape
                    JobDefs = "CycleJob"
                    Type = Copy
                    Selection Type = PoolUncopiedJobs
                    Level = Full
                    Pool = Full
                    Messages = Standard
                    Client = pbackup01-fd
                    FileSet = "SuseBase"
                    Storage = "LocalStorage"
                    Schedule = "CopyFull2Tape"
                    }

                    System Info:
                    Bareos: 24.0.4~pre0.1014be830-74
                    OS: openSUSE Leap 15.6
                    Catalog: Postgresql
                    Tape:LTO8

                    Thanks in advance --
                    You received this message because you are
                    subscribed to the Google Groups
                    "bareos-users" group.
                    To unsubscribe from this group and stop
                    receiving emails from it, send an email to
                    bareos-users...@googlegroups.com.
                    To view this discussion visit
                    
https://groups.google.com/d/msgid/bareos-users/08776ca6-2a98-4901-a228-524922713a9en%40googlegroups.com
                    
<https://groups.google.com/d/msgid/bareos-users/08776ca6-2a98-4901-a228-524922713a9en%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- Sebastian surasebasti...@bareos.com
                      Bareos GmbH & Co. KG            Phone: +49 221 630693-0
                      https://www.bareos.com
                      Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
                      Komplementär: Bareos Verwaltungs-GmbH
                      Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp 
Storz

-- You received this message because you are
                subscribed to the Google Groups "bareos-users" group.
                To unsubscribe from this group and stop receiving
                emails from it, send an email to
                bareos-users...@googlegroups.com.
                To view this discussion visit
                
https://groups.google.com/d/msgid/bareos-users/93ba060d-c6bf-46e4-8679-874fbc7e6754n%40googlegroups.com
                
<https://groups.google.com/d/msgid/bareos-users/93ba060d-c6bf-46e4-8679-874fbc7e6754n%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- Sebastian surasebasti...@bareos.com
                  Bareos GmbH & Co. KG            Phone: +49 221 630693-0
                  https://www.bareos.com
                  Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
                  Komplementär: Bareos Verwaltungs-GmbH
                  Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz

-- You received this message because you are subscribed to
            the Google Groups "bareos-users" group.
            To unsubscribe from this group and stop receiving
            emails from it, send an email to
            bareos-users...@googlegroups.com.
            To view this discussion visit
            
https://groups.google.com/d/msgid/bareos-users/f0156028-ac9f-4930-91a6-b5b59c45b59bn%40googlegroups.com
            
<https://groups.google.com/d/msgid/bareos-users/f0156028-ac9f-4930-91a6-b5b59c45b59bn%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- Sebastian surasebasti...@bareos.com
              Bareos GmbH & Co. KG            Phone: +49 221 630693-0
              https://www.bareos.com
              Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
              Komplementär: Bareos Verwaltungs-GmbH
              Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz

-- You received this message because you are subscribed to the
        Google Groups "bareos-users" group.
        To unsubscribe from this group and stop receiving emails
        from it, send an email to bareos-users...@googlegroups.com.
        To view this discussion visit
        
https://groups.google.com/d/msgid/bareos-users/c7d2965e-c03f-497c-8c64-d7e4997ec8fan%40googlegroups.com
        
<https://groups.google.com/d/msgid/bareos-users/c7d2965e-c03f-497c-8c64-d7e4997ec8fan%40googlegroups.com?utm_medium=email&utm_source=footer>.
-- Sebastian surasebasti...@bareos.com
          Bareos GmbH & Co. KG            Phone: +49 221 630693-0
          https://www.bareos.com
          Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
          Komplementär: Bareos Verwaltungs-GmbH
          Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz
-- You received this message because you are subscribed to the
        Google Groups "bareos-users" group.
        To unsubscribe from this group and stop receiving emails from
        it, send an email to bareos-users...@googlegroups.com.
        To view this discussion visit
        
https://groups.google.com/d/msgid/bareos-users/18bebbc3-3218-41c3-9cf2-a67fac50dad3%40bareos.com
        
<https://groups.google.com/d/msgid/bareos-users/18bebbc3-3218-41c3-9cf2-a67fac50dad3%40bareos.com?utm_medium=email&utm_source=footer>.

-- Sebastian surasebasti...@bareos.com
          Bareos GmbH & Co. KG            Phone: +49 221 630693-0
          https://www.bareos.com
          Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
          Komplementär: Bareos Verwaltungs-GmbH
          Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz

--
You received this message because you are subscribed to the Google Groups "bareos-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to bareos-users+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bareos-users/8b8c0fcc-586d-4f59-a365-47cc78d4202dn%40googlegroups.com <https://groups.google.com/d/msgid/bareos-users/8b8c0fcc-586d-4f59-a365-47cc78d4202dn%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
 Sebastian surasebastian.s...@bareos.com
 Bareos GmbH & Co. KG            Phone: +49 221 630693-0
 https://www.bareos.com
 Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
 Komplementär: Bareos Verwaltungs-GmbH
 Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz

--
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to bareos-users+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/bareos-users/873a2557-678b-4f22-98ac-866ac69bd805%40bareos.com.

Reply via email to