On Wed, Jul 29, 2015 at 5:17 PM Michael Mol <[email protected]> wrote:
> On Mon, Jul 27, 2015 at 5:03 PM Ryan Clough <[email protected]> wrote: > >> Hello, >> >> I have cross-posted this question in the bareos-users mailing list. >> >> Wondering if anyone has tried this because I am unable to backup data >> that is mounted via Gluster Fuse or Gluster NFS. Basically, I have the >> Gluster volume mounted on the Bareos Director which also has the tape >> changer attached. >> >> Here is some information about versions: >> Bareos version 14.2.2 >> Gluster version 3.7.2 >> Scientific Linux version 6.6 >> >> Our Gluster volume consists of two nodes in distribute only. Here is the >> configuration of our volume: >> [root@hgluster02 ~]# gluster volume info >> >> Volume Name: export_volume >> Type: Distribute >> Volume ID: c74cc970-31e2-4924-a244-4c70d958dadb >> Status: Started >> Number of Bricks: 2 >> Transport-type: tcp >> Bricks: >> Brick1: hgluster01:/gluster_data >> Brick2: hgluster02:/gluster_data >> Options Reconfigured: >> performance.io-thread-count: 24 >> server.event-threads: 20 >> client.event-threads: 4 >> performance.readdir-ahead: on >> features.inode-quota: on >> features.quota: on >> nfs.disable: off >> auth.allow: 192.168.10.*,10.0.10.*,10.8.0.*,10.2.0.*,10.0.60.* >> server.allow-insecure: on >> server.root-squash: on >> performance.read-ahead: on >> features.quota-deem-statfs: on >> diagnostics.brick-log-level: WARNING >> >> When I try to backup a directory from Gluster Fuse or Gluster NFS mount >> and I monitor the network communication I only see data being pulled from >> the hgluster01 brick. When the job finishes Bareos thinks that it completed >> without error but included in the messages for the job are lots and lots of >> permission denied errors like this: >> 15-Jul 02:03 ripper.red.dsic.com-fd JobId 613: Cannot open >> "/export/rclough/psdv-2014-archives-2/scan_111.tar.bak": ERR=Permission >> denied. >> 15-Jul 02:03 ripper.red.dsic.com-fd JobId 613: Cannot open >> "/export/rclough/psdv-2014-archives-2/run_219.tar.bak": ERR=Permission >> denied. >> 15-Jul 02:03 ripper.red.dsic.com-fd JobId 613: Cannot open >> "/export/rclough/psdv-2014-archives-2/scan_112.tar.bak": ERR=Permission >> denied. >> 15-Jul 02:03 ripper.red.dsic.com-fd JobId 613: Cannot open >> "/export/rclough/psdv-2014-archives-2/run_220.tar.bak": ERR=Permission >> denied. >> 15-Jul 02:03 ripper.red.dsic.com-fd JobId 613: Cannot open >> "/export/rclough/psdv-2014-archives-2/scan_114.tar.bak": ERR=Permission >> denied. >> >> At first I thought this might be a root-squash problem but, if I try to >> read/copy a file using the root user from the Bareos server that is trying >> to do the backup, I can read files just fine. >> >> When the job finishes is reports that it finished "OK -- with warnings" >> but, again the log for the job is filled with "ERR=Permission denied" >> messages. In my opinion, this job did not finish OK and should be Failed. >> Some of the files from the HGluster02 brick are backed up but all of the >> ones with permission errors do not. When I restore the job, all of the >> files with permission errors are empty. >> >> Has anyone successfully used Bareos to backup data from Gluster mounts? >> This is an important use case for us because this is the largest single >> volume that we have to prepare large amounts of data to be archived. >> >> Thank you for your time, >> >> How did I not see this earlier? I'm seeing a very similar problem. I > just posted this to the bareos-user list: > > Help! I've run out of know-how while trying to fix this myself... > > Environment: CentOS 7, x86_64 > Bareos version: 14.2.2-46.1.el7 (via > http://download.bareos.org/bareos/release/14.2/CentOS_7/ repo) > Gluster version: 3.7.3-1.el7 (via > http://download.gluster.org/pub/gluster/glusterfs/3.7/LATEST/EPEL.repo/epel-$releasever/$basearch/ > repo) > > Symptom: Bareos attempts to mount a volume, and spits back a Permission > Denied error, as though it didn't have permission to access the relevant > file. > > I've been seeing this at least since Gluster version 3.7.2, which I > updated to owing to a need to expand my backend storage (and 3.7.1, which > worked fine) had a bug that broke bricks while rebalancing. > > I've verified that the bareos storage daemon is running as the bareos > user, and I've also, by way of FUSE mount into the gluster volume, verified > ownership of the volume: > > # ls -l Email-Incremental-0155 > -rw-r-----. 1 bareos bareos 1073728379 Jun 10 21:04 Email-Incremental-0155 > > And uid/gid, for reference: > > # ls -ln Email-Incremental-0155 > -rw-r-----. 1 997 995 1073728379 Jun 10 21:04 Email-Incremental-0155 > > And in the gluster volume, the storage owner-{uid,gid}: > # gluster volume info bareos > > Volume Name: bareos > Type: Distribute > Volume ID: f4cb7aac-3631-41cc-9afa-f182a514d116 > Status: Started > Number of Bricks: 2 > Transport-type: tcp > Bricks: > Brick1: backup-stor-1[censored]:/var/gluster/bareos/brick-bareos > Brick2: backup-stor-2[censored]:/var/gluster/bareos/brick-bareos > Options Reconfigured: > server.allow-insecure: on > performance.readdir-ahead: off > nfs.disable: on > performance.cache-size: 128MB > performance.write-behind-window-size: 256MB > performance.cache-refresh-timeout: 10 > performance.io-thread-count: 16 > performance.cache-max-file-size: 4TB > performance.flush-behind: on > performance.client-io-threads: on > storage.owner-uid: 997 > storage.owner-gid: 995 > features.bitrot: off > features.scrub: Inactive > features.scrub-freq: daily > features.scrub-throttle: lazy > > In this run, the storage daemon and the file daemon happen to be on the > same node. Here's trace output at level 200, obtained running "tail -f > *.trace" in bareos-sd's cwd: > > ==> backup-director-sd.trace <== > backup-director-sd: fd_cmds.c:219-0 <filed: append open session > backup-director-sd: fd_cmds.c:303-0 Append open session: append open > session > backup-director-sd: fd_cmds.c:314-0 >filed: 3000 OK open ticket = 1 > backup-director-sd: fd_cmds.c:219-0 <filed: append data 1 > backup-director-sd: fd_cmds.c:265-0 Append data: append data 1 > backup-director-sd: fd_cmds.c:267-0 <filed: append data 1 > backup-director-sd: append.c:69-0 Start append data. res=1 > backup-director-sd: acquire.c:369-0 acquire_append device is disk > backup-director-sd: acquire.c:404-0 jid=924 Do mount_next_write_vol > backup-director-sd: mount.c:71-0 Enter mount_next_volume(release=0) > dev="GlusterStorage4" (gluster://backup-stor-1[censored]/bareos/bareos) > backup-director-sd: mount.c:84-0 mount_next_vol retry=0 > backup-director-sd: mount.c:604-0 No swap_dev set > backup-director-sd: askdir.c:246-0 >dird CatReq > Job=server2-email.2015-07-29_16.32.34_09 GetVolInfo > VolName=Email-Incremental-0155 write=1 > backup-director-sd: askdir.c:175-0 <dird 1000 OK > VolName=Email-Incremental-0155 VolJobs=0 VolFiles=0 VolBlocks=0 VolBytes=1 > VolMounts=3 VolErrors=0 VolWrites=16646 MaxVolBytes=1073741824 > VolCapacityBytes=0 VolStatus=Recycle Slot=0 MaxVolJobs=0 MaxVolFiles=0 > InChanger=0 VolReadTime=0 VolWriteTime=8455280 EndFile=0 > EndBlock=1073728378 LabelType=0 MediaId=156 EncryptionKey= MinBlocksize=0 > MaxBlocksize=0 > backup-director-sd: askdir.c:211-0 do_get_volume_info return true slot=0 > Volume=Email-Incremental-0155, VolminBlocksize=0 VolMaxBlocksize=0 > backup-director-sd: askdir.c:213-0 setting dcr->VolMinBlocksize(0) to > vol.VolMinBlocksize(0) > backup-director-sd: askdir.c:215-0 setting dcr->VolMaxBlocksize(0) to > vol.VolMaxBlocksize(0) > backup-director-sd: mount.c:122-0 After find_next_append. > Vol=Email-Incremental-0155 Slot=0 > backup-director-sd: autochanger.c:99-0 Device "GlusterStorage4" > (gluster://backup-stor-1[censored]/bareos/bareos) is not an autochanger > backup-director-sd: mount.c:144-0 autoload_dev returns 0 > backup-director-sd: mount.c:175-0 want vol=Email-Incremental-0155 devvol= > dev="GlusterStorage4" (gluster://backup-stor-1[censored]/bareos/bareos) > backup-director-sd: dev.c:536-0 open dev: type=5 > dev_name="GlusterStorage4" > (gluster://backup-stor-1[censored]/bareos/bareos) > vol=Email-Incremental-0155 mode=OPEN_READ_WRITE > backup-director-sd: dev.c:540-0 call open_device mode=OPEN_READ_WRITE > backup-director-sd: dev.c:941-0 Enter mount > backup-director-sd: dev.c:610-0 open disk: mode=OPEN_READ_WRITE > open(gluster://backup-stor-1[censored]/bareos/bareos/Email-Incremental-0155, > 0x2, 0640) > > ==> backup-director-fd.trace <== > > ==> backup-director-sd.trace <== > backup-director-sd: dev.c:617-0 open failed: dev.c:616 Could not open: > gluster://backup-stor-1[censored]/bareos/bareos/Email-Incremental-0155, > ERR=Permission denied > In response to Pranith's suggestion to Ryan to look at logs, I did find this interesting in root-bareos.log when I FUSE-mounted the volume. (Interesting, because everything is running the same version of gluster, at least as far as packages are telling me.) ==> root-bareos.log <== [2015-07-29 21:26:39.465191] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-bareos-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2015-07-29 21:26:39.465737] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-bareos-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2015-07-29 21:26:39.465935] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-bareos-client-1: Connected to bareos-client-1, attached to remote volume '/var/gluster/bareos/brick-bareos'. [2015-07-29 21:26:39.465999] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-bareos-client-1: Server and Client lk-version numbers are not same, reopening the fds [2015-07-29 21:26:39.466319] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-bareos-client-0: Connected to bareos-client-0, attached to remote volume '/var/gluster/bareos/brick-bareos'. [2015-07-29 21:26:39.466344] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-bareos-client-0: Server and Client lk-version numbers are not same, reopening the fds [2015-07-29 21:26:39.471772] I [fuse-bridge.c:5053:fuse_graph_setup] 0-fuse: switched to graph 0 [2015-07-29 21:26:39.471953] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-bareos-client-1: Server lk version = 1 [2015-07-29 21:26:39.472000] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-bareos-client-0: Server lk version = 1 [2015-07-29 21:26:39.473230] I [fuse-bridge.c:3979:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel 7.22 On both bricks, there's this or similar, but the timestamps don't correlate with the bareos errors: The message "W [MSGID: 101095] [xlator.c:143:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/3.7.3/xlator/features/bitrot.so: cannot open shared object file: No such file or directory" repeated 3 times between [2015-07-29 19:50:34.593333] and [2015-07-29 19:50:34.593486]
_______________________________________________ Gluster-users mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-users
