Hi Avra, On 20 February 2017 at 02:51, Avra Sengupta <[email protected]> wrote:
> Hi D, > > It seems you tried to take a clone of a snapshot, when that snapshot was > not activated. > Correct. As per my commands, I then noticed the issue, checked the snapshot's status & activated it. I included this in my command history just to clear up any doubts from the logs. However in this scenario, the cloned volume should not be in an > inconsistent state. I will try to reproduce this and see if it's a bug. > Meanwhile could you please answer the following queries: > 1. How many nodes were in the cluster. > There are 4 nodes in a (2+1)x2 setup. s0 replicates to s1, with an arbiter on s2, and s2 replicates to s3, with an arbiter on s0. 2. How many bricks does the snapshot data-bck_GMT-2017.02.09-14.15.43 have? > 6 bricks, including the 2 arbiters. > 3. Was the snapshot clone command issued from a node which did not have > any bricks for the snapshot data-bck_GMT-2017.02.09-14.15.43 > All commands were issued from s0. All volumes have bricks on every node in the cluster. > 4. I see you tried to delete the new cloned volume. Did the new cloned > volume land in this state after failure to create the clone or failure to > delete the clone > I noticed there was something wrong as soon as I created the clone. The clone command completed, however I was then unable to do anything with it because the clone didn't exist on s1-s3. > > If you want to remove the half baked volume from the cluster please > proceed with the following steps. > 1. bring down glusterd on all nodes by running the following command on > all nodes > $ systemctl stop glusterd. > Verify that the glusterd is down on all nodes by running the following > command on all nodes > $ systemctl status glusterd. > 2. delete the following repo from all the nodes (whichever nodes it exists) > /var/lib/glusterd/vols/data-teste > The repo only exists on s0, but stoppping glusterd on only s0 & deleting the directory didn't work, the directory was restored as soon as glusterd was restarted. I haven't yet tried stopping glusterd on *all* nodes before doing this, although I'll need to plan for that, as it'll take the entire cluster off the air. Thanks for the reply, Doug > Regards, > Avra > > > On 02/16/2017 08:01 PM, Gambit15 wrote: > > Hey guys, > I tried to create a new volume from a cloned snapshot yesterday, however > something went wrong during the process & I'm now stuck with the new volume > being created on the server I ran the commands on (s0), but not on the rest > of the peers. I'm unable to delete this new volume from the server, as it > doesn't exist on the peers. > > What do I do? > Any insights into what may have gone wrong? > > CentOS 7.3.1611 > Gluster 3.8.8 > > The command history & extract from etc-glusterfs-glusterd.vol.log are > included below. > > gluster volume list > gluster snapshot list > gluster snapshot clone data-teste data-bck_GMT-2017.02.09-14.15.43 > gluster volume status data-teste > gluster volume delete data-teste > gluster snapshot create teste data > gluster snapshot clone data-teste teste_GMT-2017.02.15-12.44.04 > gluster snapshot status > gluster snapshot activate teste_GMT-2017.02.15-12.44.04 > gluster snapshot clone data-teste teste_GMT-2017.02.15-12.44.04 > > > [2017-02-15 12:43:21.667403] I [MSGID: 106499] > [glusterd-handler.c:4349:__glusterd_handle_status_volume] > 0-management: Received status volume req for volume data-teste > [2017-02-15 12:43:21.682530] E [MSGID: 106301] > [glusterd-syncop.c:1297:gd_stage_op_phase] > 0-management: Staging of operation 'Volume Status' failed on localhost : > Volume data-teste is not started > [2017-02-15 12:43:43.633031] I [MSGID: 106495] > [glusterd-handler.c:3128:__glusterd_handle_getwd] > 0-glusterd: Received getwd req > [2017-02-15 12:43:43.640597] I [run.c:191:runner_log] > (-->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0xcc4b2) > [0x7ffb396a14b2] > -->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0xcbf65) > [0x7ffb396a0f65] -->/lib64/libglusterfs.so.0(runner_log+0x115) > [0x7ffb44ec31c5] ) 0-management: Ran script: /var/lib/glusterd/hooks/1/ > delete/post/S57glusterfind-delete-post --volname=data-teste > [2017-02-15 13:05:20.103423] E [MSGID: 106122] [glusterd-snapshot.c:2397: > glusterd_snapshot_clone_prevalidate] 0-management: Failed to pre validate > [2017-02-15 13:05:20.103464] E [MSGID: 106443] [glusterd-snapshot.c:2413: > glusterd_snapshot_clone_prevalidate] 0-management: One or more bricks are > not running. Please run snapshot status command to see brick status. > Please start the stopped brick and then issue snapshot clone command > [2017-02-15 13:05:20.103481] W [MSGID: 106443] > [glusterd-snapshot.c:8563:glusterd_snapshot_prevalidate] > 0-management: Snapshot clone pre-validation failed > [2017-02-15 13:05:20.103492] W [MSGID: 106122] > [glusterd-mgmt.c:167:gd_mgmt_v3_pre_validate_fn] 0-management: Snapshot > Prevalidate Failed > [2017-02-15 13:05:20.103503] E [MSGID: 106122] > [glusterd-mgmt.c:884:glusterd_mgmt_v3_pre_validate] 0-management: Pre > Validation failed for operation Snapshot on local node > [2017-02-15 13:05:20.103514] E [MSGID: 106122] [glusterd-mgmt.c:2243: > glusterd_mgmt_v3_initiate_snap_phases] 0-management: Pre Validation Failed > [2017-02-15 13:05:20.103531] E [MSGID: 106027] [glusterd-snapshot.c:8118: > glusterd_snapshot_clone_postvalidate] 0-management: unable to find clone > data-teste volinfo > [2017-02-15 13:05:20.103542] W [MSGID: 106444] [glusterd-snapshot.c:9063: > glusterd_snapshot_postvalidate] 0-management: Snapshot create > post-validation failed > [2017-02-15 13:05:20.103561] W [MSGID: 106121] > [glusterd-mgmt.c:351:gd_mgmt_v3_post_validate_fn] 0-management: > postvalidate operation failed > [2017-02-15 13:05:20.103572] E [MSGID: 106121] [glusterd-mgmt.c:1660: > glusterd_mgmt_v3_post_validate] 0-management: Post Validation failed for > operation Snapshot on local node > [2017-02-15 13:05:20.103582] E [MSGID: 106122] [glusterd-mgmt.c:2363: > glusterd_mgmt_v3_initiate_snap_phases] 0-management: Post Validation > Failed > [2017-02-15 13:11:15.862858] W [MSGID: 106057] [glusterd-snapshot-utils.c: > 410:glusterd_snap_volinfo_find] 0-management: Snap volume > c3ceae3889484e96ab8bed69593cf6d3.s0.run-gluster-snaps- > c3ceae3889484e96ab8bed69593cf6d3-brick1-data-brick not found [Argumento > inválido] > [2017-02-15 13:11:16.314759] I [MSGID: 106143] > [glusterd-pmap.c:250:pmap_registry_bind] > 0-pmap: adding brick > /run/gluster/snaps/c3ceae3889484e96ab8bed69593cf6d3/brick1/data/brick > on port 49452 > [2017-02-15 13:11:16.316090] I [rpc-clnt.c:1046:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > [2017-02-15 13:11:16.348867] W [MSGID: 106057] [glusterd-snapshot-utils.c: > 410:glusterd_snap_volinfo_find] 0-management: Snap volume > c3ceae3889484e96ab8bed69593cf6d3.s0.run-gluster-snaps- > c3ceae3889484e96ab8bed69593cf6d3-brick6-data-arbiter not found [Argumento > inválido] > [2017-02-15 13:11:16.558878] I [MSGID: 106143] > [glusterd-pmap.c:250:pmap_registry_bind] > 0-pmap: adding brick > /run/gluster/snaps/c3ceae3889484e96ab8bed69593cf6d3/brick6/data/arbiter > on port 49453 > [2017-02-15 13:11:16.559883] I [rpc-clnt.c:1046:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > [2017-02-15 13:11:23.279721] E [MSGID: 106030] > [glusterd-snapshot.c:4736:glusterd_take_lvm_snapshot] > 0-management: taking snapshot of the brick (/run/gluster/snaps/ > c3ceae3889484e96ab8bed69593cf6d3/brick1/data/brick) of device > /dev/mapper/v0.dc0.cte--g0-c3ceae3889484e96ab8bed69593cf6d3_0 failed > [2017-02-15 13:11:23.279790] E [MSGID: 106030] > [glusterd-snapshot.c:5135:glusterd_take_brick_snapshot] > 0-management: Failed to take snapshot of brick s0:/run/gluster/snaps/ > c3ceae3889484e96ab8bed69593cf6d3/brick1/data/brick > [2017-02-15 13:11:23.279806] E [MSGID: 106030] [glusterd-snapshot.c:6484: > glusterd_take_brick_snapshot_task] 0-management: Failed to take backend > snapshot for brick s0:/run/gluster/snaps/data-teste/brick1/data/brick > volume(data-teste) > [2017-02-15 13:11:23.286678] E [MSGID: 106030] > [glusterd-snapshot.c:4736:glusterd_take_lvm_snapshot] > 0-management: taking snapshot of the brick (/run/gluster/snaps/ > c3ceae3889484e96ab8bed69593cf6d3/brick6/data/arbiter) of device > /dev/mapper/v0.dc0.cte--g0-c3ceae3889484e96ab8bed69593cf6d3_1 failed > [2017-02-15 13:11:23.286735] E [MSGID: 106030] > [glusterd-snapshot.c:5135:glusterd_take_brick_snapshot] > 0-management: Failed to take snapshot of brick s0:/run/gluster/snaps/ > c3ceae3889484e96ab8bed69593cf6d3/brick6/data/arbiter > [2017-02-15 13:11:23.286749] E [MSGID: 106030] [glusterd-snapshot.c:6484: > glusterd_take_brick_snapshot_task] 0-management: Failed to take backend > snapshot for brick s0:/run/gluster/snaps/data-teste/brick6/data/arbiter > volume(data-teste) > [2017-02-15 13:11:23.286793] E [MSGID: 106030] [glusterd-snapshot.c:6626: > glusterd_schedule_brick_snapshot] 0-management: Failed to create snapshot > [2017-02-15 13:11:23.286813] E [MSGID: 106441] [glusterd-snapshot.c:6796: > glusterd_snapshot_clone_commit] 0-management: Failed to take backend > snapshot data-teste > [2017-02-15 13:11:25.530666] E [MSGID: 106442] > [glusterd-snapshot.c:8308:glusterd_snapshot] > 0-management: Failed to clone snapshot > [2017-02-15 13:11:25.530721] W [MSGID: 106123] > [glusterd-mgmt.c:272:gd_mgmt_v3_commit_fn] 0-management: Snapshot Commit > Failed > [2017-02-15 13:11:25.530735] E [MSGID: 106123] > [glusterd-mgmt.c:1427:glusterd_mgmt_v3_commit] > 0-management: Commit failed for operation Snapshot on local node > [2017-02-15 13:11:25.530749] E [MSGID: 106123] [glusterd-mgmt.c:2304: > glusterd_mgmt_v3_initiate_snap_phases] 0-management: Commit Op Failed > [2017-02-15 13:11:25.532312] E [MSGID: 106027] [glusterd-snapshot.c:8118: > glusterd_snapshot_clone_postvalidate] 0-management: unable to find clone > data-teste volinfo > [2017-02-15 13:11:25.532339] W [MSGID: 106444] [glusterd-snapshot.c:9063: > glusterd_snapshot_postvalidate] 0-management: Snapshot create > post-validation failed > [2017-02-15 13:11:25.532353] W [MSGID: 106121] > [glusterd-mgmt.c:351:gd_mgmt_v3_post_validate_fn] 0-management: > postvalidate operation failed > [2017-02-15 13:11:25.532367] E [MSGID: 106121] [glusterd-mgmt.c:1660: > glusterd_mgmt_v3_post_validate] 0-management: Post Validation failed for > operation Snapshot on local node > [2017-02-15 13:11:25.532381] E [MSGID: 106122] [glusterd-mgmt.c:2363: > glusterd_mgmt_v3_initiate_snap_phases] 0-management: Post Validation > Failed > [2017-02-15 13:29:53.779020] E [MSGID: 106062] [glusterd-snapshot-utils.c: > 2391:glusterd_snap_create_use_rsp_dict] 0-management: failed to get snap > UUID > [2017-02-15 13:29:53.779073] E [MSGID: 106099] [glusterd-snapshot-utils.c: > 2507:glusterd_snap_use_rsp_dict] 0-glusterd: Unable to use rsp dict > [2017-02-15 13:29:53.779096] E [MSGID: 106108] > [glusterd-mgmt.c:1305:gd_mgmt_v3_commit_cbk_fn] 0-management: Failed to > aggregate response from node/brick > [2017-02-15 13:29:53.779136] E [MSGID: 106116] > [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Commit > failed on s3. Please check log file for details. > [2017-02-15 13:29:54.136196] E [MSGID: 106116] > [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Commit > failed on s1. Please check log file for details. > The message "E [MSGID: 106108] [glusterd-mgmt.c:1305:gd_mgmt_v3_commit_cbk_fn] > 0-management: Failed to aggregate response from node/brick" repeated 2 > times between [2017-02-15 13:29:53.779096] and [2017-02-15 13:29:54.535080] > [2017-02-15 13:29:54.535098] E [MSGID: 106116] > [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Commit > failed on s2. Please check log file for details. > [2017-02-15 13:29:54.535320] E [MSGID: 106123] > [glusterd-mgmt.c:1490:glusterd_mgmt_v3_commit] > 0-management: Commit failed on peers > [2017-02-15 13:29:54.535370] E [MSGID: 106123] [glusterd-mgmt.c:2304: > glusterd_mgmt_v3_initiate_snap_phases] 0-management: Commit Op Failed > [2017-02-15 13:29:54.539708] E [MSGID: 106116] > [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Post > Validation failed on s1. Please check log file for details. > [2017-02-15 13:29:54.539797] E [MSGID: 106116] > [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Post > Validation failed on s3. Please check log file for details. > [2017-02-15 13:29:54.539856] E [MSGID: 106116] > [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Post > Validation failed on s2. Please check log file for details. > [2017-02-15 13:29:54.540224] E [MSGID: 106121] [glusterd-mgmt.c:1713: > glusterd_mgmt_v3_post_validate] 0-management: Post Validation failed on > peers > [2017-02-15 13:29:54.540256] E [MSGID: 106122] [glusterd-mgmt.c:2363: > glusterd_mgmt_v3_initiate_snap_phases] 0-management: Post Validation > Failed > The message "E [MSGID: 106062] [glusterd-snapshot-utils.c: > 2391:glusterd_snap_create_use_rsp_dict] 0-management: failed to get snap > UUID" repeated 2 times between [2017-02-15 13:29:53.779020] and [2017-02-15 > 13:29:54.535075] > The message "E [MSGID: 106099] [glusterd-snapshot-utils.c: > 2507:glusterd_snap_use_rsp_dict] 0-glusterd: Unable to use rsp dict" > repeated 2 times between [2017-02-15 13:29:53.779073] and [2017-02-15 > 13:29:54.535078] > [2017-02-15 13:31:14.285666] I [MSGID: 106488] [glusterd-handler.c:1537:__ > glusterd_handle_cli_get_volume] 0-management: Received get vol req > [2017-02-15 13:32:17.827422] E [MSGID: 106027] > [glusterd-handler.c:4670:glusterd_get_volume_opts] > 0-management: Volume cluster.locking-scheme does not exist > [2017-02-15 13:34:02.635762] E [MSGID: 106116] > [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Pre > Validation failed on s1. Volume data-teste does not exist > [2017-02-15 13:34:02.635838] E [MSGID: 106116] > [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Pre > Validation failed on s2. Volume data-teste does not exist > [2017-02-15 13:34:02.635889] E [MSGID: 106116] > [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Pre > Validation failed on s3. Volume data-teste does not exist > [2017-02-15 13:34:02.636092] E [MSGID: 106122] > [glusterd-mgmt.c:947:glusterd_mgmt_v3_pre_validate] 0-management: Pre > Validation failed on peers > [2017-02-15 13:34:02.636132] E [MSGID: 106122] [glusterd-mgmt.c:2009: > glusterd_mgmt_v3_initiate_all_phases] 0-management: Pre Validation Failed > [2017-02-15 13:34:20.313228] E [MSGID: 106153] > [glusterd-syncop.c:113:gd_collate_errors] > 0-glusterd: Staging failed on s2. Error: Volume data-teste does not exist > [2017-02-15 13:34:20.313320] E [MSGID: 106153] > [glusterd-syncop.c:113:gd_collate_errors] > 0-glusterd: Staging failed on s1. Error: Volume data-teste does not exist > [2017-02-15 13:34:20.313377] E [MSGID: 106153] > [glusterd-syncop.c:113:gd_collate_errors] > 0-glusterd: Staging failed on s3. Error: Volume data-teste does not exist > [2017-02-15 13:34:36.796455] E [MSGID: 106153] > [glusterd-syncop.c:113:gd_collate_errors] > 0-glusterd: Staging failed on s1. Error: Volume data-teste does not exist > [2017-02-15 13:34:36.796830] E [MSGID: 106153] > [glusterd-syncop.c:113:gd_collate_errors] > 0-glusterd: Staging failed on s3. Error: Volume data-teste does not exist > [2017-02-15 13:34:36.796896] E [MSGID: 106153] > [glusterd-syncop.c:113:gd_collate_errors] > 0-glusterd: Staging failed on s2. Error: Volume data-teste does not exist > > Many thanks! > D > > > _______________________________________________ > Gluster-users mailing > [email protected]http://lists.gluster.org/mailman/listinfo/gluster-users > > >
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
