You'd basically have to copy the content of /var/lib/glusterd from fs001 to fs003 with out overwritting fs003's onode specific details. Please ensure you don't touch glusterd.info file and content of /var/lib/glusterd/peers in fs003 and rest can be copied. Post that I expect glusterd will come up.
On Fri, 26 May 2017 at 20:30, Jarsulic, Michael [CRI] < [email protected]> wrote: > Here is some further information on this issue: > > The version of gluster we are using is 3.7.6. > > Also, the error I found in the cmd history is: > [2017-05-26 04:28:28.332700] : volume remove-brick hpcscratch > cri16fs001-ib:/data/brick1/scratch commit : FAILED : Commit failed on > cri16fs003-ib. Please check log file for details. > > I did not notice this at the time and made an attempt to remove the next > brick to migrate the data off of the system. This left the servers in the > following state. > > fs001 - /var/lib/glusterd/vols/hpcscratch/info > > type=0 > count=3 > status=1 > sub_count=0 > stripe_count=1 > replica_count=1 > disperse_count=0 > redundancy_count=0 > version=42 > transport-type=0 > volume-id=80b8eeed-1e72-45b9-8402-e01ae0130105 > … > op-version=30700 > client-op-version=3 > quota-version=0 > parent_volname=N/A > restored_from_snap=00000000-0000-0000-0000-000000000000 > snap-max-hard-limit=256 > server.event-threads=8 > performance.client-io-threads=on > client.event-threads=8 > performance.cache-size=32MB > performance.readdir-ahead=on > brick-0=cri16fs001-ib:-data-brick2-scratch > brick-1=cri16fs003-ib:-data-brick5-scratch > brick-2=cri16fs003-ib:-data-brick6-scratch > > > fs003 - cat /var/lib/glusterd/vols/hpcscratch/info > > type=0 > count=4 > status=1 > sub_count=0 > stripe_count=1 > replica_count=1 > disperse_count=0 > redundancy_count=0 > version=35 > transport-type=0 > volume-id=80b8eeed-1e72-45b9-8402-e01ae0130105 > … > op-version=30700 > client-op-version=3 > quota-version=0 > parent_volname=N/A > restored_from_snap=00000000-0000-0000-0000-000000000000 > snap-max-hard-limit=256 > performance.readdir-ahead=on > performance.cache-size=32MB > client.event-threads=8 > performance.client-io-threads=on > server.event-threads=8 > brick-0=cri16fs001-ib:-data-brick1-scratch > brick-1=cri16fs001-ib:-data-brick2-scratch > brick-2=cri16fs003-ib:-data-brick5-scratch > brick-3=cri16fs003-ib:-data-brick6-scratch > > > fs001 - /var/lib/glusterd/vols/hpcscratch/node_state.info > > rebalance_status=5 > status=4 > rebalance_op=0 > rebalance-id=00000000-0000-0000-0000-000000000000 > brick1=cri16fs001-ib:/data/brick2/scratch > count=1 > > > fs003 - /var/lib/glusterd/vols/hpcscratch/node_state.info > > rebalance_status=1 > status=0 > rebalance_op=9 > rebalance-id=0184577f-eb64-4af9-924d-91ead0605a1e > brick1=cri16fs001-ib:/data/brick1/scratch > count=1 > > > -- > Mike Jarsulic > > > On 5/26/17, 8:22 AM, "[email protected] on behalf of > Jarsulic, Michael [CRI]" <[email protected] on behalf of > [email protected]> wrote: > > Recently, I had some problems with the OS hard drives in my glusterd > servers and took one of my systems down for maintenance. The first step was > to remove one of the bricks (brick1) hosted on the server (fs001). The data > migrated successfully and completed last night. After that, I went to > commit the changes and the commit failed. Afterwards, glusterd will not > start on one of my servers (fs003). When I check the glusterd logs on fs003 > I get the following errors whenever glusterd starts: > > [2017-05-26 04:37:21.358932] I [MSGID: 100030] > [glusterfsd.c:2318:main] 0-/usr/sbin/glusterd: Started running > /usr/sbin/glusterd version 3.7.6 (args: /usr/sbin/glusterd > --pid-file=/var/run/glusterd.pid) > [2017-05-26 04:37:21.382630] I [MSGID: 106478] [glusterd.c:1350:init] > 0-management: Maximum allowed open file descriptors set to 65536 > [2017-05-26 04:37:21.382712] I [MSGID: 106479] [glusterd.c:1399:init] > 0-management: Using /var/lib/glusterd as working directory > [2017-05-26 04:37:21.422858] I [MSGID: 106228] > [glusterd.c:433:glusterd_check_gsync_present] 0-glusterd: geo-replication > module not installed in the system [No such file or directory] > [2017-05-26 04:37:21.450123] I [MSGID: 106513] > [glusterd-store.c:2047:glusterd_restore_op_version] 0-glusterd: retrieved > op-version: 30706 > [2017-05-26 04:37:21.463812] E [MSGID: 101032] > [store.c:434:gf_store_handle_retrieve] 0-: Path corresponding to > /var/lib/glusterd/vols/hpcscratch/bricks/cri16fs001-ib:-data-brick1-scratch. > [No such file or directory] > [2017-05-26 04:37:21.463866] E [MSGID: 106201] > [glusterd-store.c:3042:glusterd_store_retrieve_volumes] 0-management: > Unable to restore volume: hpcscratch > [2017-05-26 04:37:21.463919] E [MSGID: 101019] > [xlator.c:428:xlator_init] 0-management: Initialization of volume > 'management' failed, review your volfile again > [2017-05-26 04:37:21.463943] E [graph.c:322:glusterfs_graph_init] > 0-management: initializing translator failed > [2017-05-26 04:37:21.463970] E [graph.c:661:glusterfs_graph_activate] > 0-graph: init failed > [2017-05-26 04:37:21.466703] W [glusterfsd.c:1236:cleanup_and_exit] > (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xda) [0x405cba] > -->/usr/sbin/glusterd(glusterfs_process_volfp+0x116) [0x405b96] > -->/usr/sbin/glusterd(cleanup_and_exit+0x65) [0x4059d5] ) 0-: received > signum (0), shutting down > > The volume is distribution only. The problem to me looks like it is > still expecting brick1 on fs001 to be available in the volume. Is there any > way to recover from this? Is there any more information that I can provide? > > > -- > Mike Jarsulic > > _______________________________________________ > Gluster-users mailing list > [email protected] > > https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.gluster.org_mailman_listinfo_gluster-2Dusers&d=CwICAg&c=Nd1gv_ZWYNIRyZYZmXb18oVfc3lTqv2smA_esABG70U&r=Ak787_FO1coN0_NpWYelxgxjFkaWMHYbXVCdYf-STow&m=zlkeQUf69-VWf8o96ZWr-vxNatuWZvCgYuHnUVj3u70&s=8YOysLTMfJHXS6dSVgP7X0o0LovgLcIuPjfoSY2Kt2Q&e= > > > _______________________________________________ > Gluster-users mailing list > [email protected] > http://lists.gluster.org/mailman/listinfo/gluster-users -- - Atin (atinm)
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
