On Thu, Dec 8, 2016 at 11:25 PM, Pranith Kumar Karampuri < [email protected]> wrote:
> > > On Thu, Dec 8, 2016 at 11:17 PM, Pranith Kumar Karampuri < > [email protected]> wrote: > >> >> >> On Thu, Dec 8, 2016 at 10:22 PM, Ravishankar N <[email protected]> >> wrote: >> >>> On 12/08/2016 09:44 PM, Miloš Čučulović - MDPI wrote: >>> >>>> I was able to fix the sync by rsync-ing all the directories, then the >>>> hale started. The next problem :), as soon as there are files on the new >>>> brick, the gluster mount will render also this one for mounts, and the new >>>> brick is not ready yet, as the sync is not yet done, so it results on >>>> missing files on client side. I temporary removed the new brick, now I am >>>> running a manual rsync and will add the brick again, hope this could work. >>>> >>>> What mechanism is managing this issue, I guess there is something per >>>> built to make a replica brick available only once the data is completely >>>> synced. >>>> >>> This mechanism was introduced in 3.7.9 or 3.7.10 ( >>> http://review.gluster.org/#/c/13806/). Before that version, you >>> manually needed to set some xattrs on the bricks so that healing could >>> happen in parallel while the client still would server reads from the >>> original brick. I can't find the link to the doc which describes these >>> steps for setting xattrs.:-( >>> >> >> https://gluster.readthedocs.io/en/latest/Administrator%20Gui >> de/Managing%20Volumes/#replace-brick >> > > Oh this is addition of bricks? > Just do the following: > 1) Bring the new brick down by killing it. > 2) On the root of the mount directory(Let's call it /mnt) do: > > mkdir /mnt/<name-of-nonexistent-dir> > rmdir /mnt/<name-of-nonexistent-dir> > setfattr -n trusted.non-existent-key -v abc /mnt > setfattr -x trusted.non-existent-key /mnt > > 3) Start the volume using: "gluster volume start <volname> force" > > This will trigger the heal which will make sure everything is healed and > the application will only see the correct data. > > Since you did an explicit rsync, there is no gurantee that things should > work as expected. We will be adding the steps above to documentation. > Please note that you need to do these steps exactly, If you do the mkdir/rmdir/setfattr steps by bringing the good brick, reverse heal will happen and the data will be removed. > >> >> >>> Calling it a day, >>> Ravi >>> >>> >>>> - Kindest regards, >>>> >>>> Milos Cuculovic >>>> IT Manager >>>> >>>> --- >>>> MDPI AG >>>> Postfach, CH-4020 Basel, Switzerland >>>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland >>>> Tel. +41 61 683 77 35 >>>> Fax +41 61 302 89 18 >>>> Email: [email protected] >>>> Skype: milos.cuculovic.mdpi >>>> >>>> On 08.12.2016 16:17, Ravishankar N wrote: >>>> >>>>> On 12/08/2016 06:53 PM, Atin Mukherjee wrote: >>>>> >>>>>> >>>>>> >>>>>> On Thu, Dec 8, 2016 at 6:44 PM, Miloš Čučulović - MDPI >>>>>> <[email protected] <mailto:[email protected]>> wrote: >>>>>> >>>>>> Ah, damn! I found the issue. On the storage server, the storage2 >>>>>> IP address was wrong, I inversed two digits in the /etc/hosts >>>>>> file, sorry for that :( >>>>>> >>>>>> I was able to add the brick now, I started the heal, but still no >>>>>> data transfer visible. >>>>>> >>>>>> 1. Are the files getting created on the new brick though? >>>>> 2. Can you provide the output of `getfattr -d -m . -e hex >>>>> /data/data-cluster` on both bricks? >>>>> 3. Is it possible to attach gdb to the self-heal daemon on the original >>>>> (old) brick and get a backtrace? >>>>> `gdb -p <pid of self-heal daemon on the orignal brick>` >>>>> thread apply all bt -->share this output >>>>> quit gdb. >>>>> >>>>> >>>>> -Ravi >>>>> >>>>>> >>>>>> @Ravi/Pranith - can you help here? >>>>>> >>>>>> >>>>>> >>>>>> By doing gluster volume status, I have >>>>>> >>>>>> Status of volume: storage >>>>>> Gluster process TCP Port RDMA Port Online >>>>>> Pid >>>>>> ------------------------------------------------------------ >>>>>> ------------------ >>>>>> Brick storage2:/data/data-cluster 49152 0 Y >>>>>> 23101 >>>>>> Brick storage:/data/data-cluster 49152 0 Y >>>>>> 30773 >>>>>> Self-heal Daemon on localhost N/A N/A Y >>>>>> 30050 >>>>>> Self-heal Daemon on storage N/A N/A Y >>>>>> 30792 >>>>>> >>>>>> >>>>>> Any idea? >>>>>> >>>>>> On storage I have: >>>>>> Number of Peers: 1 >>>>>> >>>>>> Hostname: 195.65.194.217 >>>>>> Uuid: 7c988af2-9f76-4843-8e6f-d94866d57bb0 >>>>>> State: Peer in Cluster (Connected) >>>>>> >>>>>> >>>>>> - Kindest regards, >>>>>> >>>>>> Milos Cuculovic >>>>>> IT Manager >>>>>> >>>>>> --- >>>>>> MDPI AG >>>>>> Postfach, CH-4020 Basel, Switzerland >>>>>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland >>>>>> Tel. +41 61 683 77 35 >>>>>> Fax +41 61 302 89 18 >>>>>> Email: [email protected] <mailto:[email protected]> >>>>>> Skype: milos.cuculovic.mdpi >>>>>> >>>>>> On 08.12.2016 13:55, Atin Mukherjee wrote: >>>>>> >>>>>> Can you resend the attachment as zip? I am unable to extract >>>>>> the >>>>>> content? We shouldn't have 0 info file. What does gluster peer >>>>>> status >>>>>> output say? >>>>>> >>>>>> On Thu, Dec 8, 2016 at 4:51 PM, Miloš Čučulović - MDPI >>>>>> <[email protected] <mailto:[email protected]> >>>>>> <mailto:[email protected] <mailto:[email protected]>>> >>>>>> wrote: >>>>>> >>>>>> I hope you received my last email Atin, thank you! >>>>>> >>>>>> - Kindest regards, >>>>>> >>>>>> Milos Cuculovic >>>>>> IT Manager >>>>>> >>>>>> --- >>>>>> MDPI AG >>>>>> Postfach, CH-4020 Basel, Switzerland >>>>>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland >>>>>> Tel. +41 61 683 77 35 >>>>>> Fax +41 61 302 89 18 >>>>>> Email: [email protected] <mailto:[email protected]> >>>>>> <mailto:[email protected] <mailto:[email protected]>> >>>>>> Skype: milos.cuculovic.mdpi >>>>>> >>>>>> On 08.12.2016 10:28, Atin Mukherjee wrote: >>>>>> >>>>>> >>>>>> ---------- Forwarded message ---------- >>>>>> From: *Atin Mukherjee* <[email protected] >>>>>> <mailto:[email protected]> >>>>>> <mailto:[email protected] >>>>>> <mailto:[email protected]>> <mailto:[email protected] >>>>>> <mailto:[email protected]> >>>>>> <mailto:[email protected] >>>>>> <mailto:[email protected]>>>> >>>>>> Date: Thu, Dec 8, 2016 at 11:56 AM >>>>>> Subject: Re: [Gluster-users] Replica brick not working >>>>>> To: Ravishankar N <[email protected] >>>>>> <mailto:[email protected]> >>>>>> <mailto:[email protected] >>>>>> <mailto:[email protected]>> >>>>>> <mailto:[email protected] <mailto:[email protected] >>>>>> > >>>>>> <mailto:[email protected] >>>>>> <mailto:[email protected]>>>> >>>>>> Cc: Miloš Čučulović - MDPI <[email protected] >>>>>> <mailto:[email protected]> >>>>>> <mailto:[email protected] <mailto:[email protected] >>>>>> >> >>>>>> <mailto:[email protected] <mailto:[email protected] >>>>>> > >>>>>> <mailto:[email protected] <mailto:[email protected]>>>>, >>>>>> Pranith Kumar Karampuri >>>>>> <[email protected] <mailto:[email protected]> >>>>>> <mailto:[email protected] <mailto:[email protected]>> >>>>>> <mailto:[email protected] >>>>>> <mailto:[email protected]> <mailto:[email protected] >>>>>> <mailto:[email protected]>>>>, >>>>>> gluster-users >>>>>> <[email protected] >>>>>> <mailto:[email protected]> >>>>>> <mailto:[email protected] >>>>>> <mailto:[email protected]>> >>>>>> <mailto:[email protected] >>>>>> <mailto:[email protected]> >>>>>> <mailto:[email protected] >>>>>> <mailto:[email protected]>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Dec 8, 2016 at 11:11 AM, Ravishankar N >>>>>> <[email protected] >>>>>> <mailto:[email protected]> <mailto: >>>>>> [email protected] >>>>>> <mailto:[email protected]>> >>>>>> <mailto:[email protected] >>>>>> <mailto:[email protected]> <mailto: >>>>>> [email protected] >>>>>> <mailto:[email protected]>>>> >>>>>> >>>>>> wrote: >>>>>> >>>>>> On 12/08/2016 10:43 AM, Atin Mukherjee wrote: >>>>>> >>>>>> >From the log snippet: >>>>>> >>>>>> [2016-12-07 09:15:35.677645] I [MSGID: 106482] >>>>>> >>>>>> [glusterd-brick-ops.c:442:__glusterd_handle_add_brick] >>>>>> 0-management: Received add brick req >>>>>> [2016-12-07 09:15:35.677708] I [MSGID: 106062] >>>>>> >>>>>> [glusterd-brick-ops.c:494:__glusterd_handle_add_brick] >>>>>> 0-management: replica-count is 2 >>>>>> [2016-12-07 09:15:35.677735] E [MSGID: 106291] >>>>>> >>>>>> [glusterd-brick-ops.c:614:__glusterd_handle_add_brick] >>>>>> 0-management: >>>>>> >>>>>> The last log entry indicates that we hit the >>>>>> code path in >>>>>> gd_addbr_validate_replica_count () >>>>>> >>>>>> if (replica_count == >>>>>> volinfo->replica_count) { >>>>>> if (!(total_bricks % >>>>>> volinfo->dist_leaf_count)) { >>>>>> ret = 1; >>>>>> goto out; >>>>>> } >>>>>> } >>>>>> >>>>>> >>>>>> It seems unlikely that this snippet was hit >>>>>> because we print >>>>>> the E >>>>>> [MSGID: 106291] in the above message only if >>>>>> ret==-1. >>>>>> gd_addbr_validate_replica_count() returns -1 and >>>>>> yet not >>>>>> populates >>>>>> err_str only when in volinfo->type doesn't match >>>>>> any of the >>>>>> known >>>>>> volume types, so volinfo->type is corrupted >>>>>> perhaps? >>>>>> >>>>>> >>>>>> You are right, I missed that ret is set to 1 here in >>>>>> the above >>>>>> snippet. >>>>>> >>>>>> @Milos - Can you please provide us the volume info >>>>>> file from >>>>>> /var/lib/glusterd/vols/<volname>/ from all the three >>>>>> nodes to >>>>>> continue >>>>>> the analysis? >>>>>> >>>>>> >>>>>> >>>>>> -Ravi >>>>>> >>>>>> @Pranith, Ravi - Milos was trying to convert a >>>>>> dist (1 X 1) >>>>>> volume to a replicate (1 X 2) using add brick >>>>>> and hit >>>>>> this issue >>>>>> where add-brick failed. The cluster is >>>>>> operating with 3.7.6. >>>>>> Could you help on what scenario this code path >>>>>> can be >>>>>> hit? One >>>>>> straight forward issue I see here is missing >>>>>> err_str in >>>>>> this path. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> ~ Atin (atinm) >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> ~ Atin (atinm) >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> ~ Atin (atinm) >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> ~ Atin (atinm) >>>>>> >>>>> >>>>> >>>>> >>> >> >> >> -- >> Pranith >> > > > > -- > Pranith > -- Pranith
_______________________________________________ Gluster-users mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-users
