Hi Thomas,

Thanks again for your replies and patience :)

We have also offline backups of the files.

So, just to verify I understood this correctly, deletion of a .glusterfs-gfid file doesn't inherently include the risk of the loss of the complete brick, right?

I saw you already applied this for your purposes so it worked for you... But just as a confirmation. Of course it is fully understood that the operational risk is on our side.

It is just an "information-wise" question :)

Best regards
Ilias

Am 17.08.22 um 12:47 schrieb Thomas Bätzler:

Hello Ilias,

Please note that you can and should backup all of the file(s) involved in the split-brain by accessing them over the brick root instead of the gluster mount. That is also the reason why you’re not in danger of a failure cascade wiping out our data.

Be careful when replacing bricks, though. You want that heal to go in the right direction 😉

Mit freundlichen Grüßen,

i.A. Thomas Bätzler

--

BRINGE Informationstechnik GmbH

Zur Seeplatte 12

D-76228 Karlsruhe

Germany

Fon: +49 721 94246-0

Fon: +49 171 5438457

Fax: +49 721 94246-66

Web: http://www.bringe.de/ <http://www.bringe.de/>

Geschäftsführer: Dipl.-Ing. (FH) Martin Bringe

Ust.Id: DE812936645, HRB 108943 Mannheim

*Von:* Gluster-users <gluster-users-boun...@gluster.org> *Im Auftrag von *Ilias Chasapakis forumZFD
*Gesendet:* Mittwoch, 17. August 2022 11:18
*An:* gluster-users@gluster.org
*Betreff:* Re: [Gluster-users] Directory in split brain does not heal - Gfs 9.2

Thanks for the suggestions. My question is if the risk is actually related to only losing the file/dir or actually creating inconsistencies that span through the bricks and "break everything". Of course we have to take action anyway for this not to spread (as we already now have a second entry that developed an "unhealable" directory split-brain) so it is just a question of evaluation before acting.

Am 12.08.22 um 18:12 schrieb Thomas Bätzler:

    Am 12.08.2022 um 17:12 schrieb Ilias Chasapakis forumZFD:

        Dear fellow gluster users,

        we are facing a problem with our replica 3 setup. Glusterfs
        version is 9.2.

        We have a problem with a directory that is in split-brain and
        we cannot manage to heal with:

            gluster volume heal gfsVol split-brain latest-mtime /folder

        The command throws the following error: "failed:Transport
        endpoint is not connected."

        So the split brain directory entry remains and and so the
        whole healing process is not completing and other entries get
        stuck.

        I saw there is a python script available
        https://github.com/joejulian/glusterfs-splitbrain Would that
        be a good solution to try? To be honest we are a bit concerned
        with deleting the gfid and the files from the brick manually
        as it seems it can create inconsistencies and break things...
        I can of course give you more information about our setup and
        situation, but if you already have some tip, that would be
        fantastic.

    You could at least verify what's going on: Go to your brick roots
    and list /folder from each. You have 3n bricks with n replica
    sets. Find the replica set where you can spot a difference. It's
    most likely a file or directory that's missing or different. If
    it's a file, do a ls -ain on the file on each brick in the replica
    set. It'll report an inode number. Do a find .glusterfs -inum from
    the brick root. You'll likely see that you have different gfid-files.

    To fix the problem, you have to help gluster along by cleaning up
    the mess. This is completely "do it at your own risk, it worked
    for me, ymmv": cp (not mv!) a copy of the file you want to keep.
    On each brick in the replica-set, delete the gfid-file and the
    datafile. Try a heal on the volume and verify that you can access
    the path in question using the glusterfs mount. Copy back your
    salvaged file using the glusterfs mount.

    We had this happening quite often on a heavily loaded glusterfs
    shared filesystem that held a mail-spool. There would be parallel
    accesses trying to mv files and sometimes we'd end up with
    mismatched data on the bricks of the replica set. I've reported
    this on github, but apparently it wasn't seen as a serious
    problem. We've moved on to ceph FS now. That sure has bugs, too,
    but hopefully not as aggravating.

    MfG,

    i.A. Thomas Bätzler

--
    BRINGE Informationstechnik GmbH

    Zur Seeplatte 12

    D-76228 Karlsruhe

    Germany

    Fon: +49 721 94246-0

    Fon: +49 171 5438457

    Fax: +49 721 94246-66

    Web:http://www.bringe.de/

    Geschäftsführer: Dipl.-Ing. (FH) Martin Bringe

    Ust.Id: DE812936645, HRB 108943 Mannheim



    ________

    Community Meeting Calendar:

    Schedule -

    Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC

    Bridge:https://meet.google.com/cpu-eiue-hvk

    Gluster-users mailing list

    Gluster-users@gluster.org

    https://lists.gluster.org/mailman/listinfo/gluster-users

--
forumZFD
Entschieden für Frieden | Committed to Peace
Ilias Chasapakis
Referent IT | IT Consultant
Forum Ziviler Friedensdienst e.V. | Forum Civil Peace Service
Am Kölner Brett 8 | 50825 Köln | Germany
Tel 0221 91273243 | Fax 0221 91273299 |http://www.forumZFD.de  
<http://www.forumZFD.de>
Vorstand nach § 26 BGB, einzelvertretungsberechtigt | Executive Board:
Oliver Knabe (Vorsitz | Chair), Jens von Bargen, Alexander Mauz
VR 17651 Amtsgericht Köln
Spenden | Donations: IBAN DE37 3702 0500 0008 2401 01 BIC BFSWDE33XXX

--
forumZFD
Entschieden für Frieden | Committed to Peace

Ilias Chasapakis
Referent IT | IT Consultant

Forum Ziviler Friedensdienst e.V. | Forum Civil Peace Service
Am Kölner Brett 8 | 50825 Köln | Germany

Tel 0221 91273243 | Fax 0221 91273299 |http://www.forumZFD.de

Vorstand nach § 26 BGB, einzelvertretungsberechtigt | Executive Board:
Oliver Knabe (Vorsitz | Chair), Jens von Bargen, Alexander Mauz
VR 17651 Amtsgericht Köln

Spenden | Donations: IBAN DE37 3702 0500 0008 2401 01 BIC BFSWDE33XXX

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to