13.07.2016 09:50, Pranith Kumar Karampuri пишет:


On Wed, Jul 13, 2016 at 11:11 AM, Dmitry Melekhov <[email protected] <mailto:[email protected]>> wrote:

    13.07.2016 09:36, Pranith Kumar Karampuri пишет:


    On Wed, Jul 13, 2016 at 10:58 AM, Dmitry Melekhov <[email protected]
    <mailto:[email protected]>> wrote:

        13.07.2016 09:26, Pranith Kumar Karampuri пишет:


        On Wed, Jul 13, 2016 at 10:50 AM, Dmitry Melekhov
        <[email protected] <mailto:[email protected]>> wrote:

            13.07.2016 09:16, Pranith Kumar Karampuri пишет:


            On Wed, Jul 13, 2016 at 10:38 AM, Dmitry Melekhov
            <[email protected] <mailto:[email protected]>> wrote:

                13.07.2016 09:04, Pranith Kumar Karampuri пишет:


                On Wed, Jul 13, 2016 at 10:29 AM, Dmitry Melekhov
                <[email protected] <mailto:[email protected]>> wrote:

                    13.07.2016 08:56, Pranith Kumar Karampuri пишет:


                    On Wed, Jul 13, 2016 at 10:23 AM, Dmitry
                    Melekhov <[email protected]
                    <mailto:[email protected]>> wrote:

                        13.07.2016 08:46, Pranith Kumar Karampuri
                        пишет:


                        On Wed, Jul 13, 2016 at 10:10 AM, Dmitry
                        Melekhov <[email protected]
                        <mailto:[email protected]>> wrote:

                            13.07.2016 08:36, Pranith Kumar
                            Karampuri пишет:


                            On Wed, Jul 13, 2016 at 9:35 AM,
                            Dmitry Melekhov <[email protected]
                            <mailto:[email protected]>> wrote:

                                13.07.2016 01:52, Anuradha
                                Talur пишет:


                                    ----- Original Message -----

                                        From: "Dmitry Melekhov"
                                        <[email protected]
                                        <mailto:[email protected]>>
                                        To: "Pranith Kumar
                                        Karampuri"
                                        <[email protected]
                                        <mailto:[email protected]>>
                                        Cc: "gluster-users"
                                        <[email protected]
                                        <mailto:[email protected]>>
                                        Sent: Tuesday, July 12,
                                        2016 9:27:17 PM
                                        Subject: Re:
                                        [Gluster-users] 3.7.13,
                                        index healing broken?



                                        12.07.2016 17:39,
                                        Pranith Kumar Karampuri
                                        пишет:



                                        Wow, what are the steps
                                        to recreate the problem?

                                        just set file length to
                                        zero, always reproducible.

                                    If you are setting the file
                                    length to 0 on one of the
                                    bricks (looks like
                                    that is the case), it is
                                    not a bug.

                                    Index heal relies on
                                    failures seen from the
                                    mount point(s)
                                    to identify the files that
                                    need heal. It won't be able
                                    to recognize any file
                                    modification done directly
                                    on bricks. Same goes for
                                    heal info command which
                                    is the reason heal info
                                    also shows 0 entries.


                                Well, this makes self-heal
                                useless then- if any file is
                                accidently corrupted or deleted
                                (yes! if file is deleted
                                directly from brick this is no
                                recognized by idex heal too),
                                then it will not be
                                self-healed, because self-heal
                                uses index heal.


                            It is better to look into bit-rot
                            feature if you want to guard
                            against these kinds of problems.

                            Bit rot detects bit problems, not
                            missing files or their wrong length,
                            i.e. this is overhead for such
                            simple task.


                        It detects wrong length. Because
                        checksum won't match anymore.

                        Yes, sure. I guess that it will detect
                        missed files too. But it needs far more
                        resources, then just comparing
                        directories in bricks?

                        What use-case you are trying out is
                        leading to changing things directly on
                        the brick?
                        I'm trying to test gluster failure
                        tolerance and right now I'm not happy
                        with it...


                    Which cases of fault tolerance are you not
                    happy with? Making changes directly on the
                    brick or anything else as well?

                    I'll repeat:
                    As I already said- if I for some reason ( real
                    case  can be only by accident ) will delete
                    file this will not be detected by self-heal
                    daemon, and, thus, will lead to lower
                    replication level, i.e. lower failure tolerance.


                To prevent such accidents you need to set selinux
                policies so that files under the brick are not
                modified by accident by any user. At least that is
                the solution I remember when this was discussed
                3-4 years back.

                So only supported platfrom is linux? Or, may be, it
                is better to improve self-healing to detect missing
                or wrong length files, I guess this is very low
                cost in terms of host resources operation.
                Just a suggestion, may be we need to look to
                alternatives in near future....

            This is a corner case, from design perspective it is
            generally not a good idea to optimize for the corner
            case. It is better to protect ourselves from the corner
            case (SElinux etc) or you can also use snapshots to
            protect against these kind of mishaps.

            Sorry, I'm not agree.
            As you  know if on access missed or wrong lenghted file
            from fuse client it is restored (healed), i.e. gluster
            recognizes file is wrong and heal it , so I do not see
            any reason to provide this such function as self-healing.
            Thank you!

        Ah! Now how do you suggest we keep track of which of 10s of
        millions of files the user accidentally deleted from the
        brick without gluster's knowledge? Once it comes to
        gluster's knowledge we can do something. But how does
        gluster become aware of something it is not keeping track
        of? At the time you access it gluster knows something went
        wrong so it restores it. If you change something on the
        bricks even by accident all the data gluster keeps (similar
        to journal) is a waste. Even the disk filesystems will ask
        you to do fsck if something unexpected happens so full
        self-heal is similar operation.

        You are absolutely right- question is why gluster does not
        become aware about such problem is case of self-healing?


    Because the operations that are performed directly on brick do
    not go through gluster stack.

    OK, I'll repeat-
    As you  know if on access missed or wrong lenghted file from fuse
    client it is restored (healed), i.e. gluster recognizes file is
    wrong and heal it , so I do not see any reason to provide this
    such function as self-healing.


For which you need accessing the file.
That's right.
For which you need full crawl. You can't detect the modification which doesn't go through the stack so this is the only possibility.

OK, then, if self-heal is really useless and no possible way to get it will be provided, I guess we'll use external script to check bricks directories consistency,
don't think ls and diff will get much resources.

Thank you!

p.s.
still can't understand why it can't be implemented in gluster... :-(





-- Pranith




-- Pranith




--
Pranith

_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users

Reply via email to