13.07.2016 10:10, Pranith Kumar Karampuri пишет:


On Wed, Jul 13, 2016 at 11:27 AM, Dmitry Melekhov <[email protected] <mailto:[email protected]>> wrote:

    13.07.2016 09:50, Pranith Kumar Karampuri пишет:


    On Wed, Jul 13, 2016 at 11:11 AM, Dmitry Melekhov <[email protected]
    <mailto:[email protected]>> wrote:

        13.07.2016 09:36, Pranith Kumar Karampuri пишет:


        On Wed, Jul 13, 2016 at 10:58 AM, Dmitry Melekhov
        <[email protected] <mailto:[email protected]>> wrote:

            13.07.2016 09:26, Pranith Kumar Karampuri пишет:


            On Wed, Jul 13, 2016 at 10:50 AM, Dmitry Melekhov
            <[email protected] <mailto:[email protected]>> wrote:

                13.07.2016 09:16, Pranith Kumar Karampuri пишет:


                On Wed, Jul 13, 2016 at 10:38 AM, Dmitry Melekhov
                <[email protected] <mailto:[email protected]>> wrote:

                    13.07.2016 09:04, Pranith Kumar Karampuri пишет:


                    On Wed, Jul 13, 2016 at 10:29 AM, Dmitry
                    Melekhov <[email protected]
                    <mailto:[email protected]>> wrote:

                        13.07.2016 08:56, Pranith Kumar Karampuri
                        пишет:


                        On Wed, Jul 13, 2016 at 10:23 AM, Dmitry
                        Melekhov <[email protected]
                        <mailto:[email protected]>> wrote:

                            13.07.2016 08:46, Pranith Kumar
                            Karampuri пишет:


                            On Wed, Jul 13, 2016 at 10:10 AM,
                            Dmitry Melekhov <[email protected]
                            <mailto:[email protected]>> wrote:

                                13.07.2016 08:36, Pranith Kumar
                                Karampuri пишет:


                                On Wed, Jul 13, 2016 at 9:35
                                AM, Dmitry Melekhov
                                <[email protected]
                                <mailto:[email protected]>> wrote:

                                    13.07.2016 01:52, Anuradha
                                    Talur пишет:


                                        ----- Original Message
                                        -----

                                            From: "Dmitry
                                            Melekhov"
                                            <[email protected]
                                            <mailto:[email protected]>>
                                            To: "Pranith Kumar
                                            Karampuri"
                                            <[email protected]
                                            <mailto:[email protected]>>
                                            Cc:
                                            "gluster-users"
                                            <[email protected]
                                            <mailto:[email protected]>>
                                            Sent: Tuesday,
                                            July 12, 2016
                                            9:27:17 PM
                                            Subject: Re:
                                            [Gluster-users]
                                            3.7.13, index
                                            healing broken?



                                            12.07.2016 17:39,
                                            Pranith Kumar
                                            Karampuri пишет:



                                            Wow, what are the
                                            steps to recreate
                                            the problem?

                                            just set file
                                            length to zero,
                                            always reproducible.

                                        If you are setting the
                                        file length to 0 on
                                        one of the bricks
                                        (looks like
                                        that is the case), it
                                        is not a bug.

                                        Index heal relies on
                                        failures seen from the
                                        mount point(s)
                                        to identify the files
                                        that need heal. It
                                        won't be able to
                                        recognize any file
                                        modification done
                                        directly on bricks.
                                        Same goes for heal
                                        info command which
                                        is the reason heal
                                        info also shows 0 entries.


                                    Well, this makes self-heal
                                    useless then- if any file
                                    is accidently corrupted or
                                    deleted (yes! if file is
                                    deleted directly from
                                    brick this is no
                                    recognized by idex heal
                                    too), then it will not be
                                    self-healed, because
                                    self-heal uses index heal.


                                It is better to look into
                                bit-rot feature if you want to
                                guard against these kinds of
                                problems.

                                Bit rot detects bit problems,
                                not missing files or their
                                wrong length, i.e. this is
                                overhead for such simple task.


                            It detects wrong length. Because
                            checksum won't match anymore.

                            Yes, sure. I guess that it will
                            detect missed files too. But it
                            needs far more resources, then just
                            comparing directories in bricks?

                            What use-case you are trying out is
                            leading to changing things directly
                            on the brick?
                            I'm trying to test gluster failure
                            tolerance and right now I'm not
                            happy with it...


                        Which cases of fault tolerance are you
                        not happy with? Making changes directly
                        on the brick or anything else as well?

                        I'll repeat:
                        As I already said- if I for some reason (
                        real case  can be only by accident ) will
                        delete file this will not be detected by
                        self-heal daemon, and, thus, will lead to
                        lower replication level, i.e. lower
                        failure tolerance.


                    To prevent such accidents you need to set
                    selinux policies so that files under the
                    brick are not modified by accident by any
                    user. At least that is the solution I
                    remember when this was discussed 3-4 years back.

                    So only supported platfrom is linux? Or, may
                    be, it is better to improve self-healing to
                    detect missing or wrong length files, I guess
                    this is very low cost in terms of host
                    resources operation.
                    Just a suggestion, may be we need to look to
                    alternatives in near future....

                This is a corner case, from design perspective it
                is generally not a good idea to optimize for the
                corner case. It is better to protect ourselves
                from the corner case (SElinux etc) or you can also
                use snapshots to protect against these kind of
                mishaps.

                Sorry, I'm not agree.
                As you  know if on access missed or wrong lenghted
                file from fuse client it is restored (healed), i.e.
                gluster recognizes file is wrong and heal it , so I
                do not see any reason to provide this such function
                as self-healing.
                Thank you!

            Ah! Now how do you suggest we keep track of which of
            10s of millions of files the user accidentally deleted
            from the brick without gluster's knowledge? Once it
            comes to gluster's knowledge we can do something. But
            how does gluster become aware of something it is not
            keeping track of? At the time you access it gluster
            knows something went wrong so it restores it. If you
            change something on the bricks even by accident all the
            data gluster keeps (similar to journal) is a waste.
            Even the disk filesystems will ask you to do fsck if
            something unexpected happens so full self-heal is
            similar operation.

            You are absolutely right- question is why gluster does
            not become aware about such problem is case of self-healing?


        Because the operations that are performed directly on brick
        do not go through gluster stack.

        OK, I'll repeat-
        As you  know if on access missed or wrong lenghted file from
        fuse client it is restored (healed), i.e. gluster recognizes
        file is wrong and heal it , so I do not see any reason to
        provide this such function as self-healing.


    For which you need accessing the file.
    That's right.
    For which you need full crawl. You can't detect the modification
    which doesn't go through the stack so this is the only possibility.

    OK, then, if self-heal is really useless and no possible way to
    get it will be provided, I guess we'll use external script to
    check bricks directories consistency,
    don't think ls and diff will get much resources.


How is this different from full self-heal?

Self-heal does not detect deleted or wrong-length files .
Why it can't ? I don't know, you just said it is impossible in gluster because it can only track changes only made through gluster, i.e. bricks can have different files sets and it is not recognized (true) because , as I understand, gluster's self-heal thinks that brick underlying filesystem can't be corrupted by server admin (not true, I can say this as almost 25 years experienced engineer, i.e. I did this several times ;-) ).




    Thank you!

    p.s.
    still can't understand why it can't be implemented in gluster... :-(





-- Pranith




-- Pranith




-- Pranith




--
Pranith

_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users

Reply via email to