13.07.2016 10:24, Pranith Kumar Karampuri пишет:


On Wed, Jul 13, 2016 at 11:49 AM, Dmitry Melekhov <[email protected] <mailto:[email protected]>> wrote:

    13.07.2016 10:10, Pranith Kumar Karampuri пишет:


    On Wed, Jul 13, 2016 at 11:27 AM, Dmitry Melekhov <[email protected]
    <mailto:[email protected]>> wrote:

        13.07.2016 09:50, Pranith Kumar Karampuri пишет:


        On Wed, Jul 13, 2016 at 11:11 AM, Dmitry Melekhov
        <[email protected] <mailto:[email protected]>> wrote:

            13.07.2016 09:36, Pranith Kumar Karampuri пишет:


            On Wed, Jul 13, 2016 at 10:58 AM, Dmitry Melekhov
            <[email protected] <mailto:[email protected]>> wrote:

                13.07.2016 09:26, Pranith Kumar Karampuri пишет:


                On Wed, Jul 13, 2016 at 10:50 AM, Dmitry Melekhov
                <[email protected] <mailto:[email protected]>> wrote:

                    13.07.2016 09:16, Pranith Kumar Karampuri пишет:


                    On Wed, Jul 13, 2016 at 10:38 AM, Dmitry
                    Melekhov <[email protected]
                    <mailto:[email protected]>> wrote:

                        13.07.2016 09:04, Pranith Kumar Karampuri
                        пишет:


                        On Wed, Jul 13, 2016 at 10:29 AM, Dmitry
                        Melekhov <[email protected]
                        <mailto:[email protected]>> wrote:

                            13.07.2016 08:56, Pranith Kumar
                            Karampuri пишет:


                            On Wed, Jul 13, 2016 at 10:23 AM,
                            Dmitry Melekhov <[email protected]
                            <mailto:[email protected]>> wrote:

                                13.07.2016 08:46, Pranith Kumar
                                Karampuri пишет:


                                On Wed, Jul 13, 2016 at 10:10
                                AM, Dmitry Melekhov
                                <[email protected]
                                <mailto:[email protected]>> wrote:

                                    13.07.2016 08:36, Pranith
                                    Kumar Karampuri пишет:


                                    On Wed, Jul 13, 2016 at
                                    9:35 AM, Dmitry Melekhov
                                    <[email protected]
                                    <mailto:[email protected]>>
                                    wrote:

                                        13.07.2016 01:52,
                                        Anuradha Talur пишет:


                                            ----- Original
                                            Message -----

                                                From: "Dmitry
                                                Melekhov"
                                                <[email protected] 
<mailto:[email protected]>>
                                                To: "Pranith
                                                Kumar
                                                Karampuri"
                                                <[email protected]
                                                <mailto:[email protected]>>
                                                Cc:
                                                "gluster-users"
                                                <[email protected]
                                                
<mailto:[email protected]>>
                                                Sent:
                                                Tuesday, July
                                                12, 2016
                                                9:27:17 PM
                                                Subject: Re:
                                                [Gluster-users]
                                                3.7.13, index
                                                healing broken?



                                                12.07.2016
                                                17:39,
                                                Pranith Kumar
                                                Karampuri пишет:



                                                Wow, what are
                                                the steps to
                                                recreate the
                                                problem?

                                                just set file
                                                length to
                                                zero, always
                                                reproducible.

                                            If you are
                                            setting the file
                                            length to 0 on
                                            one of the bricks
                                            (looks like
                                            that is the
                                            case), it is not
                                            a bug.

                                            Index heal relies
                                            on failures seen
                                            from the mount
                                            point(s)
                                            to identify the
                                            files that need
                                            heal. It won't be
                                            able to recognize
                                            any file
                                            modification done
                                            directly on
                                            bricks. Same goes
                                            for heal info
                                            command which
                                            is the reason
                                            heal info also
                                            shows 0 entries.


                                        Well, this makes
                                        self-heal useless
                                        then- if any file is
                                        accidently corrupted
                                        or deleted (yes! if
                                        file is deleted
                                        directly from brick
                                        this is no recognized
                                        by idex heal too),
                                        then it will not be
                                        self-healed, because
                                        self-heal uses index
                                        heal.


                                    It is better to look into
                                    bit-rot feature if you
                                    want to guard against
                                    these kinds of problems.

                                    Bit rot detects bit
                                    problems, not missing
                                    files or their wrong
                                    length, i.e. this is
                                    overhead for such simple task.


                                It detects wrong length.
                                Because checksum won't match
                                anymore.

                                Yes, sure. I guess that it will
                                detect missed files too. But it
                                needs far more resources, then
                                just comparing directories in
                                bricks?

                                What use-case you are trying
                                out is leading to changing
                                things directly on the brick?
                                I'm trying to test gluster
                                failure tolerance and right now
                                I'm not happy with it...


                            Which cases of fault tolerance are
                            you not happy with? Making changes
                            directly on the brick or anything
                            else as well?

                            I'll repeat:
                            As I already said- if I for some
                            reason ( real case  can be only by
                            accident ) will delete file this
                            will not be detected by self-heal
                            daemon, and, thus, will lead to
                            lower replication level, i.e. lower
                            failure tolerance.


                        To prevent such accidents you need to
                        set selinux policies so that files under
                        the brick are not modified by accident
                        by any user. At least that is the
                        solution I remember when this was
                        discussed 3-4 years back.

                        So only supported platfrom is linux? Or,
                        may be, it is better to improve
                        self-healing to detect missing or wrong
                        length files, I guess this is very low
                        cost in terms of host resources operation.
                        Just a suggestion, may be we need to look
                        to alternatives in near future....

                    This is a corner case, from design
                    perspective it is generally not a good idea
                    to optimize for the corner case. It is better
                    to protect ourselves from the corner case
                    (SElinux etc) or you can also use snapshots
                    to protect against these kind of mishaps.

                    Sorry, I'm not agree.
                    As you  know if on access missed or wrong
                    lenghted file from fuse client it is restored
                    (healed), i.e. gluster recognizes file is
                    wrong and heal it , so I do not see any reason
                    to provide this such function as self-healing.
                    Thank you!

                Ah! Now how do you suggest we keep track of which
                of 10s of millions of files the user accidentally
                deleted from the brick without gluster's
                knowledge? Once it comes to gluster's knowledge we
                can do something. But how does gluster become
                aware of something it is not keeping track of? At
                the time you access it gluster knows something
                went wrong so it restores it. If you change
                something on the bricks even by accident all the
                data gluster keeps (similar to journal) is a
                waste. Even the disk filesystems will ask you to
                do fsck if something unexpected happens so full
                self-heal is similar operation.

                You are absolutely right- question is why gluster
                does not become aware about such problem is case of
                self-healing?


            Because the operations that are performed directly on
            brick do not go through gluster stack.

            OK, I'll repeat-
            As you  know if on access missed or wrong lenghted file
            from fuse client it is restored (healed), i.e. gluster
            recognizes file is wrong and heal it , so I do not see
            any reason to provide this such function as self-healing.


        For which you need accessing the file.
        That's right.
        For which you need full crawl. You can't detect the
        modification which doesn't go through the stack so this is
        the only possibility.

        OK, then, if self-heal is really useless and no possible way
        to get it will be provided, I guess we'll use external script
        to check bricks directories consistency,
        don't think ls and diff will get much resources.


    How is this different from full self-heal?

    Self-heal does not detect deleted or wrong-length files .


It detects when you do full crawl. Which essentially is ls -laR kind of thing on the whole volume. You don't need any external scripts, keep doing full crawl once in a while may be?

You mean on fuse mount?

It doesn't work:

[root@father ~]# mount -t glusterfs localhost:/pool gluster

[root@father ~]#

then make it zero lengths in brick:

[root@father gluster]# > /wall/pool/brick/gstatus-0.64-3.el7.x86_64.rpm
[root@father gluster]#


[root@father gluster]# ls -laR  /root/gluster/
/root/gluster/:
итого 122153384
drwxr-xr-x   4 qemu qemu        4096 июл 11 13:36 .
dr-xr-x---. 10 root root        4096 июл 11 12:26 ..
-rw-r--r--   1 root root  8589934592 май 11 09:14 csr1000v1.img
-rw-r--r-- 1 root root 0 июл 13 10:34 gstatus-0.64-3.el7.x86_64.rpm


As you can see gstatus-0.64-3.el7.x86_64.rpm has 0 length
But:

[root@father gluster]# touch /root/gluster/gstatus-0.64-3.el7.x86_64.rpm
[root@father gluster]# ls -laR  /root/gluster/
/root/gluster/:
итого 122153436
drwxr-xr-x   4 qemu qemu        4096 июл 11 13:36 .
dr-xr-x---. 10 root root        4096 июл 11 12:26 ..
-rw-r--r--   1 root root  8589934592 май 11 09:14 csr1000v1.img
-rw-r--r-- 1 root root 52268 июл 13 10:36 gstatus-0.64-3.el7.x86_64.rpm


I.e. if I do some i.o. on file then it is back.


By the way the same problem if I delete file directly in brick:

[root@father gluster]# rm /wall/pool/brick/gstatus-0.64-3.el7.x86_64.rpm
rm: удалить обычный файл «/wall/pool/brick/gstatus-0.64-3.el7.x86_64.rpm»? y
[root@father gluster]# ls -laR  /root/gluster/
/root/gluster/:
итого 122153384
drwxr-xr-x   4 qemu qemu        4096 июл 13 10:38 .
dr-xr-x---. 10 root root        4096 июл 11 12:26 ..
-rw-r--r--   1 root root  8589934592 май 11 09:14 csr1000v1.img
-rw-r--r--   1 qemu qemu 43692064768 июл 13 10:38 infimonitor.img


I don't see it in directory in fuse mount at all till touch, which restores file too.


If you need any performance improvements here, we will be happy to help. Please give us feedback.

You recipe doesn't work :-( If there is difference between bricks directories due to direct brick manipulation it leads to problems.


All I was saying is it is not possible to detect them through index heal. Because for the index to be populated you need the operations to go through gluster stack.

    Why it can't ? I don't know, you just said it is impossible in
    gluster because it can only track changes only made through
    gluster, i.e. bricks can have different files sets and it is not
    recognized (true) because , as I understand, gluster's  self-heal
    thinks that brick underlying filesystem can't be corrupted by
    server admin  (not true, I can say this as almost 25 years
    experienced engineer, i.e. I did this several times ;-) ).




        Thank you!

        p.s.
        still can't understand why it can't be implemented in
        gluster... :-(





-- Pranith




-- Pranith




-- Pranith




-- Pranith




--
Pranith

_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users

Reply via email to