Re: [Gluster-users] Bitrot strange behavior

2018-04-18 Thread FNU Raghavendra Manjunath
Hi Cedric,

The 120 seconds is given to allow a window for things to settle. i.e.
imagine the following situation

1) open file (fd1 as file descriptor)
2) modify the file via fd1
3) close the file descriptor (fd1)
4) Again open the file (fd2)
5) modify

In the above set of operations, by the time bitrot daemon tries to
calculate the signature after 1st fd (fd1) is closed, active IO could be
happening again on the new file descriptor (fd2). And The signature
calculated might not be correct while active IO is happening.
So in gluster bitrot daemon waits for 120 seconds to sign the file after
all the file descriptors associated with that file are closed.

So with 120 seconds time what happens is, once all the file descriptors
associated with a file are closed (by the application), then a notification
is sent to bitrot daemon that a object (file to be precise with details
about that file) is modified. When all the file descriptors of a file are
closed a operation called "release" is received by the brick. So the brick
process sends a notification to bitrot daemon about a object (i.e. file)
when release operation is received on that file (means all the file
descriptors are closed). And the bitrot daemon waits for 120 seconds after
receiving the notice. And  before the file is signed (i.e. within the 120
seconds of wait time), if someone again opens it and modifies it, the brick
process will let the bit rot daemon know about it so that bitrot daemon
wont attempt to sign the file (as it is actively being modified).

The above value is configurable. And can be changed to some other value.
You can use the below command to change it to a different value

"gluster volume set  features.expiry-time "

But as you said, currently the comparison of the signature by the scrubber
is local. i.e. while scrubbing, it calculates the checksum of the file,
compares with the stored checksum (as a extended attribute) to determine
whether the object is corrupted or not.
So yes, if the object is corrupted before the signing happens, then as of
now the scrubber does not have the mechanism to know that.

Regards,
Raghavendra


On Wed, Apr 18, 2018 at 2:20 PM, Cedric Lemarchand 
wrote:

> Hi Sweta,
>
> Thanks, this drive me some more questions:
>
> 1. What is the reason of delaying signature creation ?
>
> 2. As a same file (replicated or dispersed) having different signature
> thought bricks is by definition an error, it would be good to triggered it
> during a scrub, or with a different tool. Is something like this planned ?
>
> Cheers
>
> —
> Cédric Lemarchand
>
> On 18 Apr 2018, at 07:53, Sweta Anandpara  wrote:
>
> Hi Cedric,
>
> Any file is picked up for signing by the bitd process after the
> predetermined wait of 120 seconds. This default value is captured in the
> volume option 'features.expiry-time' and is configurable - in your case, it
> can be set to 0 or 1.
>
> Point 2 is correct. A file corrupted before the bitrot signature is
> generated will not be successfully detected by the scrubber. That would
> require admin/manual intervention to explicitly heal the corrupted file.
>
> -Sweta
>
> On 04/16/2018 10:42 PM, Cedric Lemarchand wrote:
>
> Hello,
>
> I am playing around with the bitrot feature and have some questions:
>
> 1. when a file is created, the "trusted.bit-rot.signature” attribute
> seems only created approximatively 120 seconds after its creations
> (the cluster is idle and there is only one file living on it). Why ?
> Is there a way to make this attribute generated at the same time of
> the file creation ?
>
> 2. corrupting a file (adding a 0 locally on a brick) before the
> creation of the "trusted.bit-rot.signature” do not provide any
> warning: its signature is different than the 2 others copies on other
> bricks. Starting a scrub did not show up anything. I would think that
> Gluster compares signature between bricks for this particular use
> cases, but it seems the check is only local, so a file corrupted
> before it’s bitrot signature creation stay corrupted, and thus could
> be served to clients whith bad data ?
>
> Gluster 3.12.8 on Debian Stretch, bricks on ext4.
>
> Volume Name: vol1
> Type: Replicate
> Volume ID: 85ccfaf2-5793-46f2-bd20-3f823b0a2232
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: gluster-01:/data/brick1
> Brick2: gluster-02:/data/brick2
> Brick3: gluster-03:/data/brick3
> Options Reconfigured:
> storage.build-pgfid: on
> performance.client-io-threads: off
> nfs.disable: on
> transport.address-family: inet
> features.bitrot: on
> features.scrub: Active
> features.scrub-throttle: aggressive
> features.scrub-freq: hourly
>
> Cheers,
>
> Cédric
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> ___
> Gluster-users 

Re: [Gluster-users] Bitrot strange behavior

2018-04-18 Thread Cedric Lemarchand
Hi Sweta,

Thanks, this drive me some more questions:

1. What is the reason of delaying signature creation ?

2. As a same file (replicated or dispersed) having different signature thought 
bricks is by definition an error, it would be good to triggered it during a 
scrub, or with a different tool. Is something like this planned ?

Cheers

—
Cédric Lemarchand

> On 18 Apr 2018, at 07:53, Sweta Anandpara  wrote:
> 
> Hi Cedric,
> 
> Any file is picked up for signing by the bitd process after the predetermined 
> wait of 120 seconds. This default value is captured in the volume option 
> 'features.expiry-time' and is configurable - in your case, it can be set to 0 
> or 1.
> 
> Point 2 is correct. A file corrupted before the bitrot signature is generated 
> will not be successfully detected by the scrubber. That would require 
> admin/manual intervention to explicitly heal the corrupted file.
> 
> -Sweta
> 
> On 04/16/2018 10:42 PM, Cedric Lemarchand wrote:
>> Hello,
>> 
>> I am playing around with the bitrot feature and have some questions:
>> 
>> 1. when a file is created, the "trusted.bit-rot.signature” attribute
>> seems only created approximatively 120 seconds after its creations
>> (the cluster is idle and there is only one file living on it). Why ?
>> Is there a way to make this attribute generated at the same time of
>> the file creation ?
>> 
>> 2. corrupting a file (adding a 0 locally on a brick) before the
>> creation of the "trusted.bit-rot.signature” do not provide any
>> warning: its signature is different than the 2 others copies on other
>> bricks. Starting a scrub did not show up anything. I would think that
>> Gluster compares signature between bricks for this particular use
>> cases, but it seems the check is only local, so a file corrupted
>> before it’s bitrot signature creation stay corrupted, and thus could
>> be served to clients whith bad data ?
>> 
>> Gluster 3.12.8 on Debian Stretch, bricks on ext4.
>> 
>> Volume Name: vol1
>> Type: Replicate
>> Volume ID: 85ccfaf2-5793-46f2-bd20-3f823b0a2232
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x 3 = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: gluster-01:/data/brick1
>> Brick2: gluster-02:/data/brick2
>> Brick3: gluster-03:/data/brick3
>> Options Reconfigured:
>> storage.build-pgfid: on
>> performance.client-io-threads: off
>> nfs.disable: on
>> transport.address-family: inet
>> features.bitrot: on
>> features.scrub: Active
>> features.scrub-throttle: aggressive
>> features.scrub-freq: hourly
>> 
>> Cheers,
>> 
>> Cédric
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
> 

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Bitrot strange behavior

2018-04-17 Thread Sweta Anandpara

Hi Cedric,

Any file is picked up for signing by the bitd process after the 
predetermined wait of 120 seconds. This default value is captured in the 
volume option 'features.expiry-time' and is configurable - in your case, 
it can be set to 0 or 1.


Point 2 is correct. A file corrupted before the bitrot signature is 
generated will not be successfully detected by the scrubber. That would 
require admin/manual intervention to explicitly heal the corrupted file.


-Sweta

On 04/16/2018 10:42 PM, Cedric Lemarchand wrote:

Hello,

I am playing around with the bitrot feature and have some questions:

1. when a file is created, the "trusted.bit-rot.signature” attribute
seems only created approximatively 120 seconds after its creations
(the cluster is idle and there is only one file living on it). Why ?
Is there a way to make this attribute generated at the same time of
the file creation ?

2. corrupting a file (adding a 0 locally on a brick) before the
creation of the "trusted.bit-rot.signature” do not provide any
warning: its signature is different than the 2 others copies on other
bricks. Starting a scrub did not show up anything. I would think that
Gluster compares signature between bricks for this particular use
cases, but it seems the check is only local, so a file corrupted
before it’s bitrot signature creation stay corrupted, and thus could
be served to clients whith bad data ?

Gluster 3.12.8 on Debian Stretch, bricks on ext4.

Volume Name: vol1
Type: Replicate
Volume ID: 85ccfaf2-5793-46f2-bd20-3f823b0a2232
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: gluster-01:/data/brick1
Brick2: gluster-02:/data/brick2
Brick3: gluster-03:/data/brick3
Options Reconfigured:
storage.build-pgfid: on
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
features.bitrot: on
features.scrub: Active
features.scrub-throttle: aggressive
features.scrub-freq: hourly

Cheers,

Cédric
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Bitrot strange behavior

2018-04-16 Thread Cedric Lemarchand
Hello,

I am playing around with the bitrot feature and have some questions:

1. when a file is created, the "trusted.bit-rot.signature” attribute
seems only created approximatively 120 seconds after its creations
(the cluster is idle and there is only one file living on it). Why ?
Is there a way to make this attribute generated at the same time of
the file creation ?

2. corrupting a file (adding a 0 locally on a brick) before the
creation of the "trusted.bit-rot.signature” do not provide any
warning: its signature is different than the 2 others copies on other
bricks. Starting a scrub did not show up anything. I would think that
Gluster compares signature between bricks for this particular use
cases, but it seems the check is only local, so a file corrupted
before it’s bitrot signature creation stay corrupted, and thus could
be served to clients whith bad data ?

Gluster 3.12.8 on Debian Stretch, bricks on ext4.

Volume Name: vol1
Type: Replicate
Volume ID: 85ccfaf2-5793-46f2-bd20-3f823b0a2232
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: gluster-01:/data/brick1
Brick2: gluster-02:/data/brick2
Brick3: gluster-03:/data/brick3
Options Reconfigured:
storage.build-pgfid: on
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
features.bitrot: on
features.scrub: Active
features.scrub-throttle: aggressive
features.scrub-freq: hourly

Cheers,

Cédric
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users