On 2019/1/15 下午8:03, David Sterba wrote:
> On Tue, Jan 15, 2019 at 07:48:47PM +0800, Qu Wenruo wrote:
>> Super nice move, it shows the corruption and the cause.
>>
>>      item 66 key (1714119835648 METADATA_ITEM 0) itemoff 13325 itemsize 33
>>      item 67 key (10510212874240 METADATA_ITEM 0) itemoff 13283 itemsize 42
>>      item 68 key (1714119868416 METADATA_ITEM 0) itemoff 13250 itemsize 33
> 
> The key order is the most frequent and also very reliable report of the
> memory bitlips. I think we should add an unconditional check before a
> leaf or node is written so we catch such errors before the bad data hit
> the disk.

I'm super happy for that.
Although I need to do some extra check before just removing that #ifdef
#endif pair.

> 
> This seems to happen way too often,

Right, but I don't know if it's some bad kernel driver poking the
memory, or really just some hardware memory flip.
(Especially when it comes to ultrabook like the reporter is using,
soldered memory will really be a pain in ass)

> I believe the check overhead would
> be acceptable and at least give early warning.

The problem is, current check_leaf_relaxed() call is too frequently.
It's not at leaf write time, but every time btrfs_mark_buffer_dirty().

It may cause some performance regression.

I need to look into a better location for such check.

Thanks,
Qu

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to