Re: Can btrfs silently repair read-error in raid1

2012-05-09 Thread Atila

On 08-05-2012 18:47, Hubert Kario wrote:

On Tuesday 08 of May 2012 04:45:51 cwillu wrote:

On Tue, May 8, 2012 at 1:36 AM, Fajar A. Nugrahal...@fajar.net  wrote:

On Tue, May 8, 2012 at 2:13 PM, Clemens Eissererlinuxhi...@gmail.com

wrote:

Hi,

I have a quite unreliable SSD here which develops some bad blocks from
time to time which result in read-errors.
Once the block is written to again, its remapped internally and
everything is fine again for that block.

Would it be possible to create 2 btrfs partitions on that drive and
use it in RAID1 - with btrfs silently repairing read-errors when they
occur?
Would it require special settings, to not fallback to read-only mode
when a read-error occurs?

The problem would be how the SSD (and linux) behaves when it
encounters bad blocks (not bad disks, which is easier).

If it does oh, I can't read this block. I just return an error
immediately, then it's good.

However, in most situation, it would be like hmmm, I can't read this
block, let me retry that again. What? still error? then lets retry it
again, and again., which could take several minutes for a single bad
block. And during that time linux (the kernel) would do something like
hey, the disk is not responding. Why don't we try some stuff? Let's
try resetting the link. If it doesn't work, try downgrading the link
speed.

In short, if you KNOW the SSD is already showing signs of bad blocks,
better just throw it away.

The excessive number of retries (basically, the kernel repeating the
work the drive already attempted) is being addressed in the block
layer.

[PATCH] libata-eh don't waste time retrying media errors (v3), I
believe this is queued for 3.5

I just hope they don't remove retries completely, I've seen the second or
third try return correct data on multiple disks from different vendors.
(Which allowed me to use dd to write the data back to force relocation)

But yes, Linux is a bit too overzelous with regards to retries...

Regards,
I hope they do. If you wish, you can force the retry, just trying your 
command again. This decision should happen in a higher level.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can btrfs silently repair read-error in raid1

2012-05-08 Thread Jan Schmidt
On Tue, May 08, 2012 at 09:13 (+0200), Clemens Eisserer wrote:
 Would it be possible to create 2 btrfs partitions on that drive and
 use it in RAID1 - with btrfs silently repairing read-errors when they
 occur?
 Would it require special settings, to not fallback to read-only mode
 when a read-error occurs?

No special settings required, that's exactly what btrfs is doing by
default. Though, it's not completely silent: it will tell you in your
kernel log about the repair it did.

-Jan
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can btrfs silently repair read-error in raid1

2012-05-08 Thread Fajar A. Nugraha
On Tue, May 8, 2012 at 2:13 PM, Clemens Eisserer linuxhi...@gmail.com wrote:
 Hi,

 I have a quite unreliable SSD here which develops some bad blocks from
 time to time which result in read-errors.
 Once the block is written to again, its remapped internally and
 everything is fine again for that block.

 Would it be possible to create 2 btrfs partitions on that drive and
 use it in RAID1 - with btrfs silently repairing read-errors when they
 occur?
 Would it require special settings, to not fallback to read-only mode
 when a read-error occurs?

The problem would be how the SSD (and linux) behaves when it
encounters bad blocks (not bad disks, which is easier).

If it does oh, I can't read this block. I just return an error
immediately, then it's good.

However, in most situation, it would be like hmmm, I can't read this
block, let me retry that again. What? still error? then lets retry it
again, and again., which could take several minutes for a single bad
block. And during that time linux (the kernel) would do something like
hey, the disk is not responding. Why don't we try some stuff? Let's
try resetting the link. If it doesn't work, try downgrading the link
speed.

In short, if you KNOW the SSD is already showing signs of bad blocks,
better just throw it away.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can btrfs silently repair read-error in raid1

2012-05-08 Thread cwillu
On Tue, May 8, 2012 at 1:36 AM, Fajar A. Nugraha l...@fajar.net wrote:
 On Tue, May 8, 2012 at 2:13 PM, Clemens Eisserer linuxhi...@gmail.com wrote:
 Hi,

 I have a quite unreliable SSD here which develops some bad blocks from
 time to time which result in read-errors.
 Once the block is written to again, its remapped internally and
 everything is fine again for that block.

 Would it be possible to create 2 btrfs partitions on that drive and
 use it in RAID1 - with btrfs silently repairing read-errors when they
 occur?
 Would it require special settings, to not fallback to read-only mode
 when a read-error occurs?

 The problem would be how the SSD (and linux) behaves when it
 encounters bad blocks (not bad disks, which is easier).

 If it does oh, I can't read this block. I just return an error
 immediately, then it's good.

 However, in most situation, it would be like hmmm, I can't read this
 block, let me retry that again. What? still error? then lets retry it
 again, and again., which could take several minutes for a single bad
 block. And during that time linux (the kernel) would do something like
 hey, the disk is not responding. Why don't we try some stuff? Let's
 try resetting the link. If it doesn't work, try downgrading the link
 speed.

 In short, if you KNOW the SSD is already showing signs of bad blocks,
 better just throw it away.

The excessive number of retries (basically, the kernel repeating the
work the drive already attempted) is being addressed in the block
layer.

[PATCH] libata-eh don't waste time retrying media errors (v3), I
believe this is queued for 3.5
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can btrfs silently repair read-error in raid1

2012-05-08 Thread Hubert Kario
On Tuesday 08 of May 2012 04:45:51 cwillu wrote:
 On Tue, May 8, 2012 at 1:36 AM, Fajar A. Nugraha l...@fajar.net wrote:
  On Tue, May 8, 2012 at 2:13 PM, Clemens Eisserer linuxhi...@gmail.com 
wrote:
  Hi,
  
  I have a quite unreliable SSD here which develops some bad blocks from
  time to time which result in read-errors.
  Once the block is written to again, its remapped internally and
  everything is fine again for that block.
  
  Would it be possible to create 2 btrfs partitions on that drive and
  use it in RAID1 - with btrfs silently repairing read-errors when they
  occur?
  Would it require special settings, to not fallback to read-only mode
  when a read-error occurs?
  
  The problem would be how the SSD (and linux) behaves when it
  encounters bad blocks (not bad disks, which is easier).
  
  If it does oh, I can't read this block. I just return an error
  immediately, then it's good.
  
  However, in most situation, it would be like hmmm, I can't read this
  block, let me retry that again. What? still error? then lets retry it
  again, and again., which could take several minutes for a single bad
  block. And during that time linux (the kernel) would do something like
  hey, the disk is not responding. Why don't we try some stuff? Let's
  try resetting the link. If it doesn't work, try downgrading the link
  speed.
  
  In short, if you KNOW the SSD is already showing signs of bad blocks,
  better just throw it away.
 
 The excessive number of retries (basically, the kernel repeating the
 work the drive already attempted) is being addressed in the block
 layer.
 
 [PATCH] libata-eh don't waste time retrying media errors (v3), I
 believe this is queued for 3.5

I just hope they don't remove retries completely, I've seen the second or 
third try return correct data on multiple disks from different vendors. 
(Which allowed me to use dd to write the data back to force relocation)

But yes, Linux is a bit too overzelous with regards to retries...

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can btrfs silently repair read-error in raid1

2012-05-08 Thread Chris Samuel
On 08/05/12 17:13, Clemens Eisserer wrote:

 Would it be possible to create 2 btrfs partitions on that drive and
 use it in RAID1 - with btrfs silently repairing read-errors when they
 occur?

You can, I tried that in 2009:

http://www.csamuel.org/2009/01/04/btrfs-raid1-benchmark-on-dell-e4200-with-128gb-ssd

But as Chris Mason pointed out on list at that time, it doesn't
necessarily mean you're any safer due to the way that FTL's work:

http://permalink.gmane.org/gmane.comp.file-systems.btrfs/2575

cheers,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html