Re: NVRAM support

2006-02-20 Thread Mirko Benz

Hello,

We have applications were large data sets (e.g. 100 MB) are sequentially 
written.
Software RAID could do a full stripe update (without reading/using 
existing data).
Does this happen in parallel? If yes, isn't that data vulnerable when a 
crash occurs?


Thanks,
Mirko

Neil Brown schrieb:

On Wednesday February 15, [EMAIL PROTECTED] wrote:
  

Hi,

My intention was not to use a NVRAM device for swap.

Enterprise storage systems use NVRAM for better data protection/faster 
recovery in case of a crash.
Modern CPUs can do RAID calculation very fast. But Linux RAID is 
vulnerable when a crash during a write operation occurs.
E.g. Data and parity write requests are issued in parallel but only one 
finishes. This will
lead to inconsistent data. It will be undetected and can not be 
repaired. Right?



Wrong.  Well, maybe 5% right.

If the array is degraded, that the inconsistency cannot be detected.
If the array is fully functioning, then any inconsistency will be
corrected by a 'resync'.

  

How can journaling be implemented within linux-raid?



With a fair bit of work. :-)

  

I have seen a paper that tries this in cooperation with a file system:
?Journal-guided Resynchronization for Software RAID?
www.cs.wisc.edu/adsl/Publications



This is using the ext3 journal to make the 'resync' (mentioned above)
faster.  Write-intent bitmaps can achieve similar speedups with
different costs.

  
But I would rather see a solution within md so that other file systems 
or LVM can be used on top of md.



Currently there is no solution to the crash while writing and
degraded on restart means possible silent data corruption problem.
However is it, in reality, a very small problem (unless you regularly
run with a degraded array - don't do that).

The only practical fix at the filesystem level is, as you suggest,
journalling to NVRAM.  There is work underway to restructure md/raid5
to be able to off-load the xor and raid6 calculations to dedicated
hardware.  This restructure would also make it a lot easier to journal
raid5 updates thus closing this hole (and also improving write
latency).

NeilBrown

  


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NVRAM support

2006-02-20 Thread Neil Brown
On Monday February 20, [EMAIL PROTECTED] wrote:
 Hello,
 
 We have applications were large data sets (e.g. 100 MB) are sequentially 
 written.
 Software RAID could do a full stripe update (without reading/using 
 existing data).
 Does this happen in parallel? If yes, isn't that data vulnerable when a 
 crash occurs?

md/raid5 does full stripe writes about 80% of the time when I've
measured it while doing large writes.  I'm don't know why it is not
closer to 100%.  I suspect some subtle scheduling issue that I
haven't managed to get to the bottom of yet (I should get back to
that).

Data is only vulnerable if, after the crash, the array is degraded.
If the array is still complete after the crash, then there is no loss
of data.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NVRAM support

2006-02-16 Thread Mario 'BitKoenig' Holbe
Neil Brown [EMAIL PROTECTED] wrote:
 On Wednesday February 15, [EMAIL PROTECTED] wrote:
 E.g. Data and parity write requests are issued in parallel but only one 
 finishes. This will
 lead to inconsistent data. It will be undetected and can not be 
 If the array is degraded, that the inconsistency cannot be detected.

Hmm, if the array is degraded, then there is no redundancy at all, so
there is no chance for any inconsistency.

Btw., this reminds me... now when you have raid6 - when is a raid6
defined to be degraded? Perhaps you have equal issues there as with
raid1 2 mirrors some months ago (resync was not started when 3rd
mirror failed and 1st and 2nd were inconsistent)?

 If the array is fully functioning, then any inconsistency will be
 corrected by a 'resync'.

Yes, because the redundancy is ignored and rebuilt.


regards
   Mario
-- 
Why did the tachyon cross the road?
Because it was on the other side.

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NVRAM support

2006-02-15 Thread Mirko Benz

Hi,

My intention was not to use a NVRAM device for swap.

Enterprise storage systems use NVRAM for better data protection/faster 
recovery in case of a crash.
Modern CPUs can do RAID calculation very fast. But Linux RAID is 
vulnerable when a crash during a write operation occurs.
E.g. Data and parity write requests are issued in parallel but only one 
finishes. This will
lead to inconsistent data. It will be undetected and can not be 
repaired. Right?


How can journaling be implemented within linux-raid?

I have seen a paper that tries this in cooperation with a file system:
„Journal-guided Resynchronization for Software RAID“
www.cs.wisc.edu/adsl/Publications

But I would rather see a solution within md so that other file systems 
or LVM can be used on top of md.


Regards,
Mirko

Erik Mouw schrieb:

On Fri, Feb 10, 2006 at 05:02:02PM -0800, dean gaudet wrote:
  

On Fri, 10 Feb 2006, Bill Davidsen wrote:


Erik Mouw wrote:
  

You could use it for an external journal, or you could use it as a swap
device.
 


Let me concur, I used external journal on SSD a decade ago with jfs (AIX). If
you do a lot of operations which generate journal entries, file create,
delete, etc, then it will double your performance in some cases. Otherwise it
really doesn't help much, use as a swap device might be more helpful depending
on your config.
  
it doesn't seem to make any sense at all to use a non-volatile external 
memory for swap... swap has no purpose past a power outage.



No, but it is a very fast swap device. Much faster than a hard drive.


Erik

  


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NVRAM support

2006-02-15 Thread Neil Brown
On Wednesday February 15, [EMAIL PROTECTED] wrote:
 Hi,
 
 My intention was not to use a NVRAM device for swap.
 
 Enterprise storage systems use NVRAM for better data protection/faster 
 recovery in case of a crash.
 Modern CPUs can do RAID calculation very fast. But Linux RAID is 
 vulnerable when a crash during a write operation occurs.
 E.g. Data and parity write requests are issued in parallel but only one 
 finishes. This will
 lead to inconsistent data. It will be undetected and can not be 
 repaired. Right?

Wrong.  Well, maybe 5% right.

If the array is degraded, that the inconsistency cannot be detected.
If the array is fully functioning, then any inconsistency will be
corrected by a 'resync'.

 
 How can journaling be implemented within linux-raid?

With a fair bit of work. :-)

 
 I have seen a paper that tries this in cooperation with a file system:
 ?Journal-guided Resynchronization for Software RAID?
 www.cs.wisc.edu/adsl/Publications

This is using the ext3 journal to make the 'resync' (mentioned above)
faster.  Write-intent bitmaps can achieve similar speedups with
different costs.

 
 But I would rather see a solution within md so that other file systems 
 or LVM can be used on top of md.

Currently there is no solution to the crash while writing and
degraded on restart means possible silent data corruption problem.
However is it, in reality, a very small problem (unless you regularly
run with a degraded array - don't do that).

The only practical fix at the filesystem level is, as you suggest,
journalling to NVRAM.  There is work underway to restructure md/raid5
to be able to off-load the xor and raid6 calculations to dedicated
hardware.  This restructure would also make it a lot easier to journal
raid5 updates thus closing this hole (and also improving write
latency).

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NVRAM support

2006-02-14 Thread Erik Mouw
On Mon, Feb 13, 2006 at 11:54:44AM +, Andy Smith wrote:
 On Mon, Feb 13, 2006 at 10:22:04AM +0100, Erik Mouw wrote:
  On Fri, Feb 10, 2006 at 05:02:02PM -0800, dean gaudet wrote:
   it doesn't seem to make any sense at all to use a non-volatile external 
   memory for swap... swap has no purpose past a power outage.
  
  No, but it is a very fast swap device. Much faster than a hard drive.
 
 Wouldn't the same amount of money be better spent on RAM then?

Sure, but when you happen to have such a device lying idle, this is a
way to use it.

(note that you can also use unused memory on your video adapter as a
fast swap device).


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NVRAM support

2006-02-13 Thread Andy Smith
On Mon, Feb 13, 2006 at 10:22:04AM +0100, Erik Mouw wrote:
 On Fri, Feb 10, 2006 at 05:02:02PM -0800, dean gaudet wrote:
  it doesn't seem to make any sense at all to use a non-volatile external 
  memory for swap... swap has no purpose past a power outage.
 
 No, but it is a very fast swap device. Much faster than a hard drive.

Wouldn't the same amount of money be better spent on RAM then?

-- 
http://strugglers.net/wiki/Xen_hosting -- A Xen VPS hosting hobby
Encrypted mail welcome - keyid 0x604DE5DB


signature.asc
Description: Digital signature


RE: NVRAM support

2006-02-13 Thread Guy
Not the same amount!  Match the size of the NV RAM disk with RAM at a
fraction of the cost.  With the money saved, buy a computer for the kids.
:)

} -Original Message-
} From: [EMAIL PROTECTED] [mailto:linux-raid-
} [EMAIL PROTECTED] On Behalf Of Andy Smith
} Sent: Monday, February 13, 2006 6:55 AM
} To: linux-raid@vger.kernel.org
} Subject: Re: NVRAM support
} 
} On Mon, Feb 13, 2006 at 10:22:04AM +0100, Erik Mouw wrote:
}  On Fri, Feb 10, 2006 at 05:02:02PM -0800, dean gaudet wrote:
}   it doesn't seem to make any sense at all to use a non-volatile
} external
}   memory for swap... swap has no purpose past a power outage.
} 
}  No, but it is a very fast swap device. Much faster than a hard drive.
} 
} Wouldn't the same amount of money be better spent on RAM then?
} 
} --
} http://strugglers.net/wiki/Xen_hosting -- A Xen VPS hosting hobby
} Encrypted mail welcome - keyid 0x604DE5DB

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NVRAM support

2006-02-10 Thread Erik Mouw
On Fri, Feb 10, 2006 at 10:01:09AM +0100, Mirko Benz wrote:
 Does a high speed NVRAM device makes sense for Linux SW RAID? E.g. a PCI 
 card that exports battery backed memory.

Unless it's very large (i.e.: as large as one of your disks), it
doesn't make sense. It will probably break less often, but it doesn't
help you in case a disk really breaks. It also won't speed up an MD
device much.

 Could that significantly improve write speed for RAID 5/6 (e.g. via an 
 external journal, asynchronous operation and write caching)?

You could use it for an external journal, or you could use it as a swap
device.

 What changes would be required?

None, ext3 supports external journals. Look for the -O option in the
mke2fs manual page. Using the NVRAM device as swap is not different
from a using normal swap partition.


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NVRAM support

2006-02-10 Thread Bill Davidsen

Erik Mouw wrote:


On Fri, Feb 10, 2006 at 10:01:09AM +0100, Mirko Benz wrote:
 

Does a high speed NVRAM device makes sense for Linux SW RAID? E.g. a PCI 
card that exports battery backed memory.
   



Unless it's very large (i.e.: as large as one of your disks), it
doesn't make sense. It will probably break less often, but it doesn't
help you in case a disk really breaks. It also won't speed up an MD
device much.

 

Could that significantly improve write speed for RAID 5/6 (e.g. via an 
external journal, asynchronous operation and write caching)?
   



You could use it for an external journal, or you could use it as a swap
device.
 



Let me concur, I used external journal on SSD a decade ago with jfs 
(AIX). If you do a lot of operations which generate journal entries, 
file create, delete, etc, then it will double your performance in some 
cases. Otherwise it really doesn't help much, use as a swap device might 
be more helpful depending on your config.


 


What changes would be required?
   



None, ext3 supports external journals. Look for the -O option in the
mke2fs manual page. Using the NVRAM device as swap is not different
from a using normal swap partition.


Erik

 




--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NVRAM support

2006-02-10 Thread Paul Clements

Mirko Benz wrote:

Does a high speed NVRAM device makes sense for Linux SW RAID? E.g. a PCI 
card that exports battery backed memory.


Sure. There are a couple ways I can think of using such a thing:

1) put an md intent bitmap on the NVRAM device for faster resyncs

2) use the NVRAM as a write journal for md to make md raid4/5/6 reliable 
(if the system crashes while an md raid5 is degraded, i.e., missing a 
disk, there is a chance of silent data corruption). The md driver does 
not currently do write journalling, so this would require some code changes.


--
Paul
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NVRAM support

2006-02-10 Thread dean gaudet
On Fri, 10 Feb 2006, Bill Davidsen wrote:

 Erik Mouw wrote:
 
  You could use it for an external journal, or you could use it as a swap
  device.
   
 
 Let me concur, I used external journal on SSD a decade ago with jfs (AIX). If
 you do a lot of operations which generate journal entries, file create,
 delete, etc, then it will double your performance in some cases. Otherwise it
 really doesn't help much, use as a swap device might be more helpful depending
 on your config.

it doesn't seem to make any sense at all to use a non-volatile external 
memory for swap... swap has no purpose past a power outage.

-dean
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html