Re: OT - was Hammer or ZFS based backup, encryption

2009-02-25 Thread Matthew Dillon
Generally speaking the idea with HAMMER's snapshotting and mirroring
is that everything is based on transaction-ids stored in the B-Tree.

The mirroring functionality does not require snapshotting per-say,
because EVERY sync HAMMER does to the media (including the automatic
filesystem syncs done by the kernel every 30-60 seconds) is effectively
a snapshot.

There is a downside to the way HAMMER manages its historical data store
and it is unclear how much of burden this will wind up being without some
specific tests.  The downside is that the historical information is stored
in the HAMMER B-Tree side-by-side with current information.

If you make 50,000 modifications to the same offset within a file,
for example, with a fsync() inbetween each one, and assuming you don't
prune the filesystem, then you will have 50,000 records for that HAMMER
data block in the B-Tree.  This can be optimized... HAMMER doesn't have
to scan 50,000 B-Tree elements.  It can seek to the last (most current)
one when it traverses the tree.  I may not be doing that yet but there is
no data structure limitation that would prevent it.  Even with the
optimization there will certainly be some overhead.

The mitigating factor is, of course, that the HAMMER B-Tree is pruned
every night to match the requested snapshot policy.

--

It would be cool if someone familiar with both ZFS's mirroring and
HAMMER's mirroring could test the feature and performance set.  What
I like most about HAMMER's mirroring is that the mirroring target can
have a different history retention policy then the master.

HAMMER's current mirror streaming feature is also pretty cool if I do
say so myself.  Since incremental mirroring is so fast, the hammer
utility can poll for changes every few seconds and since the stream
isn't queued it can be killed and restarted at any time.  Network
outages don't really effect it.

I also added a very cool feature to the hammer mirror-stream directive
which allows you to limit the bandwidth, preventing the mirroring
operation from interfering with production performance.

-Matt



Re: Hammer or ZFS based backup, encryption

2009-02-23 Thread Freddie Cash
On Sun, Feb 22, 2009 at 11:10 PM, Bill Hacker  wrote:
> Freddie Cash wrote:
>> On Sat, Feb 21, 2009 at 10:39 AM, Csaba Henk  wrote:
>>> I need to setup a backup machine, and I intend to utilize today's
>>> snapshotty filesystems (which boils down to Dfly+Hammer or FBSD+ZFS --
>>> btrfs is not there yet, and I don't feel like dwelving into Solaris).
>>> Set up such an OS with such an fs, and backup by syncing to the
>>> snapshotty fs and create a snapshot.
>>>
>>> I wonder about the following things:
>>>
>>> 1) Any idea how does this approach scale related to more conventional
>>> solutions,
>>> like rdiff-backup or dump(8)? I see the the pros, but are there any
>>> cons? How effective is taking regular snapshots space-wise?
>>>
>>> 2) Is there any practical argument for choosing between Dfly+Hammer and
>>> FBSD+ZFS? (Feel free to give biased answers :) )
>>>
>>> 3) I'd like to encrypt stuff, either at device or fs level. For
>>> FreeBSD there is geli(8). I haven't found anything for DragonFly.
>>> Is there any way to get at it on DragonFly?
>>
>> We do this at work, using FreeBSD 7.1 and ZFS, for backing up over 80
>> remote Linux and FreeBSD servers, and 1 Windows station.  We have two
>> servers, one that does the backups every night, and another that
>> mirrors the backups during the day.
>
> *trimmed* (description of a quite decent ZFS approach)
>
>>
>> After the initial rsync of a server, which can take several days as it
>> can easily max out an ADSL link's upload bandwidth, the daily run
>> takes about 6 hours, most of which is waiting for rsync to generate
>> the file listing.
>>
>
> *snipped*
>
> But there we are, Startup 'seeding; is unavidable, but thereafter ... among
> other things, looking to reduce the reliance on rsync (and similar CVS'ish
> or git'ish techniques) having to 'inventory' stuff at a high per-file level
> that a 'hammer mirror-stream' (or GMIRROR to networked RAID) could do 'as
> you go along' at a lower level - closer to the actual blocks as they are
> being written.
>
> How, and how well, would ZFS handle redundant pools on separate sites?
>
> And can that be a streaming process - even if that means the redundancy
> target is r/o for 'the duration', as a hammer slave would be?

ZFS includes "snapshot send" and "snapshot recv" commands for
streaming snapshots across the network.  However, there is currently a
bottleneck in the recv code that prevents it from saturating even a
100 Mbit link.  This affects ZFS v6 which is in FreeBSD 7.x, and I'm
fairly certain that it hasn't been fixed in ZFS v13 which is in
FreeBSD 8.  Once that has been fixed, then it can be used for
near-real-time streaming of snapshots across the network.

Until then, you can rsync snapshots from the master to the slave
server(s), which is what we're doing.

-- 
Freddie Cash
fjwc...@gmail.com


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-23 Thread Freddie Cash
On Mon, Feb 23, 2009 at 8:44 AM, Jeremy Chadwick  wrote:
> On Mon, Feb 23, 2009 at 05:19:14PM +0100, Jasse Jansson wrote:
>> On Feb 23, 2009, at 11:54 AM, Bill Hacker wrote:
>>> Robert Luciani wrote:
 Freddie Cash wrote:
> Booting FreeBSD 7.1 into a full KDE 4.2
> desktop takes less than 5 minutes.  This is using 3x 120 GB SATA
> drives in a single raidz1.
 Wow 5 minutes?!
 I don't think I'd be pushing it if I said that seems really slow. :S
 On such a fast machine I'd be irritated if it took over a minute to
 boot. On my
 3Ghz Athlon X2 w/ 2Gb RAM, DragonFly boots vkernels, hosting
 services, and Gnome
 in about a minute with my two 500Gb hammer drives.
>>>
>>> One minute 45 seconds into Xfce4 for a VIA C7 @ 1.5 GHz, 2 GB DDR-533,
>>> all-hammerfs on 2 natacontrol RAID1 'antique' 60 GB PATA UDMA 100 HDD.
>>>
>>> One minute 4 seconds into Xfce4 for an Intel T2130 @ 1.86 GHz 2 GB ?
>>> RAM, 1 X 120 GB 2.5" HDD, DFLY on 33 GB ad0s1, UFS2 with one hammer
>>> partition.
>>>
>>> RIADz looks to be the wall-time hog
>>
>> RAIDZ is known to be slow, even the main developers admit it if you
>> force them to it.
>
> I'd recommend the individual seeing ~5 minute boot times try disabling
> ZFS prefetching.  Place vfs.zfs.prefetch_disable="1" in
> /boot/loader.conf and reboot the machine.

Prefetch is disabled.  The delays in the boot on this system are from
using a USB stick for the / partition.  From kernel load to desktop
load is about 2 minutes.   It takes about 2 minutes to get the kernel
and modules loaded, though.

However, my Debian Lenny station at work takes almost 5 minutes to get
to a KDE 4.2 desktop, and that's using a single SATA harddrive with
ext3 and xfs.  Doesn't botther me, though, as I reboot maybe once per
month, sometimes only once per quarter.  I've never really understood
the reasoning for 30-second boot times.

-- 
Freddie Cash
fjwc...@gmail.com


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-23 Thread Freddie Cash
On Mon, Feb 23, 2009 at 2:35 AM, Robert Luciani  wrote:
> Freddie Cash wrote:
>> Booting FreeBSD 7.1 into a full KDE 4.2
>> desktop takes less than 5 minutes.  This is using 3x 120 GB SATA
>> drives in a single raidz1.
>
> Wow 5 minutes?!
> I don't think I'd be pushing it if I said that seems really slow. :S
> On such a fast machine I'd be irritated if it took over a minute to boot. On 
> my
> 3Ghz Athlon X2 w/ 2Gb RAM, DragonFly boots vkernels, hosting services, and 
> Gnome
> in about a minute with my two 500Gb hammer drives.

That includes the BIOS/POST delays to find the USB stick, a
boot.config delay, the boot menu delay, loading the kernel off a USB
stick delay, loading a bunch of kernel modules off the USB stick, the
SCSI sub-system delay, starting a bunch of daemons including Samba,
DHCPd, Apache, NFSd, and KDM, and then logging in.  The time from the
kernel being loaded to the desktop appearing is about 2 minutes.  It
just takes a long time to get to the point where the kernel is loaded.
 :)

-- 
Freddie Cash
fjwc...@gmail.com


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-23 Thread Ulrich Spörlein
On Sun, 22.02.2009 at 22:45:40 +0100, Michael Neumann wrote:
> Am Sun, 22 Feb 2009 06:33:44 -0800> schrieb Jeremy Chadwick 
> :
> > What you're trying to describe won't work, for the same reason I
> > described above (with your "zpool add tank ad8s1" command).  You can
> > split the disk into two pieces if you want, but it's not going to
> > change the fact that you cannot *grow* a zpool.  You literally have to
> > destroy it and recreate it for the pool to increase in size.
> 
> Ah okay, that's probably because the filesystem and RAID system are too
> tighly bundled in ZFS. So if I understand correctly, you can't grow a
> ZFS RAID-5 pool or anything similar to RAID-5.
> Now the ZFS filesystem probably can only use blocks from one pool, so
> the result is that you can't grow a ZFS filesystem living on a RAID-5+
> pool as well. A bad example of coupling...

No, no, no! You really think Sun would come up with this shiny new
technology in 2005 (?) and it cannot grow a filesystem? Jesus ...

Please read the other mails in this thread, thanks!

Cheers,
Ulrich Spörlein
-- 
None are more hopelessly enslaved than those who falsely believe they are free
-- Johann Wolfgang von Goethe


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-23 Thread Bill Hacker

Jeremy Chadwick wrote:

On Mon, Feb 23, 2009 at 05:19:14PM +0100, Jasse Jansson wrote:

On Feb 23, 2009, at 11:54 AM, Bill Hacker wrote:

Robert Luciani wrote:

Freddie Cash wrote:

Booting FreeBSD 7.1 into a full KDE 4.2
desktop takes less than 5 minutes.  This is using 3x 120 GB SATA
drives in a single raidz1.

Wow 5 minutes?!
I don't think I'd be pushing it if I said that seems really slow. :S
On such a fast machine I'd be irritated if it took over a minute to 
boot. On my
3Ghz Athlon X2 w/ 2Gb RAM, DragonFly boots vkernels, hosting  
services, and Gnome

in about a minute with my two 500Gb hammer drives.
One minute 45 seconds into Xfce4 for a VIA C7 @ 1.5 GHz, 2 GB DDR-533, 
all-hammerfs on 2 natacontrol RAID1 'antique' 60 GB PATA UDMA 100 HDD.


One minute 4 seconds into Xfce4 for an Intel T2130 @ 1.86 GHz 2 GB ? 
RAM, 1 X 120 GB 2.5" HDD, DFLY on 33 GB ad0s1, UFS2 with one hammer  
partition.


RIADz looks to be the wall-time hog
RAIDZ is known to be slow, even the main developers admit it if you  
force them to it.


I'd recommend the individual seeing ~5 minute boot times try disabling
ZFS prefetching.  Place vfs.zfs.prefetch_disable="1" in
/boot/loader.conf and reboot the machine.



For same-box use, I'm happy to use hardware RAID or even a low-level 
(block) pseudo RAID, such as [n]atacontrol.


A good LSi controller or such with decent cache (and battery) can make 
even a marginal fs work faster..


'back in the days of .. '  100 Mbps TCNS (ARCnet), an EISA-bus Novell fs 
with twin CDC Wren IV on twin SCSI controllers (duplexed) was faster 
over-the-wire than any local HDD we could buy for ISA-bus 386'en.


OTOH, Netware had grwon up on ARCnet, cheated by assuming zero-error 
links. Which TCNS, on either fibre-optic or coax actually delivered. 
Unlike 10-Base-T Ethernet of that era...


Bill




Re: Hammer or ZFS based backup, encryption

2009-02-23 Thread Bill Hacker

Jasse Jansson wrote:


On Feb 23, 2009, at 8:10 AM, Bill Hacker wrote:


*trimmed*



But there we are, Startup 'seeding; is unavidable, but thereafter ... 
among other things, looking to reduce the reliance on rsync (and 
similar CVS'ish or git'ish techniques) having to 'inventory' stuff at 
a high per-file level that a 'hammer mirror-stream' (or GMIRROR to 
networked RAID) could do 'as you go along' at a lower level - closer 
to the actual blocks as they are being written.


How, and how well, would ZFS handle redundant pools on separate sites?

And can that be a streaming process - even if that means the 
redundancy target is r/o for 'the duration', as a hammer slave would be?


I have not tried it myself, but there is a ZFS send/receive command,
that might do the trick.





Kaiser Jasse -- Authorized Stealth Oracle



Looks promising. VERY promising...:

http://docs.huihoo.com/opensolaris/solaris-zfs-administration-guide/html/ch06s03.html

Thanks for pointing that out

Bill


Re: Hammer or ZFS based backup, encryption

2009-02-23 Thread Kyle Butt
On Mon, Feb 23, 2009 at 05:18:09PM +0100, Jasse Jansson wrote:
>
> On Feb 23, 2009, at 8:10 AM, Bill Hacker wrote:
>
>> Freddie Cash wrote:
*snipped*
>>
>> But there we are, Startup 'seeding; is unavidable, but thereafter ... 
>> among other things, looking to reduce the reliance on rsync (and 
>> similar CVS'ish or git'ish techniques) having to 'inventory' stuff at a 
>> high per-file level that a 'hammer mirror-stream' (or GMIRROR to 
>> networked RAID) could do 'as you go along' at a lower level - closer to 
>> the actual blocks as they are being written.
>>
>> How, and how well, would ZFS handle redundant pools on separate sites?
>>
>> And can that be a streaming process - even if that means the  
>> redundancy target is r/o for 'the duration', as a hammer slave would 
>> be?
>
> I have not tried it myself, but there is a ZFS send/receive command,
> that might do the trick.
>

I've set up a mirror of zfs remotely. We take snapshots once a minute via cron
and send them. There's an upper limit on the number of snapshots that can be
present, or the receive tends to slow way down and things like zfs list take a
long time.

But yes, it does the trick to have a live backup.


pgptmXoIfTn8X.pgp
Description: PGP signature


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-23 Thread Jeremy Chadwick
On Mon, Feb 23, 2009 at 05:19:14PM +0100, Jasse Jansson wrote:
> On Feb 23, 2009, at 11:54 AM, Bill Hacker wrote:
>> Robert Luciani wrote:
>>> Freddie Cash wrote:
 Booting FreeBSD 7.1 into a full KDE 4.2
 desktop takes less than 5 minutes.  This is using 3x 120 GB SATA
 drives in a single raidz1.
>>> Wow 5 minutes?!
>>> I don't think I'd be pushing it if I said that seems really slow. :S
>>> On such a fast machine I'd be irritated if it took over a minute to 
>>> boot. On my
>>> 3Ghz Athlon X2 w/ 2Gb RAM, DragonFly boots vkernels, hosting  
>>> services, and Gnome
>>> in about a minute with my two 500Gb hammer drives.
>>
>> One minute 45 seconds into Xfce4 for a VIA C7 @ 1.5 GHz, 2 GB DDR-533, 
>> all-hammerfs on 2 natacontrol RAID1 'antique' 60 GB PATA UDMA 100 HDD.
>>
>> One minute 4 seconds into Xfce4 for an Intel T2130 @ 1.86 GHz 2 GB ? 
>> RAM, 1 X 120 GB 2.5" HDD, DFLY on 33 GB ad0s1, UFS2 with one hammer  
>> partition.
>>
>> RIADz looks to be the wall-time hog
>
> RAIDZ is known to be slow, even the main developers admit it if you  
> force them to it.

I'd recommend the individual seeing ~5 minute boot times try disabling
ZFS prefetching.  Place vfs.zfs.prefetch_disable="1" in
/boot/loader.conf and reboot the machine.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |



Re: OT - was Hammer or ZFS based backup, encryption

2009-02-23 Thread Jasse Jansson


On Feb 23, 2009, at 11:54 AM, Bill Hacker wrote:


Robert Luciani wrote:

Freddie Cash wrote:

Booting FreeBSD 7.1 into a full KDE 4.2
desktop takes less than 5 minutes.  This is using 3x 120 GB SATA
drives in a single raidz1.

Wow 5 minutes?!
I don't think I'd be pushing it if I said that seems really slow. :S
On such a fast machine I'd be irritated if it took over a minute  
to boot. On my
3Ghz Athlon X2 w/ 2Gb RAM, DragonFly boots vkernels, hosting  
services, and Gnome

in about a minute with my two 500Gb hammer drives.


One minute 45 seconds into Xfce4 for a VIA C7 @ 1.5 GHz, 2 GB  
DDR-533, all-hammerfs on 2 natacontrol RAID1 'antique' 60 GB PATA  
UDMA 100 HDD.


One minute 4 seconds into Xfce4 for an Intel T2130 @ 1.86 GHz 2 GB ? 
RAM, 1 X 120 GB 2.5" HDD, DFLY on 33 GB ad0s1, UFS2 with one hammer  
partition.


RIADz looks to be the wall-time hog


RAIDZ is known to be slow, even the main developers admit it if you  
force them to it.





Kaiser Jasse -- Authorized Stealth Oracle

The axioms of wisdom:
1. Go the SPARC way of life
2. You can't conquer the universe without the knowledge of FORTRAN
3. In the Unix realm, 10% of work fixes 90% of the problems





Re: Hammer or ZFS based backup, encryption

2009-02-23 Thread Jasse Jansson


On Feb 23, 2009, at 8:10 AM, Bill Hacker wrote:


Freddie Cash wrote:
On Sat, Feb 21, 2009 at 10:39 AM, Csaba Henk   
wrote:

I need to setup a backup machine, and I intend to utilize today's
snapshotty filesystems (which boils down to Dfly+Hammer or FBSD 
+ZFS --
btrfs is not there yet, and I don't feel like dwelving into  
Solaris).

Set up such an OS with such an fs, and backup by syncing to the
snapshotty fs and create a snapshot.

I wonder about the following things:

1) Any idea how does this approach scale related to more  
conventional solutions,

like rdiff-backup or dump(8)? I see the the pros, but are there any
cons? How effective is taking regular snapshots space-wise?

2) Is there any practical argument for choosing between Dfly 
+Hammer and

FBSD+ZFS? (Feel free to give biased answers :) )

3) I'd like to encrypt stuff, either at device or fs level. For
FreeBSD there is geli(8). I haven't found anything for DragonFly.
Is there any way to get at it on DragonFly?

We do this at work, using FreeBSD 7.1 and ZFS, for backing up over 80
remote Linux and FreeBSD servers, and 1 Windows station.  We have two
servers, one that does the backups every night, and another that
mirrors the backups during the day.


*trimmed* (description of a quite decent ZFS approach)

After the initial rsync of a server, which can take several days  
as it

can easily max out an ADSL link's upload bandwidth, the daily run
takes about 6 hours, most of which is waiting for rsync to generate
the file listing.



*snipped*

But there we are, Startup 'seeding; is unavidable, but  
thereafter ... among other things, looking to reduce the reliance  
on rsync (and similar CVS'ish or git'ish techniques) having to  
'inventory' stuff at a high per-file level that a 'hammer mirror- 
stream' (or GMIRROR to networked RAID) could do 'as you go along'  
at a lower level - closer to the actual blocks as they are being  
written.


How, and how well, would ZFS handle redundant pools on separate sites?

And can that be a streaming process - even if that means the  
redundancy target is r/o for 'the duration', as a hammer slave  
would be?


I have not tried it myself, but there is a ZFS send/receive command,
that might do the trick.





Kaiser Jasse -- Authorized Stealth Oracle

The axioms of wisdom:
1. Go the SPARC way of life
2. You can't conquer the universe without the knowledge of FORTRAN
3. In the Unix realm, 10% of work fixes 90% of the problems





Re: OT - was Hammer or ZFS based backup, encryption

2009-02-23 Thread Bill Hacker

Robert Luciani wrote:

Freddie Cash wrote:

Booting FreeBSD 7.1 into a full KDE 4.2
desktop takes less than 5 minutes.  This is using 3x 120 GB SATA
drives in a single raidz1.


Wow 5 minutes?!
I don't think I'd be pushing it if I said that seems really slow. :S
On such a fast machine I'd be irritated if it took over a minute to boot. On my
3Ghz Athlon X2 w/ 2Gb RAM, DragonFly boots vkernels, hosting services, and Gnome
in about a minute with my two 500Gb hammer drives.



One minute 45 seconds into Xfce4 for a VIA C7 @ 1.5 GHz, 2 GB DDR-533, 
all-hammerfs on 2 natacontrol RAID1 'antique' 60 GB PATA UDMA 100 HDD.


One minute 4 seconds into Xfce4 for an Intel T2130 @ 1.86 GHz 2 GB ?RAM, 
1 X 120 GB 2.5" HDD, DFLY on 33 GB ad0s1, UFS2 with one hammer partition.


RIADz looks to be the wall-time hog

Bill



Re: OT - was Hammer or ZFS based backup, encryption

2009-02-23 Thread Robert Luciani
Freddie Cash wrote:
> Booting FreeBSD 7.1 into a full KDE 4.2
> desktop takes less than 5 minutes.  This is using 3x 120 GB SATA
> drives in a single raidz1.

Wow 5 minutes?!
I don't think I'd be pushing it if I said that seems really slow. :S
On such a fast machine I'd be irritated if it took over a minute to boot. On my
3Ghz Athlon X2 w/ 2Gb RAM, DragonFly boots vkernels, hosting services, and Gnome
in about a minute with my two 500Gb hammer drives.

-- 
Robert Luciani
Chalmers University of Technology, SWE
Department of Computer Science and Engineering
http://www.rluciani.com/rluciani.asc


Re: Hammer or ZFS based backup, encryption

2009-02-22 Thread Bill Hacker

Freddie Cash wrote:

On Sat, Feb 21, 2009 at 10:39 AM, Csaba Henk  wrote:

I need to setup a backup machine, and I intend to utilize today's
snapshotty filesystems (which boils down to Dfly+Hammer or FBSD+ZFS --
btrfs is not there yet, and I don't feel like dwelving into Solaris).
Set up such an OS with such an fs, and backup by syncing to the
snapshotty fs and create a snapshot.

I wonder about the following things:

1) Any idea how does this approach scale related to more conventional solutions,
like rdiff-backup or dump(8)? I see the the pros, but are there any
cons? How effective is taking regular snapshots space-wise?

2) Is there any practical argument for choosing between Dfly+Hammer and
FBSD+ZFS? (Feel free to give biased answers :) )

3) I'd like to encrypt stuff, either at device or fs level. For
FreeBSD there is geli(8). I haven't found anything for DragonFly.
Is there any way to get at it on DragonFly?


We do this at work, using FreeBSD 7.1 and ZFS, for backing up over 80
remote Linux and FreeBSD servers, and 1 Windows station.  We have two
servers, one that does the backups every night, and another that
mirrors the backups during the day.


*trimmed* (description of a quite decent ZFS approach)



After the initial rsync of a server, which can take several days as it
can easily max out an ADSL link's upload bandwidth, the daily run
takes about 6 hours, most of which is waiting for rsync to generate
the file listing.



*snipped*

But there we are, Startup 'seeding; is unavidable, but thereafter ... 
among other things, looking to reduce the reliance on rsync (and similar 
CVS'ish or git'ish techniques) having to 'inventory' stuff at a high 
per-file level that a 'hammer mirror-stream' (or GMIRROR to networked 
RAID) could do 'as you go along' at a lower level - closer to the actual 
blocks as they are being written.


How, and how well, would ZFS handle redundant pools on separate sites?

And can that be a streaming process - even if that means the redundancy 
target is r/o for 'the duration', as a hammer slave would be?


Bill


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-22 Thread Freddie Cash
On Sat, Feb 21, 2009 at 7:17 PM, Jeremy Chadwick  wrote:
> On Sun, Feb 22, 2009 at 11:59:57AM +1100, Dmitri Nikulin wrote:
>> On Sun, Feb 22, 2009 at 10:34 AM, Bill Hacker  wrote:
>> > Hopefully more 'good stuff' will be ported out of Solaris before it hits 
>> > the
>> > 'too costly vs the alternatives' wall and is orphaned.
>>
>> Btrfs has been merged into mainline Linux now, and although it's
>> pretty far behind ZFS in completeness at the moment, it represents a
>> far greater degree of flexibility and power. In a couple of years when
>> it's stable and user friendly, high-end storage solutions will move
>> back to Linux, after having given Sun a lot of contracts due
>> specifically to ZFS.
>
> The fact that btrfs offers grow/shrink capability puts it ahead of ZFS
> with regards to home users who desire a NAS.  I can't stress this point
> enough.  ZFS's lack of this capability limits its scope.  As it stands
> now, if you replace a disk with a larger one, you have to go through
> this extremely fun process to make use of the new space available:
>
> - Offload all of your data somewhere (read: not "zfs export"); rsync
>  is usually what people end up using -- if you have multiple ZFS
>  filesystems, this can take some time
> - zpool destroy
> - zpool create
> - zfs create

According to the ZFS Admin manual, doing an online "replace" of the
drive with a larger one, than a zpool export and zpool import is all
that's needed to make use of the extra space.

In theory, one can replace all the drives in the storage array one at
a time, allowing the resilvers to complete each time, and then just
export/import once, and have a massively larger pool to use.

> And if you add a new disk to the system, it's impossible to add that
> disk to the existing pool -- you can, of course, create an entirely
> new zpool which uses that disk, but that has nothing to do with the
> existing zpool.  So you get to do the above dance.

You can add vdevs to a pool at any time.  Data will be striped across
all vdevs in the pool.  What you can't do is extend a raidz vdev.  But
you can add more raidz vdevs to a pool.

If you create a pool with a 3-drive raidz vdev, you can later extend
the pool by adding another 3-drive raidz vdev.  Or by adding a
mirrored vdev.  Or by just adding a single drive (although then you
lose the redundancy of the entire pool).

I've done this several times when playing around with ZFS on FreeBSD
7.1 on a test system with 24-drives.  Started with a 12-drive raidz2
vdev.  Then addded a 6-drive raidz2.  Then another 6-drive raidz2.
Then played around with 3x 8-drive raidz2 vdevs.  And a bunch of other
setups, just to see what the limitations were.  The only one is that
you can't start with an X-drive raidz2 and later extend that single
raidz2 vdev out to Y-drives, like you can with some hardware RAID
controllers.

> I'll also point out that ZFS on FreeBSD (at least 7.x) performs very
> differently than on Solaris 10.  We use Solaris 10 x86 + ZFS at my
> workplace, and the overall usability of the system during heavy disk I/O
> is much more refined (read: smooth) than on FreeBSD.  It's interesting
> to do something like "zpool iostat 1" on FreeBSD compared to Solaris 10;
> FreeBSD will show massive write bursts (e.g. 0MB, 0MB, 0MB, 70MB, 0MB,
> 0MB, 0MB, 67MB, etc.), while Solaris behaves more appropriately (50MB,
> 60MB, 70MB, 40MB, etc.).  "zpool scrub" is a great way to test this.

Hrm, we haven't run into that, but we're mostly limited by network
speeds in our setup.  "zpool iostat" shows a fairly constant 2 MB read
or write to each of the 24-drives in the servers.  But that's all
rsync usage, and limited by the ADSL links we use.

-- 
Freddie Cash
fjwc...@gmail.com


Re: Hammer or ZFS based backup, encryption

2009-02-22 Thread Freddie Cash
On Sat, Feb 21, 2009 at 10:39 AM, Csaba Henk  wrote:
> I need to setup a backup machine, and I intend to utilize today's
> snapshotty filesystems (which boils down to Dfly+Hammer or FBSD+ZFS --
> btrfs is not there yet, and I don't feel like dwelving into Solaris).
> Set up such an OS with such an fs, and backup by syncing to the
> snapshotty fs and create a snapshot.
>
> I wonder about the following things:
>
> 1) Any idea how does this approach scale related to more conventional 
> solutions,
> like rdiff-backup or dump(8)? I see the the pros, but are there any
> cons? How effective is taking regular snapshots space-wise?
>
> 2) Is there any practical argument for choosing between Dfly+Hammer and
> FBSD+ZFS? (Feel free to give biased answers :) )
>
> 3) I'd like to encrypt stuff, either at device or fs level. For
> FreeBSD there is geli(8). I haven't found anything for DragonFly.
> Is there any way to get at it on DragonFly?

We do this at work, using FreeBSD 7.1 and ZFS, for backing up over 80
remote Linux and FreeBSD servers, and 1 Windows station.  We have two
servers, one that does the backups every night, and another that
mirrors the backups during the day.

The main backup server is:
  - Chenbro 5U rackmount case with 24 drive bays and hot-swappable
SATA backplane
  - Tyan h2000M motherboard
  - 2x Opteron 2200-series CPUs @ 2.6 GHz (dual-core)
  - 8 GB ECC DDR2-667 SDRAM
  - 3Ware 9650SE PCIe RAID controller
  - 3Ware 9550SXU PCI-X RAID controller
  - 24x 500 GB SATA HDs (Seagate and Maxtor)
  - 1350W 4-way hot-swappable PSU

All 24 drives are configured as "Single Disk" arrays, and show up as
24 separate SCSI devices in the OS.  This allows the RAID controller
to use the 256 MB of onboard RAM as another level of cache and allows
us to use the 3dm2 management console (as opposed to JBOD-mode where
it becomes just a dumb SATA controller).

The main backup server has 2x 2GB CompactFlash in IDE adapters that
house the base OS install.  The second backup server has 2x 2GB USB
sticks for the OS install.  All space on all 24 drives is used for the
ZFS pool.

The main backup server has a single 24-drive raidz2 dataset.  Not
optimal, but we didn't know any better back then.  :)  The second
server has 3x 8-drive raidz2 datasets.  Not as much usable space, but
better performance and redundancy.

/ and /usr are on the CF/USB.  Having /usr on there makes single-user
mode a lot nicer to use.  /usr/local, /home, /usr/ports,
/usr/ports/distfiles, /usr/src, /usr/obj, /var, /tmp, and /backups are
ZFS filesystems with various properties set (like compression on
/usr/src, /usr/ports, and /backups).

We have a custom rsync script that does the backups of the remote
servers everynight, and that creates a snapshot named after the date.
And another custom script that does an rsync from a snapshot on the
main server to the backups directory on the second server, and then
creates a snapshot on there.

After the initial rsync of a server, which can take several days as it
can easily max out an ADSL link's upload bandwidth, the daily run
takes about 6 hours, most of which is waiting for rsync to generate
the file listing.

This setup works wonders, and has been used to re-image servers,
restore files from backups, and even re-set the permissions/ownerships
for /home after a teacher did a "chown -R" on /home by mistake.  Being
able to cd into a snapshot directory and directly access the files is
a god-send.

We've been running this setup since August 2008.  Disk usage so far is
just over 6 TB.  Daily snapshots average <10 GB.  With ~10 TB of drive
space in each server, we won't run out of space for a while yet.  And
when we get under 1 TB of free space, we just start swapping drives
out for larger ones and the extra space is automatically added into
the pool.  Theoretically, we can put just under 50 TB of disk into
these systems.  :)

For the curious, these boxes cost under $10,000 CDN.  We like to
mention that when the storage vendors call with their $50,000 US
"budget" storage systems with 5 TB of disk space.  :D  They tend to
not call back.

-- 
Freddie Cash
fjwc...@gmail.com


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-22 Thread Freddie Cash
On Sun, Feb 22, 2009 at 7:50 AM, Bill Hacker  wrote:
> Side issue again - just brought up DFLY 2.3.0, default all-hammer layout.
>
> - atop a natacontrol RAID1 on a pair of salvaged 60 GB IBM 'Deathstars'.
> - VIA C7 1.5 GHz CPU (the el-cheapo MB aimed at Wal-Mart)
> - pulled half the RAM, leaving only 1GB
>
> Periodically, the aged IBM's sound like squirrels having a go in a gravel
> pit, Xfce4 is *very* slow, but it hadn't started swapping, and based on what
> 'top' is showing should actually work decently for basic file, web, or mail
> serving, especially as the OpenSSL recognizes the VIA padlock engine.
>  Planty fast enough with 2GB, BTW.
>
> ZFS is not famed for tolerating that meagre a resource ration as well as
> HAMMER

ZFS on FreeBSD 7.1 runs well with 2 GB of RAM and a 2.6 GHz P4 CPU
(32-bit).  It just requires tuning of a couple of kernel tunables via
loader.conf.  It runs better with a 64-bit CPU and more RAM, but it is
perfectly usable with 2 GB.  Booting FreeBSD 7.1 into a full KDE 4.2
desktop takes less than 5 minutes.  This is using 3x 120 GB SATA
drives in a single raidz1.

ZFS will try to use every bit of RAM that it can as a cache.  But you
can limit how much it can use.  There are reports on the FreeBSD
mailing lists of people using it on laptops (slow drives), and on
systems with only 768 MB of RAM.

-- 
Freddie Cash
fjwc...@gmail.com


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-22 Thread Freddie Cash
On Sun, Feb 22, 2009 at 6:33 AM, Jeremy Chadwick  wrote:
> On Sun, Feb 22, 2009 at 01:36:28PM +0100, Michael Neumann wrote:
>> Okay "zpool remove" doesn't seem to work as expected, but it should
>> work well at least for RAID-1 (which probably no one uses for large
>> storage systems ;-). Maybe "zfs replace" works, if you replace an old
>> disk, with a larger disk, and split it into two partitions, the one
>> equally sized to the old, and the other containing the remainder of the
>> space. Then do:
>>
>>   zfs replace tank old_device new_device_equally_sized
>>   zfs add tank new_device_remainder
>>
>> But you probably know more about ZFS than me ;-)
>
> In this case, yes (that I know more about ZFS than you :-) ).  What
> you're trying to do there won't work.
>
> The "zfs" command manages filesystems (e.g. pieces under a zpool).  You
> cannot do anything with devices (disks) with "zfs".  I think you mean
> "zpool", especially since the only "replace" command is "zpool replace".
>
> What you're trying to describe won't work, for the same reason I
> described above (with your "zpool add tank ad8s1" command).  You can
> split the disk into two pieces if you want, but it's not going to
> change the fact that you cannot *grow* a zpool.  You literally have to
> destroy it and recreate it for the pool to increase in size.
>
> I've been through this procedure twice in the past year, as I replaced
> 250GB disks with 500GB, and then 500GB disks with 750GB.  It's a *huge*
> pain, and I cannot imagine anyone in an enterprise environment using ZFS
> to emulate a filer -- it simply won't work.  For individual servers
> (where disks are going to remain the same size unless the box is
> formatted, etc.), oh yes, ZFS is absolutely fantastic.

This is patently false, and you've been creating unnecessary work for
yourself.  :)

You most definitely can add drives to a pool, thus increasing the
total amount of storage space available in the pool.  It's as simple
as:
  zpool add <...>

That's the whole point of the "add" keyword ... you add storage to the
pool.  For example, you can create a pool using a 6-disk raidz2 vdev
like so:
  zpool create pool raidz2 da0 da1 da2 da3 da4 da5

Later, you can add another raidz2 vdev like so:
  zpool add pool raidz2 da6 da7 da8 da9

Your pool has now become, effectively, a RAID60:  a RAID0 stripe made
up of two RAID6 arrays.

You can later add a mirrorred vdev to the pool using:
  zpool add pool mirror da10 da11

And data will be striped across the three different vdevs.  This is
the whole point of the pooled storage setup ... you just keep adding
storage to the pool, and it gets striped across it all.

You're getting tripped up by the same thing that I did when I first
started with ZFS:  you can't extend raidz vdevs (ie you can't start
with a 6-drive raidz2 and then later expand it into a 10-drive
raidz2).  But there's nothing stopping you from adding more raidz2
vdevs to the pool.

One of the servers we have at work uses 3x 8-drive raidz2 vdevs:

[fc...@thehive  ~]$ zpool status
  pool: storage
 state: ONLINE
 scrub: none requested
config:

NAME  STATE READ WRITE CKSUM
storage   ONLINE   0 0 0
  raidz2  ONLINE   0 0 0
label/disk01  ONLINE   0 0 0
label/disk02  ONLINE   0 0 0
label/disk03  ONLINE   0 0 0
label/disk04  ONLINE   0 0 0
label/disk13  ONLINE   0 0 0
label/disk14  ONLINE   0 0 0
label/disk15  ONLINE   0 0 0
label/disk16  ONLINE   0 0 0
  raidz2  ONLINE   0 0 0
label/disk05  ONLINE   0 0 0
label/disk06  ONLINE   0 0 0
label/disk07  ONLINE   0 0 0
label/disk08  ONLINE   0 0 0
label/disk17  ONLINE   0 0 0
label/disk18  ONLINE   0 0 0
label/disk19  ONLINE   0 0 0
label/disk20  ONLINE   0 0 0
  raidz2  ONLINE   0 0 0
label/disk09  ONLINE   0 0 0
label/disk10  ONLINE   0 0 0
label/disk11  ONLINE   0 0 0
label/disk12  ONLINE   0 0 0
label/disk21  ONLINE   0 0 0
label/disk22  ONLINE   0 0 0
label/disk23  ONLINE   0 0 0
label/disk24  ONLINE   0 0 0

errors: No known data errors

[fc...@thehive  ~]$ zpool list
NAMESIZEUSED   AVAILCAP  HEALTH ALTROOT
storage10.9T   3.90T   6.98T35%  ONLINE -

-- 
Freddie Cash
fjwc...@gmail.com


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-22 Thread Freddie Cash
On Sun, Feb 22, 2009 at 6:33 AM, Jeremy Chadwick  wrote:
> I've been through this procedure twice in the past year, as I replaced
> 250GB disks with 500GB, and then 500GB disks with 750GB.  It's a *huge*
> pain, and I cannot imagine anyone in an enterprise environment using ZFS
> to emulate a filer -- it simply won't work.  For individual servers
> (where disks are going to remain the same size unless the box is
> formatted, etc.), oh yes, ZFS is absolutely fantastic.

Oh, and you don't have to re-create the pool in order to use the extra
space.  Just "zpool replace" the drives with larger ones, then "zpool
export" the pool, and "zpool import" the pool, and the extra space
becomes available.  I did this on my home server a couple of months
ago.  I started with 3x 120 GB drives split into 3 slices (2 GB for
use by gmirror for /, 1 GB for use as swap, rest for use by ZFS).
Later I moved / onto a 2 GB USB key.  A simple iteration of "zpool
replace pool ad4s1 ad4"  for each of the three drives, an export and
import, and I had an extra 9 GB of space in my pool.

-- 
Freddie Cash
fjwc...@gmail.com


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-22 Thread Bill Hacker

Jeremy Chadwick wrote:

On Mon, Feb 23, 2009 at 04:53:35AM +0800, Bill Hacker wrote:

Jeremy Chadwick wrote:

*snip*


The problem I was attempting to describe: all pool members must be the
same size, otherwise all members are considered to be equal to the size
of the smallest.  In English: you cannot "mix-and-match" different sized
disks.


*TILT*

C'mon guys - that has nuthin to do with ZFS or any other RAID [1].

{snip}


Circling back to the near-start of the thread (specifically Dmitri's
comment): the point was that Linux has btrfs and a few other filesystems
that offer some really fantastic features (including what I've
described).  Commercial filers (see: Network Appliance, Pillar Axiom)
also offer mix-matched disk sizes and grow/shrink capability.  (NetApp
is actually based on BSD, but naturally all the FS stuff is proprietary)

How/why does this matter to us?

Because users are commonly using *BSD as a form of inexpensive filer for
their servers (not everyone can afford a NetApp or Axiom), or as an OS
on their home NAS (which includes pfSense and m0n0wall).  In both of
these cases, expanding/growing the array isn't possible, which greatly
limits the user-base scope -- and sadly, users usually don't find this
out until they've already made their choice, swap/upgrade a disk, then
post "WTF!" on a mailing list or forum somewhere.

ZFS happens to be incredibly easy to manage (from an administrative POV)
and solves many shortcomings.  It's significantly easier to understand
and use than Linux LVM (Linux md/mdadm is simple, it's the LVM part that
adds excessive complexity).  HAMMER also appears to be pretty easy to
manage and also solves many shortcomings, in a significantly different
way than ZFS (obviously).  These are excellent improvements in the BSD
world, but there's still a few things folks really want which will
ultimately improve on what BSD is being used for today.  That's all I'm
trying to say.  :-)



Well said ..

With the improvements in reliability of HDD, redundancy in the HDD 
within any one given box is no longer our 'hot spot' - and may never be 
again.


A SAN or NAS doesn't improve that - it potentially makes it worse w/r 
single point of failure.


Going forward, it makes more sense to *us* drop each of our 
traditionally RAID1 or RAID5 boxes to multiple, but non-RAID HDD, 
connect the boxen to each other with Gig-E or better, and let each 
support the redundancy for one or more of the others.


Lower risk of outage from CPU or PSU fan failures... or 'fat fingers'.
Potential for IP-failover HA configuration.

Hence my testing as to how happy HAMMER is with lowly C3, (fair), C7 
(quite decent), C9/Nano ('to do' but should be as good as it needs to 
be... 'gamers' or GUI-centric we are not).


.and the research on Gfarm, Gluster, Dfarm, Chiron, Seph ... yada, 
yada... But Linux - where most of these distributed tools presently 
perch - is just not an option for *our* use. 'Wetware' cost is too high.


FreeBSD's GEOM/GMIRROR has had the counterpart to hammer mirror-stream 
over the link for quite some time. But UFS(2) only, and while that fs 
has never done me any harm, 'snapshots' on UFS are an add-on, not inherent.


Enter HAMMER not because it is necessarily any better (yet) than 
some form of 'grid farm' w/r distributed storage


.. but because HAMMER was optimized from the ground up for inherent 
ease of snapshot management, and to a higher degree than anything else 
since FOSSIL/VENTI - which did not scale well.and *did* break now 
and then.


Likewise, hammer mirror-stream (so far) looks to be very good at steady, 
working - not likely to overload the primary or the b/w.


JM2CW, but I suspect we are not alone in wanting to stop sending so many 
RAID HDD that haven't actually failed to the landfill.


Or paying to heat a CPU, head-positioner, and NIC with constant rsyncing 
calculations.


JM2CW

Bill


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-22 Thread Jeremy Chadwick
On Mon, Feb 23, 2009 at 04:53:35AM +0800, Bill Hacker wrote:
> Jeremy Chadwick wrote:
>
> *snip*
>
>>
>> The problem I was attempting to describe: all pool members must be the
>> same size, otherwise all members are considered to be equal to the size
>> of the smallest.  In English: you cannot "mix-and-match" different sized
>> disks.
>>
>
> *TILT*
>
> C'mon guys - that has nuthin to do with ZFS or any other RAID [1].
>
> {snip}

Circling back to the near-start of the thread (specifically Dmitri's
comment): the point was that Linux has btrfs and a few other filesystems
that offer some really fantastic features (including what I've
described).  Commercial filers (see: Network Appliance, Pillar Axiom)
also offer mix-matched disk sizes and grow/shrink capability.  (NetApp
is actually based on BSD, but naturally all the FS stuff is proprietary)

How/why does this matter to us?

Because users are commonly using *BSD as a form of inexpensive filer for
their servers (not everyone can afford a NetApp or Axiom), or as an OS
on their home NAS (which includes pfSense and m0n0wall).  In both of
these cases, expanding/growing the array isn't possible, which greatly
limits the user-base scope -- and sadly, users usually don't find this
out until they've already made their choice, swap/upgrade a disk, then
post "WTF!" on a mailing list or forum somewhere.

ZFS happens to be incredibly easy to manage (from an administrative POV)
and solves many shortcomings.  It's significantly easier to understand
and use than Linux LVM (Linux md/mdadm is simple, it's the LVM part that
adds excessive complexity).  HAMMER also appears to be pretty easy to
manage and also solves many shortcomings, in a significantly different
way than ZFS (obviously).  These are excellent improvements in the BSD
world, but there's still a few things folks really want which will
ultimately improve on what BSD is being used for today.  That's all I'm
trying to say.  :-)

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |



Re: OT - was Hammer or ZFS based backup, encryption

2009-02-22 Thread Michael Neumann
Am Sun, 22 Feb 2009 06:33:44 -0800
schrieb Jeremy Chadwick :

> On Sun, Feb 22, 2009 at 01:36:28PM +0100, Michael Neumann wrote:
> > Am Sat, 21 Feb 2009 19:17:11 -0800
> > schrieb Jeremy Chadwick :
> > 
> > > On Sun, Feb 22, 2009 at 11:59:57AM +1100, Dmitri Nikulin wrote:
> > > > On Sun, Feb 22, 2009 at 10:34 AM, Bill Hacker
> > > >  wrote:
> > > > > Hopefully more 'good stuff' will be ported out of Solaris
> > > > > before it hits the 'too costly vs the alternatives' wall and
> > > > > is orphaned.
> > > > 
> > > > Btrfs has been merged into mainline Linux now, and although it's
> > > > pretty far behind ZFS in completeness at the moment, it
> > > > represents a far greater degree of flexibility and power. In a
> > > > couple of years when it's stable and user friendly, high-end
> > > > storage solutions will move back to Linux, after having given
> > > > Sun a lot of contracts due specifically to ZFS.
> > > 
> > > The fact that btrfs offers grow/shrink capability puts it ahead
> > > of ZFS with regards to home users who desire a NAS.  I can't
> > > stress this point enough.  ZFS's lack of this capability limits
> > > its scope.  As it stands now, if you replace a disk with a larger
> > > one, you have to go through this extremely fun process to make
> > > use of the new space available:
> > > 
> > > - Offload all of your data somewhere (read: not "zfs export");
> > > rsync is usually what people end up using -- if you have multiple
> > > ZFS filesystems, this can take some time
> > > - zpool destroy
> > > - zpool create
> > > - zfs create
> > > 
> > > And if you add a new disk to the system, it's impossible to add
> > > that disk to the existing pool -- you can, of course, create an
> > > entirely new zpool which uses that disk, but that has nothing to
> > > do with the existing zpool.  So you get to do the above dance.
> > 
> > Hm, I thought that would work easily with ZFS, and at least in
> > theory I think that should work well with ZFS. Or what is wrong
> > with:
> > 
> >   zpool add tank /dev/ad8s1
> 
> This will only work how you expect if you're using a ZFS mirror.  With
> RAIDZ, it doesn't work -- you're forced to add the new disk into a new
> zpool.  This is one of the shortcomings of ZFS (and it is documented,
> but only lightly so).
> 
> > Okay "zpool remove" doesn't seem to work as expected, but it should
> > work well at least for RAID-1 (which probably no one uses for large
> > storage systems ;-). Maybe "zfs replace" works, if you replace an
> > old disk, with a larger disk, and split it into two partitions, the
> > one equally sized to the old, and the other containing the
> > remainder of the space. Then do:
> > 
> >   zfs replace tank old_device new_device_equally_sized
> >   zfs add tank new_device_remainder
> > 
> > But you probably know more about ZFS than me ;-)
> 
> In this case, yes (that I know more about ZFS than you :-) ).  What
> you're trying to do there won't work.
> 
> The "zfs" command manages filesystems (e.g. pieces under a zpool).
> You cannot do anything with devices (disks) with "zfs".  I think you
> mean "zpool", especially since the only "replace" command is "zpool
> replace".

Oops, yep, that was of course a typo of mine ;-)
 
> What you're trying to describe won't work, for the same reason I
> described above (with your "zpool add tank ad8s1" command).  You can
> split the disk into two pieces if you want, but it's not going to
> change the fact that you cannot *grow* a zpool.  You literally have to
> destroy it and recreate it for the pool to increase in size.

Ah okay, that's probably because the filesystem and RAID system are too
tighly bundled in ZFS. So if I understand correctly, you can't grow a
ZFS RAID-5 pool or anything similar to RAID-5.
Now the ZFS filesystem probably can only use blocks from one pool, so
the result is that you can't grow a ZFS filesystem living on a RAID-5+
pool as well. A bad example of coupling...

With Hammer the situation is different. You can let vinum
manage a RAID-5 pool (don't know if this is stable, but that's not my
point) and add the storage to a Hammer FS. If you need more space you
have too choices:

  1) Replace a disk with a larger one, splitting it into two subdisks
 (as I described in the last post).

  2) simply create a new RAID-5 pool (built using some new
 disks) and add it as well to the same filesystem. If you reblock
 everything to the new RAID-5 pool you could then remove the old
 RAID-5 pool completely.

Regards,

  Michael


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-22 Thread Bill Hacker

Jeremy Chadwick wrote:

*snip*



The problem I was attempting to describe: all pool members must be the
same size, otherwise all members are considered to be equal to the size
of the smallest.  In English: you cannot "mix-and-match" different sized
disks.



*TILT*

C'mon guys - that has nuthin to do with ZFS or any other RAID [1].

Grocery-store rithmetic says you can't put 2 liters of wine into a 
one-liter bottle.


(though there IS a MIL-SPEC exercise called a 'blivet' that tries to get 
ten liters of sh** into a one-liter container .. ...with predictable 
mess to be dealt with, fecal matter having rather poor compression 
algorithms)


;-)


Bill

[1] Though some 'grid farm' and cousins attack the problem differently - 
sort of a JBOD confederation with (some of) the contents replicated one 
or more times and in one or more places - wherever there is space, or 
least-usage, or greatest 'quality', or most eager 'volunteer', or 


Whole 'nuther subject, those.

And neither ZFS not HAMMER is of that family.

Not *yet*


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-22 Thread Jeremy Chadwick
On Sun, Feb 22, 2009 at 08:12:36PM +0100, Ulrich Spörlein wrote:
> On Sun, 22.02.2009 at 06:33:44 -0800, Jeremy Chadwick wrote:
> > On Sun, Feb 22, 2009 at 01:36:28PM +0100, Michael Neumann wrote:
> > > Okay "zpool remove" doesn't seem to work as expected, but it should
> > > work well at least for RAID-1 (which probably no one uses for large
> > > storage systems ;-). Maybe "zfs replace" works, if you replace an old
> > > disk, with a larger disk, and split it into two partitions, the one
> > > equally sized to the old, and the other containing the remainder of the
> > > space. Then do:
> > > 
> > >   zfs replace tank old_device new_device_equally_sized
> > >   zfs add tank new_device_remainder
> > > 
> > > But you probably know more about ZFS than me ;-)
> > 
> > In this case, yes (that I know more about ZFS than you :-) ).  What
> > you're trying to do there won't work.
> > 
> > The "zfs" command manages filesystems (e.g. pieces under a zpool).  You
> > cannot do anything with devices (disks) with "zfs".  I think you mean
> > "zpool", especially since the only "replace" command is "zpool replace".
> > 
> > What you're trying to describe won't work, for the same reason I
> > described above (with your "zpool add tank ad8s1" command).  You can
> > split the disk into two pieces if you want, but it's not going to
> > change the fact that you cannot *grow* a zpool.  You literally have to
> > destroy it and recreate it for the pool to increase in size.
> > 
> > I've been through this procedure twice in the past year, as I replaced
> > 250GB disks with 500GB, and then 500GB disks with 750GB.  It's a *huge*
> > pain, and I cannot imagine anyone in an enterprise environment using ZFS
> > to emulate a filer -- it simply won't work.  For individual servers
> > (where disks are going to remain the same size unless the box is
> > formatted, etc.), oh yes, ZFS is absolutely fantastic.
> 
> This is nonsense, of course. Here's proof (running on FreeBSD 7.1)
>
> {snip}

You're correct -- my statement is incorrect/inaccurate.

The problem I was attempting to describe: all pool members must be the
same size, otherwise all members are considered to be equal to the size
of the smallest.  In English: you cannot "mix-and-match" different sized
disks.

Example (with real disks):

da1: 65536MB (134217728 512 byte sectors: 255H 63S/T 8354C)
da2: 65536MB (134217728 512 byte sectors: 255H 63S/T 8354C)
da3: 65536MB (134217728 512 byte sectors: 255H 63S/T 8354C)
da4: 262144MB (536870912 512 byte sectors: 255H 63S/T 33418C)

testbox# zpool create tank raidz1 da1 da2 da3
testbox# df -k /tank
Filesystem  1024-blocks   Used Avail Capacity  Mounted on
tank  131303936  0 131303936 0%/tank
testbox# zpool list
NAMESIZEUSED   AVAILCAP  HEALTH ALTROOT
tank191G192K191G 0%  ONLINE -
testbox# zpool offline tank da2
Bringing device da2 offline
testbox# zpool replace tank da2 da4

Wait a few moments for the resilvering to take place...

testbox# zpool status | grep scrub
 scrub: resilver completed with 0 errors on Sun Feb 22 11:32:15 2009
testbox# df -k /tank
Filesystem 1024-blocks Used Avail Capacity  Mounted on
tank 1313039360 131303936 0%/tank

If da1 and da3 were replaced with 256GB disks, the pool should grow.

In this example, essentially 192GB of space on da3 is "wasted" (unused
and unavailable) due to what I've described.

zpool destroy/create *will not* fix this situation either, so I was also
wrong in that regard.  Case point:

testbox# zpool destroy tank
testbox# zpool create tank raidz1 da1 da3 da4
testbox# df -k /tank
Filesystem 1024-blocks Used Avail Capacity  Mounted on
tank 1313039360 131303936 0%/tank

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-22 Thread Ulrich Spörlein
On Sun, 22.02.2009 at 06:33:44 -0800, Jeremy Chadwick wrote:
> On Sun, Feb 22, 2009 at 01:36:28PM +0100, Michael Neumann wrote:
> > Okay "zpool remove" doesn't seem to work as expected, but it should
> > work well at least for RAID-1 (which probably no one uses for large
> > storage systems ;-). Maybe "zfs replace" works, if you replace an old
> > disk, with a larger disk, and split it into two partitions, the one
> > equally sized to the old, and the other containing the remainder of the
> > space. Then do:
> > 
> >   zfs replace tank old_device new_device_equally_sized
> >   zfs add tank new_device_remainder
> > 
> > But you probably know more about ZFS than me ;-)
> 
> In this case, yes (that I know more about ZFS than you :-) ).  What
> you're trying to do there won't work.
> 
> The "zfs" command manages filesystems (e.g. pieces under a zpool).  You
> cannot do anything with devices (disks) with "zfs".  I think you mean
> "zpool", especially since the only "replace" command is "zpool replace".
> 
> What you're trying to describe won't work, for the same reason I
> described above (with your "zpool add tank ad8s1" command).  You can
> split the disk into two pieces if you want, but it's not going to
> change the fact that you cannot *grow* a zpool.  You literally have to
> destroy it and recreate it for the pool to increase in size.
> 
> I've been through this procedure twice in the past year, as I replaced
> 250GB disks with 500GB, and then 500GB disks with 750GB.  It's a *huge*
> pain, and I cannot imagine anyone in an enterprise environment using ZFS
> to emulate a filer -- it simply won't work.  For individual servers
> (where disks are going to remain the same size unless the box is
> formatted, etc.), oh yes, ZFS is absolutely fantastic.

This is nonsense, of course. Here's proof (running on FreeBSD 7.1)

r...@roadrunner: ~# mdconfig -atswap -s128m
md1
r...@roadrunner: ~# mdconfig -atswap -s128m
md2
r...@roadrunner: ~# mdconfig -atswap -s256m
md3
r...@roadrunner: ~# mdconfig -atswap -s256m
md4
r...@roadrunner: ~# zpool create foo mirror md1 md2
r...@roadrunner: ~# zpool status foo
  pool: foo
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
foo ONLINE   0 0 0
  mirrorONLINE   0 0 0
md1 ONLINE   0 0 0
md2 ONLINE   0 0 0

errors: No known data errors
r...@roadrunner: ~# zfs list foo
NAME   USED  AVAIL  REFER  MOUNTPOINT
foo106K  90.9M18K  /foo
r...@roadrunner: ~# zpool replace foo md1 md3
r...@roadrunner: ~# zpool scrub foo
r...@roadrunner: ~# zpool status foo
  pool: foo
 state: ONLINE
 scrub: scrub completed with 0 errors on Sun Feb 22 20:06:18 2009
config:

NAMESTATE READ WRITE CKSUM
foo ONLINE   0 0 0
  mirrorONLINE   0 0 0
md3 ONLINE   0 0 0
md2 ONLINE   0 0 0

errors: No known data errors
r...@roadrunner: ~# zpool replace foo md2 md4
r...@roadrunner: ~# zpool scrub foo
r...@roadrunner: ~# zpool status foo
  pool: foo
 state: ONLINE
 scrub: scrub completed with 0 errors on Sun Feb 22 20:06:35 2009
config:

NAMESTATE READ WRITE CKSUM
foo ONLINE   0 0 0
  mirrorONLINE   0 0 0
md3 ONLINE   0 0 0
md4 ONLINE   0 0 0

errors: No known data errors
r...@roadrunner: ~# zfs list foo
NAME   USED  AVAIL  REFER  MOUNTPOINT
foo110K  90.9M18K  /foo
r...@roadrunner: ~# zpool export foo; zpool import foo
r...@roadrunner: ~# zfs list foo
NAME   USED  AVAIL  REFER  MOUNTPOINT
foo110K   219M18K  /foo

The export/import dance might be a problem in a HA environment of
course. But at least it works for RAIDZ, too.

r...@roadrunner: ~# zpool create foo raidz md1 md2 md5
r...@roadrunner: ~# zfs list foo
NAME   USED  AVAIL  REFER  MOUNTPOINT
foo122K   214M  24.0K  /foo
r...@roadrunner: ~# zpool replace foo md1 md3
r...@roadrunner: ~# zpool replace foo md2 md4
r...@roadrunner: ~# zpool replace foo md5 md6
r...@roadrunner: ~# zfs list foo; zpool export foo; zpool import foo; zfs list 
foo
NAME   USED  AVAIL  REFER  MOUNTPOINT
foo122K   214M  24.0K  /foo
NAME   USED  AVAIL  REFER  MOUNTPOINT
foo126K   470M  24.0K  /foo


Cheers,
Ulrich Spörlein
-- 
None are more hopelessly enslaved than those who falsely believe they are free
-- Johann Wolfgang von Goethe


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-22 Thread Bill Hacker

Jeremy Chadwick wrote:

On Sun, Feb 22, 2009 at 01:36:28PM +0100, Michael Neumann wrote:

Am Sat, 21 Feb 2009 19:17:11 -0800
schrieb Jeremy Chadwick :


*snip*


The one thing we all agree on is that, generally speaking, UFS isn't
cutting it.  :-)



*I* don't agree.

Mind, we don't currently run any single-store over 2 TB with it here, 
but there is a great deal of life left in FFS2/UFS2.


I don't view either ZFS or HAMMER as coming into being because UFS 
doesn't work as a file system - but rather that UFS and necessary 
'add-ons' (+ snapshots + incremental backups + network bandwidth 
limitations + peak loading + admin labor costs) - could all be improved 
upon.


IOW - the 'package'.

HAMMER - or ZFS - are sets of compromises - and all such things are 
compromises - that address that 'package' need  - itself dictated by the 
astonishing growth in drive sizes, and the even more rapid growth in the 
all-too-frequently outright garbage the human race chooses to save on 
those...


;-)

IOW HAMMER or ZFS or  [Linux flavor of the day fs] are 'nice to have 
maybe' today. The 'essential' part is still a ways off for most of us.


But coming fast.

Look for example, at the data stream Dfarm was designed to capture...

Now think about a *serious* load. Perhaps Interactive gamer scoring...

:-)

Bill


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-22 Thread Bill Hacker

Michael Neumann wrote:

Am Sat, 21 Feb 2009 19:17:11 -0800
schrieb Jeremy Chadwick :


On Sun, Feb 22, 2009 at 11:59:57AM +1100, Dmitri Nikulin wrote:

On Sun, Feb 22, 2009 at 10:34 AM, Bill Hacker 
wrote:

Hopefully more 'good stuff' will be ported out of Solaris before
it hits the 'too costly vs the alternatives' wall and is orphaned.

Btrfs has been merged into mainline Linux now, and although it's
pretty far behind ZFS in completeness at the moment, it represents a
far greater degree of flexibility and power. In a couple of years
when it's stable and user friendly, high-end storage solutions will
move back to Linux, after having given Sun a lot of contracts due
specifically to ZFS.

The fact that btrfs offers grow/shrink capability puts it ahead of ZFS
with regards to home users who desire a NAS.  I can't stress this
point enough.  ZFS's lack of this capability limits its scope.  As it
stands now, if you replace a disk with a larger one, you have to go
through this extremely fun process to make use of the new space
available:

- Offload all of your data somewhere (read: not "zfs export"); rsync
  is usually what people end up using -- if you have multiple ZFS
  filesystems, this can take some time
- zpool destroy
- zpool create
- zfs create

And if you add a new disk to the system, it's impossible to add that
disk to the existing pool -- you can, of course, create an entirely
new zpool which uses that disk, but that has nothing to do with the
existing zpool.  So you get to do the above dance.


Hm, I thought that would work easily with ZFS, and at least in theory I
think that should work well with ZFS. Or what is wrong with:

  zpool add tank /dev/ad8s1

Okay "zpool remove" doesn't seem to work as expected, but it should
work well at least for RAID-1 (which probably no one uses for large
storage systems ;-). Maybe "zfs replace" works, if you replace an old
disk, with a larger disk, and split it into two partitions, the one
equally sized to the old, and the other containing the remainder of the
space. Then do:

  zfs replace tank old_device new_device_equally_sized
  zfs add tank new_device_remainder

But you probably know more about ZFS than me ;-)

As for Hammer, I worked on some patches that will allow it to expand a
Hammer FS while mounted. It's actually very easy to implement (~100
LoC). And the shrink case should be at least in theory pretty easy to
implement, thanks to reblocking. So with very little work, we can make
Hammer grow/shrink natively (maybe it's in the next release). 


Regards,

  Michael


Side issue again - just brought up DFLY 2.3.0, default all-hammer layout.

- atop a natacontrol RAID1 on a pair of salvaged 60 GB IBM 'Deathstars'.

- VIA C7 1.5 GHz CPU (the el-cheapo MB aimed at Wal-Mart)

- pulled half the RAM, leaving only 1GB

Periodically, the aged IBM's sound like squirrels having a go in a 
gravel pit, Xfce4 is *very* slow, but it hadn't started swapping, and 
based on what 'top' is showing should actually work decently for basic 
file, web, or mail serving, especially as the OpenSSL recognizes the VIA 
padlock engine.  Planty fast enough with 2GB, BTW.


ZFS is not famed for tolerating that meagre a resource ration as well as 
HAMMER


Hardly a scientific test, but I can go off and grab lunch while 
OpenSolaris and ZFS boot to a GUI on 2 to 4 GB of RAM and a Core-D 2.6 
GHz and 'recent' WD SATA drives.


Not sure I have enough years left to try it on the VIA - at least 'til I 
can get the 64-bit Nano...


;-)

Bill


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-22 Thread Jeremy Chadwick
On Sun, Feb 22, 2009 at 01:36:28PM +0100, Michael Neumann wrote:
> Am Sat, 21 Feb 2009 19:17:11 -0800
> schrieb Jeremy Chadwick :
> 
> > On Sun, Feb 22, 2009 at 11:59:57AM +1100, Dmitri Nikulin wrote:
> > > On Sun, Feb 22, 2009 at 10:34 AM, Bill Hacker 
> > > wrote:
> > > > Hopefully more 'good stuff' will be ported out of Solaris before
> > > > it hits the 'too costly vs the alternatives' wall and is orphaned.
> > > 
> > > Btrfs has been merged into mainline Linux now, and although it's
> > > pretty far behind ZFS in completeness at the moment, it represents a
> > > far greater degree of flexibility and power. In a couple of years
> > > when it's stable and user friendly, high-end storage solutions will
> > > move back to Linux, after having given Sun a lot of contracts due
> > > specifically to ZFS.
> > 
> > The fact that btrfs offers grow/shrink capability puts it ahead of ZFS
> > with regards to home users who desire a NAS.  I can't stress this
> > point enough.  ZFS's lack of this capability limits its scope.  As it
> > stands now, if you replace a disk with a larger one, you have to go
> > through this extremely fun process to make use of the new space
> > available:
> > 
> > - Offload all of your data somewhere (read: not "zfs export"); rsync
> >   is usually what people end up using -- if you have multiple ZFS
> >   filesystems, this can take some time
> > - zpool destroy
> > - zpool create
> > - zfs create
> > 
> > And if you add a new disk to the system, it's impossible to add that
> > disk to the existing pool -- you can, of course, create an entirely
> > new zpool which uses that disk, but that has nothing to do with the
> > existing zpool.  So you get to do the above dance.
> 
> Hm, I thought that would work easily with ZFS, and at least in theory I
> think that should work well with ZFS. Or what is wrong with:
> 
>   zpool add tank /dev/ad8s1

This will only work how you expect if you're using a ZFS mirror.  With
RAIDZ, it doesn't work -- you're forced to add the new disk into a new
zpool.  This is one of the shortcomings of ZFS (and it is documented,
but only lightly so).

> Okay "zpool remove" doesn't seem to work as expected, but it should
> work well at least for RAID-1 (which probably no one uses for large
> storage systems ;-). Maybe "zfs replace" works, if you replace an old
> disk, with a larger disk, and split it into two partitions, the one
> equally sized to the old, and the other containing the remainder of the
> space. Then do:
> 
>   zfs replace tank old_device new_device_equally_sized
>   zfs add tank new_device_remainder
> 
> But you probably know more about ZFS than me ;-)

In this case, yes (that I know more about ZFS than you :-) ).  What
you're trying to do there won't work.

The "zfs" command manages filesystems (e.g. pieces under a zpool).  You
cannot do anything with devices (disks) with "zfs".  I think you mean
"zpool", especially since the only "replace" command is "zpool replace".

What you're trying to describe won't work, for the same reason I
described above (with your "zpool add tank ad8s1" command).  You can
split the disk into two pieces if you want, but it's not going to
change the fact that you cannot *grow* a zpool.  You literally have to
destroy it and recreate it for the pool to increase in size.

I've been through this procedure twice in the past year, as I replaced
250GB disks with 500GB, and then 500GB disks with 750GB.  It's a *huge*
pain, and I cannot imagine anyone in an enterprise environment using ZFS
to emulate a filer -- it simply won't work.  For individual servers
(where disks are going to remain the same size unless the box is
formatted, etc.), oh yes, ZFS is absolutely fantastic.

> As for Hammer, I worked on some patches that will allow it to expand a
> Hammer FS while mounted. It's actually very easy to implement (~100
> LoC). And the shrink case should be at least in theory pretty easy to
> implement, thanks to reblocking. So with very little work, we can make
> Hammer grow/shrink natively (maybe it's in the next release). 

That's very cool.  I'm hoping HAMMER takes off, although the popularity
of it would greatly improve if *BSD could be made to work with it.  I
know this has been mentioned in the past (and Matt has responded as to
why it would be difficult -- not impossible, just a lot of work), but
really all of the BSDs could benefit from something great and new.

The one thing we all agree on is that, generally speaking, UFS isn't
cutting it.  :-)

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |



Re: OT - was Hammer or ZFS based backup, encryption

2009-02-22 Thread Jeremy Chadwick
On Sun, Feb 22, 2009 at 02:46:15PM +1100, Dmitri Nikulin wrote:
> On Sun, Feb 22, 2009 at 2:17 PM, Jeremy Chadwick  wrote:
> > I'm a little surprised to see BSD put alongside Linux when it comes to
> > "development pace".
> 
> Certainly, I don't mean that BSD is at all keeping pace with Linux in
> the grand scheme, but for small to medium spec server installations,
> FreeBSD in particular is very competitive. Having a stable btrfs will
> push Linux very far ahead again, in my opinion.

Understood, and agreed.  Thanks for clarifying -- I appreciate it.  :-)

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |



Re: OT - was Hammer or ZFS based backup, encryption

2009-02-22 Thread Michael Neumann
Am Sat, 21 Feb 2009 19:17:11 -0800
schrieb Jeremy Chadwick :

> On Sun, Feb 22, 2009 at 11:59:57AM +1100, Dmitri Nikulin wrote:
> > On Sun, Feb 22, 2009 at 10:34 AM, Bill Hacker 
> > wrote:
> > > Hopefully more 'good stuff' will be ported out of Solaris before
> > > it hits the 'too costly vs the alternatives' wall and is orphaned.
> > 
> > Btrfs has been merged into mainline Linux now, and although it's
> > pretty far behind ZFS in completeness at the moment, it represents a
> > far greater degree of flexibility and power. In a couple of years
> > when it's stable and user friendly, high-end storage solutions will
> > move back to Linux, after having given Sun a lot of contracts due
> > specifically to ZFS.
> 
> The fact that btrfs offers grow/shrink capability puts it ahead of ZFS
> with regards to home users who desire a NAS.  I can't stress this
> point enough.  ZFS's lack of this capability limits its scope.  As it
> stands now, if you replace a disk with a larger one, you have to go
> through this extremely fun process to make use of the new space
> available:
> 
> - Offload all of your data somewhere (read: not "zfs export"); rsync
>   is usually what people end up using -- if you have multiple ZFS
>   filesystems, this can take some time
> - zpool destroy
> - zpool create
> - zfs create
> 
> And if you add a new disk to the system, it's impossible to add that
> disk to the existing pool -- you can, of course, create an entirely
> new zpool which uses that disk, but that has nothing to do with the
> existing zpool.  So you get to do the above dance.

Hm, I thought that would work easily with ZFS, and at least in theory I
think that should work well with ZFS. Or what is wrong with:

  zpool add tank /dev/ad8s1

Okay "zpool remove" doesn't seem to work as expected, but it should
work well at least for RAID-1 (which probably no one uses for large
storage systems ;-). Maybe "zfs replace" works, if you replace an old
disk, with a larger disk, and split it into two partitions, the one
equally sized to the old, and the other containing the remainder of the
space. Then do:

  zfs replace tank old_device new_device_equally_sized
  zfs add tank new_device_remainder

But you probably know more about ZFS than me ;-)

As for Hammer, I worked on some patches that will allow it to expand a
Hammer FS while mounted. It's actually very easy to implement (~100
LoC). And the shrink case should be at least in theory pretty easy to
implement, thanks to reblocking. So with very little work, we can make
Hammer grow/shrink natively (maybe it's in the next release). 

Regards,

  Michael


Re: Hammer or ZFS based backup, encryption

2009-02-22 Thread Csaba Henk
On 2009-02-21, inva...@domain.invalid  wrote:
> On 2009-02-21, Csaba Henk  wrote:
>
> snip
>
>> I don't feel like dwelving into Solaris
>
> This is a little off topic...but
>
> I don't blame you as I have gone through installing and then ditching
> Solaris more than a few times myself. This last time with Solaris 10
> 10/08 I think it's going to be a keeper.
>
> I think it's worth running Solaris and the native ZFS. It won't cost
> you anything but time and some grey hairs. But may I suggest you
> download the Solaris installation check tool and burn a disc, boot it
> and see what it says about your drivers, before downloading the
> Solaris DVD...

I can imagine. Anyway, my attitude leans much more toward "quick and
dirty". It's just the better for me if I can still end up so that latter
attribute is not appropriate for the setup I put together. But
time and hairs -- I'm tight on these resources...

Csaba


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-21 Thread Justin C. Sherrill
On Sat, February 21, 2009 10:17 pm, Jeremy Chadwick wrote:

> I'm a little surprised to see BSD put alongside Linux when it comes to
> "development pace".  Fact of the matter is, Linux wins when it comes to
> pace/speed of development -- they have more kernel folks, more eyes,
> a significantly larger user-base for testing, and absurd amounts of
> commercial backing/support from actual hardware vendors.  Sun has
> some of this, but it's gotten worse (IMHO) since they started going
> the x86 route.  BSD often trails behind Linux by 3-5 years.

I'll drag this even more off-topic and point out that thinking of the
progress of BSD systems compared to Linux (or Solaris or Windows etc.) as
a linear process, while valid, will never work out to anyone's
satisfaction.  Nobody likes playing permanent catch-up.

I agree with what Jeremy said, and think we need to work on what features
make BSD, and specifically DragonFly BSD, special.  Hammer is a big effort
in that direction.



Re: OT - was Hammer or ZFS based backup, encryption

2009-02-21 Thread Dmitri Nikulin
On Sun, Feb 22, 2009 at 2:17 PM, Jeremy Chadwick  wrote:
> I'm a little surprised to see BSD put alongside Linux when it comes to
> "development pace".

Certainly, I don't mean that BSD is at all keeping pace with Linux in
the grand scheme, but for small to medium spec server installations,
FreeBSD in particular is very competitive. Having a stable btrfs will
push Linux very far ahead again, in my opinion.

-- 
Dmitri Nikulin

Centre for Synchrotron Science
Monash University
Victoria 3800, Australia


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-21 Thread Jeremy Chadwick
On Sun, Feb 22, 2009 at 11:59:57AM +1100, Dmitri Nikulin wrote:
> On Sun, Feb 22, 2009 at 10:34 AM, Bill Hacker  wrote:
> > Hopefully more 'good stuff' will be ported out of Solaris before it hits the
> > 'too costly vs the alternatives' wall and is orphaned.
> 
> Btrfs has been merged into mainline Linux now, and although it's
> pretty far behind ZFS in completeness at the moment, it represents a
> far greater degree of flexibility and power. In a couple of years when
> it's stable and user friendly, high-end storage solutions will move
> back to Linux, after having given Sun a lot of contracts due
> specifically to ZFS.

The fact that btrfs offers grow/shrink capability puts it ahead of ZFS
with regards to home users who desire a NAS.  I can't stress this point
enough.  ZFS's lack of this capability limits its scope.  As it stands
now, if you replace a disk with a larger one, you have to go through
this extremely fun process to make use of the new space available:

- Offload all of your data somewhere (read: not "zfs export"); rsync
  is usually what people end up using -- if you have multiple ZFS
  filesystems, this can take some time
- zpool destroy
- zpool create
- zfs create

And if you add a new disk to the system, it's impossible to add that
disk to the existing pool -- you can, of course, create an entirely
new zpool which uses that disk, but that has nothing to do with the
existing zpool.  So you get to do the above dance.

I'll also point out that ZFS on FreeBSD (at least 7.x) performs very
differently than on Solaris 10.  We use Solaris 10 x86 + ZFS at my
workplace, and the overall usability of the system during heavy disk I/O
is much more refined (read: smooth) than on FreeBSD.  It's interesting
to do something like "zpool iostat 1" on FreeBSD compared to Solaris 10;
FreeBSD will show massive write bursts (e.g. 0MB, 0MB, 0MB, 70MB, 0MB,
0MB, 0MB, 67MB, etc.), while Solaris behaves more appropriately (50MB,
60MB, 70MB, 40MB, etc.).  "zpool scrub" is a great way to test this.

> So Sun's options are to GPL/BSD ZFS and hope it gets used more so they
> have some relevance left, or somehow attempt to be able to compete
> with Linux and BSD's development pace and massive head start. I think
> it's pretty clear where that'll go.

I'm a little surprised to see BSD put alongside Linux when it comes to
"development pace".  Fact of the matter is, Linux wins when it comes to
pace/speed of development -- they have more kernel folks, more eyes,
a significantly larger user-base for testing, and absurd amounts of
commercial backing/support from actual hardware vendors.  Sun has
some of this, but it's gotten worse (IMHO) since they started going
the x86 route.  BSD often trails behind Linux by 3-5 years.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |



Re: OT - was Hammer or ZFS based backup, encryption

2009-02-21 Thread Dmitri Nikulin
On Sun, Feb 22, 2009 at 10:34 AM, Bill Hacker  wrote:
> Hopefully more 'good stuff' will be ported out of Solaris before it hits the
> 'too costly vs the alternatives' wall and is orphaned.

Btrfs has been merged into mainline Linux now, and although it's
pretty far behind ZFS in completeness at the moment, it represents a
far greater degree of flexibility and power. In a couple of years when
it's stable and user friendly, high-end storage solutions will move
back to Linux, after having given Sun a lot of contracts due
specifically to ZFS.

So Sun's options are to GPL/BSD ZFS and hope it gets used more so they
have some relevance left, or somehow attempt to be able to compete
with Linux and BSD's development pace and massive head start. I think
it's pretty clear where that'll go.

-- 
Dmitri Nikulin

Centre for Synchrotron Science
Monash University
Victoria 3800, Australia


Re: OT - was Hammer or ZFS based backup, encryption

2009-02-21 Thread Bill Hacker

inva...@domain.invalid wrote:

On 2009-02-21, Csaba Henk  wrote:

snip


I don't feel like dwelving into Solaris


This is a little off topic...but

I don't blame you as I have gone through installing and then ditching
Solaris more than a few times myself. This last time with Solaris 10
10/08 I think it's going to be a keeper.

I think it's worth running Solaris and the native ZFS. It won't cost
you anything but time and some grey hairs. But may I suggest you
download the Solaris installation check tool and burn a disc, boot it
and see what it says about your drivers, before downloading the
Solaris DVD...


Oh I *have* got it running. But on Core-2 Quad, not AMD-64.

Try as they might, by the time Sun get Solaris back-tailored to, and 
re-optimized for, 32-bit++, vs massively threaded 64-bit T1 and 
sucessors, the 'competition' will have gained another parsec.


Think it through:

IBM AIX, zOS, and/or Linux on Power5&6 have taken the high(est) ground 
away from Sun.


HP-UX on Itanic revanche (pa-risc was a better chip..but never mind..)
are close behind, and have taken the upper-middle - and downwards...

Worse for Sun - HP have woken up to to the IBM 'service not hardware' 
model and are chasing Sun out of *that* business.


AMD-64 - running 4 zillion different Linux distros, and 3 of the 4 major 
*BSD's have the 'middle' under Dell, any other hardware brand - or no 
brand atall.


If 2007-8 was DragonFly's year-of-the HAMMER, 2009-10 may well be the 
year DFLY gets serious about 64-bit SMP, making it 4 for 4.


(I have no 'inside' information - but HAMMER is stable enough to allow 
the focus to shift, and Matt has just got to be bone-weary of it and in 
need of a mental break...)


Intel Core-2 and successors have the desktops, gamers and other 
WinMadness, and low-end servers upwards as far as tight monetary budgets 
can reach, (and there is never a shortage of tight budgets).


ARM and friends have the low-end.

The walls have been closing in on Sun for quite some time now.

Hopefully more 'good stuff' will be ported out of Solaris before it hits 
the 'too costly vs the alternatives' wall and is orphaned.


As it will be. 'wall' that.

Yesss. My old friend HP MPE-3000 refuses to die .. but mainstream it no 
longer is. Nor Irix... So too Solaris, and in not a lot more time.


http://finance.yahoo.com/q/ks?s=JAVA

How long can Sun show a gross profit of half their gross revenue ... yet 
post that sort of per-share loss?


Bill


Re: Hammer or ZFS based backup, encryption

2009-02-21 Thread Bill Hacker

Csaba Henk wrote:

Hi,

I need to setup a backup machine, and I intend to utilize today's
snapshotty filesystems (which boils down to Dfly+Hammer or FBSD+ZFS --
btrfs is not there yet, and I don't feel like dwelving into Solaris).
Set up such an OS with such an fs, and backup by syncing to the
snapshotty fs and create a snapshot.

I wonder about the following things:

1) Any idea how does this approach scale related to more conventional solutions,
like rdiff-backup or dump(8)? I see the the pros, but are there any
cons? How effective is taking regular snapshots space-wise?


The advantage to snapshot-as-you-go is that it 'mostly' has just the 
rate-of-change to deal with - not the scanning or (external) DB'ifying 
of the whole mass of data. Easier to be near current if you don't have 
to keep digging through the big chunks to cope with a few bytes of change.


HAMMER has the advantage there in that it's basic structure is TID 
retentive. Downside is that reblock/prune are out-of-band operations.


ZFS, AFAICS, wants a great deal of RAM to work effectively with very 
large file systems. HAMMER seems *way* less greedy in that regard - 
OTOH, HAMMER really needs large *disks*.




2) Is there any practical argument for choosing between Dfly+Hammer and
FBSD+ZFS? (Feel free to give biased answers :) )



Biased?  51 years into this game, I have no other kind

On AMD-64, UtraSPARC, or Itanium, you can handle really large memory. On 
Intel lets-pretend-we-are-64-bit it is more challenging. Not that it is 
easy to find MB that support both anyway...


So ... IF I was to run ZFS, I'd probably bite the bullet and learn to 
put up with Solaris-on-SPARC, AND NOT Solaris on-Intel.


AFAICS, the FreeBSD port of ZFS has gotten quite good. But Solaris is 
where it ZFS was grown, and where it has the best 'fit' and integration, 
and the shortest list of out-of-step-with the rest-of-the-environment 
items. Acl's for example.



3) I'd like to encrypt stuff, either at device or fs level. For
FreeBSD there is geli(8). I haven't found anything for DragonFly.
Is there any way to get at it on DragonFly?

Thanks,
Csaba


Any fan-in/fan-out fs environment should be able to interpose an 
encryption layer between VFS and media by use of a loopback or nullfs 
method.


'Should', 'in the ports', and 'proven not to break' are not synonymous.

Soon having to make a similar choice, (and still-yet looking at Gfarm, 
Gluster, DFarm, Ceph, Chiron et al..)


- ZFS doesn't offer anything I actually need - or at least, not at the 
'price' it entails.


 - HAMMER does. And resource-cheaply. Large HDD are way cheaper than 
large RAM once the switch to a powerful 64-bit CPU and associated MB to 
hold and use it effectively is factored in.


HAMMER, OTOH, seems quite happy with a VIA C7, and very modest RAM.

- I expect to 'settle' on DragonFlyBSD for the next set of production 
'twins', accept (for now) roughly half the overall throughput FreeBSD 
could deliver, as we are network bandwidth and UPS-budget constrained 
anyway - not CPU or I/O bound.


But a DFLY choice is not *just* to have the hammerfs - but also because 
DFLY has more modern disk slicing and partitioning capability.


YMMV,

Bill


Re: Hammer or ZFS based backup, encryption

2009-02-21 Thread invalid
On 2009-02-21, Csaba Henk  wrote:

snip

> I don't feel like dwelving into Solaris

This is a little off topic...but

I don't blame you as I have gone through installing and then ditching
Solaris more than a few times myself. This last time with Solaris 10
10/08 I think it's going to be a keeper.

I think it's worth running Solaris and the native ZFS. It won't cost
you anything but time and some grey hairs. But may I suggest you
download the Solaris installation check tool and burn a disc, boot it
and see what it says about your drivers, before downloading the
Solaris DVD...


Hammer or ZFS based backup, encryption

2009-02-21 Thread Csaba Henk
Hi,

I need to setup a backup machine, and I intend to utilize today's
snapshotty filesystems (which boils down to Dfly+Hammer or FBSD+ZFS --
btrfs is not there yet, and I don't feel like dwelving into Solaris).
Set up such an OS with such an fs, and backup by syncing to the
snapshotty fs and create a snapshot.

I wonder about the following things:

1) Any idea how does this approach scale related to more conventional solutions,
like rdiff-backup or dump(8)? I see the the pros, but are there any
cons? How effective is taking regular snapshots space-wise?

2) Is there any practical argument for choosing between Dfly+Hammer and
FBSD+ZFS? (Feel free to give biased answers :) )

3) I'd like to encrypt stuff, either at device or fs level. For
FreeBSD there is geli(8). I haven't found anything for DragonFly.
Is there any way to get at it on DragonFly?

Thanks,
Csaba