[zfs-discuss] zfs send/receive of an entire pool

2008-01-17 Thread James Andrewartha
Hi,

I have a zfs filesystem that I'd like to move to another host. It's part
of a pool called space, which is mounted at /space and has several child
filesystems. The first hurdle I came across was that zfs send only works
on snapshots, so I create one:
# zfs snapshot -r [EMAIL PROTECTED]
# zfs list -t snapshot  
NAME USED  AVAIL  REFER MOUNTPOINT
[EMAIL PROTECTED]   0  -  25.9G  -
space/[EMAIL PROTECTED]0  -31K  -
space/[EMAIL PROTECTED]   924K  -  52.4G  -
space/[EMAIL PROTECTED]   0  -38K  -
space/freebsd/[EMAIL PROTECTED]  0  -36K  -
space/freebsd/[EMAIL PROTECTED]   0  -  4.11G  -
space/[EMAIL PROTECTED]  0  -  47.6G  -
space/[EMAIL PROTECTED]352K  -  14.7G  -
space/netboot/[EMAIL PROTECTED]   0  -  95.5M  -
space/netboot/manduba-freebsd/[EMAIL PROTECTED]  0  -36K  -
space/netboot/manduba-freebsd/[EMAIL PROTECTED]  0  -   327M  -
space/netboot/manduba-freebsd/[EMAIL PROTECTED]  0  -36K  -
space/[EMAIL PROTECTED]   234K  -   167G  -

On the destination, I have created a zpool, again called space and
mounted at /space. However, I can't work out how to send [EMAIL PROTECTED]
to the new machine:
# zfs send [EMAIL PROTECTED] | ssh musundo zfs recv -vn -d space
cannot receive: destination 'space' exists
# zfs send [EMAIL PROTECTED] | ssh musundo zfs recv -vn space
cannot receive: destination 'space' exists
# zfs send [EMAIL PROTECTED] | ssh musundo zfs recv -vn space2
cannot receive: destination does not exist
# zfs send [EMAIL PROTECTED] | ssh musundo zfs recv -vn space/space2
would receive full stream of [EMAIL PROTECTED] into space/[EMAIL PROTECTED]
# zfs send [EMAIL PROTECTED] | ssh musundo zfs recv -vn [EMAIL PROTECTED]
cannot receive: destination 'space' exists
# zfs send [EMAIL PROTECTED] | ssh musundo zfs recv -vn [EMAIL PROTECTED]
cannot receive: destination does not exist

What am I missing here? I can't recv to space, because it exists, but I
can't make it not exist since it's the root filesystem of the pool. Do I
have to send each filesystem individually and rsync up the root fs?

Thanks,

James Andrewartha


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive of an entire pool

2008-01-17 Thread James Andrewartha
On Thu, 2008-01-17 at 09:29 -0800, Richard Elling wrote:
 You don't say which version of ZFS you are running, but what you
 want is the -R option for zfs send.  See also the example of send
 usage in the zfs(1m) man page.

Sorry, I'm running SXCE nv75. I can't see any mention of send -R in the
man page. Ah, it's PSARC/2007/574 and nv77. I'm not convinced it'll
solve my problem (sending the root filesystem of a pool), but I'll
upgrade and give it a shot.

Thanks,

James Andrewartha
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ? Removing a disk from a ZFS Storage Pool

2008-02-07 Thread James Andrewartha
Dave Lowenstein wrote:
 Couldn't we move fixing panic the system if it can't find a lun up to 
 the front of the line? that one really sucks.

That's controlled by the failmode property of the zpool, added in PSARC 
2007/567 which was integrated in b77.

-- 
James Andrewartha
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive of an entire pool

2008-02-14 Thread James Andrewartha
James Andrewartha wrote:
 On Thu, 2008-01-17 at 09:29 -0800, Richard Elling wrote:
 You don't say which version of ZFS you are running, but what you
 want is the -R option for zfs send.  See also the example of send
 usage in the zfs(1m) man page.
 
 Sorry, I'm running SXCE nv75. I can't see any mention of send -R in the
 man page. Ah, it's PSARC/2007/574 and nv77. I'm not convinced it'll
 solve my problem (sending the root filesystem of a pool), but I'll
 upgrade and give it a shot.

It did in fact do exactly what I wanted. For the record, here are the 
commands I used:
zfs snapshot -r [EMAIL PROTECTED]
zfs send -R [EMAIL PROTECTED] | ssh musundo zfs recv -vFd space

And later, to catch up further changes:
zfs snapshot -r [EMAIL PROTECTED]
zfs send -Ri @musundo [EMAIL PROTECTED] | ssh musundo zfs recv -vFd space

In both cases the -F was necessary.

-- 
James Andrewartha
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Project Hardware

2008-05-28 Thread James Andrewartha
Erik Trimble wrote:
 On a related note - does anyone know of a good Solaris-supported 4+ port 
 SATA card for PCI-Express?  Preferably 1x or 4x slots...

 From what I can tell, all the vendors are only making SAS controllers for 
PCIe with more than 4 ports. Since SAS supports SATA, I guess they don't see 
much point in doing SATA-only controllers.

For example, the LSI SAS3081E-R is $260 for 8 SAS ports on 8x PCIe, which is 
somewhat more expensive than the almost equivalent PCI-X LSI SAS3080X-R 
which is as low as $180.

For those downthread looking for full RAID controllers with battery backup 
RAM, Areca (who formerly specialised in SATA controlers) now do SAS RAID at 
reasonable prices, and have Solaris drivers.

-- 
James Andrewartha
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Project Hardware

2008-05-28 Thread James Andrewartha
On Wed, 2008-05-28 at 10:34 -0600, Keith Bierman wrote:
 On May 28, 2008, at 10:27 AM   5/28/, Richard Elling wrote:
 
  Since the mechanics are the same, the difference is in the electronics
 
 
 In my very distant past, I did QA work for an electronic component  
 manufacturer. Even parts which were identical were expected to  
 behave quite differently ... based on population statistics. That is,  
 the HighRel MilSpec parts were from batches with no failures (even  
 under very harsh conditions beyond the normal operating mode, and all  
 tests to destruction showed only the expected failure modes) and the  
 hobbyist grade components were those whose cohort *failed* all the  
 testing (and destructive testing could highlight abnormal failure  
 modes).
 
 I don't know that drive builders do the same thing, but I'd kinda  
 expect it.

Seagate's ES.2 has a higher MBTF than the equivalent consumer drive, so
you're probably right. Western Digital's RE2 series (which my work uses)
comes with a 5 year warranty, compared to 3 years for the consumer
versions. The RE2 also have firmware with Time-Limited Error Recovery,
which reports errors promptly, letting the higher-level RAID do data
recovery. Both have improved vibration tolerance through firmware
tweaks. And if you want 10krpm, I think WD's VelociRaptor counts.
http://www.techreport.com/articles.x/13732
http://www.techreport.com/articles.x/13253
http://www.techreport.com/articles.x/14583
http://www.storagereview.com/ is promising some SSD benchmarks soon.

James Andrewartha
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send and recordsize

2008-06-25 Thread James Andrewartha
Peter Boros wrote:
 I perform a snapshot and a zfs send on a filesystem with a recordsize  
 of 16k, and redirect the output to a plain file. Later, I use cat  
 sentfs | zfs receive otherpool/filesystem. In this case the new  
 filesystem's recordsize will be the default 128k again. The other  
 filesystem attributes (for example atime) are reverted to defaults  
 too. Okay, I can set these later, but I can't set the recordsize for  
 existing files. Are there any solutions for this problem? This is the  
 case on Solaris 10u5 and on Nevada b91 too.

My impression is you should change the recordsize on the first filesystem 
before performing the zfs send. This will then be used for all files when 
you receive the filesystem. I haven't tested this with recordsize, but I did 
with compression and I imagine recordsize (and others) will behave the same way.

-- 
James Andrewartha
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send and recordsize

2008-06-25 Thread James Andrewartha
Peter Boros wrote:
 Hi James,
 
 Of course, changing the recordsize was the first thing I did, after I 
 created the original filesystem. I copied some files on it, made a 
 snapshot, and then performed the zfs send (with the decreased 
 recordsize). After I performed a zfs receive, the recordsize was the 
 default (128k) on the new filesystem.

Ah, I was using the -R option to zfs send, which does what you want. It's 
been in since nv77, PSARC/2007/574. To quote the manpage:
  -R Generate a  replication  stream  package,
 which   will   replicate   the  specified
 filesystem, and all descendant file  sys-
 tems,  up  to  the  named  snapshot. When
 received, all properties, snapshots, des-
 cendent  file  systems,  and  clones  are
 preserved.

 If the -i or -I flags are  used  in  con-
 junction with the -R flag, an incremental
 replication  stream  is  generated.   The
 current values of properties, and current
 snapshot and file system  names  are  set
 when  the  stream is received.  If the -F
 flag is specified  when  this  stream  is
 recieved, snapshots and file systems that
 do not exist on the sending side are des-
 troyed.

-- 
James Andrewartha
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS over multiple iSCSI targets

2008-09-10 Thread James Andrewartha
Tuomas Leikola wrote:
 On Mon, Sep 8, 2008 at 8:35 PM, Miles Nordin [EMAIL PROTECTED] wrote:
ps iSCSI with respect to write barriers?

 +1.

 Does anyone even know of a good way to actually test it?  So far it
 seems the only way to know if your OS is breaking write barriers is to
 trade gossip and guess.
 
 Write a program that writes backwards (every other block to avoid
 write merges) with and without O_DSYNC, measure speed.
 
 I think you can also deduce driver and drive cache flush correctness
 by calculating the best theoretical correct speed (which should be
 really slow, one write per disc spin)
 
 this has been on my TODO list for ages.. :(

Does the perl script at http://brad.livejournal.com/2116715.html do what you
want?

-- 
James Andrewartha
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] web interface not showing up

2008-09-24 Thread James Andrewartha
mike wrote:
 On Sun, Sep 21, 2008 at 11:49 PM, Volker A. Brandt [EMAIL PROTECTED] wrote:
 
 Hmmm... I run Solaris 10/sparc U4.  My /usr/java points to
 jdk/jdk1.5.0_16.  I am using Firefox 2.0.0.16.  Works For Me(TM)  ;-)
 Sorry, can't help you any further.  Maybe a question for desktop-discuss?
 
 it's a java error on the server side, not client side (although there
 is a javascript error in every browser i tried it in, but probably
 unrelated or an error due to the java not executing properly)
 
 anyway - you did help me at least get the webconsole running. the zfs
 admin piece of it though is throwing the java error...

Can you post the java error to the list? Do you have gzip compressed or
aclinherit properties on your filesystems, hitting bug 6715550?
http://mail.opensolaris.org/pipermail/zfs-discuss/2008-June/048457.html
http://mail.opensolaris.org/pipermail/zfs-discuss/2008-June/048550.html

-- 
James Andrewartha
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] rename(2), atomicity, crashes and fsync()

2009-03-17 Thread James Andrewartha
Hi all,

Recently there's been discussion [1] in the Linux community about how
filesystems should deal with rename(2), particularly in the case of a crash.
ext4 was found to truncate files after a crash, that had been written with
open(foo.tmp), write(), close() and then rename(foo.tmp, foo). This is
 because ext4 uses delayed allocation and may not write the contents to disk
immediately, but commits metadata changes quite frequently. So when
rename(foo.tmp,foo) is committed to disk, it has a length of zero which
is later updated when the data is written to disk. This means after a crash,
foo is zero-length, and both the new and the old data has been lost, which
is undesirable. This doesn't happen when using ext3's default settings
because ext3 writes data to disk before metadata (which has performance
problems, see Firefox 3 and fsync[2])

Ted T'so's (the main author of ext3 and ext4) response is that applications
which perform open(),write(),close(),rename() in the expectation that they
will either get the old data or the new data, but not no data at all, are
broken, and instead should call open(),write(),fsync(),close(),rename().
Most other people are arguing that POSIX says rename(2) is atomic, and while
POSIX doesn't specify crash recovery, returning no data at all after a crash
is clearly wrong, and excessive use of fsync is overkill and
counter-productive (Ted later proposes a yes-I-really-mean-it flag for
fsync). I've omitted a lot of detail, but I think this is the core of the
argument.

Now the question I have, is how does ZFS deal with
open(),write(),close(),rename() in the case of a crash? Will it always
return the new data or the old data, or will it sometimes return no data? Is
 returning no data defensible, either under POSIX or common sense? Comments
about other filesystems, eg UFS are also welcome. As a counter-point, XFS
(written by SGI) is notorious for data-loss after a crash, but its authors
defend the behaviour as POSIX-compliant.

Note this is purely a technical discussion - I'm not interested in replies
saying ?FS is a better filesystem in general, or on GPL vs CDDL licensing.

[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781?comments=all
http://thunk.org/tytso/blog/2009/03/12/delayed-allocation-and-the-zero-length-file-problem/
http://lwn.net/Articles/323169/
http://mjg59.livejournal.com/108257.html http://lwn.net/Articles/323464/
http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/
http://lwn.net/Articles/323752/ *
http://lwn.net/Articles/322823/ *
* are currently subscriber-only, email me for a free link if you'd like to
read them
[2] http://lwn.net/Articles/283745/

-- 
James Andrewartha | Sysadmin
Data Analysis Australia Pty Ltd | STRATEGIC INFORMATION CONSULTANTS
97 Broadway, Nedlands, Western Australia, 6009
PO Box 3258, Broadway Nedlands, WA, 6009
T: +61 8 9386 3304 | F: +61 8 9386 3202 | I: http://www.daa.com.au
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [storage-discuss] Supermicro AOC-SASLP-MV8

2009-04-22 Thread James Andrewartha
myxi...@googlemail.com wrote:
 Bouncing a thread from the device drivers list:
 http://opensolaris.org/jive/thread.jspa?messageID=357176
 
 Does anybody know if OpenSolaris will support this new Supermicro card,
 based on the Marvell 88SE6480 chipset? It's a true PCI Express 8 port
 JBOD SAS/SATA controller with pricing apparently around $125.
 
 If it works with OpenSolaris it sounds pretty much perfect.

The Linux support for the 6480 builds on the 6440 mvsas support, so I don't
think marvell88sx would work, and there doesn't seem to be a Marvell SAS
driver for Solaris at all, so I'd say it's not supported.
http://www.hardforum.com/showthread.php?t=1397855 has a fair few people
testing it out, but mostly under Windows.

-- 
James Andrewartha
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why is Solaris 10 ZFS performance so terrible?

2009-07-07 Thread James Andrewartha
Joerg Schilling wrote:
 I would be interested to see a open(2) flag that tells the system that I will
 read a file that I opened exactly once in native oder. This could tell the 
 system to do read ahead and to later mark the pages as immediately reusable. 
 This would make star even faster than it is now.

Are you aware of posix_fadvise(2) and madvise(2)?

-- 
James Andrewartha
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] surprisingly poor performance

2009-07-07 Thread James Andrewartha
James Lever wrote:
 We also have a PERC 6/E w/512MB BBWC to test with or fall back to if we
 go with a Linux solution.

Have you tried putting the slog on this controller, either as an SSD or
regular disk? It's supported by the mega_sas driver, x86 and amd64 only.

-- 
James Andrewartha | Sysadmin
Data Analysis Australia Pty Ltd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] surprisingly poor performance

2009-07-08 Thread James Andrewartha
James Lever wrote:
 
 On 07/07/2009, at 8:20 PM, James Andrewartha wrote:
 
 Have you tried putting the slog on this controller, either as an SSD or
 regular disk? It's supported by the mega_sas driver, x86 and amd64 only.
 
 What exactly are you suggesting here?  Configure one disk on this array
 as a dedicated ZIL?  Would that improve performance any over using all
 disks with an internal ZIL?

I was mainly thinking about using the battery-backup write cache to
eliminate the NFS latency. There's not much difference between internal vs
dedicated ZIL if the disks are the same and on the same controller -
dedicated ZIL wins come from using SSDs and battery-backed cache.
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Separate_Log_Devices

 Is there a way to disable the write barrier in ZFS in the way you can
 with Linux filesystems (-o barrier=0)?  Would this make any difference?

http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Cache_Flushes
might help if the RAID card is still flushing to disk when ZFS asks it to
even though it's safe in the battery-backed cache.

-- 
James Andrewartha | Sysadmin
Data Analysis Australia Pty Ltd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Finding SATA cards for ZFS; was Lundman home NAS

2009-09-01 Thread James Andrewartha

Jorgen Lundman wrote:
The mv8 is a marvell based chipset, and it appears there are no 
Solaris drivers for it.  There doesn't appear to be any movement from 
Sun or marvell to provide any either.


Do you mean specifically Marvell 6480 drivers? I use both DAC-SATA-MV8 
and AOC-SAT2-MV8, which use Marvell MV88SX and works very well in 
Solaris. (Package SUNWmv88sx).


They're PCI-X SATA cards, the AOC-SASLP-MV8 is a PCIe SAS card and has no 
(Open)Solaris driver.


--
James Andrewartha
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Sun Flash Accelerator F20

2009-09-24 Thread James Andrewartha
I'm surprised no-one else has posted about this - part of the Sun Oracle 
Exadata v2 is the Sun Flash Accelerator F20 PCIe card, with 48 or 96 GB of 
SLC, a built-in SAS controller and a super-capacitor for cache protection. 
http://www.sun.com/storage/disk_systems/sss/f20/specs.xml


There's no pricing on the webpage though - does anyone know how it compares 
in price to a logzilla?


--
James Andrewartha
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs inotify?

2009-11-11 Thread James Andrewartha

Carson Gaspar wrote:

On 10/26/09 5:33 PM, p...@paularcher.org wrote:
I can't find much on gam_server on Solaris (couldn't find too much 
on it
at all, really), and port_create is apparently a system call. (I'm 
not a

developer--if I can't write it in BASH, Perl, or Ruby, I can't write
it.)
I appreciate the suggestions, but I need something a little more
pret-a-porte.


Your Google-fu needs work ;-)

Main Gamin page: http://www.gnome.org/~veillard/gamin/index.html


Actually, I found this page, which has this gem: At this point Gamin is
fairly tied to Linux, portability is not a primary goal at this stage but
if you have portability patches they are welcome.


Much has changed since that text was written, including support for the 
event completion framework (port_create() and friends, introduced with 
Sol 10) on Solaris, thus the recommendation for gam_server / gamin.


$ nm /usr/lib/gam_server | grep port_create
[458]   | 134589544| 0|FUNC |GLOB |0|UNDEF  |port_create


The patch for port_create has never gone upstream however, while gvfs uses 
glib's gio, which has backends for inotify, solaris, fam and win32.


--
James Andrewartha
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] freeNAS moves to Linux from FreeBSD

2009-12-09 Thread James Andrewartha

Bob Friesenhahn wrote:

On Mon, 7 Dec 2009, Michael DeMan (OA) wrote:


Args for FreeBSD + ZFS:

- Limited budget
- We are familiar with managing FreeBSD.
- We are familiar with tuning FreeBSD.
- Licensing model

Args against OpenSolaris + ZFS:
- Hardware compatibility
- Lack of knowledge for tuning and associated costs for training staff 
to learn 'yet one more operating system' they need to support.

- Licensing model


If you think about it a little bit, you will see that there is no 
significant difference in the licensing model between FreeBSD+ZFS and 
OpenSolaris+ZFS.  It is not possible to be a little bit pregnant. 
Either one is pregnant, or one is not.


There is a huge difference practically - OpenSolaris has no free security 
updates for stable releases, unlike FreeBSD. And I'm sure you don't 
recommend running /dev in production.


This is offtopic, and isn't specifically related to CDDL vs BSD, just how 
Sun chooses to do things. Sure, there have been claims (since before 
2008.05) that it might happen some day, but until 2009.06 users can freely 
get a non-vulernable Firefox or Samba or fixes for various network kernel 
panics the claims are meaningless.


http://mail.opensolaris.org/pipermail/opensolaris-help/2009-November/015824.html

--
James Andrewartha
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss