Re: [zfs-discuss] SVM ZFS

2013-02-27 Thread Darren J Moffat



On 02/26/13 20:30, Morris Hooten wrote:

Besides copying data from /dev/md/dsk/x volume manager filesystems to
new zfs filesystems
does anyone know of any zfs conversion tools to make the
conversion/migration from svm to zfs
easier?


With Solaris 11 you can use shadow migration, it is really a VFS layer 
feature but it is integrated into the ZFS CLI tools for easy of use


# zfs create -o shadow=file:///path/to/old  mypool/new

The new filesystem will appear to instantly have all the data, and it 
will be copied over as it is access as well as shadowd pulling it over 
in advance.


You can use shadowstat(1M) to show progress.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Bp rewrite

2013-02-15 Thread Darren J Moffat



On 02/15/13 14:39, Tyler Walter wrote:

As someone who has zero insider information and feels that there isn't
much push at oracle to develop or release new zfs features, I have to
assume it's not coming. The only way I see it becoming a reality is if
someone in the illumos community decides to do the work required to put
it in.


You obviously missed the thread we had recently about the new ZFS 
features that Solaris 11 and 11.1 have.  ZFS is very much in active 
feature, bugfix and performance improvement at Oracle for current and 
future versions of Solaris and the ZFS Storage Appliance.


BP rewrite is actually very complex to do correctly and safely - it it 
wasn't I'm sure it would have been done by now by multiple people!


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Freeing unused space in thin provisioned zvols

2013-02-12 Thread Darren J Moffat



On 02/10/13 12:01, Koopmann, Jan-Peter wrote:

Why should it?

Unless you do a shrink on the vmdk and use a zfs variant with scsi unmap 
support (I believe currently only Nexenta but correct me if I am wrong) the 
blocks will not be freed, will they?


Solaris 11.1 has ZFS with SCSI UNMAP support.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Freeing unused space in thin provisioned zvols

2013-02-12 Thread Darren J Moffat



On 02/12/13 15:07, Thomas Nau wrote:

Darren

On 02/12/2013 11:25 AM, Darren J Moffat wrote:



On 02/10/13 12:01, Koopmann, Jan-Peter wrote:

Why should it?

Unless you do a shrink on the vmdk and use a zfs variant with scsi
unmap support (I believe currently only Nexenta but correct me if I am
wrong) the blocks will not be freed, will they?


Solaris 11.1 has ZFS with SCSI UNMAP support.



Seem to have skipped that one... Are there any related tools e.g. to
release all zero blocks or the like? Of course it's up to the admin
then to know what all this is about or to wreck the data


No tools, ZFS does it automaticaly when freeing blocks when the 
underlying device advertises the functionality.


ZFS ZVOLs shared over COMSTAR advertise SCSI UNMAP as well.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RFE: Un-dedup for unique blocks

2013-01-24 Thread Darren J Moffat



On 01/24/13 00:04, Matthew Ahrens wrote:

On Tue, Jan 22, 2013 at 5:29 AM, Darren J Moffat
darr...@opensolaris.org mailto:darr...@opensolaris.org wrote:

Preallocated ZVOLs - for swap/dump.


Darren, good to hear about the cool stuff in S11.

Just to clarify, is this preallocated ZVOL different than the
preallocated dump which has been there for quite some time (and is in
Illumos)?  Can you use it for other zvols besides swap and dump?


It is the same but we are using it for swap now too.  It isn't available 
for general use.



Some background:  the zfs dump device has always been preallocated
(thick provisioned), so that we can reliably dump.  By definition,
something has gone horribly wrong when we are dumping, so this code path
needs to be as small as possible to have any hope of getting a dump.  So
we preallocate the space for dump, and store a simple linked list of
disk segments where it will be stored.  The dump device is not COW,
checksummed, deduped, compressed, etc. by ZFS.


For the sake of others, I know you know this Matt, the dump system does 
the compression so ZFS didn't need to anyway.



In Illumos (and S10), swap was treated more or less like a regular zvol.
  This leads to some tricky code paths because ZFS allocates memory from
many points in the code as it is writing out changes.  I could see
advantages to the simplicity of a preallocated swap volume, using the
same code that already existed for preallocated dump.  Of course, the
loss of checksumming and encryption is much more of a concern with swap
(which is critical for correct behavior) than with dump (which is nice
to have for debugging).


We have encryption for dump because it is hooked in to the zvol code.

For encrypting swap Illumos could do the same as Solaris 11 does and use 
lofi.  I changed swapadd so that if encryption is specified in the 
options field of the vfstab entry it creates a lofi shim over the swap 
device using 'lofiadm -e'.  This provides you encrypted swap regardless 
of what the underlying disk is (normal ZVOL, prealloc ZVOL, real disk 
slide, SVM mirror etc).


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RFE: Un-dedup for unique blocks

2013-01-22 Thread Darren J Moffat

On 01/21/13 17:03, Sašo Kiselkov wrote:

Again, what significant features did they add besides encryption? I'm
not saying they didn't, I'm just not aware of that many.


Just a few examples:

Solaris ZFS already has support for 1MB block size.

Support for SCSI UNMAP - both issuing it and honoring it when it is the 
backing store of an iSCSI target.


It also has a lot of performance improvements and general bug fixes in 
the Solaris 11.1 release.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RFE: Un-dedup for unique blocks

2013-01-22 Thread Darren J Moffat



On 01/22/13 11:57, Tomas Forsman wrote:

On 22 January, 2013 - Darren J Moffat sent me these 0,6K bytes:


On 01/21/13 17:03, Sa?o Kiselkov wrote:

Again, what significant features did they add besides encryption? I'm
not saying they didn't, I'm just not aware of that many.


Just a few examples:

Solaris ZFS already has support for 1MB block size.

Support for SCSI UNMAP - both issuing it and honoring it when it is the
backing store of an iSCSI target.


Would this apply to say a SATA SSD used as ZIL? (which we have, a
vertex2ex with supercap)


If the device advertises the UNMAP feature and you are running Solaris 
11.1 it should attempt to use it.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RFE: Un-dedup for unique blocks

2013-01-22 Thread Darren J Moffat



On 01/22/13 13:20, Michel Jansens wrote:


Maybe 'shadow migration' ? (eg: zfs create -o shadow=nfs://server/dir
pool/newfs)


That isn't really a ZFS feature, since it happens at the VFS layer.  The 
ZFS support there is really about getting the options passed through and 
checking status but the core of the work happens at the VFS layer.


Shadow migration works with UFS as well!

Since I'm replying here are a few others that have been introduced in 
Solaris 11 or 11.1.


There is also the new improved ZFS share syntax for NFS and CIFS in 
Solaris 11.1 where you can much more easily inherit and also override 
individual share properties.


There is improved diganostics rules.

ZFS support for Immutable Zones (mostly a VFS feature)  Extended 
(privilege) Policy and aliasing of datasets in Zones (so you don't see 
the part of the dataset hierarchy above the bit delegated to the zone).


UEFI GPT label support for root pools with GRUB2 and on SPARC with OBP.

New sensitive per file flag.

Various ZIL and ARC performance improvements.

Preallocated ZVOLs - for swap/dump.


Michel


On 01/21/13 17:03, Sašo Kiselkov wrote:

Again, what significant features did they add besides encryption? I'm
not saying they didn't, I'm just not aware of that many.


Just a few examples:

Solaris ZFS already has support for 1MB block size.

Support for SCSI UNMAP - both issuing it and honoring it when it is
the backing store of an iSCSI target.

It also has a lot of performance improvements and general bug fixes in
the Solaris 11.1 release.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Michel Jansens
mjans...@ulb.ac.be



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RFE: Un-dedup for unique blocks

2013-01-22 Thread Darren J Moffat



On 01/22/13 13:29, Sašo Kiselkov wrote:

On 01/22/2013 02:20 PM, Michel Jansens wrote:


Maybe 'shadow migration' ?  (eg: zfs create -o shadow=nfs://server/dir
pool/newfs)


Hm, interesting, so it works as a sort of replication system, except
that the data needs to be read-only and you can start accessing it on
the target before the initial sync. Did I get that right?


The source filesystem needs to be read-only.  It works at the VFS layer 
so it doesn't copy snapshots or clones over.  Once mounted it appears 
like all the original data is instantly there.


There is an (optional) shadowd that pushes the migration along, but it 
will complete on its own anyway.


shadowstat(1M) gives information on the status of the migrations.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RFE: Un-dedup for unique blocks

2013-01-22 Thread Darren J Moffat


On 01/22/13 13:29, Darren J Moffat wrote:

Since I'm replying here are a few others that have been introduced in
Solaris 11 or 11.1.


and another one I can't believe I missed since I was one of the people 
that helped design it and I did codereview...


Per file sensitively labels for TX configurations.

and I'm sure I'm still missing stuff that is in Solaris 11 and 11.1.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RFE: Un-dedup for unique blocks

2013-01-22 Thread Darren J Moffat



On 01/22/13 15:32, Edward Ned Harvey 
(opensolarisisdeadlongliveopensolaris) wrote:

From: Darren J Moffat [mailto:darr...@opensolaris.org]

Support for SCSI UNMAP - both issuing it and honoring it when it is the
backing store of an iSCSI target.


When I search for scsi unmap, I come up with all sorts of documentation that 
... is ... like reading a medical journal when all you want to know is the 
conversion from 98.6F to C.

Would you mind momentarily, describing what SCSI UNMAP is used for?  If I were 
describing to a customer (CEO, CFO) I'm not going to tell them about SCSI 
UNMAP, I'm going to say the new system has a new feature that enables ... or 
solves the ___ problem...

Customer doesn't *necessarily* have to be as clueless as CEO/CFO.  Perhaps just 
another IT person, or whatever.


It is a mechanism for part of the storage system above the disk (eg 
ZFS) to inform the disk that it is no longer using a given set of blocks.


This is useful when using an SSD - see Saso's excellent response on that.

However it can also be very useful when your disk is an iSCSI LUN.  It 
allows the filesystem layer (eg ZFS or NTFS, etc) when on iSCSI LUN that 
advertises SCSI UNMAP to tell the target there are blocks in that LUN it 
isn't using any more (eg it just deleted some blocks).


This means you can get more accurate space usage when using things like 
iSCSI.


ZFS in Solaris 11.1 issues SCSI UNMAP to devices that support it and the 
ZVOLs when exported over COMSTAR advertise it too.


In the iSCSI case it is mostly about improved space accounting and 
utilisation.  This is particularly interesting with ZFS when snapshots 
and clones of ZVOLs come into play.


Some vendors call this (and thins like it) Thin Provisioning, I'd say 
it is more accurate communication between 'disk' and filesystem about 
in use blocks.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RFE: Un-dedup for unique blocks

2013-01-22 Thread Darren J Moffat



On 01/22/13 16:02, Sašo Kiselkov wrote:

On 01/22/2013 05:00 PM, casper@oracle.com wrote:

Some vendors call this (and thins like it) Thin Provisioning, I'd say
it is more accurate communication between 'disk' and filesystem about
in use blocks.


In some cases, users of disks are charged by bytes in use; when not using
SCSI UNMAP, a set of disks used for a zpool will in the end be charged for
the whole reservation; this becomes costly when your standard usage is
much less than your peak usage.

Thin provisioning can now be used for zpools as long as the underlying
LUNs have support for SCSI UNMAP


Looks like an interesting technical solution to a political problem :D


There is also a technical problem too: because if you can't inform the 
backing store that you no longer need the blocks it can't free them 
either so they get stuck in snapshots unnecessarily.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dm-crypt + ZFS on Linux

2012-11-30 Thread Darren J Moffat



On 11/23/12 15:49, John Baxter wrote:

After searching for dm-crypt and ZFS on Linux and finding too little
information, I shall ask here. Please keep in mind this in the context
of running this in a production environment.

We have the need to encypt our data, approximately 30TB on three ZFS
volumes under Solaris 10. The volumes currently reside on iscsi sans
connected via 10Gb/s ethernet. We have tested Solaris 11 with ZFS
encrypted volumes and found the performance to be very poor and have an
open bug report with Oracle.


This bug report hasn't reached me yet and I'd really like to be sure 
if there is a performance bug with ZFS that is unique to encryption I 
can attempt to resolve it.


Can you please provide the bug and/or SR number that Oracle Support gave 
to you.



We are a Linux shop and since performance is so poor and still no
resolution, we are considering ZFS on Linux with dm-crypt.
I have read once or twice that if we implemented ZFS + dm-crypt we would
loose features, however which features are not specified.
We currently mirror the volumes across identical iscsi sans with ZFS and
we use hourly ZFS snapshots to update our DR site.

Which features of ZFS are lost if we use dm-crypt? My guess would be
they are related to raidz but unsure.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dm-crypt + ZFS on Linux

2012-11-30 Thread Darren J Moffat



On 11/30/12 11:41, Darren J Moffat wrote:



On 11/23/12 15:49, John Baxter wrote:

After searching for dm-crypt and ZFS on Linux and finding too little
information, I shall ask here. Please keep in mind this in the context
of running this in a production environment.

We have the need to encypt our data, approximately 30TB on three ZFS
volumes under Solaris 10. The volumes currently reside on iscsi sans
connected via 10Gb/s ethernet. We have tested Solaris 11 with ZFS
encrypted volumes and found the performance to be very poor and have an
open bug report with Oracle.


This bug report hasn't reached me yet and I'd really like to be sure
if there is a performance bug with ZFS that is unique to encryption I
can attempt to resolve it.

Can you please provide the bug and/or SR number that Oracle Support gave
to you.


For the sake of those on the list, I've got these references now.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Appliance as a general-purpose server question

2012-11-22 Thread Darren J Moffat



On 11/22/12 16:24, Jim Klimov wrote:

A customer is looking to replace or augment their Sun Thumper
with a ZFS appliance like 7320. However, the Thumper was used
not only as a protocol storage server (home dirs, files, backups
over NFS/CIFS/Rsync), but also as a general-purpose server with
unpredictably-big-data programs running directly on it (such as
corporate databases, Alfresco for intellectual document storage,
etc.) in order to avoid the networking transfer of such data
between pure-storage and compute nodes - this networking was
seen as both a bottleneck and a possible point of failure.

Is it possible to use the ZFS Storage appliances in a similar
way, and fire up a Solaris zone (or a few) directly on the box
for general-purpose software; or to shell-script administrative
tasks such as the backup archive management in the global zone
(if that concept still applies) as is done on their current
Solaris-based box?


No it is a true appliance, it might look like it has Solaris underneath 
but it is just based on Solaris.


You can script administrative tasks but not using bash/ksh style 
scripting you use the ZFSSA's own scripting language.



Is it possible to run VirtualBoxes in the ZFS-SA OS, dare I ask? ;)


No.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send to older version

2012-10-24 Thread Darren J Moffat



On 10/24/12 03:16, Edward Ned Harvey 
(opensolarisisdeadlongliveopensolaris) wrote:

From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Karl Wagner

The only thing I think Oracle should have done differently is to allow
either a downgrade or creating a send stream in a lower version
(reformatting the data where necessary, and disabling features which
weren't present). However, this would not be a simple addition, and it
is probably not worth it for Oracle's intended customers.


So you have a backup server in production, that has storage and does a zfs send 
to removable media, on periodic basis.  (I know I do.)

So you buy a new server, and it comes with a new version of zfs.  Now you can't 
backup your new server.


So in this case you should have a) created the pool with a version that 
matches the pool version of the backup server and b) make sure you 
create the ZFS file systems with a version that is supposed by the 
backup server.


zpool create -o version=

zfs create -o version=

ZFS has the functionality but it can't guess as to what the intended 
usage is so the default behaviour is to create pools and versions using 
the highest version supported by the running software.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send to older version

2012-10-24 Thread Darren J Moffat

On 10/24/12 17:44, Carson Gaspar wrote:

On 10/24/12 3:59 AM, Darren J Moffat wrote:


So in this case you should have a) created the pool with a version that
matches the pool version of the backup server and b) make sure you
create the ZFS file systems with a version that is supposed by the
backup server.


And AI allows you to set the rpool version how, exactly?


I haven't personally tried this but I believe it should be possible 
since you can set other pool options at install time eg:


pool_options
option name=version value=28 /
pool_options

similarly for datasets that your AI manifest creates for you:

dataset_options
option name=version value=4 /
dataset_options

See /usr/share/install/target.dtd.1

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] openindiana-1 filesystem, time-slider, and snapshots

2012-10-16 Thread Darren J Moffat



On 10/16/12 14:54, Edward Ned Harvey 
(opensolarisisdeadlongliveopensolaris) wrote:

Can anyone explain to me what the openindiana-1 filesystem is all
about?I thought it was the backup copy of the openindiana filesystem,
when you apply OS updates, but that doesn't seem to be the case...

I have time-slider enabled for rpool/ROOT/openindiana.It has a daily
snapshot (amongst others).But every day when the new daily snap is
taken, the old daily snap rotates into the rpool/ROOT/openindiana-1
filesystem.This is messing up my cron-scheduled zfs send script -
which detects that the rpool/ROOT/openindiana filesystem no longer has
the old daily snapshot, and therefore has no snapshot in common with the
receiving system, and therefore sends a new full backup every night.

To make matters more confusing, when I run mount and when I zfs get
all | grep -i mount, I see / on rpool/ROOT/openindiana-1


It is a new boot environment see beadm(1M) - you must have done some 
'pkg update' or 'pkg install' option that created a new BE.




It would seem, I shouldn't be backing up openindiana, but instead,
backup openindiana-1?I would have sworn, out-of-the-box, there was no
openindiana-1.Am I simply wrong?


Initially there wouldn't have been.

Are you doing the zfs send on your own or letting time-slider do it for 
you ?


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS ok for single disk dev box?

2012-08-30 Thread Darren J Moffat

On 08/30/12 11:07, Anonymous wrote:

Hi. I have a spare off the shelf consumer PC and was thinking about loading
Solaris on it for a development box since I use Studio @work and like it
better than gcc. I was thinking maybe it isn't so smart to use ZFS since it
has only one drive. If ZFS detects something bad it might kernel panic and
lose the whole system right? I realize UFS /might/ be ignorant of any
corruption but it might be more usable and go happily on it's way without
noticing? Except then I have to size all the partitions and lose out on
compression etc. Any suggestions thankfully received.


If you are using Solaris 11 or any of the Illumos based distributions 
you have not choice you must use ZFS as your root/boot filesystem.


I would recommend that if physically possible attach a second drive to 
make it a mirror.


Personally I've run many many builds of Solaris on single disk laptop 
systems and never has it lost me access to my data.  The only time I 
lost access to data on a single disk system was because of total hard 
drive failure.  I run with copies=2 set on my home directory and any 
datasets I store data in when on a single disk system.


However much much more importantly ZFS does not preclude the need for 
off system backups.  Even with mirroring, and snaphots you still have to 
have a backup of important data elsewhere.  No file system and more 
importantly no hardware is that good.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Darren J Moffat

On 07/11/12 00:56, Sašo Kiselkov wrote:

  * SHA-512: simplest to implement (since the code is already in the
kernel) and provides a modest performance boost of around 60%.


FIPS 180-4 introduces SHA-512/t support and explicitly SHA-512/256.

http://csrc.nist.gov/publications/fips/fips180-4/fips-180-4.pdf

Note this is NOT a simple truncation of SHA-512 since when using 
SHA-512/t the initial value H(0) is different.


See sections 5.3.6.2 and 6.7.

I recommend the checksum value for this be
checksum=sha512/256

A / in the value doesn't cause any problems and it is the official NIST 
name of that hash.


With the internal enum being: ZIO_CHECKSUM_SHA512_256

CR 7020616 already exists for adding this in Oracle Solaris.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Benefits of enabling compression in ZFS for the zones

2012-07-10 Thread Darren J Moffat

On 07/10/12 12:45, Ferenc-Levente Juhos wrote:

Of course you don't see any difference, this is how it should work.
'ls' will never report the compressed size, because it's not aware of
it. Nothing is aware of the compression and decompression that takes
place on-the-fly, except of course zfs.
That's the reason why you could gain in write and read speed if you use
compression, because the actual amount of compressed data that is being
written and read from the pool is smaller than the original data. And I
think with the checksum test you prooved that zfs checksums the
uncompressed data.


No ZFS checksums are over the data as it is stored on disk so the 
compressed data.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] current status of SAM-QFS?

2012-05-03 Thread Darren J Moffat

On 05/02/12 23:34, Fred Liu wrote:

If you want to know Oracle's roadmap for SAM-QFS then I recommend
contacting your Oracle account rep rather than asking on a ZFS discussion list.
You won't get SAM-QFS or Oracle roadmap answers from this alias.



My original purpose is to ask if there is an effort to integrate open-sourced 
SAM-QFS into illumos
or smartos/oi/illumian.


Okay, then it would have been clearer if you had asked that question but 
you asked about SAM-QFS on a zfs discuss alias.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] current status of SAM-QFS?

2012-05-02 Thread Darren J Moffat

On 05/02/12 10:40, Fred Liu wrote:

Still a fully supported product from Oracle:

http://www.oracle.com/us/products/servers-storage/storage/storage-
software/qfs-software/overview/index.html



Yeah. But it seems no more updates since sun acquisition.
Don't know Oracle's roadmap in aspect of data-tying.


If you want to know Oracle's roadmap for SAM-QFS then I recommend 
contacting your Oracle account rep rather than asking on a ZFS 
discussion list.  You won't get SAM-QFS or Oracle roadmap answers from 
this alias.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] current status of SAM-QFS?

2012-05-01 Thread Darren J Moffat

On 04/30/12 04:00, Fred Liu wrote:

The subject says it all.


Still a fully supported product from Oracle:

http://www.oracle.com/us/products/servers-storage/storage/storage-software/qfs-software/overview/index.html

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Aaron Toponce: Install ZFS on Debian GNU/Linux

2012-04-19 Thread Darren J Moffat

On 04/18/12 17:28, Jim Klimov wrote:

In the beginning it was my wishful thinking that encryption
code and maybe some other newbies got legally leaked into
Linux, and if they were there, then they might be legally
included into other ZFS source code projects.


Not Linux per say but there is another (readonly) implementation of ZFS 
encryption:


http://bazaar.launchpad.net/~vcs-imports/grub/grub2-bzr/view/head:/grub-core/fs/zfs/zfscrypt.c

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Solaris 11/ZFS historical reporting

2012-04-17 Thread Darren J Moffat

On 04/16/12 20:18, Anh Quach wrote:

Are there any tools that ship w/ Solaris 11 for historical reporting on things 
like network activity, zpool iops/bandwidth, etc., or is it pretty much 
roll-your-own scripts and whatnot?


For network activity look at flowstat it can read exacct format files.

For IO depends what level you want to look at, if it is the device level 
iostat, if it is how ZFS is using the devices look at 'zpool iostat'. 
If it is the filesystem level look at fsstat.


Also look acctadm(1M).



--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs-discuss Digest, Vol 76, Issue 20

2012-02-22 Thread Darren J Moffat

On 02/21/12 15:32, zfs-dev wrote:

You might want to try a reboot of the system. There is some low level
caching of the encryption key in the kernel. I noticed that you can
remove the key and continue to mount and umount it without a key so long
as you do not reboot. Maybe this will clear it up. I never recommend
just reboot however, in this case it may actually work.


That behaviour is by design and is documented on zfs(1M) in the 'zfs 
umount' section as follows:


 For an encrypted dataset, the key is not  unloaded  when
 the file system is unmounted. To unload the key, see zfs
 key.



--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cannot mount encrypted filesystems.

2012-02-22 Thread Darren J Moffat

On 02/22/12 06:10, Roberto Waltman wrote:

2011-08-23.23:48:35 zfs set
keysource=passphrase,file:///root/passphrases/slice_2_passphrase
slice_2/base/bitsavers


That should have failed because the keysource property is inherited from 
slice_2/base. So you have found a bug and I can reproduce it.


The reason that should have failed is the source of where the keysource 
comes from is used to determine which dataset to look at for the hidden 
salt property.  We know what that salt property should actually be in 
your case because it is set on slice_2/base.


Unfortunately 'zfs set salt' won't work because salt is read-only from 
userland (so it doesn't accidentally get overridden and cause the very 
same symptoms you have!).


In theory you would assume that you could go back to having the 
keysource inherited by running:

 'zfs inherit keysource slice_2/base/bitsavers'

However that won't work because of a protection we have in place to 
again avoid yet another route into these same symptoms. It will fail 
with an error message something like this:


cannot inherit keysource for 'slice_2/base/bitsavers': use 'zfs key -c 
-o keysource=...'


Using a hacked up libzfs that removes the check that 'zfs inherit' does 
so I can get out of the situation and make the datasets accessible 
again.  So this is fixable so don't abandon hope yet.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cannot mount encrypted filesystems.

2012-02-21 Thread Darren J Moffat

On 02/21/12 01:58, Roberto Waltman wrote:

First, I did the 2nd. (Change location only)
I believe I tried the first form also *after*
things were already broken, but I'm sure the
passphrases were identical: slice_08, slice_18
and slice_28 for each pools 0/1/2. - The '8'
to bring the length to the minimal
requirement of 8 characters.


A 'zfs key -c' won't work unless a 'zfs key -l' or 'zfs mount' has 
successfully loaded the key first.


Can you send the 'zpool history slice_2' output so I can see what 
commands have been run.



( My goal for using encryption was just to
obfuscate the contents if, for example, I
send a disk out for repair; not to hide
anything from the NSA )

Question: I believed the keys generated from a
passphrase depend only on the passphrase, and
not on how it is provided or where it is stored.
Is this a true statement?


Almost, the passphrase case also depends on a hidden property called 
salt that is updated only when you do 'zfs key -c' and was set to a 
random value at the time the dataset was created.


Did you ever do a send|recv of these filesystems ?  There was a bug with 
send|recv in 151a that has since been fixed that could cause the salt to 
be zero'd out in some cases.



slice_2/base/bitsavers keysource
passphrase,file:///export/home/trouser/passphrases/slice_2_passphrase local


This is the interesting part you have set the keysource explicitly on 
every leaf dataset - you didn't need to do that it would have been 
inherited.


What this means is that even though you have the same passphrase for 
each dataset the actual data encryption key is different because the 
passphrase value plus the hidden salt property are used together to 
generated the wrapping key.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] encryption

2012-02-21 Thread Darren J Moffat



On 02/21/12 13:27, Edward Ned Harvey wrote:

From: Darren J Moffat [mailto:darr...@opensolaris.org]
Sent: Monday, February 20, 2012 12:46 PM

GRUB2 has support
for encrypted ZFS file systems already.


I assume this requires a pre-boot password, right?  Then I have two
questions...


The ZFS encryption support in GRUB2 was written by the main GRUB2 
developer and doesn't use any Solaris ZFS encryption code.  The GRUB2 
code has support for interactive prompting for the passphrase or for 
reading the passphrase or raw wrapping key from a file in some other 
filesystem that GRUB2 can see.


Solaris 11 doesn't have GRUB2 at this time it uses GRUB 0.97 which does 
not have encryption support.  You can't put the two parts together 
because the Solaris 11 kernel doesn't know how to mount an encrypted 
root filesystem even though GRUB2 could have loaded the kernel and 
boot_archive from one if you managed to craft together a GRUB2 and 
Solaris 11 system on your own.



I noticed in solaris 11, when you init 6 it doesn't reboot the way other
OSes reboot.


What you are seeing is Fast Reboot where on x86 we completely avoid 
the trip back through the BIOS and the boot loader it just loads and 
rexecute the kernel directly.  The situation on SPARC is similar but not 
identical.


 So maybe init 6 will not need you to type in a password

again?  Maybe you just need a passsword one time when you power on?


Solaris 11 doesn't have support for encrypted root at all at this time. 
 Doesn't mater if Fast Reboot is in use or not.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] encryption

2012-02-20 Thread Darren J Moffat

On 02/16/12 15:35, David Magda wrote:

On Thu, February 16, 2012 09:55, Edward Ned Harvey wrote:

I've never used ZFS encryption.  How does it work?  Do you need to type in
a pre-boot password?  And if so, how do you do that with a server?  Or does
it use TPM or something similar, to avoid the need for a pre-boot password?


Darren Moffat put up some good posts when the code was initially introduced:

 https://blogs.oracle.com/darren/en_GB/tags/zfs
 https://blogs.oracle.com/darren/en_GB/tags/crypto

I don't believe encrypting the root volume is currently supported, so
pre-boot stuff doesn't apply. (Please correct if I'm wrong here.)


That is correct you can't currently encrypt the root/boot file system. 
This is because neither OBP or GRUB 0.97 have any knowledge of ZFS 
encrypted file systems and how to get keys for them.  GRUB2 has support 
for encrypted ZFS file systems already.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cannot mount encrypted filesystems.

2012-02-20 Thread Darren J Moffat

On 02/18/12 05:12, Roberto Waltman wrote:

Solaris 11 Express 1010.11/snv_151a


I strongly suggest upgrading to Solaris 11 there have been some 
important ZFS and specifically ZFS encryption related bug fixes.



They were created with encryption
on, forcing all others to be encrypted.

The keysource for slice_?/base
was set to
passphrase,prompt
while creating the file systems.

Then I stored the keys (one key per
pool) in files in a subdirectory
of home/user1, and set keysource for
slice_0/base to
passphrase,file:///export/home/user1/keys/key_0
(Similarly for the other two pools)


Did you ever export the slice_0 pool and reimport it or reboot the 
server ?  Basically are you and ZFS both 100% sure you had the correct 
passphrases stored in those files ?



So far so good.
Several weeks and several terabytes
of data later, I decided to relocate
the files with the encryption keys
from a subdir of user1 to a subdir
of root. Copied the files and set
slice_0/base keysource to
passphrase,file:///root/keys/key_0, etc.


Exactly how did you do that ?

zfs key -c -o keysource=passphrase,file:///root/keys/key_0

or

zfs set keysource=passphrase,file:///root/keys/key_0

The first does a key change and actually reencryptes the on disk data 
encryption keys using the newly generated AES wrapping key that is 
derived from the passphrase. The second only change where to find the 
passphrase.



That broke it. After doing that, the base
file systems (that contain no data files)
can be mounted, but trying to mount any
other fs fails with the message:
cannot load key for 'slice_?/base/fsys_?_?': incorrect key.


Can post some sample output of:

zfs get -r encryption,keysource slice_0

In particular include a few examples of the filesystems you call 'base' 
and the fsys ones.


What is important here is understanding where the encryption and 
keysource properties are set and where they are inherited.



--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RFE: add an option/attribute to import ZFS pool without automounting/sharing ZFS datasets

2012-01-11 Thread Darren J Moffat



On 01/11/12 11:48, Jim Klimov wrote:

I think about adding the following RFE to illumos bugtracker:
add an option/attribute to import ZFS pool without
automounting/sharing ZFS datasets

I wonder if something like this (like a tricky workaround)
is already in place?



 -N

 Import the pool without mounting any file systems.


If it isn't mounted it can't be shared.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZIL on a dedicated HDD slice (1-2 disk systems)

2012-01-09 Thread Darren J Moffat

On 01/08/12 18:21, Bob Friesenhahn wrote:

Something else to be aware of is that even if you don't have a dedicated
ZIL device, zfs will create a ZIL using devices in the main pool so


Terminology nit:  The log device is a SLOG.  Every ZFS dataset has a 
ZIL.  Where the ZIL writes (slog or main pool devices) go for a given 
dataset are determined by a combination of things including (but not 
limited to) the presence of a SLOG device, the logbias property and the 
size of the data.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] S11 vs illumos zfs compatiblity

2012-01-03 Thread Darren J Moffat

On 12/28/11 06:27, Richard Elling wrote:

On Dec 27, 2011, at 7:46 PM, Tim Cook wrote:

On Tue, Dec 27, 2011 at 9:34 PM, Nico Williamsn...@cryptonector.com  wrote:
On Tue, Dec 27, 2011 at 8:44 PM, Frank Cusackfr...@linetwo.net  wrote:

So with a de facto fork (illumos) now in place, is it possible that two
zpools will report the same version yet be incompatible across
implementations?


This was already broken by Sun/Oracle when the deduplication feature was not
backported to Solaris 10. If you are running Solaris 10, then zpool version 29 
features
are not implemented.


Solaris 10 does have some deduplication support, it can import and read 
datasets in a deduped pool just fine.  You can't enable dedup on a 
dataset and any writes won't dedup they will rehydrate.


So it is more like partial dedup support rather than it not being there 
at all.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Can I create a mirror for a root rpool?

2011-12-19 Thread Darren J Moffat

On 12/18/11 11:52, Pawel Jakub Dawidek wrote:

On Thu, Dec 15, 2011 at 04:39:07PM -0700, Cindy Swearingen wrote:

Hi Anon,

The disk that you attach to the root pool will need an SMI label
and a slice 0.

The syntax to attach a disk to create a mirrored root pool
is like this, for example:

# zpool attach rpool c1t0d0s0 c1t1d0s0


BTW. Can you, Cindy, or someone else reveal why one cannot boot from
RAIDZ on Solaris? Is this because Solaris is using GRUB and RAIDZ code
would have to be licensed under GPL as the rest of the boot code?

I'm asking, because I see no technical problems with this functionality.
Booting off of RAIDZ (even RAIDZ3) and also from multi-top-level-vdev
pools works just fine on FreeBSD for a long time now. Not being forced
to have dedicated pool just for the root if you happen to have more than
two disks in you box is very convenient.


For those of us not familiar with how FreeBSD is installed and boots can 
you explain how boot works (ie do you use GRUB at all and if so which 
version and where the early boot ZFS code is).


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-08 Thread Darren J Moffat

On 12/07/11 20:48, Mertol Ozyoney wrote:

Unfortunetly the answer is no. Neither l1 nor l2 cache is dedup aware.

The only vendor i know that can do this is Netapp

In fact , most of our functions, like replication is not dedup aware.



For example, thecnicaly it's possible to optimize our replication that
it does not send daya chunks if a data chunk with the same chechsum
exists in target, without enabling dedup on target and source.


We already do that with 'zfs send -D':

 -D

 Perform dedup processing on the stream. Deduplicated
 streams  cannot  be  received on systems that do not
 support the stream deduplication feature.




--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on Dell with FreeBSD

2011-10-19 Thread Darren J Moffat

On 10/19/11 15:30, Fajar A. Nugraha wrote:

On Wed, Oct 19, 2011 at 9:14 PM, Albert Shihalbert.s...@obspm.fr  wrote:

Hi

Sorry to cross-posting. I don't knwon which mailing-list I should post this
message.

I'll would like to use FreeBSD with ZFS on some Dell server with some
MD1200 (classique DAS).

When we buy a MD1200 we need a RAID PERC H800 card on the server so we have
two options :

1/ create a LV on the PERC H800 so the server see one volume and put
the zpool on this unique volume and let the hardware manage the
raid.

2/ create 12 LV on the perc H800 (so without raid) and let FreeBSD
and ZFS manage the raid.

which one is the best solution ?


Neither.

The best solution is to find a controller which can pass the disk as
JBOD (not encapsulated as virtual disk). Failing that, I'd go with (1)
(though others might disagree).


No go with 2.  ALWAYS let ZFS manage the redundancy otherwise it can't 
self-heal.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Darren J Moffat

On 10/18/11 13:18, Edward Ned Harvey wrote:

* btrfs is able to balance.  (after adding new blank devices, rebalance, so
the data  workload are distributed across all the devices.)  zfs is not
able to do this yet.


ZFS does slightly biases new vdevs for new writes so that we will get to 
a more even spread.  It doesn't go and move already written blocks onto 
the new vdevs though.  So while there isn't an admin interface to 
rebalancing ZFS does do something in this area.


This is implemented in metaslab_alloc_dva()

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/metaslab.c

See lines 1356-1378

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Darren J Moffat

On 10/18/11 14:04, Jim Klimov wrote:

2011-10-18 16:26, Darren J Moffat пишет:

On 10/18/11 13:18, Edward Ned Harvey wrote:

* btrfs is able to balance. (after adding new blank devices,
rebalance, so
the data workload are distributed across all the devices.) zfs is not
able to do this yet.


ZFS does slightly biases new vdevs for new writes so that we will get
to a more even spread. It doesn't go and move already written blocks
onto the new vdevs though. So while there isn't an admin interface to
rebalancing ZFS does do something in this area.

This is implemented in metaslab_alloc_dva()

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/metaslab.c


See lines 1356-1378



And the admin interface would be what exactly?..


As I said there isn't one because that isn't how it works today it is 
all automatic and only for new writes.


I was pointing out that ZFS does do 'something' not that it had an 
exactly matching feature.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thumper (X4500), and CF SSD for L2ARC = ?

2011-10-14 Thread Darren J Moffat

On 10/14/11 13:39, Jim Klimov wrote:

Hello, I was asked if the CF port in Thumpers can be accessed by the OS?
In particular, would it be a good idea to use a modern 600x CF card
(some reliable one intended for professional photography) as an L2ARC
device using this port?


I don't know about the Thumpers internal CF slot.

I can say I have tried using a fast (at the time, this was about 3 years 
ago) CF card via a CF to IDE adaptor before and it turned out to be a 
really bad idea because the spinning rust disk (which was SATA) was 
actually faster to access.  Same went for USB to CF adaptors at the time 
too.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] commercial zfs-based storage replication software?

2011-10-13 Thread Darren J Moffat

On 10/13/11 09:27, Fajar A. Nugraha wrote:

On Tue, Oct 11, 2011 at 5:26 PM, Darren J Moffat
darr...@opensolaris.org  wrote:

Have you looked at the time-slider functionality that is already in Solaris
?


Hi Darren. Is it available for Solaris 10? I just installed Solaris 10
u10 and couldn't find it.


No it is not.


There is a GUI for configuration of the snapshots


the screenshots that I can find all refer to opensolaris


and time-slider can be
configured to do a 'zfs send' or 'rsync'.  The GUI doesn't have the ability
to set the 'zfs recv' command but that is set one-time in the SMF service
properties.


Is there a reference on how to get/install this functionality on Solaris 10?


No because it doesn't exist on Solaris 10.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] commercial zfs-based storage replication software?

2011-10-11 Thread Darren J Moffat
Have you looked at the time-slider functionality that is already in 
Solaris ?


There is a GUI for configuration of the snapshots and time-slider can be 
configured to do a 'zfs send' or 'rsync'.  The GUI doesn't have the 
ability to set the 'zfs recv' command but that is set one-time in the 
SMF service properties.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Any info about System attributes

2011-10-11 Thread Darren J Moffat

On 09/26/11 20:03, Jesus Cea wrote:

# zpool upgrade -v
[...]
24  System attributes
[...]


This is really an on disk format issue rather than something that the 
end user or admin can use directly.


These are special on disk blocks for storing file system metadata 
attributes when there isn't enough space in the bonus buffer area of the 
on disk version of the dnode.


This can be necessary in some cases if a file has a very large and 
complex ACL and also has other attributes set such as the ones for CIFS 
compatibility.


They are also always used if the filesystem is encrypted, so that all 
metadata is in the system attribute (also know as spill) block rather 
than in the dnode - this is required because we need the dnone in the 
clear because it contains block pointers and other information needed to 
navigate the pool.  However we never want file system metadata to be in 
the clear.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Advice with SSD, ZIL and L2ARC

2011-09-20 Thread Darren J Moffat

On 09/19/11 18:45, Jesus Cea wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I have a new answer: interaction between dataset encryption and L2ARC
and ZIL.

1. I am pretty sure (but not completely sure) that data stored in the
ZIL is encrypted, if the destination dataset uses encryption. Can
anybody confirm?.


Of course if we didn't do that we would be leaking user data.


2. What happens with L2ARC?. Since ARC is not encrypted (in RAM), is
it encrypted when evicted to L2ARC?.


Use of the L2ARC is disabled for data from encrypted datasets at this time.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Advice with SSD, ZIL and L2ARC

2011-08-30 Thread Darren J Moffat

On 08/30/11 15:31, Edward Ned Harvey wrote:

From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Jesus Cea

1. Is the L2ARC data stored in the SSD checksummed?. If so, can I
expect that ZFS goes directly to the disk if the checksum is wrong?.


Yup.


Note the following is an implementation detail subject to change:

It is NOT checksumed on disk only in memory, but the L2ARC data on disk 
is not used after reboot anyway just now.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RBAC and zfs

2011-08-26 Thread Darren J Moffat

On 08/26/11 13:29, cephas maposah wrote:

i would like to create a role which can take snapshots, run zfs send and
zfs receive. the user switches to that role and has permissions to run
those commands on a pool


See the zfs(1M) man page for the section on the 'allow' subcommand.

Assuming a role name of 'myrole' and a ZFS pool called 'tank' it would 
be something like this:


# roleadd -R myrole
# passwd myrole
...
# useradd -R myrole cephas

# zfs allow -u myrole send,receive,snapshot,mount tank

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disable ZIL - persistent

2011-08-05 Thread Darren J Moffat

On 08/05/11 13:11, Edward Ned Harvey wrote:

After a certain rev, I know you can set the sync property, and it
takes effect immediately, and it's persistent across reboots. But that
doesn't apply to Solaris 10.

My question: Is there any way to make Disabled ZIL a normal mode of
operations in solaris 10? Particularly:

If I do this echo zil_disable/W0t1 | mdb -kw then I have to remount
the filesystem. It's kind of difficult to do this automatically at boot
time, and impossible (as far as I know) for rpool. The only solution I
see is to write some startup script which applies it to filesystems
other than rpool. Which feels kludgy. Is there a better way?


echo set zfs:zil_disable = 1  /etc/system

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD vs hybrid drive - any advice?

2011-07-27 Thread Darren J Moffat

On 07/27/11 00:00, Peter Jeremy wrote:

On 2011-Jul-26 17:24:05 +0800, Fajar A. Nugrahaw...@fajar.net  wrote:

Shouldn't modern SSD controllers be smart enough already that they know:
- if there's a request to overwrite a sector, then the old data on
that sector is no longer needed


ZFS never does update-in-place and UFS only does update-in-place for


Note quite never, there are some very special cases where blocks are 
allocated ahead of time and could be written to in place more than 
once.  In particular the special type of ZVOLs used for dump devices.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive and ashift

2011-07-27 Thread Darren J Moffat

On 07/27/11 10:24, Fred Liu wrote:

The alternative is to have the node in your NDMP network that does the
writing to the tape to do the compression and encryption of the data
stream before putting it on the tape.



I see. T1C is a monster to have if possible ;-).
And doing the job on NDMP node(Solaris) needs extra software, is it correct?


I believe so, also it is more than just the T1C drive you need it 
needs to be in a library and you also need the Oracle Key Management 
system to be able to do the key management for it.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive and ashift

2011-07-27 Thread Darren J Moffat

On 07/27/11 12:51, Pawel Jakub Dawidek wrote:

On Tue, Jul 26, 2011 at 03:28:10AM -0700, Fred Liu wrote:




The ZFS Send stream is at the DMU layer at this layer the data is
uncompress and decrypted - ie exactly how the application wants it.



Even the data compressed/encrypted by ZFS will be decrypted? If it is true, 
will it be any CPU overhead?
And ZFS send/receive tunneled by ssh becomes the only way to encrypt the data 
transmission?


Even if zfs send/recv will work with encrypted and compressed data you
still need some secure tunneling. Storage encryption is not the same as
network traffic encryption.


Indeed, plus you don't necessarily want to always have your backups 
encrypted by the same keys as the live data (ie the policy for key 
management and retention could be different on purpose).


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive and ashift

2011-07-26 Thread Darren J Moffat

On 07/26/11 10:14, Andrew Gabriel wrote:

Does anyone know if it's OK to do zfs send/receive between zpools with
different ashift values?


The ZFS Send stream is at the DMU layer at this layer the data is 
uncompress and decrypted - ie exactly how the application wants it.


The ashift is a vdev layer concept - ie below the DMU layer.

There is nothing in the send stream format that knows what an ashift 
actually is.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive and ashift

2011-07-26 Thread Darren J Moffat

On 07/26/11 11:28, Fred Liu wrote:

The ZFS Send stream is at the DMU layer at this layer the data is
uncompress and decrypted - ie exactly how the application wants it.



Even the data compressed/encrypted by ZFS will be decrypted?


Yes, which is exactly what I said.

All data as seen by the DMU is decrypted and decompressed, the DMU layer 
is what the ZPL layer is built ontop of so it has to be that way.


 If it is true, will it be any CPU overhead?

There is always some overhead for doing a decryption and decompression, 
the question is really can you detect it and if you can does it mater.
If you are running Solaris on processors with built in support for AES 
(eg SPARC T2, T3 or Intel with AES-NI) the overhead is reduced 
significantly in many cases.


For many people getting the stuff from disk takes more time than doing 
the transform to get back your plaintext.


In some of the testing I did I found that gzip decompression can be more 
significant to a workload than doing the AES decryption.


So basically yes of course but does it actually mater ?


And ZFS send/receive tunneled by ssh becomes the only way to encrypt the data 
transmission?


That isn't the only way.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive and ashift

2011-07-26 Thread Darren J Moffat

On 07/26/11 11:56, Fred Liu wrote:

It is up to how big the delta is. It does matter if the data backup can not
be finished within the required backup window when people use zfs  send/receive
to do the mass data backup.


The only way you will know of decrypting and decompressing causes a 
problem in that case is if you try it on your systems.  I seriously 
doubt it will be unless the system is already heavily CPU bound and your 
backup window is already very tight.



BTW adding a sort of off-topic question -- will NDMP protocol in Solaris will do
decompression and decryption? Thanks.


My understanding of the NDMP protocol is that it would be a translator 
that did that it isn't part of the core protocol.


The way I would do it is to use a T1C tape drive and have it do the 
compression and encryption of the data.


http://www.oracle.com/us/products/servers-storage/storage/tape-storage/t1c-tape-drive-292151.html

The alternative is to have the node in your NDMP network that does the 
writing to the tape to do the compression and encryption of the data 
stream before putting it on the tape.



And ZFS send/receive tunneled by ssh becomes the only way to encrypt

the data transmission?

That isn't the only way.


--


Any alternatives, if you don't mind? ;-)


For starters SSL/TLS (which is what the Oracle ZFSSA provides for 
replication) or IPsec are possibilities as well, depends what the risk 
is you are trying to protect against and what transport layer is.


But basically it is not provided by ZFS itself it is up to the person 
building the system to secure the transport layer used for ZFS send.


It could also be write directly to a T10k encrypting tape drive.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Zil on multiple usb keys

2011-07-25 Thread Darren J Moffat

On 07/23/11 04:57, Michael DeMan wrote:

Generally performance is going to pretty bad as well - USB sticks are
not made to be written too rapidly. They are entirely different animals
than SSDs. I would not be surprised (but would be curious to know if you
still move forward on this) that you will find performance even worse
trying to do this.


Back in the snv_120 ish era I tried this experiement on both my pool and 
on a friends.  In both cases we were serving NFS (he was also doing 
CIFS) which was mostly read but also had periods where 1-2 G of data was 
rapidly added (uploading photos or videos) over the network.


In both the USB flash drive and in the case of a San Disk Extreme IV 
CF card in a CF-IDE enclosure the performance did not improve and in 
fact in the case of the CF card the enclosure was bugging such that the 
changes we had to make to the ata config did actually make it slower.


I removed the separate log device from both of those pools (by manual 
hacking with specially build zfs kernel modules because slog removal 
didn't exist back then.).


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is oi_151 zpool version on par with sol11ex?

2011-07-19 Thread Darren J Moffat

On 07/19/11 12:03, Jim Klimov wrote:

Hello, some time ago I've seen the existence of development ISOs
of OpenIndiana dubbed build 151. How close or far is it from the
sol11ex 151a? In particular, regarding ZFS/ZPOOL version and
functionality?


Solaris 11 Express (snv_151a) has the following pool versions beyond 28:

 29  RAID-Z/mirror hybrid allocator
 30  Encryption
 31  Improved 'zfs list' performance

http://hub.opensolaris.org/bin/view/Community+Group+zfs/29
http://hub.opensolaris.org/bin/view/Community+Group+zfs/30
http://hub.opensolaris.org/bin/view/Community+Group+zfs/31

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] latest zpool version in solaris 11 express

2011-07-18 Thread Darren J Moffat

On 07/18/11 02:29 PM, Edward Ned Harvey wrote:

From: Edward Ned Harvey
[mailto:opensolarisisdeadlongliveopensola...@nedharvey.com]

It says zpool version 31 and zfs version 5.  Can anybody please confirm or
deny that this is the absolute latest version available to the public in

any

way?


After applying all updates, it's still zpool 31 and zfs 5.  So unless anyone
has anything else to suggest...  I'm not going to repeat any of the dedup
tests.  It doesn't look like any zfs/zpool/dedup code has changed since
solaris 11 express was released in 2010.


Note that in general code can change without either the pool or 
filesystem versions changing.  The filesystem and pool version numbers 
usually only need to change if there is an on disk format change or some 
other compatibility issue.


Some performance fixes need an on disk layout change and some don't.

Note I'm not commenting about any specific issue here but about the way 
your conclusion was written it doesn't follow that because the pool and 
version number are the same that no zfs/zpool/dedup code was changed.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] question about COW and snapshots

2011-06-15 Thread Darren J Moffat

On 06/15/11 12:29, Edward Ned Harvey wrote:

From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Richard Elling

That would suck worse.


Don't mind Richard.  He is of the mind that ZFS is perfect for everything
just the way it is, and anybody who wants anything different should adjust
their thought process.


I suspect rather than that it is more that Richard equated write to 
write(2) / dmu_write() calls and that would suck performance wise.


I also suspect that what Simon wants isn't a snapshot of every little 
write(2) level call but when the file is completed being updated, maybe 
on close(2) [ but that assumes the app does actually call close() ].



I know I've certainly had many situations where people wanted to snapshot or
rev individual files everytime they're modified.  As I said - perfect
example is Google Docs.  Yes it is useful.  But no, it's not what ZFS does.


Exactly versions of a whole file, but that is different to a snapshot on 
every write.


How you interpret on every write depends on where in the stack you are 
coming from.  If you think about an application a write is whey you 
save the document but at the ZPL layer that is multiple write(2) calls 
and maybe even some rename(2)/unlink(2)/close(2) calls as well.
If you move further down then doing a snapshot on every dmu_write() call 
is fundamentally at odds with how ZFS works.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS receive checksum mismatch

2011-06-10 Thread Darren J Moffat

On 06/10/11 12:47, Edward Ned Harvey wrote:

From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Jonathan Walker

New to ZFS, I made a critical error when migrating data and
configuring zpools according to needs - I stored a snapshot stream to
a file using zfs send -R [filesystem]@[snapshot][stream_file].


There are precisely two reasons why it's not recommended to store a zfs send
datastream for later use.  As long as you can acknowledge and accept these
limitations, then sure, go right ahead and store it.  ;-)  A lot of people
do, and it's good.


Not recommended by who ?  Which documentation says this ?

As I pointed out last time this came up the NDMP service on Solaris 11 
Express and on the Oracle ZFS Storage Appliance uses the 'zfs send' 
stream as what is to be stored on the tape.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SE11 express Encryption on - errors in the pool after Scrub

2011-06-06 Thread Darren J Moffat

On 06/04/11 13:52, Thomas Hobbes wrote:

I am testing Solaris Express 11 with napp-it on two machines. In both
cases the same problem: Enabling encryption on a folder, filling it with
data will result in errors indicated by a subsequent scrub. I did not
find the topic on the web, but also not experiences shared by people
using encryption on SE11 express. Advice would be highly appreciated.


If you are doing the scrub when the encryption keys are not present it 
is possible you are hitting a known (and very recently fixed in the 
Solaris 11 development gates) bug.


If you have an operating systems support contract with Oracle you should 
be able to log a support ticket and request a backport of the fix for CR 
6989185.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ndmp?

2011-05-24 Thread Darren J Moffat

On 05/24/11 14:37, Edward Ned Harvey wrote:

When I search around, I see that nexenta has ndmp, and solaris 10 does
not, and there was at least some talk about supporting ndmp in
opensolaris ... So ...

Is ndmp present in solaris 11 express? Is it an installable 3rd party
package? How would you go about supporting ndmp if you wanted to?


It is present, it is not 3rd party.

Click here to install it:

http://pkg.oracle.com/solaris/release/p5i/0/service%2Fstorage%2Fndmp.p5i

Man pages are here:

http://download.oracle.com/docs/cd/E19963-01/html/821-1462/ndmpadm-1m.html
http://download.oracle.com/docs/cd/E19963-01/html/821-1462/ndmpd-1m.html
http://download.oracle.com/docs/cd/E19963-01/html/821-1462/ndmpstat-1m.html

What you mean by supporting it ?

I believe (though I haven't tested it) it works with Oracle Secure 
Backup as well as NetBackup and Networker.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: bug? ZFS crypto vs. scrub

2011-05-11 Thread Darren J Moffat

On 11/05/2011 01:07, Daniel Carosone wrote:

Sorry for abusing the mailing list, but I don't know how to report
bugs anymore and have no visibility of whether this is a
known/resolved issue.  So, just in case it is not...


Log a support call with Oracle if you have a support contract.


With Solaris 11 Express, scrubbing a pool with encrypted datasets for
which no key is currently loaded, unrecoverable read errors are
reported. The error count applies to the pool, and not to any specific
device, which is also somewhat at odds with the helpful message text
for diagnostic status and suggested action:


Known issue:

6989185 scrubbing a pool with encrypted filesystems and snapshots can 
report false positive errors.


If you have a support contract you may be able to request that fix be 
back ported into an SRU (note I'm not guaranteeing it will be just 
saying that it is technically possible)



When this has happened previously (on this and other pools) mounting
the dataset by supplying the key, and rerunning the scrub, removes the
errors.

For some reason, I can't in this case (keeps complaining that
the key is wrong). That may be a different issue that has also
happened before, and I will post about separately, once I'm sure I
didn't just made a typo (twice) when first setting the key.


Since you are saying typo I'm assuming you have 
keysource=passphrase,prompt (ie the default).  Have you ever done a 
send|recv of the encrypted datasets ? and if so where there multiple 
snapshots recv'd ?


--
Darren J Moffat
___
zfs-crypto-discuss mailing list
zfs-crypto-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-crypto-discuss


Re: [zfs-discuss] ls reports incorrect file size

2011-05-02 Thread Darren J Moffat

On 05/ 2/11 08:41 PM, Eric D. Mudama wrote:

On Mon, May 2 at 14:01, Bob Friesenhahn wrote:

On Mon, 2 May 2011, Eric D. Mudama wrote:



Hi. While doing a scan of disk usage, I noticed the following oddity.
I have a directory of files (named file.dat for this example) that all
appear as ~1.5GB when using 'ls -l', but that (correctly) appear as
~250KB
files when using 'ls -s' or du commands:


These are probably just sparse files. Nothing to be alarmed about.


They were created via CIFS. I thought sparse files were an iSCSI
concept, no?


iSCSI is a block level protocol.  Sparse files are a filesystem level 
concept that is understood my many filesystems including CIFS and ZFS 
and many others.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: How to mount encrypted file system at boot? Why no pass phrase requesed

2011-04-21 Thread Darren J Moffat

On 21/04/2011 11:05, Dr. David Kirkby wrote:

I went to a talk last night at the London Open Solaris User Group (LOSUG) by 
Darren Moffat - an Oracle engineer who had a major role in the ZFS encryption 
implementation in Solaris. I was particularly interested in this,as for a long 
time I've been concerned about security of data on my laptop.

I decided to try to secure my laptop, which is running Solaris 11 Express. I 
want to set the machine up so that during the boot process I get asked to enter 
the pass phrase to mount file system with my home directory on.

But I am having problems.

First I create the file system. As expected, Solaris asks for a pass phrase:

drkirkby@laptop:~# zfs create -o compression=on -o encryption=on -o
mountpoint=/export/home/davek rpool/export/home/davek
Enter passphrase for 'rpool/export/home/davek': ***
Enter again: **

Next I create a file on the file system and check it exists.

drkirkby@laptop:~# touch /export/home/davek/foo
drkirkby@laptop:~# ls /export/home/davek/foo
/export/home/davek/foo

Unmount the encrypted file system

drkirkby@laptop:~# zfs umount rpool/export/home/davek

Check  the file I created is no longer available

drkirkby@laptop:~# ls /export/home/davek/foo
/export/home/davek/foo: No such file or directory



Now I get a problem. I was expecting to have to enter the pass
phrase  again when attempting to mount the file system, but this is not being
requested. As you can see, I can mount the file system without the pass
phrase and read the data on the file system.


I covered that in the talk last night - in fact we had about a 5 minute 
discussion about why it is this way.


If you want the key to go away you need to run:

# zfs key -u rpool/export/home/davek


drkirkby@laptop:~# zfs mount rpool/export/home/davek
drkirkby@laptop:~# ls /export/home/davek/foo
/export/home/davek/foo
drkirkby@laptop:~#

This looks wrong to me, but I've no idea how to solve it.


No it is correct by design.

As I mentioned last night the reason for this is so that delegated 
administration of certain properties can work for users that don't have 
the 'key' delegation and don't have access to the wrapping keys.


For example changing a mountpoint causes an umount followed by a mount. 
 There are other changes that under the covers can cause a filesystem 
to be temporarily unmounted and remounted.



The next issue is how do I get the file system to mount when the

 machine is booted? I want to supply the pass phrase by typing it in,
 rather than from storing it in USB stick or other similar method.

Since this is your user home directory the ideal way would be a PAM 
module that ran during user login and requested the passphrase for the 
ZFS encrypted home dir.


There isn't one in Solaris 11 Express (snv_151a) at this time.


Any  ideas what I need to do to get this file system to request the
pass phrase before mountin g the file system?


There is source for a prototype PAM module in the old opensolaris.org 
zfs-crypto repository:


http://src.opensolaris.org/source/history/zfs-crypto/phase2/usr/src/lib/pam_modules/

You would need to take a clone of that repository and check out 
changeset  6749:6dded109490e  and see if that old PAM module could be 
hacked into submission.  Note that it uses private interfaces and doing 
so is not supported by any Oracle support contract you have.


--
Darren J Moffat
___
zfs-crypto-discuss mailing list
zfs-crypto-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-crypto-discuss


Re: [zfs-discuss] X4540 no next-gen product?

2011-04-08 Thread Darren J Moffat

On 08/04/2011 14:59, Bob Friesenhahn wrote:

On Fri, 8 Apr 2011, Erik Trimble wrote:


Sorry, I read the question differently, as in I have X4500/X4540 now,
and want more of them, but Oracle doesn't sell them anymore, what can
I buy?. The 7000-series (now: Unified Storage) *are* storage appliances.


They may be storage appliances, but the user can not put their own
software on them. This limits the appliance to only the features that
Oracle decides to put on it.


Isn't that the very definition of an Appliance ?

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540 no next-gen product?

2011-04-08 Thread Darren J Moffat

On 08/04/2011 17:47, Sašo Kiselkov wrote:

In short, I think the X4540 was an elegant and powerful system that
definitely had its market, especially in my area of work (digital video
processing - heavy on latency, throughput and IOPS - an area, where the
7000-series with its over-the-network access would just be a totally
useless brick).


As an engineer I'm curious have you actually tried a suitably sized 
S7000 or are you assuming it won't perform suitably for you ?


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] disable zfs/zpool destroy for root user

2011-02-18 Thread Darren J Moffat

On 17/02/2011 20:44, Stefan Dormayer wrote:

is there a way to disable the subcommand destroy of zpool/zfs for the
root user?


ZFS doesn't actually require root for those it actually checks for 
individual privileges.  Mostly that amounts to sys_mount and 
sys_config (for pool operations) - though those aren't documented 
requirements.


By default the root user ends up being able to do anything to any pool 
or dataset and all other users need to be granted access via 'zfs allow'.


Would it be useful if you could remove the ability for a root user in a 
zone to do zfs operations on delegated datasets ?  Doing this for the 
global zone is a little harder but for a local zone it can be done by 
extending the 'zfs allow' mechanism.


See:

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=7011365

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] vscand + quarantine

2011-02-09 Thread Darren J Moffat

On 02/ 9/11 09:57 PM, Zoltan Gyula Beck wrote:

   I would like to ask if it's possible to check the content of
quarantine in case of zfs uses vscand + antivirus. So is there any
command to list all the infected files in a dataset?


Any file which has been quarantined will have the av_quarantine bit set.

The easiest way to see that is with /usr/bin/ls  for example:

ls -/ v foo
rw-r--r--   1 darrenm  staff 176411 Nov  4 14:56 foo
	 
{archive,nohidden,noreadonly,nosystem,noappendonly,nonodump,noimmutable,av_modified,noav_quarantined,nonounlink,nooffline,nosparse}


In the above case the file has noav_quarantined if it had been one that 
vscand had marked as quarantined it would say av_quarantined instead.


There is also a compact mode see ls(1) man page.

-rw-r--r--   1 darrenm  staff 176411 Nov  4 14:56 foo
{A---q---}

That is what it would look like if 'foo' was quarantined.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] vscand + quarantine

2011-02-09 Thread Darren J Moffat

On 02/ 9/11 11:50 PM, Zoltan Gyula Beck wrote:

   Yes, I know that way with ls, but how can I check all the infected
files on a dataset which is used by a file server with millions of
files?! I mean there is no official way to check infections, but I
have to use some customs scripts? (find, ls, grep)


The quarantine bit is just an attribute of the file.  ZFS is not a 
database so you can't do


select name from files where files.quarantine = true;

There is no way go to this other than getting the system attributes from 
each file directly.  The only way to do that from shell script is 
find/ls/grep.  You could write a C program that uses the same method 
that ls does to get the attributes but you will still have to visit 
every file in the file system.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best choice - file system for system

2011-01-28 Thread Darren J Moffat

On 28/01/2011 13:37, Edward Ned Harvey wrote:

From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Tristram Scott

When it comes to dumping and restoring filesystems, there is still no

official

replacement for the ufsdump and ufsrestore.


Let's go into that a little bit.  If you're piping zfs send directly into
zfs receive, then it is an ideal backup method.  But not everybody can
afford the disk necessary to do that, so people are tempted to zfs send to
a file or tape.  There are precisely two reasons why that's not officially
recommended:


Officially yes you have it in quotes but where is the official 
reference for this ?


In fact I'd say the opposite.  In Solaris 11 Express the NDMP daemon can 
backup using dump, tar or zfs send stream.


This is also what the 'Sun ZFS Storage Appliance' does see here:

http://www.oracle.com/technetwork/articles/systems-hardware-architecture/ndmp-whitepaper-192164.pdf

On page 8 of the PDF titled: About ZFS-NDMP Backup Support

It does point out though that it is full ZFS dataset only, but 
incremental backup and incremental restore is supported.


This has been tested and is known to work with at least the following 
backup applications:


• Oracle Secure Backup 10.3.0.2 and above
• Enterprise Backup Software (EBS) / Legato Networker 7.5 and above
• Symantec NetBackup 6.5.3 and above


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-07 Thread Darren J Moffat

On 06/01/2011 23:07, David Magda wrote:

On Jan 6, 2011, at 15:57, Nicolas Williams wrote:


Fletcher is faster than SHA-256, so I think that must be what you're
asking about: can Fletcher+Verification be faster than
Sha256+NoVerification?  Or do you have some other goal?


Would running on recent T-series servers, which have have on-die crypto units, 
help any in this regard?


The on chip SHA-256 implementation is not yet used see:

http://blogs.sun.com/darren/entry/improving_zfs_dedup_performance_via

Note that the fix I integrated only uses a software implementation of 
SHA256 on the T5120 (UltraSPARC T2) and is not (yet) using the on CPU 
hardware implementation of SHA256.  The reason for this is to do with 
boot time availability of the Solaris Cryptographic Framework and the 
need to have ZFS as the root filesystem.


Not yet changed it turns out to be quite complicated to fix due to very 
early boot issues.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-07 Thread Darren J Moffat

On 07/01/2011 11:56, Sašo Kiselkov wrote:

On 01/07/2011 10:26 AM, Darren J Moffat wrote:

On 06/01/2011 23:07, David Magda wrote:

On Jan 6, 2011, at 15:57, Nicolas Williams wrote:


Fletcher is faster than SHA-256, so I think that must be what you're
asking about: can Fletcher+Verification be faster than
Sha256+NoVerification?  Or do you have some other goal?


Would running on recent T-series servers, which have have on-die
crypto units, help any in this regard?


The on chip SHA-256 implementation is not yet used see:

http://blogs.sun.com/darren/entry/improving_zfs_dedup_performance_via

Note that the fix I integrated only uses a software implementation of
SHA256 on the T5120 (UltraSPARC T2) and is not (yet) using the on CPU
hardware implementation of SHA256.  The reason for this is to do with
boot time availability of the Solaris Cryptographic Framework and the
need to have ZFS as the root filesystem.

Not yet changed it turns out to be quite complicated to fix due to
very early boot issues.


Would it be difficult to implement both methods and allow ZFS to switch
to the hardware-accelerated crypto backend at runtime after it has been
brought up and initialized? It seems like one heck of a feature


Wither it is difficult or not depends on your level of familiarity with 
ZFS, boot and the cryptographic framework ;-)


For me no it wouldn't be difficult but it still isn't completely trivial.


(essentially removing most of the computational complexity of dedup).


Most of the data I've seen on the performance impact of dedup is not 
coming from the SHA256 computation it is mostly about the additional IO 
to deal with the DDT.   Though lowering the overhead that SHA256 does 
add is always a good thing.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] A few questions

2011-01-06 Thread Darren J Moffat

On 06/01/2011 00:14, Edward Ned Harvey wrote:

solaris engineers don't use?  Non-sun hardware.  Pretty safe bet you won't
find any Dell servers in the server room where solaris developers do their
thing.


You would lose that bet, not only would you find Dell you would many 
other big names as well as white box hand build systems too.


Solaris developers use a lot of different hardware - Sun never made 
laptops so many of us have Apple (running Solaris on the metal and/or 
under virtualisation) or Toshiba or Fujitsu etc laptops.  There are also 
many workstations around the company that aren't Sun hardware as well as 
servers.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] stupid ZFS question - floating point operations

2010-12-23 Thread Darren J Moffat

On 22/12/2010 20:27, Garrett D'Amore wrote:

That said, some operations -- and cryptographic ones in particular --
may use floating point registers and operations because for some
architectures (sun4u rings a bell) this can make certain expensive


Well remembered!  There are sun4u optimisations that use the floating 
point unit but those only apply to the bignum code which in kernel is 
only used by RSA.



operations go faster. I don't think this is the case for secure
hash/message digest algorithms, but if you use ZFS encryption as found
in Solaris 11 Express you might find that on certain systems these
registers are used for performance reasons, either on the bulk crypto or
on the keying operations. (More likely the latter, but my memory of
these optimizations is still hazy.)


RSA isn't used at all by ZFS encryption, everything is AES (including 
key wrapping) and SHA256.


So those optimistations for floating point don't come into play for ZFS 
encryption.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] stupid ZFS question - floating point operations

2010-12-23 Thread Darren J Moffat

On 23/12/2010 15:18, Garrett D'Amore wrote:

Thanks for the clarification. I guess I need to go back and figure out
how ZFS crypto keying is performed. I guess most likely the key is
generated from some sort of one-way hash from a passphrase?


See http://blogs.sun.com/darren/entry/zfs_encryption_what_is_on where I 
explain all the type of keys used and how they are generated as well as 
how passphrases are turned into AES wrapping keys (using PKCS#5 PBE).


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] stupid ZFS question - floating point operations

2010-12-23 Thread Darren J Moffat

On 23/12/2010 17:09, joerg.schill...@fokus.fraunhofer.de wrote:

Darren J Moffatdarren.mof...@oracle.com  wrote:


On 22/12/2010 20:27, Garrett D'Amore wrote:

That said, some operations -- and cryptographic ones in particular --
may use floating point registers and operations because for some
architectures (sun4u rings a bell) this can make certain expensive


Well remembered!  There are sun4u optimisations that use the floating
point unit but those only apply to the bignum code which in kernel is
only used by RSA.


It may be a guess caused by the fact that integer division and multiplication
is inside the FPU on SPARC processors.


Not a guess it is code to do big number integer arithmetic that is 
optimised for sun4u to explicitly (ab)using the FPU.  This isn't 
guessing it is was a deliberate design choice.


Specifically this code here:

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/common/bignum/sun4u/

Note that there are separate kernel and user land variants of that.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] a single nfs file system shared out twice with different permissions

2010-12-21 Thread Darren J Moffat

On 20/12/2010 19:26, Geoff Nordli wrote:

I guess he has some application he can imprison into a specific read-only
subdirectory, while some other application should be able to read/write or
something like that, using the same username, on the same machine.


It is the same application, but for some functions it needs to use read-only
access or it will modify the files when I don't want it to.


An other alterntative is if the application is running on Solaris then 
you can run it with the basic file_write privilege removed.  This basic 
privilege was added for exactly this type of use case.


$ ppriv -e -s EPIL=basic,!file_write myapp

If it is being started by an SMF service you can remove file_write in 
the method_credential section - see smf_method(5).


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] relationship between ARC and page cache

2010-12-21 Thread Darren J Moffat

On 21/12/2010 14:25, Phil Harman wrote:

Hi Jeff,

ZFS support for mmap() was something of an afterthought. The current
Solaris virtual memory infrastructure didn't have the features or
performance required, which is why ZFS ended up with the ARC.

Yes, you've got it. When we mmap() a ZFS file, there are two main caches
involved: the ZFS ARC and the good old Solaris page cache. The reason
for poor performance is the overhead of keeping the two caches in sync,
but contention for RAM is also an issue.



Clamping the ARC is probably a good thing in your case, but it only
addresses part of the problem.


Another alternative to try would be setting primarycache=metadata on the 
ZFS dataset that contains the mmap files.  That way you are only turning 
of the ZFS ARC cache of the file content for that one dataset rather 
than clamping the ARC.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] a single nfs file system shared out twice with different permissions

2010-12-20 Thread Darren J Moffat

On 18/12/2010 07:09, Geoff Nordli wrote:

I am trying to configure a system where I have two different NFS shares
which point to the same directory.  The idea is if you come in via one path,
you will have read-only access and can't delete any files, if you come in
the 2nd path, then you will have read/write access.


That sounds very similar to what you would do with Trusted Extensions. 
The read/write label would be a higher classification than the read-only 
one - since you can read down, can't see higher and need to be equal to 
modify.


For more information on Trusted Extensions start with these resources:


Oracle Solaris 11 Express Trusted Extensions Collection

http://docs.sun.com/app/docs/coll/2580.1?l=en

OpenSolaris Security Community pages on TX:

http://hub.opensolaris.org/bin/view/Community+Group+security/tx

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS ... open source moving forward?

2010-12-13 Thread Darren J Moffat

On 12/13/10 05:55 PM, Miles Nordin wrote:

+ Oracle publishes the promised yet-to-be-delivered zfs-crypto
  paper that's thorough enough to write a compatible implementation


It isn't yet the full paper but a lot of the on disk details are in my 
latest blog entry and all of the structs necessary for the on disk 
format are in the CTF data of the binaries.


http://blogs.sun.com/darren/entry/zfs_encryption_what_is_on

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Crypto in Oracle Solaris 11 Express

2010-12-02 Thread Darren J Moffat

On 17/11/2010 21:58, Bill Sommerfeld wrote:

In particular, the mechanism by which dedup-friendly block IV's are
chosen based on the plaintext needs public scrutiny. Knowing Darren,
it's very likely that he got it right, but in crypto, all the details
matter and if a spec detailed enough to allow for interoperability isn't
available, it's safest to assume that some of the details are wrong.


That is described here:

http://blogs.sun.com/darren/entry/zfs_encryption_what_is_on

If dedup=on for the dataset the per block IVs are generated 
differently.  They are generated by taking an HMAC-SHA256 of the 
plaintext and using the left most 96 bits of that as the IV.  The key 
used for the HMAC-SHA256 is different to the one used by AES for the 
data encryption, but is stored (wrapped) in the same keychain entry, 
just like the data encryption key a new one is generated when doing a 
'zfs key -K dataset'.  Obviously we couldn't calculate this IV when 
doing a read so it has to be stored.


This was also suggested independently by other well known people 
involved in encrypted filesystems while it was discussed on a public 
forum (most of that thread was cross posted to zfs-crypto-discuss).


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot limit?

2010-12-01 Thread Darren J Moffat

On 01/12/2010 13:36, f...@ll wrote:

I must send zfs snaphost from one server to another. Snapshot have size
130GB. Now I have question, the zfs have any limit of sending file?


No.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs on a removable device

2010-11-26 Thread Darren J Moffat

On 26/11/2010 13:16, Pavel Heimlich wrote:

I tried to transfer some data between two S11 machines via a usb harddrive with 
zfs on it, but importing the zpool failed (with some assertion error I did not 
write down) because I did not export it first (on the first machine). I had to 
go back to the first machine, plug the drive in again and export the fs.

Are there some zfs / OS parameters I could set so that my usb drive with zfs on 
it would meet the expectations one has from a removable drive? (i.e. safe to 
remove +-anytime)


No you run zpool export first, that is the OS parameter, this is no 
different to any other filesystem on any other operating system.  If you 
don't export it first how is Solaris or ZFS supposed to know the 
difference between you yanking it out because you are purposely moving 
it and the drive accidentally falling out or some other error that 
causes it to be come unavailable.  Hint: the answer is you can't unless 
you administratively tell ZFS that the pool is supposed to be going away 
they way you do that is by 'zpool export'.


Unlike other filesystems though ZFS will be consistent on disk.

You didn't have to plug it back into to the original system you could 
have just forced the import.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Crypto in Oracle Solaris 11 Express

2010-11-23 Thread Darren J Moffat

On 23/11/2010 21:01, StorageConcepts wrote:

r...@solaris11:~# zfs list mypool/secret_received
cannot open 'mypool/secret_received': dataset does not exist
r...@solaris11:~# zfs send mypool/plaint...@test | zfs receive -o encryption=on 
mypool/secret_received
cannot receive: cannot override received encryption
---

Is there a implementation/technical  reason for not allowing this ?


Yes there is, this is because of how the ZPL metadata is written to disk 
- it is slightly different between encrypted and non encrypted cases and 
unfortunately that difference shows up even in the ZFS send stream.


It is a known (and documented in the Admin guide) restriction.

If we allowed the receive to proceed the result would be that some ZPL 
metadata (including filenames) for some files may end up on disk in the 
clear, there are various cases where this could happen but it is most 
likely to happen when the filesystem is being used by Windows clients 
because of the combination of things that happen - but it can equally 
well happen with only local ZPL usage too particularly if there are 
large ACLs in use.


In the mean time the best workaround I can offer is to use 
tar/cpio/rsync, but obviously you lose your snapshot history that way.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Crypto in Oracle Solaris 11 Express

2010-11-19 Thread Darren J Moffat

On 19/11/2010 00:39, David Magda wrote:

On Nov 16, 2010, at 05:09, Darren J Moffat wrote:


Both CCM[1] and GCM[2] are provided so that if one turns out to have
flaws hopefully the other will still be available for use safely even
though they are roughly similar styles of modes.

On systems without hardware/cpu support for Galios multiplication
(Intel Westmere and later and SPARC T3 and later) GCM will be slower
because the Galios field multiplication has to happen in software
without any hardware/cpu assist. However depending on your workload
you might not even notice the difference.


Both modes of operation are authenticating. At one point the design of
ZFS crypto had the checksum automatically go to SHA-256 when it was
enabled. [1] Is SHA activation still the case, or are the two modes of
operations simply used in themselves to verify data integrity?


That is still the case, the blockpointer contains the IV, the SHA256 
checksum (truncated) and the MAC from CCM and GCM.



Also, are slog and cache devices encrypted at this time? Given a pool,
and the fact that only particular data sets on it could be encrypted,
would these special devices be entirely encrypted, or only data from the
particular encrypted data set/s? I would also assume the in-memory ARC
would be clear-text.


The ZIL wither it is in pool or on a slog is always encrypted for an 
encrypted dataset, it is encrypted in exactly the same way.


Data from encrypted datasets does not currently go to the L2ARC cache 
devices.


The in memory ARC is in the clear and it has to be because those buffers 
can be shared via zero copy means to other parts of the system including 
other filesystems like NFS and CIFS.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Crypto in Oracle Solaris 11 Express

2010-11-19 Thread Darren J Moffat
The design for ZFS crypto was done in the open via opensolaris.org and 
versions of the source (though not the final version at this time) are 
available on opensolaris.org.


It was reviewed by internal and external to Sun/Oracle people who have 
considerable crypto experience.  Important parts of the cryptography 
design were also discussed on other archived public forums as well as 
zfs-crypto-discuss.


The design was also presented at IEEE 1619 SISWG and at SNIA.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Crypto in Oracle Solaris 11 Express

2010-11-18 Thread Darren J Moffat

On 17/11/2010 20:04, Miles Nordin wrote:

djm == Darren J Moffatdarr...@opensolaris.org  writes:


djm  http://blogs.sun.com/darren/entry/introducing_zfs_crypto_in_oracle
djm  http://blogs.sun.com/darren/entry/assued_delete_with_zfs_dataset
djm  
http://blogs.sun.com/darren/entry/compress_encrypt_checksum_deduplicate_with

Is there a URL describing the on-disk format and implementation details?


It is a work in progress.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Crypto in Oracle Solaris 11 Express

2010-11-18 Thread Darren J Moffat

On 18/11/2010 03:55, grarpamp wrote:

One reason you may want to select aes-128-gcm rather than aes-128-ccm is
that GCM is one of the modes for AES in NSA Suite B[3], but CCM is not.



Are there symmetric algorithms other than AES that are of interest ?


How might AES-XTS [1] be able to fit into the the ZFS picture?


It doesn't.  We don't need it because we don't need to have the 
ciphertext the same size as the plaintext because we have space to store 
a sufficiently large MAC (and store an IV as well).  This is why CCM and 
GCM were chosen rather than XTS or EME2.



Additionally given the user may wish to trade off compression, dedup,
the number of encryptable blocks [2], etc for any particular selectable
algorithm.


We don't need to make those compromises in ZFS, you can compress and 
encrypt and dedup (it happens in that order).


http://blogs.sun.com/darren/entry/compress_encrypt_checksum_deduplicate_with

For changing the encryption key see the discussion of 'zfs key -K' in 
the zfs(1M) man page:


http://docs.sun.com/app/docs/doc/821-1462/zfs-1m?l=ena=view

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Crypto in Oracle Solaris 11 Express

2010-11-17 Thread Darren J Moffat

On 17/11/2010 10:17, Richard Elling wrote:

I know there are far more apps without support for encryption than
with it. And given the ever more stringent government regulations in
the US, there are plenty of customers chomping at the bit for
encryption at the storage array.


I do not disagree. There are many products in the market that
seamlessly encrypt data. But, vi has had encryption for almost
30 years, so there is clearly no barrier to app writers. As more
development moves to the cloud, encryption comes almost free
at the app layer. The only thing left is the legacy apps...


Encryption at the application layer solves a different set of problems 
to encryption at the storage layer.  Just like the encryption in ZFS 
solves a different set of problems to full disk encryption in the drive 
firmware.


These sets have overlapping regions and depending on security policies 
one or more may be the best solution.


As always encryption is the easy part it is key management that is 
hard, because key management enters the real of policy and key 
management can be hard to scale out to large numbers of apps.


There is on one correct solution for where to do encryption just like 
there is on one correct way to write files onto persistent media. 
Choice is important and sometimes choosing more than one is the correct 
thing to do.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Crypto in Oracle Solaris 11 Express

2010-11-17 Thread Darren J Moffat

On 17/11/2010 11:41, Erik Trimble wrote:

There is on one correct solution for where to do encryption just
like there is on one correct way to write files onto persistent media.
Choice is important and sometimes choosing more than one is the
correct thing to do.


I'm assuming you meant no the two times you wrote on in that
second-to-last sentence. :-)


Yes thanks, it should have read:

There is no one correct solution for where to do encryption just
like there is no one correct way to write files onto persistent media.
Choice is important and sometimes choosing more than one is the
correct thing to do.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Crypto in Oracle Solaris 11 Express

2010-11-17 Thread Darren J Moffat

On 17/11/2010 14:18, Bob Friesenhahn wrote:

On Wed, 17 Nov 2010, Markus Kovero wrote:


Does Oracle support Solaris 11 Express in production systems?
-- richard


Yes, You need Premier support plan from Oracle for that.
Afaik, sol11 express is production ready, and is going to be updated
to real Solaris 11, and is supported even with non-oracle hardware if
you have the money (and certified system).


Solaris 11 Express may be production ready but is Oracle Premier
Support prepared to support it in production? That seems like the vital
question to me. As for myself, I will wait a while and observe before
assigning my trust.


From the FAQ[1] linked from here:

http://www.oracle.com/technetwork/server-storage/solaris11/overview/index.html


Licensing and Support for Oracle Solaris 11
Express

11-Can I get support for Oracle Solaris 11 Express?

Yes. Oracle Solaris 11 Express is covered under the Oracle
Premier Support for Operating Systems or Oracle Premier
Support for Systems support option for Oracle hardware, and
Oracle Solaris Premier Subscription for non-Oracle
hardware. Customers must choose either of these support
options should they wish to deploy Oracle Solaris 11 Express
into a production environment.

[1] 
http://www.oracle.com/technetwork/server-storage/solaris11/overview/faqs-oraclesolaris11express-185609.pdf


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Crypto in Oracle Solaris 11 Express

2010-11-16 Thread Darren J Moffat

On 11/15/10 19:36, David Magda wrote:

On Mon, November 15, 2010 14:14, Darren J Moffat wrote:

Today Oracle Solaris 11 Express was released and is available for
download[1], this release includes on disk encryption support for ZFS.

Using ZFS encryption support can be as easy as this:

  # zfs create -o encryption=on tank/darren
  Enter passphrase for 'tank/darren':
  Enter again:


Looking forwarding to playing with it. Some questions:
  1. Is it possible to do a 'zfs create -o encryption=off
tank/darren/music' after the above command? I don't much care if my MP3s
are encrypted. :)


No, all child filesystems must be encrypted as well.  This is to avoid 
problems with mounting during boot / pool import.  It is possible this 
could be relaxed in the future but it is highly dependent on some other 
things that may not work out.



  2. Both CCM and GCM modes of operation are supported: can you recommended
which mode should be used when? I'm guessing it's best to accept the
default if you're not sure, but what if we want to expand our knowledge?


You've preempted my next planned posting ;-)  But I'll attempt to give 
an answer here:


'on' maps to aes-128-ccm, because it is the fastest of the 6 available
modes of encryption currently provided.  Also I believe it is the 
current wisdom of cryptographers (which I do not claim to be) that AES 
128 is the preferred key length due to recent discoveries about AES 256 
that are not know to impact AES 128.


Both CCM[1] and GCM[2] are provided so that if one turns out to have 
flaws hopefully the other will still be available for use safely even 
though they are roughly similar styles of modes.


On systems without hardware/cpu support for Galios multiplication (Intel 
Westmere and later and SPARC T3 and later) GCM will be slower because 
the Galios field multiplication has to happen in software without any 
hardware/cpu assist.  However depending on your workload you might not 
even notice the difference.


One reason you may want to select aes-128-gcm rather than aes-128-ccm is 
that GCM is one of the modes for AES in NSA Suite B[3], but CCM is not.


Are there symmetric algorithms other than AES that are of interest ?
The wrapping key algorithm currently matches the data encryption key 
algorithm, is there interest in providing different wrapping key 
algorithms and configuration properties for selecting which one ?  For 
example doing key wrapping with an RSA keypair/certificate ?


[1] http://en.wikipedia.org/wiki/CCM_mode
[2] http://en.wikipedia.org/wiki/Galois/Counter_Mode
[3] http://en.wikipedia.org/wiki/NSA_Suite_B_Cryptography

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS Crypto in Oracle Solaris 11 Express

2010-11-15 Thread Darren J Moffat
Today Oracle Solaris 11 Express was released and is available for 
download[1], this release includes on disk encryption support for ZFS.


Using ZFS encryption support can be as easy as this:

# zfs create -o encryption=on tank/darren
Enter passphrase for 'tank/darren':
Enter again:
#

Continued at:

http://blogs.sun.com/darren/entry/introducing_zfs_crypto_in_oracle
http://blogs.sun.com/darren/entry/assued_delete_with_zfs_dataset
http://blogs.sun.com/darren/entry/compress_encrypt_checksum_deduplicate_with

[1] 
http://www.oracle.com/technetwork/server-storage/solaris11/downloads/index.html


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] how to quiesce and unquiesc zfs and zpool for array/hardware snapshots ?

2010-11-12 Thread Darren J Moffat

On 12/11/2010 13:01, sridhar surampudi wrote:

How I can I quiesce / freeze all writes to zfs and zpool if want to take 
hardware level snapshots or array snapshot of all devices under a pool ?
are there any commands or ioctls or apis available ?


zpool export pool
zpool import pool

That is the only documented and supported way to do it that I'm aware 
of, and yes that does take the pool off line but that way you can be 
sure it isn't changing.


The only other way I know of to freeze a pool is for testing purposes 
only and if you want to learn about that you need to read the code 
because I'm not going to disclose it here in case it is miss used.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool split how it works?

2010-11-10 Thread Darren J Moffat

On 10/11/2010 11:18, sridhar surampudi wrote:

I was wondering how zpool split works or implemented.

If a pool pool1 is on a mirror having two devices dev1 and dev2 then using 
zpool split I can split with the new pool name say pool-mirror on dev2.

How split can change metadata on dev2 and rename/replace and associate with new 
name i.e. pool-mirror ??


Exactly what isn't clear from the description in the man page ?

 zpool split [-R altroot] [-n] [-o mntopts] [-o
 property=value] pool newpool [device ...]

 Splits off one disk from each mirrored top-level vdev in
 a  pool and creates a new pool from the split-off disks.
 The original pool must be made up of one or more mirrors
 and must not be in the process of resilvering. The split
 subcommand chooses the last device in each  mirror  vdev
 unless  overridden by a device specification on the com-
 mand line.

 When using a device argument, split includes the  speci-
 fied  device(s)  in  a  new pool and, should any devices
 remain unspecified, assigns the last device in each mir-
 ror  vdev  to that pool, as it does normally. If you are
 uncertain about the outcome of a split command, use  the
 -n  (dry-run)  option to ensure your command will have
 the effect you intend.

Or are you really asking about the implementation details ?  If you want 
to know how it is implemented then you need to read the source code.


Here would be a good starting point:

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libzfs/common/libzfs_pool.c#zpool_vdev_split

Which ends up in kernel here:

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/zfs_ioctl.c#zfs_ioc_vdev_split


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Does a zvol use the zil?

2010-10-21 Thread Darren J Moffat

Yes, ZVOLs do use the ZIL.

If the write cache has been disabled on the zvol by the DKIOCSETWCE 
ioctl or the sync property is set to always.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Does a zvol use the zil?

2010-10-21 Thread Darren J Moffat

On 21/10/2010 18:59, Maurice Volaski wrote:

Does the write cache referred to above refer to the Writeback Cache

 property listed by stmfadm list-lu -v (when a zvol is a target) or
 is that some other cache and if it is, how does it interact with the
 first one?

Yes it does, that basically results in DKIOCGETWCE ioctl being called on 
the ZVOL (though you won't see that in truss because it is called from 
the comstar kernel modules not directly from stmfadm in userland).


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Finding corrupted files

2010-10-20 Thread Darren J Moffat

On 20/10/2010 12:20, Edward Ned Harvey wrote:

It's one of the big selling points, reasons for ZFS to exist.  You should
always give ZFS JBOD devices to work on, so ZFS is able to scrub both of the
redundant sides of the data, and when a checksum error occurs, ZFS is able
to detect *and* correct it.  Don't use hardware raid.


That isn't the recommended best practice, you are stating it far too 
strongly.


The recommended best practice is to always create ZFS pools with 
redundancy in the control of ZFS.  That doesn't require that the back 
end storage be JBOD or full disks nor does it require you not to use 
hardware raid. Some of all of which are impossible if you are using SAN 
or other remote block storage devices in many cases - and certainly the 
case if the SAN is provided by a Sun ZFS Storage appliance.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Myth? 21 disk raidz3: Don't put more than ___ disks in a vdev

2010-10-20 Thread Darren J Moffat

On 20/10/2010 14:03, Edward Ned Harvey wrote:

In a discussion a few weeks back, it was mentioned that the Best Practices
Guide says something like Don't put more than ___ disks into a single
vdev.  At first, I challenged this idea, because I see no reason why a
21-disk raidz3 would be bad.  It seems like a good thing.


If you have those 21 disks spread across 3 top level vdevs each of 
raidz3 with 7 disks then ZFS can will stripe across 3 vdevs rather than 
than 1.


Here is an example from the Sun ZFS Storage Appliance GUI:

Each O is a score out of 5
--
AVAIL   PERFCAPACITY
Double parity RAID  _   OOO__   _   1.45T
Mirrored_   O   O   808G
Single partiy RAID, narrow stripes  OOO__   _   OO___   1.18T
Striped _   O   O   1.84T
Triple mirrored _   O   _   538G
Triple parity RAID, wide stripes_   OO___   O   1.31T

--

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-10-18 Thread Darren J Moffat

On 18/10/2010 07:44, Habony, Zsolt wrote:

I have seen a similar question on this list in the archive but haven’t
seen the answer.

Can I avoid striping across top level vdevs ?

If I use a zpool which is one LUN from the SAN, and when it becomes full
I add a new LUN to it.

But I cannot guarantee that the LUN will not come from the same spindles
on the SAN.


That sounds like a problem with your SAN config if that matters to you.


Can I force zpool to not to stripe the data ?


You can't, but why do you care ?

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


  1   2   3   4   5   6   7   8   >