Re: [zfs-discuss] zfs backup and restore

2007-05-25 Thread Louwtjie Burger

A good place to start is: http://www.opensolaris.org/os/community/zfs/

Have a look at:

http://www.opensolaris.org/os/community/zfs/docs/

as well as

http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#

Create some files, which you can use as disks within zfs and demo to
your customer precisely what happens on a small scale using snapshots
and clones and promotions.

Cheers

On 5/25/07, Roshan Perera [EMAIL PROTECTED] wrote:

Hi,

I believe Solaris 10 version 3 supports zfs backup and restore. How can I 
upgrade previous versions of Solaris to run zfs backup/restore and where to 
download the relevant versions.

Also, I have a customer wanting to know (now I am interested too) the detailed 
information of how the zfs snapshot and cloning works. Mostly to 
justify/explain the speed of cloning.

Thanks

Roshan

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Preparing to compare Solaris/ZFS and FreeBSD/ZFS performance.

2007-05-25 Thread Claus Guttesen

 I have just (re)installed FreeBSD amd64 current with gcc 4.2 with src
 from May. 21'st on a dual Dell PE 2850.  Does the post-gcc-4-2 current
 include all your zfs-optimizations?

 I have commented out INVARIANTS, INVARIANTS_SUPPORT, WITNESS and
 WITNESS_SKIPSPIN in my kernel and recompiled with CPUTYPE=nocona.

 A default install solaris fares better io-wise compared to a default
 FreeBSD where writes could pass 100 MB/s (zpool iostat 1) and FreeBSD
 would write 30-40 MB/s. After adding the following to
 /boot/loader.conf writes peak at 90-95 MB/s:

 vm.kmem_size_max=2147483648
 vfs.zfs.arc_max=1610612736

 Now FreeBSD seems to perfom almost as good as solaris io-wise although
 I don't have any numbers to justify my statement. I did not import
 postgresql in solaris as one thing.

This patch also improve concurrency in VFS:

http://people.freebsd.org/~pjd/patches/vfs_shared.patch


I applied the patch and it seems to speed up my reads and writes.
Watching zpool iostat I saw reads at 155 MB/s and writes at 111 MB/s.
But it also seems to introduce some minor complete stops accessing the
zpool lasting for 10-20 secs. I apologize I'm not very specific but I
only had time to test disk-io but not digg into the issues.


When you want to operate on mmap(2)ed files, you should disable ZIL and
remote file systems:

# sysctl vfs.zfs.zil_disable=1
# zpool export name
# zpool import name


Won't disabling ZIL minimize the chance of a consistent zfs-filesystem
if - for some reason - the server did an unplanned reboot?

--
regards
Claus

When lenity and cruelty play for a kingdom,
the gentlest gamester is the soonest winner.

Shakespeare
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-25 Thread Casper . Dik


Depend on the guarantees. Some RAID systems have built in block 
checksumming.


But we all know that block checksums stored with the blocks do
not catch a number of common errors.

(Ghost writes, misdirected writes, misdirected reads)

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: Need guidance on RAID 5, ZFS, and RAIDZ on home file server

2007-05-25 Thread Joerg Schilling
[EMAIL PROTECTED] wrote:

  IRIX was much earlier than Solaris; Solaris was pretty late in the 64 bit
  game with Solaris 7.
 
 And Alpha did not have a real 64 bit port as they did implement ILP64.
 With ILP64 your application does not really notice that it runs in 64 bits
 if you only use sizeof().

 ILP64?

 AFAIK, Alpha had int as a 32 bit type and L and P as 64 bit types;
 even ILP64 would be a proper 64 bit OS if a tad difficult to port
 some code to.

 That's why time_t was a 32 bit value (oops).

OOps, you are right :-)

Is it possible that I confused this with Linux an Alpha?
GCC was not 64 bit clean until GCC-3.x If you compiles a GCC-2.x you did
get more than 1 warnings for bad printf format strings and people have been
very upset for not being able to use gcc to compile 64 bit sparc binaries.

Jörg

-- 
 EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin
   [EMAIL PROTECTED](uni)  
   [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS boot: Now, how can I do a pseudo live upgrade?

2007-05-25 Thread Constantin Gonzalez
Hi,

I'm a big fan of live upgrade. I'm also a big fan of ZFS boot. The latter is
more important for me. And yes, I'm looking forward to both being integrated
with each other.

Meanwhile, what is the best way to upgrade a post-b61 system that is booted
from ZFS?


I'm thinking:

1. Boot from ZFS
2. Use Tim's excellent multiple boot datasets script to create a new cloned ZFS
   boot environment:
   http://blogs.sun.com/timf/entry/an_easy_way_to_manage
3. Loopback mount the new OS ISO image
4. Run the installer from the loopbacked ISO image in upgrade mode on the clone
5. Mark the clone to be booted the next time
6. Reboot into the upgraded OS.


Questions:

- How exactly do I do step 4? Before, luupgrade did everything for me, now
  what manpage do I need to do this?

- Did I forget something above? I'm ok with losing some logfiles and stuff that
  maybe changed between the clone and the reboot, but is there anything else?

- Did someone already blog about this and I haven't noticed yet?


Cheers,
   Constantin

-- 
Constantin GonzalezSun Microsystems GmbH, Germany
Platform Technology Group, Global Systems Engineering  http://www.sun.de/
Tel.: +49 89/4 60 08-25 91   http://blogs.sun.com/constantin/

Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Error zfs creating new zone

2007-05-25 Thread Wesley Naves de faira

Hello,
  I get a error when i was creating the new zone on ZFS. I´m using Solaris
11/06 118855-36, i have several others machines identical hardware, but only
this is appears this behaviour.


[EMAIL PROTECTED]:/] # fmdump -V

TIME UUID SUNW-MSG-ID

May 23 17:39:43.4886 b3c6c2b6-f41a-eede-dce4-dd42e5c5424a ZFS-8000-CS



nvlist version: 0

   version = 0x0

   class = list.suspect

   uuid = b3c6c2b6-f41a-eede-dce4-dd42e5c5424a

   code = ZFS-8000-CS

   diag-time = 1179952783 488616

   de = (embedded nvlist)

   nvlist version: 0

   version = 0x0

   scheme = fmd

   authority = (embedded nvlist)

   nvlist version: 0

   version = 0x0

   product-id = LX200

   chassis-id = COWYRR10RR0007

   server-id = server

   (end authority)



   mod-name = zfs-diagnosis

   mod-version = 1.0

   (end de)



   fault-list-sz = 0x1

   fault-list = (array of embedded nvlists)

   (start fault-list[0])

   nvlist version: 0

   version = 0x0

   class = fault.fs.zfs.pool

   certainty = 0x64

   asru = (embedded nvlist)

   nvlist version: 0

   version = 0x0

   scheme = zfs

   pool = 0x9c85a50f25483bc6

   (end asru)



   resource = (embedded nvlist)

   nvlist version: 0

   version = 0x0

   scheme = zfs

   pool = 0x9c85a50f25483bc6

   (end resource)



   (end fault-list[0])



   fault-status = 0x1

   __ttl = 0x1

   __tod = 0x4654a68f 0x1d20ae28
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-25 Thread Toby Thain


On 25-May-07, at 1:22 AM, Torrey McMahon wrote:


Toby Thain wrote:


On 22-May-07, at 11:01 AM, Louwtjie Burger wrote:


On 5/22/07, Pål Baltzersen [EMAIL PROTECTED] wrote:

What if your HW-RAID-controller dies? in say 2 years or more..
What will read your disks as a configured RAID? Do you know how  
to (re)configure the controller or restore the config without  
destroying your data? Do you know for sure that a spare-part  
and firmware will be identical, or at least compatible? How good  
is your service subscription? Maybe only scrapyards and museums  
will have what you had. =o


Be careful when talking about RAID controllers in general. They are
not created equal! ...
Hardware raid controllers have done the job for many years ...


Not quite the same job as ZFS, which offers integrity guarantees  
that RAID subsystems cannot.


Depend on the guarantees. Some RAID systems have built in block  
checksumming.




Which still isn't the same. Sigh.

--T___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-25 Thread Torrey McMahon

Toby Thain wrote:


On 25-May-07, at 1:22 AM, Torrey McMahon wrote:


Toby Thain wrote:


On 22-May-07, at 11:01 AM, Louwtjie Burger wrote:


On 5/22/07, Pål Baltzersen [EMAIL PROTECTED] wrote:

What if your HW-RAID-controller dies? in say 2 years or more..
What will read your disks as a configured RAID? Do you know how to 
(re)configure the controller or restore the config without 
destroying your data? Do you know for sure that a spare-part and 
firmware will be identical, or at least compatible? How good is 
your service subscription? Maybe only scrapyards and museums will 
have what you had. =o


Be careful when talking about RAID controllers in general. They are
not created equal! ...
Hardware raid controllers have done the job for many years ...


Not quite the same job as ZFS, which offers integrity guarantees 
that RAID subsystems cannot.


Depend on the guarantees. Some RAID systems have built in block 
checksumming.




Which still isn't the same. Sigh. 


Yep.you get what you pay for. Funny how ZFS is free to purchase 
isn't it?


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS boot: Now, how can I do a pseudo live upgrade?

2007-05-25 Thread Malachi de Ælfweald

I'm actually wondering the same thing because I have b62 w/ the ZFS bits;
but need the snapshot's -r functionality.

Malachi

On 5/25/07, Constantin Gonzalez [EMAIL PROTECTED] wrote:


Hi,

I'm a big fan of live upgrade. I'm also a big fan of ZFS boot. The latter
is
more important for me. And yes, I'm looking forward to both being
integrated
with each other.

Meanwhile, what is the best way to upgrade a post-b61 system that is
booted
from ZFS?


I'm thinking:

1. Boot from ZFS
2. Use Tim's excellent multiple boot datasets script to create a new
cloned ZFS
   boot environment:
   http://blogs.sun.com/timf/entry/an_easy_way_to_manage
3. Loopback mount the new OS ISO image
4. Run the installer from the loopbacked ISO image in upgrade mode on the
clone
5. Mark the clone to be booted the next time
6. Reboot into the upgraded OS.


Questions:

- How exactly do I do step 4? Before, luupgrade did everything for me, now
  what manpage do I need to do this?

- Did I forget something above? I'm ok with losing some logfiles and stuff
that
  maybe changed between the clone and the reboot, but is there anything
else?

- Did someone already blog about this and I haven't noticed yet?


Cheers,
   Constantin

--
Constantin GonzalezSun Microsystems GmbH,
Germany
Platform Technology Group, Global Systems Engineering
http://www.sun.de/
Tel.: +49 89/4 60 08-25 91
http://blogs.sun.com/constantin/

Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551
Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS boot: Now, how can I do a pseudo live upgrade?

2007-05-25 Thread Constantin Gonzalez
Hi,

 Our upgrade story isn't great right now.  In the meantime,
 you might check out Tim Haley's blog entry on using
 bfu with zfs root.

thanks.

But doesn't live upgrade just start the installer from the new OS
DVD with the right options? Can't I just do that too?

Cheers,
   Constantin

 
 http://blogs.sun.com/timh/entry/friday_fun_with_bfu_and
 
 lori
 
 Constantin Gonzalez wrote:
 Hi,

 I'm a big fan of live upgrade. I'm also a big fan of ZFS boot. The
 latter is
 more important for me. And yes, I'm looking forward to both being
 integrated
 with each other.

 Meanwhile, what is the best way to upgrade a post-b61 system that is
 booted
 from ZFS?


 I'm thinking:

 1. Boot from ZFS
 2. Use Tim's excellent multiple boot datasets script to create a new
 cloned ZFS
boot environment:
http://blogs.sun.com/timf/entry/an_easy_way_to_manage
 3. Loopback mount the new OS ISO image
 4. Run the installer from the loopbacked ISO image in upgrade mode on
 the clone
 5. Mark the clone to be booted the next time
 6. Reboot into the upgraded OS.


 Questions:

 - How exactly do I do step 4? Before, luupgrade did everything for me,
 now
   what manpage do I need to do this?

 - Did I forget something above? I'm ok with losing some logfiles and
 stuff that
   maybe changed between the clone and the reboot, but is there
 anything else?

 - Did someone already blog about this and I haven't noticed yet?


 Cheers,
Constantin

   
 

-- 
Constantin GonzalezSun Microsystems GmbH, Germany
Platform Technology Group, Global Systems Engineering  http://www.sun.de/
Tel.: +49 89/4 60 08-25 91   http://blogs.sun.com/constantin/

Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS boot: Now, how can I do a pseudo live upgrade?

2007-05-25 Thread Lori Alt

Constantin Gonzalez wrote:

Hi,

  

Our upgrade story isn't great right now.  In the meantime,
you might check out Tim Haley's blog entry on using
bfu with zfs root.



thanks.

But doesn't live upgrade just start the installer from the new OS
DVD with the right options? Can't I just do that too?
  


I'll look at it and see if I can give you a better recommendation.
I don't want to give you bad advice.  I might have more
information later today.

Lori



Cheers,
   Constantin

  

http://blogs.sun.com/timh/entry/friday_fun_with_bfu_and

lori

Constantin Gonzalez wrote:


Hi,

I'm a big fan of live upgrade. I'm also a big fan of ZFS boot. The
latter is
more important for me. And yes, I'm looking forward to both being
integrated
with each other.

Meanwhile, what is the best way to upgrade a post-b61 system that is
booted
from ZFS?


I'm thinking:

1. Boot from ZFS
2. Use Tim's excellent multiple boot datasets script to create a new
cloned ZFS
   boot environment:
   http://blogs.sun.com/timf/entry/an_easy_way_to_manage
3. Loopback mount the new OS ISO image
4. Run the installer from the loopbacked ISO image in upgrade mode on
the clone
5. Mark the clone to be booted the next time
6. Reboot into the upgraded OS.


Questions:

- How exactly do I do step 4? Before, luupgrade did everything for me,
now
  what manpage do I need to do this?

- Did I forget something above? I'm ok with losing some logfiles and
stuff that
  maybe changed between the clone and the reboot, but is there
anything else?

- Did someone already blog about this and I haven't noticed yet?


Cheers,
   Constantin

  
  


  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS boot: Now, how can I do a pseudo live upgrade?

2007-05-25 Thread Constantin Gonzalez
Hi Malachi,

Malachi de Ælfweald wrote:
 I'm actually wondering the same thing because I have b62 w/ the ZFS
 bits; but need the snapshot's -r functionality.

you're lucky, it's already there. From my b62 machine's man zfs:

 zfs snapshot [-r] [EMAIL PROTECTED]|[EMAIL PROTECTED]

 Creates  a  snapshot  with  the  given  name.  See   the
 Snapshots section for details.

 -rRecursively create  snapshots  of  all  descendant
   datasets.  Snapshots are taken atomically, so that
   all recursive snapshots  correspond  to  the  same
   moment in time.

Or did you mean send -r?

Best regards,
   Constantin


-- 
Constantin GonzalezSun Microsystems GmbH, Germany
Platform Technology Group, Global Systems Engineering  http://www.sun.de/
Tel.: +49 89/4 60 08-25 91   http://blogs.sun.com/constantin/

Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS over a layered driver interface

2007-05-25 Thread Roch Bourbonnais


hi Shweta;

First thing is to look for all kernel function return that errno (25  
I think) during your test.


dtrace -n 'fbt:::return/arg1 == 25/[EMAIL PROTECTED]()}'

More verbose but also useful :

dtrace -n 'fbt:::return/arg1 == 25/[EMAIL PROTECTED](20)]=count()}'

It's a catch all, but often points me in the right direction.


-r

Le 19 mai 07 à 00:24, Shweta Krishnan a écrit :

I explored this a bit and found that the ldi_ioctl in my layered  
driver does fail, but fails because of an iappropriate ioctl for  
device  error, which the underlying ramdisk driver's ioctl  
returns. So doesn't seem like that's an issue at all (since I know  
the storage pool creation is successful when I give the ramdisk  
directly as the target device).


However, as I mentioned, even though reads and writes are getting  
invoked on the ramdisk, through my layered driver, the storage pool  
creation still fails.


Surprisingly, the layered driver's routines show no sign of error -  
as in the layered device gets closed successfully when the pool  
creation command returns.


It is unclear to be what would be a good way to go about debugging  
this, since I'm not familiar with dtrace- i shall try and  
familiarize myself with dtrace, but even then, it seems like there  
are a large number of functions returning non-zero values, and  
confusing to me where to look for the error.


Any pointers would be most welcome!!

Thanks,
Swetha.


This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] No zfs_nocacheflush in Solaris 10?

2007-05-25 Thread Albert Chin
On Fri, May 25, 2007 at 12:14:45AM -0400, Torrey McMahon wrote:
 Albert Chin wrote:
 On Thu, May 24, 2007 at 11:55:58AM -0700, Grant Kelly wrote:
   
 
 I'm getting really poor write performance with ZFS on a RAID5 volume
 (5 disks) from a storagetek 6140 array. I've searched the web and
 these forums and it seems that this zfs_nocacheflush option is the
 solution, but I'm open to others as well.
 
 
 What type of poor performance? Is it because of ZFS? You can test this
 by creating a RAID-5 volume on the 6140, creating a UFS file system on
 it, and then comparing performance with what you get against ZFS.
 
 If it's ZFS then you might want to check into modifying the 6540 NVRAM 
 as mentioned in this thread
 
 http://mail.opensolaris.org/pipermail/zfs-discuss/2006-December/024194.html
 
 there is a fix that doesn't involve modifying the NVRAM in the works. (I 
 don't have an estimate.)

The above URL helps only if you have Santricity.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] No zfs_nocacheflush in Solaris 10?

2007-05-25 Thread Andy Lubel
Im using: 
 
  zfs set:zil_disable 1

On my se6130 with zfs, accessed by NFS and writing performance almost
doubled.  Since you have BBC, why not just set that?

-Andy



On 5/24/07 4:16 PM, Albert Chin
[EMAIL PROTECTED] wrote:

 On Thu, May 24, 2007 at 11:55:58AM -0700, Grant Kelly wrote:
 I'm running SunOS Release 5.10 Version Generic_118855-36 64-bit
 and in [b]/etc/system[/b] I put:
 
 [b]set zfs:zfs_nocacheflush = 1[/b]
 
 And after rebooting, I get the message:
 
 [b]sorry, variable 'zfs_nocacheflush' is not defined in the 'zfs' module[/b]
 
 So is this variable not available in the Solaris kernel?
 
 I think zfs:zfs_nocacheflush is only available in Nevada.
 
 I'm getting really poor write performance with ZFS on a RAID5 volume
 (5 disks) from a storagetek 6140 array. I've searched the web and
 these forums and it seems that this zfs_nocacheflush option is the
 solution, but I'm open to others as well.
 
 What type of poor performance? Is it because of ZFS? You can test this
 by creating a RAID-5 volume on the 6140, creating a UFS file system on
 it, and then comparing performance with what you get against ZFS.
 
 It would also be worthwhile doing something like the following to
 determine the max throughput the H/W RAID is giving you:
   # time dd of=raw disk if=/dev/zero bs=1048576 count=1000
 For a 2Gbps 6140 with 300GB/10K drives, we get ~46MB/s on a
 single-drive RAID-0 array, ~83MB/s on a 4-disk RAID-0 array w/128k
 stripe, and ~69MB/s on a seven-disk RAID-5 array w/128k strip.
-- 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS over a layered driver interface

2007-05-25 Thread Shweta Krishnan
Thanks to everyone for their help! yes dtrace did help and I found that in my 
layered driver, the prop_op entry point had an error in setting the [Ss]ize 
dynamic property, and apparently that's what ZFS looks for, not just Nblocks! 
what took me so long in getting to this error was that the driver was faulting 
not in the beginning but after some reads and writes (basically when the offset 
exceeded the size, it gave rise to the EINVAL), and that too within zio_wait(), 
which confused it with a synchronization problem.
With that fixed, the layered driver works fine when I try to create a storage 
pool with it.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rsync update to ZFS server over SSH faster than over NFS?

2007-05-25 Thread Roch Bourbonnais


Le 22 mai 07 à 01:11, Nicolas Williams a écrit :


On Mon, May 21, 2007 at 06:09:46PM -0500, Albert Chin wrote:

But still, how is tar/SSH any more multi-threaded than tar/NFS?


It's not that it is, but that NFS sync semantics and ZFS sync  
semantics

conspire against single-threaded performance.



Hi Nic,  I don't agree with the blanket statement. So to clarify.

There are 2 independant things at play here.

a) NFS sync semantics conspire againts single thread performance with  
any backend filesystem.

 However NVRAM normally offers some releaf of the issue.

b) ZFS sync semantics along with the Storage Software + imprecise  
protocol in between, conspire againts ZFS performance
of some workloads on NVRAM backed storage. NFS being one of the  
affected workloads.


The conjunction of the 2 causes worst than expected NFS perfomance  
over ZFS backend running __on NVRAM back storage__.
If you are not considering NVRAM storage, then I know of no ZFS/NFS  
specific problems.


Issue b) is being delt with, by both Solaris and Storage Vendors (we  
need a refined protocol);


Issue a) is not related to ZFS and rather fundamental NFS issue.  
Maybe future NFS protocol will help.



Net net; if one finds a way to 'disable cache flushing' on the  
storage side, then one reaches the state
we'll be, out of the box, when b) is implemented by Solaris _and_  
Storage vendor. At that point,  ZFS becomes a fine NFS
server not only on JBOD as it is today , both also on NVRAM backed  
storage.


It's complex enough, I thougt it was worth repeating.

-r


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs root: legacy mount or not?

2007-05-25 Thread Lori Alt

We've been kicking around the question of whether or
not zfs root mounts should appear in /etc/vfstab (i.e., be
legacy mount) or use the new zfs approach to mounts.
Instead of writing up the issues again, here's a blog
entry that I just posted on the subject:

http://blogs.sun.com/lalt/date/20070525

Weigh in if you care.

Lori
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rsync update to ZFS server over SSH faster than over NFS?

2007-05-25 Thread Roch Bourbonnais


Le 22 mai 07 à 01:21, Albert Chin a écrit :


On Mon, May 21, 2007 at 06:11:36PM -0500, Nicolas Williams wrote:

On Mon, May 21, 2007 at 06:09:46PM -0500, Albert Chin wrote:

But still, how is tar/SSH any more multi-threaded than tar/NFS?


It's not that it is, but that NFS sync semantics and ZFS sync
semantics conspire against single-threaded performance.


What's why we have set zfs:zfs_nocacheflush = 1 in /etc/system. But,
that's only helps ZFS. Is there something similar for NFS?



With this set, we also reach a state where the NFS/ZFS/NVRAM works as  
it should.

So it should speed things up.

The problem is :

	Once it starts to go in /etc/system it will spread. Customers with  
no NVRAM storage will use it and

some will experience pool corruption.

-r



--
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] No zfs_nocacheflush in Solaris 10?

2007-05-25 Thread Albert Chin
On Fri, May 25, 2007 at 12:01:45PM -0400, Andy Lubel wrote:
 Im using: 
  
   zfs set:zil_disable 1
 
 On my se6130 with zfs, accessed by NFS and writing performance almost
 doubled.  Since you have BBC, why not just set that?

I don't think it's enough to have BBC to justify zil_disable=1.
Besides, I don't know anyone from Sun recommending zil_disable=1. If
your storage array has BBC, it doesn't matter. What matters is what
happens when ZIL isn't flushed and your file server crashes (ZFS file
system is still consistent but you'll lose some info that hasn't been
flushed by ZIL). Even having your file server on a UPS won't help
here.

http://blogs.sun.com/erickustarz/entry/zil_disable discusses some of
the issues affecting zil_disable=1.

We know we get better performance with zil_disable=1 but we're not
taking any chances.

 -Andy
 
 
 
 On 5/24/07 4:16 PM, Albert Chin
 [EMAIL PROTECTED] wrote:
 
  On Thu, May 24, 2007 at 11:55:58AM -0700, Grant Kelly wrote:
  I'm running SunOS Release 5.10 Version Generic_118855-36 64-bit
  and in [b]/etc/system[/b] I put:
  
  [b]set zfs:zfs_nocacheflush = 1[/b]
  
  And after rebooting, I get the message:
  
  [b]sorry, variable 'zfs_nocacheflush' is not defined in the 'zfs' 
  module[/b]
  
  So is this variable not available in the Solaris kernel?
  
  I think zfs:zfs_nocacheflush is only available in Nevada.
  
  I'm getting really poor write performance with ZFS on a RAID5 volume
  (5 disks) from a storagetek 6140 array. I've searched the web and
  these forums and it seems that this zfs_nocacheflush option is the
  solution, but I'm open to others as well.
  
  What type of poor performance? Is it because of ZFS? You can test this
  by creating a RAID-5 volume on the 6140, creating a UFS file system on
  it, and then comparing performance with what you get against ZFS.
  
  It would also be worthwhile doing something like the following to
  determine the max throughput the H/W RAID is giving you:
# time dd of=raw disk if=/dev/zero bs=1048576 count=1000
  For a 2Gbps 6140 with 300GB/10K drives, we get ~46MB/s on a
  single-drive RAID-0 array, ~83MB/s on a 4-disk RAID-0 array w/128k
  stripe, and ~69MB/s on a seven-disk RAID-5 array w/128k strip.
 -- 
 
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rsync update to ZFS server over SSH faster than over NFS?

2007-05-25 Thread Roch Bourbonnais


Le 22 mai 07 à 03:18, Frank Cusack a écrit :

On May 21, 2007 6:30:42 PM -0500 Nicolas Williams  
[EMAIL PROTECTED] wrote:

On Mon, May 21, 2007 at 06:21:40PM -0500, Albert Chin wrote:

On Mon, May 21, 2007 at 06:11:36PM -0500, Nicolas Williams wrote:
 On Mon, May 21, 2007 at 06:09:46PM -0500, Albert Chin wrote:
  But still, how is tar/SSH any more multi-threaded than tar/NFS?

 It's not that it is, but that NFS sync semantics and ZFS sync
 semantics conspire against single-threaded performance.

What's why we have set zfs:zfs_nocacheflush = 1 in /etc/system.  
But,

that's only helps ZFS. Is there something similar for NFS?


NFS's semantics for open() and friends is that they are synchronous,
whereas POSIX's semantics are that they are not.  You're paying for a
sync() after every open.


nocto?


I think it's after every client close. But on the server side, there  
are lots of operations

that also requires a commit. So nocto is not the silver bullet.

-r


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Preparing to compare Solaris/ZFS and FreeBSD/ZFS performance.

2007-05-25 Thread Claus Guttesen

 Won't disabling ZIL minimize the chance of a consistent zfs-
 filesystem
 if - for some reason - the server did an unplanned reboot?

 ZIL in ZFS is only used to speed-up various workloads, it has
 nothing to
 do with file system consistency. ZFS is always consistent on disk no
 matter if you use ZIL or not.

But it can cause NFS client corruption and you no longer gets
synchronous write semantics (see if your app depend on that):
http://blogs.sun.com/erickustarz/entry/zil_disable

I highly recommend *against* setting zil_disable.


Disabling zil did improve the postgresql-import from 1 h. 45 min. to
1. h. 35 min. I get a 10 min. speedup but as the link point out
disabling has it's disadvantages. So I'll revert to the old setting.

--
regards
Claus

When lenity and cruelty play for a kingdom,
the gentlest gamester is the soonest winner.

Shakespeare
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rsync update to ZFS server over SSH faster than over NFS?

2007-05-25 Thread Roch Bourbonnais


Le 22 mai 07 à 16:23, Dick Davies a écrit :


allyourbase
Take off every ZIL!

 http://number9.hellooperator.net/articles/2007/02/12/zil- 
communication


/allyourbase



Cause client corrupt but also database corruption and just about  
anything that carefully manages data.

Yes the zpool will survive, but it may be the only thing that does.

So please don't do this.

-r


On 22/05/07, Albert Chin
[EMAIL PROTECTED] wrote:

On Mon, May 21, 2007 at 06:11:36PM -0500, Nicolas Williams wrote:
 On Mon, May 21, 2007 at 06:09:46PM -0500, Albert Chin wrote:
  But still, how is tar/SSH any more multi-threaded than tar/NFS?

 It's not that it is, but that NFS sync semantics and ZFS sync
 semantics conspire against single-threaded performance.

What's why we have set zfs:zfs_nocacheflush = 1 in /etc/system.  
But,

that's only helps ZFS. Is there something similar for NFS?

--
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




--
Rasputin :: Jack of All Trades - Master of Nuns
http://number9.hellooperator.net/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: No zfs_nocacheflush in Solaris 10?

2007-05-25 Thread Grant Kelly
 It would also be worthwhile doing something like the
 following to
 determine the max throughput the H/W RAID is giving
 you:
 # time dd of=raw disk if=/dev/zero bs=1048576
  count=1000
 or a 2Gbps 6140 with 300GB/10K drives, we get ~46MB/s
 on a
 single-drive RAID-0 array, ~83MB/s on a 4-disk RAID-0
 array w/128k
 stripe, and ~69MB/s on a seven-disk RAID-5 array
 w/128k strip.
 
 -- 
 albert chin ([EMAIL PROTECTED])
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss
 

Well the Solaris kernel is telling me that it doesn't understand 
zfs_nocacheflush, but the array sure is acting like it!
I ran the dd example, but increased the count for a longer running time.

5-disk RAID5 with UFS: ~79 MB/s
5-disk RAID5 with ZFS: ~470 MB/s

I'm assuming there's some caching going on with ZFS that's really helping out?

Also, no Santricity, just Sun's Common Array Manager. Is it possible to use 
both without completely confusing the array?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: No zfs_nocacheflush in Solaris 10?

2007-05-25 Thread Albert Chin
On Fri, May 25, 2007 at 09:54:04AM -0700, Grant Kelly wrote:
  It would also be worthwhile doing something like the following to
  determine the max throughput the H/W RAID is giving you:
  # time dd of=raw disk if=/dev/zero bs=1048576 count=1000
  or a 2Gbps 6140 with 300GB/10K drives, we get ~46MB/s on a
  single-drive RAID-0 array, ~83MB/s on a 4-disk RAID-0 array w/128k
  stripe, and ~69MB/s on a seven-disk RAID-5 array w/128k strip.
 
 Well the Solaris kernel is telling me that it doesn't understand
 zfs_nocacheflush, but the array sure is acting like it!
 I ran the dd example, but increased the count for a longer running time.

I don't think a longer running time is going to give you a more
accurate measurement.

 5-disk RAID5 with UFS: ~79 MB/s

What about against a raw RAID-5 device?

 5-disk RAID5 with ZFS: ~470 MB/s

I don't think you want to if=/dev/zero on ZFS. There's probably some
optimization going on. Better to use /dev/urandom or concat n-many
files comprised of random bits.

 I'm assuming there's some caching going on with ZFS that's really
 helping out?

Yes.

 Also, no Santricity, just Sun's Common Array Manager. Is it possible
 to use both without completely confusing the array?

I think both are ok. CAM is free. Dunno about Santricity.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS boot: Now, how can I do a pseudo live upgrade?

2007-05-25 Thread Lin Ling


Malachi de Ælfweald wrote:
No, I did mean 'snapshot -r' but I thought someone on the list said 
that the '-r' wouldn't work until b63... hmmm...




'snapshot -r' is available before b62, however, '-r' may run into a 
stack overflow (bug 6533813) which is fixed in b63.


Lin
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rsync update to ZFS server over SSH faster than over NFS?

2007-05-25 Thread Spencer Shepler


On May 25, 2007, at 11:22 AM, Roch Bourbonnais wrote:



Le 22 mai 07 à 01:11, Nicolas Williams a écrit :


On Mon, May 21, 2007 at 06:09:46PM -0500, Albert Chin wrote:

But still, how is tar/SSH any more multi-threaded than tar/NFS?


It's not that it is, but that NFS sync semantics and ZFS sync  
semantics

conspire against single-threaded performance.



Hi Nic,  I don't agree with the blanket statement. So to clarify.

There are 2 independant things at play here.

a) NFS sync semantics conspire againts single thread performance  
with any backend filesystem.

 However NVRAM normally offers some releaf of the issue.

b) ZFS sync semantics along with the Storage Software + imprecise  
protocol in between, conspire againts ZFS performance
of some workloads on NVRAM backed storage. NFS being one of the  
affected workloads.


The conjunction of the 2 causes worst than expected NFS perfomance  
over ZFS backend running __on NVRAM back storage__.
If you are not considering NVRAM storage, then I know of no ZFS/NFS  
specific problems.


Issue b) is being delt with, by both Solaris and Storage Vendors  
(we need a refined protocol);


Issue a) is not related to ZFS and rather fundamental NFS issue.  
Maybe future NFS protocol will help.



Net net; if one finds a way to 'disable cache flushing' on the  
storage side, then one reaches the state
we'll be, out of the box, when b) is implemented by Solaris _and_  
Storage vendor. At that point,  ZFS becomes a fine NFS
server not only on JBOD as it is today , both also on NVRAM backed  
storage.


I will add a third category, response time of individual requests.

One can think of the ssh stream of filesystem data as one large remote
procedure call that says put this directory tree and contents on
the server.  The time it takes is essentially the time it takes
to transfer the filesystem data.  The latency on the very last of the
request, amortized across the entire stream is zero.

For the NFS client, there is response time injected at each request
and the best way to amortize this is through parallelism and that is
very difficult for some applications.  Add the items in a) and b) and
there is a lot to deal with.  Not insurmountable but it takes a little
more effort to build an effective solution.

Spencer

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: No zfs_nocacheflush in Solaris 10?

2007-05-25 Thread Matthew Ahrens

Albert Chin wrote:

I don't think you want to if=/dev/zero on ZFS. There's probably some
optimization going on. Better to use /dev/urandom or concat n-many
files comprised of random bits.


Unless you have turned on compression, that is not the case.  By default 
there is no optimization for all zeros.


--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: zfs root: legacy mount or not?

2007-05-25 Thread Bob Palowoda
 We've been kicking around the question of whether or
 not zfs root mounts should appear in /etc/vfstab
 (i.e., be
 legacy mount) or use the new zfs approach to mounts.
 Instead of writing up the issues again, here's a blog
 entry that I just posted on the subject:
 
 http://blogs.sun.com/lalt/date/20070525
 
 Weigh in if you care.

  Interesting.  Is there an ARC case that is related to some of these issues?  

---Bob
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: zfs root: legacy mount or not?

2007-05-25 Thread Lori Alt

Bob Palowoda wrote:

We've been kicking around the question of whether or
not zfs root mounts should appear in /etc/vfstab
(i.e., be
legacy mount) or use the new zfs approach to mounts.
Instead of writing up the issues again, here's a blog
entry that I just posted on the subject:

http://blogs.sun.com/lalt/date/20070525

Weigh in if you care.



  Interesting.  Is there an ARC case that is related to some of these issues?  
  

The ARC case for using zfs as a root file system is PSARC/2006/370,
but there isn't much there yet.  I'm preparing the documents for
the case and this is one of the issues I wanted to get some feedback
on from the external community before I make a proposal for
what to do.

I don't know of any other ARC cases that would be relevant.
I'm not sure how old the getvfsent interface is.  If that interface
got ARC'd, some of the documents for it might be relevant.
I'll check it out.

Lori
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS boot and SXCE build 64a

2007-05-25 Thread Al Hopper


Hi Lori,

Are there any changes to build 64a that will affect ZFS bootability?
Will the conversion script for build 62 still do its magic?

Thanks,

Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
   Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007
http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs root: legacy mount or not?

2007-05-25 Thread Brian Hechinger
On Fri, May 25, 2007 at 02:50:15PM -0500, Al Hopper wrote:
 On Fri, 25 May 2007, Lori Alt wrote:
 
 We've been kicking around the question of whether or
 not zfs root mounts should appear in /etc/vfstab (i.e., be
 legacy mount) or use the new zfs approach to mounts.
 Instead of writing up the issues again, here's a blog
 entry that I just posted on the subject:
 
 http://blogs.sun.com/lalt/date/20070525
 
 Weigh in if you care.
 
 ZFS is a paradigm shift and Nevada has not been released.  Therefore I 
 vote for implementing it the ZFS way - going forward.  Place the burden 
 on the other developers to fix their bugs.

I second Al's point.  In fact, I couldn't have said it better myself. :)

-brian
-- 
Perl can be fast and elegant as much as J2EE can be fast and elegant.
In the hands of a skilled artisan, it can and does happen; it's just
that most of the shit out there is built by people who'd be better
suited to making sure that my burger is cooked thoroughly.  -- Jonathan 
Patschke
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs root: legacy mount or not?

2007-05-25 Thread Bill Sommerfeld
On Fri, 2007-05-25 at 10:20 -0600, Lori Alt wrote:
 We've been kicking around the question of whether or
 not zfs root mounts should appear in /etc/vfstab (i.e., be
 legacy mount) or use the new zfs approach to mounts.
 Instead of writing up the issues again, here's a blog
 entry that I just posted on the subject:
 
 http://blogs.sun.com/lalt/date/20070525
 
 Weigh in if you care.

IMHO, there should be no need to put any ZFS filesystems in /etc/vfstab,
but (this is something of a digression based on discussion kicked up by
PSARC 2007/297) it's become clear to me that ZFS filesystems *should* be
mounted by mountall and mount -a rather than via a special-case
invocation of zfs mount at the end of the fs-local method script.

in other words: teach mount how to find the list of filesystems in
attached pools and mix them in to the dependency graph it builds to
mount filesystems in the right order, rather than mounting
everything-but-zfs first and then zfs later.

- Bill








___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS boot and SXCE build 64a

2007-05-25 Thread Lori Alt

Build 64a has bug 6553537 (zfs root fails to boot from a
snv_63+zfsboot-pfinstall netinstall image), for which I
don't have a ready workaround.  So I recommend waiting
for build 65 (which should be out soon, I think).

Lori

Al Hopper wrote:


Hi Lori,

Are there any changes to build 64a that will affect ZFS bootability?
Will the conversion script for build 62 still do its magic?

Thanks,

Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
   Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007
http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Strange behaviour with sharenfs

2007-05-25 Thread Darren . Reed

Prior to rebooting my system (S10U2) yesterday, I had half
a dozen ZFS shares active...

Today, how that I look at this, I find I have only 1 of them is
being exported through NFS.

# zfs list -o name,sharenfs NAME  SHARENFS
biscuit   off
biscuit/crashes   off
biscuit/data  off
biscuit/foo   off
biscuit/home  off
biscuit/on10u3off
biscuit/on10u4on
biscuit/onnv  yes
biscuit/[EMAIL PROTECTED]  -
biscuit/[EMAIL PROTECTED]  -
biscuit/[EMAIL PROTECTED]  -
biscuit/onnv_6538379  on
biscuit/onnv_6544307  on
biscuit/pfh-clone off
biscuit/pfhs10u4  off
biscuit/queue_t   off
biscuit/refactor  off
biscuit/[EMAIL PROTECTED]  -
biscuit/s10u4fix  on
biscuit/stc2-hookson
mintslice ~# showmount -e
export list for mintslice:
/biscuit/on10u4 (everyone)
mintslice ~#

Is this a known problem, fixed already?

Darren

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs root: legacy mount or not?

2007-05-25 Thread Lori Alt

Bill Sommerfeld wrote:

On Fri, 2007-05-25 at 10:20 -0600, Lori Alt wrote:
  

We've been kicking around the question of whether or
not zfs root mounts should appear in /etc/vfstab (i.e., be
legacy mount) or use the new zfs approach to mounts.
Instead of writing up the issues again, here's a blog
entry that I just posted on the subject:

http://blogs.sun.com/lalt/date/20070525

Weigh in if you care.



IMHO, there should be no need to put any ZFS filesystems in /etc/vfstab,
but (this is something of a digression based on discussion kicked up by
PSARC 2007/297) it's become clear to me that ZFS filesystems *should* be
mounted by mountall and mount -a rather than via a special-case
invocation of zfs mount at the end of the fs-local method script.

in other words: teach mount how to find the list of filesystems in
attached pools and mix them in to the dependency graph it builds to
mount filesystems in the right order, rather than mounting
everything-but-zfs first and then zfs later.


  

I agree with this.  This seems like a necessary response to
both PSARC/2007/297 and also necessary for eliminating
legacy mounts for zfs root file systems.  The problem of
the interaction between legacy and non-legacy mounts will just
get worse once we are using non-legacy mounts for the
file systems in the BE.

Lori
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs root: legacy mount or not?

2007-05-25 Thread Mike Dotson
On Fri, 2007-05-25 at 14:29 -0600, Lori Alt wrote:
 Bill Sommerfeld wrote:
  IMHO, there should be no need to put any ZFS filesystems in /etc/vfstab,
  but (this is something of a digression based on discussion kicked up by
  PSARC 2007/297) it's become clear to me that ZFS filesystems *should* be
  mounted by mountall and mount -a rather than via a special-case
  invocation of zfs mount at the end of the fs-local method script.
 
  in other words: teach mount how to find the list of filesystems in
  attached pools and mix them in to the dependency graph it builds to
  mount filesystems in the right order, rather than mounting
  everything-but-zfs first and then zfs later.
 
 

 I agree with this.  This seems like a necessary response to
 both PSARC/2007/297 and also necessary for eliminating
 legacy mounts for zfs root file systems.  The problem of
 the interaction between legacy and non-legacy mounts will just
 get worse once we are using non-legacy mounts for the
 file systems in the BE.

Could we also look into why system-console insists on waiting for ALL
the zfs mounts to be available?  Shouldn't the main file system food
groups be mounted and then allow console-login (much like single user or
safe-mode)?

Would help in many cases where an admin needs to work on a system but
doesn't need, say 20k users home directories mounted, to do this work.


 
 Lori
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-- 
Mike Dotson

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: Need guidance on RAID 5, ZFS, and RAIDZ on home file server

2007-05-25 Thread Tomasz Torcz

On 5/24/07, Tom Buskey [EMAIL PROTECTED] wrote:

  Linux and Windows
 as well as the BSDs) are all relative newcomers to
 the 64-bit arena.

The 2nd non-x86 port of Linux was to the Alpha in 1999 (98?) by Linus no less.


In 1994 to be precise. In 1999 Linux 2.2 got released, which supported
few more 64 bit platforms.

--
Tomasz Torcz
[EMAIL PROTECTED]
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs root: legacy mount or not?

2007-05-25 Thread Lori Alt

Mike Dotson wrote:

On Fri, 2007-05-25 at 14:29 -0600, Lori Alt wrote:
  

Bill Sommerfeld wrote:


IMHO, there should be no need to put any ZFS filesystems in /etc/vfstab,
but (this is something of a digression based on discussion kicked up by
PSARC 2007/297) it's become clear to me that ZFS filesystems *should* be
mounted by mountall and mount -a rather than via a special-case
invocation of zfs mount at the end of the fs-local method script.

in other words: teach mount how to find the list of filesystems in
attached pools and mix them in to the dependency graph it builds to
mount filesystems in the right order, rather than mounting
everything-but-zfs first and then zfs later.


  
  

I agree with this.  This seems like a necessary response to
both PSARC/2007/297 and also necessary for eliminating
legacy mounts for zfs root file systems.  The problem of
the interaction between legacy and non-legacy mounts will just
get worse once we are using non-legacy mounts for the
file systems in the BE.



Could we also look into why system-console insists on waiting for ALL
the zfs mounts to be available?  Shouldn't the main file system food
groups be mounted and then allow console-login (much like single user or
safe-mode)?
  
Would help in many cases where an admin needs to work on a system but

doesn't need, say 20k users home directories mounted, to do this work.
  

So single-user mode is not sufficient for this?


Lori



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs root: legacy mount or not?

2007-05-25 Thread Mike Dotson
On Fri, 2007-05-25 at 15:50 -0600, Lori Alt wrote:
 Mike Dotson wrote:
  On Fri, 2007-05-25 at 14:29 -0600, Lori Alt wrote:
   
  Would help in many cases where an admin needs to work on a system but
  doesn't need, say 20k users home directories mounted, to do this work.

 So single-user mode is not sufficient for this?
 

Not all work needs to be done in single user:) And I wouldn't consider a
4+ hour boot time just for mounting file systems a good use of cpu time
when an admin could be doing other things - preparation for next
patching, configuring changes to webserver, etc.  Or just monitoring the
status of the file system mounts to give an update to management on how
many file systems are mounted and how many are left.

Point is, why is console-login dependent on *all* the file systems being
mounted in *multiboot*.  Does it really need to depend on *all* the file
systems being mounted?  

 
 Lori
 
 
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-- 
Thanks...


Mike Dotson
Area System Support Engineer - ACS West
Phone: (503) 343-5157
[EMAIL PROTECTED]


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs root: legacy mount or not?

2007-05-25 Thread Casper . Dik

On Fri, 2007-05-25 at 15:50 -0600, Lori Alt wrote:
 Mike Dotson wrote:
  On Fri, 2007-05-25 at 14:29 -0600, Lori Alt wrote:
   
  Would help in many cases where an admin needs to work on a system but
  doesn't need, say 20k users home directories mounted, to do this work.

 So single-user mode is not sufficient for this?
 

Not all work needs to be done in single user:) And I wouldn't consider a
4+ hour boot time just for mounting file systems a good use of cpu time
when an admin could be doing other things - preparation for next
patching, configuring changes to webserver, etc.  Or just monitoring the
status of the file system mounts to give an update to management on how
many file systems are mounted and how many are left.

Point is, why is console-login dependent on *all* the file systems being
mounted in *multiboot*.  Does it really need to depend on *all* the file
systems being mounted?  

Why do we need the filesystems mounted at all, ever, if they
are not used?

Mounts could be more magic than that.

Casper
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs root: legacy mount or not?

2007-05-25 Thread Eric Schrock
On Fri, May 25, 2007 at 03:01:20PM -0700, Mike Dotson wrote:
 On Fri, 2007-05-25 at 15:50 -0600, Lori Alt wrote:
  Mike Dotson wrote:
   On Fri, 2007-05-25 at 14:29 -0600, Lori Alt wrote:

   Would help in many cases where an admin needs to work on a system but
   doesn't need, say 20k users home directories mounted, to do this work.
 
  So single-user mode is not sufficient for this?
  
 
 Not all work needs to be done in single user:) And I wouldn't consider a
 4+ hour boot time just for mounting file systems a good use of cpu time
 when an admin could be doing other things - preparation for next
 patching, configuring changes to webserver, etc.  Or just monitoring the
 status of the file system mounts to give an update to management on how
 many file systems are mounted and how many are left.
 
 Point is, why is console-login dependent on *all* the file systems being
 mounted in *multiboot*.  Does it really need to depend on *all* the file
 systems being mounted?  
 

This has been discussed many times in smf-discuss, for all types of
login.  Basically, there is no way to say console login for root
only.  As long as any user can log in, we need to have all the
filesystems mounted because we don't know what dependencies there may
be.  Simply changing the definition of console-login isn't a
solution because it breaks existing assumptions and software.

A much better option is the 'trigger mount' RFE that would allow ZFS to
quickly 'mount' a filesystem but not pull all the necessary data off
disk until it's first accessed.

- Eric


--
Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] I seem to have backed myself into a corner - how do I migrate filesystems from one pool to another?

2007-05-25 Thread John Plocher

Thru a sequence of good intentions, I find myself with a raidz'd
pool that has a failed drive that I can't replace.

We had a generous department donate a fully configured V440 for
use as our departmental server.  Of course, I installed SX/b56
on it, created a pool with 3x 148Gb drives and made a dozen
filesystems on it.  Life was good.  ZFS is great!

One of the raidz pool drives failed.  When I went to replace it,
I found that the V440's original 72Gb drives had been upgraded
to Dell 148Gb Fujitsu drives, and the Sun versions of those drives
(same model number...) had different firmware, and more importantly,
FEWER sectors!  They were only 147.8 Gb!  You know what they say
about a free lunch and too good to be true...

This meant that zpool replace drive faild because the
replacement drive is too small.

The question of the moment is what to do?.

All I can think of is to

Attach/create a new pool that has enough space to
hold the existing content,

Copy the content from the old to new pools,

Destroy the old pool,

Recreate the old pool with the (slightly) smaller
size, and

copy the data back onto the pool.

Given that there are a bunch of filesystems in the pool, each
with some set of properties ..., what is the easiest way to
move the data and metadata back and forth without losing
anything, and without having to manually recreate the
metainfo/properties?

(adding to the 'shrink' RFE, if I replace a pool drive with
a smaller one, and the existing content is small enough
to fit on a shrunk/resized pool, the zpool replace command
should (after prompting) simply do the work.  In this situation,
losing less than 10Mb of pool space to get a healthy raidz
configuration seems to be an easy tradeoff :-)

TIA,

  -John



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs root: legacy mount or not?

2007-05-25 Thread John Plocher

Why not simply have a SMF sequence that does

early in boot, after / and /usr are mounted:
create /etc/nologin (contents=coming up, not ready yet)
enable login
later in boot, when user filesystems are all mounted:
delete /etc/nologin

Wouldn't this would give the desired behavior?
  -John


Eric Schrock wrote:

This has been discussed many times in smf-discuss, for all types of
login.  Basically, there is no way to say console login for root
only. 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs root: legacy mount or not?

2007-05-25 Thread Mike Dotson
On Fri, 2007-05-25 at 15:19 -0700, Eric Schrock wrote:
 This has been discussed many times in smf-discuss, for all types of
 login.  Basically, there is no way to say console login for root
 only.  As long as any user can log in, we need to have all the
 filesystems mounted because we don't know what dependencies there may
 be.  Simply changing the definition of console-login isn't a
 solution because it breaks existing assumptions and software.

devils_advocate
So how are you guaranteeing NFS server and automount with autofs are up,
running and working for the user for console-login.
/devils_advocate

I don't buy this argument and you don't have to say console-login for
root only you just have to have console-login and the services
available are minimal and may not include *all* services much like when
a nfs server is down, etc. 

If the software depends on a file system or all the file systems to be
mounted, it adds that as a dependency (filesystem/local).  console-login
does not require this - only non-root users.  (I remember a smf config
bug with apache not requiring filesystem/local and failing to start)

What software is dependent on console-login?
helios(3): svcs -D console-login
STATE  STIMEFMRI

In fact the console-login depends on filesystem/minimal which to me
means minimal file systems not all file systems and there is no software
dependent on console-login - where's the disconnect?

From what I see, problem is auditd is dependent on filesystem/local
which is where we possibly have the hangup.

 
 A much better option is the 'trigger mount' RFE that would allow ZFS to
 quickly 'mount' a filesystem but not pull all the necessary data off
 disk until it's first accessed.

Agreed but there's still the issue with console-login being dependent on
all file systems instead of minimal file systems.

 
 - Eric
 
 
 --
 Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
-- 
Mike Dotson

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs root: legacy mount or not?

2007-05-25 Thread Eric Schrock
I didn't mean to imply that it wasn't technically possible, only that
there is no one size fits all solution for OpenSolaris as a whole.
Even getting this to work in an easily tunable form is quite tricky,
since you must dynamically determine  dependencies in the process
(filesystem/minimal vs. filesystem/user).

If someone wants to pursue this, I would suggest moving the discussion
to smf-discuss.

- Eric

On Fri, May 25, 2007 at 03:32:52PM -0700, John Plocher wrote:
 Why not simply have a SMF sequence that does
 
 early in boot, after / and /usr are mounted:
   create /etc/nologin (contents=coming up, not ready yet)
   enable login
 later in boot, when user filesystems are all mounted:
   delete /etc/nologin
 
 Wouldn't this would give the desired behavior?
   -John
 
 
 Eric Schrock wrote:
 This has been discussed many times in smf-discuss, for all types of
 login.  Basically, there is no way to say console login for root
 only. 

--
Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] I seem to have backed myself into a corner - how do I migrate filesystems from one pool to another?

2007-05-25 Thread Matthew Ahrens

Given that there are a bunch of filesystems in the pool, each
with some set of properties ..., what is the easiest way to
move the data and metadata back and forth without losing
anything, and without having to manually recreate the
metainfo/properties?


AFAIK, your only choices are:

A. Write/find a script to do the appropriate 'zfs send|recv' and 'zfs set' 
commands.


B. Wait for us to implement 6421959  6421958 (zfs send -r / -p).  I'm 
currently working on this, ETA at least a few months.


Sorry,

--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs root: legacy mount or not?

2007-05-25 Thread Eric Schrock
On Fri, May 25, 2007 at 03:39:11PM -0700, Mike Dotson wrote:
 
 In fact the console-login depends on filesystem/minimal which to me
 means minimal file systems not all file systems and there is no software
 dependent on console-login - where's the disconnect?
 

You're correct - I thought console-login depended in filesystem/local,
not
filesystem/minimal.  ZFS filesystems are not mounted as part of
filesystem/minimal, so remind me what the promlem is?

- Eric

--
Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs root: legacy mount or not?

2007-05-25 Thread Mike Dotson
On Fri, 2007-05-25 at 15:46 -0700, Eric Schrock wrote:
 On Fri, May 25, 2007 at 03:39:11PM -0700, Mike Dotson wrote:
  
  In fact the console-login depends on filesystem/minimal which to me
  means minimal file systems not all file systems and there is no software
  dependent on console-login - where's the disconnect?
  
 
 You're correct - I thought console-login depended in filesystem/local,
 not
 filesystem/minimal.  ZFS filesystems are not mounted as part of
 filesystem/minimal, so remind me what the promlem is?

Create 20k zfs file systems and reboot.  Console login waits for all the
zfs file systems to be mounted (fully loaded 880, you're looking at
about 4 hours so have some coffee ready).

The *only* place I can see the filesystem/local dependency is in
svc:/system/auditd:default, however, on my systems it's disabled.

Haven't had a chance to really prune out the dependency tree to really
find the disconnect but once /, /var, /tmp and /usr are mounted, the
conditions for console-login should be met.

As you mentioned, best solution for this number of filesystems in zfs
land is the *automount* fs option where it mounts the filesystems as
needed to reduce the *boot time*.



 
 - Eric
 
 --
 Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
-- 
Thanks...


Mike Dotson
Area System Support Engineer - ACS West
Phone: (503) 343-5157
[EMAIL PROTECTED]


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] I seem to have backed myself into a corner - how do I migrate filesystems from one pool to another?

2007-05-25 Thread Will Murnane

On 5/25/07, John Plocher [EMAIL PROTECTED] wrote:

One of the raidz pool drives failed.  When I went to replace it,
I found that the V440's original 72Gb drives had been upgraded
to Dell 148Gb Fujitsu drives, and the Sun versions of those drives
(same model number...) had different firmware, and more importantly,
FEWER sectors!  They were only 147.8 Gb!  You know what they say
about a free lunch and too good to be true...

What about buying a single larger drive?  A 300 GB disk had better
have at least 148 GB on it...  It's a few hundred bucks extra,
granted, but if you have to rent or buy enough space to back
everything up it might be a tossup.

Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZVol Panic on 62

2007-05-25 Thread Ben Rockwood
May 25 23:32:59 summer unix: [ID 836849 kern.notice]
May 25 23:32:59 summer ^Mpanic[cpu1]/thread=1bf2e740:
May 25 23:32:59 summer genunix: [ID 335743 kern.notice] BAD TRAP: type=e (#pf 
Page fault) rp=ff00232c3a80 addr=490 occurred in module unix due to a 
NULL pointer dereference
May 25 23:32:59 summer unix: [ID 10 kern.notice]
May 25 23:32:59 summer unix: [ID 839527 kern.notice] grep:
May 25 23:32:59 summer unix: [ID 753105 kern.notice] #pf Page fault
May 25 23:32:59 summer unix: [ID 532287 kern.notice] Bad kernel fault at 
addr=0x490
May 25 23:32:59 summer unix: [ID 243837 kern.notice] pid=18425, 
pc=0xfb83b6bb, sp=0xff00232c3b78, eflags=0x10246
May 25 23:32:59 summer unix: [ID 211416 kern.notice] cr0: 
8005003bpg,wp,ne,et,ts,mp,pe cr4: 6f8xmme,fxsr,pge,mce,pae,pse,de
May 25 23:32:59 summer unix: [ID 354241 kern.notice] cr2: 490 cr3: 1fce52000 
cr8: c
May 25 23:32:59 summer unix: [ID 592667 kern.notice]rdi:  490 
rsi:0 rdx: 1bf2e740
May 25 23:32:59 summer unix: [ID 592667 kern.notice]rcx:0  
r8:d  r9: 62ccc700
May 25 23:32:59 summer unix: [ID 592667 kern.notice]rax:0 
rbx:0 rbp: ff00232c3bd0
May 25 23:32:59 summer unix: [ID 592667 kern.notice]r10: fc18 
r11:0 r12:  490
May 25 23:32:59 summer unix: [ID 592667 kern.notice]r13:  450 
r14: 52e3aac0 r15:0
May 25 23:32:59 summer unix: [ID 592667 kern.notice]fsb:0 
gsb: fffec3731800  ds:   4b
May 25 23:32:59 summer unix: [ID 592667 kern.notice] es:   4b  
fs:0  gs:  1c3
May 25 23:33:00 summer unix: [ID 592667 kern.notice]trp:e 
err:2 rip: fb83b6bb
May 25 23:33:00 summer unix: [ID 592667 kern.notice] cs:   30 
rfl:10246 rsp: ff00232c3b78
May 25 23:33:00 summer unix: [ID 266532 kern.notice] ss:   38
May 25 23:33:00 summer unix: [ID 10 kern.notice]
May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3960 
unix:die+c8 ()
May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3a70 
unix:trap+135b ()
May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3a80 
unix:cmntrap+e9 ()
May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3bd0 
unix:mutex_enter+b ()
May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3c20 
zfs:zvol_read+51 ()
May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3c50 
genunix:cdev_read+3c ()
May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3cd0 
specfs:spec_read+276 ()
May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3d40 
genunix:fop_read+3f ()
May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3e90 
genunix:read+288 ()
May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3ec0 
genunix:read32+1e ()
May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3f10 
unix:brand_sys_syscall32+1a3 ()
May 25 23:33:00 summer unix: [ID 10 kern.notice]
May 25 23:33:00 summer genunix: [ID 672855 kern.notice] syncing file systems...


Does anyone have an idea of what bug this might be?  Occurred on X86 B62.  I'm 
not seeing any putbacks into 63 or bugs that seem to match.

Any insight is appreciated.  Core's are available.

benr.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] I seem to have backed myself into a corner - how do I migrate filesystems from one pool to another?

2007-05-25 Thread Toby Thain


On 25-May-07, at 7:28 PM, John Plocher wrote:


...
I found that the V440's original 72Gb drives had been upgraded
to Dell 148Gb Fujitsu drives, and the Sun versions of those drives
(same model number...) had different firmware


You can't get hold of another one of the same drive?

--Toby

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss