Re: [zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

2006-12-22 Thread przemolicc
On Thu, Dec 21, 2006 at 04:45:34PM +0100, Robert Milkowski wrote:
 Hello Shawn,
 
 Thursday, December 21, 2006, 4:28:39 PM, you wrote:
 
 SJ All,
 
 SJ I understand that ZFS gives you more error correction when using
 SJ two LUNS from a SAN. But, does it provide you with less features
 SJ than UFS does on one LUN from a SAN (i.e is it less stable).
 
 With only one LUN you still get error detection which UFS doesn't give
 you. You still can use snapshots, clones, quotas, etc. so in general
 you still have more features than UFS.
 
 Now when in comes to stability - depends. UFS is for years in use
 while ZFS much younger.
 
 More and more people are using ZFS in production and while there're
 some corner cases mostly performance related, it works really good.
 And I haven't heard of verified data lost due to ZFS. I've been using
 ZFS for quite some time (much sooner than it was available in SX) and
 I haven't also lost any data.

Robert,

I don't understand why not loosing any data is an advantage of ZFS.
No filesystem should lose any data. It is like saying that an advantage
of football player is that he/she plays football (he/she should do that !)
or an advantage of chef is that he/she cooks (he/she should do that !).
Every filesystem should _save_ our data, not lose it.

Regards
przemol

--
Jestes kierowca? To poczytaj!  http://link.interia.pl/f199e

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] !

2006-12-22 Thread Ulrich Graef

[EMAIL PROTECTED] wrote:


Robert,

I don't understand why not loosing any data is an advantage of ZFS.
No filesystem should lose any data. It is like saying that an advantage
of football player is that he/she plays football (he/she should do that !)
or an advantage of chef is that he/she cooks (he/she should do that !).
Every filesystem should _save_ our data, not lose it.
 


yes, you are right: every filesystem should save the data.
(... and every program should have no error! ;-)

Unfortunately there are some cases, where the disks lose data,
these cannot be detected by traditional filesystems but with ZFS:

   * bit rot: some bits on the disk gets flipped (~ 1 in 10^11)
 (cosmic rays, static particles in airflow, random thermodynamics)
   * phantom writes: a disk 'forgets' to write data (~ 1 in 10^8)
 (positioning errors, disk firmware errors, ...)
   * misdirected reads/writes: disk writes to the wrong position (~ 1
 in 10^8)
 (disks use very small structures, head can move after positioning)
   * errors on the data transfer connection

You can look up the probabilities at several disk vendors, the are 
published.
Traditional filesystems do not check the data they read. You get strange 
effects

when the filesystem code runs with wrong metadata (worst case: panic).
If you use the wrong data in your applicaton, you 'only' have the wrong 
results...


ZFS on the contrary checks every block it reads and is able to find the 
mirror

or reconstruct the data in a raidz config.
Therefore ZFS uses only valid data and is able to repair the data blocks 
automatically.
This is not possible in a traditional filesystem/volume manager 
configuration.


You may say, you never heard of a disk losing data; but you have heard 
of systems,

which behave strange and a re-installation fixed everything.
Or some data have gone bad and you have to recover from backup.

It may be, that this was one of these cases.
Our service encounters a number of these cases every year,
where the customer was not able to re-install or did not want to restore 
his data,

which can be traced back to such a disk error.
These are always nasty problems and it gets nastier, because customers
have more and more data and there is a trend to save money on backup/restore
infrastructures which make it hurt to restore data.

Regards,

   Ulrich

--
| Ulrich Graef, Senior Consultant, OS Ambassador \
|  Operating Systems, Performance \ Platform Technology   \
|   Mail: [EMAIL PROTECTED] \ Global Systems Enginering \
|Phone: +49 6103 752 359\ Sun Microsystems Inc  \

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

2006-12-22 Thread Robert Milkowski
Hello przemolicc,

Friday, December 22, 2006, 10:02:44 AM, you wrote:

ppf On Thu, Dec 21, 2006 at 04:45:34PM +0100, Robert Milkowski wrote:
 Hello Shawn,
 
 Thursday, December 21, 2006, 4:28:39 PM, you wrote:
 
 SJ All,
 
 SJ I understand that ZFS gives you more error correction when using
 SJ two LUNS from a SAN. But, does it provide you with less features
 SJ than UFS does on one LUN from a SAN (i.e is it less stable).
 
 With only one LUN you still get error detection which UFS doesn't give
 you. You still can use snapshots, clones, quotas, etc. so in general
 you still have more features than UFS.
 
 Now when in comes to stability - depends. UFS is for years in use
 while ZFS much younger.
 
 More and more people are using ZFS in production and while there're
 some corner cases mostly performance related, it works really good.
 And I haven't heard of verified data lost due to ZFS. I've been using
 ZFS for quite some time (much sooner than it was available in SX) and
 I haven't also lost any data.

ppf Robert,

ppf I don't understand why not loosing any data is an advantage of ZFS.
ppf No filesystem should lose any data. It is like saying that an advantage

I wasn't saying this is advantage. Of course no file system should
lose your data - it's just that when new file systems show up on
market people do not trust them in general at first - which is
expected precaution.

Part of such perception is Linux - due to different development type
you often get software badly written and tested - try to look at
google how many people lost their data with RaiserFS for example.
The same happened for many people with XFS on Linux.

That's why I thought emphasis on ZFS that it hasn't lost my data even if
it's new-born file system and I've been using it for years (as other
users) is important, especially for people mostly from Linux world.

ps. I really belive development style in Open Solaris is better than
in Linux (kernel).

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: zfs list and snapshots..

2006-12-22 Thread Robert Milkowski
Hello Wade,

Thursday, December 21, 2006, 10:15:56 PM, you wrote:





WSfc Hola folks,

WSfc   I am new to the list, please redirect me if I am posting to the 
wrong
WSfc location.  I am starting to use ZFS in production (Solaris x86 10U3 --
WSfc 11/06) and I seem to be seeing unexpected behavior for zfs list and
WSfc snapshots.  I create a filesystem (lets call it a/b where a is the pool).
WSfc Now, if I store 100 gb of files on a/b and then snapshot a/[EMAIL 
PROTECTED] then
WSfc delete about 50 gb of files from a/b -- I expect to see ~50 gb USED on
WSfc both a/b and a/[EMAIL PROTECTED] via zfs list output -- instead I only 
seem to see the
WSfc delta block adds as USED (~20mb) on a/[EMAIL PROTECTED]  Is this 
correct behavior?
WSfc how do you track the total delta blocks the snap is using vs other snaps
WSfc and live fs?

This is almost[1] ok. When you delete a file from a file system you
definitely expect to see that the file system allocated space reduced
by about the same size.

[1] the problem is that space consumed by snapshot isn't entirely
correct and once you delete snapshot you'll actually get some more
space than zfs list reported for that snapshot as used space. It's not
a big deal but still it makes it harder to determine exactly how much
space is allocated for snapshots for a given file system.



-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: !

2006-12-22 Thread przemolicc
Ulrich,

in his e-mail Robert mentioned _two_ things regarding ZFS:
[1] ability to detect errors (checksums)
[2] using ZFS didn't caused data lost so far
I completely agree that [1] is wonderful and this is huge advantage. And you
also underlined [1] in you e-mail !
The _only_ thing I mentioned is [2]. And I guess Robert wrote about
it only because ZFS is relatively young. When you talk
about VxFS/UFS you don't underline that they don't lose data - it
would be ridiculous. 

Regards
przemol

On Fri, Dec 22, 2006 at 11:39:44AM +0100, Ulrich Graef wrote:
 [EMAIL PROTECTED] wrote:
 
 Robert,
 
 I don't understand why not loosing any data is an advantage of ZFS.
 No filesystem should lose any data. It is like saying that an advantage
 of football player is that he/she plays football (he/she should do that !)
 or an advantage of chef is that he/she cooks (he/she should do that !).
 Every filesystem should _save_ our data, not lose it.
  
 
 yes, you are right: every filesystem should save the data.
 (... and every program should have no error! ;-)
 
 Unfortunately there are some cases, where the disks lose data,
 these cannot be detected by traditional filesystems but with ZFS:
 
* bit rot: some bits on the disk gets flipped (~ 1 in 10^11)
  (cosmic rays, static particles in airflow, random thermodynamics)
* phantom writes: a disk 'forgets' to write data (~ 1 in 10^8)
  (positioning errors, disk firmware errors, ...)
* misdirected reads/writes: disk writes to the wrong position (~ 1
  in 10^8)
  (disks use very small structures, head can move after positioning)
* errors on the data transfer connection
 
 You can look up the probabilities at several disk vendors, the are 
 published.
 Traditional filesystems do not check the data they read. You get strange 
 effects
 when the filesystem code runs with wrong metadata (worst case: panic).
 If you use the wrong data in your applicaton, you 'only' have the wrong 
 results...
 
 ZFS on the contrary checks every block it reads and is able to find the 
 mirror
 or reconstruct the data in a raidz config.
 Therefore ZFS uses only valid data and is able to repair the data blocks 
 automatically.
 This is not possible in a traditional filesystem/volume manager 
 configuration.
 
 You may say, you never heard of a disk losing data; but you have heard 
 of systems,
 which behave strange and a re-installation fixed everything.
 Or some data have gone bad and you have to recover from backup.
 
 It may be, that this was one of these cases.
 Our service encounters a number of these cases every year,
 where the customer was not able to re-install or did not want to restore 
 his data,
 which can be traced back to such a disk error.
 These are always nasty problems and it gets nastier, because customers
 have more and more data and there is a trend to save money on backup/restore
 infrastructures which make it hurt to restore data.
 
 Regards,
 
Ulrich
 
 -- 
 | Ulrich Graef, Senior Consultant, OS Ambassador \
 |  Operating Systems, Performance \ Platform Technology   \
 |   Mail: [EMAIL PROTECTED] \ Global Systems Enginering \
 |Phone: +49 6103 752 359\ Sun Microsystems Inc  \
 

--
Jestes kierowca? To poczytaj!  http://link.interia.pl/f199e

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Re: Re: Snapshots impact on performance

2006-12-22 Thread Robert Milkowski
Hi.

  The problem is getting worse... now even if I destroy all snapshots in a pool 
I get performance problem even with zil_disable set to 1.

Despite that I have limit for maximum nfs threads set to 2048 I get only about 
1700.
If I want to kill nfsd server it takes 1-4 minutes untill all threads are 
finished (and most of the time above 1000 threads are stil there). During nfsd 
is stopping I can see IOs (99% reads) to the pool.

Also simple zfs commands (like changing quote for a file system, etc.) take too 
much time to complete (like 3-5 minutes to set quota for a file system).

There's still over 700gb free storage in a pool, setting quota to none doesn't 
help.

Using iostat it looks like disks aren't saturated. It looks almost like lot of 
nfsd threads are spinning (probably in zfs) - I don't recall nfsd stopping so 
long on UFS file systems.


bash-3.00# iostat -xnzC 1
[... first output]
extended device statistics
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
 1782.30.0 12700.70.0  0.0 11.10.06.2   0 293 c5
  584.30.0 4102.20.0  0.0  3.60.06.2   0  96 
c5t600C0FF0098FD57F9DA83C00d0
  572.10.0 4144.80.0  0.0  3.70.06.5   0  99 
c5t600C0FF0098FD55DBA4EA000d0
  625.80.0 4453.70.0  0.0  3.70.05.9   0  98 
c5t600C0FF0098FD516E4403200d0
extended device statistics
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
 2381.90.0 17242.10.0  0.0 16.70.07.0   0 291 c5
  800.00.0 5827.90.0  0.0  5.50.06.9   0  96 
c5t600C0FF0098FD57F9DA83C00d0
  690.00.0 4991.90.0  0.0  4.90.07.1   0  97 
c5t600C0FF0098FD55DBA4EA000d0
  892.00.0 6422.30.0  0.0  6.30.07.1   0  98 
c5t600C0FF0098FD516E4403200d0
extended device statistics
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
 1890.20.0 12826.00.0  0.0 11.40.06.0   0 292 c5
  611.10.0 3998.50.0  0.0  3.40.05.6   0  96 
c5t600C0FF0098FD57F9DA83C00d0
  604.10.0 4259.50.0  0.0  3.90.06.4   0  98 
c5t600C0FF0098FD55DBA4EA000d0
  675.10.0 4568.00.0  0.0  4.10.06.0   0  98 
c5t600C0FF0098FD516E4403200d0
^C



Now dtrace output you've requested.

dtrace -n '[EMAIL PROTECTED](20)] = count()}' -c 'sleep 5'
[...]

  unix`i_ddi_splhigh
  unix`disp_getwork+0x38
  unix`idle+0xd4
  unix`thread_start+0x4
   35

  unix`i_ddi_splx
  unix`disp_getwork+0x160
  unix`idle+0xd4
  unix`thread_start+0x4
   38

  unix`disp_getwork+0x70
  unix`idle+0xd4
  unix`thread_start+0x4
   38

  unix`disp_getwork+0x158
  unix`idle+0xd4
  unix`thread_start+0x4
   38

  unix`i_ddi_splhigh+0x14
  unix`disp_getwork+0x38
  unix`idle+0xd4
  unix`thread_start+0x4
   39

  unix`disp_getwork+0x8c
  unix`idle+0xd4
  unix`thread_start+0x4
   39

  unix`idle+0x12c
  unix`thread_start+0x4
   44

  unix`disp_getwork+0x1a8
  unix`idle+0xd4
  unix`thread_start+0x4
   47

  unix`disp_getwork+0x7c
  unix`idle+0xd4
  unix`thread_start+0x4
   49

  unix`disp_getwork+0x90
  unix`idle+0xd4
  unix`thread_start+0x4
   56

  unix`disp_getwork+0x10c
  unix`idle+0xd4
  unix`thread_start+0x4
   62

  unix`i_ddi_splx+0x1c
  unix`disp_getwork+0x160
  unix`idle+0xd4
  unix`thread_start+0x4
  117
bash-3.00#
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Re: Re: Snapshots impact on performance

2006-12-22 Thread Robert Milkowski
bash-3.00# lockstat -kgIW sleep 100 | head -30

Profiling interrupt: 38844 events in 100.098 seconds (388 events/sec)

Count genr cuml rcnt nsec Hottest CPU+PILCaller
---
32081  83%  0.00 2432 cpu[1] thread_start
24347  63%  0.00 2116 cpu[1]+11  idle
20724  53%  0.00 2125 cpu[1]+11  disp_getwork
 5525  14%  0.00 3571 cpu0   syscall_trap32
 5470  14%  0.00 3570 cpu0   nfssys
 5470  14%  0.00 3570 cpu0   svc_run
 5225  13%  0.00 3595 cpu0   svc_getreq
 5142  13%  0.00 3600 cpu0   common_dispatch
 4488  12%  0.00 3040 cpu[2] taskq_thread
 3658   9%  0.00 2946 cpu[2] zio_vdev_io_assess
 2943   8%  0.00 2874 cpu[2] zio_read_decompress
 2846   7%  0.00 2120 cpu[1] splx
 2654   7%  0.00 4105 cpu[1] putnext
 2541   7%  0.00 2785 cpu[2] lzjb_decompress
 2382   6%  0.00 2138 cpu[2] i_ddi_splhigh
 2056   5%  0.00 3481 cpu0   fop_lookup
 2031   5%  0.00 3485 cpu0   zfs_lookup
 1979   5%  0.00 3482 cpu0   zfs_dirlook
 1935   5%  0.00 3492 cpu0   zfs_dirent_lock
 1910   5%  0.00 3835 cpu[1] txg_sync_thread
 1910   5%  0.00 3835 cpu[1] spa_sync
 1896   5%  0.00 3842 cpu[1] dsl_pool_sync
 1896   5%  0.00 3842 cpu[1] dmu_objset_sync
 1895   5%  0.00 3843 cpu[1] dmu_objset_sync_dnodes
 1894   5%  0.00 3843 cpu[1] dnode_sync
bash-3.00#
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Difference between ZFS and UFS with one LUN from a SAN

2006-12-22 Thread Shawn Joy
OK,

But lets get back to the original question.

Does ZFS provide you with less features than UFS does on one LUN from a SAN 
(i.e is it less stable).

ZFS on the contrary checks every block it reads and is able to find the
mirror
or reconstruct the data in a raidz config.
Therefore ZFS uses only valid data and is able to repair the data blocks
automatically.
This is not possible in a traditional filesystem/volume manager
configuration.

The above is fine. If I have two LUNs. But my original question was if I only 
have one LUN. 

What about kernel panics from ZFS if for instance access to one controller goes 
away for a few seconds or minutes. Normally UFS would just sit there and warn I 
have lost access to the controller. Then when the controller returns, after a 
short period, the warnings go away and the LUN continues to operate. The admin 
can then research further into why the controller went away. With ZFS, the 
above will panic the system and possibly cause other coruption  on other LUNs 
due to this panic? I believe this was discussed in other threads? I also 
believe there is a bug filed against this? If so when should we expect this bug 
to be fixed?


My understanding of ZFS is that it functions better in an environment where we 
have JBODs attached to the hosts. This way ZFS takes care of all of the 
redundancy? But what about SAN enviroments where customers have spend big money 
to invest in storage. I know of one instance where a customer has a growing 
need for more storage space. There environemt uses many inodes. Due to the UFS 
inode limitation, when creating LUNs over one TB, they would have to quadrulpe 
the about of storage usesd in there SAN in order to hold all of the files. A 
possible solution to this inode issue would be ZFS. However they have 
experienced kernel panics in there environment when a controller dropped of 
line.

Any body have a solution to this?

Shawn
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: Re[2]: [zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

2006-12-22 Thread Roch - PAE


Robert Milkowski writes:
  Hello przemolicc,
  
  Friday, December 22, 2006, 10:02:44 AM, you wrote:
  
  ppf On Thu, Dec 21, 2006 at 04:45:34PM +0100, Robert Milkowski wrote:
   Hello Shawn,
   
   Thursday, December 21, 2006, 4:28:39 PM, you wrote:
   
   SJ All,
   
   SJ I understand that ZFS gives you more error correction when using
   SJ two LUNS from a SAN. But, does it provide you with less features
   SJ than UFS does on one LUN from a SAN (i.e is it less stable).
   
   With only one LUN you still get error detection which UFS doesn't give
   you. You still can use snapshots, clones, quotas, etc. so in general
   you still have more features than UFS.
   
   Now when in comes to stability - depends. UFS is for years in use
   while ZFS much younger.
   
   More and more people are using ZFS in production and while there're
   some corner cases mostly performance related, it works really good.
   And I haven't heard of verified data lost due to ZFS. I've been using
   ZFS for quite some time (much sooner than it was available in SX) and
   I haven't also lost any data.
  
  ppf Robert,
  
  ppf I don't understand why not loosing any data is an advantage of ZFS.
  ppf No filesystem should lose any data. It is like saying that an advantage
  
  I wasn't saying this is advantage. Of course no file system should
  lose your data - it's just that when new file systems show up on
  market people do not trust them in general at first - which is
  expected precaution.
  
  Part of such perception is Linux - due to different development type
  you often get software badly written and tested - try to look at
  google how many people lost their data with RaiserFS for example.
  The same happened for many people with XFS on Linux.
  
  That's why I thought emphasis on ZFS that it hasn't lost my data even if
  it's new-born file system and I've been using it for years (as other
  users) is important, especially for people mostly from Linux world.
  
  ps. I really belive development style in Open Solaris is better than
  in Linux (kernel).
  

The fact that most FS do not manage the disk write caches
does mean you're at risk of data lost for those FS.

-r


  -- 
  Best regards,
   Robertmailto:[EMAIL PROTECTED]
 http://milek.blogspot.com
  
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: [zfs-discuss] Re: Difference between ZFS and UFS with one LUN froma SAN

2006-12-22 Thread Tim Cook
This may not be the answer you're looking for, but I don't know if it's
something you've thought of.  If you're pulling a LUN from an expensive
array, with multiple HBA's in the system, why not run mpxio?  If you ARE
running mpxio, there shouldn't be an issue with a path dropping.  I have
the setup above in my test lab and pull cables all the time and have yet
to see a zfs kernel panic.  Is this something you've considered?  I
haven't seen the bug in question, but I definitely have not run into it
when running mpxio.

--Tim

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Shawn Joy
Sent: Friday, December 22, 2006 7:35 AM
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] Re: Difference between ZFS and UFS with one LUN
froma SAN

OK,

But lets get back to the original question.

Does ZFS provide you with less features than UFS does on one LUN from a
SAN (i.e is it less stable).

ZFS on the contrary checks every block it reads and is able to find the
mirror
or reconstruct the data in a raidz config.
Therefore ZFS uses only valid data and is able to repair the data
blocks
automatically.
This is not possible in a traditional filesystem/volume manager
configuration.

The above is fine. If I have two LUNs. But my original question was if I
only have one LUN. 

What about kernel panics from ZFS if for instance access to one
controller goes away for a few seconds or minutes. Normally UFS would
just sit there and warn I have lost access to the controller. Then when
the controller returns, after a short period, the warnings go away and
the LUN continues to operate. The admin can then research further into
why the controller went away. With ZFS, the above will panic the system
and possibly cause other coruption  on other LUNs due to this panic? I
believe this was discussed in other threads? I also believe there is a
bug filed against this? If so when should we expect this bug to be
fixed?


My understanding of ZFS is that it functions better in an environment
where we have JBODs attached to the hosts. This way ZFS takes care of
all of the redundancy? But what about SAN enviroments where customers
have spend big money to invest in storage. I know of one instance where
a customer has a growing need for more storage space. There environemt
uses many inodes. Due to the UFS inode limitation, when creating LUNs
over one TB, they would have to quadrulpe the about of storage usesd in
there SAN in order to hold all of the files. A possible solution to this
inode issue would be ZFS. However they have experienced kernel panics in
there environment when a controller dropped of line.

Any body have a solution to this?

Shawn
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Difference between ZFS and UFS with one LUN froma SAN

2006-12-22 Thread Shawn Joy

No,

I have not played with this. As I do not have access to my customer 
site. They have tested this themselves. It is unclear if they 
implemented this on a MPXIO/SSTM device. I will ask this question.


Thanks,
Shawn

Tim Cook wrote:

This may not be the answer you're looking for, but I don't know if it's
something you've thought of.  If you're pulling a LUN from an expensive
array, with multiple HBA's in the system, why not run mpxio?  If you ARE
running mpxio, there shouldn't be an issue with a path dropping.  I have
the setup above in my test lab and pull cables all the time and have yet
to see a zfs kernel panic.  Is this something you've considered?  I
haven't seen the bug in question, but I definitely have not run into it
when running mpxio.

--Tim

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Shawn Joy
Sent: Friday, December 22, 2006 7:35 AM
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] Re: Difference between ZFS and UFS with one LUN
froma SAN

OK,

But lets get back to the original question.

Does ZFS provide you with less features than UFS does on one LUN from a
SAN (i.e is it less stable).


ZFS on the contrary checks every block it reads and is able to find the
mirror
or reconstruct the data in a raidz config.
Therefore ZFS uses only valid data and is able to repair the data

blocks

automatically.
This is not possible in a traditional filesystem/volume manager
configuration.


The above is fine. If I have two LUNs. But my original question was if I
only have one LUN. 


What about kernel panics from ZFS if for instance access to one
controller goes away for a few seconds or minutes. Normally UFS would
just sit there and warn I have lost access to the controller. Then when
the controller returns, after a short period, the warnings go away and
the LUN continues to operate. The admin can then research further into
why the controller went away. With ZFS, the above will panic the system
and possibly cause other coruption  on other LUNs due to this panic? I
believe this was discussed in other threads? I also believe there is a
bug filed against this? If so when should we expect this bug to be
fixed?


My understanding of ZFS is that it functions better in an environment
where we have JBODs attached to the hosts. This way ZFS takes care of
all of the redundancy? But what about SAN enviroments where customers
have spend big money to invest in storage. I know of one instance where
a customer has a growing need for more storage space. There environemt
uses many inodes. Due to the UFS inode limitation, when creating LUNs
over one TB, they would have to quadrulpe the about of storage usesd in
there SAN in order to hold all of the files. A possible solution to this
inode issue would be ZFS. However they have experienced kernel panics in
there environment when a controller dropped of line.

Any body have a solution to this?

Shawn
 
 
This message posted from opensolaris.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


--
Shawn Joy
Systems Support Specialist

Sun Microsystems, Inc.
1550 Bedford Highway, Suite 302
Bedford, Nova Scotia B4A 1E6 CA
Phone 902-832-6213
Fax 902-835-6321
Email [EMAIL PROTECTED]

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: RE: What SATA controllers are people using for ZFS?

2006-12-22 Thread Lida Horn
 And yes, I would feel better if this driver was open
 sourced but
 that
 is Suns' decision to make.

Well, no.  That is Marvell's decision to make.  Marvell is
the one who make the determination that the driver
could not be open sourced, not Sun.  Since Sun
needed information received under NDA from
Marvell in order to write this driver, Marvell can
dictate whether or not the source can be made
available.

Regards,
Lida
 
 Regards,
 
 Al Hopper  Logical Approach Inc, Plano, TX.
  [EMAIL PROTECTED]
 Voice: 972.379.2133 Fax: 972.379.2134
   Timezone: US CDT
 enSolaris.Org Community Advisory Board (CAB) Member -
 Apr 2005
 OpenSolaris Governing Board (OGB) Member
 - Feb 2006
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss

 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: RE: What SATA controllers are people using for ZFS?

2006-12-22 Thread Al Hopper
On Fri, 22 Dec 2006, Lida Horn wrote:

  And yes, I would feel better if this driver was open
  sourced but
  that
  is Suns' decision to make.

 Well, no.  That is Marvell's decision to make.  Marvell is
 the one who make the determination that the driver
 could not be open sourced, not Sun.  Since Sun
 needed information received under NDA from
 Marvell in order to write this driver, Marvell can
 dictate whether or not the source can be made
 available.

Thanks Lida, for the clarification.  Do you have a contact at Marvell
where we can send email requests to have the necessary info made public?

Thanks,

Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
   Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005
 OpenSolaris Governing Board (OGB) Member - Feb 2006
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Difference between ZFS and UFS with one LUN froma SAN

2006-12-22 Thread Jason J. W. Williams

Just for what its worth, when we rebooted a controller in our array
(we pre-moved all the LUNs to the other controller), despite using
MPXIO ZFS kernel panicked. Verified that all the LUNs were on the
correct controller when this occurred. Its not clear why ZFS thought
it lost a LUN but it did. We have done cable pulling using ZFS/MPXIO
before and that works very well. It may well be array-related in our
case, but I hate anyone to have a false sense of security.

-J

On 12/22/06, Tim Cook [EMAIL PROTECTED] wrote:

This may not be the answer you're looking for, but I don't know if it's
something you've thought of.  If you're pulling a LUN from an expensive
array, with multiple HBA's in the system, why not run mpxio?  If you ARE
running mpxio, there shouldn't be an issue with a path dropping.  I have
the setup above in my test lab and pull cables all the time and have yet
to see a zfs kernel panic.  Is this something you've considered?  I
haven't seen the bug in question, but I definitely have not run into it
when running mpxio.

--Tim

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Shawn Joy
Sent: Friday, December 22, 2006 7:35 AM
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] Re: Difference between ZFS and UFS with one LUN
froma SAN

OK,

But lets get back to the original question.

Does ZFS provide you with less features than UFS does on one LUN from a
SAN (i.e is it less stable).

ZFS on the contrary checks every block it reads and is able to find the
mirror
or reconstruct the data in a raidz config.
Therefore ZFS uses only valid data and is able to repair the data
blocks
automatically.
This is not possible in a traditional filesystem/volume manager
configuration.

The above is fine. If I have two LUNs. But my original question was if I
only have one LUN.

What about kernel panics from ZFS if for instance access to one
controller goes away for a few seconds or minutes. Normally UFS would
just sit there and warn I have lost access to the controller. Then when
the controller returns, after a short period, the warnings go away and
the LUN continues to operate. The admin can then research further into
why the controller went away. With ZFS, the above will panic the system
and possibly cause other coruption  on other LUNs due to this panic? I
believe this was discussed in other threads? I also believe there is a
bug filed against this? If so when should we expect this bug to be
fixed?


My understanding of ZFS is that it functions better in an environment
where we have JBODs attached to the hosts. This way ZFS takes care of
all of the redundancy? But what about SAN enviroments where customers
have spend big money to invest in storage. I know of one instance where
a customer has a growing need for more storage space. There environemt
uses many inodes. Due to the UFS inode limitation, when creating LUNs
over one TB, they would have to quadrulpe the about of storage usesd in
there SAN in order to hold all of the files. A possible solution to this
inode issue would be ZFS. However they have experienced kernel panics in
there environment when a controller dropped of line.

Any body have a solution to this?

Shawn


This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: [zfs-discuss] Re: Difference between ZFS and UFS with one LUN froma SAN

2006-12-22 Thread Tim Cook
Always good to hear others experiences J.  Maybe I'll try firing up the
Nexan today and downing a controller to see how that affects it vs.
downing a switch port/pulling cable.  My first intuition is time-out
values.  A cable pull will register differently than a blatant time-out
depending on where it occurs.  IE: Pulling the cable from the back of
the server will register instantly, vs. the storage timing out 3
switches away.  I'm sure you're aware of that, but just an FYI for
others following the thread less familiar with SAN technology.

To get a little more background:

What kind of an array is it?

How do you have the controllers setup?  Active/active?  Active/passive?
In other words do you have array side failover occurring as well or is
it in *dummy mode*?

Do you have multiple physical paths?  IE: each controller port and each
server port hitting different switches?

What HBA's are you using?  What switches?

What version of snv are you running, and which driver?

Yey for slow Friday's before x-mas, I have a bit of time to play in the
lab today.

--Tim

-Original Message-
From: Jason J. W. Williams [mailto:[EMAIL PROTECTED] 
Sent: Friday, December 22, 2006 10:56 AM
To: Tim Cook
Cc: Shawn Joy; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Re: Difference between ZFS and UFS with one
LUN froma SAN

Just for what its worth, when we rebooted a controller in our array
(we pre-moved all the LUNs to the other controller), despite using
MPXIO ZFS kernel panicked. Verified that all the LUNs were on the
correct controller when this occurred. Its not clear why ZFS thought
it lost a LUN but it did. We have done cable pulling using ZFS/MPXIO
before and that works very well. It may well be array-related in our
case, but I hate anyone to have a false sense of security.

-J

On 12/22/06, Tim Cook [EMAIL PROTECTED] wrote:
 This may not be the answer you're looking for, but I don't know if
it's
 something you've thought of.  If you're pulling a LUN from an
expensive
 array, with multiple HBA's in the system, why not run mpxio?  If you
ARE
 running mpxio, there shouldn't be an issue with a path dropping.  I
have
 the setup above in my test lab and pull cables all the time and have
yet
 to see a zfs kernel panic.  Is this something you've considered?  I
 haven't seen the bug in question, but I definitely have not run into
it
 when running mpxio.

 --Tim

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Shawn Joy
 Sent: Friday, December 22, 2006 7:35 AM
 To: zfs-discuss@opensolaris.org
 Subject: [zfs-discuss] Re: Difference between ZFS and UFS with one LUN
 froma SAN

 OK,

 But lets get back to the original question.

 Does ZFS provide you with less features than UFS does on one LUN from
a
 SAN (i.e is it less stable).

 ZFS on the contrary checks every block it reads and is able to find
the
 mirror
 or reconstruct the data in a raidz config.
 Therefore ZFS uses only valid data and is able to repair the data
 blocks
 automatically.
 This is not possible in a traditional filesystem/volume manager
 configuration.

 The above is fine. If I have two LUNs. But my original question was if
I
 only have one LUN.

 What about kernel panics from ZFS if for instance access to one
 controller goes away for a few seconds or minutes. Normally UFS would
 just sit there and warn I have lost access to the controller. Then
when
 the controller returns, after a short period, the warnings go away and
 the LUN continues to operate. The admin can then research further into
 why the controller went away. With ZFS, the above will panic the
system
 and possibly cause other coruption  on other LUNs due to this panic? I
 believe this was discussed in other threads? I also believe there is a
 bug filed against this? If so when should we expect this bug to be
 fixed?


 My understanding of ZFS is that it functions better in an environment
 where we have JBODs attached to the hosts. This way ZFS takes care of
 all of the redundancy? But what about SAN enviroments where customers
 have spend big money to invest in storage. I know of one instance
where
 a customer has a growing need for more storage space. There environemt
 uses many inodes. Due to the UFS inode limitation, when creating LUNs
 over one TB, they would have to quadrulpe the about of storage usesd
in
 there SAN in order to hold all of the files. A possible solution to
this
 inode issue would be ZFS. However they have experienced kernel panics
in
 there environment when a controller dropped of line.

 Any body have a solution to this?

 Shawn


 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Difference between ZFS and UFS with one LUN froma SAN

2006-12-22 Thread Jason J. W. Williams

Hi Tim,

One switch environment, two ports going to the host, 4 ports going to
the storage. Switch is a Brocade SilkWorm 3850 and the HBA is a
dual-port QLA2342. Solaris rev is S10 update 3. Array is a StorageTek
FLX210 (Engenio 2884)

The LUNs had moved to the other controller and MPXIO had shown the
paths change as a result, so it was a bit bizarre. Rebooting the other
controller shouldn't have done anything, but it did. Could have been
the array.

-J

On 12/22/06, Tim Cook [EMAIL PROTECTED] wrote:

Always good to hear others experiences J.  Maybe I'll try firing up the
Nexan today and downing a controller to see how that affects it vs.
downing a switch port/pulling cable.  My first intuition is time-out
values.  A cable pull will register differently than a blatant time-out
depending on where it occurs.  IE: Pulling the cable from the back of
the server will register instantly, vs. the storage timing out 3
switches away.  I'm sure you're aware of that, but just an FYI for
others following the thread less familiar with SAN technology.

To get a little more background:

What kind of an array is it?

How do you have the controllers setup?  Active/active?  Active/passive?
In other words do you have array side failover occurring as well or is
it in *dummy mode*?

Do you have multiple physical paths?  IE: each controller port and each
server port hitting different switches?

What HBA's are you using?  What switches?

What version of snv are you running, and which driver?

Yey for slow Friday's before x-mas, I have a bit of time to play in the
lab today.

--Tim

-Original Message-
From: Jason J. W. Williams [mailto:[EMAIL PROTECTED]
Sent: Friday, December 22, 2006 10:56 AM
To: Tim Cook
Cc: Shawn Joy; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Re: Difference between ZFS and UFS with one
LUN froma SAN

Just for what its worth, when we rebooted a controller in our array
(we pre-moved all the LUNs to the other controller), despite using
MPXIO ZFS kernel panicked. Verified that all the LUNs were on the
correct controller when this occurred. Its not clear why ZFS thought
it lost a LUN but it did. We have done cable pulling using ZFS/MPXIO
before and that works very well. It may well be array-related in our
case, but I hate anyone to have a false sense of security.

-J

On 12/22/06, Tim Cook [EMAIL PROTECTED] wrote:
 This may not be the answer you're looking for, but I don't know if
it's
 something you've thought of.  If you're pulling a LUN from an
expensive
 array, with multiple HBA's in the system, why not run mpxio?  If you
ARE
 running mpxio, there shouldn't be an issue with a path dropping.  I
have
 the setup above in my test lab and pull cables all the time and have
yet
 to see a zfs kernel panic.  Is this something you've considered?  I
 haven't seen the bug in question, but I definitely have not run into
it
 when running mpxio.

 --Tim

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Shawn Joy
 Sent: Friday, December 22, 2006 7:35 AM
 To: zfs-discuss@opensolaris.org
 Subject: [zfs-discuss] Re: Difference between ZFS and UFS with one LUN
 froma SAN

 OK,

 But lets get back to the original question.

 Does ZFS provide you with less features than UFS does on one LUN from
a
 SAN (i.e is it less stable).

 ZFS on the contrary checks every block it reads and is able to find
the
 mirror
 or reconstruct the data in a raidz config.
 Therefore ZFS uses only valid data and is able to repair the data
 blocks
 automatically.
 This is not possible in a traditional filesystem/volume manager
 configuration.

 The above is fine. If I have two LUNs. But my original question was if
I
 only have one LUN.

 What about kernel panics from ZFS if for instance access to one
 controller goes away for a few seconds or minutes. Normally UFS would
 just sit there and warn I have lost access to the controller. Then
when
 the controller returns, after a short period, the warnings go away and
 the LUN continues to operate. The admin can then research further into
 why the controller went away. With ZFS, the above will panic the
system
 and possibly cause other coruption  on other LUNs due to this panic? I
 believe this was discussed in other threads? I also believe there is a
 bug filed against this? If so when should we expect this bug to be
 fixed?


 My understanding of ZFS is that it functions better in an environment
 where we have JBODs attached to the hosts. This way ZFS takes care of
 all of the redundancy? But what about SAN enviroments where customers
 have spend big money to invest in storage. I know of one instance
where
 a customer has a growing need for more storage space. There environemt
 uses many inodes. Due to the UFS inode limitation, when creating LUNs
 over one TB, they would have to quadrulpe the about of storage usesd
in
 there SAN in order to hold all of the files. A possible solution to
this
 inode issue would be ZFS. However they have experienced 

Re: [zfs-discuss] Re: zfs list and snapshots..

2006-12-22 Thread Wade . Stuart





[EMAIL PROTECTED] wrote on 12/22/2006 04:50:25 AM:

 Hello Wade,

 Thursday, December 21, 2006, 10:15:56 PM, you wrote:





 WSfc Hola folks,

 WSfc   I am new to the list, please redirect me if I am posting
 to the wrong
 WSfc location.  I am starting to use ZFS in production (Solaris x86 10U3
--
 WSfc 11/06) and I seem to be seeing unexpected behavior for zfs list and
 WSfc snapshots.  I create a filesystem (lets call it a/b where a isthe
pool).
 WSfc Now, if I store 100 gb of files on a/b and then snapshot a/[EMAIL 
 PROTECTED]
then
 WSfc delete about 50 gb of files from a/b -- I expect to see ~50
gbUSED on
 WSfc both a/b and a/[EMAIL PROTECTED] via zfs list output -- instead I only
 seem to see the
 WSfc delta block adds as USED (~20mb) on a/[EMAIL PROTECTED]  Is this
 correct behavior?
 WSfc how do you track the total delta blocks the snap is using vs other
snaps
 WSfc and live fs?

 This is almost[1] ok. When you delete a file from a file system you
 definitely expect to see that the file system allocated space reduced
 by about the same size.

 [1] the problem is that space consumed by snapshot isn't entirely
 correct and once you delete snapshot you'll actually get some more
 space than zfs list reported for that snapshot as used space. It's not
 a big deal but still it makes it harder to determine exactly how much
 space is allocated for snapshots for a given file system.


Well this is a problem for me,  in the case I showed above the snapshot
USAGE in zfs list is not only a little off on how much space it actually is
reserving for the delta blocks -- it is 50gb off out of a of 52.002gb
delta.  Now this is a test case -- where I actually know the delta.  When
this goes into production and I need to snap 6+ times a day on dynamic
filesystems,  how am I to programmatically determine how many snaps need to
fall off over time to keep the maximum amount of snapshots while
retaining enough free pool space for new live updates?  I find it hard to
believe that with all of the magic of zfs (it is a truly great leap in fs)
that I am expected to remove tail snaps until I free enough space on the
pool blindly.


I have to assume there is a more valid metric for how much pool is reserved
for a snap in time somewhere,  or this zfs list is reporting buggy data...




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: !

2006-12-22 Thread Anton B. Rang
 Unfortunately there are some cases, where the disks lose data,
 these cannot be detected by traditional filesystems but with ZFS:
 
 * bit rot: some bits on the disk gets flipped (~  1 in 10^11)
 * phantom writes: a disk 'forgets' to write data (~ 1 in 10^8)
 * misdirected reads/writes: disk writes to the wrong position (~ 1 in 10^8)

 u can look up the probabilities at several disk
 vendors, the are published.

I'm puzzled where you got those numbers from.  They seem to be several orders 
of magnitude too low.

Bit errors:

For SATA disks, the probability of an *uncorrected* error is roughly 1 in 10^14 
bits read (12 terabytes or so).  [Seagate WinHEC].  These should be handled 
identically by ZFS and a traditional file system over RAID.

The probability of either an *undetected* or *miscorrected* error is not, so 
far as I know, published for disks.  For high-end tape, where the uncorrected 
error rate is roughly 1 in 10^17 bits read, the miscorrected error rate is 1 in 
10^33 bits.  Modern disks may use a two-level ECC [IBM ECC] which reduces even 
further the miscorrected error rate. These are one class of errors which ZFS 
will catch and a traditional file system will not.

Phantom writes and/or misdirected reads/writes:

I haven't seen probabilities published on this; obviously the disk vendors 
would claim zero, but we believe they're slightly wrong.  ;-)  That said, 1 in 
10^8 bits would mean we’d have an error in every 12 megabytes written!  That’s 
clearly far too low.  1 in 10^8 blocks would be an error in every 46 gigabytes 
written; that is also clearly far too low. (At 1 GB/second that would be a 
phantom write every minute.)


References:

[Seagate WINHEC] SATA in the Enterprise. Can be found at 
http://download.microsoft.com/download/9/8/f/98f3fe47-dfc3-4e74-92a3-088782200fe7/TWST05005_WinHEC05.ppt.

[IBM ECC] Two-level coding for error control in magnetic disk storage 
products. Can be found at 
http://www.research.ibm.com/journal/rd/334/ibmrd3304G.pdf.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: !

2006-12-22 Thread Ed Gould

On  Dec 22, 2006, at 09:50, Anton B. Rang wrote:

Phantom writes and/or misdirected reads/writes:

I haven't seen probabilities published on this; obviously the disk  
vendors would claim zero, but we believe they're slightly  
wrong.  ;-)  That said, 1 in 10^8 bits would mean we’d have an  
error in every 12 megabytes written!  That’s clearly far too low.   
1 in 10^8 blocks would be an error in every 46 gigabytes written;  
that is also clearly far too low. (At 1 GB/second that would be a  
phantom write every minute.)


Jim Gray (a well-known and respected database expert, currently at  
Microsoft) clams that the drive/controller combination will write  
data to the wrong place on the drive at a rate of about one incident/ 
drive/year.  In a 400-drive array (JBOD or RAID, doesn't matter),  
that would be about once a day.  This is a kind of error that (so  
far, at least) can only be detected (and potentially corrected, given  
redundancy) by ZFS.


--Ed



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: B54 and marvell cards

2006-12-22 Thread Lida Horn
 We just put together a new system for ZFS use at a
 company, and twice
 in one week we've had the system wedge. You can log
 on, but the zpools
 are hosed, and a reboot never occurs if requested
 since it can't
 unmount the zfs volumes. So, only a power cycle
 works.
 
I've tried to reproduce this problem here at Sun with
no luck.  No errors, no wedges, nothing but
normal behavior.  If there is to be any progress on
this more details on how to reproduce this problem
will have to be provided.

Regards,
Lida
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

2006-12-22 Thread Torrey McMahon

Roch - PAE wrote:


The fact that most FS do not manage the disk write caches
does mean you're at risk of data lost for those FS.



Does ZFS? I thought it just turned it on in the places where we had 
previously turned if off.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

2006-12-22 Thread Robert Milkowski
Hello Torrey,

Friday, December 22, 2006, 9:17:46 PM, you wrote:

TM Roch - PAE wrote:

 The fact that most FS do not manage the disk write caches
 does mean you're at risk of data lost for those FS.


TM Does ZFS? I thought it just turned it on in the places where we had 
TM previously turned if off.

ZFS send flush cache command after each transaction group so it's sure
transaction is on stable storage.

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Re: zfs list and snapshots..

2006-12-22 Thread Anton B. Rang
Do you have more than one snapshot?

If you have a file system a, and create two snapshots [EMAIL PROTECTED] and 
[EMAIL PROTECTED], then any space shared between the two snapshots does not 
get accounted for anywhere visible.  Only once one of those two is deleted, so 
that all the space is private to only one snapshot, does it become visible as 
usage.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

2006-12-22 Thread Neil Perrin



Robert Milkowski wrote On 12/22/06 13:40,:

Hello Torrey,

Friday, December 22, 2006, 9:17:46 PM, you wrote:

TM Roch - PAE wrote:


The fact that most FS do not manage the disk write caches
does mean you're at risk of data lost for those FS.




TM Does ZFS? I thought it just turned it on in the places where we had 
TM previously turned if off.


ZFS send flush cache command after each transaction group so it's sure
transaction is on stable storage.


... and after every fsync, O_DSYNC, etc that writes out intent log blocks.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] Re: Difference between ZFS and UFS with one LUN froma SAN

2006-12-22 Thread Robert Milkowski
Hello Jason,

Friday, December 22, 2006, 5:55:38 PM, you wrote:

JJWW Just for what its worth, when we rebooted a controller in our array
JJWW (we pre-moved all the LUNs to the other controller), despite using
JJWW MPXIO ZFS kernel panicked. Verified that all the LUNs were on the
JJWW correct controller when this occurred. Its not clear why ZFS thought
JJWW it lost a LUN but it did. We have done cable pulling using ZFS/MPXIO
JJWW before and that works very well. It may well be array-related in our
JJWW case, but I hate anyone to have a false sense of security.

Did you first check (with format for example) if LUNs were really
accessible? If MPxIO worked ok and at least one path is ok then ZFS
won't panic.

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: Re[2]: [zfs-discuss] Re: Difference between ZFS and UFS with one LUN froma SAN

2006-12-22 Thread Tim Cook
More specifically, if you have the controllers in your array setup in an
active/passive setup, and they have a failover timeout of 30 seconds,
and the hba's have a failover timeout of 20 seconds, when it goes to
failover and cannot write to the disks... I'm sure *bad things* will
happen.  Again, I haven't tested this scenario, but I can only imagine
it's not something that can be/should be/is recovered from gracefully.

--Tim

-Original Message-
From: Robert Milkowski [mailto:[EMAIL PROTECTED] 
Sent: Friday, December 22, 2006 3:18 PM
To: Jason J. W. Williams
Cc: Tim Cook; zfs-discuss@opensolaris.org; Shawn Joy
Subject: Re[2]: [zfs-discuss] Re: Difference between ZFS and UFS with
one LUN froma SAN

Hello Jason,

Friday, December 22, 2006, 5:55:38 PM, you wrote:

JJWW Just for what its worth, when we rebooted a controller in our
array
JJWW (we pre-moved all the LUNs to the other controller), despite using
JJWW MPXIO ZFS kernel panicked. Verified that all the LUNs were on the
JJWW correct controller when this occurred. Its not clear why ZFS
thought
JJWW it lost a LUN but it did. We have done cable pulling using
ZFS/MPXIO
JJWW before and that works very well. It may well be array-related in
our
JJWW case, but I hate anyone to have a false sense of security.

Did you first check (with format for example) if LUNs were really
accessible? If MPxIO worked ok and at least one path is ok then ZFS
won't panic.

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: zfs list and snapshots..

2006-12-22 Thread Robert Milkowski
Hello Anton,

Friday, December 22, 2006, 10:55:45 PM, you wrote:

ABR Do you have more than one snapshot?

ABR If you have a file system a, and create two snapshots [EMAIL PROTECTED]
ABR and [EMAIL PROTECTED], then any space shared between the two snapshots 
does
ABR not get accounted for anywhere visible.  Only once one of those
ABR two is deleted, so that all the space is private to only one
ABR snapshot, does it become visible as usage.
ABR  

7-15 snapshots and all I can do is to guess how much space I'll get
after deleting one of them (I can only say minimum space I'll get).

Not a big problem but still little bit annoying.

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Remote Replication

2006-12-22 Thread Eric Enright

Hi all,

I'm currently investigating solutions for disaster recovery, and would
like to go with a zfs-based solution.  From what I understand, there
are two possible methods of achieving this: an iscsi mirror over a WAN
link, and remote replication with incremental zfs send/recv.  Due to
performance considerations with the former, I'm looking mostly at
incremental replication over set intervals.

My main question is: does anyone have experience doing this in
production?  It looks good on html and man pages, but I would like to
know if there are any caveats I should be aware of.  Various threads
I've read in the alias archives do not really seem to talk about
people's experiences with implementing it.

Additionally, are there any plans for tools to facilitate such a
system?  Something along the lines of a zfsreplicated service, which
could be more robust than a cron job?

Regards,
Eric
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Lots of snapshots make scrubbing extremely slow

2006-12-22 Thread Josip Gracin

Hello!

I'm generating two snapshots per day on my zfs pool.  I've noticed that 
after a while, scrubbing gets very slow, e.g. taking 12 hours and more 
on system with cca. 400 snapshots.  I think the slowdown is progressive. 
When I delete most of the snapshots, things get back to normal, i.e. 
scrubbing takes an hour or so.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss