Re: [zfs-discuss] [zfs] portable zfs send streams (preview webrev)

2012-10-22 Thread Arne Jansen
On 20.10.2012 22:24, Tim Cook wrote:
 
 
 On Sat, Oct 20, 2012 at 2:54 AM, Arne Jansen sensi...@gmx.net
 mailto:sensi...@gmx.net wrote:
 
 On 10/20/2012 01:10 AM, Tim Cook wrote:
 
 
  On Fri, Oct 19, 2012 at 3:46 PM, Arne Jansen sensi...@gmx.net
 mailto:sensi...@gmx.net
  mailto:sensi...@gmx.net mailto:sensi...@gmx.net wrote:
 
  On 10/19/2012 09:58 PM, Matthew Ahrens wrote:
   On Wed, Oct 17, 2012 at 5:29 AM, Arne Jansen sensi...@gmx.net
 mailto:sensi...@gmx.net
  mailto:sensi...@gmx.net mailto:sensi...@gmx.net
   mailto:sensi...@gmx.net mailto:sensi...@gmx.net
 mailto:sensi...@gmx.net mailto:sensi...@gmx.net wrote:
  
   We have finished a beta version of the feature. A webrev for 
 it
   can be found here:
  
   http://cr.illumos.org/~webrev/sensille/fits-send/
  
   It adds a command 'zfs fits-send'. The resulting streams can
   currently only be received on btrfs, but more receivers will
   follow.
   It would be great if anyone interested could give it some 
 testing
   and/or review. If there are no objections, I'll send a formal
   webrev soon.
  
  
  
   Please don't bother changing libzfs (and proliferating the 
 copypasta
   there) -- do it like lzc_send().
  
 
  ok. It would be easier though if zfs_send would also already use the
  new style. Is it in the pipeline already?
 
   Likewise, zfs_ioc_fits_send should use the new-style API.  See the
   comment at the beginning of zfs_ioctl.c.
  
   I'm not a fan of the name FITS but I suppose somebody else 
 already
   named the format.  If we are going to follow someone else's format
   though, it at least needs to be well-documented.  Where can we
  find the
   documentation?
  
   FYI, #1 google hit for FITS:  http://en.wikipedia.org/wiki/FITS
   #3 hit:  http://code.google.com/p/fits/
  
   Both have to do with file formats.  The entire first page of 
 google
   results for FITS format and FITS file format are related to 
 these
   two formats.  FITS btrfs didn't return anything specific to the 
 file
   format, either.
 
  It's not too late to change it, but I have a hard time coming up 
 with
  some better name. Also, the format is still very new and I'm sure 
 it'll
  need some adjustments.
 
  -arne
 
  
   --matt
 
 
 
  I'm sure we can come up with something.  Are you planning on this being
  solely for ZFS, or a larger architecture for replication both directions
  in the future?
 
 We have senders for zfs and btrfs. The planned receiver will be mostly
 filesystem agnostic and can work on a much broader range. It basically
 only needs to know how to create snapshots and where to store a few
 meta informations.
 It would be great if more filesystems would join on the sending side,
 but I have no involvement there.
 
 I see no basic problem in choosing a name that's already in use.
 Especially with file extensions most will be already taken. How about
 something with 'portable' and 'backup', like pib or pibs? 'i' for
 incremental.
 
 -Arne
 
 
 Re-using names generally isn't a big deal, but in this case the existing name 
 is
 a technology that's extremely similar to what you're doing - which WILL cause 
 a
 ton of confusion in the userbase, and make troubleshooting far more difficult
 when searching google/etc looking for links to documents that are applicable. 
  
 
 Maybe something like far - filesystem agnostic replication?   

I like that one. It has a nice connotation to 'remote'. So 'far' it be.
Thanks!

-Arne

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [zfs] portable zfs send streams (preview webrev)

2012-10-22 Thread Arne Jansen
On 22.10.2012 06:32, Matthew Ahrens wrote:
 On Sat, Oct 20, 2012 at 1:24 PM, Tim Cook t...@cook.ms 
 mailto:t...@cook.ms wrote:
 
 
 
 On Sat, Oct 20, 2012 at 2:54 AM, Arne Jansen sensi...@gmx.net
 mailto:sensi...@gmx.net wrote:
 
 On 10/20/2012 01:10 AM, Tim Cook wrote:
 
 
  On Fri, Oct 19, 2012 at 3:46 PM, Arne Jansen sensi...@gmx.net
 mailto:sensi...@gmx.net
  mailto:sensi...@gmx.net mailto:sensi...@gmx.net wrote:
 
  On 10/19/2012 09:58 PM, Matthew Ahrens wrote:
   On Wed, Oct 17, 2012 at 5:29 AM, Arne Jansen sensi...@gmx.net
 mailto:sensi...@gmx.net
  mailto:sensi...@gmx.net mailto:sensi...@gmx.net
   mailto:sensi...@gmx.net mailto:sensi...@gmx.net
 mailto:sensi...@gmx.net mailto:sensi...@gmx.net wrote:
  
   We have finished a beta version of the feature. A webrev 
 for it
   can be found here:
  
   http://cr.illumos.org/~webrev/sensille/fits-send/
  
   It adds a command 'zfs fits-send'. The resulting streams 
 can
   currently only be received on btrfs, but more receivers 
 will
   follow.
   It would be great if anyone interested could give it some
 testing
   and/or review. If there are no objections, I'll send a 
 formal
   webrev soon.
  
  
  
   Please don't bother changing libzfs (and proliferating the 
 copypasta
   there) -- do it like lzc_send().
  
 
  ok. It would be easier though if zfs_send would also already 
 use the
  new style. Is it in the pipeline already?
 
   Likewise, zfs_ioc_fits_send should use the new-style API.  
 See the
   comment at the beginning of zfs_ioctl.c.
  
   I'm not a fan of the name FITS but I suppose somebody else 
 already
   named the format.  If we are going to follow someone else's 
 format
   though, it at least needs to be well-documented.  Where can we
  find the
   documentation?
  
   FYI, #1 google hit for FITS:  
 http://en.wikipedia.org/wiki/FITS
   #3 hit:  http://code.google.com/p/fits/
  
   Both have to do with file formats.  The entire first page of 
 google
   results for FITS format and FITS file format are related 
 to
 these
   two formats.  FITS btrfs didn't return anything specific to
 the file
   format, either.
 
  It's not too late to change it, but I have a hard time coming 
 up with
  some better name. Also, the format is still very new and I'm 
 sure
 it'll
  need some adjustments.
 
  -arne
 
  
   --matt
 
 
 
  I'm sure we can come up with something.  Are you planning on this 
 being
  solely for ZFS, or a larger architecture for replication both 
 directions
  in the future?
 
 We have senders for zfs and btrfs. The planned receiver will be mostly
 filesystem agnostic and can work on a much broader range. It basically
 only needs to know how to create snapshots and where to store a few
 meta informations.
 It would be great if more filesystems would join on the sending side,
 but I have no involvement there.
 
 I see no basic problem in choosing a name that's already in use.
 Especially with file extensions most will be already taken. How about
 something with 'portable' and 'backup', like pib or pibs? 'i' for
 incremental.
 
 -Arne
 
 
 Re-using names generally isn't a big deal, but in this case the existing
 name is a technology that's extremely similar to what you're doing - which
 WILL cause a ton of confusion in the userbase, and make troubleshooting 
 far
 more difficult when searching google/etc looking for links to documents 
 that
 are applicable.  
 
 Maybe something like far - filesystem agnostic replication?   
 
 
 All else being equal, I like this name (FAR).  It ends in AR like several
 other archive formats (TAR, WAR, JAR).  Plus not a lot of false positives when
 googling around for it.   
 
 However, if compatibility with the existing format is an explicit goal, we
 should use the same name, and the btrfs authors may be averse to changing the 
 name.

There's really nothing to keep. In the btrfs world, like in the zfs world, the
stream has no special name, it's just a 'btrfs send stream', like the 'zfs send
stream'. The necessity for a name only arises from the wish to build a bridge
between the worlds.
The author of 

Re: [zfs-discuss] ARC de-allocation with large ram

2012-10-22 Thread Robert Milkowski
Hi,

If after it decreases in size it stays there it might be similar to:

7111576 arc shrinks in the absence of memory pressure

Also, see document:

ZFS ARC can shrink down without memory pressure result in slow
performance [ID 1404581.1]

Specifically, check if arc_no_grow is set to 1 after the cache size is
decreased, and if it stays that way.

The fix is in one of the SRUs and I think it should be in 11.1
I don't know if it was fixed in Illumos or even if Illumos was affected by
this at all.


-- 
Robert Milkowski
http://milek.blogspot.com


 -Original Message-
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Chris Nagele
 Sent: 20 October 2012 18:47
 To: zfs-discuss@opensolaris.org
 Subject: [zfs-discuss] ARC de-allocation with large ram
 
 Hi. We're running OmniOS as a ZFS storage server. For some reason, our
 arc cache will grow to a certain point, then suddenly drops. I used
 arcstat to catch it in action, but I was not able to capture what else
 was going on in the system at the time. I'll do that next.
 
 read  hits  miss  hit%  l2read  l2hits  l2miss  l2hit%  arcsz  l2size
  166   166 0   100   0   0   0   085G225G
 5.9K  5.9K 0   100   0   0   0   085G225G
  755   7154094  40   0  40   084G225G
  17K   17K 0   100   0   0   0   067G225G
  409   3951496  14   0  14   049G225G
  388   3642493  24   0  24   041G225G
  37K   37K2099  20   6  14  3040G225G
 
 For reference, it's a 12TB pool with 512GB SSD L2 ARC and 198GB RAM.
 We have nothing else running on the system except NFS. We are also not
 using dedupe. Here is the output of memstat at one point:
 
 # echo ::memstat | mdb -k
 Page SummaryPagesMB  %Tot
      
 Kernel   19061902 74460   38%
 ZFS File Data28237282110301   56%
 Anon43112   1680%
 Exec and libs1522 50%
 Page cache  13509520%
 Free (cachelist) 6366240%
 Free (freelist)   2958527 115566%
 
 Total5030196571
 Physical 50322219196571
 
 According to prstat -s rss nothing else is consuming the memory.
 
592 root   33M   26M sleep   590   0:00:33 0.0% fmd/27
 12 root   13M   11M sleep   590   0:00:08 0.0%
 svc.configd/21
641 root   12M   11M sleep   590   0:04:48 0.0% snmpd/1
 10 root   14M   10M sleep   590   0:00:03 0.0%
 svc.startd/16
342 root   12M 9084K sleep   590   0:00:15 0.0% hald/5
321 root   14M 8652K sleep   590   0:03:00 0.0% nscd/52
 
 So far I can't figure out what could be causing this. The only other
 thing I can think of is that we have a bunch of zfs send/receive
 operations going on as backups across 10 datasets in the pool. I  am
 not sure how snapshots and send/receive affect the arc. Does anyone
 else have any ideas?
 
 Thanks,
 Chris
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ARC de-allocation with large ram

2012-10-22 Thread Tomas Forsman
On 22 October, 2012 - Robert Milkowski sent me these 3,6K bytes:

 Hi,
 
 If after it decreases in size it stays there it might be similar to:
 
   7111576 arc shrinks in the absence of memory pressure
 
 Also, see document:
 
   ZFS ARC can shrink down without memory pressure result in slow
 performance [ID 1404581.1]
 
 Specifically, check if arc_no_grow is set to 1 after the cache size is
 decreased, and if it stays that way.
 
 The fix is in one of the SRUs and I think it should be in 11.1
 I don't know if it was fixed in Illumos or even if Illumos was affected by
 this at all.

The code that affects bug 7111576 was introduced between s10 and s11.

/Tomas
-- 
Tomas Forsman, st...@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What happens when you rm zpool.cache?

2012-10-22 Thread Edward Ned Harvey (opensolarisisdeadlongliveopensolaris)
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Gary Mills
 
 On Sun, Oct 21, 2012 at 11:40:31AM +0200, Bogdan Ćulibrk wrote:
 Follow up question regarding this: is there any way to disable
 automatic import of any non-rpool on boot without any hacks of
 removing
 zpool.cache?
 
 Certainly.  Import it with an alternate cache file.  You do this by
 specifying the `cachefile' property on the command line.  The `zpool'
 man page describes how to do this.

You can also specify cachefile=none
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What happens when you rm zpool.cache?

2012-10-22 Thread Edward Ned Harvey (opensolarisisdeadlongliveopensolaris)
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Edward Ned Harvey
 
 If you rm /etc/zfs/zpool.cache and reboot...  The system is smart enough (at
 least in my case) to re-import rpool, and another pool, but it didn't figure 
 out
 to re-import some other pool.
 
 How does the system decide, in the absence of rpool.cache, which pools it's
 going to import at boot?

So, in this thread, I haven't yet got the answer that I expect or believe.  
Because, the behavior I observed was:

I did a zfs send from one system to another, received onto 
/localpool/backups.  Side note, the receiving system has three pools: rpool, 
localpool, and iscsipool.  Unfortunately, I sent the zfs properties with it, 
including the mountpoint.  Naturally, there was already something mounted on / 
and /exports and /exports/home, so the zfs receive failed to mount on the 
receiving system, but I didn't notice that.  Later, I rebooted.

During reboot, of course, rpool mounted correctly on /, but then the system 
found the localpool/backups filesystems, and mounted /exports, /exports/home 
and so forth.  So when it tried to mount rpool/exports, it failed.  Then, 
iscsipool was unavailable, so the system failed to bootup completely.  I was 
able to login to console as myself, but I had no home directory, so I su'd to 
root.

I tried to change the mountpoints of localpool/backups/exports and so forth - 
but it failed.  Filesystem is in use, or filesystem busy or something like 
that.  (Because I logged in, obviously.)  I tried to export localpool, and 
again failed.  So I wanted some way to prevent localpool from importing or 
mounting next time, although I can't make it unmount or change mountpoints this 
time.

rm /etc/zfs/zpool.cache ; init 6

This time, the system came up, and iscsipool was not imported (as expected.)  
But I was surprised - localpool was imported.

Fortunately, this time the system mounted filesystems in the right order - 
rpool/exports was mounted under /exports, and I was able to login as myself, 
and export/import / change mountpoints of the localpool filesystems.  One more 
reboot just to be sure, and voila, no problem.

Point in question is - After I removed the zpool.cache file, I expected rpool 
to be the only pool imported upon reboot.  That's not what I observed, and I 
was wondering how the system knew to import localpool?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What happens when you rm zpool.cache?

2012-10-22 Thread Jim Klimov

Are you sure that the system with failed mounts came up NOT in a
read-only root moment, and that your removal of /etc/zfs/zpool.cache
did in fact happen (and that you did not then boot into an earlier
BE with the file still in it)?

On a side note, repairs of ZFS mount order are best done with a
single-user mode boot (-s as a kernel parameter in GRUB), which
among other things spawns very few programs and keeps your FSes
not busy. Also you (should) get to log in as root directly even
if it normally a role account and not a user on your box.

2012-10-22 15:06, Edward Ned Harvey 
(opensolarisisdeadlongliveopensolaris) пишет:

From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Edward Ned Harvey

If you rm /etc/zfs/zpool.cache and reboot...  The system is smart enough (at
least in my case) to re-import rpool, and another pool, but it didn't figure out
to re-import some other pool.

How does the system decide, in the absence of rpool.cache, which pools it's
going to import at boot?


So, in this thread, I haven't yet got the answer that I expect or believe.  
Because, the behavior I observed was:

I did a zfs send from one system to another, received onto 
/localpool/backups.  Side note, the receiving system has three pools: rpool, localpool, 
and iscsipool.  Unfortunately, I sent the zfs properties with it, including the 
mountpoint.  Naturally, there was already something mounted on / and /exports and 
/exports/home, so the zfs receive failed to mount on the receiving system, but I didn't 
notice that.  Later, I rebooted.

During reboot, of course, rpool mounted correctly on /, but then the system 
found the localpool/backups filesystems, and mounted /exports, /exports/home 
and so forth.  So when it tried to mount rpool/exports, it failed.  Then, 
iscsipool was unavailable, so the system failed to bootup completely.  I was 
able to login to console as myself, but I had no home directory, so I su'd to 
root.

I tried to change the mountpoints of localpool/backups/exports and so forth - 
but it failed.  Filesystem is in use, or filesystem busy or something like 
that.  (Because I logged in, obviously.)  I tried to export localpool, and 
again failed.  So I wanted some way to prevent localpool from importing or 
mounting next time, although I can't make it unmount or change mountpoints this 
time.

rm /etc/zfs/zpool.cache ; init 6

This time, the system came up, and iscsipool was not imported (as expected.)  
But I was surprised - localpool was imported.

Fortunately, this time the system mounted filesystems in the right order - 
rpool/exports was mounted under /exports, and I was able to login as myself, 
and export/import / change mountpoints of the localpool filesystems.  One more 
reboot just to be sure, and voila, no problem.

Point in question is - After I removed the zpool.cache file, I expected rpool 
to be the only pool imported upon reboot.  That's not what I observed, and I 
was wondering how the system knew to import localpool?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [zfs] portable zfs send streams (preview webrev)

2012-10-22 Thread Joerg Schilling
Alexander Block abloc...@googlemail.com wrote:

 tar/pax was the initial format that was chosen for btrfs send/receive
 as it looked like the best and most compatible way. In the middle of
 development however I realized that we need more then storing whole
 and incremental files/dirs in the format. We needed to store
 information about moved, renamed, deleted, reflinked and even partial
 clones where only some bits of a file are shared with another. This
 can for sure all be implemented in pax, but then the next problem is
 that in some situations renamed/moved files need multiple entries to
 get to the desired result. For example, file a may be renamed to b
 while at the same time file b got renamed to a. In such cases we need
 3 entries that use a temporary name so that we don't loose one of the
 files while receiving. There are much more complex examples where it
 gets quite complicated.

The problems of complex renames has been solved in star with the incremental 
backup/restore concept 8 years ago already. Renames are done based on inode 
numbers.

 Also, it needed support for metadata (mode, size, uid/gid, ...)
 changes on already existing files/dirs. Reusing already existent
 tar/pax entry types was not possible for this as standard tar would
 overwrite the original files with empty files.

This is not true. Star implements this since more than 8 years.


 I had all that implemented with pax, using a lot of custom pax
 entries. A lot...so many that it didn't look like tar/pax anymore. It
 actually mutated from a list of file/dir/link entries (which tar/pax
 is meant to be) to a list of filesystem instructions (rename, link,
 unlink, rmdir, write parts of a file, clone parts of a file, chmod,
 ...).

If you end up in something that does not look like an enhanced tar archivem you 
did probably not follow the rules.

 My thought was, that this was already a big misuse of tar/pax, so I
 decided to implement a simple format for this purpose only. Using pax
 gave no advantages anymore. In tar/pax every entry must have a file
 name, even the pax header entries need a file name. The problem now
 is, that plain tar will treat every unknown entry type as regular file
 and blindly overwrite existing ones which may result in data loss. To
 prevent this, I always added something to the file name so that
 unpacking with tar would not hurt the user. The unavoidable side
 effect however is that the result of a plain untar is unusable without
 further interpretation, which will be hard because tar by default does
 not dump pax headers but instead ignores unknown entries.

The problem with possible overrides with too dump archivers has been solved 
in star long ago. Star let's old software believe that there is an EOF when a 
meta data only entry is found.

 Also, using tar/pax as the format for send/receive may give a user the
 wrong impression that he can later use his good old standard tar to
 restore his backups...this could be fatal for him.

This works fine for the incremental backup/restore system used by star.

Enhancing existing data structures without breaking the philosophy is not 
trivial and may take some time, but it is usually better than reinventing the 
wheel.

Star now exists since more than 30 years (it exists since August 1982). Star is 
where other tar implementaions take their ideas from ;-)

Jörg

-- 
 EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
   j...@cs.tu-berlin.de(uni)  
   joerg.schill...@fokus.fraunhofer.de (work) Blog: 
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ARC de-allocation with large ram

2012-10-22 Thread Chris Nagele
 If after it decreases in size it stays there it might be similar to:

 7111576 arc shrinks in the absence of memory pressure

After it dropped, it did build back up. Today is the first day that
these servers are working under real production load and it is looking
much better. arcstat is showing some nice numbers for arc, but l2 is
still building.

read  hits  miss  hit%  l2read  l2hits  l2miss  l2hit%  arcsz  l2size
 19K   17K  2.5K872.5K 4902.0K  19   148G371G
 41K   39K  2.3K942.3K 1842.1K   7   148G371G
 34K   34K   69498 694  17 677   2   148G371G
 16K   15K  1.0K931.0K  161.0K   1   148G371G
 39K   36K  2.3K942.3K  202.3K   0   148G371G
 23K   22K   74696 746  76 670  10   148G371G
 49K   47K  1.7K961.7K 2491.5K  14   148G371G
 23K   21K  1.4K931.4K  381.4K   2   148G371G

My only guess is that the large zfs send / recv streams were affecting
the cache when they started and finished.

Thanks for the responses and help.

Chris
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] What is L2ARC write pattern?

2012-10-22 Thread Jim Klimov

Hello all,

  A few months ago I saw a statement that L2ARC writes are simplistic
in nature, and I got the (mis?)understanding that some sort of ring
buffer may be in use, like for ZIL. Is this true, and the only metric
of write-performance important for L2ARC SSD device is the sequential
write bandwidth (and IOPS)? In particular, there are some SD/MMC/CF
cards for professional photography and such, that can do pretty good
bursts at writes, and may be okay (better than HDD) at random reads
of the cached blocks.

  Are such cards fit for L2ARC uses?

  One idea I have is that a laptop which only has a single HDD slot,
often has SD/MMC cardreader slots. If populated with a card for L2ARC,
can it be expected to boost the laptop's ZFS performance?

  Also, what's the worst that is expected to happen if the card pops
out of its cardreader slot while it is used as L2ARC - should the
ZFS filesystem panic or fall back to HDD data seamlessly? Or does it,
as always, depend on firmware and drivers involved in the stack?

Thanks,
//Jim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What is L2ARC write pattern?

2012-10-22 Thread Jim Klimov

2012-10-22 20:58, Brian wrote:

hi jim,

writes are sequential and to a ring buffer.  reads of course would not
be sequential, and would be intermixed with writes.


Thanks... Do I get it correctly that if a block from L2ARC is
requested by the readers, then it is fetched from the SSD and
becomes a normal block in the ARC - thus the L2ARC entry won't
be used again for this same block when it expires again from
RAM ARC? Likewise, if the ring buffer is filled up and the
block in L2ARC becomes overwritten, a possible read via the
pointer from ARC to L2ARC fails (checksum mismatch), the pointer
gets invalidated and pool disks get used to read the data from
original source - right? If that all is the case, is there any
sort of proactive expiration of values from L2ARC, or it is
just not needed (useful data will travel from L2ARC to ARC and
back again, and useless data will expire by being overwritten
as the ring buffer completes its full cycle)?



 the digital

photography cards probably aren't the fastest flash around, and may not
be designed for a high number of (over)write cycles, but anything that
can be configured as a vdev ought to work as l2arc.


Well, I do photograph a lot and the consumer-grade cards begin
to show deterioration in a couple of years of use. AFAIK the
professional expensive ones also employ redundancy chips and
thus improve speed and reliability - being more of an SSD in
a non-SATA package ;)



can't say how the system will behave if the flash is yanked during i/o.
all the modules involved should handle that condition cleanly but it
doesn't mean the use case will work.


:)

Thanks,
//Jim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ARC de-allocation with large ram

2012-10-22 Thread Richard Elling
On Oct 22, 2012, at 6:52 AM, Chris Nagele nag...@wildbit.com wrote:

 If after it decreases in size it stays there it might be similar to:
 
7111576 arc shrinks in the absence of memory pressure
 
 After it dropped, it did build back up. Today is the first day that
 these servers are working under real production load and it is looking
 much better. arcstat is showing some nice numbers for arc, but l2 is
 still building.
 
 read  hits  miss  hit%  l2read  l2hits  l2miss  l2hit%  arcsz  l2size
 19K   17K  2.5K872.5K 4902.0K  19   148G371G
 41K   39K  2.3K942.3K 1842.1K   7   148G371G
 34K   34K   69498 694  17 677   2   148G371G
 16K   15K  1.0K931.0K  161.0K   1   148G371G
 39K   36K  2.3K942.3K  202.3K   0   148G371G
 23K   22K   74696 746  76 670  10   148G371G
 49K   47K  1.7K961.7K 2491.5K  14   148G371G
 23K   21K  1.4K931.4K  381.4K   2   148G371G
 
 My only guess is that the large zfs send / recv streams were affecting
 the cache when they started and finished.

There are other cases where data is evicted from the ARC, though I don't
have a complete list at my fingertips. For example, if a zvol is closed, then
the data for the zvol is evicted.
 -- richard

 
 Thanks for the responses and help.
 
 Chris
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

--

richard.ell...@richardelling.com
+1-760-896-4422









___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send to older version

2012-10-22 Thread Richard Elling
On Oct 19, 2012, at 4:59 PM, Edward Ned Harvey 
(opensolarisisdeadlongliveopensolaris) 
opensolarisisdeadlongliveopensola...@nedharvey.com wrote:

 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Richard Elling
 
 At some point, people will bitterly regret some zpool upgrade with no way
 back.
 
 uhm... and how is that different than anything else in the software world?
 
 No attempt at backward compatibility, and no downgrade path, not even by 
 going back to an older snapshot before the upgrade.

ZFS has a stellar record of backwards compatibility. The only break with 
backwards
compatibility I can recall was a bug fix in the send stream somewhere around 
opensolaris b34.

Perhaps you are confusing backwards compatibility with forwards compatibility?
 -- richard

--

richard.ell...@richardelling.com
+1-760-896-4422









___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What happens when you rm zpool.cache?

2012-10-22 Thread Edward Ned Harvey (opensolarisisdeadlongliveopensolaris)
 From: Jim Klimov [mailto:jimkli...@cos.ru]
 Sent: Monday, October 22, 2012 7:26 AM
 
 Are you sure that the system with failed mounts came up NOT in a
 read-only root moment, and that your removal of /etc/zfs/zpool.cache
 did in fact happen (and that you did not then boot into an earlier
 BE with the file still in it)?

I'm going to take your confusion and disbelief in support of my confusion and 
disbelief.  So it's not that I didn't understand what to expect ... it's that I 
somehow made a mistake, but I don't know what (and I don't care enough to try 
reproducing the same circumstance.)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss