Re: [zfs-discuss] Freeing unused space in thin provisioned zvols

2013-02-12 Thread Stefan Ring
 Unless you do a shrink on the vmdk and use a zfs variant with scsi unmap
 support (I believe currently only Nexenta but correct me if I am wrong) the
 blocks will not be freed, will they?


 Solaris 11.1 has ZFS with SCSI UNMAP support.

Freeing unused blocks works perfectly well with fstrim (Linux)
consuming an iSCSI zvol served up by oi151a6.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Question about ZFS snapshots

2012-09-20 Thread Stefan Ring
On Fri, Sep 21, 2012 at 6:31 AM, andy thomas a...@time-domain.co.uk wrote:
 I have a ZFS filseystem and create weekly snapshots over a period of 5 weeks
 called week01, week02, week03, week04 and week05 respectively. Ny question
 is: how do the snapshots relate to each other - does week03 contain the
 changes made since week02 or does it contain all the changes made since the
 first snapshot, week01, and therefore includes those in week02?

Every snapshot is based on the previous one and store only what is
needed to capture the differences.

 To rollback to week03, it's necesaary to delete snapshots week04 and week05
 first but what if week01 and week02 have also been deleted - will the
 rollback still work or is it ncessary to keep earlier snapshots?

No, it's not necessary. You can rollback to any snapshot.

I almost never use rollback though, in normal use. If I've
accidentally deleted or overwritten something, I just rsync it over
from the corresponding /.zfs/snapshots directory. Only if what I want
to restore is huge, rollback might be a better option.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS ok for single disk dev box?

2012-08-30 Thread Stefan Ring
 I asked what I thought was a simple question but most of the answers don't
 have too much to do with the question.

Hehe, welcome to mailing lists ;).

 What I'd
 really like is an option (maybe it exists) in ZFS to say when a block fails
 a checksum tell me which file it affects

It does exactly that.

 I have read reports on this list that show ZFS does panic the system by
 default in some cases. It may not have been for checksum failures, I have no
 idea why it did, but enough people wrote about crashed boxes to make me ask
 the question I asked.

I've never heard or experienced anything like that.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot used space question

2012-08-29 Thread Stefan Ring
On Wed, Aug 29, 2012 at 8:58 PM, Timothy Coalson tsc...@mst.edu wrote:
 As I understand it, the used space of a snapshot does not include anything
 that is in more than one snapshot.

True. It shows the amount that would be freed if you destroyed the
snapshot right away. Data held onto by more than one snapshot cannot
be removed when you destroy just one of them, obviously. The act of
destroying a snapshot will likely change the USED value of the
neighbouring snapshots though.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Missing Disk Space

2012-08-06 Thread Stefan Ring
Have you not seen my answer?

http://mail.opensolaris.org/pipermail/zfs-discuss/2012-August/052170.html
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] what have you been buying for slog and l2arc?

2012-08-06 Thread Stefan Ring
 Unfortunately, the Intel 520 does *not* power protect it's
 on-board volatile cache (unlike the Intel 320/710 SSD).

 Intel has an eye-opening technology brief, describing the
 benefits of power-loss data protection at:

 http://www.intel.com/content/www/us/en/solid-state-drives/ssd-320-series-power-loss-data-protection-brief.html

 Intel's brief also clears up a prior controversy of what types of
 data are actually cached, per the brief it's both user and system
 data!

So you're saying that SSDs don't generally flush data to stable medium
when instructed to? So data written before an fsync is not guaranteed
to be seen after a power-down?

If that -- ignoring cache flush requests -- is the whole reason why
SSDs are so fast, I'm glad I haven't got one yet.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Missing disk space

2012-08-04 Thread Stefan Ring
On Sat, Aug 4, 2012 at 12:00 AM, Burt Hailey bhai...@triunesystems.com wrote:
 We do hourly snapshots.   Two days ago I deleted 100GB of
 data and did not see a corresponding increase in snapshot sizes.  I’m new to
 zfs and am reading the zfs admin handbook but I wanted to post this to get
 some suggestions on what to look at.

Use

$ zfs get usedbysnapshots file_system,

and you will see where the space went. Listing the snapshots and
looking at the USED column does not give you this information because
it only shows what would be freed if _only_ this snapshot were
destroyed. By destroying all of them, a lot more might become
available.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs sata mirror slower than single disk

2012-07-16 Thread Stefan Ring
 2) in the mirror case the write speed is cut by half, and the read
 speed is the same as a single disk. I'd expect about twice the
 performance for both reading and writing, maybe a bit less, but
 definitely more than measured.

I wouldn't expect mirrored read to be faster than single-disk read,
because the individual disks would need to read small chunks of data
with holes in-between. Regardless of the holes being read or not, the
disk will spin at the same speed.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs sata mirror slower than single disk

2012-07-16 Thread Stefan Ring
 It is normal for reads from mirrors to be faster than for a single disk
 because reads can be scheduled from either disk, with different I/Os being
 handled in parallel.

That assumes that there *are* outstanding requests to be scheduled in
parallel, which would only happen with multiple readers or a large
read-ahead buffer.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Interaction between ZFS intent log and mmap'd files

2012-07-05 Thread Stefan Ring
 Actually, a write to memory for a memory mapped file is more similar to
 write(2).  If two programs have the same file mapped then the effect on the
 memory they share is instantaneous because it is the same physical memory.
 A mmapped file becomes shared memory as soon as it is mapped at least twice.

True, for some interpretation of instantaneous. It does not
establish a happens-before relationship though, as
store-munmap/mmap-load does.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Interaction between ZFS intent log and mmap'd files

2012-07-04 Thread Stefan Ring
 I really makes no sense at all to
 have munmap(2) not imply msync(3C).

Why not? munmap(2) does basically the equivalent of write(2). In the
case of write, that is: a later read from the same location will see
the written data, unless another write happens in-between. If power
goes down following the write, all bets are off. And translated to
munmap: a subsequent call to mmap(2) that makes the previously
munmap-ped region available will make visible everything stored to the
region prior to the munmap call. If power goes down following the
munmap, all bets are off. In both cases, if you want your data to
persist across power losses, use sync -- fsync or msync.

If only the syncing variants were available, disk accesses would be
significantly slower, and disks would thrash rather audibly all the
time.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Recovery of RAIDZ with broken label(s)

2012-06-16 Thread Stefan Ring
 when you say remove the device, I assume you mean simply make it unavailable
 for import (I can't remove it from the vdev).

Yes, that's what I meant.

 root@openindiana-01:/mnt# zpool import -d /dev/lofi
  pool: ZP-8T-RZ1-01
    id: 9952605666247778346
  state: FAULTED
 status: One or more devices are missing from the system.
 action: The pool cannot be imported. Attach the missing
        devices and try again.
   see: http://www.sun.com/msg/ZFS-8000-3C
 config:

        ZP-8T-RZ1-01              FAULTED  corrupted data
          raidz1-0                DEGRADED
            12339070507640025002  UNAVAIL  cannot open
            /dev/lofi/5           ONLINE
            /dev/lofi/4           ONLINE
            /dev/lofi/3           ONLINE
            /dev/lofi/1           ONLINE

 It's interesting that even though 4 of the 5 disks are available, it still
 can import it as DEGRADED.

I agree that it's interesting. Now someone really knowledgable will
need to have a look at this. I can only imagine that somehow the
devices contain data from different points in time, and that it's too
far apart for the aggressive txg rollback that was added in PSARC
2009/479. Btw, did you try that? Try: zpool import -d /dev/lofi -FVX
ZP-8T-RZ1-01.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] snapshot size

2012-06-05 Thread Stefan Ring
 Two questions from a newbie.

        1/ What REFER mean in zfs list ?

The amount of data that is reachable from the file system root. It's
just what I would call the contents of the file system.

        2/ How can I known the size of all snapshot size for a partition ?
        (OK I can add zfs list -t snapshot)

zfs get usedbysnapshots zfs-name

Or if you have a recent enough system, have a look at the written
property: http://blog.delphix.com/matt/files/2011/11/oss.pdf (pg 8).
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] snapshot size

2012-06-05 Thread Stefan Ring
 Can I say

        USED-REFER=snapshot size ?

No. USED is the space that would be freed if you destroyed the
snapshot _right now_. This can change (and usually does) if you
destroy previous snapshots.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on Linux vs FreeBSD

2012-04-25 Thread Stefan Ring
 I saw one team revert from ZoL (CentOS 6) back to ext on some backup servers
 for an application project, the killer  was
 stat times (find running slow etc.), perhaps more layer 2 cache could have
 solved the problem, but it was easier to deploy ext/lvm2.

But stat times (think directory traversal) are horrible on ZFS/Solaris
as well, at least on a workstation-class machine that doesn't run
24/7. Maybe on an always-on server with 256GB RAM or more, things
would be different. For me, that's really the only pain point of using
ZFS.

Sorry for not being able to contribute any ZoL experience. I've been
pondering whether it's worth trying for a few months myself already.
Last time I checked, it didn't support the .zfs directory (for
snapshot access), which you really don't want to miss after getting
used to it.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] What is your data error rate?

2012-01-24 Thread Stefan Ring
After having read this mailing list for a little while, I get the
impression that there are at least some people who regularly
experience on-disk corruption that ZFS should be able to report and
handle. I’ve been running a raidz1 on three 1TB consumer disks for
approx. 2 years now (about 90% full), and I scrub the pool every 3-4
weeks and have never had a single error. From the oft-quoted 10^14
error rate that consumer disks are rated at, I should have seen an
error by now -- the scrubbing process is not the only activity on the
disks, after all, and the data transfer volume from that alone clocks
in at almost exactly 10^14 by now.

Not that I’m worried, of course, but it comes at a slight surprise to
me. Or does the 10^14 rating just reflect the strength of the on-disk
ECC algorithm?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data loss by memory corruption?

2012-01-17 Thread Stefan Ring
 The issue is definitely not specific to ZFS.  For example, the whole OS
 depends on relable memory content in order to function.  Likewise, no one
 likes it if characters mysteriously change in their word processing
 documents.

I don’t care too much if a single document gets corrupted – there’ll
always be a good copy in a snapshot. I do care however if a whole
directory branch or old snapshots were to disappear.

 Most of the blame seems to focus on Intel, with its objective to spew CPUs
 with the highest-clocking performance at the lowest possible price point for
 the desktop market.  AMD CPUs seem to usually be slower but include ECC as
 standard in the CPU or AMD-supplied chipset.

Agreed. I originally bought an AMD-based system for that reason alone,
with the intention of running OpenSolaris on it. Alas, it performed
abysmally, so it was quickly swapped for an Intel-based one (without
ECC).

Additionally, consider that Joyent’s port of KVM supports only Intel
systems, AFAIK.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Data loss by memory corruption?

2012-01-14 Thread Stefan Ring
Inspired by the paper End-to-end Data Integrity for File Systems: A
ZFS Case Study [1], I've been thinking if it is possible to devise a way,
in which a minimal in-memory data corruption would cause massive data
loss. I could imagine a scenario where an entire directory branch
drops off the tree structure, for example. Since I know too little
about ZFS's structure, I'm also asking myself if it is possible to
make old snapshots disappear via memory corruption or lose data blocks
to leakage (not containing data, but not marked as available).

I'd appreciate it if someone with a good understanding of ZFS's
internals and principles could comment on the possibility of such
scenarios.

[1] http://www.usenix.org/event/fast10/tech/full_papers/zhang.pdf
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss