Re: [OmniOS-discuss] ZFS Dedup

2018-05-04 Thread Chris Siebenmann
> - while you can enable on a filesystem level it operates on pool > level. once activated you cannot disable beside a pool destroy. As far as I can tell from having looked at the code in the past, dedup is effectively inactive as soon as all data written while dedup was enabled has been

Re: [OmniOS-discuss] Upgrade from 151022 to CE bootadm error

2017-09-15 Thread Chris Siebenmann
> Yes I saw that too, but I'm not sure why 12GB is being taken up by > rpool/dump. And there's really not that much used by BEs. Is there a > reason that dumps has to be so large or should I allocate more space > to rpool? Thanks! If I remember correctly, by default rpool/dump is sized during

Re: [OmniOS-discuss] OmniOS CE question: will there be a LTS stable release?

2017-07-26 Thread Chris Siebenmann
> Which does open the question: how many people need LTS? (And what does > LTS mean for them - in terms of how long support would be needed, and > what level of support/backports they expect.) Perhaps Chris could chip > in here, but I know that with my $DAYJOB hat on the idea of dropping >

[OmniOS-discuss] OmniOS CE question: will there be a LTS stable release?

2017-07-25 Thread Chris Siebenmann
The OmniOS CE release announcement here on the mailing list covered the broad plans for the (non-Bloody) release schedule: The intention is for new stable releases to continue to come out every 26 weeks. Interim, "weekly" updates to stable follow a fixed schedule denoted

Re: [OmniOS-discuss] nfsv3rwsnoop.d lists NFS writes to files

2017-03-21 Thread Chris Siebenmann
> Hi, > I am using the dtrace script nfsv3rwsnoop.d to find file that are > accessed from my OmniOS r151020 filer and some file names are listed as > unknown :-( > I guess they are files that have been open for a long time and have > dropped out of some data structure. > Is there any way to

Re: [OmniOS-discuss] DTrace Scripts

2017-03-09 Thread Chris Siebenmann
> Im looking for some general dtrace scripts for debugging ZFS on OmniOS > (like updated dtrace toolkit)..didnt want to reinvent the wheel if > some folks are willing to share. Also willing to purchase if needed. We have a collection of local DTrace scripts at:

Re: [OmniOS-discuss] A problem and puzzle with disappearing ZFS snapshots

2017-01-10 Thread Chris Siebenmann
For the interest of bystanders, here's the resolution to our puzzle. I wrote: > I wrote, about our mysteriously disappearing snapshots: > > [...] For example, is there some way where snapshots can be removed > > without that being logged in 'zpool history'? > > The answer to my question

Re: [OmniOS-discuss] A problem and puzzle with disappearing ZFS snapshots

2017-01-09 Thread Chris Siebenmann
> > On Jan 9, 2017, at 11:13 AM, Chris Siebenmann <c...@cs.toronto.edu> wrote: > > > > The answer to my question turns out to be 'yes'. If you do: > > rmdir /.zfs/snapshot/ > > > > and ZFS accepts this, there is no entry made in 'zpool history'; the

Re: [OmniOS-discuss] A problem and puzzle with disappearing ZFS snapshots

2017-01-09 Thread Chris Siebenmann
I wrote, about our mysteriously disappearing snapshots: > [...] For example, is there some way where snapshots can be removed > without that being logged in 'zpool history'? The answer to my question turns out to be 'yes'. If you do: rmdir /.zfs/snapshot/ and ZFS accepts this, there

Re: [OmniOS-discuss] A problem and puzzle with disappearing ZFS snapshots

2017-01-09 Thread Chris Siebenmann
> I wonder if the scrub has something to do with it? Are you in the > middle of the scrub when noticing missing snapshots? Do they reappear > after the scrub is done? The scrub seems like it's relevant. I take > it you're doing a cron-driven scrub? If so, with what frequency? We scrub

Re: [OmniOS-discuss] A problem and puzzle with disappearing ZFS snapshots

2017-01-06 Thread Chris Siebenmann
I wrote, quoting first 'zpool history' and then 'zfs list' output: [...] > 2017-01-06.15:10:01 zfs snapshot fs0-admin-02/h/105@Fri-15 > 2017-01-06.16:10:01 zfs snapshot fs0-admin-02/h/105@Fri-16 > 2017-01-06.16:45:55 zfs snapshot fs0-admin-02/h/105@Fri-16 >

[OmniOS-discuss] A problem and puzzle with disappearing ZFS snapshots

2017-01-06 Thread Chris Siebenmann
We have an automated system for making regular (roughly hourly) snapshots of some especially important filesystems where we want fast restores. This has been running smoothly for some time and without problems. However, starting this week we have twice gone to do from-snapshot restores on one of

[OmniOS-discuss] Understanding OmniOS disk IO timeouts and options to control them

2017-01-04 Thread Chris Siebenmann
We recently had a server reboot due to the ZFS vdev_deadman/spa_deadman timeout timer activating and panicing the system. If you haven't heard of this timer before, that's not surprising; triggering it requires an IO to a vdev to take more than 1000 seconds (by default; it's controlled by the

Re: [OmniOS-discuss] Increase default maximum NFS server threads?

2016-12-06 Thread Chris Siebenmann
> I got a link to this commit from the Delphix illumos repo a while back: > > https://github.com/openzfs/openzfs/pull/186/ > > I was curious if NFS-using people in the audience here would like to > see this one Just Land (TM) in illumos-omnios or not? I think that modernizing these NFS

Re: [OmniOS-discuss] Host is not being rebooted if it uses ZFS over iSCSI

2016-11-30 Thread Chris Siebenmann
> Does anybody uses ZFS over iSCSI? > > There is problem with reboots as iscsi-initiator service does not > taking care of ZFS while shutting down. It leads to zpool goes into > UNAVAIL state and then first sync() issued gets blocked with following > stack: We have a significant ZFS-over-iSCSI

Re: [OmniOS-discuss] Slow scrub on SSD-only pool

2016-04-21 Thread Chris Siebenmann
[About ZFS scrub tunables:] > Interesting read - and it surely works. If you set the tunable before > you start the scrub you can immediately see the thoughput being much > higher than with the standard setting. [...] It's perhaps worth noting here that the scrub rate shown in 'zpool status' is

Re: [OmniOS-discuss] Caiman issues with certain timezones

2016-04-08 Thread Chris Siebenmann
> So my quick question to you all: for r151018 (and backporting), do you > prefer a nicer-looking installer that craps out if you select Africa, > Europe, or Asia? Or do you prefer a sketchier-looking one that works > for all timezones out of the box? I'm cutting the last 017 bloody > with this,

[OmniOS-discuss] Good way to debug DTrace invalid address errors?

2016-03-23 Thread Chris Siebenmann
I have a relatively complicated chunk of dtrace code that reads kernel data structures and chases pointers through them. Some of the time it spits out 'invalid address' errors during execution, for example: dtrace: error on enabled probe ID 8 (ID 75313: fbt:nfssrv:nfs3_fhtovp:return):

Re: [OmniOS-discuss] 4kn or 512e with ashift=12

2016-03-23 Thread Chris Siebenmann
> > The sd.conf whitelist also requires a reboot to activate if you need > > to add a new entry, as far as I know. > > > > (Nor do I know what happens if you have some 512n disks and > > some 512e disks, both correctly recognized and in different > > pools, and now you need to

Re: [OmniOS-discuss] 4kn or 512e with ashift=12

2016-03-22 Thread Chris Siebenmann
> > This implicitly assumes that the only reason to set ashift=12 is > > if you are currently using one or more drives that require it. I > > strongly disagree with this view. Since ZFS cannot currently replace > > a 512n drive with a 512e one, I feel [...] > > *In theory* this replacement

Re: [OmniOS-discuss] 4kn or 512e with ashift=12

2016-03-21 Thread Chris Siebenmann
> > Adding the ashift argument to zpool was discussed every few years > > and so far was always deemed not enterprisey enough for the Solaris > > heritage, so the setup to tweak sd driver reports and properly rely > > on that layer was pushed instead. > > The issue is that once a drive model lies,

Re: [OmniOS-discuss] Configuration sanity checking

2016-02-19 Thread Chris Siebenmann
> > > Motherboard - Supermicro X10DRi family > > > CPUs - E5-2620 v3 > > > HBA - LSI 9300-8i > > > Network - Intel i350 or 540, depending on precise motherboard variant > > > Disks (SSD) - Intel S3510 (boot), S3610 (application + data) > > > > The drives are SATA and there are quite a few SATA

Re: [OmniOS-discuss] mixing 512B and 4K disks in vdev

2015-11-19 Thread Chris Siebenmann
> I have to replace a HDD disk in a mirrored vdev which is using > ashift=9. The problem is that all new HDD's is 4K disks so I wonder > if anyone has made some performance measurements of using a 4K disk > with ashift=9 in contrast to the obvious ashift=12? > > As I understand it you cannot

[OmniOS-discuss] A small gotcha with switching the ssh & openssh packages in OmniOS

2015-11-16 Thread Chris Siebenmann
It's great that Sun SSH can now be more or less transparently swapped for OpenSSH. As part of that both of them provide the SMF service svc:/network/ssh:default. However, there is a difference in the SMF properties that each of them uses for boot time ordering: Sun SSH uses fs-local/*, and

Re: [OmniOS-discuss] OmniOS r151016 is now out!

2015-11-09 Thread Chris Siebenmann
> Technically, I believe OmniOS "owns" /opt/omni. Feel free to define > other subdirs as datasets hosted elsewhere (out of tree of rootfs > children). My own admin-scripts go under /opt but these are okay as > part of rootfs/BE so I don't dataset them ;) OmniOS also puts stuff in eg

Re: [OmniOS-discuss] OmniOS r151016 is now out!

2015-11-08 Thread Chris Siebenmann
> My first try at an update failed to boot properly. Being a control > freak, I had made /opt its own zfs filesystem. Previously it must > have been part of the boot environment. The contents of /opt was not > as expected and so the fs.local service failed. We ran into this in our ongoing

Re: [OmniOS-discuss] Bloody // mailwrapper & mta mediator

2015-11-06 Thread Chris Siebenmann
> Sure, generally speaking. In this particular context I believe users > should ship their own if they want to deploy a mail server, but > all nodes should be able to deliver mail locally. It would also be > great if the default install lended itself to mail submission (eg. > a satellite mailer

[OmniOS-discuss] Installing non-current kernels on OmniOS r151014?

2015-10-13 Thread Chris Siebenmann
We have a situation where we would like to be able to install new r151014 machines with something other than the current r151014 kernel. (In the extreme case we'd like to be able to specify the exact package version for all packages, but kernels are the most important for us.) I *think* that

Re: [OmniOS-discuss] big zfs storage?

2015-10-07 Thread Chris Siebenmann
> I completely concur with Richard on this. Let me give an a real example > that emphases this point as it's a critical design decision. [...] > Now I only run one hot spare per pool. Most of my pools are raidz2 or > raidz3. This way any event like this can not take out more than one > disk and

Re: [OmniOS-discuss] Clues for tracking down why kernel memory isn't being released?

2015-07-16 Thread Chris Siebenmann
I wrote: We have one ZFS-based NFS fileserver that persistently runs at a very high level of non-ARC kernel memory usage that never seems to shrink. On a 128 GB machine, mdb's ::memstat reports 95% memory usage by just 'Kernel' while the ZFS ARC is only at about 21 GB (as reported by 'kstat

Re: [OmniOS-discuss] Clues for tracking down why kernel memory isn't being released?

2015-07-16 Thread Chris Siebenmann
It turns out that the explanation for this is relatively simple, as is the work around. Put simply: the OmniOS kernel does not actually free up these deallocated cache objects until the system is put under relatively strong memory pressure. Crucially, *the ZFS ARC does not create this

[OmniOS-discuss] Clues for tracking down why kernel memory isn't being released?

2015-07-14 Thread Chris Siebenmann
We have one ZFS-based NFS fileserver that persistently runs at a very high level of non-ARC kernel memory usage that never seems to shrink. On a 128 GB machine, mdb's ::memstat reports 95% memory usage by just 'Kernel' while the ZFS ARC is only at about 21 GB (as reported by 'kstat -m') although

[OmniOS-discuss] Schedule for a new kernel update on OmniOS r151014?

2015-06-29 Thread Chris Siebenmann
According to the OmniOS source repository's changelogs, there's a fix in the r151014 branch that we're quite interested in: 3783 Flow control is needed in rpcmod when the NFS server is unable to keep up with the network As far as I know this fix is not in the initial release

Re: [OmniOS-discuss] OmniOS NFS fileserver hanging under sustained high write loads

2015-05-05 Thread Chris Siebenmann
On May 4, 2015, at 9:03 PM, Dan McDonald dan...@omniti.com wrote: I swear I've seen someone try to address this before. Maybe it's from = my Nexenta days. I will be querying the illumos developer's list (as I = suspect this affects the other distros as well if they haven't fixed it = in

Re: [OmniOS-discuss] OmniOS NFS fileserver hanging under sustained high write loads

2015-05-05 Thread Chris Siebenmann
The hard part will be testing this. I'm not sure I have the HW in-house to do it. I may need illumos community help. Since we have a test environment where we can reproduce this and a high interest in seeing it fixed, we can test new kernel packages and so on. (If given specific

[OmniOS-discuss] OmniOS NFS fileserver hanging under sustained high write loads

2015-05-04 Thread Chris Siebenmann
We now have a reproducable setup with OmniOS r151014 where an OmniOS NFS fileserver will experience memory exhaustion and then hang in the kernel if it receives sustained NFS write traffic from multiple clients at a rate faster than its local disks can sustain. The machine will run okay for a

Re: [OmniOS-discuss] Clues for tracking down a drastic ZFS fs space difference?

2015-04-29 Thread Chris Siebenmann
On Apr 29, 2015, at 3:21 PM, Chris Siebenmann c...@cs.toronto.edu wrote: We have a filesystem/dataset with no snapshots, You're sure about no snapshots? zfs list -t snapshot has surprised me once or twice in the past. :-/ Completely sure. 'zfs list -t snapshot' has nothing and all

[OmniOS-discuss] Clues for tracking down a drastic ZFS fs space difference?

2015-04-29 Thread Chris Siebenmann
We have a filesystem/dataset with no snapshots, no subordinate filesystems, nothing complicated (and no compression), that has a drastic difference in space used between what df/zfs list/etc report at the ZFS level and what du reports at the filesystem level. ZFS says NAME PROPERTY

Re: [OmniOS-discuss] Clues for tracking down a drastic ZFS fs space difference?

2015-04-29 Thread Chris Siebenmann
From: Chris Siebenmann [...] ZFS level; as you can see, it's added 22 GB of ZFS usage in less than a month while losing 5 GB at the user level. There was a thread late last year on the developer list that described similar symptoms: http://lists.open-zfs.org/pipermail/developer/2014

[OmniOS-discuss] What should we look at in a memory exhaustion situation?

2015-04-28 Thread Chris Siebenmann
We now have a reproducable situation where high NFS load can cause our specific fileserver configuration to lock up with what looks like a memory exhaustion deadlock. So far, attempts to get a crash dump haven't worked (although they may someday). Without a crash dump, what system stats and so on

Re: [OmniOS-discuss] r151014 feedback

2015-04-23 Thread Chris Siebenmann
Oddly, removing the SFP modules makes no difference. Replacing the modules - still doesn't show up. I really do expect to see the interfaces in dladm show-phys now.. also add_drv still fails to attach, which is odd. I feel like this has something to do with fault management not clearing

[OmniOS-discuss] What do people use for basic system monitoring?

2015-04-21 Thread Chris Siebenmann
Out of curiosity: I suspect that plenty of people are gathering basic system activity stats for their OmniOS systems and pushing them into modern metrics systems such as graphite (to pick perhaps the most well known package for this). For those that are doing this, what is your preferred

Re: [OmniOS-discuss] Internal pkg error during a test r151010 to r151014 upgrade

2015-04-07 Thread Chris Siebenmann
History lesson: until people could afford to purchase more than one disk and before Sun invented the diskless workstation (with shared /usr), everything was under /. As Richard knows but other people may not, this is ahistorical on Unix. From almost the beginning[*] Unix had a split between

Re: [OmniOS-discuss] Internal pkg error during a test r151010 to r151014 upgrade

2015-04-07 Thread Chris Siebenmann
Short story is that /opt is part of a namespace managed by the Solaris packaging and as such is part of a BE fs tree. If you have privately managed packages under certain subdirs, turn those sub-dirs into separate datasets instead. If this is the case for OmniOS, I believe that it should be

Re: [OmniOS-discuss] Internal pkg error during a test r151010 to r151014 upgrade

2015-04-07 Thread Chris Siebenmann
A better way is to IPS package everything properly, and add proper metadata to your packages, so that those packages that should go into a new BE do ask for one in their manifest. That way, there is no distinction between any system package living in /usr (or wherever) and your package

[OmniOS-discuss] How to check if you have enough NFS server threads?

2015-03-20 Thread Chris Siebenmann
We're running into a situation with one of our NFS ZFS fileservers[*] where we're wondering if we have enough NFS server threads to handle our load. Per 'sharectl get nfs', we have 'servers=512' configured, but we're not sure we know how to check how many are actually in use and active at any

Re: [OmniOS-discuss] The ixgbe driver, Lindsay Lohan, and the Greek economy

2015-02-20 Thread Chris Siebenmann
After installation and configuration, I observed all kinds of bad behavior in the network traffic between the hosts and the server. All of this bad behavior is traced to the ixgbe driver on the storage server. Without going into the full troubleshooting process, here are my takeaways: [...]

[OmniOS-discuss] Oddity in how much reserved space there is in ZFS pools?

2014-10-23 Thread Chris Siebenmann
If you have a ZFS pool with mirror vdevs and you look at 'zfs list' versus 'zpool list', you can see that the available space is somewhat different between the two. This is a known issue and comes about because the ZFS code reserves some amount of space that can't be consumed in normal use and

[OmniOS-discuss] Debugging reproducable iSCSI initiator hang problem?

2014-09-29 Thread Chris Siebenmann
We have found a reproducable iSCSI initiator issue in our environment. The short form version of the hang is that if there is a network interruption between one of our OmniOS machines and an iSCSI target and you run 'iscsiadm list target -S' in a roughly 30-second window after the interruption

[OmniOS-discuss] Tips on diagnosing an OmniOS lockup?

2014-09-19 Thread Chris Siebenmann
We have a situation where one of our OmniOS NFS fileservers (running r151010 although not current on updates) is hanging/locking up mysteriously (other identical servers run fine). No messages are logged either to the console or to syslog and the machine becomes almost totally unresponsive; at

[OmniOS-discuss] Getting detailed NFS lock information on an OmniOS NFS server

2014-08-11 Thread Chris Siebenmann
Here's a question: does anyone know of a good way to get detailed NFS lock information on an OmniOS NFS server, especially information like which client has a lock on a particular file? 'mdb -k' can be used to extract basic lock information, like the full paths of all files that have active

Re: [OmniOS-discuss] OmniOS OpenSSL 1.0.1g and CVE-2014-0160

2014-04-08 Thread Chris Siebenmann
| On 2014-04-08 15:44, Saso Kiselkov wrote: | Anything below OpenSSL 1.0.0 (inclusive) isn't vulnerable to this. (Most | legacy systems, including OI, still run on the OpenSSL 0.9.8 | release train) | | Thanks, I've read that statement ;) | | I just wanted to make sure that if we have an

Re: [OmniOS-discuss] Debating Swapping 2TB with 4TB drives in RaidZ2 or Create new Vol and clone

2014-03-21 Thread Chris Siebenmann
| I am debating the idea of just swapping all my hard drives in my | current 8x2TB RaidZ2 (all be it slowly) and let the environment | resilver each drive than expand versus creating a new RaidZ2 on a | different box and cloning the data over. | | Obviously i know of the Pros/Cons/Risks associated

Re: [OmniOS-discuss] Debating Swapping 2TB with 4TB drives in RaidZ2 or Create new Vol and clone

2014-03-21 Thread Chris Siebenmann
| I know the drive itself does 512b emulation but i would rather run 4K | if theres a performance increase! What matters for OmniOS is what the drive reports as. If it reports honestly that it has a 4k physical sector size, ZFS will say 'nope!' even if the drive will accept 512b reads and

Re: [OmniOS-discuss] Reproducible r151008j kernel crash with ZFS pools on iSCSI

2014-03-10 Thread Chris Siebenmann
| In short: when reproducing the bug, try something like vmstat 1 in a | separate SSH shell, to see if your available memory plummets when you | disconnect the devices and/or the sr (scanrate, search for swapping) | increases substantially. 'vmstat 1' shows no sign of this. sr is flatlined at

Re: [OmniOS-discuss] Bug: OmniOS r151008j terminates iSCSI initiator too early in shutdown

2014-03-08 Thread Chris Siebenmann
| On 2014-03-07 21:49, Chris Siebenmann wrote: |In at least OmniOS r151008j, the iSCSI initiator and thus any iSCSI | disks it has established are shut down relatively early during a shutdown | or reboot. In specific they are terminated before halt et al runs | '/sbin/bootadm -ea update_all

Re: [OmniOS-discuss] Bug: OmniOS r151008j terminates iSCSI initiator too early in shutdown

2014-03-08 Thread Chris Siebenmann
I wrote: | As far as I can tell from simply looking at things right now, even | an orderly shutdown on an OmniOS system will not avoid this. I should clarify that: 'on a normal, stock setup OmniOS system'. You can of course add SMF jobs to import and export ZFS pools and then shim them into the

[OmniOS-discuss] Bug: OmniOS r151008j terminates iSCSI initiator too early in shutdown

2014-03-07 Thread Chris Siebenmann
In at least OmniOS r151008j, the iSCSI initiator and thus any iSCSI disks it has established are shut down relatively early during a shutdown or reboot. In specific they are terminated before halt et al runs '/sbin/bootadm -ea update_all' (in halt.c's do_archives_update()). Under some

Re: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?

2014-03-05 Thread Chris Siebenmann
| I think a zpool list can help in your debugging to see if the pools | in question are in fact imported before zfs mount -a, or if some | unexpected magic happens and the zfs command does indeed trigger the | imports. Sorry for not mentioning this before: a 'zpool list' before the 'zfs mount

[OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?

2014-03-04 Thread Chris Siebenmann
I will ask my question to start with and then explain the background. As far as I can tell from running truss on the 'zfs mount -a' in /lib/svc/method/fs-local, this *does not* mount filesystems from pools other than rpool. However the mounts are absent immediately before it runs and present

Re: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?

2014-03-04 Thread Chris Siebenmann
| You mention 'directories' being empty. Does /fs3-test-02 contain empty | directories before being mounted? It doesn't. All of /fs3-test-01, /fs3-test-02, /h/281, and /h/999 are empty before 'zfs mount -a' runs (I've verified this with ls's immediately before the 'zfs mount -a' in

Re: [OmniOS-discuss] How do you configure serial ports on OmniOS?

2014-02-28 Thread Chris Siebenmann
| On Fri, 28 Feb 2014, Chris Siebenmann wrote: | This question makes me feel silly but I'm lost in a confusing maze of | documentation for sacadm, pmadm, and so on and I can't find anything | with web searches. What I would like to do is configure what I believe is | /dev/term/c ('ttyS3

[OmniOS-discuss] Are reboots hanging on 'rebooting...' still of interest?

2014-02-04 Thread Chris Siebenmann
Back in October there was a thread about OmniOS systems hanging on reboot (starting at http://lists.omniti.com/pipermail/omnios-discuss/2013-October/001494.html). I now have a OmniOS r151008j install that does this reliably on a SuperMicro X9SRH-7TF motherboard (and the workaround from

[OmniOS-discuss] What are people using as a source of additional packages currently?

2014-01-14 Thread Chris Siebenmann
When I tested and experimented with OmniOS last year, I used pkgsrc.org as a general source of additional packages. However it currently seems to be fairly non-functional on OmniOS (or in general, eg its install instructions point to things that don't exist right now). Is there some generally

Re: [OmniOS-discuss] NBD on OmniOS

2013-08-27 Thread Chris Siebenmann
| Also, the ZFS pool backing the NFS server may be slow to write (all | NFS I/O is sync, and you may require an SSD log device to speed this | up - seek dtrace scripts that would help you analyze beforehand if you | have any sync IO that may be the culprit, or temporarily disable sync | I/O on the