Re: [OmniOS-discuss] scsi command timeouts

2017-06-22 Thread Michael Talbott
A couple things that I've discovered over time that might help:

Don't ever use the root user for zpool queries such as "zpool status". If you 
have a really bad failing disk a zpool status command can take forever to 
complete when ran as root. A "su nobody -c 'zpool status'" will return results 
almost instantly. So if your device discovery script(s) use zpool commands, 
that might be a choking point.

# make sure to prevent scsi bus resets (in /kernel/drv/sd.conf) especially in 
an HA environment
allow-bus-device-reset=0;

Also, depending on the disk model, I've found that some of them wreak havoc on 
the SAS topology itself when they start to fail. Some just handle errors really 
badly and can flood the SAS channel. If you have a SAS switch in between, you 
might be able to get an idea of which device is causing the grief from there 
based on the counts.

In my case I have had horrible experiences with the WD WD4001FYYG. That model 
of drive has caused me an insane amount of headache. The disk scan on boot 
literally takes 13 seconds per-disk (when the disks are perfectly good and much 
much longer when one is dying). If I replace them with another make/model 
drive, the disk scan is done in a fraction of a second. Also, booting the same 
machine into any linux os the scan completes in a fraction of a second. Must be 
something about that model's firmware that doesn't play nicely with Illumos's 
driver. Anyway, that's a story for another time ;)

I've reduced the drive scan time at boot down to 5 seconds per disk instead of 
the 13 seconds per disk for that horrible accursed drive by adding this to 
/kernel/drv/sd.conf

sd-config-list= "WD  WD4001FYYG","power-condition:false";

Followed by this command to commit it:
update_drv -vf sd

Hope this helps.


Michael


> On Jun 22, 2017, at 1:41 PM, Schweiss, Chip  wrote:
> 
> I'm talking about an offline pool.   I started this thread after rebooting a 
> server that is part of an HA pair. The other server has the pools online.  
> It's been over 4 hours now and it still hasn't completed its disk scan.   
> 
> Every tool I have that helps me locate disks, suffers from the same insane 
> command timeout to happen many times before moving on.   Operations that 
> typically take seconds blow up to hours really fast because of a few dead 
> disks. 
> 
> -Chip
> 
> 
> 
> On Thu, Jun 22, 2017 at 3:12 PM, Dale Ghent  > wrote:
> 
> Have you able to and have tried offlining it in the zpool?
> 
> zpool offline thepool 
> 
> I'm assuming the pool has some redundancy which would allow for this.
> 
> /dale
> 
> > On Jun 22, 2017, at 11:54 AM, Schweiss, Chip  > > wrote:
> >
> > When ever a disk goes south, several disk related takes become painfully 
> > slow.  Boot up times can jump into the hours to complete the disk scans.
> >
> > The logs slowly get these type messages:
> >
> > genunix: WARNING /pci@0,0/pci8086,340c@5/pci15d9,400@0 (mpt_sas0):
> > Timeout of 60 seconds expired with 1 commands on target 16 lun 0
> >
> > I thought this /etc/system setting would reduce the timeout to 5 seconds:
> > set sd:sd_io_time = 5
> >
> > But this doesn't seem to change anything.
> >
> > Is there anyway to make this a more reasonable timeout, besides pulling the 
> > disk that's causing it?   Just locating the defective disk is also 
> > painfully slow because of this problem.
> >
> > -Chip
> > ___
> > OmniOS-discuss mailing list
> > OmniOS-discuss@lists.omniti.com 
> > http://lists.omniti.com/mailman/listinfo/omnios-discuss 
> > 
> 
> 
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] scsi command timeouts

2017-06-22 Thread Bob Friesenhahn

On Thu, 22 Jun 2017, Schweiss, Chip wrote:


I'm talking about an offline pool.   I started this thread after rebooting
a server that is part of an HA pair. The other server has the pools
online.  It's been over 4 hours now and it still hasn't completed its disk
scan.

Every tool I have that helps me locate disks, suffers from the same insane
command timeout to happen many times before moving on.   Operations that
typically take seconds blow up to hours really fast because of a few dead
disks.


You forgot to describe your storage topology and the type of drives 
(SAS/SATA) involved.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] scsi command timeouts

2017-06-22 Thread Jeffry Molanus
Hi,

Certain commands (in particular during attach) are send by mptsas itself,
these have a timeout set in the driver and are not issued by SD hence these
commands are not affected by changing those values. See for example,
mptsas_access_config_page()

 - Jeffry

On Thu, Jun 22, 2017 at 10:12 PM, Dale Ghent  wrote:

>
> Have you able to and have tried offlining it in the zpool?
>
> zpool offline thepool 
>
> I'm assuming the pool has some redundancy which would allow for this.
>
> /dale
>
> > On Jun 22, 2017, at 11:54 AM, Schweiss, Chip  wrote:
> >
> > When ever a disk goes south, several disk related takes become painfully
> slow.  Boot up times can jump into the hours to complete the disk scans.
> >
> > The logs slowly get these type messages:
> >
> > genunix: WARNING /pci@0,0/pci8086,340c@5/pci15d9,400@0 (mpt_sas0):
> > Timeout of 60 seconds expired with 1 commands on target 16 lun 0
> >
> > I thought this /etc/system setting would reduce the timeout to 5 seconds:
> > set sd:sd_io_time = 5
> >
> > But this doesn't seem to change anything.
> >
> > Is there anyway to make this a more reasonable timeout, besides pulling
> the disk that's causing it?   Just locating the defective disk is also
> painfully slow because of this problem.
> >
> > -Chip
> > ___
> > OmniOS-discuss mailing list
> > OmniOS-discuss@lists.omniti.com
> > http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
>
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
>
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] scsi command timeouts

2017-06-22 Thread Schweiss, Chip
I'm talking about an offline pool.   I started this thread after rebooting
a server that is part of an HA pair. The other server has the pools
online.  It's been over 4 hours now and it still hasn't completed its disk
scan.

Every tool I have that helps me locate disks, suffers from the same insane
command timeout to happen many times before moving on.   Operations that
typically take seconds blow up to hours really fast because of a few dead
disks.

-Chip



On Thu, Jun 22, 2017 at 3:12 PM, Dale Ghent  wrote:

>
> Have you able to and have tried offlining it in the zpool?
>
> zpool offline thepool 
>
> I'm assuming the pool has some redundancy which would allow for this.
>
> /dale
>
> > On Jun 22, 2017, at 11:54 AM, Schweiss, Chip  wrote:
> >
> > When ever a disk goes south, several disk related takes become painfully
> slow.  Boot up times can jump into the hours to complete the disk scans.
> >
> > The logs slowly get these type messages:
> >
> > genunix: WARNING /pci@0,0/pci8086,340c@5/pci15d9,400@0 (mpt_sas0):
> > Timeout of 60 seconds expired with 1 commands on target 16 lun 0
> >
> > I thought this /etc/system setting would reduce the timeout to 5 seconds:
> > set sd:sd_io_time = 5
> >
> > But this doesn't seem to change anything.
> >
> > Is there anyway to make this a more reasonable timeout, besides pulling
> the disk that's causing it?   Just locating the defective disk is also
> painfully slow because of this problem.
> >
> > -Chip
> > ___
> > OmniOS-discuss mailing list
> > OmniOS-discuss@lists.omniti.com
> > http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
>
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] scsi command timeouts

2017-06-22 Thread Dale Ghent

Have you able to and have tried offlining it in the zpool?

zpool offline thepool 

I'm assuming the pool has some redundancy which would allow for this.

/dale

> On Jun 22, 2017, at 11:54 AM, Schweiss, Chip  wrote:
> 
> When ever a disk goes south, several disk related takes become painfully 
> slow.  Boot up times can jump into the hours to complete the disk scans.
> 
> The logs slowly get these type messages:
> 
> genunix: WARNING /pci@0,0/pci8086,340c@5/pci15d9,400@0 (mpt_sas0):
> Timeout of 60 seconds expired with 1 commands on target 16 lun 0
> 
> I thought this /etc/system setting would reduce the timeout to 5 seconds:
> set sd:sd_io_time = 5
> 
> But this doesn't seem to change anything.
> 
> Is there anyway to make this a more reasonable timeout, besides pulling the 
> disk that's causing it?   Just locating the defective disk is also painfully 
> slow because of this problem.
> 
> -Chip
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss



signature.asc
Description: Message signed with OpenPGP
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] scsi command timeouts

2017-06-22 Thread Schweiss, Chip
On Thu, Jun 22, 2017 at 11:05 AM, Michael Rasmussen  wrote:

>
> > I thought this /etc/system setting would reduce the timeout to 5 seconds:
> > set sd:sd_io_time = 5
> >
> I think it expects a hex value so try 0x5 instead.
>
>
Unfortunately, no, I've tried that too.

-Chip


> --
> Hilsen/Regards
> Michael Rasmussen
>
> Get my public GnuPG keys:
> michael  rasmussen  cc
> http://pgp.mit.edu:11371/pks/lookup?op=get=0xD3C9A00E
> mir  datanom  net
> http://pgp.mit.edu:11371/pks/lookup?op=get=0xE501F51C
> mir  miras  org
> http://pgp.mit.edu:11371/pks/lookup?op=get=0xE3E80917
> --
> /usr/games/fortune -es says:
> Look, we play the Star Spangled Banner before every game.  You want us
> to pay income taxes, too?
> -- Bill Veeck, Chicago White Sox
>
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
>
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] scsi command timeouts

2017-06-22 Thread Michael Rasmussen
On Thu, 22 Jun 2017 10:54:25 -0500
"Schweiss, Chip"  wrote:

> I thought this /etc/system setting would reduce the timeout to 5 seconds:
> set sd:sd_io_time = 5
> 
I think it expects a hex value so try 0x5 instead.

-- 
Hilsen/Regards
Michael Rasmussen

Get my public GnuPG keys:
michael  rasmussen  cc
http://pgp.mit.edu:11371/pks/lookup?op=get=0xD3C9A00E
mir  datanom  net
http://pgp.mit.edu:11371/pks/lookup?op=get=0xE501F51C
mir  miras  org
http://pgp.mit.edu:11371/pks/lookup?op=get=0xE3E80917
--
/usr/games/fortune -es says:
Look, we play the Star Spangled Banner before every game.  You want us
to pay income taxes, too?
-- Bill Veeck, Chicago White Sox


pgp740EhuCt9Z.pgp
Description: OpenPGP digital signature
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


[OmniOS-discuss] scsi command timeouts

2017-06-22 Thread Schweiss, Chip
When ever a disk goes south, several disk related takes become painfully
slow.  Boot up times can jump into the hours to complete the disk scans.

The logs slowly get these type messages:

genunix: WARNING /pci@0,0/pci8086,340c@5/pci15d9,400@0 (mpt_sas0):
Timeout of 60 seconds expired with 1 commands on target 16 lun 0

I thought this /etc/system setting would reduce the timeout to 5 seconds:
set sd:sd_io_time = 5

But this doesn't seem to change anything.

Is there anyway to make this a more reasonable timeout, besides pulling the
disk that's causing it?   Just locating the defective disk is also
painfully slow because of this problem.

-Chip
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Loosing NFS shares

2017-06-22 Thread Oliver Weinmann
Hi Dan,

Thanks for pointing this out. No the service is not running:

svcs -a | grep cap



Oliver Weinmann
Senior Unix VMWare, Storage Engineer
Telespazio VEGA Deutschland GmbH
 Europaplatz 5 - 64293 Darmstadt - Germany
Ph: + 49 (0)6151 8257 744 | Fax: +49 (0)6151 8257 799
oliver.weinm...@telespazio-vega.de
http://www.telespazio-vega.de
Registered office/Sitz: Darmstadt, Register court/Registergericht: Darmstadt, 
HRB 89231; Managing Director/Geschäftsführer: Sigmar Keller-Original 
Message-
From: Dan McDonald [mailto:dan...@kebe.com]
Sent: Donnerstag, 22. Juni 2017 14:10
To: Oliver Weinmann ; Dan McDonald 

Cc: Tobias Oetiker ; omnios-discuss 

Subject: Re: [OmniOS-discuss] Loosing NFS shares


> On Jun 22, 2017, at 3:13 AM, Oliver Weinmann 
>  wrote:
>
> Hi,
>
> Don’t think so:
>
> svcs -vx rcapd
>
> shows nothing.

You're not looking for the right thing.

neuromancer(~)[0]% pgrep rcapd
340
neuromancer(~)[0]% svcs -a | grep cap
online May_12   svc:/system/rcap:default
neuromancer(~)[0]% svcs -xv rcap
svc:/system/rcap:default (resource capping daemon)
 State: online since Fri May 12 02:12:40 2017
   See: man -M /usr/share/man -s 1M rcapd
   See: man -M /usr/share/man -s 1M rcapstat
   See: man -M /usr/share/man -s 1M rcapadm
   See: /var/svc/log/system-rcap:default.log
Impact: None.
neuromancer(~)[0]% su troot
Password:
OmniOS 5.11 omnios-r151022-f9693432c2   May 2017
(0)# svcadm disable rcap
(0)#


Hope this helps,
Dan

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Loosing NFS shares

2017-06-22 Thread Dan McDonald

> On Jun 22, 2017, at 3:13 AM, Oliver Weinmann 
>  wrote:
> 
> Hi,
>  
> Don’t think so:
>  
> svcs -vx rcapd 
>  
> shows nothing.

You're not looking for the right thing.

neuromancer(~)[0]% pgrep rcapd
340
neuromancer(~)[0]% svcs -a | grep cap
online May_12   svc:/system/rcap:default
neuromancer(~)[0]% svcs -xv rcap
svc:/system/rcap:default (resource capping daemon)
 State: online since Fri May 12 02:12:40 2017
   See: man -M /usr/share/man -s 1M rcapd
   See: man -M /usr/share/man -s 1M rcapstat
   See: man -M /usr/share/man -s 1M rcapadm
   See: /var/svc/log/system-rcap:default.log
Impact: None.
neuromancer(~)[0]% su troot
Password: 
OmniOS 5.11 omnios-r151022-f9693432c2   May 2017
(0)# svcadm disable rcap
(0)# 


Hope this helps,
Dan

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Loosing NFS shares

2017-06-22 Thread Oliver Weinmann
Hi,

Running the zfs mount –a from / shows the same errors.

I now ran the following commands to correct the mountpoints:

Re-enable inheritance:

zfs inherit -r mountpoint hgst4u60/ReferencePR

Reset mountpoint on the root folder:

zfs set mountpoint=/hgst4u60/ReferencePR hgst4u60/ReferencePR

unmount all subfolders:

for fs in `zfs mount | grep ReferencePR | awk '{print $2}'`; do zfs unmount 
$fs; done

Check that all subfolders are unmounted:

zfs mount | grep ReferencePR

Check that all folders are empty, just to be sure!!!:

du  /hgst4u60/ReferencePR

If they are empty remove the root folder:

rm -rf /hgst4u60/ReferencePR

Finally remount:

zfs mount –a


This solves the mount issues but I wonder why this has happened? And hopefully 
this doesn’t happen again?




[cid:Logo_Telespazio_180_px_signature_eng_b58fa623-e26d-4116-9230-766adacfe55e1.png]

Oliver Weinmann
Senior Unix VMWare, Storage Engineer

Telespazio VEGA Deutschland GmbH
Europaplatz 5 - 64293 Darmstadt - Germany
Ph: + 49 (0)6151 8257 744 | Fax: +49 (0)6151 8257 799
oliver.weinm...@telespazio-vega.de
http://www.telespazio-vega.de

Registered office/Sitz: Darmstadt, Register court/Registergericht: Darmstadt, 
HRB 89231; Managing Director/Geschäftsführer: Sigmar Keller
From: Sriram Narayanan [mailto:sriram...@gmail.com]
Sent: Donnerstag, 22. Juni 2017 10:26
To: Oliver Weinmann 
Cc: Stephan Budach ; omnios-discuss 

Subject: Re: [OmniOS-discuss] Loosing NFS shares



On Thu, Jun 22, 2017 at 3:45 PM, Oliver Weinmann 
> 
wrote:
One more thing I just noticed is that the system seems to be unable to mount 
directories:

root@omnios01:/hgst4u60/ReferenceAC/AGDEMO# /usr/sbin/zfs mount -a
cannot mount '/hgst4u60/ReferenceAC': directory is not empty
cannot mount '/hgst4u60/ReferenceDF': directory is not empty
cannot mount '/hgst4u60/ReferenceGI': directory is not empty
cannot mount '/hgst4u60/ReferenceJL': directory is not empty
cannot mount '/hgst4u60/ReferenceMO': directory is not empty
cannot mount '/hgst4u60/ReferencePR': directory is not empty
cannot mount '/hgst4u60/ReferenceSU': directory is not empty
cannot mount '/hgst4u60/ReferenceVX': directory is not empty
cannot mount '/hgst4u60/ReferenceYZ': directory is not empty

Maybe this is where all problems are coming from?

Please issue the zfs mount -a command from elsewhere rather than from within 
"/hgst4u60/ReferenceAC"

It also seems that "" and the others may already have local files. If possible, 
then rename those Reference* directories and issue the zfs mount -a again.


From: Stephan Budach 
[mailto:stephan.bud...@jvm.de]
Sent: Donnerstag, 22. Juni 2017 09:30
To: Oliver Weinmann 
>
Cc: omnios-discuss 
>
Subject: Re: [OmniOS-discuss] Loosing NFS shares

Hi Oliver,


Von: "Oliver Weinmann" 
>
An: "Tobias Oetiker" >
CC: "omnios-discuss" 
>
Gesendet: Donnerstag, 22. Juni 2017 09:13:27
Betreff: Re: [OmniOS-discuss] Loosing NFS shares

Hi,

Don’t think so:

svcs -vx rcapd

shows nothing.




[cid:image001.png@01D2EB3B.A023E330]

Oliver Weinmann
Senior Unix VMWare, Storage Engineer

Telespazio VEGA Deutschland GmbH
Europaplatz 5 - 64293 Darmstadt - Germany
Ph: + 49 (0)6151 8257 744 | Fax: +49 (0)6151 8257 
799
oliver.weinm...@telespazio-vega.de
http://www.telespazio-vega.de

Registered office/Sitz: Darmstadt, Register court/Registergericht: Darmstadt, 
HRB 89231; Managing Director/Geschäftsführer: Sigmar Keller
From: Tobias Oetiker [mailto:t...@oetiker.ch]
Sent: Donnerstag, 22. Juni 2017 09:11
To: Oliver Weinmann 
>
Cc: omnios-discuss 
>
Subject: Re: [OmniOS-discuss] Loosing NFS shares

Oliver,

are you running rcapd ? we found that (at least of the box) this thing wrecks 
havoc to both
nfs and iscsi sharing ...

cheers
tobi

- On Jun 22, 2017, at 8:45 AM, Oliver Weinmann 
> 
wrote:
Hi,

we are using OmniOS for a few months now and have big trouble with stability. 
We mainly use it for VMware NFS datastores. The last 3 nights we lost all NFS 
datastores and VMs stopped running. I noticed that even though zfs get sharenfs 
shows folders as shared they 

Re: [OmniOS-discuss] Loosing NFS shares

2017-06-22 Thread Sriram Narayanan
On Thu, Jun 22, 2017 at 3:45 PM, Oliver Weinmann <
oliver.weinm...@telespazio-vega.de> wrote:

> One more thing I just noticed is that the system seems to be unable to
> mount directories:
>
>
>
> root@omnios01:/hgst4u60/ReferenceAC/AGDEMO# /usr/sbin/zfs mount -a
>
> cannot mount '/hgst4u60/ReferenceAC': directory is not empty
>
> cannot mount '/hgst4u60/ReferenceDF': directory is not empty
>
> cannot mount '/hgst4u60/ReferenceGI': directory is not empty
>
> cannot mount '/hgst4u60/ReferenceJL': directory is not empty
>
> cannot mount '/hgst4u60/ReferenceMO': directory is not empty
>
> cannot mount '/hgst4u60/ReferencePR': directory is not empty
>
> cannot mount '/hgst4u60/ReferenceSU': directory is not empty
>
> cannot mount '/hgst4u60/ReferenceVX': directory is not empty
>
> cannot mount '/hgst4u60/ReferenceYZ': directory is not empty
>
>
>
> Maybe this is where all problems are coming from?
>

Please issue the zfs mount -a command from elsewhere rather than from
within "/hgst4u60/ReferenceAC"

It also seems that "" and the others may already have local files. If
possible, then rename those Reference* directories and issue the zfs mount
-a again.


>
>
> *From:* Stephan Budach [mailto:stephan.bud...@jvm.de]
> *Sent:* Donnerstag, 22. Juni 2017 09:30
> *To:* Oliver Weinmann 
> *Cc:* omnios-discuss 
> *Subject:* Re: [OmniOS-discuss] Loosing NFS shares
>
>
>
> Hi Oliver,
>
>
> --
>
> *Von: *"Oliver Weinmann" 
> *An: *"Tobias Oetiker" 
> *CC: *"omnios-discuss" 
> *Gesendet: *Donnerstag, 22. Juni 2017 09:13:27
> *Betreff: *Re: [OmniOS-discuss] Loosing NFS shares
>
>
>
> Hi,
>
>
>
> Don’t think so:
>
>
>
> svcs -vx rcapd
>
>
>
> shows nothing.
>
>
>
>
>
> [image: cid:image001.png@01D2EB3B.A023E330]
>
> *Oliver Weinmann*
> Senior Unix VMWare, Storage Engineer
>
> Telespazio VEGA Deutschland GmbH
> Europaplatz 5 - 64293 Darmstadt - Germany
> Ph: + 49 (0)6151 8257 744 <+49%206151%208257744> | Fax: +49 (0)6151 8257
> 799 <+49%206151%208257799>
> oliver.weinm...@telespazio-vega.de
> http://www.telespazio-vega.de
>
> Registered office/Sitz: Darmstadt, Register court/Registergericht:
> Darmstadt, HRB 89231; Managing Director/Geschäftsführer: Sigmar Keller
>
> *From:* Tobias Oetiker [mailto:t...@oetiker.ch ]
> *Sent:* Donnerstag, 22. Juni 2017 09:11
> *To:* Oliver Weinmann 
> *Cc:* omnios-discuss 
> *Subject:* Re: [OmniOS-discuss] Loosing NFS shares
>
>
>
> Oliver,
>
>
>
> are you running rcapd ? we found that (at least of the box) this thing
> wrecks havoc to both
>
> nfs and iscsi sharing ...
>
>
>
> cheers
>
> tobi
>
>
>
> - On Jun 22, 2017, at 8:45 AM, Oliver Weinmann <
> oliver.weinm...@telespazio-vega.de> wrote:
>
> Hi,
>
>
>
> we are using OmniOS for a few months now and have big trouble with
> stability. We mainly use it for VMware NFS datastores. The last 3 nights we
> lost all NFS datastores and VMs stopped running. I noticed that even though
> zfs get sharenfs shows folders as shared they become inaccessible. Setting
> sharenfs to off and sharing again solves the issue. I have no clue where to
> start. I’m fairly new to OmniOS.
>
>
>
> Any help would be highly appreciated.
>
>
>
> Thanks and Best Regards,
>
> Oliver
>
>
>
> [image: cid:image001.png@01D2EB3B.A023E330]
>
> *Oliver Weinmann*
> Senior Unix VMWare, Storage Engineer
>
> Telespazio VEGA Deutschland GmbH
> Europaplatz 5 - 64293 Darmstadt - Germany
> Ph: + 49 (0)6151 8257 744 <+49%206151%208257744> | Fax: +49 (0)6151 8257
> 799 <+49%206151%208257799>
> oliver.weinm...@telespazio-vega.de
> http://www.telespazio-vega.de
>
> Registered office/Sitz: Darmstadt, Register court/Registergericht:
> Darmstadt, HRB 89231; Managing Director/Geschäftsführer: Sigmar Keller
>
>
>
> What is the output from fmdump / fmdump -v? Also, it would be good to have
> a better understanding of your setup. We have been using NFS shares from
> OmniOS since r006 on OVM and also VMWare and at least the NFS part has
> always been very solid for us. So, how did you setup your storage and how
> many NFS clients do you have?
>
>
>
> Cheers,
>
> Stephan
>
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
>
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Loosing NFS shares

2017-06-22 Thread Oliver Weinmann
One more thing I just noticed is that the system seems to be unable to mount 
directories:

 

root@omnios01:/hgst4u60/ReferenceAC/AGDEMO# /usr/sbin/zfs mount -a

cannot mount '/hgst4u60/ReferenceAC': directory is not empty

cannot mount '/hgst4u60/ReferenceDF': directory is not empty

cannot mount '/hgst4u60/ReferenceGI': directory is not empty

cannot mount '/hgst4u60/ReferenceJL': directory is not empty

cannot mount '/hgst4u60/ReferenceMO': directory is not empty

cannot mount '/hgst4u60/ReferencePR': directory is not empty

cannot mount '/hgst4u60/ReferenceSU': directory is not empty

cannot mount '/hgst4u60/ReferenceVX': directory is not empty

cannot mount '/hgst4u60/ReferenceYZ': directory is not empty

 

Maybe this is where all problems are coming from?

 

From: Stephan Budach [mailto:stephan.bud...@jvm.de] 
Sent: Donnerstag, 22. Juni 2017 09:30
To: Oliver Weinmann 
Cc: omnios-discuss 
Subject: Re: [OmniOS-discuss] Loosing NFS shares

 

Hi Oliver,

 

  _  

Von: "Oliver Weinmann" <  
oliver.weinm...@telespazio-vega.de>
An: "Tobias Oetiker" <  t...@oetiker.ch>
CC: "omnios-discuss" <  
omnios-discuss@lists.omniti.com>
Gesendet: Donnerstag, 22. Juni 2017 09:13:27
Betreff: Re: [OmniOS-discuss] Loosing NFS shares

 

Hi,

 

Don’t think so:

 

svcs -vx rcapd 

 

shows nothing.

 

 



Oliver Weinmann
Senior Unix VMWare, Storage Engineer

Telespazio VEGA Deutschland GmbH
Europaplatz 5 - 64293 Darmstadt - Germany 
Ph: + 49 (0)6151 8257 744 | Fax: +49 (0)6151 8257 799
  oliver.weinm...@telespazio-vega.de
  http://www.telespazio-vega.de

Registered office/Sitz: Darmstadt, Register court/Registergericht: Darmstadt, 
HRB 89231; Managing Director/Geschäftsführer: Sigmar Keller

From: Tobias Oetiker [mailto:t...@oetiker.ch] 
Sent: Donnerstag, 22. Juni 2017 09:11
To: Oliver Weinmann  >
Cc: omnios-discuss  >
Subject: Re: [OmniOS-discuss] Loosing NFS shares

 

Oliver,

 

are you running rcapd ? we found that (at least of the box) this thing wrecks 
havoc to both

nfs and iscsi sharing ... 

 

cheers

tobi

 

- On Jun 22, 2017, at 8:45 AM, Oliver Weinmann < 
 oliver.weinm...@telespazio-vega.de> 
wrote:

Hi,

 

we are using OmniOS for a few months now and have big trouble with stability. 
We mainly use it for VMware NFS datastores. The last 3 nights we lost all NFS 
datastores and VMs stopped running. I noticed that even though zfs get sharenfs 
shows folders as shared they become inaccessible. Setting sharenfs to off and 
sharing again solves the issue. I have no clue where to start. I’m fairly new 
to OmniOS.

 

Any help would be highly appreciated.

 

Thanks and Best Regards,

Oliver

 



Oliver Weinmann
Senior Unix VMWare, Storage Engineer

Telespazio VEGA Deutschland GmbH
Europaplatz 5 - 64293 Darmstadt - Germany 
Ph: + 49 (0)6151 8257 744 | Fax: +49 (0)6151 8257 799
  oliver.weinm...@telespazio-vega.de
  http://www.telespazio-vega.de

Registered office/Sitz: Darmstadt, Register court/Registergericht: Darmstadt, 
HRB 89231; Managing Director/Geschäftsführer: Sigmar Keller

 

What is the output from fmdump / fmdump -v? Also, it would be good to have a 
better understanding of your setup. We have been using NFS shares from OmniOS 
since r006 on OVM and also VMWare and at least the NFS part has always been 
very solid for us. So, how did you setup your storage and how many NFS clients 
do you have?

 

Cheers,

Stephan



smime.p7s
Description: S/MIME cryptographic signature
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Loosing NFS shares

2017-06-22 Thread Oliver Weinmann
Hi Stephan,

 

It seems that the problem is not VMware related as we also have non VMware NFS 
shares that are disappearing too. We have joined the omnios to our win2k8 r2 
domain. Previously we also setup ldap client but we had several problems with 
it as you can see from the fmdump and so we decieded to stop using ldap client.

 

What is the output from fmdump / fmdump -v?

 

>>  TIME UUID SUNW-MSG-ID EVENT

Jan 17 18:38:07.0648 a82e5b3d-217f-ecc6-e2d0-81f52d21798b SMF-8000-YX Diagnosed

Jan 17 18:51:46.6008 a82e5b3d-217f-ecc6-e2d0-81f52d21798b FMD-8000-4M Repaired

Jan 17 18:51:46.6017 a82e5b3d-217f-ecc6-e2d0-81f52d21798b FMD-8000-6U Resolved

Jan 17 19:07:25.0795 4a1ea04f-3b1f-6d80-ae6c-eafb51341197 SMF-8000-YX Diagnosed

Jan 17 12:44:18.0553 4a1ea04f-3b1f-6d80-ae6c-eafb51341197 FMD-8000-4M Repaired

Jan 17 12:44:18.0563 4a1ea04f-3b1f-6d80-ae6c-eafb51341197 FMD-8000-6U Resolved

Jan 19 10:21:58.4380 b55dcffb-d60f-c9a3-e94e-b1c753998ab0 SMF-8000-YX Diagnosed

Jan 19 10:21:59.1993 b55dcffb-d60f-c9a3-e94e-b1c753998ab0 FMD-8000-4M Repaired

Jan 19 10:21:59.2000 b55dcffb-d60f-c9a3-e94e-b1c753998ab0 FMD-8000-6U Resolved

Jan 19 17:09:38.1130 eeb1742a-c5f5-6e28-bf26-8cdf08924432 SMF-8000-YX Diagnosed

Jan 19 17:10:31.6057 eeb1742a-c5f5-6e28-bf26-8cdf08924432 FMD-8000-4M Repaired

Jan 19 17:10:31.6071 eeb1742a-c5f5-6e28-bf26-8cdf08924432 FMD-8000-6U Resolved

May 23 12:47:27.8159 685b232f-638c-cacf-cee9-98430dbdb97c SMF-8000-YX Diagnosed

May 23 12:47:33.2961 685b232f-638c-cacf-cee9-98430dbdb97c FMD-8000-4M Repaired

May 23 12:47:33.2974 685b232f-638c-cacf-cee9-98430dbdb97c FMD-8000-6U Resolved

May 23 12:53:46.2183 7bb2784a-9a50-6fd8-f25a-8043060a883a SMF-8000-YX Diagnosed

May 23 12:53:46.2245 7bb2784a-9a50-6fd8-f25a-8043060a883a FMD-8000-4M Repaired

May 23 12:53:46.2252 7bb2784a-9a50-6fd8-f25a-8043060a883a FMD-8000-6U Resolved

May 23 12:56:14.3915 d2294c05-5f95-e8dd-ea58-9bb31c11d519 SMF-8000-YX Diagnosed

May 23 12:57:21.8437 d2294c05-5f95-e8dd-ea58-9bb31c11d519 FMD-8000-4M Repaired

May 23 12:57:21.8452 d2294c05-5f95-e8dd-ea58-9bb31c11d519 FMD-8000-6U Resolved

May 23 12:59:22.2249 7d95a15a-ce83-4fd6-976c-c67fe15cd9ca SMF-8000-YX Diagnosed

May 23 12:59:22.2266 7d95a15a-ce83-4fd6-976c-c67fe15cd9ca FMD-8000-4M Repaired

May 23 12:59:22.2273 7d95a15a-ce83-4fd6-976c-c67fe15cd9ca FMD-8000-6U Resolved

May 23 16:58:42.2836 af1d96ae-489e-46cb-b4da-b5adf5780018 SMF-8000-YX Diagnosed

May 23 17:01:53.4846 af1d96ae-489e-46cb-b4da-b5adf5780018 FMD-8000-4M Repaired

May 23 17:01:53.4857 af1d96ae-489e-46cb-b4da-b5adf5780018 FMD-8000-6U Resolved

May 24 10:04:18.5145 34f126dd-fd71-4de2-cafd-dd084438d63a SMF-8000-YX Diagnosed

May 24 12:15:47.8823 34f126dd-fd71-4de2-cafd-dd084438d63a FMD-8000-4M Repaired

May 24 12:15:47.8838 34f126dd-fd71-4de2-cafd-dd084438d63a FMD-8000-6U Resolved

root@omnios01:/hgst4u60/ReferenceAC/AGDEMO# fmdump -v

TIME UUID SUNW-MSG-ID EVENT

Jan 17 18:38:07.0648 a82e5b3d-217f-ecc6-e2d0-81f52d21798b SMF-8000-YX Diagnosed

  100%  defect.sunos.smf.svc.maintenance

 

Problem in: svc:///network/ntp:default

   Affects: svc:///network/ntp:default

   FRU: -

  Location: -

 

Jan 17 18:51:46.6008 a82e5b3d-217f-ecc6-e2d0-81f52d21798b FMD-8000-4M Repaired

  100%  defect.sunos.smf.svc.maintenanceRepair Attempted

 

Problem in: svc:///network/ntp:default

   Affects: svc:///network/ntp:default

   FRU: -

  Location: -

 

Jan 17 18:51:46.6017 a82e5b3d-217f-ecc6-e2d0-81f52d21798b FMD-8000-6U Resolved

  100%  defect.sunos.smf.svc.maintenanceRepair Attempted

 

Problem in: svc:///network/ntp:default

   Affects: svc:///network/ntp:default

   FRU: -

  Location: -

 

Jan 17 19:07:25.0795 4a1ea04f-3b1f-6d80-ae6c-eafb51341197 SMF-8000-YX Diagnosed

  100%  defect.sunos.smf.svc.maintenance

 

Problem in: svc:///network/ntp:default

   Affects: svc:///network/ntp:default

   FRU: -

  Location: -

 

Jan 17 12:44:18.0553 4a1ea04f-3b1f-6d80-ae6c-eafb51341197 FMD-8000-4M Repaired

  100%  defect.sunos.smf.svc.maintenanceRepair Attempted

 

Problem in: svc:///network/ntp:default

   Affects: svc:///network/ntp:default

   FRU: -

  Location: -

 

Jan 17 12:44:18.0563 4a1ea04f-3b1f-6d80-ae6c-eafb51341197 FMD-8000-6U Resolved

  100%  defect.sunos.smf.svc.maintenanceRepair Attempted

 

Problem in: svc:///network/ntp:default

   Affects: svc:///network/ntp:default

   FRU: -

  Location: -

 

Jan 19 10:21:58.4380 b55dcffb-d60f-c9a3-e94e-b1c753998ab0 SMF-8000-YX Diagnosed

  100%  defect.sunos.smf.svc.maintenance

 

Problem in: svc:///network/ldap/client:default

   Affects: svc:///network/ldap/client:default

   FRU: -

  

Re: [OmniOS-discuss] Loosing NFS shares

2017-06-22 Thread Stephan Budach
Hi Oliver, 


Von: "Oliver Weinmann"  
An: "Tobias Oetiker"  
CC: "omnios-discuss"  
Gesendet: Donnerstag, 22. Juni 2017 09:13:27 
Betreff: Re: [OmniOS-discuss] Loosing NFS shares 



Hi, 



Don’t think so: 



svcs -vx rcapd 



shows nothing. 







Oliver Weinmann 
Senior Unix VMWare, Storage Engineer 

Telespazio VEGA Deutschland GmbH 
Europaplatz 5 - 64293 Darmstadt - Germany 
Ph: + 49 (0)6151 8257 744 | Fax: +49 (0)6151 8257 799 
[ mailto:oliver.weinm...@telespazio-vega.de | 
oliver.weinm...@telespazio-vega.de ] 
[ http://www.telespazio-vega.de/ | http://www.telespazio-vega.de ] 


Registered office/Sitz: Darmstadt, Register court/Registergericht: Darmstadt, 
HRB 89231; Managing Director/Geschäftsführer: Sigmar Keller 


From: Tobias Oetiker [mailto:t...@oetiker.ch] 
Sent: Donnerstag, 22. Juni 2017 09:11 
To: Oliver Weinmann  
Cc: omnios-discuss  
Subject: Re: [OmniOS-discuss] Loosing NFS shares 





Oliver, 





are you running rcapd ? we found that (at least of the box) this thing wrecks 
havoc to both 


nfs and iscsi sharing ... 





cheers 


tobi 





- On Jun 22, 2017, at 8:45 AM, Oliver Weinmann < [ 
mailto:oliver.weinm...@telespazio-vega.de | oliver.weinm...@telespazio-vega.de 
] > wrote: 





Hi, 



we are using OmniOS for a few months now and have big trouble with stability. 
We mainly use it for VMware NFS datastores. The last 3 nights we lost all NFS 
datastores and VMs stopped running. I noticed that even though zfs get sharenfs 
shows folders as shared they become inaccessible. Setting sharenfs to off and 
sharing again solves the issue. I have no clue where to start. I’m fairly new 
to OmniOS. 



Any help would be highly appreciated. 



Thanks and Best Regards, 

Oliver 





Oliver Weinmann 
Senior Unix VMWare, Storage Engineer 

Telespazio VEGA Deutschland GmbH 
Europaplatz 5 - 64293 Darmstadt - Germany 
Ph: + 49 (0)6151 8257 744 | Fax: +49 (0)6151 8257 799 
[ mailto:oliver.weinm...@telespazio-vega.de | 
oliver.weinm...@telespazio-vega.de ] 
[ http://www.telespazio-vega.de/ | http://www.telespazio-vega.de ] 


Registered office/Sitz: Darmstadt, Register court/Registergericht: Darmstadt, 
HRB 89231; Managing Director/Geschäftsführer: Sigmar Keller 



What is the output from fmdump / fmdump -v? Also, it would be good to have a 
better understanding of your setup. We have been using NFS shares from OmniOS 
since r006 on OVM and also VMWare and at least the NFS part has always been 
very solid for us. So, how did you setup your storage and how many NFS clients 
do you have? 

Cheers, 
Stephan 


smime.p7s
Description: S/MIME cryptographic signature
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Loosing NFS shares

2017-06-22 Thread Stephan Budach
Hi Oliver, 

- Ursprüngliche Mail -

> Von: "Oliver Weinmann" 
> An: omnios-discuss@lists.omniti.com
> Gesendet: Donnerstag, 22. Juni 2017 08:45:14
> Betreff: [OmniOS-discuss] Loosing NFS shares

> Hi,

> we are using OmniOS for a few months now and have big trouble with
> stability. We mainly use it for VMware NFS datastores. The last 3
> nights we lost all NFS datastores and VMs stopped running. I noticed
> that even though zfs get sharenfs shows folders as shared they
> become inaccessible. Setting sharenfs to off and sharing again
> solves the issue. I have no clue where to start. I’m fairly new to
> OmniOS.

> Any help would be highly appreciated.

> Thanks and Best Regards,
> Oliver


smime.p7s
Description: S/MIME cryptographic signature
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Loosing NFS shares

2017-06-22 Thread Oliver Weinmann
Hi,

Don’t think so:

svcs -vx rcapd

shows nothing.




[cid:Logo_Telespazio_180_px_signature_eng_b58fa623-e26d-4116-9230-766adacfe55e1.png]

Oliver Weinmann
Senior Unix VMWare, Storage Engineer

Telespazio VEGA Deutschland GmbH
Europaplatz 5 - 64293 Darmstadt - Germany
Ph: + 49 (0)6151 8257 744 | Fax: +49 (0)6151 8257 799
oliver.weinm...@telespazio-vega.de
http://www.telespazio-vega.de

Registered office/Sitz: Darmstadt, Register court/Registergericht: Darmstadt, 
HRB 89231; Managing Director/Geschäftsführer: Sigmar Keller
From: Tobias Oetiker [mailto:t...@oetiker.ch]
Sent: Donnerstag, 22. Juni 2017 09:11
To: Oliver Weinmann 
Cc: omnios-discuss 
Subject: Re: [OmniOS-discuss] Loosing NFS shares

Oliver,

are you running rcapd ? we found that (at least of the box) this thing wrecks 
havoc to both
nfs and iscsi sharing ...

cheers
tobi

- On Jun 22, 2017, at 8:45 AM, Oliver Weinmann 
> 
wrote:

Hi,

we are using OmniOS for a few months now and have big trouble with stability. 
We mainly use it for VMware NFS datastores. The last 3 nights we lost all NFS 
datastores and VMs stopped running. I noticed that even though zfs get sharenfs 
shows folders as shared they become inaccessible. Setting sharenfs to off and 
sharing again solves the issue. I have no clue where to start. I’m fairly new 
to OmniOS.

Any help would be highly appreciated.

Thanks and Best Regards,
Oliver



[cid:image001.png@01D2EB37.CE643CD0]

Oliver Weinmann
Senior Unix VMWare, Storage Engineer

Telespazio VEGA Deutschland GmbH
Europaplatz 5 - 64293 Darmstadt - Germany
Ph: + 49 (0)6151 8257 744 | Fax: +49 (0)6151 8257 799
oliver.weinm...@telespazio-vega.de
http://www.telespazio-vega.de

Registered office/Sitz: Darmstadt, Register court/Registergericht: Darmstadt, 
HRB 89231; Managing Director/Geschäftsführer: Sigmar Keller

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss

--
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
www.oetiker.ch t...@oetiker.ch 
+41 62 775 9902
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Loosing NFS shares

2017-06-22 Thread Tobias Oetiker
Oliver, 

are you running rcapd ? we found that (at least of the box) this thing wrecks 
havoc to both 
nfs and iscsi sharing ... 

cheers 
tobi 

- On Jun 22, 2017, at 8:45 AM, Oliver Weinmann 
 wrote: 

> Hi,

> we are using OmniOS for a few months now and have big trouble with stability. 
> We
> mainly use it for VMware NFS datastores. The last 3 nights we lost all NFS
> datastores and VMs stopped running. I noticed that even though zfs get 
> sharenfs
> shows folders as shared they become inaccessible. Setting sharenfs to off and
> sharing again solves the issue. I have no clue where to start. I’m fairly new
> to OmniOS.

> Any help would be highly appreciated.

> Thanks and Best Regards,

> Oliver

> Oliver Weinmann
> Senior Unix VMWare, Storage Engineer

> Telespazio VEGA Deutschland GmbH
> Europaplatz 5 - 64293 Darmstadt - Germany
> Ph: + 49 (0)6151 8257 744 | Fax: +49 (0)6151 8257 799
> [ mailto:oliver.weinm...@telespazio-vega.de | 
> oliver.weinm...@telespazio-vega.de
> ]
> [ http://www.telespazio-vega.de/ | http://www.telespazio-vega.de ]

> Registered office/Sitz: Darmstadt, Register court/Registergericht: Darmstadt,
> HRB 89231; Managing Director/Geschäftsführer: Sigmar Keller
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland 
www.oetiker.ch t...@oetiker.ch +41 62 775 9902 
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


[OmniOS-discuss] Loosing NFS shares

2017-06-22 Thread Oliver Weinmann
Hi,

we are using OmniOS for a few months now and have big trouble with stability. 
We mainly use it for VMware NFS datastores. The last 3 nights we lost all NFS 
datastores and VMs stopped running. I noticed that even though zfs get sharenfs 
shows folders as shared they become inaccessible. Setting sharenfs to off and 
sharing again solves the issue. I have no clue where to start. I'm fairly new 
to OmniOS.

Any help would be highly appreciated.

Thanks and Best Regards,
Oliver



[cid:Logo_Telespazio_180_px_signature_eng_b58fa623-e26d-4116-9230-766adacfe55e1.png]

Oliver Weinmann
Senior Unix VMWare, Storage Engineer

Telespazio VEGA Deutschland GmbH
Europaplatz 5 - 64293 Darmstadt - Germany
Ph: + 49 (0)6151 8257 744 | Fax: +49 (0)6151 8257 799
oliver.weinm...@telespazio-vega.de
http://www.telespazio-vega.de

Registered office/Sitz: Darmstadt, Register court/Registergericht: Darmstadt, 
HRB 89231; Managing Director/Gesch?ftsf?hrer: Sigmar Keller
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss