Re: [OmniOS-discuss] Ang: LX Zones question: Do you miss ipadm(1M)?

2017-03-30 Thread Brian Hechinger
I'd like to see a way that network configuration can be disabled from
within the zone so that it's set by the host admin and not the zone admin
(assuming they are different people).

Is this a possibility?

On Mar 30, 2017 5:04 PM, "Dan McDonald"  wrote:

>
> > On Mar 30, 2017, at 5:02 PM, Bob Friesenhahn <
> bfrie...@simple.dallas.tx.us> wrote:
> >
> > On Thu, 30 Mar 2017, Dan McDonald wrote:
> >
> >>
> >>> On Mar 30, 2017, at 4:26 PM, Bob Friesenhahn <
> bfrie...@simple.dallas.tx.us> wrote:
> >>>
> >>> The only way it could possibly work is if /etc/resolv.conf gets
> updated in the zone.  This is because native user-space apps/libraries take
> care of the DNS lookups rather than kernel code.
> >>
> >> Check out /usr/lib/brand/lx/lx_boot_zone_*.  Those scripts scribble
> resolv.conf at zone boot time.
> >
> > Linux DHCP can overwrite files at any time, possibly weeks after boot.
>
> Interesting.
>
> Given "lxinit" does DHCP too, you probably shouldn't be using any Linux
> DHCP client in an LX zone.
>
> Dan
>
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] OmniOS r151020: Setup won't see my NVMe disk

2017-03-23 Thread Brian Hechinger

> On Mar 23, 2017, at 12:01, Dan McDonald  wrote:
> 
> 
>> On Mar 23, 2017, at 11:53 AM, Brian Hechinger  wrote:
>> 
>> Has the issue with the Samsung drives been fixed? I haven’t tried it lately, 
>> but the 950 Pros were hanging. Those are 1.2a devices though, so I don’t 
>> know if that matters.
>> 
>> Just worried that enabling 1.2 might result in crashy systems.
>> 
> 
> No idea, and "crashy systems" is why it's not turned on by default today.

Fix it, fix it, fix it, fix it, fix it, fix it!

> Thank you for the reality check!

Yay, I’m useful! :-D

-brian
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] OmniOS r151020: Setup won't see my NVMe disk

2017-03-23 Thread Brian Hechinger

>> This suggests that the SSD is indeed a NVM 1.2 SSD. :-(
>> But I still did not find a proper datasheet or any (Linux, ...) tool to 
>> identify the NVMe level.
>> 
>> What are your plans to support NVMe 1.2?
> 
> We can override the NVMe settings to support 1.2 devices, but we can't 
> support all of the 1.2 improvements (e.g. namespaces).  Colleague Dale Ghent, 
> who's spent time both in ixgbe (he put X550 support into illumos and 
> therefore OmniOS) and NVMe can speak more, but I don't think he'll disagree 
> with anything I've said here.
> 
> The big question to my mind is whether or not we should just support NVMe 1.2 
> devices out of the box (and out of the Installer ISO).

Has the issue with the Samsung drives been fixed? I haven’t tried it lately, 
but the 950 Pros were hanging. Those are 1.2a devices though, so I don’t know 
if that matters.

Just worried that enabling 1.2 might result in crashy systems.

-brian
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] [discuss] COMSTAR hanging

2016-01-12 Thread Brian Hechinger
In my case the SATA disks aren’t on the 1068E.

-brian

> On Jan 12, 2016, at 11:19 PM, John Barfield  wrote:
> 
> BTW I left off that it has the same LSI controller chipset
> 
> Sent from Outlook Mobile <https://aka.ms/qtex0l>
> _
> From: John Barfield 
> Sent: Tuesday, January 12, 2016 10:17 PM
> Subject: Re: [OmniOS-discuss] [discuss] COMSTAR hanging
> To: , omnios-discuss 
> 
> 
> 
> My input may or may not be valid but Im going to throw it out there anyway :)
> 
> do you have any Mpt disconnect errors in /var/adm/messages? 
> 
> Also do you have smartmontools installed? 
> 
> I ran into similiar issues just booting a sunfire x4540 recently off of 
> OmniOS live, i/o would just hang while probing device nodes.
> 
> I found the drive that was acting up and pulled it. 
> 
> All of a sudden everything miraculously worked amazing. 
> 
> I compiled smartmontools after I got it to boot and found 10 drives out of 48 
> with bad sectors in prefail state.
> 
> I dont know if this happens with SAS drives or not but Im using SATA and saw 
> this was a common issue in old opensolaris threads. 
> 
> -barfield 
> 
> Sent from Outlook Mobile <https://aka.ms/qtex0l>
> 
> 
> 
> On Tue, Jan 12, 2016 at 8:08 PM -0800, "Brian Hechinger"  <mailto:wo...@4amlunch.net>> wrote: 
> 
> In the meantime I’ve removed the SLOG and L2ARC just in case. I don’t think 
> that’s it though. At least will have some sort of data point to work with 
> here. :) 
> 
> -brian 
> 
> > On Jan 12, 2016, at 10:55 PM, Brian Hechinger  wrote: 
> > 
> > Ok, it has happened. 
> > 
> > Checking this here, the pool seems to be fine. I can read and write files. 
> > 
> > except ‘zpool status’ is now currently hanging. I can still read/write from 
> > the pool, however. 
> > 
> > I can telnet to port 3260, but restarting target services has hung. 
> > 
> > root@basket1:/tank/Share# svcs -a | grep stmf 
> > online Jan_05   svc:/system/stmf:default 
> > root@basket1:/tank/Share# svcs -a | grep target 
> > disabled   Jan_05   svc:/system/fcoe_target:default 
> > online Jan_05   svc:/network/iscsi/target:default 
> > online Jan_05   svc:/system/ibsrp/target:default 
> > root@basket1:/tank/Share# svcadm restart /system/ibsrp/target 
> > root@basket1:/tank/Share# svcadm restart /network/iscsi/target 
> > root@basket1:/tank/Share# svcadm restart /system/stmf 
> > root@basket1:/tank/Share# svcs -a | grep target 
> > disabled   Jan_05   svc:/system/fcoe_target:default 
> > online*22:43:03 svc:/system/ibsrp/target:default 
> > online*22:43:13 svc:/network/iscsi/target:default 
> > root@basket1:/tank/Share# svcs -a | grep stmf 
> > online*22:43:18 svc:/system/stmf:default 
> > root@basket1:/tank/Share# 
> > 
> > I’m doing a crash dump reboot. I’ll post the output somewhere. 
> > 
> > The output of echo '$ > 
> >  
> > 
> >> On Jan 8, 2016, at 3:11 PM, Matej Zerovnik  wrote: 
> >> 
> >> Is the pool usable during comstar hang? 
> >> Can you write and read from the pool (test both, in my case, when pool 
> >> froze, I wasn’t able to write to the pool, but I could read). 
> >> 
> >> Again, this might not be connected with Comstar, but in my case, Comstar 
> >> and pool hang were exchanging. 
> >> 
> >> Matej 
> >> 
> >>> On 08 Jan 2016, at 20:11, Brian Hechinger  wrote: 
> >>> 
> >>> Yeah, I’m using the 1068E to boot from (this has been supported since 
> >>> before Illumos) but that doesn’t have anything accessed by COMSTAR. 
> >>> 
> >>> It’s the ICH10R SATA that hosts the disks that COMSTAR shares out space 
> >>> from. 
> >>> 
> >>> -brian 
> >>> 
> >>>> On Jan 8, 2016, at 1:31 PM, Richard Jahnel  
> >>>> wrote: 
> >>>> 
> >>>> First off, love SuperMicro good choice IMHO. 
> >>>> 
> >>>> This board has two on board controllers. 
> >>>> 
> >>>> LSI SAS1068E (not 100% sure there are working illumos drivers for this 
> >>>> one) 
> >>>> 
> >>>> And 
> >>>> 
> >>>> Intel ICH10R SATA (So I'm guessing your using this one.) 
> >>>> 
> >>>> -Original Message- 
> >>>> From: OmniOS-discuss [ mailto:omnios-discuss-boun...@lists.omniti.com 
> >>>&g

Re: [OmniOS-discuss] [discuss] COMSTAR hanging

2016-01-12 Thread Brian Hechinger
I will look to do this. The shared storage is on SATA disks, so maybe? Although 
they are new. I hope they are fine. :)

I don’t see anything about mpt in /var/adm/messages, no.

-brian

> On Jan 12, 2016, at 11:16 PM, John Barfield  wrote:
> 
> My input may or may not be valid but Im going to throw it out there anyway :)
> 
> do you have any Mpt disconnect errors in /var/adm/messages?
> 
> Also do you have smartmontools installed?
> 
> I ran into similiar issues just booting a sunfire x4540 recently off of 
> OmniOS live, i/o would just hang while probing device nodes.
> 
> I found the drive that was acting up and pulled it.
> 
> All of a sudden everything miraculously worked amazing.
> 
> I compiled smartmontools after I got it to boot and found 10 drives out of 48 
> with bad sectors in prefail state.
> 
> I dont know if this happens with SAS drives or not but Im using SATA and saw 
> this was a common issue in old opensolaris threads.
> 
> -barfield
> 
> Sent from Outlook Mobile <https://aka.ms/qtex0l>
> 
> 
> 
> On Tue, Jan 12, 2016 at 8:08 PM -0800, "Brian Hechinger"  <mailto:wo...@4amlunch.net>> wrote:
> 
> In the meantime I’ve removed the SLOG and L2ARC just in case. I don’t think 
> that’s it though. At least will have some sort of data point to work with 
> here. :)
> 
> -brian
> 
> > On Jan 12, 2016, at 10:55 PM, Brian Hechinger  wrote:
> > 
> > Ok, it has happened.
> > 
> > Checking this here, the pool seems to be fine. I can read and write files.
> > 
> > except ‘zpool status’ is now currently hanging. I can still read/write from 
> > the pool, however.
> > 
> > I can telnet to port 3260, but restarting target services has hung.
> > 
> > root@basket1:/tank/Share# svcs -a | grep stmf
> > online Jan_05   svc:/system/stmf:default
> > root@basket1:/tank/Share# svcs -a | grep target
> > disabled   Jan_05   svc:/system/fcoe_target:default
> > online Jan_05   svc:/network/iscsi/target:default
> > online Jan_05   svc:/system/ibsrp/target:default
> > root@basket1:/tank/Share# svcadm restart /system/ibsrp/target
> > root@basket1:/tank/Share# svcadm restart /network/iscsi/target
> > root@basket1:/tank/Share# svcadm restart /system/stmf
> > root@basket1:/tank/Share# svcs -a | grep target
> > disabled   Jan_05   svc:/system/fcoe_target:default
> > online*22:43:03 svc:/system/ibsrp/target:default
> > online*22:43:13 svc:/network/iscsi/target:default
> > root@basket1:/tank/Share# svcs -a | grep stmf
> > online*22:43:18 svc:/system/stmf:default
> > root@basket1:/tank/Share#
> > 
> > I’m doing a crash dump reboot. I’ll post the output somewhere.
> > 
> > The output of echo '$ > 
> > 
> > 
> >> On Jan 8, 2016, at 3:11 PM, Matej Zerovnik  wrote:
> >> 
> >> Is the pool usable during comstar hang?
> >> Can you write and read from the pool (test both, in my case, when pool 
> >> froze, I wasn’t able to write to the pool, but I could read).
> >> 
> >> Again, this might not be connected with Comstar, but in my case, Comstar 
> >> and pool hang were exchanging.
> >> 
> >> Matej
> >> 
> >>> On 08 Jan 2016, at 20:11, Brian Hechinger  wrote:
> >>> 
> >>> Yeah, I’m using the 1068E to boot from (this has been supported since 
> >>> before Illumos) but that doesn’t have anything accessed by COMSTAR.
> >>> 
> >>> It’s the ICH10R SATA that hosts the disks that COMSTAR shares out space 
> >>> from.
> >>> 
> >>> -brian
> >>> 
> >>>> On Jan 8, 2016, at 1:31 PM, Richard Jahnel  
> >>>> wrote:
> >>>> 
> >>>> First off, love SuperMicro good choice IMHO.
> >>>> 
> >>>> This board has two on board controllers.
> >>>> 
> >>>> LSI SAS1068E (not 100% sure there are working illumos drivers for this 
> >>>> one)
> >>>> 
> >>>> And
> >>>> 
> >>>> Intel ICH10R SATA (So I'm guessing your using this one.)
> >>>> 
> >>>> -Original Message-
> >>>> From: OmniOS-discuss [mailto:omnios-discuss-boun...@lists.omniti.com 
> >>>> <mailto:omnios-discuss-boun...@lists.omniti.com>] On Behalf Of Brian 
> >>>> Hechinger
> >>>> Sent: Friday, January 08, 2016 12:16 PM
> >>>> To: Matej Zerovnik 
> >>>> Cc: omnios-discuss 
> >

Re: [OmniOS-discuss] [discuss] COMSTAR hanging

2016-01-12 Thread Brian Hechinger
In the meantime I’ve removed the SLOG and L2ARC just in case. I don’t think 
that’s it though. At least will have some sort of data point to work with here. 
:)

-brian

> On Jan 12, 2016, at 10:55 PM, Brian Hechinger  wrote:
> 
> Ok, it has happened.
> 
> Checking this here, the pool seems to be fine. I can read and write files.
> 
> except ‘zpool status’ is now currently hanging. I can still read/write from 
> the pool, however.
> 
> I can telnet to port 3260, but restarting target services has hung.
> 
> root@basket1:/tank/Share# svcs -a | grep stmf
> online Jan_05   svc:/system/stmf:default
> root@basket1:/tank/Share# svcs -a | grep target
> disabled   Jan_05   svc:/system/fcoe_target:default
> online Jan_05   svc:/network/iscsi/target:default
> online Jan_05   svc:/system/ibsrp/target:default
> root@basket1:/tank/Share# svcadm restart /system/ibsrp/target
> root@basket1:/tank/Share# svcadm restart /network/iscsi/target
> root@basket1:/tank/Share# svcadm restart /system/stmf
> root@basket1:/tank/Share# svcs -a | grep target
> disabled   Jan_05   svc:/system/fcoe_target:default
> online*22:43:03 svc:/system/ibsrp/target:default
> online*22:43:13 svc:/network/iscsi/target:default
> root@basket1:/tank/Share# svcs -a | grep stmf
> online*22:43:18 svc:/system/stmf:default
> root@basket1:/tank/Share#
> 
> I’m doing a crash dump reboot. I’ll post the output somewhere.
> 
> The output of echo '$ 
> 
> 
>> On Jan 8, 2016, at 3:11 PM, Matej Zerovnik  wrote:
>> 
>> Is the pool usable during comstar hang?
>> Can you write and read from the pool (test both, in my case, when pool 
>> froze, I wasn’t able to write to the pool, but I could read).
>> 
>> Again, this might not be connected with Comstar, but in my case, Comstar and 
>> pool hang were exchanging.
>> 
>> Matej
>> 
>>> On 08 Jan 2016, at 20:11, Brian Hechinger  wrote:
>>> 
>>> Yeah, I’m using the 1068E to boot from (this has been supported since 
>>> before Illumos) but that doesn’t have anything accessed by COMSTAR.
>>> 
>>> It’s the ICH10R SATA that hosts the disks that COMSTAR shares out space 
>>> from.
>>> 
>>> -brian
>>> 
>>>> On Jan 8, 2016, at 1:31 PM, Richard Jahnel  wrote:
>>>> 
>>>> First off, love SuperMicro good choice IMHO.
>>>> 
>>>> This board has two on board controllers.
>>>> 
>>>> LSI SAS1068E (not 100% sure there are working illumos drivers for this one)
>>>> 
>>>> And
>>>> 
>>>> Intel ICH10R SATA (So I'm guessing your using this one.)
>>>> 
>>>> -Original Message-
>>>> From: OmniOS-discuss [mailto:omnios-discuss-boun...@lists.omniti.com] On 
>>>> Behalf Of Brian Hechinger
>>>> Sent: Friday, January 08, 2016 12:16 PM
>>>> To: Matej Zerovnik 
>>>> Cc: omnios-discuss 
>>>> Subject: Re: [OmniOS-discuss] [discuss] COMSTAR hanging
>>>> 
>>>> 
>>>>> Which controller exactly do you have?
>>>> 
>>>> Whatever ACHI stuff is built into the motherboard. Motherboard is X8DTL-3F.
>>>> 
>>>>> Do you know firmware version?
>>>> 
>>>> I’m assuming this is linked to the BIOS version?
>>>> 
>>>>> Which hard drives?
>>>> 
>>>> Hitachi-HUA723030ALA640-MKAOAA50-2.73TB
>>>> 
>>>>> It might not tell much, but it’s good to have as much information as 
>>>>> possible.
>>>>> 
>>>>> When comstar hangs, can you telnet to the iSCSI port?
>>>>> What does svcs says, is the service running?
>>>>> What happens in you try to restart it?
>>>>> How do you restart it?
>>>> 
>>>> I’ll try all these things next time.
>>>> 
>>>>> In my case, svcs reported service running, but when I tried to telnet, 
>>>>> there was no connection as well as there was no listening port opened 
>>>>> when checking with 'netstat -an'. If I tried to restart target and stmf 
>>>>> service, but stmf service got stucked in online* state and would not 
>>>>> start. Reboot was the only solution in my case, but as I said, latest 014 
>>>>> release is working OK (but then again, load got reduced).
>>>> 
>>>> All good info. Thanks!
>>>> 
>>>> -brian
>>>> 
>>>>> 
>&g

Re: [OmniOS-discuss] [discuss] COMSTAR hanging

2016-01-12 Thread Brian Hechinger
Or there won’t be a crash dump. Unless I get one after resetting the box. 
‘reboot -d’ has hung as well. :(

-brian

> On Jan 12, 2016, at 10:55 PM, Brian Hechinger  wrote:
> 
> Ok, it has happened.
> 
> Checking this here, the pool seems to be fine. I can read and write files.
> 
> except ‘zpool status’ is now currently hanging. I can still read/write from 
> the pool, however.
> 
> I can telnet to port 3260, but restarting target services has hung.
> 
> root@basket1:/tank/Share# svcs -a | grep stmf
> online Jan_05   svc:/system/stmf:default
> root@basket1:/tank/Share# svcs -a | grep target
> disabled   Jan_05   svc:/system/fcoe_target:default
> online Jan_05   svc:/network/iscsi/target:default
> online Jan_05   svc:/system/ibsrp/target:default
> root@basket1:/tank/Share# svcadm restart /system/ibsrp/target
> root@basket1:/tank/Share# svcadm restart /network/iscsi/target
> root@basket1:/tank/Share# svcadm restart /system/stmf
> root@basket1:/tank/Share# svcs -a | grep target
> disabled   Jan_05   svc:/system/fcoe_target:default
> online*22:43:03 svc:/system/ibsrp/target:default
> online*22:43:13 svc:/network/iscsi/target:default
> root@basket1:/tank/Share# svcs -a | grep stmf
> online*22:43:18 svc:/system/stmf:default
> root@basket1:/tank/Share#
> 
> I’m doing a crash dump reboot. I’ll post the output somewhere.
> 
> The output of echo '$ 
> 
> 
>> On Jan 8, 2016, at 3:11 PM, Matej Zerovnik  wrote:
>> 
>> Is the pool usable during comstar hang?
>> Can you write and read from the pool (test both, in my case, when pool 
>> froze, I wasn’t able to write to the pool, but I could read).
>> 
>> Again, this might not be connected with Comstar, but in my case, Comstar and 
>> pool hang were exchanging.
>> 
>> Matej
>> 
>>> On 08 Jan 2016, at 20:11, Brian Hechinger  wrote:
>>> 
>>> Yeah, I’m using the 1068E to boot from (this has been supported since 
>>> before Illumos) but that doesn’t have anything accessed by COMSTAR.
>>> 
>>> It’s the ICH10R SATA that hosts the disks that COMSTAR shares out space 
>>> from.
>>> 
>>> -brian
>>> 
>>>> On Jan 8, 2016, at 1:31 PM, Richard Jahnel  wrote:
>>>> 
>>>> First off, love SuperMicro good choice IMHO.
>>>> 
>>>> This board has two on board controllers.
>>>> 
>>>> LSI SAS1068E (not 100% sure there are working illumos drivers for this one)
>>>> 
>>>> And
>>>> 
>>>> Intel ICH10R SATA (So I'm guessing your using this one.)
>>>> 
>>>> -Original Message-
>>>> From: OmniOS-discuss [mailto:omnios-discuss-boun...@lists.omniti.com] On 
>>>> Behalf Of Brian Hechinger
>>>> Sent: Friday, January 08, 2016 12:16 PM
>>>> To: Matej Zerovnik 
>>>> Cc: omnios-discuss 
>>>> Subject: Re: [OmniOS-discuss] [discuss] COMSTAR hanging
>>>> 
>>>> 
>>>>> Which controller exactly do you have?
>>>> 
>>>> Whatever ACHI stuff is built into the motherboard. Motherboard is X8DTL-3F.
>>>> 
>>>>> Do you know firmware version?
>>>> 
>>>> I’m assuming this is linked to the BIOS version?
>>>> 
>>>>> Which hard drives?
>>>> 
>>>> Hitachi-HUA723030ALA640-MKAOAA50-2.73TB
>>>> 
>>>>> It might not tell much, but it’s good to have as much information as 
>>>>> possible.
>>>>> 
>>>>> When comstar hangs, can you telnet to the iSCSI port?
>>>>> What does svcs says, is the service running?
>>>>> What happens in you try to restart it?
>>>>> How do you restart it?
>>>> 
>>>> I’ll try all these things next time.
>>>> 
>>>>> In my case, svcs reported service running, but when I tried to telnet, 
>>>>> there was no connection as well as there was no listening port opened 
>>>>> when checking with 'netstat -an'. If I tried to restart target and stmf 
>>>>> service, but stmf service got stucked in online* state and would not 
>>>>> start. Reboot was the only solution in my case, but as I said, latest 014 
>>>>> release is working OK (but then again, load got reduced).
>>>> 
>>>> All good info. Thanks!
>>>> 
>>>> -brian
>>>> 
>>>>> 
>>>>> Matej
>

Re: [OmniOS-discuss] [discuss] COMSTAR hanging

2016-01-08 Thread Brian Hechinger
Yeah, I’m using the 1068E to boot from (this has been supported since before 
Illumos) but that doesn’t have anything accessed by COMSTAR.

It’s the ICH10R SATA that hosts the disks that COMSTAR shares out space from.

-brian

> On Jan 8, 2016, at 1:31 PM, Richard Jahnel  wrote:
> 
> First off, love SuperMicro good choice IMHO.
> 
> This board has two on board controllers.
> 
> LSI SAS1068E (not 100% sure there are working illumos drivers for this one)
> 
> And
> 
> Intel ICH10R SATA (So I'm guessing your using this one.)
> 
> -Original Message-
> From: OmniOS-discuss [mailto:omnios-discuss-boun...@lists.omniti.com] On 
> Behalf Of Brian Hechinger
> Sent: Friday, January 08, 2016 12:16 PM
> To: Matej Zerovnik 
> Cc: omnios-discuss 
> Subject: Re: [OmniOS-discuss] [discuss] COMSTAR hanging
> 
> 
>> Which controller exactly do you have?
> 
> Whatever ACHI stuff is built into the motherboard. Motherboard is X8DTL-3F.
> 
>> Do you know firmware version?
> 
> I’m assuming this is linked to the BIOS version?
> 
>> Which hard drives?
> 
> Hitachi-HUA723030ALA640-MKAOAA50-2.73TB
> 
>> It might not tell much, but it’s good to have as much information as 
>> possible.
>> 
>> When comstar hangs, can you telnet to the iSCSI port?
>> What does svcs says, is the service running?
>> What happens in you try to restart it?
>> How do you restart it?
> 
> I’ll try all these things next time.
> 
>> In my case, svcs reported service running, but when I tried to telnet, there 
>> was no connection as well as there was no listening port opened when 
>> checking with 'netstat -an'. If I tried to restart target and stmf service, 
>> but stmf service got stucked in online* state and would not start. Reboot 
>> was the only solution in my case, but as I said, latest 014 release is 
>> working OK (but then again, load got reduced).
> 
> All good info. Thanks!
> 
> -brian
> 
>> 
>> Matej
>> 
>>> On 08 Jan 2016, at 17:50, Dave Pooser  wrote:
>>> 
>>>>> On Jan 8, 2016, at 11:22 AM, Brian Hechinger  wrote:
>>>>> 
>>>>> No, ZFS raid10
>>>> 
>>>> Saw the HW-RAID term, and got concerned.  That's what, raidz2 in ZFS-ese?
>>> 
>>> It's a zpool with multiple mirror vdevs.
>>> 
>>> -- 
>>> Dave Pooser
>>> Cat-Herder-in-Chief, Pooserville.com
>>> 
>>> 
>>> ___
>>> OmniOS-discuss mailing list
>>> OmniOS-discuss@lists.omniti.com
>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>> 
>> ___
>> OmniOS-discuss mailing list
>> OmniOS-discuss@lists.omniti.com
>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
> 
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] [discuss] COMSTAR hanging

2016-01-08 Thread Brian Hechinger

> Which controller exactly do you have?

Whatever ACHI stuff is built into the motherboard. Motherboard is X8DTL-3F.

> Do you know firmware version?

I’m assuming this is linked to the BIOS version?

> Which hard drives?

Hitachi-HUA723030ALA640-MKAOAA50-2.73TB

> It might not tell much, but it’s good to have as much information as possible.
> 
> When comstar hangs, can you telnet to the iSCSI port?
> What does svcs says, is the service running?
> What happens in you try to restart it?
> How do you restart it?

I’ll try all these things next time.

> In my case, svcs reported service running, but when I tried to telnet, there 
> was no connection as well as there was no listening port opened when checking 
> with 'netstat -an'. If I tried to restart target and stmf service, but stmf 
> service got stucked in online* state and would not start. Reboot was the only 
> solution in my case, but as I said, latest 014 release is working OK (but 
> then again, load got reduced).

All good info. Thanks!

-brian

> 
> Matej
> 
>> On 08 Jan 2016, at 17:50, Dave Pooser  wrote:
>> 
>>>> On Jan 8, 2016, at 11:22 AM, Brian Hechinger  wrote:
>>>> 
>>>> No, ZFS raid10
>>> 
>>> Saw the HW-RAID term, and got concerned.  That's what, raidz2 in ZFS-ese?
>> 
>> It's a zpool with multiple mirror vdevs.
>> 
>> -- 
>> Dave Pooser
>> Cat-Herder-in-Chief, Pooserville.com
>> 
>> 
>> ___
>> OmniOS-discuss mailing list
>> OmniOS-discuss@lists.omniti.com
>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
> 
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] [discuss] COMSTAR hanging

2016-01-08 Thread Brian Hechinger
Yeah, it doesn’t have a great name, really.

raidz10? :)

-brian

> On Jan 8, 2016, at 11:50 AM, Dave Pooser  wrote:
> 
>>> On Jan 8, 2016, at 11:22 AM, Brian Hechinger  wrote:
>>> 
>>> No, ZFS raid10
>> 
>> Saw the HW-RAID term, and got concerned.  That's what, raidz2 in ZFS-ese?
> 
> It's a zpool with multiple mirror vdevs.
> 
> -- 
> Dave Pooser
> Cat-Herder-in-Chief, Pooserville.com
> 
> 
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] [discuss] COMSTAR hanging

2016-01-08 Thread Brian Hechinger
No, ZFS raid10

-brian

> On Jan 8, 2016, at 11:13 AM, Dan McDonald  wrote:
> 
> 
>> On Jan 8, 2016, at 10:23 AM, Brian Hechinger  wrote:
>> 
>> Hardware is a Xeon box with 6x SATA disks in a raid10 wired straight to the 
>> controller. No expanders.
>> 
> 
> You're using hardware raid?!?  That could, to be honest, be part of the 
> problem.
> 
> Dan
> 
> 
> 
> ---
> illumos-discuss
> Archives: https://www.listbox.com/member/archive/182180/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/182180/27886204-900409ed
> Modify Your Subscription: 
> https://www.listbox.com/member/?member_id=27886204&id_secret=27886204-941ea7c7
> Powered by Listbox: http://www.listbox.com

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] [discuss] COMSTAR hanging

2016-01-08 Thread Brian Hechinger
ZFS lockups don’t seem much better an option. :)

Hardware is a Xeon box with 6x SATA disks in a raid10 wired straight to the 
controller. No expanders.

This is not a heavily loaded system (yet) so I don’t think it’s a load issue, 
sadly.

I *might* be able to downgrade to 014? I’m not sure I want to. I’d rather help 
get things fixed going forward.

This is for my home VMware stack. I have important services on disks local to 
the ESXi hosts (not idea, but makes this less painful when it happens) so 
COMSTAR locking up is mostly an inconvenience yet at this point. I’d rather it 
didn’t though because I’d like to be using this more heavily than it is now.

Thanks for the data points!

-brian

> On Jan 8, 2016, at 9:36 AM, Matej Zerovnik  wrote:
> 
> I had the same problems… 
> 
> In my case, comstart hanging went away with downgrade to early 014 version, 
> but then ZFS started to lock.
> 
> What is your hardware? Any JBODs? SAS or SATA drives? Expanders?
> 
> Currently, I didn’t had Comstar lock for about a month, and I’m running 
> latest 014 (but I did reduce the number of users for about 25%, so maybe I 
> removed some some of the problematic users).
> 
> Matej
> 
> 
>> On 06 Jan 2016, at 21:29, Brian Hechinger > <mailto:wo...@4amlunch.net>> wrote:
>> 
>> Great, look for one in the future when this happens again. :)
>> 
>> -brian
>> 
>>> On Jan 5, 2016, at 11:45 PM, Garrett D'Amore >> <mailto:garr...@damore.org>> wrote:
>>> 
>>> Actually, what is probably the most useful is this command:
>>> 
>>> # echo ‘$>> 
>>> A full crashdump will have that inside it, as well, but that first list of 
>>> threads (and therefore will include the comstar threads) and backtraces 
>>> will probably yield the most fruit for least effort on your part.
>>> 
>>> On Tue, Jan 5, 2016 at 6:59 PM, Brian Hechinger >> <mailto:wo...@4amlunch.net>> wrote:
>>> So this is the second time this has happened to me. The COMSTAR layer 
>>> appears to be getting hung. At first I thought it was just the IB/SRP 
>>> target stuff, but the iSCSI target also stops working. So far the only 
>>> solution I’ve found is a reboot.
>>> 
>>> This is very concerning and I’d like to try to get it figured out.
>>> 
>>> The next time it happens, what is the best course of action in order to get 
>>> the information you all need to debug this? I’m assuming force a crashdump, 
>>> but is there anything else I could be doing?
>>> 
>>> Thanks!
>>> 
>>> -brian
>>> 
>>> PS: Latest OmniOS-stable
>>> 
>>> ---
>>> illumos-discuss
>>> Archives: https://www.listbox.com/member/archive/182180/=now 
>>> <https://www.listbox.com/member/archive/182180/=now>
>>> RSS Feed: 
>>> https://www.listbox.com/member/archive/rss/182180/22003744-9012f59c 
>>> <https://www.listbox.com/member/archive/rss/182180/22003744-9012f59c>
>>> Modify Your Subscription: 
>>> https://www.listbox.com/member/?member_id=22003744&id_secret=22003744-e9cd8436
>>>  
>>> <https://www.listbox.com/member/?member_id=22003744&id_secret=22003744-e9cd8436>
>>> Powered by Listbox: http://www.listbox.com <http://www.listbox.com/>
>>> 
>>> ___
>>> OmniOS-discuss mailing list
>>> OmniOS-discuss@lists.omniti.com <mailto:OmniOS-discuss@lists.omniti.com>
>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss 
>>> <http://lists.omniti.com/mailman/listinfo/omnios-discuss>
>> 
>> ___
>> OmniOS-discuss mailing list
>> OmniOS-discuss@lists.omniti.com <mailto:OmniOS-discuss@lists.omniti.com>
>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
> 

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] [discuss] COMSTAR hanging

2016-01-06 Thread Brian Hechinger
Great, look for one in the future when this happens again. :)

-brian

> On Jan 5, 2016, at 11:45 PM, Garrett D'Amore  wrote:
> 
> Actually, what is probably the most useful is this command:
> 
> # echo ‘$ 
> A full crashdump will have that inside it, as well, but that first list of 
> threads (and therefore will include the comstar threads) and backtraces will 
> probably yield the most fruit for least effort on your part.
> 
> On Tue, Jan 5, 2016 at 6:59 PM, Brian Hechinger  <mailto:wo...@4amlunch.net>> wrote:
> So this is the second time this has happened to me. The COMSTAR layer appears 
> to be getting hung. At first I thought it was just the IB/SRP target stuff, 
> but the iSCSI target also stops working. So far the only solution I’ve found 
> is a reboot.
> 
> This is very concerning and I’d like to try to get it figured out.
> 
> The next time it happens, what is the best course of action in order to get 
> the information you all need to debug this? I’m assuming force a crashdump, 
> but is there anything else I could be doing?
> 
> Thanks!
> 
> -brian
> 
> PS: Latest OmniOS-stable
> 
> ---
> illumos-discuss
> Archives: https://www.listbox.com/member/archive/182180/=now 
> <https://www.listbox.com/member/archive/182180/=now>
> RSS Feed: https://www.listbox.com/member/archive/rss/182180/22003744-9012f59c 
> <https://www.listbox.com/member/archive/rss/182180/22003744-9012f59c>
> Modify Your Subscription: 
> https://www.listbox.com/member/?member_id=22003744&id_secret=22003744-e9cd8436
>  
> <https://www.listbox.com/member/?member_id=22003744&id_secret=22003744-e9cd8436>
> Powered by Listbox: http://www.listbox.com <http://www.listbox.com/>
> 
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


[OmniOS-discuss] COMSTAR hanging

2016-01-05 Thread Brian Hechinger
So this is the second time this has happened to me. The COMSTAR layer appears 
to be getting hung. At first I thought it was just the IB/SRP target stuff, but 
the iSCSI target also stops working. So far the only solution I’ve found is a 
reboot.

This is very concerning and I’d like to try to get it figured out.

The next time it happens, what is the best course of action in order to get the 
information you all need to debug this? I’m assuming force a crashdump, but is 
there anything else I could be doing?

Thanks!

-brian

PS: Latest OmniOS-stable
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Hung ZFS Pool

2015-12-17 Thread Brian Hechinger
And……

  pool: zoom
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: none requested
config:

NAME  STATE READ WRITE CKSUM
zoom  DEGRADED 0 025
  mirror-0DEGRADED 0 0   150
c4t1d0s1  DEGRADED 0 0   150  too many errors
c5t1d0s1  DEGRADED 0 0   154  too many errors

So those patches didn’t help. :(

-brian

> On Dec 17, 2015, at 2:23 PM, Brian Hechinger  wrote:
> 
> Yeah, I think the one I already had (the init() related one that Hans gave 
> me) but I wonder if the other one is somehow related?
> 
> I’ve installed the updates.
> 
> I’ll re-create the pool and re-run iozone
> 
> -brian
> 
>> On Dec 17, 2015, at 2:21 PM, Dan McDonald  wrote:
>> 
>> 
>>> On Dec 17, 2015, at 2:20 PM, Brian Hechinger  wrote:
>>> 
>>> That seems……… unlikely to me?
>>> 
>>> I’ll put one of them into a linux box and see what happens with it.
>>> 
>>> Is there a way to somehow see if the nvme drivers are being wonky? I get 
>>> the feeling NVMe 1.1 cards aren’t completely supported just yet?
>> 
>> OH SHOOT!  I forgot these are NVMe.
>> 
>> Did you see my mail announcing the update?  Did you see it has two NVME 
>> fixes in it?
>> 
>> Dan
>> 
> 

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Hung ZFS Pool

2015-12-17 Thread Brian Hechinger
Yeah, I think the one I already had (the init() related one that Hans gave me) 
but I wonder if the other one is somehow related?

I’ve installed the updates.

I’ll re-create the pool and re-run iozone

-brian

> On Dec 17, 2015, at 2:21 PM, Dan McDonald  wrote:
> 
> 
>> On Dec 17, 2015, at 2:20 PM, Brian Hechinger  wrote:
>> 
>> That seems……… unlikely to me?
>> 
>> I’ll put one of them into a linux box and see what happens with it.
>> 
>> Is there a way to somehow see if the nvme drivers are being wonky? I get the 
>> feeling NVMe 1.1 cards aren’t completely supported just yet?
> 
> OH SHOOT!  I forgot these are NVMe.
> 
> Did you see my mail announcing the update?  Did you see it has two NVME fixes 
> in it?
> 
> Dan
> 

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Hung ZFS Pool

2015-12-17 Thread Brian Hechinger
That seems……… unlikely to me?

I’ll put one of them into a linux box and see what happens with it.

Is there a way to somehow see if the nvme drivers are being wonky? I get the 
feeling NVMe 1.1 cards aren’t completely supported just yet?

-brian

> On Dec 17, 2015, at 2:18 PM, Dan McDonald  wrote:
> 
>> 
>> On Dec 17, 2015, at 2:17 PM, Brian Hechinger  wrote:
>> 
>> Boom.
>> 
>> wonko@basket1:/export/home/wonko$ sudo zpool scrub zoom
>> Password:
>> wonko@basket1:/export/home/wonko$ sudo zpool status -v zoom
>> pool: zoom
>> state: DEGRADED
>> status: One or more devices has experienced an unrecoverable error.  An
>>   attempt was made to correct the error.  Applications are unaffected.
>> action: Determine if the device needs to be replaced, and clear the errors
>>   using 'zpool clear' or replace the device with 'zpool replace'.
>>  see: http://illumos.org/msg/ZFS-8000-9P
>> scan: scrub repaired 226K in 0h0m with 0 errors on Thu Dec 17 14:15:12 2015
>> config:
>> 
>>   NAME  STATE READ WRITE CKSUM
>>   zoom  DEGRADED 0 0 0
>> mirror-0DEGRADED 0 0 0
>>   c4t1d0s1  DEGRADED 0 038  too many errors
>>   c5t1d0s1  DEGRADED 0 042  too many errors
>> 
>> errors: No known data errors
> 
> Looks like you got bad drives.
> 
> Dan

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Hung ZFS Pool

2015-12-17 Thread Brian Hechinger
Boom.

wonko@basket1:/export/home/wonko$ sudo zpool scrub zoom
Password:
wonko@basket1:/export/home/wonko$ sudo zpool status -v zoom
  pool: zoom
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: scrub repaired 226K in 0h0m with 0 errors on Thu Dec 17 14:15:12 2015
config:

NAME  STATE READ WRITE CKSUM
zoom  DEGRADED 0 0 0
  mirror-0DEGRADED 0 0 0
c4t1d0s1  DEGRADED 0 038  too many errors
c5t1d0s1  DEGRADED 0 042  too many errors

errors: No known data errors

-brian

> On Dec 17, 2015, at 2:15 PM, Dan McDonald  wrote:
> 
> 
>> On Dec 17, 2015, at 2:05 PM, Brian Hechinger  wrote:
>> 
>> I can delete and create files just fine.
>> 
>> G.
> 
> Scrub it now.  Just in case.  A scrub is always a good idea anyway just to 
> make sure bits haven't rotted on the disk.
> 
> Dan
> 

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Hung ZFS Pool

2015-12-17 Thread Brian Hechinger
   =  169551.89 ops/sec
Avg throughput per thread   =   13453.69 ops/sec
Min xfer=4281.00 ops

Children see throughput for 16 mixed workload   =8673.89 ops/sec
Parent sees throughput for 16 mixed workload=6341.03 ops/sec
Min throughput per thread   = 442.73 ops/sec
Max throughput per thread   = 641.36 ops/sec
Avg throughput per thread   = 542.12 ops/sec
Min xfer=  180991.00 ops

Children see throughput for 16 random writers   =4008.54 ops/sec
Parent sees throughput for 16 random writers=3972.48 ops/sec
Min throughput per thread   = 248.54 ops/sec
Max throughput per thread   = 252.76 ops/sec
Avg throughput per thread   = 250.53 ops/sec
Min xfer=  257769.00 ops

Children see throughput for 16 fwriters =   70222.20 ops/sec
Parent sees throughput for 16 fwriters  =   65632.32 ops/sec
Min throughput per thread   =4132.12 ops/sec
Max throughput per thread   =4686.85 ops/sec
Avg throughput per thread   =4388.89 ops/sec
Min xfer=  262144.00 ops




Error in file: Found ?0? Expecting ?7979797979797979? addr 29f6770
Error in file: Found ?0? Expecting ?7979797979797979? addr 29f6770
Error in file: Position 0
Error in file: Position 0
Record # 0 Record size 8 kb
Record # 0 Record size 8 kb
where 29f6770x loop 0
where 29f6770x loop 0

I can delete and create files just fine.

G.

-brian

> On Dec 9, 2015, at 11:27 AM, Brian Hechinger  wrote:
> 
> 
>> On Dec 9, 2015, at 11:22 AM, Dan McDonald  wrote:
>> 
>> 
>>> On Dec 9, 2015, at 11:18 AM, Brian Hechinger  wrote:
>>> 
>>> It’s brand new!!
>> 
>> Sometimes you get flaky HW that's new.  I've had to return new spinning-rust 
>> disks, for example.
> 
> Bah. :(
> 
>> 
>>> Also, I would expect the other slice to be affected as well?  It’s been 
>>> humming along just fine as SLOG with no errors:
>>> 
>>>  logs
>>>mirror-3ONLINE   0 0 0
>>>  c4t1d0s0  ONLINE   0 0 0
>>>  c5t1d0s0  ONLINE   0 0 0
>> 
>> Could just be bad luck your slog hasn't encountered the bad portion of this 
>> drive.
> 
> I suppose. You think there is a maybe a good way to test this device before I 
> try to get it RMA-ed?
> 
>> Also, what OmniOS revision are you running? If you're not up to the latest 
>> November r151014 update, you may be missing some NVMe fixes.
> 
> Oh right, totally forgot to do that for you:
> 
> wonko@basket1:/var/adm$ head /etc/release ; uname -a
>  OmniOS v11 r151016
>  Copyright 2015 OmniTI Computer Consulting, Inc. All rights reserved.
>  Use is subject to license terms.
> SunOS basket1 5.11 omnios-073d8c0 i86pc i386 i86pc
> 

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Ang: Re: Ang: OpenSM for OmniOS

2015-12-11 Thread Brian Hechinger
That fails to clone, but I did manage to eventually find it on that site.

Got 3.3.19

It needs libibumad so I got that, but that explodes horribly when I try to 
build it. :(

Stuff like this:

./include/infiniband/umad.h:84:62: error: expected expression before 'uint32_t'
 #define IB_USER_MAD_UNREGISTER_AGENT _IOW(IB_IOCTL_MAGIC, 2, uint32_t)
  ^
src/umad.c:979:19: note: in expansion of macro 'IB_USER_MAD_UNREGISTER_AGENT'
  return ioctl(fd, IB_USER_MAD_UNREGISTER_AGENT, &agentid);

-brian

> On Dec 11, 2015, at 12:02 PM, Johan Kragsterman 
>  wrote:
> 
> Hi!
> 
> 
> 
> -Brian Hechinger  skrev: -
> Till: Johan Kragsterman 
> Från: Brian Hechinger 
> Datum: 2015-12-11 17:51
> Kopia: omnios-discuss 
> Ärende: Re: Ang: [OmniOS-discuss] OpenSM for OmniOS
> 
> Yeah, I’ve found that. The problem is I can’t find the source for 
> this or the patch.
> 
> Eric pointed me at 
> http://code.openhub.net/project?pid=&ipid=303919&fp=303919&mp&projSelected=true&filterChecked
> 
> I’m trying to get that to clone (it fails, sigh). It’s at least 
> newer and maybe (hopefully) doesn’t need to be patched for 
> Solaris/Illumos?
> 
> -brian
> 
> 
> 
> 
> Did you try it from here:
> 
> 
> http://git.openfabrics.org/~alexnetes/opensm.git/
> 
> 
> Rgrds Johan
> 

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Ang: OpenSM for OmniOS

2015-12-11 Thread Brian Hechinger
Yeah, I’ve found that. The problem is I can’t find the source for this or the 
patch.

Eric pointed me at 
http://code.openhub.net/project?pid=&ipid=303919&fp=303919&mp&projSelected=true&filterChecked
 
<http://code.openhub.net/project?pid=&ipid=303919&fp=303919&mp&projSelected=true&filterChecked>

I’m trying to get that to clone (it fails, sigh). It’s at least newer and maybe 
(hopefully) doesn’t need to be patched for Solaris/Illumos?

-brian

> On Dec 11, 2015, at 11:49 AM, Johan Kragsterman 
>  wrote:
> 
> Hi!
> 
> 
> -"OmniOS-discuss"  skrev: -
> Till: omnios-discuss 
> Från: Brian Hechinger 
> Sänt av: "OmniOS-discuss" 
> Datum: 2015-12-11 17:35
> Ärende: [OmniOS-discuss] OpenSM for OmniOS
> 
> I’ve found that supposedly this works. I just need to get a copy and 
> build it.
> 
> Does anyone know where I would get a copy?  I cannot find it for the life of 
> me!
> 
> Thanks,
> 
> -brian
> 
> 
> 
> 
> That software is pretty old, and I never tested it myself, but heard about 
> people successfully running it. Don't know about production, though...
> 
> I can give you some links:
> 
> https://syoyo.wordpress.com/category/infiniband/
> 
> https://github.com/syoyo/solaris-infiniband-tools
> 
> 
> Rgrds Johan
> 
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
> 
> 

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


[OmniOS-discuss] OpenSM for OmniOS

2015-12-11 Thread Brian Hechinger
I’ve found that supposedly this works. I just need to get a copy and build it.

Does anyone know where I would get a copy?  I cannot find it for the life of me!

Thanks,

-brian
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Hung ZFS Pool

2015-12-09 Thread Brian Hechinger

> On Dec 9, 2015, at 11:22 AM, Dan McDonald  wrote:
> 
> 
>> On Dec 9, 2015, at 11:18 AM, Brian Hechinger  wrote:
>> 
>> It’s brand new!!
> 
> Sometimes you get flaky HW that's new.  I've had to return new spinning-rust 
> disks, for example.

Bah. :(

> 
>> Also, I would expect the other slice to be affected as well?  It’s been 
>> humming along just fine as SLOG with no errors:
>> 
>>   logs
>> mirror-3ONLINE   0 0 0
>>   c4t1d0s0  ONLINE   0 0 0
>>   c5t1d0s0  ONLINE   0 0 0
> 
> Could just be bad luck your slog hasn't encountered the bad portion of this 
> drive.

I suppose. You think there is a maybe a good way to test this device before I 
try to get it RMA-ed?

> Also, what OmniOS revision are you running? If you're not up to the latest 
> November r151014 update, you may be missing some NVMe fixes.

Oh right, totally forgot to do that for you:

wonko@basket1:/var/adm$ head /etc/release ; uname -a
  OmniOS v11 r151016
  Copyright 2015 OmniTI Computer Consulting, Inc. All rights reserved.
  Use is subject to license terms.
SunOS basket1 5.11 omnios-073d8c0 i86pc i386 i86pc

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Hung ZFS Pool

2015-12-09 Thread Brian Hechinger
Also, I would expect the other slice to be affected as well?  It’s been humming 
along just fine as SLOG with no errors:

logs
  mirror-3ONLINE   0 0 0
c4t1d0s0  ONLINE   0 0 0
c5t1d0s0  ONLINE   0 0 0

> On Dec 9, 2015, at 11:17 AM, Dan McDonald  wrote:
> 
> 
>> On Dec 9, 2015, at 11:13 AM, Brian Hechinger  wrote:
>> 
>> I didn’t know about pgrep, no. :)
> 
> The Solaris/illumos ptools are a huge win.  Learn about 'em.  :)
> 
> Back to the main discussion...
> 
>> So the ‘zpool clear’ has fixed things a bit. The touch processes have all 
>> exited.
>> 
>> I can now touch a file on that pool.
>> 
>> A zpool scrub later and this is the status:
>> 
>> pool: zoom
>> state: ONLINE
>> status: One or more devices has experienced an unrecoverable error.  An
>>   attempt was made to correct the error.  Applications are unaffected.
>> action: Determine if the device needs to be replaced, and clear the errors
>>   using 'zpool clear' or replace the device with 'zpool replace'.
>>  see: http://illumos.org/msg/ZFS-8000-9P
>> scan: scrub repaired 6K in 0h0m with 0 errors on Wed Dec  9 10:25:33 2015
>> config:
>> 
>>   NAME  STATE READ WRITE CKSUM
>>   zoom  ONLINE   0 0 0
>> mirror-0ONLINE   0 0 0
>>   c4t1d0s1  ONLINE   0 0 0
>>   c5t1d0s1  ONLINE   0 0 2
>> 
>> errors: No known data errors
>> 
>> I’m going to try to re-run iozone later and see if I can’t get it to happen 
>> again.
>> 
>> This is concerning.
> 
> I see this, and I think "c5t1d0" is broken HW and needs to be replaced.
> 
> Combine that with "unrecoverable IO failures" and you really should be 
> planning to replace that drive.
> 
> Dan
> 

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Hung ZFS Pool

2015-12-09 Thread Brian Hechinger
It’s brand new!!

-brian

> On Dec 9, 2015, at 11:17 AM, Dan McDonald  wrote:
> 
> 
>> On Dec 9, 2015, at 11:13 AM, Brian Hechinger  wrote:
>> 
>> I didn’t know about pgrep, no. :)
> 
> The Solaris/illumos ptools are a huge win.  Learn about 'em.  :)
> 
> Back to the main discussion...
> 
>> So the ‘zpool clear’ has fixed things a bit. The touch processes have all 
>> exited.
>> 
>> I can now touch a file on that pool.
>> 
>> A zpool scrub later and this is the status:
>> 
>> pool: zoom
>> state: ONLINE
>> status: One or more devices has experienced an unrecoverable error.  An
>>   attempt was made to correct the error.  Applications are unaffected.
>> action: Determine if the device needs to be replaced, and clear the errors
>>   using 'zpool clear' or replace the device with 'zpool replace'.
>>  see: http://illumos.org/msg/ZFS-8000-9P
>> scan: scrub repaired 6K in 0h0m with 0 errors on Wed Dec  9 10:25:33 2015
>> config:
>> 
>>   NAME  STATE READ WRITE CKSUM
>>   zoom  ONLINE   0 0 0
>> mirror-0ONLINE   0 0 0
>>   c4t1d0s1  ONLINE   0 0 0
>>   c5t1d0s1  ONLINE   0 0 2
>> 
>> errors: No known data errors
>> 
>> I’m going to try to re-run iozone later and see if I can’t get it to happen 
>> again.
>> 
>> This is concerning.
> 
> I see this, and I think "c5t1d0" is broken HW and needs to be replaced.
> 
> Combine that with "unrecoverable IO failures" and you really should be 
> planning to replace that drive.
> 
> Dan
> 

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Hung ZFS Pool

2015-12-09 Thread Brian Hechinger
I didn’t know about pgrep, no. :)

So the ‘zpool clear’ has fixed things a bit. The touch processes have all 
exited.

I can now touch a file on that pool.

A zpool scrub later and this is the status:

  pool: zoom
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: scrub repaired 6K in 0h0m with 0 errors on Wed Dec  9 10:25:33 2015
config:

NAME  STATE READ WRITE CKSUM
zoom  ONLINE   0 0 0
  mirror-0ONLINE   0 0 0
c4t1d0s1  ONLINE   0 0 0
c5t1d0s1  ONLINE   0 0 2

errors: No known data errors

I’m going to try to re-run iozone later and see if I can’t get it to happen 
again.

This is concerning.

The previous entry in messages is 4 days prior talking about ntpd.

-brian

> On Dec 9, 2015, at 10:25 AM, Dan McDonald  wrote:
> 
> 
>> On Dec 9, 2015, at 10:20 AM, Brian Hechinger  wrote:
>> 
>> I cannot ^C out of the touch.
>> 
>> wonko@basket1:/export/home/wonko$ ps -ef | grep touch
> 
> You do know about pgrep(1), right?  :)
> 
>> Also, kill -9 doesn’t touch them.
> 
> Okay!  This means something in-kernel is locking them up.  More reason for a 
> coredump.
> 
>> the only thing in messages is:
>> 
>> Dec  7 14:31:56 basket1 fmd: [ID 377184 daemon.error] SUNW-MSG-ID: 
>> ZFS-8000-HC, TYPE: Error, VER: 1, SEVERITY: Major
>> Dec  7 14:31:56 basket1 EVENT-TIME: Mon Dec  7 14:31:56 EST 2015
>> Dec  7 14:31:56 basket1 PLATFORM: X8DTL, CSN: 1234567890, HOSTNAME: basket1
>> Dec  7 14:31:56 basket1 SOURCE: zfs-diagnosis, REV: 1.0
>> Dec  7 14:31:56 basket1 EVENT-ID: 585f9fa2-4a84-4184-8c87-c2f9c600e1a1
>> Dec  7 14:31:56 basket1 DESC: The ZFS pool has experienced currently 
>> unrecoverable I/O
>> Dec  7 14:31:56 basket1 failures.  Refer to 
>> http://illumos.org/msg/ZFS-8000-HC for more information.
>> Dec  7 14:31:56 basket1 AUTO-RESPONSE: No automated response will be taken.
>> Dec  7 14:31:56 basket1 IMPACT: Read and write I/Os cannot be serviced.
>> Dec  7 14:31:56 basket1 REC-ACTION: Make sure the affected devices are 
>> connected, then run
>> Dec  7 14:31:56 basket1 'zpool clear’.
> 
> You sure there's nothing before the FMA complaints?  It might be one line, 
> but it may be enough to show something.
> 
>> I can definitely share a kernel coredump, that’s not a problem. Just need to 
>> schedule a time to shut down all the VMs first.
> 
> Take your time, do it on your schedule, that's fine.
> 
> So I know where to put it:  Which OmniOS release are you running?
> 
>   head /etc/release ; uname -a
> 
> Thanks,
> Dan
> 

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Hung ZFS Pool

2015-12-09 Thread Brian Hechinger
I cannot ^C out of the touch.

wonko@basket1:/export/home/wonko$ ps -ef | grep touch
root  2459  2447   0 08:12:09 ?   0:00 touch /zoom/hi
root  2050  2049   0   Dec 07 ?   0:00 touch hi
root  2049 1   0   Dec 07 ?   0:00 sudo touch hi

Also, kill -9 doesn’t touch them.

the only thing in messages is:

Dec  7 14:31:56 basket1 fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-HC, 
TYPE: Error, VER: 1, SEVERITY: Major
Dec  7 14:31:56 basket1 EVENT-TIME: Mon Dec  7 14:31:56 EST 2015
Dec  7 14:31:56 basket1 PLATFORM: X8DTL, CSN: 1234567890, HOSTNAME: basket1
Dec  7 14:31:56 basket1 SOURCE: zfs-diagnosis, REV: 1.0
Dec  7 14:31:56 basket1 EVENT-ID: 585f9fa2-4a84-4184-8c87-c2f9c600e1a1
Dec  7 14:31:56 basket1 DESC: The ZFS pool has experienced currently 
unrecoverable I/O
Dec  7 14:31:56 basket1 failures.  Refer to 
http://illumos.org/msg/ZFS-8000-HC for more information.
Dec  7 14:31:56 basket1 AUTO-RESPONSE: No automated response will be taken.
Dec  7 14:31:56 basket1 IMPACT: Read and write I/Os cannot be serviced.
Dec  7 14:31:56 basket1 REC-ACTION: Make sure the affected devices are 
connected, then run
Dec  7 14:31:56 basket1 'zpool clear’.

I can definitely share a kernel coredump, that’s not a problem. Just need to 
schedule a time to shut down all the VMs first.

Maybe later tonight.

-brian

> On Dec 9, 2015, at 10:16 AM, Dan McDonald  wrote:
> 
> 
>> On Dec 9, 2015, at 8:14 AM, Brian Hechinger  wrote:
>> 
>> So read access appears to be ok. Writes are totally boned, however.  That 
>> touch just hangs forever.
>> 
>> So what do I need to do to provide you all with the information you need to 
>> diagnose this.
> 
> Do you literally have a touch process hanging right now?  Or is it something 
> you can ^C out of?
> 
> Does anything stand out in /var/adm/messages?  Maybe the kernel is 
> complaining about something there.
> 
> My final inclination is heavy-handed:
> 
>   - Make sure you have at least one process stuck on writing to that 
> filesystem.
> 
>   - "reboot -d" and take a kernel coredump
> 
> Unless you have sensitive information, a kernel coredump you can share would 
> be the best thing to do.
> 
> 
> Dan
> 
> p.s. I'm at the Dr. the rest of the day starting in 90 mins, pardon any 
> latency.

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Hung ZFS Pool

2015-12-09 Thread Brian Hechinger
Just did a ‘zpool clear’ on that pool and now I see:

errors: Permanent errors have been detected in the following files:

:<0x59>

> On Dec 9, 2015, at 10:16 AM, Dan McDonald  wrote:
> 
> 
>> On Dec 9, 2015, at 8:14 AM, Brian Hechinger  wrote:
>> 
>> So read access appears to be ok. Writes are totally boned, however.  That 
>> touch just hangs forever.
>> 
>> So what do I need to do to provide you all with the information you need to 
>> diagnose this.
> 
> Do you literally have a touch process hanging right now?  Or is it something 
> you can ^C out of?
> 
> Does anything stand out in /var/adm/messages?  Maybe the kernel is 
> complaining about something there.
> 
> My final inclination is heavy-handed:
> 
>   - Make sure you have at least one process stuck on writing to that 
> filesystem.
> 
>   - "reboot -d" and take a kernel coredump
> 
> Unless you have sensitive information, a kernel coredump you can share would 
> be the best thing to do.
> 
> 
> Dan
> 
> p.s. I'm at the Dr. the rest of the day starting in 90 mins, pardon any 
> latency.

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Hung ZFS Pool

2015-12-09 Thread Brian Hechinger
Sorry, typo-ed that.

These are SM951

-brian

> On Dec 9, 2015, at 9:02 AM, Davide Poletto  wrote:
> 
> Hi Brian,
> 
> a side note: are you sure that your Samsung 851 drive (I think you're 
> referring more specifically to the Samsung PM851 SSD Drive) supports the NVMe 
> interface standard?
> 
> I think it doesn't...at least looking at its released interface's 
> specifications: it uses SATA 3 (6.0 Gbps) interface instead of the NVMe 1.1 
> used by "disks" like the Samsung PM/SM951, PM1725, XS/SM1715 or the 
> PM/SM953...just to name some.
> 
> Regards, Davide.
> 
> On Wed, Dec 9, 2015 at 2:14 PM, Brian Hechinger  <mailto:wo...@4amlunch.net>> wrote:
> So I decided to do some testing on the pool I have that is made up of a pair 
> of Samsung 851 NVMe drives.
> 
> I’ve got it partitioned as I’m using part of it to test as SLOG against the 
> “spinning rust pool”. Yes I know these aren’t ideal for this, but they will 
> do for now.
> 
> I setup the other slices as a mirror and ran iozone against it.
> 
> It wrote fast. Really fast.
> 
> Then it stopped.
> 
> Now the pool seems to be wedged. At first I thought it might be the drives 
> themselves, but I see them still functioning as SLOG just fine, so it’s not 
> that I don’t believe.
> 
> root@basket1:/root# zpool status -v zoom
>   pool: zoom
>  state: ONLINE
> status: One or more devices are faulted in response to IO failures.
> action: Make sure the affected devices are connected, then run 'zpool clear'.
>see: http://illumos.org/msg/ZFS-8000-HC 
> <http://illumos.org/msg/ZFS-8000-HC>
>   scan: none requested
> config:
> 
> NAME  STATE READ WRITE CKSUM
> zoom  ONLINE   0 0 1
>   mirror-0ONLINE   0 0 6
> c4t1d0s1  ONLINE   0 0 6
> c5t1d0s1  ONLINE   0 0 6
> 
> errors: List of errors unavailable (insufficient privileges)
> root@basket1:/root# ls /zoom/
> iozone.DUMMY.0  iozone.DUMMY.10  iozone.DUMMY.12  iozone.DUMMY.14  
> iozone.DUMMY.2  iozone.DUMMY.4  iozone.DUMMY.6  iozone.DUMMY.8
> iozone.DUMMY.1  iozone.DUMMY.11  iozone.DUMMY.13  iozone.DUMMY.15  
> iozone.DUMMY.3  iozone.DUMMY.5  iozone.DUMMY.7  iozone.DUMMY.9
> root@basket1:/root# touch /zoom/hi
> 
> So read access appears to be ok. Writes are totally boned, however.  That 
> touch just hangs forever.
> 
> So what do I need to do to provide you all with the information you need to 
> diagnose this.
> 
> Thanks!
> 
> -brian
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com <mailto:OmniOS-discuss@lists.omniti.com>
> http://lists.omniti.com/mailman/listinfo/omnios-discuss 
> <http://lists.omniti.com/mailman/listinfo/omnios-discuss>
> 

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


[OmniOS-discuss] Hung ZFS Pool

2015-12-09 Thread Brian Hechinger
So I decided to do some testing on the pool I have that is made up of a pair of 
Samsung 851 NVMe drives.

I’ve got it partitioned as I’m using part of it to test as SLOG against the 
“spinning rust pool”. Yes I know these aren’t ideal for this, but they will do 
for now.

I setup the other slices as a mirror and ran iozone against it.

It wrote fast. Really fast.

Then it stopped.

Now the pool seems to be wedged. At first I thought it might be the drives 
themselves, but I see them still functioning as SLOG just fine, so it’s not 
that I don’t believe.

root@basket1:/root# zpool status -v zoom
  pool: zoom
 state: ONLINE
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://illumos.org/msg/ZFS-8000-HC
  scan: none requested
config:

NAME  STATE READ WRITE CKSUM
zoom  ONLINE   0 0 1
  mirror-0ONLINE   0 0 6
c4t1d0s1  ONLINE   0 0 6
c5t1d0s1  ONLINE   0 0 6

errors: List of errors unavailable (insufficient privileges)
root@basket1:/root# ls /zoom/
iozone.DUMMY.0  iozone.DUMMY.10  iozone.DUMMY.12  iozone.DUMMY.14  
iozone.DUMMY.2  iozone.DUMMY.4  iozone.DUMMY.6  iozone.DUMMY.8
iozone.DUMMY.1  iozone.DUMMY.11  iozone.DUMMY.13  iozone.DUMMY.15  
iozone.DUMMY.3  iozone.DUMMY.5  iozone.DUMMY.7  iozone.DUMMY.9
root@basket1:/root# touch /zoom/hi

So read access appears to be ok. Writes are totally boned, however.  That touch 
just hangs forever.

So what do I need to do to provide you all with the information you need to 
diagnose this.

Thanks!

-brian
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Can't get OmniOS to see LSI 1068e on SuperMicro X8DTL-3F

2015-11-17 Thread Brian Hechinger

> On Nov 17, 2015, at 4:28 PM, Brian Hechinger  wrote:
> 
>> 
>> On Nov 17, 2015, at 3:53 PM, Dan McDonald  wrote:
>> 
>> 
>>> On Nov 17, 2015, at 3:23 PM, Brian Hechinger  wrote:
>>> 
>>> prtconf output:
>>> 
>>> https://gist.github.com/bhechinger/01ba826eb8e0415e4530
>>> 
>>> I trimmed this to just the relevant entry for the Samsung drive
>> 
>> Thank you.  Yes, there's no reference to the single NVMe entry that exists 
>> today:
>> 
>> bloody(~/ws/illumos-omnios)[0]% grep nvme /etc/driver_aliases
>> nvme "pciex8086,953"
>> bloody(~/ws/illumos-omnios)[0]% 
>> 
>> But that PCIe-class is worth a shot.
>> 
>>> I’m willing to try anything. I’ll let you know how it goes. :)
>> 
>> If using pciclass works, it's potentially a worthy addition to illumos-gate 
>> upstream.  It appears the code doesn't check for the specific 8086,953 ID 
>> mentioned above, so there's a better chance it'll work.
>> 
>> You REALLY need to bring this up on the illumos developer's list, and if you 
>> can't, I can.  ESPECIALLY if it works.
> 
> I tried adding the pciclass and that had no effect.

A quick update here. Adding this causes the thing to kernel panic every time it 
tries to load the name drivers and talk to these drives. :(

-brian___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Can't get OmniOS to see LSI 1068e on SuperMicro X8DTL-3F

2015-11-17 Thread Brian Hechinger
The only thing in that log matching NVMe is a number of these in quick 
succession:

Nov 17 13:00:25 basket1 genunix: [ID 819705 kern.notice] 
/kernel/drv/amd64/nvme: undefined symbol
Nov 17 13:00:25 basket1 genunix: [ID 472681 kern.notice] WARNING: mod_load: 
cannot load module ‘nvme'

-brian

> On Nov 17, 2015, at 4:29 PM, Dan McDonald  wrote:
> 
> 
>> On Nov 17, 2015, at 4:28 PM, Brian Hechinger  wrote:
>> 
>> I’ll take this up with the illumos list, thanks!
> 
> Check your /var/adm/messages before doing so.  Maybe nvme-the-driver 
> complained about something?  (I'd guess NVMe version is beyond 1.0...)
> 
> Dan
> 

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Can't get OmniOS to see LSI 1068e on SuperMicro X8DTL-3F

2015-11-17 Thread Brian Hechinger

> On Nov 17, 2015, at 3:53 PM, Dan McDonald  wrote:
> 
> 
>> On Nov 17, 2015, at 3:23 PM, Brian Hechinger  wrote:
>> 
>> prtconf output:
>> 
>> https://gist.github.com/bhechinger/01ba826eb8e0415e4530
>> 
>> I trimmed this to just the relevant entry for the Samsung drive
> 
> Thank you.  Yes, there's no reference to the single NVMe entry that exists 
> today:
> 
> bloody(~/ws/illumos-omnios)[0]% grep nvme /etc/driver_aliases
> nvme "pciex8086,953"
> bloody(~/ws/illumos-omnios)[0]% 
> 
> But that PCIe-class is worth a shot.
> 
>> I’m willing to try anything. I’ll let you know how it goes. :)
> 
> If using pciclass works, it's potentially a worthy addition to illumos-gate 
> upstream.  It appears the code doesn't check for the specific 8086,953 ID 
> mentioned above, so there's a better chance it'll work.
> 
> You REALLY need to bring this up on the illumos developer's list, and if you 
> can't, I can.  ESPECIALLY if it works.

I tried adding the pciclass and that had no effect.

I’ll take this up with the illumos list, thanks!

-brian
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Can't get OmniOS to see LSI 1068e on SuperMicro X8DTL-3F

2015-11-17 Thread Brian Hechinger
prtconf output:

https://gist.github.com/bhechinger/01ba826eb8e0415e4530 
<https://gist.github.com/bhechinger/01ba826eb8e0415e4530>

I trimmed this to just the relevant entry for the Samsung drive

The exact card is this:

http://www.amazon.com/gp/product/B00VK0XMPE 
<http://www.amazon.com/gp/product/B00VK0XMPE>

I’m willing to try anything. I’ll let you know how it goes. :)

-brian

> On Nov 17, 2015, at 3:14 PM, Dan McDonald  wrote:
> 
> 
>> On Nov 17, 2015, at 2:57 PM, Brian Hechinger  wrote:
>> 
>> Installed r151016, but…
>> 
>> root@basket1:/root# prtconf -d | grep Samsung
>>   pci144d,a801 (pciex144d,a802) [Samsung Electronics Co Ltd unknown 
>> device] (driver not attached)
>>   pci144d,a801 (pciex144d,a802) [Samsung Electronics Co Ltd unknown 
>> device] (driver not attached)
> 
> Please share "prtconf -v"?  Also...
> 
>   https://pci-ids.ucw.cz/read/PC/144d/a800
> 
> I have no idea if this is an NVMe or not.  The "prtconf -v" will show 
> aliases, which may or may not provide more information.
> 
> I wonder if it's beyond NVMe 1.0?  Perhaps...
> 
> Also, I wonder if a PCI Class entry needs to land in /etc/driver_aliases?  I 
> found this:
> 
>   https://lists.debian.org/debian-boot/2015/09/msg00167.html
> 
> Suggesting that MAYBE we need this in /etc/driver_aliases:
> 
>   nvme "pciclass,010802" 
> 
> You should mention this on the illumos developer list.  You also COULD put 
> that entry in by hand, reboot, and see THEN if it pops up as a device?
> 
> Dan
> 

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Can't get OmniOS to see LSI 1068e on SuperMicro X8DTL-3F

2015-11-17 Thread Brian Hechinger
Installed r151016, but…

root@basket1:/root# prtconf -d | grep Samsung
pci144d,a801 (pciex144d,a802) [Samsung Electronics Co Ltd unknown 
device] (driver not attached)
pci144d,a801 (pciex144d,a802) [Samsung Electronics Co Ltd unknown 
device] (driver not attached)

-brian

> On Nov 17, 2015, at 2:45 PM, Dan McDonald  wrote:
> 
> 
>> On Nov 17, 2015, at 2:38 PM, Brian Hechinger  wrote:
>> 
>> Now to get the NVMe drives recognized. :)
> 
> The installer is not going to find 'em until at least this:
> 
>   https://illumos.org/issues/6232
> 
> gets fixed.
> 
> OTOH, Kayak should see them, as should installed 014 and beyond.
> 
> Dan
> 

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Can't get OmniOS to see LSI 1068e on SuperMicro X8DTL-3F

2015-11-17 Thread Brian Hechinger
Yeah, the onboard 1068 (and now the PERC 6i) are just a stopgap until I get a 
proper HBA (hopefully a cheap 9201-16i)

Now to get the NVMe drives recognized. :)

-brian

> On Nov 17, 2015, at 2:22 PM, John D Groenveld  
> wrote:
> 
> In message <564acc31.2060...@gmail.com>, Mark writes:
>> I used to work with the Supermicro Servers and used embedded and cards 
>> with OpenSolaris. The older firmware seems to have evaporated from 
>> Supermicro's ftp site.
>> 
>> When Supermicro stopped releasing newer versions, I used the matching 
>> LSI ones without issues.
> 
> Avago has broken many of the LSI links but I found a newer
> version of the old sasflash.exe for DOS.
> 
> But Brian's instincts are probably right and he's best off 
> buying a lightly used LSI SAS 6Gb HBA off eBay and
> flashing to the IT firmware.
> 
> John
> groenv...@acm.org
> 
> 
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Can't get OmniOS to see LSI 1068e on SuperMicro X8DTL-3F

2015-11-17 Thread Brian Hechinger
AH HA!

Lame BIOS.

There are only 6 “slots” in the boot device selection drop down.

Pulled the two SATA disks hanging off of the ACHI ports and the PERC showed up 
in the list.

-brian

> On Nov 17, 2015, at 11:59 AM, Brian Hechinger  wrote:
> 
> So I’ve solved this by decommissioning an older server (was planning on that 
> anyway) and stealing the PERC 6i out of it.
> 
> That shows up properly (which isn’t surprising, the machine that was in ran 
> OI for years).
> 
> Now to try to figure out how to get the BIOS to want to boot from it.
> 
> /me sighs
> 
> -brian
> 
>> On Nov 17, 2015, at 1:41 AM, Mark  wrote:
>> 
>> I used to work with the Supermicro Servers and used embedded and cards with 
>> OpenSolaris. The older firmware seems to have evaporated from Supermicro's 
>> ftp site.
>> 
>> When Supermicro stopped releasing newer versions, I used the matching LSI 
>> ones without issues.
>> 
>> I'll dig in my archives and see what info I can find.
>> 
>> 
>> 
>> On 17/11/2015 2:05 a.m., wo...@4amlunch.net wrote:
>>> This is built into the motherboard. This isn't an add on card.
>>> 
>>> As far as I know (and let's be honest about how much I know here) this is a 
>>> pretty vanilla firmware. It's the SuperMicro provided one.
>>> 
>>> -brian
>>> 
>>>> On Nov 16, 2015, at 02:43, Mark  wrote:
>>>> 
>>>> Check the PCI slot's speed setting in setup.
>>>> Many older cards need the slot speed lowered to work reliably.
>>>> 
>>>> I also seem to recall there may be different LSI firmware that supports 
>>>> different sas bus negotiation upper limits that can also cause issues.
>>>> 
>>>> Mark.
>>>> 
>>>> 
>>>>> On 16/11/2015 12:24 p.m., Brian Hechinger wrote:
>>>>> 
>>>>>> On Nov 15, 2015, at 6:11 PM, John D Groenveld  
>>>>>> wrote:
>>>>>> 
>>>>>> In message , Brian 
>>>>>> Hechinger
>>>>>> writes:
>>>>>>> WARNING: /pci@0,0/pci8086,340a@3/pci159d,6@0 (mpt0):
>>>>>>>   LSI PCI device (1000,59) not supported.
>>>>>> 
>>>>>> Does mega_sas(7D) also fail to attach to 1000,59?
>>>>> 
>>>>> It doesn’t complain but it also doesn’t work.
>>>>> 
>>>>> and now fault manager is yelling about a PCIE device fault.
>>>>> 
>>>>> So I’ll say no. :)
>>>>> 
>>>>> -brian
>>>>> 
>>>>> ___
>>>>> OmniOS-discuss mailing list
>>>>> OmniOS-discuss@lists.omniti.com
>>>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>>>> 
>>>> ___
>>>> OmniOS-discuss mailing list
>>>> OmniOS-discuss@lists.omniti.com
>>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>> 
> 

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Can't get OmniOS to see LSI 1068e on SuperMicro X8DTL-3F

2015-11-17 Thread Brian Hechinger
So I’ve solved this by decommissioning an older server (was planning on that 
anyway) and stealing the PERC 6i out of it.

That shows up properly (which isn’t surprising, the machine that was in ran OI 
for years).

Now to try to figure out how to get the BIOS to want to boot from it.

/me sighs

-brian

> On Nov 17, 2015, at 1:41 AM, Mark  wrote:
> 
> I used to work with the Supermicro Servers and used embedded and cards with 
> OpenSolaris. The older firmware seems to have evaporated from Supermicro's 
> ftp site.
> 
> When Supermicro stopped releasing newer versions, I used the matching LSI 
> ones without issues.
> 
> I'll dig in my archives and see what info I can find.
> 
> 
> 
> On 17/11/2015 2:05 a.m., wo...@4amlunch.net wrote:
>> This is built into the motherboard. This isn't an add on card.
>> 
>> As far as I know (and let's be honest about how much I know here) this is a 
>> pretty vanilla firmware. It's the SuperMicro provided one.
>> 
>> -brian
>> 
>>> On Nov 16, 2015, at 02:43, Mark  wrote:
>>> 
>>> Check the PCI slot's speed setting in setup.
>>> Many older cards need the slot speed lowered to work reliably.
>>> 
>>> I also seem to recall there may be different LSI firmware that supports 
>>> different sas bus negotiation upper limits that can also cause issues.
>>> 
>>> Mark.
>>> 
>>> 
>>>> On 16/11/2015 12:24 p.m., Brian Hechinger wrote:
>>>> 
>>>>> On Nov 15, 2015, at 6:11 PM, John D Groenveld  
>>>>> wrote:
>>>>> 
>>>>> In message , Brian 
>>>>> Hechinger
>>>>> writes:
>>>>>> WARNING: /pci@0,0/pci8086,340a@3/pci159d,6@0 (mpt0):
>>>>>>LSI PCI device (1000,59) not supported.
>>>>> 
>>>>> Does mega_sas(7D) also fail to attach to 1000,59?
>>>> 
>>>> It doesn’t complain but it also doesn’t work.
>>>> 
>>>> and now fault manager is yelling about a PCIE device fault.
>>>> 
>>>> So I’ll say no. :)
>>>> 
>>>> -brian
>>>> 
>>>> ___
>>>> OmniOS-discuss mailing list
>>>> OmniOS-discuss@lists.omniti.com
>>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>>> 
>>> ___
>>> OmniOS-discuss mailing list
>>> OmniOS-discuss@lists.omniti.com
>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
> 

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Can't get OmniOS to see LSI 1068e on SuperMicro X8DTL-3F

2015-11-15 Thread Brian Hechinger

> On Nov 15, 2015, at 6:11 PM, John D Groenveld  
> wrote:
> 
> In message , Brian 
> Hechinger
> writes:
>> WARNING: /pci@0,0/pci8086,340a@3/pci159d,6@0 (mpt0):
>>LSI PCI device (1000,59) not supported.
> 
> Does mega_sas(7D) also fail to attach to 1000,59?

It doesn’t complain but it also doesn’t work.

and now fault manager is yelling about a PCIE device fault.

So I’ll say no. :)

-brian

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Can't get OmniOS to see LSI 1068e on SuperMicro X8DTL-3F

2015-11-15 Thread Brian Hechinger

> On Nov 15, 2015, at 5:52 PM, Brian Hechinger  wrote:
> 
> I can re-flash this to IR and see what the result is there, but something 
> tells me it’s going to be the same (other than the BIOS showing those drives 
> again)

Or not because the flash util crashes. :(

-brian
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Can't get OmniOS to see LSI 1068e on SuperMicro X8DTL-3F

2015-11-15 Thread Brian Hechinger

> On Nov 15, 2015, at 5:16 PM, John D Groenveld  
> wrote:
> 
> In message <9f7a6807-3818-4668-8caf-d023daabf...@4amlunch.net>, Brian 
> Hechinger
> writes:
>> The installer wouldn't see the 4x SAS disks connected to the onboard LSI 
>> 1068e.
>> prtconf -d showed that the driver wasn't getting attached to the card.
>> 
>> I flashed the controller with the IT firmware to see if that would help (and 
>> a
>> s I wanted to do that anyway).
> 
> On cold-boot to FreeDOS, does LSI's sasflash still list the HBA?

Just checked, and yes.

1068E(B3)

> And if so, which firmware?

01.30.00.00 (IT)

> 
>> This has led to the following situation:
>> 
>> 1) It still shows the LSI card as (driver not attached)
>> 2) The BIOS no longer sees the disks on the 1068e as bootable
>> 
>> This is really super frustrating. :(
>> 
>> How do I get these drives seen by OmniOS? (and bootable!)
> 
> $ prtconf -pv|egrep 'pci.*1000'
> 
> And confirm it is listed in /etc/driver_aliases

So what I see is from ‘prtconf -d | grep LSI’ are two number: pci15d9,6 
(pciex1000,59)

Both of those are listed in the output of "prtconf -pv|egrep 'pci.*1000’”

Neither are in /etc/driver_aliases

Adding pciex1000,59 to mpt results in this at boot:

WARNING: /pci@0,0/pci8086,340a@3/pci159d,6@0 (mpt0):
 LSI PCI device (1000,59) not supported.
WARNING: /pci@0,0/pci8086,340a@3/pci159d,6@0 (mpt0):
 mpt_config_space_init failed
WARNING: /pci@0,0/pci8086,340a@3/pci159d,6@0 (mpt0):
 attach failed

I can re-flash this to IR and see what the result is there, but something tells 
me it’s going to be the same (other than the BIOS showing those drives again)

-brian
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


[OmniOS-discuss] Can't get OmniOS to see LSI 1068e on SuperMicro X8DTL-3F

2015-11-15 Thread Brian Hechinger
I’ve got a X8DTL-3F that I’m attempting to install OmniOS onto.

The installer wouldn’t see the 4x SAS disks connected to the onboard LSI 1068e. 
prtconf -d showed that the driver wasn’t getting attached to the card.

I flashed the controller with the IT firmware to see if that would help (and as 
I wanted to do that anyway).

This has led to the following situation:

1) It still shows the LSI card as (driver not attached)
2) The BIOS no longer sees the disks on the 1068e as bootable

This is really super frustrating. :(

How do I get these drives seen by OmniOS? (and bootable!)

Additionally (while I have your attention) there are a pair of NVMe drives in 
PCIe adapters that are also showing up as (driver not attached) that I would 
really like to get working as those are to be my SLOG devices. Samsung SM951 if 
it matters.

-brian
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss