Greetings to smartos-discuss,

Here is a forward of an email that I sent to [email protected] 
on 9 October 2014, when I was not subscribed to the list.

I don't see my message in the archives, so I have subscribed to the list, and 
am trying again to get this message on to the list.

Regards,

Steve

* * *

Steve Petrie, P.Eng.

ITS-ETO Consortium
Oakville, Ontario, Canada
(905) 847-3253
[email protected]

----- Original Message ----- 
From: Steve Petrie, P.Eng. 
To: [email protected] 
Cc: Elastic Hosts - Support 
Sent: Thursday, October 09, 2014 2:50 PM
Subject: SmartOS setup & boot Errors -- Testing SmartOS On An Elastic Hostd VM;


Greetings To smartos-discuss,

I am working to get SmartOS operational under a VM provisioned on the Elastic 
Hosts (EH) cloud infrastucture www.elastichosts.com

Presently, on a trial EH VM, a SmartOS setup / SmartOS boot, running from from 
the iso CD image on an EH cloud disk, is giving four messages that seem 
particularly of note:
  a.. trap: unknown trap type 8 in user mode 
  b.. Notice: MPO disabled because memory is disabled 
  c.. WARNING: KVM no hardware support 
  d.. WARNING rtls0: failure resetting PHY
I would appreciate very much, any comments smart-os list members would care to 
make. The KVM warning is not a big issue for my particular application.

EH support advise me, that the "trap" message could be of particular concern. 
They also advise trying to disable ACPI.

* * *
* * *

At the http://wiki.smartos.org/display/DOC/Home web page, a Google custom 
search for the string:
  a.. smartos setup message "trap: unknown trap type 8 in user mode" ?
yields a single result, being the illumos IRC log of [August 18 - 2011], a copy 
of which is attached to this email, in edited form:
  a.. <SmartOS -- Elastic Hosts VM - _Unknown trap type 8 in user mode_ - boot 
message - edited search results - 20141409.txt>
* * *
* * *

As a help to smart-os list participants, here are snippets (emphasis added) 
selected from the attached edited IRC log:
  a.. [01:51:17] <grey> I'm getting "Unknown trap type 8 in user mode" right 
after it starts trying to load the Kernel
  ...
  [01:57:25] <richlowe> I've seen the unknown usermode trap before
    
  b.. [02:06:34] <richlowe> could be ACPI, I suppose.
  [02:06:48] <grey> It's possible I didn't give nexenta enough time, in 
hindsight, that message may not have been fatal?
    
  c.. [02:06:49] <lewellyn> grey: does OI boot?
  ...
  [02:07:11] <richlowe> grey: it is.
  [02:07:13] <grey> Nexenta booted up this time
  [02:07:20] <richlowe> drat.
    
  d.. [02:08:00] <richlowe> last I saw it, we took the "bogus interrupt" early 
in boot, took it to be a trapped, and tried to core a process that didn't exist.
  ...
  [02:09:30] <richlowe> lewellyn: if it involved counter underflow, that was 
different.
  ...
  [02:09:55] <lewellyn> i don't remember details. that's why i have computers. 
but in any case, it was something i didn't want to even get near.
  [02:09:59] <richlowe> if you booted with kmdb loaded, and you can get there, 
::interrupts might be useful.
    
  e.. [02:16:13] <richlowe> got it.
  ...
  [02:16:20] <richlowe> this is the !@#$% intel_nhmex pain.
  ...
  [02:16:25] <richlowe> grey: can you modify your boot image?
  [02:16:33] <grey> can I? you tell me ;)
  [02:16:45] <richlowe> intel_nhmex (and intel_nhm, actually) do actual things 
in _init, presumably mistaking them for _attach
    
  f.. [02:17:00] <richlowe> when they do this on certain CPUs, it traps and the 
whole world blows up
  [02:17:08] <richlowe> if you're looking, you just get a warning from ACPI
  ...
  [02:17:40] <lewellyn> i get messages at boot regarding acpi and then things 
go wonky later
  [02:17:43] <richlowe> I went through hell on this older athlon64 chasing that 
down.
  [02:18:33] <richlowe> it turns out that this used to crash Xen too, and 
rather than fixing it, they just made the module not load under i86xpv
    
  g.. [02:19:03] <richlowe> grey: if you can delete /kernel/**/intel_nhm* you 
may come up.
  [02:19:11] <richlowe> I don't think -Bdisable-intel_nhmex works
    
  h.. [02:19:43] <grey> richlowe: ok, I'll take a crack at that (I don't 
suppose you know off the top of your head the easiest way to modify an ISO 
image in that fashion?, I'll google around for it)
  [02:20:06] <richlowe> I think on the iso, you'd have to unpack it, delete the 
modules, and then rebuild its boot_archive.
    
  i.. [02:22:49] <grey> richlowe: that's the kernel/**/intel_nhm* under 
platform/i86pc/ ?
  [02:23:15] <richlowe> under /kernel/drv, I think the other is the cpu mod
  [02:24:46] <grey> this is on the nexenta distro? I don't see any intel_nhm* 
files anywhere in the entire tree
  ...
  [02:25:43] <grey> http://pastebin.com/Q9Nzb30D  There's the directory 
structure
  [02:26:28] <grey> oh, is it inside the usr.img.zlib?
  [02:27:11] <richlowe> If that's OI media, I have no idea where it'd be on the 
install media
  [02:27:33] <grey> that's from nexenta-core-platform_3.0.1-b134_x86.iso
    
  j.. [02:33:21] <richlowe> sort of interesting would be, boot with -kvd ( hit 
'e' in grub, add -kvd to the kernel$ line), and hit 'b' to boot.  it'll stop 
saying [0]>.  Type "::bp pcplusmp`apic_error_intr -c $C" then :c
  [02:34:07] <richlowe> hopefully it'll hit the breakpoint and dump the stack, 
rather than exploding.
  ...
  [02:34:20] <richlowe> if it doesn't, I think you found a new way to get a 
random interrupt.
  [02:34:42] <richlowe> if it happened to log APIC errors, even when it worked, 
that'd help to place the blame on nhmex again, too.
* * *
* * *

Would smart-os list members have any knowlledge -- was there a resolution of 
this "trap" issue?

In the interest of getting SmartOS to work on the EH cloud platform, if this 
requires building a special distribution that simply omits some SmartOS 
functionality, whose absence is not fatal, I would be happy to undertake that.

I'm also happy to try to make any diagnostic efforts, that smart-os list 
participants might recommend. 

Please be aware that, although I am a software engineer with 30 years 
experience, I have little Unix and less Linux expeience. I do have C++ 
experience and some hardware-level comprehension. And a keen desire to enter 
the SmartOS world.

So I would need advice from smart-os list members and from EH sipport. 

* * * 
* * *

My motivation to climb a SmartOS learning curve, is because I really like the 
idea of hosting in production under SmartOS, a new website I have developed.

Please see the attached PDF rendition of the home page, for the new website 
(not yet online anywhere):
  a.. <eto_home - 20140812.pdf>
(Unfortunately, the PDF does not provide animation of the horizontal graphic of 
the expressway section. If smartos-discuss list members would like to see the 
animation, just let me know, and I will supply the necessary files.)

* * *
* * *

Hopefully Clicking "Send" Now :)

Steve

* * *

Steve Petrie, P.Eng.

ITS-ETO Consortium
Oakville, Ontario, Canada
(905) 847-3253
[email protected]

Attachment: eto_home - 20140812.pdf
Description: Adobe PDF document

SmartOS -- Elastic Hosts VM - _Unknown trap type 8 in user mode_ - boot message 
- edited search results - 20141409.txt   9 October 2014

This is an edited version, of file:

==> <SmartOS -- Elastic Hosts VM - _Unknown trap type 8 in user mode_ - boot 
message - raw search results - 20141409.txt>

being a portion of the target page, of the single result link, from a Google 
custom search, invoked from web page:

==> http://wiki.smartos.org/display/DOC/Home

with search parameter <smartos setup message "trap: unknown trap type 8 in user 
mode" ?>.

* * *
* * *

The search results page was titled "illumos IRC log of [August 18 - 2011]".

They are provided below, edited for brevity only, by deletion of some lines, 
with ellipses substituted, between the opening "@@@ ..." and closing "... @@@" 
lines.

* * *

@@@ illumos IRC log of [August 18 - 2011 ...
[01:51:17] <grey> I'm getting "Unknown trap type 8 in user mode" right after it 
starts trying to load the Kernel
...
[01:57:25] <richlowe> I've seen the unknown usermode trap before
...
[01:58:43] <Triskelios> ...I've seen an FP exception before
[01:59:25] <grey> I'm going to try smartos, not sure if they made any changes 
to the kernel/boot options
[01:59:29] <richlowe> not FP, we take an interrupt we don't expect, and land in 
an IDT that wasn't setup
[01:59:50] <richlowe> I remember bitching about it, don't remember what I broke 
to cause it.
[02:00:09] *** McBofh has quit IRC
[02:01:22] *** McBofh has joined #illumos
[02:03:10] <grey> wat.
[02:03:27] <grey> "System detected 256 cpus, but only 1 cpu(s) were enabled 
during boot."
[02:04:07] <Triskelios> vbox?
[02:04:46] <alanc> it's so annoying when you have 255 imaginary cpus going to 
waste
[02:05:32] <grey> Triskelios: qemu+kvm
[02:05:36] <grey> well, it booted anyways
..
[02:05:56] <grey> I thought it was stalled, I'm used to much more feedback from 
a linux bootup
[02:06:12] <lewellyn> grey: you can make it more verbose...
[02:06:15] <richlowe> grey: wait, which booted and which didn't?
[02:06:19] <lewellyn> but all that tends to do is confuse people.
[02:06:26] <grey> SmartOS booted, Nexenta didn't
[02:06:34] <richlowe> could be ACPI, I suppose.
[02:06:48] <grey> It's possible I didn't give nexenta enough time, in 
hindsight, that message may not have been fatal?
[02:06:49] <lewellyn> grey: does OI boot?
...
[02:07:11] <richlowe> grey: it is.
[02:07:13] <grey> Nexenta booted up this time
[02:07:20] <richlowe> drat.
...
[02:07:36] <grey> lewellyn: I don't have an OI iso around to test
..
[02:08:00] <richlowe> last I saw it, we took the "bogus interrupt" early in 
boot, took it to be a trapped, and tried to core a process that didn't exist.
...
[02:09:06] <grey> lewellyn: I'll grab a copy and give it a shot, it'll take a 
little while to download
[02:09:11] <lewellyn> i might have your summary around somewhere.
[02:09:30] <richlowe> lewellyn: if it involved counter underflow, that was 
different.
[02:09:35] <grey> richlowe: is there anything I can do to provide more useful 
debugging info if I see that bug again?
[02:09:45] <richlowe> I don't know.
[02:09:55] <lewellyn> i don't remember details. that's why i have computers. 
but in any case, it was something i didn't want to even get near.
[02:09:59] <richlowe> if you booted with kmdb loaded, and you can get there, 
::interrupts might be useful.
[02:11:49] <grey> Ah there we go
...
[02:12:07] <grey> the installer starts, then fails with that error, I missed it 
before because I didn't start up VNC fast enough
...
[02:16:13] <richlowe> got it.
..
[02:16:20] <richlowe> this is the !@#$% intel_nhmex pain.
...
[02:16:25] <richlowe> grey: can you modify your boot image?
[02:16:33] <grey> can I? you tell me ;)
[02:16:45] <richlowe> intel_nhmex (and intel_nhm, actually) do actual things in 
_init, presumably mistaking them for _attach
[02:16:45] <grey> I'm just booting the iso's of the main distributions
[02:17:00] <richlowe> when they do this on certain CPUs, it traps and the whole 
world blows up
[02:17:08] <richlowe> if you're looking, you just get a warning from ACPI
...
[02:17:40] <lewellyn> i get messages at boot regarding acpi and then things go 
wonky later
[02:17:43] <richlowe> I went through hell on this older athlon64 chasing that 
down.
[02:18:33] <richlowe> it turns out that this used to crash Xen too, and rather 
than fixing it, they just made the module not load under i86xpv
...
[02:19:03] <richlowe> grey: if you can delete /kernel/**/intel_nhm* you may 
come up.
[02:19:11] <richlowe> I don't think -Bdisable-intel_nhmex works
[02:19:39] <lewellyn> richlowe: i may have more datapoints on that for you 
tomorrow :)
[02:19:43] <grey> richlowe: ok, I'll take a crack at that (I don't suppose you 
know off the top of your head the easiest way to modify an ISO image in that 
fashion?, I'll google around for it)
[02:20:06] <richlowe> I think on the iso, you'd have to unpack it, delete the 
modules, and then rebuild its boot_archive.
[02:20:20] <richlowe> ryancnelson: don't suppose you feel like experimenting? :)
[02:20:40] <grey> ok, I should be able to do that
[02:22:49] <grey> richlowe: that's the kernel/**/intel_nhm* under 
platform/i86pc/ ?
[02:23:15] <richlowe> under /kernel/drv, I think the other is the cpu mod
[02:24:46] <grey> this is on the nexenta distro? I don't see any intel_nhm* 
files anywhere in the entire tree
...
[02:25:43] <grey> http://pastebin.com/Q9Nzb30D  There's the directory structure
[02:26:28] <grey> oh, is it inside the usr.img.zlib?
[02:27:11] <richlowe> If that's OI media, I have no idea where it'd be on the 
install media
[02:27:33] <grey> that's from nexenta-core-platform_3.0.1-b134_x86.iso
...
[02:28:22] <grey> the OI media is downloading now, smartOS booted at least
...
[02:33:21] <richlowe> sort of interesting would be, boot with -kvd ( hit 'e' in 
grub, add -kvd to the kernel$ line), and hit 'b' to boot.  it'll stop saying 
[0]>.  Type "::bp pcplusmp`apic_error_intr -c $C" then :c
[02:34:07] <richlowe> hopefully it'll hit the breakpoint and dump the stack, 
rather than exploding.
...
[02:34:20] <richlowe> if it doesn't, I think you found a new way to get a 
random interrupt.
[02:34:42] <richlowe> if it happened to log APIC errors, even when it worked, 
that'd help to place the blame on nhmex again, too.
... @@@

* * *

*** End of Document ***

-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Reply via email to