Re: ThinkPad T60 x86emu panic on resume (w/patch)

2012-09-10 Thread Paul Irofti
Please send the dump my way.



Re: ThinkPad T60 x86emu panic on resume (w/patch)

2012-09-10 Thread Mark Kettenis
 Date: Wed, 5 Sep 2012 04:01:56 +0400
 From: Alexander Polakov p...@sdf.org
 
 * Alexander Polakov p...@sdf.org [120903 21:41]:
  Now back to underlying issues: x86emu executes some code which causes
  parity check NMI (bit 7 set in port 0x61) to be generated, which causes
  drop to the debugger (I mistook it for a panic).
 
 Nobody asked me which code exactly. But I'll tell you anyway.
 
 000C1867 56 pushsi
 000C1868 BE 38 02   mov si, 0x0238
 000C186B 8B 34  mov si, [si]
 000C186D 8B 74 10   mov si, [si+0x10]
 000C1870 83 C6 04   add si, 0x04
 000C1873 FC cld 
 000C1874 B9 07 00   mov cx, 0x0007
 000C1877 66 AD  lodseax, ds:[esi]
 000C1879 E8 D5 F9   call0x000C1251
 000C187C 83 C3 04   add bx, 0x04
 000C187F E2 F6  loop0x000C1877
 000C1881 5E pop si
 000C1882 C3 ret 
 
 NMI ... going to debugger
 ...
 x86emu: SEGMASK: 0x0
 x86emu: R_DS: 0xc000
 x86emu: R_SS: 0x0
 x86emu: R_CS: 0xc000
 x86emu: R_ES: 0xc000
 x86emu: R_FS: 0x0
 x86emu: R_GS: 0x0
 x86emu: R_IP: 0x1877
 x86emu: R_SI: 0xe886
 x86emu: R_ESI: 0xe886
 x86emu: Now at 0xc1877, instruction: 0x66
 
 Sometimes it's 0xc1878 or even 0xc187f.
 
 So, if you feel like debugging this video bios/x86 emulation mess,
 feel free to contact me, I can provide you with memory dump or
 disassembled memory dump.

Hmm 0x66 is used as an instruction prefix.  I believe I've seen
prefix-related fixes flying by on the Xorg mailing lists.  Might be
worth investigating whether there are any x86emu fixes that are
missing from our tree.



Re: ThinkPad T60 x86emu panic on resume (w/patch)

2012-09-10 Thread Paul Irofti
On Mon, Sep 10, 2012 at 12:26:09PM +0200, Mark Kettenis wrote:
  Date: Wed, 5 Sep 2012 04:01:56 +0400
  From: Alexander Polakov p...@sdf.org
  
  * Alexander Polakov p...@sdf.org [120903 21:41]:
   Now back to underlying issues: x86emu executes some code which causes
   parity check NMI (bit 7 set in port 0x61) to be generated, which causes
   drop to the debugger (I mistook it for a panic).
  
  Nobody asked me which code exactly. But I'll tell you anyway.
  
  000C1867 56 pushsi
  000C1868 BE 38 02   mov si, 0x0238
  000C186B 8B 34  mov si, [si]
  000C186D 8B 74 10   mov si, [si+0x10]
  000C1870 83 C6 04   add si, 0x04
  000C1873 FC cld 
  000C1874 B9 07 00   mov cx, 0x0007
  000C1877 66 AD  lodseax, ds:[esi]
  000C1879 E8 D5 F9   call0x000C1251
  000C187C 83 C3 04   add bx, 0x04
  000C187F E2 F6  loop0x000C1877
  000C1881 5E pop si
  000C1882 C3 ret 
  
  NMI ... going to debugger
  ...
  x86emu: SEGMASK: 0x0
  x86emu: R_DS: 0xc000
  x86emu: R_SS: 0x0
  x86emu: R_CS: 0xc000
  x86emu: R_ES: 0xc000
  x86emu: R_FS: 0x0
  x86emu: R_GS: 0x0
  x86emu: R_IP: 0x1877
  x86emu: R_SI: 0xe886
  x86emu: R_ESI: 0xe886
  x86emu: Now at 0xc1877, instruction: 0x66
  
  Sometimes it's 0xc1878 or even 0xc187f.
  
  So, if you feel like debugging this video bios/x86 emulation mess,
  feel free to contact me, I can provide you with memory dump or
  disassembled memory dump.
 
 Hmm 0x66 is used as an instruction prefix.  I believe I've seen
 prefix-related fixes flying by on the Xorg mailing lists.  Might be
 worth investigating whether there are any x86emu fixes that are
 missing from our tree.

I looked at the NetBSD tree last week and I haven't spotted any.
Where's the X11 tree for this? 



Re: ThinkPad T60 x86emu panic on resume (w/patch)

2012-09-10 Thread Mark Kettenis
 Date: Mon, 10 Sep 2012 13:34:56 +0300
 From: Paul Irofti p...@irofti.net
 
 On Mon, Sep 10, 2012 at 12:26:09PM +0200, Mark Kettenis wrote:
   Date: Wed, 5 Sep 2012 04:01:56 +0400
   From: Alexander Polakov p...@sdf.org
   
   * Alexander Polakov p...@sdf.org [120903 21:41]:
Now back to underlying issues: x86emu executes some code which causes
parity check NMI (bit 7 set in port 0x61) to be generated, which causes
drop to the debugger (I mistook it for a panic).
   
   Nobody asked me which code exactly. But I'll tell you anyway.
   
   000C1867 56 pushsi
   000C1868 BE 38 02   mov si, 0x0238
   000C186B 8B 34  mov si, [si]
   000C186D 8B 74 10   mov si, [si+0x10]
   000C1870 83 C6 04   add si, 0x04
   000C1873 FC cld 
   000C1874 B9 07 00   mov cx, 0x0007
   000C1877 66 AD  lodseax, ds:[esi]
   000C1879 E8 D5 F9   call0x000C1251
   000C187C 83 C3 04   add bx, 0x04
   000C187F E2 F6  loop0x000C1877
   000C1881 5E pop si
   000C1882 C3 ret 
   
   NMI ... going to debugger
   ...
   x86emu: SEGMASK: 0x0
   x86emu: R_DS: 0xc000
   x86emu: R_SS: 0x0
   x86emu: R_CS: 0xc000
   x86emu: R_ES: 0xc000
   x86emu: R_FS: 0x0
   x86emu: R_GS: 0x0
   x86emu: R_IP: 0x1877
   x86emu: R_SI: 0xe886
   x86emu: R_ESI: 0xe886
   x86emu: Now at 0xc1877, instruction: 0x66
   
   Sometimes it's 0xc1878 or even 0xc187f.
   
   So, if you feel like debugging this video bios/x86 emulation mess,
   feel free to contact me, I can provide you with memory dump or
   disassembled memory dump.
  
  Hmm 0x66 is used as an instruction prefix.  I believe I've seen
  prefix-related fixes flying by on the Xorg mailing lists.  Might be
  worth investigating whether there are any x86emu fixes that are
  missing from our tree.
 
 I looked at the NetBSD tree last week and I haven't spotted any.
 Where's the X11 tree for this? 

We have a copy in /usr/xenocara/xserver/hw/xfree86/x86emu, and you can
look at the commit log at:

  http://cgit.freedesktop.org/xorg/xserver/tree/hw/xfree86/x86emu

The particular change I was thinking about was:

  
http://cgit.freedesktop.org/xorg/xserver/commit/hw/xfree86/x86emu?id=bb18f277156c08be028a6e12d8987fb1593e9168

Cheers,

Mark



Re: ThinkPad T60 x86emu panic on resume (w/patch)

2012-09-03 Thread Alexander Polakov
* Theo de Raadt dera...@cvs.openbsd.org [120903 00:13]:
  Index: vga_pci.c
  ===
  RCS file: /cvs/src/sys/dev/pci/vga_pci.c,v
  retrieving revision 1.68
  diff -u -r1.68 vga_pci.c
  --- vga_pci.c   22 Aug 2012 20:58:30 -  1.68
  +++ vga_pci.c   2 Sep 2012 17:42:09 -
  @@ -186,7 +186,13 @@
  {   0x, 0x, 0x, 0x }, 1, 0
  },
   
  -   {   /* All ATI video until further notice */
  +   {   
  +   {   PCI_VENDOR_ATI, PCI_PRODUCT_ATI_RADEON_X1400,
  +   0x, 0x },
  +   {   0x, 0x, 0x, 0x}, 0, 0
  +   },
  +
  +   {   /* Other ATI video until further notice */
  {   PCI_VENDOR_ATI, 0x,
  0x, 0x },
  {   0x, 0x, 0x, 0x}, 1, 0
 
 
 That's a great patch, because it doesn't find or fix the underlying
 issues.

Thanks.

 Then, when the next person -- who's video did need this -- finds out
 it no longer works, they can submit the inverse of your diff -- once
 again not undercovering the real issue.  Of course, now it is their
 problem, not yours, right?

No, he'll add a record above mine with his subproduct/subvendor ids.

 Your diff can be summarized as make it work for me, me, me, me.
 Awesome work.  You know how to fix it just for yourself and submit a
 self-serving patch, hoping we'll commit it without looked deeper, so
 truly you are now an open source wizard.

I don't hope you'll commit it as is. I hope you'll send some useful
comments and suggestions to improve the patch (not insults).

Now back to underlying issues: x86emu executes some code which causes
parity check NMI (bit 7 set in port 0x61) to be generated, which causes
drop to the debugger (I mistook it for a panic).

Btw, is it intended to not include isa.h in i386/trap.c? (removed in r.1.35).
Code wrapped in #if NISA  0 is thrown out by cpp now.



Re: ThinkPad T60 x86emu panic on resume (w/patch)

2012-09-03 Thread Theo de Raadt
 * Theo de Raadt dera...@cvs.openbsd.org [120903 00:13]:
   Index: vga_pci.c
   ===
   RCS file: /cvs/src/sys/dev/pci/vga_pci.c,v
   retrieving revision 1.68
   diff -u -r1.68 vga_pci.c
   --- vga_pci.c 22 Aug 2012 20:58:30 -  1.68
   +++ vga_pci.c 2 Sep 2012 17:42:09 -
   @@ -186,7 +186,13 @@
 {   0x, 0x, 0x, 0x }, 1, 0
 },

   - {   /* All ATI video until further notice */
   + {   
   + {   PCI_VENDOR_ATI, PCI_PRODUCT_ATI_RADEON_X1400,
   + 0x, 0x },
   + {   0x, 0x, 0x, 0x}, 0, 0
   + },
   +
   + {   /* Other ATI video until further notice */
 {   PCI_VENDOR_ATI, 0x,
 0x, 0x },
 {   0x, 0x, 0x, 0x}, 1, 0
  
  
  That's a great patch, because it doesn't find or fix the underlying
  issues.
 
 Thanks.
 
  Then, when the next person -- who's video did need this -- finds out
  it no longer works, they can submit the inverse of your diff -- once
  again not undercovering the real issue.  Of course, now it is their
  problem, not yours, right?
 
 No, he'll add a record above mine with his subproduct/subvendor ids.

I do not agree with that process.

Basically in 10 years we'll have a table full of garbage entries,
which would help only a handful of people.  It is the wrong way to
approach this, and the cvs log already makes that clear.

There is code missing to handle all the types of video we need to
cleanup.  This code you are looking at is not the solution.  It is a
stupid quirk table.

  Your diff can be summarized as make it work for me, me, me, me.
  Awesome work.  You know how to fix it just for yourself and submit a
  self-serving patch, hoping we'll commit it without looked deeper, so
  truly you are now an open source wizard.
 
 I don't hope you'll commit it as is. I hope you'll send some useful
 comments and suggestions to improve the patch (not insults).

We won't commit it.



Re: ThinkPad T60 x86emu panic on resume (w/patch)

2012-09-03 Thread Ted Unangst
On Mon, Sep 03, 2012 at 21:18, Alexander Polakov wrote:

 Now back to underlying issues: x86emu executes some code which causes
 parity check NMI (bit 7 set in port 0x61) to be generated, which causes
 drop to the debugger (I mistook it for a panic).
 
 Btw, is it intended to not include isa.h in i386/trap.c? (removed in r.1.35).
 Code wrapped in #if NISA  0 is thrown out by cpp now.

Probably not, but after 11 years, it doesn't look like we used that
code much. :)  Putting the include back at the top of the file would
be a good diff to propose. (I guess it may have been deleted because
it looks funny in the middle of the file).



Re: ThinkPad T60 x86emu panic on resume (w/patch)

2012-09-02 Thread Alexander Polakov
With the diff below it resumes ok to X, but when I try to switch
to text console or do a shutdown, I get this panic:

/etc/rc.shutdown in progress...
/etc/rc.shutdown complete.
Stopped at  cpu_idle_cycle+0xf: ret
ddb{1} bt
No such command
ddb{1} trace
cpu_idle_cycle(d406ac00) at cpu_idle_cycle+0xf
Bad frame pointer: 0xf546ef5c
ddb{1} machine ddbcpu 0
Stopped at  Debugger+0x4:   popl%ebp
ddb{0} bt
No such command
ddb{0} trace
Debugger(d0b035e0,f54c2f08,0,14,6b) at Debugger+0x4
i386_ipi_handler(0,20,0,f54c0010,d0510010) at i386_ipi_handler+0x5f
Xintripi() at Xintripi+0x49
--- interrupt ---
kputchar(6b,14,0,1,0) at kputchar+0x1a
kprintf(d08f3d1a,14,0,0,f54c2ef8) at kprintf+0x254
db_printf(d08f3d1a,1b88d72e,b,d03db6c2,0) at db_printf+0x3e
kdbprinttrap(109,0,d9ddae24,1000,d0a47d44) at kdbprinttrap+0x1e
kdb_trap(109,0,f54c2fa8,3,ffee3bd) at kdb_trap+0x1fc
trap() at trap+0x2e7
--- trap (number 0) ---
0:
ddb{0} 

Index: vga_pci.c
===
RCS file: /cvs/src/sys/dev/pci/vga_pci.c,v
retrieving revision 1.68
diff -u -r1.68 vga_pci.c
--- vga_pci.c   22 Aug 2012 20:58:30 -  1.68
+++ vga_pci.c   2 Sep 2012 17:42:09 -
@@ -186,7 +186,13 @@
{   0x, 0x, 0x, 0x }, 1, 0
},
 
-   {   /* All ATI video until further notice */
+   {   
+   {   PCI_VENDOR_ATI, PCI_PRODUCT_ATI_RADEON_X1400,
+   0x, 0x },
+   {   0x, 0x, 0x, 0x}, 0, 0
+   },
+
+   {   /* Other ATI video until further notice */
{   PCI_VENDOR_ATI, 0x,
0x, 0x },
{   0x, 0x, 0x, 0x}, 1, 0

* Alexander Polakov p...@sdf.org [120902 21:40]:
 Suspending system...
 # Stopped at  x86emu_halt_sys+0x47f7: movl0x40(%edi),%ecx
 x86emu_halt_sys(d408804c,3,f5470d6c,d0725f1c,d4088000) at 
 x86emu_halt_sys+0x47f
 7
 x86emu_exec(d4088000,c000,3,d05b0887,0) at x86emu_exec+0x41
 vga_post_call(d4088000,c0,f5470dbc,f5470d8c,d4016400) at vga_post_call+0x6c
 vga_pci_activate(d4016400,3,f5470dbc,d05b2208,d3f29180) at 
 vga_pci_activate+0x9
 8
 config_activate_children(d3f29180,3,f5470dec,d05b019c,0) at 
 config_activate_chi
 ldren+0x45
 config_activate_children(d4016600,3,4,100107,f5470e1c) at 
 config_activate_child
 ren+0x45
 ppbactivate(d4016600,3,f5470e5c,d05b2208,d3f29380) at ppbactivate+0x345
 config_activate_children(d3f29380,3,f5470e8c,d0840914,c0) at 
 config_activate_ch
 ildren+0x45
 config_activate_children(d4015fc0,3,1,f5470ebc,0) at 
 config_activate_children+0
 x45
 acpi_resume(d4013c00,3,0,d409a560,d4013c00) at acpi_resume+0x134
 ddb{0} trace
 x86emu_halt_sys(d408804c,3,f5470d6c,d0725f1c,d4088000) at 
 x86emu_halt_sys+0x47f
 7
 x86emu_exec(d4088000,c000,3,d05b0887,0) at x86emu_exec+0x41
 vga_post_call(d4088000,c0,f5470dbc,f5470d8c,d4016400) at vga_post_call+0x6c
 vga_pci_activate(d4016400,3,f5470dbc,d05b2208,d3f29180) at 
 vga_pci_activate+0x9
 8
 config_activate_children(d3f29180,3,f5470dec,d05b019c,0) at 
 config_activate_chi
 ldren+0x45
 config_activate_children(d4016600,3,4,100107,f5470e1c) at 
 config_activate_child
 ren+0x45
 ppbactivate(d4016600,3,f5470e5c,d05b2208,d3f29380) at ppbactivate+0x345
 config_activate_children(d3f29380,3,f5470e8c,d0840914,c0) at 
 config_activate_ch
 ildren+0x45
 config_activate_children(d4015fc0,3,1,f5470ebc,0) at 
 config_activate_children+0
 x45
 acpi_resume(d4013c00,3,0,d409a560,d4013c00) at acpi_resume+0x134
 acpi_sleep_state(d4013c00,3,0,d0840d41,d409a560) at acpi_sleep_state+0x7a
 acpi_sleep_task(d4013c00,3,d9fd0744,1,d4013c00) at acpi_sleep_task+0x1a
 acpi_dotask(d4013c00,20,d09ccf9d,0,d03db6c2) at acpi_dotask+0x42
 acpi_thread(d3f27770) at acpi_thread+0x123
 Bad frame pointer: 0xd0bbbe28
 ddb{0} boot reboot
 cbb0: bad Vcc request. sock_ctrl 0xff88, sock_status 0x
 rebooting...



Re: ThinkPad T60 x86emu panic on resume (w/patch)

2012-09-02 Thread Theo de Raadt
 Index: vga_pci.c
 ===
 RCS file: /cvs/src/sys/dev/pci/vga_pci.c,v
 retrieving revision 1.68
 diff -u -r1.68 vga_pci.c
 --- vga_pci.c 22 Aug 2012 20:58:30 -  1.68
 +++ vga_pci.c 2 Sep 2012 17:42:09 -
 @@ -186,7 +186,13 @@
   {   0x, 0x, 0x, 0x }, 1, 0
   },
  
 - {   /* All ATI video until further notice */
 + {   
 + {   PCI_VENDOR_ATI, PCI_PRODUCT_ATI_RADEON_X1400,
 + 0x, 0x },
 + {   0x, 0x, 0x, 0x}, 0, 0
 + },
 +
 + {   /* Other ATI video until further notice */
   {   PCI_VENDOR_ATI, 0x,
   0x, 0x },
   {   0x, 0x, 0x, 0x}, 1, 0


That's a great patch, because it doesn't find or fix the underlying
issues.

Then, when the next person -- who's video did need this -- finds out
it no longer works, they can submit the inverse of your diff -- once
again not undercovering the real issue.  Of course, now it is their
problem, not yours, right?

Your diff can be summarized as make it work for me, me, me, me.
Awesome work.  You know how to fix it just for yourself and submit a
self-serving patch, hoping we'll commit it without looked deeper, so
truly you are now an open source wizard.

/sarcasm



Re: ThinkPad T60 x86emu panic on resume (w/patch)

2012-09-02 Thread Eugene Yunak
On 2 September 2012 21:13, Theo de Raadt dera...@cvs.openbsd.org wrote:
 Index: vga_pci.c
 ===
 RCS file: /cvs/src/sys/dev/pci/vga_pci.c,v
 retrieving revision 1.68
 diff -u -r1.68 vga_pci.c
 --- vga_pci.c 22 Aug 2012 20:58:30 -  1.68
 +++ vga_pci.c 2 Sep 2012 17:42:09 -
 @@ -186,7 +186,13 @@
   {   0x, 0x, 0x, 0x }, 1, 0
   },

 - {   /* All ATI video until further notice */
 + {
 + {   PCI_VENDOR_ATI, PCI_PRODUCT_ATI_RADEON_X1400,
 + 0x, 0x },
 + {   0x, 0x, 0x, 0x}, 0, 0
 + },
 +
 + {   /* Other ATI video until further notice */
   {   PCI_VENDOR_ATI, 0x,
   0x, 0x },
   {   0x, 0x, 0x, 0x}, 1, 0


 That's a great patch, because it doesn't find or fix the underlying
 issues.

 Then, when the next person -- who's video did need this -- finds out
 it no longer works, they can submit the inverse of your diff -- once
 again not undercovering the real issue.  Of course, now it is their
 problem, not yours, right?

 Your diff can be summarized as make it work for me, me, me, me.
 Awesome work.  You know how to fix it just for yourself and submit a
 self-serving patch, hoping we'll commit it without looked deeper, so
 truly you are now an open source wizard.

 /sarcasm


This is as unfair as it gets. Did i somehow miss the bit where Alexander
asks for OKs? I think this diff merely shows that he nailed it down to
specific piece of code. He gets a panic so how is his problem even 'fixed'?

Do not assume everyone around you is silly and ill-intended.



Re: ThinkPad T60 x86emu panic on resume (w/patch)

2012-09-02 Thread Theo de Raadt
 On 2 September 2012 21:13, Theo de Raadt dera...@cvs.openbsd.org wrote:
  Index: vga_pci.c
  ===
  RCS file: /cvs/src/sys/dev/pci/vga_pci.c,v
  retrieving revision 1.68
  diff -u -r1.68 vga_pci.c
  --- vga_pci.c 22 Aug 2012 20:58:30 -  1.68
  +++ vga_pci.c 2 Sep 2012 17:42:09 -
  @@ -186,7 +186,13 @@
{   0x, 0x, 0x, 0x }, 1, 0
},
 
  - {   /* All ATI video until further notice */
  + {
  + {   PCI_VENDOR_ATI, PCI_PRODUCT_ATI_RADEON_X1400,
  + 0x, 0x },
  + {   0x, 0x, 0x, 0x}, 0, 0
  + },
  +
  + {   /* Other ATI video until further notice */
{   PCI_VENDOR_ATI, 0x,
0x, 0x },
{   0x, 0x, 0x, 0x}, 1, 0
 
 
  That's a great patch, because it doesn't find or fix the underlying
  issues.
 
  Then, when the next person -- who's video did need this -- finds out
  it no longer works, they can submit the inverse of your diff -- once
  again not undercovering the real issue.  Of course, now it is their
  problem, not yours, right?
 
  Your diff can be summarized as make it work for me, me, me, me.
  Awesome work.  You know how to fix it just for yourself and submit a
  self-serving patch, hoping we'll commit it without looked deeper, so
  truly you are now an open source wizard.
 
  /sarcasm
 
 
 This is as unfair as it gets. Did i somehow miss the bit where Alexander
 asks for OKs?

Here, you are confused.  ok's are given to developers with accounts.
He not a developer with an account.

 I think this diff merely shows that he nailed it down to
 specific piece of code. He gets a panic so how is his problem even 'fixed'?

He nailed it down to the variable (that 1 vs 0) that controls the if
statement that enters the code that explodes.

It has a comment, which he deletes:

/* All ATI video until further notice */

Quote a curious comment.  So curious, one might want to go looking at
the commit logs.  Then the story becomes more clear -- we've had tons
of problems in this area and don't have a solution yet.  Tweaking the
table further doesn't change shit.

 Do not assume everyone around you is silly and ill-intended.

I don't assume he's ill-intended.

That diff says he's only thinking of solutions for himself.

My point is people shouldn't hand out bad advice.  That diff is bad
advice.  It is a diff created with very shallow consideration of what
has been found.

It's like those acpi tz diffs that help a few broken HP machines by
solving a real problem in the wrong way -- and that wrong way hurts
all the other machines.  Yes, apply that diff, and go bravo, it fixes
my machine.

That is not a part of a good cooperative development process.