Re: panics due to buggy ACPI in Dell Latitude E6530?
Le jeudi 4 avril 2013 18:45:58 Andriy Gapon a écrit : on 31/03/2013 14:42 kron said the following: I'm sorry I forgot to update the thread - good you're reminding. Andriy did a brilliant job at debugging the issue and I owe him to say in public: Thank you, Andriy! I also apologize for not sharing the information promptly. So here is a summary. The problem turned out to be with the reference count in ACPICA. It doesn't have any internal locking and so it relied on locks obtained by the callers. But not all of the callers obtained the relevant locks (namespace, interpreter) and they not really needed them (except for the reference counting). Also, our acpi_battery driver uses ACPICA interfaces that were problematic. Additionally the driver allows parallel queries, not sure if that is an intentional choice. So now the ACPICA developers have fixed the reference counting code and no changes in FreeBSD code should be required. We are just waiting for the next ACPICA release. That's for head. Not sure which approach we will take for stable branches, because we haven't been doing any MFCs of ACPICA imports. So there are tow choices: - use the below patch to prevent parallel execution in the batter driver - manually apply the specific reference count change to ACPICA code in the branches Finally many thanks to Olli/kron for doing a lot of testing and debugging. And many thanks to Tom Lislegaard who did a lot of testing and debugging too - in our debugging session I reached all the same conclusions and came up with a (different) patch, but then got distracted and completely forgot about the issue until it resurfaced again. Thanks a lot for the very detailed explanation. For now I'll use the patch until the acpica release is merge into the next FreeBSD release :-). Regards, The results are: - hw.acpi.osname=Linux is not relevant - there's some ACPICA issue Andriy took to discuss with other hackers (and much above my competence to comment) - a temporary workaround: --- sys/dev/acpica/acpi_battery.c (revision 248682) +++ sys/dev/acpica/acpi_battery.c (working copy) @@ -360,6 +360,8 @@ int error, unit; device_t dev; +mtx_lock(Giant); + /* For commands that use the ioctl_arg struct, validate it first. */ error = ENXIO; unit = 0; @@ -417,6 +419,7 @@ error = EINVAL; } +mtx_unlock(Giant); return (error); } The patch works for me without any problem. I guess it won't hurt your system ;-) but I actually don't know if/how it relates to your PR. -- David Demelier ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org
Re: panics due to buggy ACPI in Dell Latitude E6530?
on 31/03/2013 14:42 kron said the following: I'm sorry I forgot to update the thread - good you're reminding. Andriy did a brilliant job at debugging the issue and I owe him to say in public: Thank you, Andriy! I also apologize for not sharing the information promptly. So here is a summary. The problem turned out to be with the reference count in ACPICA. It doesn't have any internal locking and so it relied on locks obtained by the callers. But not all of the callers obtained the relevant locks (namespace, interpreter) and they not really needed them (except for the reference counting). Also, our acpi_battery driver uses ACPICA interfaces that were problematic. Additionally the driver allows parallel queries, not sure if that is an intentional choice. So now the ACPICA developers have fixed the reference counting code and no changes in FreeBSD code should be required. We are just waiting for the next ACPICA release. That's for head. Not sure which approach we will take for stable branches, because we haven't been doing any MFCs of ACPICA imports. So there are tow choices: - use the below patch to prevent parallel execution in the batter driver - manually apply the specific reference count change to ACPICA code in the branches Finally many thanks to Olli/kron for doing a lot of testing and debugging. And many thanks to Tom Lislegaard who did a lot of testing and debugging too - in our debugging session I reached all the same conclusions and came up with a (different) patch, but then got distracted and completely forgot about the issue until it resurfaced again. The results are: - hw.acpi.osname=Linux is not relevant - there's some ACPICA issue Andriy took to discuss with other hackers (and much above my competence to comment) - a temporary workaround: --- sys/dev/acpica/acpi_battery.c (revision 248682) +++ sys/dev/acpica/acpi_battery.c (working copy) @@ -360,6 +360,8 @@ int error, unit; device_t dev; +mtx_lock(Giant); + /* For commands that use the ioctl_arg struct, validate it first. */ error = ENXIO; unit = 0; @@ -417,6 +419,7 @@ error = EINVAL; } +mtx_unlock(Giant); return (error); } The patch works for me without any problem. I guess it won't hurt your system ;-) but I actually don't know if/how it relates to your PR. -- Andriy Gapon ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org
Re: panics due to buggy ACPI in Dell Latitude E6530?
on 02/04/2013 10:29 David Demelier said the following: Thanks for that small patch, I'm currently testing it and will tell you how it works for me, The best way to check is to run several while true ; do acpiconf -i N ; done in parallel. If the system survives, then it will most likely survive the typical use too. -- Andriy Gapon ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org
Re: panics due to buggy ACPI in Dell Latitude E6530?
Hello, Thanks for that small patch, I'm currently testing it and will tell you how it works for me, Cheers! 2013/3/31 kron kro...@gmail.com: On 2013/03/30 14:22, David Demelier wrote: Le samedi 30 mars 2013 14:13:53 David Demelier a écrit : Le mercredi 27 février 2013 18:51:09 Andriy Gapon a écrit : on 27/02/2013 17:22 kron said the following: Hi, I have a Dell notebook (Latitude E6530) on which I track 9-STABLE. It served excellently until mid-Jan when it started to panic a few times a week or so: Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 03 fault virtual address = 0x10116 fault code = supervisor read data, page not present instruction pointer= 0x20:0x802bc360 stack pointer = 0x28:0xff848f6db390 frame pointer = 0x28:0xff848f6db3c0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process= 2199 (conky) trap number= 12 panic: page fault cpuid = 3 Before the panics kernel used to emit messages like: ACPI Error: No object attached to node 0xfe00094a51c0 (20110527/exresnte-138) ACPI Error: Method execution failed [\_SB_.BAT0._UID] (Node 0xfe00094a51c0), AE_AML_NO_OPERAND (20110527/uteval-113) This looks very much like a heisenbug reported several times here. E.g.: http://lists.freebsd.org/pipermail/freebsd-acpi/2012-December/007962.html I suspected it started with a BIOS update (A07 - A09). Following the handbook, I took a look at acpidump. Sad to say, it all was Greek to me, I could't even compile it back using iasl (35 Errors). However, while skimming it I noticed names of many versions of Windows and in addition to that, Linux. Just to try, I put hw.acpi.osname=Linux to /boot/loader.conf. Since that I've never get the panic again (for ~3 weeks). I hope this is not just a coincidence. It very well could be. Maybe this experience can help somebody else. If any of ACPI developers wants to play with the problem I can provide more info (sorry, no crashdump, was not enabled), do tests, etc. Please at least enable printing of a stack trace. Better do get the crash dump. P.S. I suspect that the issue we are discussing with hps in this mailing list could be related to this problem. About me, I've currently added the following to my /boot/loader.conf: debug.acpi.disabled=acad cmbat And it solved my panics but unfortunately I must say bye to the battery information. Regards, By the way, may be this is related? :) http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/173408 Cheers, I'm sorry I forgot to update the thread - good you're reminding. Andriy did a brilliant job at debugging the issue and I owe him to say in public: Thank you, Andriy! The results are: - hw.acpi.osname=Linux is not relevant - there's some ACPICA issue Andriy took to discuss with other hackers (and much above my competence to comment) - a temporary workaround: --- sys/dev/acpica/acpi_battery.c (revision 248682) +++ sys/dev/acpica/acpi_battery.c (working copy) @@ -360,6 +360,8 @@ int error, unit; device_t dev; +mtx_lock(Giant); + /* For commands that use the ioctl_arg struct, validate it first. */ error = ENXIO; unit = 0; @@ -417,6 +419,7 @@ error = EINVAL; } +mtx_unlock(Giant); return (error); } The patch works for me without any problem. I guess it won't hurt your system ;-) but I actually don't know if/how it relates to your PR. BR Oli -- Demelier David ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org
Re: panics due to buggy ACPI in Dell Latitude E6530?
On 2013/03/30 14:22, David Demelier wrote: Le samedi 30 mars 2013 14:13:53 David Demelier a écrit : Le mercredi 27 février 2013 18:51:09 Andriy Gapon a écrit : on 27/02/2013 17:22 kron said the following: Hi, I have a Dell notebook (Latitude E6530) on which I track 9-STABLE. It served excellently until mid-Jan when it started to panic a few times a week or so: Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 03 fault virtual address = 0x10116 fault code = supervisor read data, page not present instruction pointer= 0x20:0x802bc360 stack pointer = 0x28:0xff848f6db390 frame pointer = 0x28:0xff848f6db3c0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process= 2199 (conky) trap number= 12 panic: page fault cpuid = 3 Before the panics kernel used to emit messages like: ACPI Error: No object attached to node 0xfe00094a51c0 (20110527/exresnte-138) ACPI Error: Method execution failed [\_SB_.BAT0._UID] (Node 0xfe00094a51c0), AE_AML_NO_OPERAND (20110527/uteval-113) This looks very much like a heisenbug reported several times here. E.g.: http://lists.freebsd.org/pipermail/freebsd-acpi/2012-December/007962.html I suspected it started with a BIOS update (A07 - A09). Following the handbook, I took a look at acpidump. Sad to say, it all was Greek to me, I could't even compile it back using iasl (35 Errors). However, while skimming it I noticed names of many versions of Windows and in addition to that, Linux. Just to try, I put hw.acpi.osname=Linux to /boot/loader.conf. Since that I've never get the panic again (for ~3 weeks). I hope this is not just a coincidence. It very well could be. Maybe this experience can help somebody else. If any of ACPI developers wants to play with the problem I can provide more info (sorry, no crashdump, was not enabled), do tests, etc. Please at least enable printing of a stack trace. Better do get the crash dump. P.S. I suspect that the issue we are discussing with hps in this mailing list could be related to this problem. About me, I've currently added the following to my /boot/loader.conf: debug.acpi.disabled=acad cmbat And it solved my panics but unfortunately I must say bye to the battery information. Regards, By the way, may be this is related? :) http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/173408 Cheers, I'm sorry I forgot to update the thread - good you're reminding. Andriy did a brilliant job at debugging the issue and I owe him to say in public: Thank you, Andriy! The results are: - hw.acpi.osname=Linux is not relevant - there's some ACPICA issue Andriy took to discuss with other hackers (and much above my competence to comment) - a temporary workaround: --- sys/dev/acpica/acpi_battery.c (revision 248682) +++ sys/dev/acpica/acpi_battery.c (working copy) @@ -360,6 +360,8 @@ int error, unit; device_t dev; +mtx_lock(Giant); + /* For commands that use the ioctl_arg struct, validate it first. */ error = ENXIO; unit = 0; @@ -417,6 +419,7 @@ error = EINVAL; } +mtx_unlock(Giant); return (error); } The patch works for me without any problem. I guess it won't hurt your system ;-) but I actually don't know if/how it relates to your PR. BR Oli ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org
Re: panics due to buggy ACPI in Dell Latitude E6530?
Le mercredi 27 février 2013 18:51:09 Andriy Gapon a écrit : on 27/02/2013 17:22 kron said the following: Hi, I have a Dell notebook (Latitude E6530) on which I track 9-STABLE. It served excellently until mid-Jan when it started to panic a few times a week or so: Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 03 fault virtual address = 0x10116 fault code = supervisor read data, page not present instruction pointer = 0x20:0x802bc360 stack pointer = 0x28:0xff848f6db390 frame pointer = 0x28:0xff848f6db3c0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 2199 (conky) trap number = 12 panic: page fault cpuid = 3 Before the panics kernel used to emit messages like: ACPI Error: No object attached to node 0xfe00094a51c0 (20110527/exresnte-138) ACPI Error: Method execution failed [\_SB_.BAT0._UID] (Node 0xfe00094a51c0), AE_AML_NO_OPERAND (20110527/uteval-113) This looks very much like a heisenbug reported several times here. E.g.: http://lists.freebsd.org/pipermail/freebsd-acpi/2012-December/007962.html I suspected it started with a BIOS update (A07 - A09). Following the handbook, I took a look at acpidump. Sad to say, it all was Greek to me, I could't even compile it back using iasl (35 Errors). However, while skimming it I noticed names of many versions of Windows and in addition to that, Linux. Just to try, I put hw.acpi.osname=Linux to /boot/loader.conf. Since that I've never get the panic again (for ~3 weeks). I hope this is not just a coincidence. It very well could be. Maybe this experience can help somebody else. If any of ACPI developers wants to play with the problem I can provide more info (sorry, no crashdump, was not enabled), do tests, etc. Please at least enable printing of a stack trace. Better do get the crash dump. P.S. I suspect that the issue we are discussing with hps in this mailing list could be related to this problem. About me, I've currently added the following to my /boot/loader.conf: debug.acpi.disabled=acad cmbat And it solved my panics but unfortunately I must say bye to the battery information. Regards, -- David Demelier ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org
Re: panics due to buggy ACPI in Dell Latitude E6530?
On 2013/03/11 22:28, Andriy Gapon wrote: ... In either case, could you please try the following patch (it is against recent stable/9) ? http://people.freebsd.org/~avg/OsdSynch-9.diff ... The following patch might also be of use with _debugging_ this issue: http://people.freebsd.org/~avg/acpi-uma-cache.diff Hi Andryi, 9.1-STABLE #0 r248230M - with the two patches mentioned above panic: page fault GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as amd64-marcel-freebsd... Unread portion of the kernel message buffer: ACPI Error: Object not a Integer, type Reference (20110527/exresnte-209) ACPI Error: Method execution failed [\_SB_.BAT0._UID] (Node 0xfe0008496938), AE_AML_OPERAND_TYPE (20110527/uteval-113) ... #0 doadump (textdump=value optimized out) at pcpu.h:229 229 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump (textdump=value optimized out) at pcpu.h:229 #1 0x80471b64 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:449 #2 0x80471fa4 in panic (fmt=value optimized out) at /usr/src/sys/kern/kern_shutdown.c:637 #3 0x806c02c5 in trap_fatal (frame=value optimized out, eva=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:878 #4 0x806c0663 in trap_pfault (frame=0x0, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:735 #5 0x806bfcae in trap (frame=0xff80003025f0) at /usr/src/sys/amd64/amd64/trap.c:463 #6 0x806aa233 in calltrap () at exception.S:228 #7 0x802add43 in AcpiExStore (SourceDesc=value optimized out, DestDesc=0xfe03eed84090, WalkState=value optimized out) at /usr/src/sys/contrib/dev/acpica/executer/exstore.c:288 #8 0x802ab468 in AcpiExOpcode_1A_1T_1R (WalkState=0xfe0013abdc00) at /usr/src/sys/contrib/dev/acpica/executer/exoparg1.c:502 #9 0x802a2ebd in AcpiDsExecEndOp (WalkState=0xfe0013abdc00) at /usr/src/sys/contrib/dev/acpica/dispatcher/dswexec.c:475 #10 0x802b613e in AcpiPsParseLoop (WalkState=0xfe0013abdc00) at /usr/src/sys/contrib/dev/acpica/parser/psloop.c:1249 #11 0x802b69cd in AcpiPsParseAml (WalkState=value optimized out) at /usr/src/sys/contrib/dev/acpica/parser/psparse.c:525 #12 0x802b7517 in AcpiPsExecuteMethod (Info=0xfe027c59f980) at /usr/src/sys/contrib/dev/acpica/parser/psxface.c:368 #13 0x802b11e6 in AcpiNsEvaluate (Info=0xfe027c59f980) at /usr/src/sys/contrib/dev/acpica/namespace/nseval.c:193 #14 0x802b4318 in AcpiEvaluateObject (Handle=0xfe000848f640, Pathname=value optimized out, ExternalParams=value optimized out, ReturnBuffer=0xff8000302988) at /usr/src/sys/contrib/dev/acpica/namespace/nsxfeval.c:289 #15 0x802cd84a in acpi_cpu_cx_cst (sc=0xfe0008463200) at /usr/src/sys/dev/acpica/acpi_cpu.c:735 #16 0x802cdb9b in acpi_cpu_notify (h=value optimized out, notify=value optimized out, context=0xfe0008463200) at /usr/src/sys/dev/acpica/acpi_cpu.c:1104 #17 0x802a5fa1 in AcpiEvNotifyDispatch (Context=0xfe0013e0e3c0) at /usr/src/sys/contrib/dev/acpica/events/evmisc.c:283 #18 0x802c38b0 in acpi_task_execute (context=0xff80009d4000, pending=333443472) at /usr/src/sys/dev/acpica/Osd/OsdSchedule.c:134 #19 0x804b2fa6 in taskqueue_run_locked (queue=0xfe000841ba80) at /usr/src/sys/kern/subr_taskqueue.c:312 #20 0x804b3818 in taskqueue_thread_loop (arg=value optimized out) at /usr/src/sys/kern/subr_taskqueue.c:501 #21 0x80446045 in fork_exit ( callout=0x804b3780 taskqueue_thread_loop, arg=0x80abcc68, frame=0xff8000302b00) at /usr/src/sys/kern/kern_fork.c:988 #22 0x806aa76e in fork_trampoline () at exception.S:602 #23 0x in ?? () Current language: auto; currently minimal (kgdb) I have the crash dump. BR Oli ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org
Re: panics due to buggy ACPI in Dell Latitude E6530?
On 2013/03/05 14:13, Andriy Gapon wrote: on 03/03/2013 20:26 kron said the following: #5 0x806abc53 in calltrap () at exception.S:228 #6 0x802bc850 in AcpiOsAcquireObject (Cache=0xfe00093a14a0) at /usr/src/sys/contrib/dev/acpica/utilities/utcache.c:310 #7 0x802bf481 in AcpiUtCreateInternalObjectDbg ( ModuleName=0x8071c1a6 dsutils, LineNumber=703, ComponentId=64,. Type=1) at /usr/src/sys/contrib/dev/acpica/utilities/utobject.c:437 Did this panic happen even with hw.acpi.osname=Linux or did you revert that to reproduce the panic? Yes, it was with hw.acpi.osname=Linux. It seems to be useless so I've just removed it. In either case, could you please try the following patch (it is against recent stable/9) ? http://people.freebsd.org/~avg/OsdSynch-9.diff Applied. BTW, the crashes use to be signalized beforehand - there can be more than 1 hour between the first kernel error messages (ACPI Error: No object attached) and a crash. Is there anything useful I can milk from the soon-to-be-dead system if I get a chance? BR Oli ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org
RE: panics due to buggy ACPI in Dell Latitude E6530?
ACPI Error: Method execution failed [\_SB_.BAT0._UID] (Node 0xfe00094a51c0), AE_AML_NO_OPERAND (20110527/uteval-113) Sorry, could not reproduce the problem here: - ex _SB_.BAT0._UID Evaluating \_SB_.BAT0._UID Evaluation of \_SB_.BAT0._UID returned object 000342A0, external buffer length 10 [Integer] = 0001 Please send a full list of all such ACPI errors. -Original Message- From: owner-freebsd-a...@freebsd.org [mailto:owner-freebsd- a...@freebsd.org] On Behalf Of kron Sent: Wednesday, February 27, 2013 7:23 AM To: freebsd-acpi@freebsd.org Subject: panics due to buggy ACPI in Dell Latitude E6530? Hi, I have a Dell notebook (Latitude E6530) on which I track 9-STABLE. It served excellently until mid-Jan when it started to panic a few times a week or so: Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 03 fault virtual address = 0x10116 fault code= supervisor read data, page not present instruction pointer = 0x20:0x802bc360 stack pointer = 0x28:0xff848f6db390 frame pointer = 0x28:0xff848f6db3c0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 2199 (conky) trap number = 12 panic: page fault cpuid = 3 Before the panics kernel used to emit messages like: ACPI Error: No object attached to node 0xfe00094a51c0 (20110527/exresnte-138) ACPI Error: Method execution failed [\_SB_.BAT0._UID] (Node 0xfe00094a51c0), AE_AML_NO_OPERAND (20110527/uteval-113) I suspected it started with a BIOS update (A07 - A09). Following the handbook, I took a look at acpidump. Sad to say, it all was Greek to me, I could't even compile it back using iasl (35 Errors). However, while skimming it I noticed names of many versions of Windows and in addition to that, Linux. Just to try, I put hw.acpi.osname=Linux to /boot/loader.conf. Since that I've never get the panic again (for ~3 weeks). I hope this is not just a coincidence. Maybe this experience can help somebody else. If any of ACPI developers wants to play with the problem I can provide more info (sorry, no crashdump, was not enabled), do tests, etc. BR Oli ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org
Re: panics due to buggy ACPI in Dell Latitude E6530?
on 28/02/2013 17:44 Moore, Robert said the following: ACPI Error: Method execution failed [\_SB_.BAT0._UID] (Node 0xfe00094a51c0), AE_AML_NO_OPERAND (20110527/uteval-113) Sorry, could not reproduce the problem here: - ex _SB_.BAT0._UID Evaluating \_SB_.BAT0._UID Evaluation of \_SB_.BAT0._UID returned object 000342A0, external buffer length 10 [Integer] = 0001 To me it is semi-obvious that the reported problem is a consequence of the FreeBSD heisenbug that I reported before. The one that messes up the internal state of ACPICA and which I previously blamed either on ACPICA object cache or ACPICA reference counting. But now I am inclined to think that it is caused by something in FreeBSD adaptation layer. -- Andriy Gapon ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org
Re: panics due to buggy ACPI in Dell Latitude E6530?
On 2013/02/28 17:38, Andriy Gapon wrote: on 28/02/2013 17:44 Moore, Robert said the following: ACPI Error: Method execution failed [\_SB_.BAT0._UID] (Node 0xfe00094a51c0), AE_AML_NO_OPERAND (20110527/uteval-113) Sorry, could not reproduce the problem here: - ex _SB_.BAT0._UID Evaluating \_SB_.BAT0._UID Evaluation of \_SB_.BAT0._UID returned object 000342A0, external buffer length 10 [Integer] = 0001 To me it is semi-obvious that the reported problem is a consequence of the FreeBSD heisenbug that I reported before. The one that messes up the internal state of ACPICA and which I previously blamed either on ACPICA object cache or ACPICA reference counting. But now I am inclined to think that it is caused by something in FreeBSD adaptation layer. Yes, I looked at David Demelier's report - the ACPI errors are nearly identical. I'll enable crash dumps, just in case... BR Oli ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org
panics due to buggy ACPI in Dell Latitude E6530?
Hi, I have a Dell notebook (Latitude E6530) on which I track 9-STABLE. It served excellently until mid-Jan when it started to panic a few times a week or so: Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 03 fault virtual address = 0x10116 fault code = supervisor read data, page not present instruction pointer = 0x20:0x802bc360 stack pointer = 0x28:0xff848f6db390 frame pointer = 0x28:0xff848f6db3c0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 2199 (conky) trap number = 12 panic: page fault cpuid = 3 Before the panics kernel used to emit messages like: ACPI Error: No object attached to node 0xfe00094a51c0 (20110527/exresnte-138) ACPI Error: Method execution failed [\_SB_.BAT0._UID] (Node 0xfe00094a51c0), AE_AML_NO_OPERAND (20110527/uteval-113) I suspected it started with a BIOS update (A07 - A09). Following the handbook, I took a look at acpidump. Sad to say, it all was Greek to me, I could't even compile it back using iasl (35 Errors). However, while skimming it I noticed names of many versions of Windows and in addition to that, Linux. Just to try, I put hw.acpi.osname=Linux to /boot/loader.conf. Since that I've never get the panic again (for ~3 weeks). I hope this is not just a coincidence. Maybe this experience can help somebody else. If any of ACPI developers wants to play with the problem I can provide more info (sorry, no crashdump, was not enabled), do tests, etc. BR Oli ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org
RE: panics due to buggy ACPI in Dell Latitude E6530?
Please forward the acpidump for the machine, thanks. Bob -Original Message- From: owner-freebsd-a...@freebsd.org [mailto:owner-freebsd- a...@freebsd.org] On Behalf Of kron Sent: Wednesday, February 27, 2013 7:23 AM To: freebsd-acpi@freebsd.org Subject: panics due to buggy ACPI in Dell Latitude E6530? Hi, I have a Dell notebook (Latitude E6530) on which I track 9-STABLE. It served excellently until mid-Jan when it started to panic a few times a week or so: Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 03 fault virtual address = 0x10116 fault code= supervisor read data, page not present instruction pointer = 0x20:0x802bc360 stack pointer = 0x28:0xff848f6db390 frame pointer = 0x28:0xff848f6db3c0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 2199 (conky) trap number = 12 panic: page fault cpuid = 3 Before the panics kernel used to emit messages like: ACPI Error: No object attached to node 0xfe00094a51c0 (20110527/exresnte-138) ACPI Error: Method execution failed [\_SB_.BAT0._UID] (Node 0xfe00094a51c0), AE_AML_NO_OPERAND (20110527/uteval-113) I suspected it started with a BIOS update (A07 - A09). Following the handbook, I took a look at acpidump. Sad to say, it all was Greek to me, I could't even compile it back using iasl (35 Errors). However, while skimming it I noticed names of many versions of Windows and in addition to that, Linux. Just to try, I put hw.acpi.osname=Linux to /boot/loader.conf. Since that I've never get the panic again (for ~3 weeks). I hope this is not just a coincidence. Maybe this experience can help somebody else. If any of ACPI developers wants to play with the problem I can provide more info (sorry, no crashdump, was not enabled), do tests, etc. BR Oli ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org
Re: panics due to buggy ACPI in Dell Latitude E6530?
on 27/02/2013 17:22 kron said the following: Hi, I have a Dell notebook (Latitude E6530) on which I track 9-STABLE. It served excellently until mid-Jan when it started to panic a few times a week or so: Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 03 fault virtual address = 0x10116 fault code= supervisor read data, page not present instruction pointer = 0x20:0x802bc360 stack pointer = 0x28:0xff848f6db390 frame pointer = 0x28:0xff848f6db3c0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 2199 (conky) trap number = 12 panic: page fault cpuid = 3 Before the panics kernel used to emit messages like: ACPI Error: No object attached to node 0xfe00094a51c0 (20110527/exresnte-138) ACPI Error: Method execution failed [\_SB_.BAT0._UID] (Node 0xfe00094a51c0), AE_AML_NO_OPERAND (20110527/uteval-113) This looks very much like a heisenbug reported several times here. E.g.: http://lists.freebsd.org/pipermail/freebsd-acpi/2012-December/007962.html I suspected it started with a BIOS update (A07 - A09). Following the handbook, I took a look at acpidump. Sad to say, it all was Greek to me, I could't even compile it back using iasl (35 Errors). However, while skimming it I noticed names of many versions of Windows and in addition to that, Linux. Just to try, I put hw.acpi.osname=Linux to /boot/loader.conf. Since that I've never get the panic again (for ~3 weeks). I hope this is not just a coincidence. It very well could be. Maybe this experience can help somebody else. If any of ACPI developers wants to play with the problem I can provide more info (sorry, no crashdump, was not enabled), do tests, etc. Please at least enable printing of a stack trace. Better do get the crash dump. P.S. I suspect that the issue we are discussing with hps in this mailing list could be related to this problem. -- Andriy Gapon ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org
Re: panics due to buggy ACPI in Dell Latitude E6530?
On 2013/02/27 19:31, Moore, Robert wrote: I need the binary tables, or the actual output of acpidump, not the disassembled code. Thanks, Bob oh, sorry... now acpidump -t: 1. raw dump http://www.filedropper.com/delllatitudee6530a09acpidump 2. stdout http://www.filedropper.com/delllatitudee6530a09acpidump-t Thanks for your interest Oli ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org