Re: iwpriv (Was: OLPC News 2007-12-30)
On Mon, 2007-12-31 at 18:10 +, David Woodhouse wrote: An interesting goal would be cleaning up CONFIG_OLPC so that it could be enabled in stock kernels of standard Linux distros. I actually see that as a prerequisite for getting the thing upstream. And the first step along that path is to stop making it worse. Let's see if we can repeat history. If experience with the libertas driver is anything to go by, I predict that by starting to look at the problem, I will provoke others into a generating a storm of conflicting patches by attempting to do the same thing themselves¹. So here's an untested patch to make the reboot fixups slightly more generic, so that we can easily add our own 'fixup' for the XO in a fashion which will actually be mergeable upstream. Untested-but-otherwise-Signed-Off-By: David Woodhouse [EMAIL PROTECTED] diff --git a/arch/x86/kernel/reboot_32.c b/arch/x86/kernel/reboot_32.c index bb1a0f8..dedb1d8 100644 --- a/arch/x86/kernel/reboot_32.c +++ b/arch/x86/kernel/reboot_32.c @@ -332,9 +332,7 @@ static void native_machine_shutdown(void) #endif } -void __attribute__((weak)) mach_reboot_fixups(void) -{ -} +void (*mach_reboot_fixup)(void); static void native_machine_emergency_restart(void) { @@ -347,7 +345,8 @@ static void native_machine_emergency_restart(void) /* rebooting needs to touch the page at absolute addr 0 */ *((unsigned short *)__va(0x472)) = reboot_mode; for (;;) { - mach_reboot_fixups(); /* for board specific fixups */ + if (mach_reboot_fixup) + mach_reboot_fixup(); mach_reboot(); /* That didn't work - force a triple fault.. */ load_idt(no_idt); diff --git a/arch/x86/kernel/reboot_fixups_32.c b/arch/x86/kernel/reboot_fixups_32.c index f452726..d9607a7 100644 --- a/arch/x86/kernel/reboot_fixups_32.c +++ b/arch/x86/kernel/reboot_fixups_32.c @@ -14,16 +14,18 @@ #include asm/msr.h #include asm/geode.h -static void cs5530a_warm_reset(struct pci_dev *dev) +static pci_dev *cs5530a_pci_dev; + +static void cs5530a_warm_reset(void) { /* writing 1 to the reset control register, 0x44 causes the cs5530a to perform a system warm reset */ - pci_write_config_byte(dev, 0x44, 0x1); + pci_write_config_byte(cs5530_pci_dev, 0x44, 0x1); udelay(50); /* shouldn't get here but be safe and spin-a-while */ return; } -static void cs5536_warm_reset(struct pci_dev *dev) +static void cs5536_warm_reset(void) { /* writing 1 to the LSB of this MSR causes a hard reset */ wrmsrl(MSR_DIVIL_SOFT_RESET, 1ULL); @@ -48,24 +50,23 @@ static struct device_fixup fixups_table[] = { * do return, we keep looking and then eventually fall back to the * standard mach_reboot on return. */ -void mach_reboot_fixups(void) +int mach_reboot_fixup_setup(void) { struct device_fixup *cur; struct pci_dev *dev; int i; - /* we can be called from sysrq-B code. In such a case it is -* prohibited to dig PCI */ - if (in_interrupt()) - return; - for (i=0; i ARRAY_SIZE(fixups_table); i++) { cur = (fixups_table[i]); dev = pci_get_device(cur-vendor, cur-device, NULL); if (!dev) continue; - cur-reboot_fixup(dev); + cs5530a_pci_dev = dev; + mach_reboot_fixup = cur-reboot_fixup; } + return 0; } +subsys_initcall(mach_reboot_fixup_setup); + diff --git a/include/asm-x86/reboot_fixups.h b/include/asm-x86/reboot_fixups.h index 0cb7d87..4f79001 100644 --- a/include/asm-x86/reboot_fixups.h +++ b/include/asm-x86/reboot_fixups.h @@ -1,6 +1,6 @@ #ifndef _LINUX_REBOOT_FIXUPS_H #define _LINUX_REBOOT_FIXUPS_H -extern void mach_reboot_fixups(void); +extern void (*mach_reboot_fixup)(void); #endif /* _LINUX_REBOOT_FIXUPS_H */ -- dwmw2 ¹ Only this time I don't actually plan to follow through; I'm relying on the interference ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: iwpriv (Was: OLPC News 2007-12-30)
David Woodhouse wrote: So here's an untested patch to make the reboot fixups slightly more generic, so that we can easily add our own 'fixup' for the XO in a fashion which will actually be mergeable upstream. It would be slightly nicer and generic if we had void (*mach_reboot_fixup)(void *arg); void *mach_reboot_fixup_arg; rather than the cs5530a_pci_dev global. But anyway, Untested-but-otherwise-Signed-Off-By: David Woodhouse [EMAIL PROTECTED] Untested-but-otherwise-Acked-By: Bernardo Innocenti [EMAIL PROTECTED] -- \___/ |___| Bernardo Innocenti - http://www.codewiz.org/ \___\ One Laptop Per Child - http://www.laptop.org/ ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: OLPC News 2007-12-30
On Sun, 2007-12-30 at 12:05 -1000, Mitch Bradley wrote: I meant the OLPC kernel. I presume that OLPC changes will be offered to mainline in some batch fashion, rather than piecemeal. This particular one is of no upstream value in isolation, as it is utterly dependent on OLPC-specific EC commands. As a general rule, that is totally incorrect. Changes should be pushed towards upstream _before_ they're ever committed to our tree, and any change which has been made only in the OLPC tree and not pushed upstream should be considered volatile and likely to disappear... like the private wireless ioctls I removed last week because they weren't upstream for example¹. However, you're right about this patch not going upstream -- I thought I'd already told you that the naïve patch to cs5536_warm_reset() as shown in ticket #4397 was not acceptable. It doesn't live in that function, and even if it did, it shouldn't be happening unconditionally based on CONFIG_OLPC. -- dwmw2 ¹ I have actually put them back now, temporarily. But they will be going away again. Nothing is stable until it's upstream. ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
iwpriv (Was: OLPC News 2007-12-30)
David Woodhouse wrote: As a general rule, that is totally incorrect. Changes should be pushed towards upstream _before_ they're ever committed to our tree, and any change which has been made only in the OLPC tree and not pushed upstream should be considered volatile and likely to disappear... like the private wireless ioctls I removed last week because they weren't upstream for example¹. ¹ I have actually put them back now, temporarily. But they will be going away again. Nothing is stable until it's upstream. btw, we still have code in /etc/init.d/olpc-configure that tries to use one of those private ioctls to remap the leds, and outputs errors if they're missing. Is this still needed? However, you're right about this patch not going upstream -- I thought I'd already told you that the naïve patch to cs5536_warm_reset() as shown in ticket #4397 was not acceptable. It doesn't live in that function, and even if it did, it shouldn't be happening unconditionally based on CONFIG_OLPC. An interesting goal would be cleaning up CONFIG_OLPC so that it could be enabled in stock kernels of standard Linux distros. -- \___/ |___| Bernardo Innocenti - http://www.codewiz.org/ \___\ One Laptop Per Child - http://www.laptop.org/ ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: iwpriv (Was: OLPC News 2007-12-30)
On Mon, 2007-12-31 at 12:56 -0500, Bernardo Innocenti wrote: btw, we still have code in /etc/init.d/olpc-configure that tries to use one of those private ioctls to remap the leds, and outputs errors if they're missing. Is this still needed? Yes, I think so. And I think it probably even justifies a private ioctl. So it'll get proper consideration and it'll get sent upstream. Not just dumped into our kernel and forgotten. However, you're right about this patch not going upstream -- I thought I'd already told you that the naïve patch to cs5536_warm_reset() as shown in ticket #4397 was not acceptable. It doesn't live in that function, and even if it did, it shouldn't be happening unconditionally based on CONFIG_OLPC. An interesting goal would be cleaning up CONFIG_OLPC so that it could be enabled in stock kernels of standard Linux distros. I actually see that as a prerequisite for getting the thing upstream. And the first step along that path is to stop making it worse. -- dwmw2 ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
OLPC News 2007-12-30
1. Give One Get One: The G1G1 program ends on December 31. G1G1 has not only made it possible to seed the launch of programs in Haiti, Rwanda, Ethiopia, Cambodia, Mongolia, and Afghanistan, but we have also greatly broaden the community of participation in the project. The community has already jumped in to help: the level of activity in our forums, IRC, email lists, wiki, etc. has risen dramatically over the past few weeks. G1G1 participants have asked lots of questions—and have uncovered some new bugs—but they also have lots of answers—and have submitted some new patches. The community model seems to be scaling. Many thanks to Hilary Meserole and the tireless efforts from the teams at Pentagram, Nurun, Eleven, Patriot, and Brightstar. 2. Mary Lou Jepsen: Mary Lou's last day at OLPC is December 31. She will be continuing to consult with us on a number of different fronts as she chases after her next miracle in display technology. Mary Lou was OLPC employee Number One, both in terms of when she joined the organization and in terms of the breadth and depth of her contributions. Thank you and best of luck with your adventures in a new role and new year. 3. Embedded controller (EC): Richard Smith has tested a battery EEPROM dumping feature recently added by Andres Salomon: it seems to work great. Richard has written crontab scripts and phone home scripts for inclusion in joyride builds, with the intent to include them in an upcoming release to build an anonymous database of battery performance. These scripts will sample the power used every five minutes and log it. They only sample when the battery is charging or discharging. The hope is to gather a composite view of battery performance under realistic conditions of use. Richard noticed that on the community-development list there are at least two reports of the EC going terminal, meaning that on boot they get the error message: EC problem. Remove all power and restart. We need to get those machines to Cambridge to investigate further. Another issue found on the community-list are reports from a few people about their batteries not charging. Richard says this would not surprise him if they were NiMH batteries, but G1G1 machines have the LiFePo batteries. He did had one person run logbat and send him the results: the EC reads the battery fine and is attempting to charge the battery but no current ever goes into the battery. Again, we need to get these machines to Cambridge as we haven't seen this behavior before. 4. Open Firmware: Mitch Bradley continued to provide G1G1 customer support, for example, chasing down some problems with SD cards. He also added the ability to delete JFFS2 files from Open Firmware and fixed Tickets #5717, #5585, and #5727, all improvements to the overall OFW performance and reliability. Preparations continue on OFW for the Intel prototype XO board. 5. Wireless firmware: Marvell released firmware version 5.110.20.p49 which addresses Ticket #5194. With this firmware release, all known major low-level bugs have been addressed. With the wireless driver that's in the current ship builds, we see locking errors under heavy load from which the driver recovers automatically. David Woodhouse is doing a major rewrite of the driver which should eventually address that issue. 6. Software ECOs: From time to time there may be critical bug fixes that must be released between our regularly scheduled releases. These may occur due to security issues, from unexpected hardware problems, or the discovery of latent bugs that affect large numbers of users. We've started a page in wiki discuss the software engineering change order (ECO) process (See http://wiki.laptop.org/go/Operating_system_release_procedures). 7. Support: The past week has been a busy one for Adam Holt and the OLPC support team. Adam has organized a team of 30 support volunteers to comprehensively answer [EMAIL PROTECTED] tickets. (Each ticket is an ongoing email conversation with a donor/client.) The volunteer team is working hard, but keeping up with the support load. Part of the process includes the compilation of a Support FAQ (See http://wiki.laptop.org/go/Support_FAQ). Adam is also organizing a virtual call center based on asterisk.org VoIP. Matthew O'Gorman is helping finalize the server. Callers will access a local US number in the 617 area code. It will be informal, but we hope it will provide a critical outreach to those users who need it most. We hope to complete testing and possibly an initial rollout within the coming week. Please everyone recruit your XO-aware friends as: (1) charming volunteers to answer phones; and (2) perfectionist volunteers to help organize our wiki pages. You can email Adam regarding your talents, motivations, and a phone number at holt AT laptop DOT org. Thanks! There will be an Organizing Sunday meeting among our volunteers on 30 December, 4PM EST. All interested parties can join if they email Adam first. Noah Kantrowitz has helped to
Re: OLPC News 2007-12-30
Richard noticed that on the community-development list there are at least two reports of the EC going terminal, meaning that on boot they get the error message: EC problem. Remove all power and restart. We need to get those machines to Cambridge to investigate further. It is unlikely that getting those specific machines to Cambridge will prove helpful, unless one of those systems exhibits the problem with great regularity. I have seen that problem happen on quite a few machines - but it happens very infrequently, always on a power-up, and it always goes away when you completely reset the EC by removing the battery and AC. It is quite possible that fixing http://dev.laptop.org/ticket/4397 will make the problem go away. The technique that the kernel currently uses to reboot involves forcing a triple-fault, which results in the main CPU resetting without the EC's knowledge. There is a 2-line patch in the ticket; it makes the kernel reboot using the approved EC interaction. I have been trying for 2 months to get this fix included in the kernel, but so far I haven't managed to get any traction. ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: OLPC News 2007-12-30
On Dec 31, 2007 3:23 AM, Mitch Bradley [EMAIL PROTECTED] wrote: resetting without the EC's knowledge. There is a 2-line patch in the ticket; it makes the kernel reboot using the approved EC interaction. Looking at your trac entry, I see: The change is in arch/i386/kernel/reboot_fixups.c : cs5536_warm_reset(), more or less like this: + #ifdef CONFIG_OLPC +outb(0xdb, 0x66); +udelay (100); + #endif wrmsrl(0x51400017, 1ULL); udelay(50); I have been trying for 2 months to get this fix included in the kernel, but so far I haven't managed to get any traction. I am unsure if you mean the olpc repo or if you mean you haven't been able to get the patch into Linus's mainline tree. If you mean mainline, I didn't see the patch and can't find your posting in [EMAIL PROTECTED] archives. If you can repost your patch after diffing it against mainline (the file may be renamed to arch/x86/kernel/reboot_fixups_32.c after the x86-64 merge) and please CC me, I would be happy to ack it and Andres's previous one as well. Thanks, jaya ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: OLPC News 2007-12-30
Mitch Bradley wrote: Richard noticed that on the community-development list there are at least two reports of the EC going terminal, meaning that on boot they get the error message: EC problem. Remove all power and restart. We need to get those machines to Cambridge to investigate further. It is unlikely that getting those specific machines to Cambridge will prove helpful, unless one of those systems exhibits the problem with A fact ommited from the summary of my report was that it happens 100%. The laptop won't boot regardless of how long they leave it without power. -- Richard Smith [EMAIL PROTECTED] One Laptop Per Child ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: OLPC News 2007-12-30
Jaya Kumar wrote: On Dec 31, 2007 3:23 AM, Mitch Bradley [EMAIL PROTECTED] wrote: resetting without the EC's knowledge. There is a 2-line patch in the ticket; it makes the kernel reboot using the approved EC interaction. Looking at your trac entry, I see: The change is in arch/i386/kernel/reboot_fixups.c : cs5536_warm_reset(), more or less like this: + #ifdef CONFIG_OLPC +outb(0xdb, 0x66); +udelay (100); + #endif wrmsrl(0x51400017, 1ULL); udelay(50); I have been trying for 2 months to get this fix included in the kernel, but so far I haven't managed to get any traction. I am unsure if you mean the olpc repo I meant the OLPC kernel. I presume that OLPC changes will be offered to mainline in some batch fashion, rather than piecemeal. This particular one is of no upstream value in isolation, as it is utterly dependent on OLPC-specific EC commands. or if you mean you haven't been able to get the patch into Linus's mainline tree. If you mean mainline, I didn't see the patch and can't find your posting in [EMAIL PROTECTED] archives. If you can repost your patch after diffing it against mainline (the file may be renamed to arch/x86/kernel/reboot_fixups_32.c after the x86-64 merge) and please CC me, I would be happy to ack it and Andres's previous one as well. Thanks, jaya ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: OLPC News 2007-12-30
Richard A. Smith wrote: Mitch Bradley wrote: Richard noticed that on the community-development list there are at least two reports of the EC going terminal, meaning that on boot they get the error message: EC problem. Remove all power and restart. We need to get those machines to Cambridge to investigate further. It is unlikely that getting those specific machines to Cambridge will prove helpful, unless one of those systems exhibits the problem with A fact ommited from the summary of my report was that it happens 100%. The laptop won't boot regardless of how long they leave it without power. Ah, those would indeed be worthwhile to analyze. I'm not sure they will shed much light on the sporadic occurrences of the EC problem symptom, though. The 100% case is likely to be an EC that is completely broken in some way. We need to get root cause on both, eventually. The EC problem message is not particularly precise as a microscopic diagnostic - it basically means that OFW tried to talk to the EC and the EC didn't answer. That could be caused by any number of EC issues into which OFW has little visibility. My best guess is that fails every time is probably due to a different root cause than fails once in a blue moon. I would bet on hardware for the former and software/firmware for the latter. ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel