Re: iwpriv (Was: OLPC News 2007-12-30)

2008-01-01 Thread David Woodhouse

On Mon, 2007-12-31 at 18:10 +, David Woodhouse wrote:
  An interesting goal would be cleaning up CONFIG_OLPC so that
  it could be enabled in stock kernels of standard Linux distros.
 
 I actually see that as a prerequisite for getting the thing upstream.
 And the first step along that path is to stop making it worse.

Let's see if we can repeat history. If experience with the libertas
driver is anything to go by, I predict that by starting to look at the
problem, I will provoke others into a generating a storm of conflicting
patches by attempting to do the same thing themselves¹.

So here's an untested patch to make the reboot fixups slightly more
generic, so that we can easily add our own 'fixup' for the XO in a
fashion which will actually be mergeable upstream.

Untested-but-otherwise-Signed-Off-By: David Woodhouse [EMAIL PROTECTED]

diff --git a/arch/x86/kernel/reboot_32.c b/arch/x86/kernel/reboot_32.c
index bb1a0f8..dedb1d8 100644
--- a/arch/x86/kernel/reboot_32.c
+++ b/arch/x86/kernel/reboot_32.c
@@ -332,9 +332,7 @@ static void native_machine_shutdown(void)
 #endif
 }
 
-void __attribute__((weak)) mach_reboot_fixups(void)
-{
-}
+void (*mach_reboot_fixup)(void);
 
 static void native_machine_emergency_restart(void)
 {
@@ -347,7 +345,8 @@ static void native_machine_emergency_restart(void)
/* rebooting needs to touch the page at absolute addr 0 */
*((unsigned short *)__va(0x472)) = reboot_mode;
for (;;) {
-   mach_reboot_fixups(); /* for board specific fixups */
+   if (mach_reboot_fixup)
+   mach_reboot_fixup();
mach_reboot();
/* That didn't work - force a triple fault.. */
load_idt(no_idt);
diff --git a/arch/x86/kernel/reboot_fixups_32.c 
b/arch/x86/kernel/reboot_fixups_32.c
index f452726..d9607a7 100644
--- a/arch/x86/kernel/reboot_fixups_32.c
+++ b/arch/x86/kernel/reboot_fixups_32.c
@@ -14,16 +14,18 @@
 #include asm/msr.h
 #include asm/geode.h
 
-static void cs5530a_warm_reset(struct pci_dev *dev)
+static pci_dev *cs5530a_pci_dev;
+
+static void cs5530a_warm_reset(void)
 {
/* writing 1 to the reset control register, 0x44 causes the
cs5530a to perform a system warm reset */
-   pci_write_config_byte(dev, 0x44, 0x1);
+   pci_write_config_byte(cs5530_pci_dev, 0x44, 0x1);
udelay(50); /* shouldn't get here but be safe and spin-a-while */
return;
 }
 
-static void cs5536_warm_reset(struct pci_dev *dev)
+static void cs5536_warm_reset(void)
 {
/* writing 1 to the LSB of this MSR causes a hard reset */
wrmsrl(MSR_DIVIL_SOFT_RESET, 1ULL);
@@ -48,24 +50,23 @@ static struct device_fixup fixups_table[] = {
  * do return, we keep looking and then eventually fall back to the
  * standard mach_reboot on return.
  */
-void mach_reboot_fixups(void)
+int mach_reboot_fixup_setup(void)
 {
struct device_fixup *cur;
struct pci_dev *dev;
int i;
 
-   /* we can be called from sysrq-B code. In such a case it is
-* prohibited to dig PCI */
-   if (in_interrupt())
-   return;
-
for (i=0; i  ARRAY_SIZE(fixups_table); i++) {
cur = (fixups_table[i]);
dev = pci_get_device(cur-vendor, cur-device, NULL);
if (!dev)
continue;
 
-   cur-reboot_fixup(dev);
+   cs5530a_pci_dev = dev;
+   mach_reboot_fixup = cur-reboot_fixup;
}
+   return 0;
 }
 
+subsys_initcall(mach_reboot_fixup_setup);
+
diff --git a/include/asm-x86/reboot_fixups.h b/include/asm-x86/reboot_fixups.h
index 0cb7d87..4f79001 100644
--- a/include/asm-x86/reboot_fixups.h
+++ b/include/asm-x86/reboot_fixups.h
@@ -1,6 +1,6 @@
 #ifndef _LINUX_REBOOT_FIXUPS_H
 #define _LINUX_REBOOT_FIXUPS_H
 
-extern void mach_reboot_fixups(void);
+extern void (*mach_reboot_fixup)(void);
 
 #endif /* _LINUX_REBOOT_FIXUPS_H */

-- 
dwmw2

¹ Only this time I don't actually plan to follow through; I'm relying on
the interference 

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: iwpriv (Was: OLPC News 2007-12-30)

2008-01-01 Thread Bernardo Innocenti
David Woodhouse wrote:


 So here's an untested patch to make the reboot fixups slightly more
 generic, so that we can easily add our own 'fixup' for the XO in a
 fashion which will actually be mergeable upstream.

It would be slightly nicer and generic if we had

 void (*mach_reboot_fixup)(void *arg);
 void *mach_reboot_fixup_arg;

rather than the cs5530a_pci_dev global.

But anyway,

 Untested-but-otherwise-Signed-Off-By: David Woodhouse [EMAIL PROTECTED]
Untested-but-otherwise-Acked-By: Bernardo Innocenti [EMAIL PROTECTED]

-- 
 \___/
 |___|   Bernardo Innocenti - http://www.codewiz.org/
  \___\  One Laptop Per Child - http://www.laptop.org/
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: OLPC News 2007-12-30

2007-12-31 Thread David Woodhouse
On Sun, 2007-12-30 at 12:05 -1000, Mitch Bradley wrote:
 I meant the OLPC kernel.
 
 I presume that OLPC changes will be offered to mainline in some batch 
 fashion, rather than piecemeal. This particular one is of no upstream 
 value in isolation, as it is utterly dependent on OLPC-specific EC
 commands.

As a general rule, that is totally incorrect. Changes should be pushed
towards upstream _before_ they're ever committed to our tree, and any
change which has been made only in the OLPC tree and not pushed upstream
should be considered volatile and likely to disappear... like the
private wireless ioctls I removed last week because they weren't
upstream for example¹.

However, you're right about this patch not going upstream -- I thought
I'd already told you that the naïve patch to cs5536_warm_reset() as
shown in ticket #4397 was not acceptable. It doesn't live in that
function, and even if it did, it shouldn't be happening unconditionally
based on CONFIG_OLPC.

-- 
dwmw2

¹ I have actually put them back now, temporarily. But they will be going
away again. Nothing is stable until it's upstream.


___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


iwpriv (Was: OLPC News 2007-12-30)

2007-12-31 Thread Bernardo Innocenti
David Woodhouse wrote:

 As a general rule, that is totally incorrect. Changes should be pushed
 towards upstream _before_ they're ever committed to our tree, and any
 change which has been made only in the OLPC tree and not pushed upstream
 should be considered volatile and likely to disappear... like the
 private wireless ioctls I removed last week because they weren't
 upstream for example¹.

 ¹ I have actually put them back now, temporarily. But they will be going
 away again. Nothing is stable until it's upstream.

btw, we still have code in /etc/init.d/olpc-configure that
tries to use one of those private ioctls to remap the leds,
and outputs errors if they're missing.  Is this still needed?


 However, you're right about this patch not going upstream -- I thought
 I'd already told you that the naïve patch to cs5536_warm_reset() as
 shown in ticket #4397 was not acceptable. It doesn't live in that
 function, and even if it did, it shouldn't be happening unconditionally
 based on CONFIG_OLPC.

An interesting goal would be cleaning up CONFIG_OLPC so that
it could be enabled in stock kernels of standard Linux distros.

-- 
 \___/
 |___|   Bernardo Innocenti - http://www.codewiz.org/
  \___\  One Laptop Per Child - http://www.laptop.org/

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: iwpriv (Was: OLPC News 2007-12-30)

2007-12-31 Thread David Woodhouse

On Mon, 2007-12-31 at 12:56 -0500, Bernardo Innocenti wrote: 
 btw, we still have code in /etc/init.d/olpc-configure that
 tries to use one of those private ioctls to remap the leds,
 and outputs errors if they're missing.  Is this still needed?

Yes, I think so. And I think it probably even justifies a private ioctl.
So it'll get proper consideration and it'll get sent upstream. Not just
dumped into our kernel and forgotten. 

  However, you're right about this patch not going upstream -- I thought
  I'd already told you that the naïve patch to cs5536_warm_reset() as
  shown in ticket #4397 was not acceptable. It doesn't live in that
  function, and even if it did, it shouldn't be happening unconditionally
  based on CONFIG_OLPC.
 
 An interesting goal would be cleaning up CONFIG_OLPC so that
 it could be enabled in stock kernels of standard Linux distros.

I actually see that as a prerequisite for getting the thing upstream.
And the first step along that path is to stop making it worse.

-- 
dwmw2

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


OLPC News 2007-12-30

2007-12-30 Thread Walter Bender
1. Give One Get One: The G1G1 program ends on December 31. G1G1 has
not only made it possible to seed the launch of programs in Haiti,
Rwanda, Ethiopia, Cambodia, Mongolia, and Afghanistan, but we have
also greatly broaden the community of participation in the project.
The community has already jumped in to help: the level of activity in
our forums, IRC, email lists, wiki, etc. has risen dramatically over
the past
few weeks. G1G1 participants have asked lots of questions—and have
uncovered some new bugs—but they also have lots of answers—and have
submitted some new patches. The community model seems to be scaling.

Many thanks to Hilary Meserole and the tireless efforts from the teams
at Pentagram, Nurun, Eleven, Patriot, and Brightstar.

2. Mary Lou Jepsen: Mary Lou's last day at OLPC is December 31. She
will be continuing to consult with us on a number of different fronts
as she chases after her next miracle in display technology. Mary Lou
was OLPC employee Number One, both in terms of when she joined the
organization and in terms of the breadth and depth of her
contributions. Thank you and best of luck with your adventures in a
new role and new year.

3. Embedded controller (EC): Richard Smith has tested a battery EEPROM
dumping feature recently added by Andres Salomon: it seems to work
great. Richard has written crontab scripts and phone home scripts
for inclusion in joyride builds, with the intent to include them in an
upcoming release to build an anonymous database of battery
performance. These scripts will sample the power used every five
minutes and log it. They only sample when the battery is charging or
discharging. The hope is to gather a composite view of battery
performance under realistic conditions of use.

Richard noticed that on the community-development list there are at
least two reports of the EC going terminal, meaning that on boot
they get the error message: EC problem. Remove all power and
restart. We need to get those machines to Cambridge to investigate
further.

Another issue found on the community-list are reports from a few
people about their batteries not charging. Richard says this would not
surprise him if they were NiMH batteries, but G1G1 machines have the
LiFePo batteries. He did had one person run logbat and send him the
results: the EC reads the battery fine and is attempting to charge the
battery but no current ever goes into the battery. Again, we need to
get these machines to Cambridge as we haven't seen this behavior
before.

4. Open Firmware: Mitch Bradley continued to provide G1G1 customer
support, for example, chasing down some problems with SD cards. He
also added the ability to delete JFFS2 files from Open Firmware and
fixed Tickets #5717, #5585, and #5727, all improvements to the overall
OFW performance and reliability. Preparations continue on OFW for the
Intel prototype XO board.

5. Wireless firmware: Marvell released firmware version 5.110.20.p49
which addresses Ticket #5194. With this firmware release, all known
major low-level bugs have been addressed. With the wireless driver
that's in the current ship builds, we see locking errors under heavy
load from which the driver recovers automatically. David Woodhouse is
doing a major rewrite of the driver which should eventually address
that issue.

6. Software ECOs: From time to time there may be critical bug fixes
that must be released between our regularly scheduled releases. These
may occur due to security issues, from unexpected hardware problems,
or the discovery of latent bugs that affect large numbers of users.
We've started a page in wiki discuss the software engineering change
order (ECO) process (See
http://wiki.laptop.org/go/Operating_system_release_procedures).

7. Support: The past week has been a busy one for Adam Holt and the
OLPC support team. Adam has organized a team of 30 support volunteers
to comprehensively answer [EMAIL PROTECTED] tickets. (Each ticket is an
ongoing email conversation with a donor/client.) The volunteer team is
working hard, but keeping up with the support load. Part of the
process includes the compilation of a Support FAQ (See
http://wiki.laptop.org/go/Support_FAQ). Adam is also organizing a
virtual call center based on asterisk.org VoIP. Matthew O'Gorman is
helping finalize the server. Callers will access a local US number in
the 617 area code. It will be informal, but we hope it will provide a
critical outreach to those users who need it most. We hope to complete
testing and possibly an initial rollout within the coming week.

Please everyone recruit your XO-aware friends as:
(1) charming volunteers to answer phones; and
(2) perfectionist volunteers to help organize our wiki pages.

You can email Adam regarding your talents, motivations, and a phone
number at holt AT laptop DOT org. Thanks!

There will be an Organizing Sunday meeting among our volunteers on
30 December, 4PM EST. All interested parties can join if they email
Adam first.

Noah Kantrowitz has helped to 

Re: OLPC News 2007-12-30

2007-12-30 Thread Mitch Bradley

 Richard noticed that on the community-development list there are at
 least two reports of the EC going terminal, meaning that on boot
 they get the error message: EC problem. Remove all power and
 restart. We need to get those machines to Cambridge to investigate
 further.
   

It is unlikely that getting those specific machines to Cambridge will 
prove helpful, unless one of those systems exhibits the problem with 
great regularity.  I have seen that problem happen on quite a few 
machines - but it happens very infrequently, always on a power-up, and 
it always goes away when you completely reset the EC by removing the 
battery and AC.

It is quite possible that fixing http://dev.laptop.org/ticket/4397 will 
make the problem go away.  The technique that the kernel currently uses 
to reboot involves forcing a triple-fault, which results in the main CPU 
resetting without the EC's knowledge.  There is a 2-line patch in the 
ticket; it makes the kernel reboot using the approved EC interaction.

I have been trying for 2 months to get this fix included in the kernel, 
but so far I haven't managed to get any traction.

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: OLPC News 2007-12-30

2007-12-30 Thread Jaya Kumar
On Dec 31, 2007 3:23 AM, Mitch Bradley [EMAIL PROTECTED] wrote:
 resetting without the EC's knowledge.  There is a 2-line patch in the
 ticket; it makes the kernel reboot using the approved EC interaction.

Looking at your trac entry, I see:

The change is in arch/i386/kernel/reboot_fixups.c :
cs5536_warm_reset(), more or less like this:
+ #ifdef CONFIG_OLPC
+outb(0xdb, 0x66);
+udelay (100);
+ #endif
   wrmsrl(0x51400017, 1ULL);
   udelay(50);



 I have been trying for 2 months to get this fix included in the kernel,
 but so far I haven't managed to get any traction.


I am unsure if you mean the olpc repo or if you mean you haven't been
able to get the patch into Linus's mainline tree. If you mean
mainline, I didn't see the patch and can't find your posting in
[EMAIL PROTECTED] archives. If you can repost your patch after diffing
it against mainline (the file may be renamed to
arch/x86/kernel/reboot_fixups_32.c after the x86-64 merge) and please
CC me, I would be happy to ack it and Andres's previous one as well.

Thanks,
jaya
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: OLPC News 2007-12-30

2007-12-30 Thread Richard A. Smith
Mitch Bradley wrote:

 Richard noticed that on the community-development list there are at
 least two reports of the EC going terminal, meaning that on boot
 they get the error message: EC problem. Remove all power and
 restart. We need to get those machines to Cambridge to investigate
 further.
   
 
 It is unlikely that getting those specific machines to Cambridge will 
 prove helpful, unless one of those systems exhibits the problem with 

A fact ommited from the summary of my report was that it happens 100%. 
The laptop won't boot regardless of how long they leave it without 
power.

-- 
Richard Smith  [EMAIL PROTECTED]
One Laptop Per Child
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: OLPC News 2007-12-30

2007-12-30 Thread Mitch Bradley
Jaya Kumar wrote:
 On Dec 31, 2007 3:23 AM, Mitch Bradley [EMAIL PROTECTED] wrote:
   
 resetting without the EC's knowledge.  There is a 2-line patch in the
 ticket; it makes the kernel reboot using the approved EC interaction.
 

 Looking at your trac entry, I see:

 The change is in arch/i386/kernel/reboot_fixups.c :
 cs5536_warm_reset(), more or less like this:
 + #ifdef CONFIG_OLPC
 +outb(0xdb, 0x66);
 +udelay (100);
 + #endif
wrmsrl(0x51400017, 1ULL);
udelay(50);
 

   
 I have been trying for 2 months to get this fix included in the kernel,
 but so far I haven't managed to get any traction.

 

 I am unsure if you mean the olpc repo

I meant the OLPC kernel.

I presume that OLPC changes will be offered to mainline in some batch 
fashion, rather than piecemeal. This particular one is of no upstream 
value in isolation, as it is utterly dependent on OLPC-specific EC commands.

 or if you mean you haven't been
 able to get the patch into Linus's mainline tree. If you mean
 mainline, I didn't see the patch and can't find your posting in
 [EMAIL PROTECTED] archives. If you can repost your patch after diffing
 it against mainline (the file may be renamed to
 arch/x86/kernel/reboot_fixups_32.c after the x86-64 merge) and please
 CC me, I would be happy to ack it and Andres's previous one as well.

 Thanks,
 jaya
   

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: OLPC News 2007-12-30

2007-12-30 Thread Mitch Bradley
Richard A. Smith wrote:
 Mitch Bradley wrote:

 Richard noticed that on the community-development list there are at
 least two reports of the EC going terminal, meaning that on boot
 they get the error message: EC problem. Remove all power and
 restart. We need to get those machines to Cambridge to investigate
 further.
   

 It is unlikely that getting those specific machines to Cambridge will 
 prove helpful, unless one of those systems exhibits the problem with 

 A fact ommited from the summary of my report was that it happens 100%. 
 The laptop won't boot regardless of how long they leave it without power.


Ah, those would indeed be worthwhile to analyze.  I'm not sure they will 
shed much light on the sporadic occurrences of the EC problem symptom, 
though.  The 100% case is likely to be an EC that is completely broken 
in some way.  We need to get root cause on both, eventually.

The EC problem message is not particularly precise as a microscopic 
diagnostic - it basically means that OFW tried to talk to the EC and the 
EC didn't answer.  That could be caused by any number of EC issues into 
which OFW has little visibility.  My best guess is that fails every 
time is probably due to a different root cause than fails once in a 
blue moon.  I would bet on hardware for the former and 
software/firmware for the latter.

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel