date:20050815

Re: [PATCH 5/6] i386 virtualization - Make generic set wrprotect a macro

2005-08-15 Thread Chris Wright

* [EMAIL PROTECTED] ([EMAIL PROTECTED]) wrote:
> Make the generic version of ptep_set_wrprotect a macro.  This is good for
> code uniformity, and fixes the build for architectures which include pgtable.h
> through headers into assembly code, but do not define a ptep_set_wrprotect
> function.

This one is unrelated to other descriptor related changes.  Why is it
included in this series?

> Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]>
> Index: linux-2.6.13/include/asm-generic/pgtable.h
> ===
> --- linux-2.6.13.orig/include/asm-generic/pgtable.h   2005-08-12 
> 12:12:55.0 -0700
> +++ linux-2.6.13/include/asm-generic/pgtable.h2005-08-15 
> 13:54:42.0 -0700
> @@ -313,11 +313,12 @@
>  #endif
>  
>  #ifndef __HAVE_ARCH_PTEP_SET_WRPROTECT
> -static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long 
> address, pte_t *ptep)
> -{
> - pte_t old_pte = *ptep;
> - set_pte_at(mm, address, ptep, pte_wrprotect(old_pte));
> -}
> +#define ptep_set_wrprotect(__mm, __address, __ptep)  \
> +({   \
> + pte_t __old_pte = *(__ptep);\
> + set_pte_at((__mm), (__address), (__ptep),   \
> + pte_wrprotect(__old_pte));  \
> +})
>  #endif

I'm not sure I agree with this approach (although I understand the
motivation).  This should at least be a do {} while(0) type macro,
since it's not returning a value.

thanks,
-chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Michael E Brown

On Mon, 2005-08-15 at 22:35 -0700, Chris Wedgwood wrote:
> On Tue, Aug 16, 2005 at 12:19:49AM -0500, Michael E Brown wrote:
> 
> > Hmm... did I mention libsmbios? :-)
> > http://linux.dell.com/libsmbios/main.
> 
> I'm aware of it --- it seems pretty limited right now and I'm still
> irked Dell isn't more forthcoming with documentation.

I cannot give docs, but I can retype the docs into code or xml files.
What, specifically, were you looking for?

I intend to make an XML file of all of the token name, id, and
description mappings within the next couple of weeks. This should pretty
much document all of the token mappings. 

Next thing would be to do something for the SMI calls. What I will do
there is basically just put a big table and make a C-API available for
every SMI call we support.

> Given that why not resubmit the kernel driver when the userspace
> becomes usable for people without them having to use MonsterApp from
> Dell?

Well, there are three different groups involved here. I didn't write the
dcdbas code, Doug did. I just reviewed it and decided it would be nice
to implement in libsmbios. I started work on the libsmbios side of
things this weekend. I didn't know that Doug had reposted the driver to
linux-kernel until about 4pm this afternoon. :-(

Libsmbios isn't the only user of dcdbas. That is the third group.
(MonsterApp, so nicely put...)

When I found out Doug reposted the driver, I went into overdrive trying
to finish out libsmbios. But, basically, libsmbios is a one-person
project, and that person would be me. And I have a "real" job to do
besides just libsmbios. :-)  The best I can guarantee is next week,
although if my manager is understanding, I may have it done sooner. :-)

> 
> > Aside from that, for the most part, the only thing SMI ever does is
> > pass buffers back and forth.
> 
> I meant to ask; does this have horrible latency or nasties like lots
> of laptop SMM stuff?

That really depends, I guess. The hugely horrible laptop SMM stuff
mostly has to do with the battery gauge. The reason that the battery
stuff takes so long is that they basically do an entire current
measurement and computation of the battery each time the SMI is called
and do not (and pretty much cannot) cache anything from call to call.
Compounding things, they have to talk to the battery over a very slow
serial link. (as related to me by a former BIOS engineer)

I haven't done any measurements on servers, but I bet that most of it
isn't anywhere near as bad as the laptop stuff.
--
Michael

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Arjan van de Ven

On Tue, 2005-08-16 at 12:30 +0900, Hiro Yoshioka wrote:

> The following example shows the L3 cache miss is reduced from 37410 to 107.

most impressive; it seems the approach to do this selectively is paying
off very well! 

The only comment/question I have is about the use of prefetchnta; that
might have cache-evicting properties as well (eg evict the cache of the
original of the copy, eg the userspace memory). Is that really the right
approach? 
In addition, my measurements show that removing the prefetch from the
main copy loop is a gain because the modern cpus have an autoprefetcher
already in the hardware.

   "1: prefetchnta (%0)\n" /* This set is 28 bytes */
+   "   prefetchnta 64(%0)\n"
+   "   prefetchnta 128(%0)\n"
+   "   prefetchnta 192(%0)\n"
+   "   prefetchnta 256(%0)\n"
+   "2:  \n"
+   ".section .fixup, \"ax\"\n"
+   "3: movw $0x1AEB, 1b\n" /* jmp on 26 bytes */
+   "   jmp 2b\n"
+   ".previous\n"

oh and prefetch(nta) is a non-faulting instruction so no need for the
fixup handling...


But overall this is starting to look really interesting!

Greetings,
   Arjan van de Ven

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/6] i386 virtualization - Make ldt a desc struct

2005-08-15 Thread Zachary Amsden


Chris Wright wrote:

Thanks for the feedback.  Comments inline.


@@ -30,7 +33,7 @@
static inline unsigned long get_desc_base(struct desc_struct *desc)
{
unsigned long base;
-   base = ((desc->a >> 16)  & 0x) |
+   base = (desc->a >> 16) |
   



Seemingly unrelated.
 



Yes, alas my bucket has leaks.  I was hoping for better assembly, but 
never got around to verifying.  So I matched this to shorter C code 
which I had obsoleted.



@@ -28,28 +28,27 @@
}
#endif

-static inline int alloc_ldt(mm_context_t *pc, const int oldsize, int mincount, 
const int reload)
+static inline int alloc_ldt(mm_context_t *pc, const int old_pages, int 
new_pages, const int reload)
{
-   void *oldldt;
-   void *newldt;
+   struct desc_struct *oldldt;
+   struct desc_struct *newldt;

   



Not quite related here (since change was introduced in earlier patch),
but old alloc_ldt special cased when room was available.  This is gone,
so am I reading this correctly, each time through it will allocate a
new one, and free the old one (if it existed)?  Just double checking
that change doesn't introduce possible mem leak.
 



Since LDT is now in pages, it is only called when page reservation 
increases.   I chose a slightly bad name here for new_pages.  See 
further down:


   if (page_number >= mm->context.ldt_pages) {
   error = alloc_ldt(>mm->context, 
mm->context.ldt_pages,

   page_number+1, 1);
   if (error < 0)
   goto out_unlock;
   }

I actually had to check the code here to verify there is no leak, and I 
don't believe I changed any semantics, but was very happy when I found this:


   if (old_pages) {
   ClearPagesLDT(oldldt, old_pages);
   if (old_pages > 1)
   vfree(oldldt);
   else
   kfree(oldldt);
   }



-   mincount = (mincount+511)&(~511);
-   if (mincount*LDT_ENTRY_SIZE > PAGE_SIZE)
-   newldt = vmalloc(mincount*LDT_ENTRY_SIZE);
+   if (new_pages > 1)
+   newldt = vmalloc(new_pages*PAGE_SIZE);
else
-   newldt = kmalloc(mincount*LDT_ENTRY_SIZE, GFP_KERNEL);
+   newldt = kmalloc(PAGE_SIZE, GFP_KERNEL);
   



If so, then full page is likely to be reusable in common case, no? (If
there's such a thing as LDT common case ;-)
 



Yeah, there is no LDT common case.  This code could be made 100% optimal 
with a lot of likely/unlikely wrappers and additional cleanup, but 
seeing as it is already uncommon, the only worthwhile optimizations for 
this code are ones that reduce code size or make it more readable and 
less error prone.  I had to write a test that emits inline assembler 
onto a crossing page boundary, clones the VM, and tests strict 
conformance to byte/page limits to actually test this.  :)




if (!newldt)
return -ENOMEM;

-   if (oldsize)
-   memcpy(newldt, pc->ldt, oldsize*LDT_ENTRY_SIZE);
+   if (old_pages)
+   memcpy(newldt, pc->ldt, old_pages*PAGE_SIZE);
oldldt = pc->ldt;
if (reload)
-   memset(newldt+oldsize*LDT_ENTRY_SIZE, 0, 
(mincount-oldsize)*LDT_ENTRY_SIZE);
+   memset(newldt+old_pages*LDT_ENTRIES_PER_PAGE, 0, 
(new_pages-old_pages)*PAGE_SIZE);
   



In fact, I _think_ this causes a problem.  Who says newldt is bigger
than old one?  This looks like user-triggerable oops to me.
 



Safe -- two call sites.  One has no LDT (cloning), and the other is here:

   if (page_number >= mm->context.ldt_pages) {
   error = alloc_ldt(>mm->context, 
mm->context.ldt_pages,


Note page_number is zero-based, ldt_pages is not.


@@ -113,13 +114,13 @@
unsigned long size;
struct mm_struct * mm = current->mm;

-   if (!mm->context.size)
+   if (!mm->context.ldt_pages)
return 0;
if (bytecount > LDT_ENTRY_SIZE*LDT_ENTRIES)
bytecount = LDT_ENTRY_SIZE*LDT_ENTRIES;

down(>context.sem);
-   size = mm->context.size*LDT_ENTRY_SIZE;
+   size = mm->context.ldt_pages*PAGE_SIZE;
if (size > bytecount)
size = bytecount;
   



This now looks like you can leak data?  Since full page is unlikely
used, but accounting is done in page sizes.  Asking to read_ldt with
bytcount of PAGE_SIZE could give some uninitialzed data back to user.
Did I miss the spot where this is always zero-filled?
 



You could leak data, but the code already takes care to zero the page.  
This is especially important, since random present segments could allow 
a violation of kernel security ;)


   if (reload)
   memset(newldt+old_pages*LDT_ENTRIES_PER_PAGE, 0, 
(new_pages-old_pages)*PAGE_SIZE);




Wow.  Thanks for a completely thorough review.  I have tested this code 
quite intensely, but I very much appreciate having more eyes on it, 
since it is quite

Re: oops in 2.6.13-rc6-git5

2005-08-15 Thread D. Hazelton

On Monday 15 August 2005 08:22, Jesper Juhl wrote:
> Can you reproduce the crash reliably?
> Can you reproduce the crash with a non-tainted kernel?

I've tried several times now to reproduce the oops, but there might have been 
another factor that led to the oops, because just booting the kernel and 
shutting down does not trigger it. I have tried reproducing all conditions up 
to the time that the oops actually occurred and think it might just be my 
hardware going flaky - so I have reloaded the module that taints the kernel 
(have done this already and having the module loaded when the powerdown 
started did not trigger the oops) and am seeing if using it for any period of 
time causes the oops to occur. If it does I will try the other driver 
available for the modem, although that one also contains proprietary code. 
The upside is that it does not make use of any functions marked as deprecated 
in any version of the kernel, where the official driver requires me to 
re-enable a deprecated function that had been made non-exported by the 
kernel.

DRH

0xA6992F96300F159086FF28208F8280BB8B00C32A.asc
Description: application/pgp-keys

pgpQYkGfgjBlh.pgp
Description: PGP signature

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Arjan van de Ven

On Tue, 2005-08-16 at 13:17 +0900, Hirokazu Takahashi wrote:
> Hi,
> 
> BTW, what are you going to do with the page-faults which may happen
> during __copy_user_zeroing_nocache()? The current process may be blocked
> in the handler for a while and get FPU registers polluted.
> kernel_fpu_begin() won't help the case. This is another issue, though.


__copy_from_user_inatomic

.. that implies it won't sleep actually ;)



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Arjan van de Ven

On Tue, 2005-08-16 at 13:54 +0900, Hiro Yoshioka wrote:
> Takahashi san,
> 
> I appreciate your comments.
> 
> > Hi,
> > 
> > BTW, what are you going to do with the page-faults which may happen
> > during __copy_user_zeroing_nocache()? The current process may be blocked
> > in the handler for a while and get FPU registers polluted.
> > kernel_fpu_begin() won't help the case. This is another issue, though.
> 
> My code does nothing do it.
> 
> I need a volunteer to implement it.

it's actually not too hard; all you need is to use SSE and not MMX; and
then just store sse register you're overwriting on the stack or so...



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Valdis . Kletnieks

On Mon, 15 Aug 2005 23:58:43 CDT, Michael E Brown said:

> No, this is an _EXCELLENT_ reason why _LESS_ of this should be in the
> kernel. Why should we have to duplicate a _TON_ of code inside the
> kernel to figure out which platform we are on, and then look up in a
> table which method to use for that platform? We have a _MUCH_ nicer
> programming environment available to us in userspace where we can use
> things like libsmbios to look up the platform type, then look in an
> easily-updateable text file which smi type to use. In general, plugging
> the wrong value here is a no-op.

You'll still need to do some *very* basic checking - there's fairly
scary-looking 'outb' call in  callintf_smi()  and host_control_smi() that seems 
to
be totally too trusting that The Right Thing is located at address 
CMOS_BASE_PORT:

+   for (index = PE1300_CMOS_CMD_STRUCT_PTR;
+index < (PE1300_CMOS_CMD_STRUCT_PTR + 4);
+index++) {
+   outb(index,
+(CMOS_BASE_PORT + CMOS_PAGE2_INDEX_PORT_PIIX4));
+   outb(*data++,
+(CMOS_BASE_PORT + CMOS_PAGE2_DATA_PORT_PIIX4));
+   }

This Dell C840 has an 845, not a PIIX.  What just got toasted if this driver
gets called?

Can we have a check that the machine is (a) a Dell and (b) has a PIIX and (c) 
the
PIIX has a functional SMI behind it, before we start doing outb() calls?




pgpkgwfdR79iY.pgp
Description: PGP signature

Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Chris Wedgwood

On Tue, Aug 16, 2005 at 12:19:49AM -0500, Michael E Brown wrote:

> Hmm... did I mention libsmbios? :-)
> http://linux.dell.com/libsmbios/main.

I'm aware of it --- it seems pretty limited right now and I'm still
irked Dell isn't more forthcoming with documentation.

> SMI support is not yet implemented inside libsmbios, but I am
> working feverishly on it (in-between emails to linux-kernel, of
> course.) :-) We have a development mailing list, and I will make the
> announcement there when it has been complete.

[...]

> I cannot (at this point, I'm working on it, though), provide our
> internal documentation of our SMI implementation directly. But, I am
> authorized to add all of this to libsmbios, and I intend to very
> clearly document all of the SMI calls in libsmbios.

Given that why not resubmit the kernel driver when the userspace
becomes usable for people without them having to use MonsterApp from
Dell?

> Aside from that, for the most part, the only thing SMI ever does is
> pass buffers back and forth.

I meant to ask; does this have horrible latency or nasties like lots
of laptop SMM stuff?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel 2.4.27-10: isofs driver ignore some parameters with mount

2005-08-15 Thread Horms

On Mon, Aug 15, 2005 at 10:11:21PM -0300, Marcelo Tosatti wrote:
> 
> Hi folks,
> 
> On Fri, Aug 12, 2005 at 05:29:36PM +0900, Horms wrote:
> > On Fri, Aug 12, 2005 at 10:44:17AM +0300, Alexander Pytlev wrote:
> > > Hello Debian,
> > > 
> > > Kernel 2.4.27-10
> > > With mount isofs filesystem, any mount parameters after
> > > iocharset=,map=,session= are ignored.
> > > 
> > > Sample:
> > > 
> > > mount -t isofs -o uid=100,iocharset=koi8-r,gid=100 /dev/cdrom /media/cdrom
> > > 
> > > gid=100 - was ignored
> > > 
> > > I look in source and find that problem. I make two patch, simply and full
> > > (what addeded some functionality - ignore wrong mount parameters)
> > 
> > Thanks,
> > 
> > I will try and get the simple version of this patch into the next
> > Sarge update.
> > 
> > I have also CCed Marcelo and the LKML for their consideration,
> > as this problem still seems to be present in the lastest 2.4 tree.
> > 
> > -- 
> > Horms
> > 
> > simply patch:
> > ===
> > --- kernel-source-2.4.27/fs/isofs/inode.c   2005-05-19 
> > 13:29:39.0 +0300
> > +++ kernel-source/fs/isofs/inode.c  2005-08-11 11:55:12.0 +0300
> > @@ -340,13 +340,13 @@
> > else if (!strcmp(value,"acorn")) popt->map = 'a';
> > else return 0;
> > }
> > -   if (!strcmp(this_char,"session") && value) {
> > +   else if (!strcmp(this_char,"session") && value) {
> > char * vpnt = value;
> > unsigned int ivalue = simple_strtoul(vpnt, , 
> > 0);
> > if(ivalue < 0 || ivalue >99) return 0;
> > popt->session=ivalue+1;
> > }
> > -   if (!strcmp(this_char,"sbsector") && value) {
> > +   else if (!strcmp(this_char,"sbsector") && value) {
> > char * vpnt = value;
> > unsigned int ivalue = simple_strtoul(vpnt, , 
> > 0);
> > if(ivalue < 0 || ivalue >660*512) return 0;
> > ===
> 
> Neither "sbsector" or "session" parameters are part of the options string 
> used 
> in Alexander's example, so how come this patch can make any difference? 
> 
> Usage of "sbsector" or "session" parameters could explain the above patch
> making a difference because the buggy, always true "(unsigned long) ivalue < 
> 0"
> comparison invokes "return 0", but that is not the case.
> 
> The code after the "popt->iocharset = value;" does not make any sense.
> 
> It seems that the "*value = 0" assignment can screw up the rest of the
> string, isnt that the real issue?
> 
> #ifdef CONFIG_JOLIET
> if (!strcmp(this_char,"iocharset") && value) {
> popt->iocharset = value;
> while (*value && *value != ',')
> value++;
> if (value == popt->iocharset)
> return 0;
> *value = 0;
> } else
> #endif

Sorry about that, while the patch above does seem to be
a valid clean up, on further examination I agree that it
does not address the problem at hand, and that the problem seems
to lie in the *value assignment as you suggest. I wonder
if advancing this_char to the character aftter value, if
non-NULL would resolve this problem. I'll do some testing,
but in the mean time, here is what I have in mind:

--- a/fs/isofs/inode.c  2005-08-16 14:22:27.0 +0900
+++ b/fs/isofs/inode.c  2005-08-16 14:27:55.0 +0900
@@ -329,7 +329,10 @@
value++;
if (value == popt->iocharset)
return 0;
-   *value = 0;
+   if (*value) {
+   this_char = value + 1;
+   *value = 0;
+   }
} else
 #endif
if (!strcmp(this_char,"map") && value) {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Michael E Brown

On Tue, 2005-08-16 at 01:17 -0400, [EMAIL PROTECTED] wrote:
> On Mon, 15 Aug 2005 23:09:28 CDT, you said:
> 
> > No, dcdbas has nothing to do with this. I'll have to submit a patch
> > against the docs. The program you need to use already exists and is
> > open source. You can use libsmbios to do this.
> > http://linux.dell.com/libsmbios/main.
> 
> Now I'm confoozled.  Maybe - I suspect we're actually in violent agreement...

nope... :-)

> 
> On Mon, 15 Aug 2005 17:58:56 CDT, [EMAIL PROTECTED] said:
> > Additionally, we are releasing an open source library (GPL/OSL dual 
> > license) that can use these hooks to perform many systems management 
> > functions in userspace. See http://linux.dell.com/libsmbios/main/. We 
> > should have code in libsmbios to do SMI using this driver within about two 
> > weeks.  We currently writing the SMI hooks in libsmbios using this posted 
> > version of the driver. I am the maintainer of this project, and it is my 
> > goal 
> > to have code in libsmbios for every Dell SMI call.
> 
> So dcdbas *is* intended as the kernel end of the userspace libsmbios, which
> is the suggested way of getting that BIOS updated. OK, I got it now.. ;)

not quite... :-)

Basically, for the exact case of RBU, libsmbios _today_ has what is
necessary to support this, without using dcdbas.

Today, libsmbios can set certain CMOS bits. _Some_ of the BIOS F2 screen
options are represented in CMOS as bits. Also, other features are made
available through CMOS that are not available through F2, and all of
these bits (F2 bits and other bits) can be manipulated by libsmbios. It
just so happens that RBU is implemented using a CMOS bit (represented by
token 0x005C and 0x005D). 

The addition of 'dcdbas' driver enables _extra_, _additional_
functionality that libsmbios does not today have. The rest of the BIOS
F2 screen options that are not in CMOS are available through SMI. Also,
lots of other interesting stuff that is not related to BIOS F2 screens
is available through SMI.

To give an example: the Asset tag can be set through CMOS and SMI.
Today, libsmbios can only set asset tag through CMOS. With the addition
of dcdbas, libsmbios can use the SMI method to update asset tag. 

SMI is a more reliable way to set asset tag, as it is dynamic and system
flash is updated right away. Future systems may drop CMOS method
completely as we start to run out of room in CMOS (there are only 256
_bytes_ available in CMOS, remember.) 

Basically, I am positioning libsmbios as an open-source way to take
advantage of all of the features of a Dell system that are available
through the system smbios/dmi table (similar to dmidecode), system cmos,
or through SMI calls.

> 
> (continuing on)
> 
> > The binary you want to use is "activateCmosToken", under bins/output/
> > (after compilation). The command line syntax is like this:
> > activateCmosToken 0x005C
> > 
> > If you want to cancel a BIOS update that has already been activated
> > (per above), use:   
> > activateCmosToken 0x005D
> > 
> > Basically, follow the docs in the RBU docs as far as cat-ing the bios
> > update image to the rbu sysfs files, then use the activateCmosToken
> > program to tell BIOS to do the update on reboot. 
> 
> Ahh... the missing piece I didn't have before. :)

I provided this info to Abhay when he posted RBU, and I thought he had
already updated the rbu docs with this info. I suppose I should have
checked. :-(

--
Michael

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/6] i386 virtualization - Make ldt a desc struct

2005-08-15 Thread Chris Wright

* [EMAIL PROTECTED] ([EMAIL PROTECTED]) wrote:
> Make the LDT a desc_struct pointer, since this is what it actually is.

I like that plan.

> There is code which relies on the fact that LDTs are allocated in page
> chunks, and it is both cleaner and more convenient to keep the rather
> poorly named "size" variable from the LDT in terms of LDT pages.

I noticed it's replaced by context.ldt and context.ldt_pages, which
appear to be decoupling the overloaded use from before.  Looks sane.
More comments below.

> Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]>
> Index: linux-2.6.13/include/asm-i386/mmu.h
> ===
> --- linux-2.6.13.orig/include/asm-i386/mmu.h  2005-08-15 11:16:59.0 
> -0700
> +++ linux-2.6.13/include/asm-i386/mmu.h   2005-08-15 11:19:49.0 
> -0700
> @@ -9,9 +9,9 @@
>   * cpu_vm_mask is used to optimize ldt flushing.
>   */
>  typedef struct { 
> - int size;
>   struct semaphore sem;
> - void *ldt;
> + struct desc_struct *ldt;
> + int ldt_pages;
>  } mm_context_t;
>  
>  #endif
> Index: linux-2.6.13/include/asm-i386/desc.h
> ===
> --- linux-2.6.13.orig/include/asm-i386/desc.h 2005-08-15 11:16:59.0 
> -0700
> +++ linux-2.6.13/include/asm-i386/desc.h  2005-08-15 11:19:49.0 
> -0700
> @@ -6,6 +6,9 @@
>  
>  #define CPU_16BIT_STACK_SIZE 1024
>  
> +/* The number of LDT entries per page */
> +#define LDT_ENTRIES_PER_PAGE (PAGE_SIZE / LDT_ENTRY_SIZE)
> +
>  #ifndef __ASSEMBLY__
>  
>  #include 
> @@ -30,7 +33,7 @@
>  static inline unsigned long get_desc_base(struct desc_struct *desc)
>  {
>   unsigned long base;
> - base = ((desc->a >> 16)  & 0x) |
> + base = (desc->a >> 16) |

Seemingly unrelated.

>   ((desc->b << 16) & 0x00ff) |
>   (desc->b & 0xff00);
>   return base;
> @@ -171,7 +174,7 @@
>  static inline void load_LDT_nolock(mm_context_t *pc, int cpu)
>  {
>   void *segments = pc->ldt;
> - int count = pc->size;
> + int count = pc->ldt_pages * LDT_ENTRIES_PER_PAGE;
>  
>   if (likely(!count)) {
>   segments = _ldt[0];
> Index: linux-2.6.13/include/asm-i386/mmu_context.h
> ===
> --- linux-2.6.13.orig/include/asm-i386/mmu_context.h  2005-08-15 
> 11:16:59.0 -0700
> +++ linux-2.6.13/include/asm-i386/mmu_context.h   2005-08-15 
> 11:19:49.0 -0700
> @@ -19,7 +19,7 @@
>   memset(>context, 0, sizeof(mm->context));
>   init_MUTEX(>context.sem);
>   old_mm = current->mm;
> - if (old_mm && unlikely(old_mm->context.size > 0)) {
> + if (old_mm && unlikely(old_mm->context.ldt)) {
>   retval = copy_ldt(>context, _mm->context);
>   }
>   if (retval == 0)
> @@ -32,7 +32,7 @@
>   */
>  static inline void destroy_context(struct mm_struct *mm)
>  {
> - if (unlikely(mm->context.size))
> + if (unlikely(mm->context.ldt))
>   destroy_ldt(mm);
>   del_lazy_mm(mm);
>  }
> Index: linux-2.6.13/include/asm-i386/mach-default/mach_desc.h
> ===
> --- linux-2.6.13.orig/include/asm-i386/mach-default/mach_desc.h   
> 2005-08-15 11:16:59.0 -0700
> +++ linux-2.6.13/include/asm-i386/mach-default/mach_desc.h2005-08-15 
> 11:19:49.0 -0700
> @@ -62,11 +62,10 @@
>   _set_tssldt_desc(_cpu_gdt_table(cpu)[GDT_ENTRY_LDT], (int)addr, 
> ((size << 3)-1), 0x82);
>  }
>  
> -static inline int write_ldt_entry(void *ldt, int entry, __u32 entry_a, __u32 
> entry_b)
> +static inline int write_ldt_entry(struct desc_struct *ldt, int entry, __u32 
> entry_a, __u32 entry_b)
>  {
> - __u32 *lp = (__u32 *)((char *)ldt + entry*8);
> - *lp = entry_a;
> - *(lp+1) = entry_b;
> + ldt[entry].a = entry_a;
> + ldt[entry].b = entry_b;
>   return 0;
>  }
>  
> Index: linux-2.6.13/arch/i386/kernel/ptrace.c
> ===
> --- linux-2.6.13.orig/arch/i386/kernel/ptrace.c   2005-08-15 
> 11:16:59.0 -0700
> +++ linux-2.6.13/arch/i386/kernel/ptrace.c2005-08-15 11:19:49.0 
> -0700
> @@ -164,18 +164,20 @@
>* and APM bios ones we just ignore here.
>*/
>   if (segment_from_ldt(seg)) {
> - u32 *desc;
> + mm_context_t *context;
> + struct desc_struct *desc;
>   unsigned long base;
>  
> - down(>mm->context.sem);
> - desc = child->mm->context.ldt + (seg & ~7);
> - base = (desc[0] >> 16) | ((desc[1] & 0xff) << 16) | (desc[1] & 
> 0xff00);
> + context = >mm->context;
> + down(>sem);
> + desc = >ldt[segment_index(seg)];
> + base = get_desc_base(desc);
>  
>   /* 16-bit code segment? */
> - if

Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Michael E Brown

Again, please cc Doug and I on replies...

Kyle Moffett wrote:
>On Aug 16, 2005, at 00:34:51, Chris Wedgwood wrote:
>> On Mon, Aug 15, 2005 at 04:23:37PM -0400, Kyle Moffett wrote:
>>> Why can't you just implement the system management actions in the
>>> kernel driver?
>>
>> Why put things in the kernel unless it's really needed?

Thank you. Our sentiments exactly.

>>
>> I'm not thrillied about the lack of userspace support for this driver
>> but that still doesn't mean we need to shovel wads of crap into the
>> kernel.

Hmm... did I mention libsmbios? :-)
http://linux.dell.com/libsmbios/main.

SMI support is not yet implemented inside libsmbios, but I am working
feverishly on it (in-between emails to linux-kernel, of course.) :-) We
have a development mailing list, and I will make the announcement there
when it has been complete. I will also be submitting patches to the RBU
documentation and dcdbas documentation pointing to libsmbios for folks
that want a 100% open-source method of using these drivers.

I cannot (at this point, I'm working on it, though), provide our
internal documentation of our SMI implementation directly. But, I am
authorized to add all of this to libsmbios, and I intend to very
clearly document all of the SMI calls in libsmbios. 

>
> I'm worried that it might be more of a mess in userspace than it  
> could be
> if done properly in the kernel.  Hardware drivers, especially for  
> something
> as critical as the BIOS, should probably be done in-kernel.  Look at the
> mess that X has become, it mmaps /dev/mem and pokes at the PCI busses
> directly.  I just don't want an MSI-driver to become another /dev/mem.

Again, this is a whole different thing from a video card driver. The
SMI driver consists of one instruction: "outb magic_port#,
magic_value;", with the simple addition that EBX contain the
physical address of buffer that holds the requested command code and the
return values.

There isn't really a whole lot more than that. For the Dell SMI, you
have to look up the magic port # and magic value in smbios,
specifically, there is a vendor structure 0xDA with a specific layout
(which will be documented in libsmbios) that specifies the magic port
and value.

Aside from that, for the most part, the only thing SMI ever does is
pass buffers back and forth. 

-- 
Michael

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Valdis . Kletnieks

On Mon, 15 Aug 2005 23:09:28 CDT, you said:

> No, dcdbas has nothing to do with this. I'll have to submit a patch
> against the docs. The program you need to use already exists and is
> open source. You can use libsmbios to do this.
> http://linux.dell.com/libsmbios/main.

Now I'm confoozled.  Maybe - I suspect we're actually in violent agreement...

On Mon, 15 Aug 2005 17:58:56 CDT, [EMAIL PROTECTED] said:
>   Additionally, we are releasing an open source library (GPL/OSL dual 
> license) that can use these hooks to perform many systems management 
> functions in userspace. See http://linux.dell.com/libsmbios/main/. We 
> should have code in libsmbios to do SMI using this driver within about two 
> weeks.  We currently writing the SMI hooks in libsmbios using this posted 
> version of the driver. I am the maintainer of this project, and it is my goal 
> to have code in libsmbios for every Dell SMI call.

So dcdbas *is* intended as the kernel end of the userspace libsmbios, which
is the suggested way of getting that BIOS updated. OK, I got it now.. ;)

(continuing on)

> The binary you want to use is "activateCmosToken", under bins/output/
> (after compilation). The command line syntax is like this:
>   activateCmosToken 0x005C
> 
> If you want to cancel a BIOS update that has already been activated
> (per above), use: 
>   activateCmosToken 0x005D
> 
> Basically, follow the docs in the RBU docs as far as cat-ing the bios
> update image to the rbu sysfs files, then use the activateCmosToken
> program to tell BIOS to do the update on reboot. 

Ahh... the missing piece I didn't have before. :)


pgprrpFtnhPaq.pgp
Description: PGP signature

Re: obsolete modparam change busted.

2005-08-15 Thread Dave Jones

On Tue, Aug 16, 2005 at 02:39:10PM +1000, Rusty Russell wrote:
 > On Sat, 2005-08-13 at 14:27 -0400, Dave Jones wrote:
 > > We're now munching the end of the boot command line it seems.
 > 
 > Wow, if we had testcases in the kernel source, I would not have to keep
 > rewriting them (badly) every time I touched this code.

Maybe someone should write a userspace test harness for kernel code. :-P

 > Throw away that stupid patch, apply this stupid patch.

I'll throw it into the Fedora kernel tomorrow, and re-test.
Thanks for chasing this so quickly.

Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Chris Wedgwood

On Tue, Aug 16, 2005 at 12:55:35AM -0400, Kyle Moffett wrote:

> I'm worried that it might be more of a mess in userspace than it
> could be if done properly in the kernel.

I would rather if it's gonna be a mess it's, then we put that mess in
userspace rather than in the kernel.

> Hardware drivers, especially for something as critical as the BIOS,
> should probably be done in-kernel.

For the most part it's not for the BIOS though AFAICT.  It's for some
hacky little microcontroller that controls fans and similar things
that Dell won't give up details for.

> Look at the mess that X has become, it mmaps /dev/mem and pokes at
> the PCI busses directly.  I just don't want an MSI-driver to become
> another /dev/mem.

It's for old hardware, I'm not sure it will evolve much other than
just for maintenance.

It's really hard to say, maybe if Dell gave more details about the
userspace we would have a better idea?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/6] i386 virtualization - Remove some dead debugging code

2005-08-15 Thread Zachary Amsden


Chris Wright wrote:


* [EMAIL PROTECTED] ([EMAIL PROTECTED]) wrote:
 


This code is quite dead.  Release_thread is always guaranteed that the mm has
already been released, thus dead_task->mm will always be NULL.

Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]>
Index: linux-2.6.13/arch/i386/kernel/process.c
===
--- linux-2.6.13.orig/arch/i386/kernel/process.c2005-08-15 
10:46:18.0 -0700
+++ linux-2.6.13/arch/i386/kernel/process.c 2005-08-15 10:48:51.0 
-0700
@@ -421,17 +421,7 @@

void release_thread(struct task_struct *dead_task)
{
-   if (dead_task->mm) {
-   // temporary debugging check
-   if (dead_task->mm->context.size) {
-   printk("WARNING: dead process %8s still has LDT? 
<%p/%d>\n",
-   dead_task->comm,
-   dead_task->mm->context.ldt,
-   dead_task->mm->context.size);
-   BUG();
-   }
-   }
-
+   BUG_ON(dead_task->mm);
   



This BUG_ON() has different semantics than old dead one.  Is there a
point?  exit_mm() has already reset this to NULL, no?
 



Yes, completely.  This BUG() could be eliminated entirely, as trivial 
inspection shows.  I can't fathom a single reason why it should still 
exist, but the presence of it in the first place made be wonder if there 
may be some erudite reason for it.  Thus I raised the BUG to a higher 
power - obviously the LDT is gone if the MM is gone.


Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/6] i386 virtualization - Fix uml build

2005-08-15 Thread Zachary Amsden


Chris Wright wrote:


* [EMAIL PROTECTED] ([EMAIL PROTECTED]) wrote:
 


Attempt to fix the UML build by assuming the default i386 subarchitecture
(mach-default).

I can't fully test this because spinlock breakage is still happening in
my tree, but it gets rid of the mach_xxx.h missing file warnings.
   



I assume this is intended to fix a build error caused by patches in the
earlier set which added more reliance on mach-default?
 



Yes, I already sent the fix to Jeff and Andrew, so it may already 
included in anything based off -mm1.  But it seems a good idea in 
general for UML.  I got a 100% clean um-i386 build after this patch on 
-rc5-mm1.


Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Hiro Yoshioka

Takahashi san,

I appreciate your comments.

> Hi,
> 
> BTW, what are you going to do with the page-faults which may happen
> during __copy_user_zeroing_nocache()? The current process may be blocked
> in the handler for a while and get FPU registers polluted.
> kernel_fpu_begin() won't help the case. This is another issue, though.

My code does nothing do it.

I need a volunteer to implement it.

Regards,
  Hiro
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Michael E Brown

I am not subscribed to linux-kernel. Please cc me (and Doug) on all
replies. Sorry if I'm breaking peoples threading, but I am cut-and-
pasting this from web archives, since this wasn't cc-d to me
originally. 

>On Aug 15, 2005, at 19:38:49, Doug Warzecha wrote:
>> On Mon, Aug 15, 2005 at 04:23:37PM -0400, Kyle Moffett wrote:
>>> Why can't you just implement the system management actions in the  
>>> kernel
>>> driver?
>>
>> We want to minimize the amount of code in the kernel and avoid  
>> having to
>> update the driver each time a new system management command is added.
>
> One of the recent trends in kernel driver development is to make as much
> as possible accessible through standard tools (like with echo and cat  
> via sysfs).

Where it makes sense. Everything can be taken too far, and I believe
that you are taking this past the point of sanity. Are you also to
advocate that X stop mmap()-ing /dev/mem to manipulate PCI memory-space
of the video cards, but rather we should have a kernel driver that
makes every register of each PCI card available as a file in sysfs so
that we can re-write X as a big bash script? Let me know how that works
out.

>
>> The libsmbios project is being updated to use this code.  http:// 
>> linux.dell.com/libsmbios/main/.  Using the libsmbios code, you
>> will be able to set all of the options in BIOS F2 screen from Linux
>> userspace.  Also, libsmbios is looking at implementing a few other  
>> things
>> like fan status.  Libsmbios is 100% open-source (OSL/GPL dual  
>> license).
>
>  From my point of view, this driver could use sysfs almost entirely  
> and put
> all of the hardware-manipulation code completely in kernel space, along
> with the hardware detection code.  You could have plain-text files in
> /sys/bus/platform/dellbios/ that have all of the BIOS F2 options  
> accessible
> to the admin from the command line, without special tools.  (You could
> always add an extra program that presents a BIOS-like interface)

Conservatively counting, I see just about 350 different BIOS options in
my current list, plus about 60 different (unrelated) SMI calls. We are
talking about several tens of thousands of lines of code in the kernel
to surface each of these in the kernel along with all of the necessary
BIOS-bug-workaround and platform detection code. This is not pretty,
nor easy code. I, personally, do not want to be responsible for the
parsing bug that causes a root hole here. 

In userspace, I can easily stick all of the cross-references into an
XML file, along with the workarounds and bug-fixes, which makes the
code a bit tighter. We have one project here at Dell that implemented
an all C (userspace) equivalent of what you are talking about, and they
ended up writing a code generation script that took XML definitions of
each option and generated the resulting C code. They still ended up
with a huge bucketload of code. We don't have the same conveniences in
kernel-land. All the nice toys are userspace.

>> The method of generating a host control SMI is not exactly the same  
>> for
>> each PowerEdge system listed in dcdbas.txt.  host_control_smi_type  
>> tells
>> the driver how to generate the host control SMI for the system in use.
>> I'll update dcdbas.txt with the SMI type value associated with the  
>> systems
>> listed in that file.
>
> This is an _excellent_ reason why more of this should be in the kernel.
> What happens if the wrong SMI is used?  Shouldn't it be relatively easy
> for the kernel to determine the correct SMI itself?

No, this is an _EXCELLENT_ reason why _LESS_ of this should be in the
kernel. Why should we have to duplicate a _TON_ of code inside the
kernel to figure out which platform we are on, and then look up in a
table which method to use for that platform? We have a _MUCH_ nicer
programming environment available to us in userspace where we can use
things like libsmbios to look up the platform type, then look in an
easily-updateable text file which smi type to use. In general, plugging
the wrong value here is a no-op.

What you are advocating is that we bloat the kernel beyond belief just
so you can use echo and cat. I thought that we were trying to remove
extra stuff from the kernel. I thought this was the reasoning behind
initramfs and things like irqbalanced.

-- 
Michael

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] removes pci_find_device from i6300esb.c

2005-08-15 Thread Greg KH

On Tue, Aug 16, 2005 at 02:24:57AM +0200, Jiri Slaby wrote:
> This patch changes pci_find_device to pci_get_device (encapsulated in
> for_each_pci_dev) in i6300esb watchdog card with appropriate adding 
> pci_dev_put.
> 
> Generated in 2.6.13-rc5-mm1 kernel version.
> 
> Signed-off-by: Jiri Slaby <[EMAIL PROTECTED]>
> 
> This is repost, the patch was posted yet:
> 8 Aug 2005

I can't take this as the driver is only in the -mm tree, not mainline.
Andrew will have to pick it up (if it's even correct, haven't verified
it or not...)

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Kyle Moffett


On Aug 16, 2005, at 00:34:51, Chris Wedgwood wrote:

On Mon, Aug 15, 2005 at 04:23:37PM -0400, Kyle Moffett wrote:

Why can't you just implement the system management actions in the
kernel driver?


Why put things in the kernel unless it's really needed?

I'm not thrillied about the lack of userspace support for this driver
but that still doesn't mean we need to shovel wads of crap into the
kernel.


I'm worried that it might be more of a mess in userspace than it  
could be
if done properly in the kernel.  Hardware drivers, especially for  
something

as critical as the BIOS, should probably be done in-kernel.  Look at the
mess that X has become, it mmaps /dev/mem and pokes at the PCI busses
directly.  I just don't want an MSI-driver to become another /dev/mem.

Cheers,
Kyle Moffett

-BEGIN GEEK CODE BLOCK-
Version: 3.12
GCM/CS/IT/U d- s++: a18 C>$ UB/L/X/*(+)>$ P+++()>$ L(+ 
++) E
W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+ PGP+++ t+(+++) 5  
X R?

tv-(--) b(++) DI+ D+ G e->$ h!*()>++$ r  !y?(-)
--END GEEK CODE BLOCK--


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: obsolete modparam change busted.

2005-08-15 Thread Rusty Russell

On Sat, 2005-08-13 at 14:27 -0400, Dave Jones wrote:
> We're now munching the end of the boot command line it seems.

Wow, if we had testcases in the kernel source, I would not have to keep
rewriting them (badly) every time I touched this code.

Throw away that stupid patch, apply this stupid patch.


Name: Ignore trailing whitespace on kernel parameters correctly: Fixed version
Signed-off-by: Rusty Russell <[EMAIL PROTECTED]>

Dave Jones says:

... if the modprobe.conf has trailing whitespace, modules fail to load
with the following helpful message..

snd_intel8x0: Unknown parameter `'

Previous version truncated last argument.

Index: linux-2.6.13-rc6-git7-Module/kernel/params.c
===
--- linux-2.6.13-rc6-git7-Module.orig/kernel/params.c   2005-08-10 
16:12:45.0 +1000
+++ linux-2.6.13-rc6-git7-Module/kernel/params.c2005-08-16 
14:31:16.0 +1000
@@ -80,8 +80,6 @@
int in_quote = 0, quoted = 0;
char *next;
 
-   /* Chew any extra spaces */
-   while (*args == ' ') args++;
if (*args == '"') {
args++;
in_quote = 1;
@@ -121,6 +119,9 @@
next = args + i + 1;
} else
next = args + i;
+
+   /* Chew up trailing spaces. */
+   while (*next == ' ') next++;
return next;
 }
 
@@ -134,6 +135,9 @@
char *param, *val;
 
DEBUGP("Parsing ARGS: %s\n", args);
+   
+   /* Chew leading spaces */
+   while (*args == ' ') args++;
 
while (*args) {
int ret;

-- 
A bad analogy is like a leaky screwdriver -- Richard Braakman

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/6] i386 virtualization - Remove some dead debugging code

2005-08-15 Thread Chris Wright

* [EMAIL PROTECTED] ([EMAIL PROTECTED]) wrote:
> This code is quite dead.  Release_thread is always guaranteed that the mm has
> already been released, thus dead_task->mm will always be NULL.
> 
> Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]>
> Index: linux-2.6.13/arch/i386/kernel/process.c
> ===
> --- linux-2.6.13.orig/arch/i386/kernel/process.c  2005-08-15 
> 10:46:18.0 -0700
> +++ linux-2.6.13/arch/i386/kernel/process.c   2005-08-15 10:48:51.0 
> -0700
> @@ -421,17 +421,7 @@
>  
>  void release_thread(struct task_struct *dead_task)
>  {
> - if (dead_task->mm) {
> - // temporary debugging check
> - if (dead_task->mm->context.size) {
> - printk("WARNING: dead process %8s still has LDT? 
> <%p/%d>\n",
> - dead_task->comm,
> - dead_task->mm->context.ldt,
> - dead_task->mm->context.size);
> - BUG();
> - }
> - }
> -
> + BUG_ON(dead_task->mm);

This BUG_ON() has different semantics than old dead one.  Is there a
point?  exit_mm() has already reset this to NULL, no?

>   release_vm86_irqs(dead_task);
>  }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/6] i386 virtualization - Fix uml build

2005-08-15 Thread Chris Wright

* [EMAIL PROTECTED] ([EMAIL PROTECTED]) wrote:
> Attempt to fix the UML build by assuming the default i386 subarchitecture
> (mach-default).
> 
> I can't fully test this because spinlock breakage is still happening in
> my tree, but it gets rid of the mach_xxx.h missing file warnings.

I assume this is intended to fix a build error caused by patches in the
earlier set which added more reliance on mach-default?

thanks,
-chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Chris Wedgwood

On Mon, Aug 15, 2005 at 04:23:37PM -0400, Kyle Moffett wrote:

> Why can't you just implement the system management actions in the
> kernel driver?

Why put things in the kernel unless it's really needed?

I'm not thrillied about the lack of userspace support for this driver
but that still doesn't mean we need to shovel wads of crap into the
kernel.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Hirokazu Takahashi

Hi,

BTW, what are you going to do with the page-faults which may happen
during __copy_user_zeroing_nocache()? The current process may be blocked
in the handler for a while and get FPU registers polluted.
kernel_fpu_begin() won't help the case. This is another issue, though.

> > Thanks.
> > 
> > filemap_copy_from_user() calls __copy_from_user_inatomic() calls
> > __copy_from_user_ll().
> > 
> > I'll look at the code.
> 
> The following is a quick hack of cache aware implementation
> of __copy_from_user_ll() and __copy_from_user_inatomic()
> 
> __copy_from_user_ll_nocache() and __copy_from_user_inatomic_nocache()
> 
> filemap_copy_from_user() calles __copy_from_user_inatomic_nocache()
> instead of __copy_from_user_inatomic() and reduced cashe miss.
> 
> The first column is the cache reference (memory access) and the
> third column is the 3rd level cache miss.
> 
> The following example shows the L3 cache miss is reduced from 37410 to 107.
> 
> 2.6.12.4 nocache version
> Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) 
> with a unit mask of 0x3f (multiple flags) count 3000
> Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) 
> with a unit mask of 0x200 (read 3rd level cache miss) count 3000
> samples  %samples  % app name   symbol name
> 1204426.4106  1070.5620  vmlinux__copy_user_zeroing_nocache
> 80049 4.2606  5783.0357  vmlinuxjournal_add_journal_head
> 69194 3.6829  1540.8088  vmlinuxjournal_dirty_metadata
> 67059 3.5692  78 0.4097  vmlinux__find_get_block
> 64145 3.4141  32 0.1681  vmlinuxjournal_put_journal_head
> pattern9-0-cpu4-0-08161154/summary.out
> 
> The 2.6.12.4 original version is
> Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) 
> with a unit mask of 0x3f (multiple flags) count 3000
> Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) 
> with a unit mask of 0x200 (read 3rd level cache miss) count 3000
> samples  %samples  % app name   symbol name
> 1206467.4680  37410 62.3355  vmlinux__copy_from_user_ll
> 79508 4.9215  9031.5046  vmlinux_spin_lock
> 65526 4.0561  8731.4547  vmlinuxjournal_add_journal_head
> 59296 3.6704  1290.2149  vmlinux__find_get_block
> 58647 3.6302  2150.3582  vmlinuxjournal_dirty_metadata
> 
> What do you think?
> 
> Hiro
> 
> diff -ur linux-2.6.12.4.orig/Makefile linux-2.6.12.4.nocache/Makefile
> --- linux-2.6.12.4.orig/Makefile  2005-08-12 14:37:59.0 +0900
> +++ linux-2.6.12.4.nocache/Makefile   2005-08-16 10:22:31.0 +0900
> @@ -1,7 +1,7 @@
>  VERSION = 2
>  PATCHLEVEL = 6
>  SUBLEVEL = 12
> -EXTRAVERSION = .4.orig
> +EXTRAVERSION = .4.nocache
>  NAME=Woozy Numbat
>  
>  # *DOCUMENTATION*
> diff -ur linux-2.6.12.4.orig/arch/i386/lib/usercopy.c 
> linux-2.6.12.4.nocache/arch/i386/lib/usercopy.c
> --- linux-2.6.12.4.orig/arch/i386/lib/usercopy.c  2005-08-05 
> 16:04:37.0 +0900
> +++ linux-2.6.12.4.nocache/arch/i386/lib/usercopy.c   2005-08-16 
> 10:49:59.0 +0900
> @@ -10,6 +10,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -511,6 +512,110 @@
>   : "memory");\
>  } while (0)
>  
> +/* Non Temporal Hint version of mmx_memcpy */
> +/* It is cache aware   */
> +/* [EMAIL PROTECTED]   */
> +static unsigned long 
> +__copy_user_zeroing_nocache(void *to, const void *from, size_t len)
> +{
> +/* Note! gcc doesn't seem to align stack variables properly, so we
> + * need to make use of unaligned loads and stores.
> + */
> + void *p;
> + int i;
> +
> + if (unlikely(in_interrupt())){
> + __copy_user_zeroing(to, from, len);
> + return len;
> + }
> +
> + p = to;
> + i = len >> 6; /* len/64 */
> +
> +kernel_fpu_begin();
> +
> + __asm__ __volatile__ (
> + "1: prefetchnta (%0)\n" /* This set is 28 bytes */
> + "   prefetchnta 64(%0)\n"
> + "   prefetchnta 128(%0)\n"
> + "   prefetchnta 192(%0)\n"
> + "   prefetchnta 256(%0)\n"
> + "2:  \n"
> + ".section .fixup, \"ax\"\n"
> + "3: movw $0x1AEB, 1b\n" /* jmp on 26 bytes */
> + "   jmp 2b\n"
> + ".previous\n"
> + ".section __ex_table,\"a\"\n"
> + "   .align 4\n"
> + "   .long 1b, 3b\n"
> + ".previous"
> + : : "r" (from) );
> + 
> + for(; i>5; i--)
> + {
> + __asm__ __volatile__ (
> + "1:  prefetchnta 320(%0)\n"
> + "2:  movq (%0), %%mm0\n"
> + "  movq 8(%0), %%mm1\n"
> + "  movq 16(%0), %%mm2\n"
> + "  movq 24(%0),

Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Michael E Brown

>On Mon, 15 Aug 2005 18:38:49 CDT, Doug Warzecha said:
>
>> > If this is supposed to be used with the RBU code to trigger a BIOS  
>> > update, ...
>> 
>> This driver is not needed by the RBU code.
>
> Documentation/dell_rbu.txt says:
>
>> The rbu driver needs to have an application which will inform the BIOS to
>> enable the update in the next system reboot.
>
> Can the dcdbas code be used to implement that application?

No, dcdbas has nothing to do with this. I'll have to submit a patch
against the docs. The program you need to use already exists and is
open source. You can use libsmbios to do this.
http://linux.dell.com/libsmbios/main.

The binary you want to use is "activateCmosToken", under bins/output/
(after compilation). The command line syntax is like this:
activateCmosToken 0x005C

If you want to cancel a BIOS update that has already been activated
(per above), use:   
activateCmosToken 0x005D

Basically, follow the docs in the RBU docs as far as cat-ing the bios
update image to the rbu sysfs files, then use the activateCmosToken
program to tell BIOS to do the update on reboot. 

-- 
Michael




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] Real-Time Preemption, -RT-2.6.13-rc6-V0.7.53-11

2005-08-15 Thread Ingo Molnar

* Peter Zijlstra <[EMAIL PROTECTED]> wrote:

> --- linux-2.6.13-rc6-git7-RT-V0.7.53-11/drivers/usb/core/hcd.c~ 2005-08-15 
> 21:23:45.0 +0200
> +++ linux-2.6.13-rc6-git7-RT-V0.7.53-11/drivers/usb/core/hcd.c  2005-08-15 
> 22:03:33.0 +0200
> @@ -506,13 +506,11 @@ error:
> }
> 
> /* any errors get returned through the urb completion */
> -   local_irq_save (flags);
> +   local_irq_save_nort (flags);
> spin_lock (>lock);
> if (urb->status == -EINPROGRESS)
> urb->status = status;
> spin_unlock (>lock);
> usb_hcd_giveback_urb (hcd, urb, NULL);
> -   local_irq_restore (flags);
> +   local_irq_restore_nort (flags);
> return 0;
>  }

i'm wondering whether we could/should also fix this upstream - and 
whether this [making the IRQ flags disabling a NOP on -RT] is the right 
fix. Why does the USB hcd.c code do this in the first place? Disabling 
interrupts during usb_hcd_giveback_urb() [but not holding the urb->lock] 
might serialize on UP, but it has no serialization effect on SMP and is 
hence potentially buggy. Is there something i'm missing about this code?

the normal way of using urb->lock would be spin_lock_irqsave() and 
spin_lock_irqrestore(), not the 'detached' method seen above.

> similar fix, completions need not have irqs disabled on PREEMPT_RT 
> right?

correct, PREEMPT_RT is very strict about the use of the interrupt flags.  
A fair portion of the now-illegal API uses are also SMP bugs on 
upstream, so these details are worth pursuing.

> --- linux-2.6.13-rc6-git7-RT-V0.7.53-11/drivers/usb/core/hcd.c~ 2005-08-15 
> 22:03:33.0 +0200
> +++ linux-2.6.13-rc6-git7-RT-V0.7.53-11/drivers/usb/core/hcd.c  2005-08-15 
> 22:32:54.0 +0200
> @@ -538,7 +538,7 @@ void usb_hcd_poll_rh_status(struct usb_h
> if (length > 0) {
> 
> /* try to complete the status urb */
> -   local_irq_save (flags);
> +   local_irq_save_nort (flags);
> spin_lock(_root_hub_lock);
> urb = hcd->status_urb;
> if (urb) {
> @@ -562,7 +562,7 @@ void usb_hcd_poll_rh_status(struct usb_h
> usb_hcd_giveback_urb (hcd, urb, NULL);
> else
> hcd->poll_pending = 1;
> -   local_irq_restore (flags);
> +   local_irq_restore_nort (flags);

same question: why are interrupts being kept disabled longer, and why is 
usb_hcd_giveback_urb() called with interrupts disabled? (I tried to use 
spin_lock_irqsave/irqrestore() in earlier -RT versions, but people 
reported hangs and USB misbehavior, which might be related. I'm worried 
that your _nort patch could cause similar misbehavior.)

how about (naively) extending the urb->lock to cover 
usb_hcd_giveback_urb() calls too - does that cause a deadlock or is it 
unsafe in some other way?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Hiro Yoshioka

From: Hiro Yoshioka <[EMAIL PROTECTED]>
Date: Tue, 16 Aug 2005 08:33:59 +0900

> Thanks.
> 
> filemap_copy_from_user() calls __copy_from_user_inatomic() calls
> __copy_from_user_ll().
> 
> I'll look at the code.

The following is a quick hack of cache aware implementation
of __copy_from_user_ll() and __copy_from_user_inatomic()

__copy_from_user_ll_nocache() and __copy_from_user_inatomic_nocache()

filemap_copy_from_user() calles __copy_from_user_inatomic_nocache()
instead of __copy_from_user_inatomic() and reduced cashe miss.

The first column is the cache reference (memory access) and the
third column is the 3rd level cache miss.

The following example shows the L3 cache miss is reduced from 37410 to 107.

2.6.12.4 nocache version
Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with 
a unit mask of 0x3f (multiple flags) count 3000
Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with 
a unit mask of 0x200 (read 3rd level cache miss) count 3000
samples  %samples  % app name   symbol name
1204426.4106  1070.5620  vmlinux__copy_user_zeroing_nocache
80049 4.2606  5783.0357  vmlinuxjournal_add_journal_head
69194 3.6829  1540.8088  vmlinuxjournal_dirty_metadata
67059 3.5692  78 0.4097  vmlinux__find_get_block
64145 3.4141  32 0.1681  vmlinuxjournal_put_journal_head
pattern9-0-cpu4-0-08161154/summary.out

The 2.6.12.4 original version is
Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with 
a unit mask of 0x3f (multiple flags) count 3000
Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with 
a unit mask of 0x200 (read 3rd level cache miss) count 3000
samples  %samples  % app name   symbol name
1206467.4680  37410 62.3355  vmlinux__copy_from_user_ll
79508 4.9215  9031.5046  vmlinux_spin_lock
65526 4.0561  8731.4547  vmlinuxjournal_add_journal_head
59296 3.6704  1290.2149  vmlinux__find_get_block
58647 3.6302  2150.3582  vmlinuxjournal_dirty_metadata

What do you think?

Hiro

diff -ur linux-2.6.12.4.orig/Makefile linux-2.6.12.4.nocache/Makefile
--- linux-2.6.12.4.orig/Makefile2005-08-12 14:37:59.0 +0900
+++ linux-2.6.12.4.nocache/Makefile 2005-08-16 10:22:31.0 +0900
@@ -1,7 +1,7 @@
 VERSION = 2
 PATCHLEVEL = 6
 SUBLEVEL = 12
-EXTRAVERSION = .4.orig
+EXTRAVERSION = .4.nocache
 NAME=Woozy Numbat
 
 # *DOCUMENTATION*
diff -ur linux-2.6.12.4.orig/arch/i386/lib/usercopy.c 
linux-2.6.12.4.nocache/arch/i386/lib/usercopy.c
--- linux-2.6.12.4.orig/arch/i386/lib/usercopy.c2005-08-05 
16:04:37.0 +0900
+++ linux-2.6.12.4.nocache/arch/i386/lib/usercopy.c 2005-08-16 
10:49:59.0 +0900
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -511,6 +512,110 @@
: "memory");\
 } while (0)
 
+/* Non Temporal Hint version of mmx_memcpy */
+/* It is cache aware   */
+/* [EMAIL PROTECTED]   */
+static unsigned long 
+__copy_user_zeroing_nocache(void *to, const void *from, size_t len)
+{
+/* Note! gcc doesn't seem to align stack variables properly, so we
+ * need to make use of unaligned loads and stores.
+ */
+   void *p;
+   int i;
+
+   if (unlikely(in_interrupt())){
+   __copy_user_zeroing(to, from, len);
+   return len;
+   }
+
+   p = to;
+   i = len >> 6; /* len/64 */
+
+kernel_fpu_begin();
+
+   __asm__ __volatile__ (
+   "1: prefetchnta (%0)\n" /* This set is 28 bytes */
+   "   prefetchnta 64(%0)\n"
+   "   prefetchnta 128(%0)\n"
+   "   prefetchnta 192(%0)\n"
+   "   prefetchnta 256(%0)\n"
+   "2:  \n"
+   ".section .fixup, \"ax\"\n"
+   "3: movw $0x1AEB, 1b\n" /* jmp on 26 bytes */
+   "   jmp 2b\n"
+   ".previous\n"
+   ".section __ex_table,\"a\"\n"
+   "   .align 4\n"
+   "   .long 1b, 3b\n"
+   ".previous"
+   : : "r" (from) );
+   
+   for(; i>5; i--)
+   {
+   __asm__ __volatile__ (
+   "1:  prefetchnta 320(%0)\n"
+   "2:  movq (%0), %%mm0\n"
+   "  movq 8(%0), %%mm1\n"
+   "  movq 16(%0), %%mm2\n"
+   "  movq 24(%0), %%mm3\n"
+   "  movntq %%mm0, (%1)\n"
+   "  movntq %%mm1, 8(%1)\n"
+   "  movntq %%mm2, 16(%1)\n"
+   "  movntq %%mm3, 24(%1)\n"
+   "  movq 32(%0), %%mm0\n"
+   "  movq 40(%0), %%mm1\n"
+   "  movq 48(%0), %%mm2\n"
+   "  movq 56(%0), %%mm3\n"
+   "  movntq %%mm0,

Re: [PATCH 4/5] add i2c_probe_device and i2c_remove_device

2005-08-15 Thread Nathan Lutchansky

On Tue, Aug 16, 2005 at 12:14:13AM +0200, Jean Delvare wrote:
> > These functions can be used for special-purpose adapters, such as
> > those on TV tuner cards, where we generally know in advance what
> > devices are attached.  This is important in cases where the adapter
> > does not support probing or when probing is potentially dangerous to
> > the connected devices.
> 
> Do you know of any adapter actually not supporting the SMBus Quick
> command (which we use for probing)?

I do, in fact, which is the reason I submitted these patches in the
first place.  :-)

The WIS GO7007, which is the MPEG1/2/4 video encoder used in the Plextor
ConvertX, has an on-board i2c interface that supports nothing but 8-bit
and 16-bit register reads and writes.  Worse, it does not correctly
report i2c errors.  Even if it did support probing, though, I wouldn't
want to use it because the i2c adapter generally lives on the other end
of a USB and requires a minimum of 15 USB round trips per i2c
transaction, so probing 124 i2c addresses would take a while.

> > +   if (kind < 0 && !i2c_check_functionality(adapter,I2C_FUNC_SMBUS_QUICK))
> > +   return -EINVAL;
> 
> Coding style please: one space after the comma. 
> 
> > +
> > +   down(_lists);
> > +   list_for_each(item,) {
> 
> Ditto.

Yeah, I copied those lines from other parts of i2c-core.c.  But I'll fix
them.

> > +   driver = list_entry(item, struct i2c_driver, list);
> > +   if (driver->id == driver_id)
> > +   break;
> > +   }
> > +   up(_lists);
> > +   if (!item)
> > +   return -ENOENT;
> > +
> > +   /* Already in use? */
> > +   if (i2c_check_addr(adapter, addr))
> > +   return -EBUSY;
> > +
> > +   /* Make sure there is something at this address, unless forced */
> > +   if (kind < 0) {
> > +   if (i2c_smbus_xfer(adapter, addr, 0, 0, 0,
> > +  I2C_SMBUS_QUICK, NULL) < 0)
> > +   return -ENODEV;
> > +
> > +   /* prevent 24RF08 corruption */
> > +   if ((addr & ~0x0f) == 0x50)
> > +   i2c_smbus_xfer(adapter, addr, 0, 0, 0,
> > +  I2C_SMBUS_QUICK, NULL);
> > +   }
> > +
> > +   return driver->detect_client(adapter, addr, kind);
> > +}
> 
> You are duplicating a part of i2c_probe_address() here. Why don't you
> simply call it?

I could, I guess.  Is there a reason i2c_probe_address is limited to
7-bit addresses?  Clients with 10-bit addresses (I've never seen one,
but I guess they exist?) should be able to be instantiated though the
known-device list.

I think the way to go about this would be to rework and resubmit the
first 3 patches once we decide how to handle the no-probe adapter flag,
and later I will send a separate patch set for review with the
known-device list implementation.

Thank you for all your help!  -Nathan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/6] i386 virtualization patches, Set 3

2005-08-15 Thread Zachary Amsden


Brian Gerst wrote:

If you really want to test the math emu code, you can hack check_x87 
in head.S to always leave the fpu disabled.  Then you can test it on 
any cpu, not just a 386. 



That is a good idea, and while a valid point, it actually still requires 
writing the code to actually test the FPU, specifically, using weird 
prefixes, LDT based segments, and other oddities that don't get 
generated from "normal" C code.  I'm pretty sure the existing code works 
for the 99% cases or else it wouldn't have gotten there in the first 
place.  But testing the corner cases here is even nastier than testing 
the LDT corner cases - you would basically need to write a lot of hand 
coded i387 assembler.  Perhaps such a test might exist, but in all 
honesty, without a comprehensive test, it is simply far too easy to 
introduce a bug here, and far too likely it will either not be noticed 
until it has caused someone a possibly undetected numerical error, or 
I'm just wasting my time because noone is using this code anyways.  
Fortunately, the Hubble telescope has been upgraded to a 486 ;)


Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][2.6.12.3] IRQ compression/sharing patch

2005-08-15 Thread James Cleverdon

On Monday 15 August 2005 10:44 am, Andi Kleen wrote:
> On Sun, Aug 14, 2005 at 07:57:53PM -0700, James Cleverdon wrote:
> > On Thursday 04 August 2005 02:22 am, Andi Kleen wrote:
> > > On Thu, Aug 04, 2005 at 12:05:50AM -0700, James Cleverdon wrote:
> > > > diff -pruN 2.6.12.3/arch/i386/kernel/acpi/boot.c
> > > > n12.3/arch/i386/kernel/acpi/boot.c ---
> > > > 2.6.12.3/arch/i386/kernel/acpi/boot.c   2005-07-15 14:18:57.0
> > > > -0700 +++ n12.3/arch/i386/kernel/acpi/boot.c2005-08-04
> > > > 00:01:10.199710211 -0700 @@ -42,6 +42,7 @@
> > > >  static inline void  acpi_madt_oem_check(char *oem_id, char
> > > > *oem_table_id) { } extern void __init clustered_apic_check(void);
> > > >  static inline int ioapic_setup_disabled(void) { return 0; }
> > > > +extern int gsi_irq_sharing(int gsi);
> > > >  #include 
> > > >
> > > >  #else  /* X86 */
> > > > @@ -51,6 +52,9 @@ static inline int ioapic_setup_disabled(
> > > >  #include 
> > > >  #endif /* CONFIG_X86_LOCAL_APIC */
> > > >
> > > > +static inline int gsi_irq_sharing(int gsi) { return gsi; }
> > >
> > > Why is this different for i386/x86-64? It shouldn't.
> > 
> > True.  Have added code for i386.  Unfortunately, I didn't see one file 
> > that is shared by both architectures and which is included when 
> > building with I/O APIC support.  So, I duplicated the function into 
> > io_apic.c
> 
> That needs to be cleaned up before merge. This code is already ugly and I 
> don't
> want the cruft accumulating here.

OK, I moved the function into a separate file that can be used by
both architectures.

> > > As a unrelated note we really need to get rid of this whole ifdef
> > > block.
> > >
> > > > +++ n12.3/arch/x86_64/Kconfig   2005-08-03 21:31:07.487451167 -0700
> > > > @@ -280,13 +280,13 @@ config HAVE_DEC_LOCK
> > > > default y
> > > >
> > > >  config NR_CPUS
> > > > -   int "Maximum number of CPUs (2-256)"
> > > > -   range 2 256
> > > > +   int "Maximum number of CPUs (2-255)"
> > > > +   range 2 255
> > > > depends on SMP
> > > > -   default "8"
> > > > +   default "16"
> > >
> > > Don't change the default please.
> > >
> > > > +static int next_irq = 16;
> > >
> > > Won't this need a lock for hotplug later?
> > 
> > That's what I thought originally, but maybe not.  We initialize all RTEs 
> > and assign IRQs+vectors fairly early in boot, plus store the results in 
> > arrays.  Thereafter the functions just return the preallocated values.
> 
> I was thinking of IO-APIC hotplug here. IIRC the ia64 folks
> have it already and I'm sure someone will turn up with a patch
> for i386/x86-64 soon. For devices it should be ok, you're right.
> 
> Ok I guess they can change it in that patch then. Perhaps
> just add a comment.

I've already got a spin lock there, so may as well keep it.

> > > > have a different trigger mode +  * than PCI.
> > > > +*/
> > >
> > > Can we perhaps force such sharing early temporarily even when the
> > > table is not filled up?  This way we would get better test coverage
> > > of all of  this.
> > >
> > > That would be later disabled of course.
> > 
> > Suppose I added a static counter and pretended that every third 
> > non-legacy IRQ needed to be shared?
> 
> Can you drop into the sharing path unconditionally?
> 
> -Andi

If no vectors/IRQs are ever allocated, there is nothing to share.
Added some simple minded forced sharing to gsi_irq_sharing.  It
forces 1 in 3 IRQs to be shared.  That should exercise some of the
code paths.

Patch attached.  (Sorry.)

-- 
James Cleverdon
IBM LTC (xSeries Linux Solutions)
{jamesclv(Unix, preferred), cleverdj(Notes)} at us dot ibm dot comm
diff -pruN 2.6.12.3/arch/i386/kernel/Makefile z12.3/arch/i386/kernel/Makefile
--- 2.6.12.3/arch/i386/kernel/Makefile	2005-07-15 14:18:57.0 -0700
+++ z12.3/arch/i386/kernel/Makefile	2005-08-15 15:57:45.0 -0700
@@ -7,7 +7,7 @@ extra-y := head.o init_task.o vmlinux.ld
 obj-y	:= process.o semaphore.o signal.o entry.o traps.o irq.o vm86.o \
 		ptrace.o time.o ioport.o ldt.o setup.o i8259.o sys_i386.o \
 		pci-dma.o i386_ksyms.o i387.o dmi_scan.o bootflag.o \
-		doublefault.o quirks.o
+		doublefault.o quirks.o gsi2irq.o
 
 obj-y+= cpu/
 obj-y+= timers/
diff -pruN 2.6.12.3/arch/i386/kernel/acpi/boot.c z12.3/arch/i386/kernel/acpi/boot.c
--- 2.6.12.3/arch/i386/kernel/acpi/boot.c	2005-07-15 14:18:57.0 -0700
+++ z12.3/arch/i386/kernel/acpi/boot.c	2005-08-14 15:40:36.0 -0700
@@ -453,7 +453,7 @@ int acpi_gsi_to_irq(u32 gsi, unsigned in
  		*irq = IO_APIC_VECTOR(gsi);
 	else
 #endif
-		*irq = gsi;
+		*irq = gsi_irq_sharing(gsi);
 	return 0;
 }
 
diff -pruN 2.6.12.3/arch/i386/kernel/gsi2irq.c z12.3/arch/i386/kernel/gsi2irq.c
--- 2.6.12.3/arch/i386/kernel/gsi2irq.c	1969-12-31 16:00:00.0 -0800
+++ z12.3/arch/i386/kernel/gsi2irq.c	2005-08-15 18:18:24.0 -0700
@@ -0,0 +1,134 @@
+/*
+ * Copyright 2005 James Cleverdon, IBM.
+ * Subject to the GNU Public License, v.2
+ *
+ *

Re: [PATCH 1/5] call i2c_probe from i2c core

2005-08-15 Thread Nathan Lutchansky

On Mon, Aug 15, 2005 at 11:55:31PM +0200, Jean Delvare wrote:
> You should probably be using the --no-index option of quilt 0.42 (if you
> are using quilt as I presumed), as I heard Linus doesn't like these
> index lines in the patches he receives.

Ooh, thanks!

> > if (driver->flags & I2C_DF_NOTIFY) {
> > list_for_each(item,) {
> > adapter = list_entry(item, struct i2c_adapter, list);
> > -   driver->attach_adapter(adapter);
> > +   if (driver->attach_adapter)
> > +   driver->attach_adapter(adapter);
> > +   if (driver->detect_client && driver->address_data &&
> > +   ((driver->class & adapter->class) ||
> > +   driver->class == 0))
> > +   i2c_probe(adapter, driver->address_data,
> > +   driver->detect_client);
> > }
> > }
> 
> Couldn't we check for the return value of driver->attach_adapter()? That
> way this function could conditionally prevent i2c_probe() from being
> run. This is just a random proposal, I don't know if some drivers would
> have an interest in doing that.

Yeah, I was thinking about that too, but I can't think of a reasonable
return code to use.  -1 for "don't probe"?  Or <0 for "fatal error,
don't touch this bus any more"?  Anyway, client drivers will probably
only use one of the two detection methods because if they need to
implement attach_adapter they can just call i2c_probe from there.

> Also, maybe we can put this new code in a separate function to be called
> from both i2c_add_adapter and i2c_add_driver, so as to not duplicate
> code? Or would this be too much overhead? It could be made inline then.
> Again, just a random thought.

Sure, I can do that.

> > --- linux-2.6.13-rc6+gregkh.orig/Documentation/i2c/writing-clients
> > +++ linux-2.6.13-rc6+gregkh/Documentation/i2c/writing-clients
> 
> Thanks for the documentation update. However, you didn't update
> i2c/porting-clients accordingly. Could you please?

I didn't think anybody was porting client drivers any more, but if
you're still updating that doc, I can too.  :-)  -Nathan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Michael E Brown

On Mon, 2005-08-15 at 21:29 -0400, Kyle Moffett wrote:
> On Aug 15, 2005, at 18:58:56, [EMAIL PROTECTED] wrote:
> >> Why can't you just implement the system management actions in
> >> the kernel driver?  This is tantamount to a binary SMI hook to
> >> userspace.  What functionality does this provide on a dell
> >> system from an administrator's point of view?
> 
> > The second alternative is not entirely feasible. We have over 60
> > SMI functions, and we would have to write a kernel-mode wrapper for
> > each and every one. I hope you agree that code that doesn't exist is
> > less buggy than code that is, and that code that is in userspace is a
> > whole lot less likely to cause a kernel crash than code that is in
> > the kernel.
> 
> I think the second alternative is actually feasible and preferable. The
> point of the kernel is to provide safe and secure access to two things:
>1) Hardware through an abstraction layer
>2) Software services (like IP stack) that are not feasible to do in
>   userspace.
> 
> A system that just provides a hunk of DMA RAM and the ability to  
> generate

We are only using the DMA allocation API, we are not actually doing DMA
to those addresses. We use the DMA API to easily restrict allocations to
4GB, as has been previously requested, instead of rolling our own
allocation functions.

> interrupts is definitely not 2, and does not really follow the ideal
> behind 1 either.  I gave the firmware example earlier.  There are  
> several
> devices that provide access to update firmware by reading and writing a
> firmware file directly in sysfs, then updating it on reboot if  
> necessary.

But... the firmware loading example is bogus. We already have the Dell
RBU driver for system BIOS updates, and it has been accepted into the
kernel. This driver (dcdbas) has absolutely nothing to do with firmware
loading. I'm confused as to why you have brought up this example again
after Doug just finished saying that dcdbas has nothing to do with
firmware updates.

So, in a sense, we are _ALREADY_ following your advice, having already
split out the firmware driver into it's own driver that sits under the
firmware/ class.

Sorry, but I think you mis-understand the whole point behind this
driver. This _is_ an abstraction.

For instance, if you have 16 journaling file systems in your kernel, it
would make a lot of sense to pull out the common journaling code and
create a separate journaling subsystem in the kernel, much like jbd. It
would then make sense to make people justify adding new journaling
methods to the kernel for a new file system, since there already exists
one journaling abstraction.

But, it only should go so far. Just because it makes sense to
standardize on one journaling layer in the kernel, doesn't mean that it
also makes sense to pull in all of mysql into the kernel.

In our case, we have a whole bunch of unrelated SMI calls to the BIOS
that have absolutely nothing in common except that they use the SMI
calling method. We have abstracted down to the lowest common denominator
of what we can put into the kernel to enable our whole managment stack.
Rather than re-invent the SMI stack for each of these functions, we have
provided an abstraction.

To take a concrete example, I suggested to Doug to mention fan status. I
get the feeling that you possibly think that this would be better
integrated into lmsensors, or something like that. That really isn't the
case, as lmsensors is really geared towards bit-banging lm81 (for
example) chips to get fan status. In our case, we have a standardized
BIOS interface to get this info, and that standardized method involves
SMI and not bit-banging interfaces. Once this driver is accepted into
the kernel, we can go and add support in the _userspace_ lmsensors libs
to poll fan and temp using this driver.

For example, we already have at least one buggy implementation of this
exact stack in the kernel as the i8k driver. The i8k driver was reverse-
engineered and works, but it does not follow the spec at all, and so is
subject to major breakage if the BIOS changes. With dcdbase + libsmbios,
we can write this _correctly_, and in such a way that it follows the
spec and will not break on BIOS updates.

What you are asking us to do is just not feasible on many levels. First,
just from the number of functions we would have to implement. We would
have to implement about 60 new sysfs files, with at least 120 separate
functions for read/write. Each function would have to take into account
the specific calling requirements of that specific function. Then, we
would have to implement all of the bugfixes and platform-specific
workarounds in the kernel for each of those functions for each Dell
platform. Each time another function is added to BIOS, we would have to
go out and patch everybody's kernel to support the new function.

Besides the fact that this is just not a good design, there just isn't
the manpower to maintain all of these in the kernel

Re: git-net-selinux-build-fix.patch added to -mm tree (fwd)

2005-08-15 Thread James Morris

Forgot to cc lkml.


-- Forwarded message --
Date: Mon, 15 Aug 2005 23:00:10 -0400 (EDT)
From: James Morris <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Cc: David S. Miller <[EMAIL PROTECTED]>, Stephen Smalley <[EMAIL PROTECTED]>
Subject: Re: git-net-selinux-build-fix.patch added to -mm tree

Please use this patch instead so that we catch the new DCCP message type.

Signed-off-by: James Morris <[EMAIL PROTECTED]>

---

 security/selinux/hooks.c|2 +-
 security/selinux/nlmsgtab.c |3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff -purN -X dontdiff net-2.6.14.o/security/selinux/hooks.c 
net-2.6.14.w/security/selinux/hooks.c
--- net-2.6.14.o/security/selinux/hooks.c   2005-08-15 15:47:44.0 
-0400
+++ net-2.6.14.w/security/selinux/hooks.c   2005-08-15 16:01:29.0 
-0400
@@ -659,7 +659,7 @@ static inline u16 socket_type_to_securit
return SECCLASS_NETLINK_ROUTE_SOCKET;
case NETLINK_FIREWALL:
return SECCLASS_NETLINK_FIREWALL_SOCKET;
-   case NETLINK_TCPDIAG:
+   case NETLINK_INET_DIAG:
return SECCLASS_NETLINK_TCPDIAG_SOCKET;
case NETLINK_NFLOG:
return SECCLASS_NETLINK_NFLOG_SOCKET;
diff -purN -X dontdiff net-2.6.14.o/security/selinux/nlmsgtab.c 
net-2.6.14.w/security/selinux/nlmsgtab.c
--- net-2.6.14.o/security/selinux/nlmsgtab.c2005-08-15 15:47:44.0 
-0400
+++ net-2.6.14.w/security/selinux/nlmsgtab.c2005-08-15 16:06:44.0 
-0400
@@ -16,7 +16,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 
@@ -76,6 +76,7 @@ static struct nlmsg_perm nlmsg_firewall_
 static struct nlmsg_perm nlmsg_tcpdiag_perms[] =
 {
{ TCPDIAG_GETSOCK,  NETLINK_TCPDIAG_SOCKET__NLMSG_READ },
+   { DCCPDIAG_GETSOCK, NETLINK_TCPDIAG_SOCKET__NLMSG_READ },
 };
 
 static struct nlmsg_perm nlmsg_xfrm_perms[] =
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/6] i386 virtualization patches, Set 3

2005-08-15 Thread Brian Gerst


[EMAIL PROTECTED] wrote:

This round attempts to conclude all of the LDT related cleanup with some
finally nice looking LDT code, fixes for the UML build, a bugfix for
really rather nasty kprobes problems, and the basic framework for an LDT
test suite.  It is really rather unfortunate that this code is so
difficult to test, even with DOSemu and Wine, there are still very nasty
corner cases here - anyone want an iret to 16-bit stack test?.

I was going to attempt to clean up the math-emu code to make it use the
nice new segment and descriptor table accessors, but it quickly became
apparent that this would be a long, tedious, error prone process that
would eventually result in the death of a large section of my brain.
In addition, it is not very fun to test this on the actual hardware it
is designed to run on (although I did manage to track down a 386 with
detachable i387 coprocessor, the owner is not sure it still boots).
Someday it would be nice to have an audit of this code; it appears to
be riddled with bugs relating to segmentation, for example it assumes
LDT segments on overrides, does not use the mm->context semaphore to
protect LDT access, and generally looks scarily out of date in both
function and appearance.


If you really want to test the math emu code, you can hack check_x87 in 
head.S to always leave the fpu disabled.  Then you can test it on any 
cpu, not just a 386.


--
Brian Gerst
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/5] improve i2c probing

2005-08-15 Thread Nathan Lutchansky

On Mon, Aug 15, 2005 at 11:39:58PM +0200, Jean Delvare wrote:
> > The second improvement (which is really the point of this patch set)
> > is to add the functions i2c_probe_device and i2c_remove_device for
> > directly creating and destroying i2c clients on a particular adapter:
> > 
> > int i2c_probe_device(struct i2c_adapter *adapter, int driver_id,
> >  int addr, int kind);
> > int i2c_remove_device(struct i2c_adapter *adapter, int driver_id,
> >   int addr);
> > 
> 
> I think I understand the point of i2c_probe_device(). However, it would
> help if you could additionally show how this is going to help the
> media/video drivers. Currently, all these drivers use the traditional
> probing mecanism, and have to jam "foreign" probes, right? I would hope
> that these two patches will make it possible to improve this. Can you
> provide a few  examples of use? We need to figure out how good this new
> interface/mechanism is, and this can only be done with concrete
> examples.

OK, so I realized a few hours ago that the i2c_probe_device and
i2c_remove_device interface is probably the wrong way to go about
things, and it's broken anyway because the instantiated devices don't
survive if the client driver module is unloaded and reloaded.

Here's what I'm after.  Devices like video capture cards often have
on-board i2c buses for controlling chips like the video decoder and TV
tuner.  The devices connected to these buses and their addresses can
almost always be determined by the PCI/USB ID of the card or by reading
an on-board EEPROM.  With these special-purpose i2c buses, there's
really no need to do any i2c probing, but we've always been forced to
use probing anyway because that's the only way to instantiate new i2c
clients.  With the i2c_probe_device function I was attempting to provide
a means for video capture card drivers to directly instantiate the i2c
clients it already knows exist without having to probe for them.  (The
i2c_remove_device function was only present for symmetry...)

My new (well, old) idea for explicitly instantiating i2c clients,
instead of i2c_probe_device, is to put a new field into the i2c_adapter
with a list of (driver id, address, kind) tuples that should be
force-detected by the i2c core.

 1. Video capture card driver discovers new capture card
 2. Driver creates new (unprobed) i2c adapter with this device list:
 {
 { I2C_DRIVERID_EEPROM, 0x50, 0 }
 }
 3. i2c core does no probing but force-detects an EEPROM at 0x50
 4. Driver reads EEPROM and updates the device list:
 {
 { I2C_DRIVERID_EEPROM, 0x50, 0 }
 { I2C_DRIVERID_SAA7115, 0x20, 0 }
 { I2C_DRIVERID_TUNER, 0x61, TUNER_PHILIPS_FM1236_MK3 }
 }
 5. Driver somehow notifies the i2c core that the device list changed
 6. i2c core force-detects the remaining two clients

I'd really like to model the known-device list after the PCI subsystem
and other buses that track what devices "should" be connected based on
configuration information from the bus's host adapter, but I haven't had
time to research it yet.

> I am not totally convinced by the reintroduction of the i2c_adapter
> flags. I hope we can do without it.
> 
> One possibility would be to have an additional class of client, say
> I2C_CLASS_MISC. This would cover all the chip drivers which do not have
> a well-defined class, so that every client would *have to* define a
> class (we could enforce that at core level - I think this was the
> planned ultimate goal of .class when it was first introduced.)

I'm not sure I like lumping all the "unclassified" clients together.
The class mechanism is used to limit probing of an adapter to client
drivers with similar functions, so putting a bunch of client drivers that
would never be on the same bus into the same class is kind of silly.
(You'd never expect to find a keyboard controller on the same bus
as an IR motion sensor, for example.)

> If I2C_CLASS_MISC doesn't please you, then we can simply keep the idea
> that i2c_adapters that do not want to be probed do not define a class.

I like this better.  :-)  This kind of implies, though, that i2c
adapters that define *any* class would be probed by drivers that had no
class defined.  This might be the correct way to go about it.  -Nathan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: git-net-selinux-build-fix-2.patch added to -mm tree

2005-08-15 Thread James Morris

On Mon, 15 Aug 2005, [EMAIL PROTECTED] wrote:

> I sell copies of grep at reasonable prices...

We were working on fixing this separately and I was waiting for an ack 
from Stephen on the patch.

Anyway, please use the patch I just sent.


- James
-- 
James Morris
<[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ck] [PATCH] dynamic-tick patch modified for SMP

2005-08-15 Thread Srivatsa Vaddagiri

On Mon, Aug 15, 2005 at 09:39:22AM -0700, john stultz wrote:
> The timer_opts interface is the existing interface, my work replaces it
> and separates timekeeping from the timer interrupt.
> 
> You can find a cumulative version of my patch here:
> http://www.ussg.iu.edu/hypermail/linux/kernel/0508.1/0982.html

Oops ..Thanks for pointing it out! Will try this patch and let you
know how stable time is with dyn-tick.

-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Valdis . Kletnieks

On Mon, 15 Aug 2005 18:38:49 CDT, Doug Warzecha said:

> > If this is supposed to be used with the RBU code to trigger a BIOS  
> > update, ...
> 
> This driver is not needed by the RBU code.

Documentation/dell_rbu.txt says:

> The rbu driver needs to have an application which will inform the BIOS to
> enable the update in the next system reboot.

Can the dcdbas code be used to implement that application?




pgpf2LDM6ZOtS.pgp
Description: PGP signature

Re: [RFC][PATCH - 4/13] NTP cleanup: Breakup ntp_adjtimex()

2005-08-15 Thread john stultz

On Wed, 2005-08-10 at 18:27 -0700, john stultz wrote:
> All,
>   This patch breaks up the complex nesting of code in ntp_adjtimex() by
> creating a ntp_hardupdate() function and simplifying some of the logic.
> This also mimics the documented NTP spec somewhat better.
> 
> Any comments or feedback would be greatly appreciated.

Ugh. I just caught a bug where I misplaced the parens. 

> - } /* STA_PLL */
> + else if (ntp_hardupdate(txc->offset, xtime))
> + result = TIME_ERROR;
> + }
>   } /* txc->modes & ADJ_OFFSET */
 
That's wrong. The following patch fixes it. 

thanks
-john


diff --git a/kernel/ntp.c b/kernel/ntp.c
--- a/kernel/ntp.c
+++ b/kernel/ntp.c
@@ -388,9 +388,8 @@ int ntp_adjtimex(struct timex *txc)
/* adjtime() is independent from ntp_adjtime() 
*/
if ((time_next_adjust = txc->offset) == 0)
time_adjust = 0;
-   else if (ntp_hardupdate(txc->offset, xtime))
-   result = TIME_ERROR;
-   }
+   } else if (ntp_hardupdate(txc->offset, xtime))
+   result = TIME_ERROR;
} /* txc->modes & ADJ_OFFSET */
 
if (txc->modes & ADJ_TICK) {


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][patch 0/2] mm: remove PageReserved

2005-08-15 Thread Daniel Phillips

On Monday 15 August 2005 23:15, David Howells wrote:
> I want to know when a page is going to be modified so that I
> can predict the state of the cache as much as possible. I don't want
> userspace processes corrupting the cache in unrecorded ways.

There are two cases:

  1) Metadata.  If anybody is doing racy writes to metadata pages, it is
 your filesystem, and you have a bug.

  2) Data.  In Linux practice and Posix, racy writes to files have
 undefined semantics, including the possibility that data may end up
 interleaved on a disk block.

You seem to be trying to define (2) as "corruption" and setting out to prevent 
it.  But it is not the responsibility of a filesystem to prevent this, it is 
the responsibility of the application.

Could you please explain why it is not ok to end up with a half-written page 
in your cache, if the client was in fact halfway through writing it when it 
crashed?

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Kyle Moffett


On Aug 15, 2005, at 19:38:49, Doug Warzecha wrote:

On Mon, Aug 15, 2005 at 04:23:37PM -0400, Kyle Moffett wrote:
Why can't you just implement the system management actions in the  
kernel

driver?


We want to minimize the amount of code in the kernel and avoid  
having to

update the driver each time a new system management command is added.


One of the recent trends in kernel driver development is to make as much
as possible accessible through standard tools (like with echo and cat  
via

sysfs).

The libsmbios project is being updated to use this code.  http:// 
linux.dell.com/libsmbios/main/.  Using the libsmbios code, you

will be able to set all of the options in BIOS F2 screen from Linux
userspace.  Also, libsmbios is looking at implementing a few other  
things
like fan status.  Libsmbios is 100% open-source (OSL/GPL dual  
license).


From my point of view, this driver could use sysfs almost entirely  
and put

all of the hardware-manipulation code completely in kernel space, along
with the hardware detection code.  You could have plain-text files in
/sys/bus/platform/dellbios/ that have all of the BIOS F2 options  
accessible

to the admin from the command line, without special tools.  (You could
always add an extra program that presents a BIOS-like interface)


The power cycle feature of the system powers off the system for a few
seconds and then powers the system back on without user intervention.
shutdown() and reboot() don't provide that feature.


Please ensure that the code is only run on reboot (and maybe halt), but
definitely not in the poweroff code.

What exactly is smi_type used for?  Please provide better  
documentation

on how to use this and what it does.


The method of generating a host control SMI is not exactly the same  
for
each PowerEdge system listed in dcdbas.txt.  host_control_smi_type  
tells

the driver how to generate the host control SMI for the system in use.
I'll update dcdbas.txt with the SMI type value associated with the  
systems

listed in that file.


This is an _excellent_ reason why more of this should be in the kernel.
What happens if the wrong SMI is used?  Shouldn't it be relatively easy
for the kernel to determine the correct SMI itself?

Thanks for your hard work!

Cheers,
Kyle Moffett

--
Unix was not designed to stop people from doing stupid things,  
because that

would also stop them from doing clever things.
  -- Doug Gwyn


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: oops in 2.6.13-rc6-git5

2005-08-15 Thread D. ShadowWolf

On Monday 15 August 2005 08:22, you wrote:
> On 8/15/05, D. ShadowWolf <[EMAIL PROTECTED]> wrote:
> > Decided to take the latest git kernel for a run and ran into the
> > following oops when shutting the system down to try it from a cold-boot
> > situation. I wasn't able to capture the oops as it happened, but
> > thankfully syslog was still running and I managed to trap it there.
>
> serial console, netconsole, console on line-printer  are all useful
> for capturing Oops data. There are detailed guides in Documentation/
>
> > When the oops occurred the system was almost shut down, but the command
> > that was executing at the time was to save the sync the hardware clock to
> > Linux (I think)... the trap from my kernel logs (I have each class of
> > kernel event redirected to a different file. This leads to some huge
> > files in a short span, but is useful for debugging a new kernel) The
> > kernel is tainted by the lt modem drivers (lt_modem & lt_serial) however
> > the problem does not appear to be in either of those, and they function
> > properly under 2.6.12.3
> >
> > I'm running a basic Slackware 10 distribution (other than the ltmodem
> > drivers (gone inside the next month))
>
> I've tried to reproduce the Oops here with your config, but my
> hardware is too different to match your config, so I had to make some
> changes to get the kernel running. In the end I was not able to
> reproduce it.
>
> Can you reproduce the crash reliably?
> Can you reproduce the crash with a non-tainted kernel?

Haven't attempted to reproduce - I actually tried the kernel against my better 
judgement -- I've had hard-drives killed by faulty kernels in the past. But 
since you've requested, yes, I'll give it a shot. Reproducing it with an 
untainted kernel should be as simple as just booting the system, but I'm 
going to have to go monkey my init scripts to disable the automagic loading 
of the modules that taint the kernel.

As for doing it using a network or serial console - I don't have the equipment 
anymore. I used to have a decent WYSE serial console, but that died in the 
same storm that caught my old system... If you know where I can pick up an 
old line printer or a cheap serial terminal I'll buy it and get it setup 
ASAP.

DRH

0xA6992F96300F159086FF28208F8280BB8B00C32A.asc
Description: application/pgp-keys

pgpZXAIusL7Id.pgp
Description: PGP signature

Re: [PATCH] Fix mmap_kmem (was: [question] What's the difference between /dev/kmem and /dev/mem)

2005-08-15 Thread Linus Torvalds



On Mon, 15 Aug 2005, Steven Rostedt wrote:

> On Mon, 2005-08-15 at 21:16 -0400, Steven Rostedt wrote:
> 
> > Sorry for the late reply, my wife's Grandmother just passed away a few
> > days ago (at 98 years old) and if I went within 6 feet of the computer
> > she would have killed me!
> 
> Just to clearify, "she" as in my wife would have killed me. Not her late
> grandmother.

Thanks for the clarification. We were starting to worry about your family.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: BSD jail

2005-08-15 Thread Joshua Hudson

> 
> To build a virtual network device requires code for the device, code
> for routing the device
> in the kernel, some way to tell the router that this machine is hosted
> through the host
> machine's ethernet card, and control of which processes use which
> network devices.
> 
I've bombed out. I don't understand how the network devices work well
enough to do any of this.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Kyle Moffett


On Aug 15, 2005, at 18:58:56, [EMAIL PROTECTED] wrote:

Why can't you just implement the system management actions in
the kernel driver?  This is tantamount to a binary SMI hook to
userspace.  What functionality does this provide on a dell
system from an administrator's point of view?


Kyle,
I'm sure that not everybody agrees with the whole concept of SMI
calls. Nevertheless, these calls exist, and in order to have a  
complete

systems-management solution, we have to provide a way to do SMI calls.
Now, we have developed a way to do these SMI calls from userspace
without kernel support, but we are trying to be community-friendly and
show our hooks in the open, rather than trying to sneak them in under
the covers.




You might not like the concept of a generic hook for SMI calls
in the kernel, but the alternatives are hardly better. One alternative
is the already-mentioned method that we do things under the covers in
userspace. Another alternative is that we write separate kernel code
for each and every SMI call that exists in the Dell BIOS.



The second alternative is not entirely feasible. We have over 60
SMI functions, and we would have to write a kernel-mode wrapper for
each and every one. I hope you agree that code that doesn't exist is
less buggy than code that is, and that code that is in userspace is a
whole lot less likely to cause a kernel crash than code that is in
the kernel.


I think the second alternative is actually feasible and preferable. The
point of the kernel is to provide safe and secure access to two things:
  1) Hardware through an abstraction layer
  2) Software services (like IP stack) that are not feasible to do in
 userspace.

A system that just provides a hunk of DMA RAM and the ability to  
generate

interrupts is definitely not 2, and does not really follow the ideal
behind 1 either.  I gave the firmware example earlier.  There are  
several

devices that provide access to update firmware by reading and writing a
firmware file directly in sysfs, then updating it on reboot if  
necessary.


 We are trying to keep our kernel bloat down. We don't really think  
that
customers of IBM or HP really want their Red Hat kernels loaded  
down with

a bunch of Dell-only code.


That's what kconfig is for.  My G4 Powerbook doesn't have support for
hardware found in my G4 desktop any more than an IBM box should be  
forced

to have support for Dell hardware, yet all platforms work fine from the
same kernel tree.


Additionally, we are releasing an open source library (GPL/OSL
dual  license) that can use these hooks to perform many systems
management functions in userspace. See
http://linux.dell.com/libsmbios/main/. We should have code in  
libsmbios to
do SMI using this driver within about two weeks.  We currently  
writing the
SMI hooks in libsmbios using this posted version of the driver. I  
am the
maintainer of this project, and it is my goal to have code in  
libsmbios

for every Dell SMI call.


That's a nice project.  I applaud Dell for it's openness, but that's  
not the

only issue here, the kernel needs good engineering too.

I would suggest that you try to implement as much as is possible in a  
kernel
driver.  Firmware loading support, for example, or hardware sensors,  
should

integrate well into sysfs and be accessible through existing tools if
possible.  Doug also mentions fan status and control in his mail.   
Could you
provide such access through existing fan status/control interfaces so  
that

existing tools work as well?


We would welcome feedback on a better way to implement this
driver in the kernel, but the fact remains that we have to have a  
way to do
this, and we are open-sourcing all of the code necessary to get  
this done.


Thank you for your effort.  You guys have made significant progress,  
but IMHO,

you've still got a ways to go.  Keep up the good work, though!

Cheers,
Kyle Moffett

--
Unix was not designed to stop people from doing stupid things,  
because that

would also stop them from doing clever things.
  -- Doug Gwyn


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Fix mmap_kmem (was: [question] What's the difference between /dev/kmem and /dev/mem)

2005-08-15 Thread Steven Rostedt

On Mon, 2005-08-15 at 21:16 -0400, Steven Rostedt wrote:

> Sorry for the late reply, my wife's Grandmother just passed away a few
> days ago (at 98 years old) and if I went within 6 feet of the computer
> she would have killed me!

Just to clearify, "she" as in my wife would have killed me. Not her late
grandmother.

-- Steve


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Fix mmap_kmem (was: [question] What's the difference between /dev/kmem and /dev/mem)

2005-08-15 Thread Steven Rostedt

On Sat, 2005-08-13 at 10:37 -0700, Linus Torvalds wrote:
> 
> On Sat, 13 Aug 2005, Arjan van de Ven wrote:
> > 
> > attached is the same patch but now with Steven's change made as well
> 
> Actually, the more I looked at that mmap_kmem() function, the less I liked 
> it.  Let's get that sucker fixed better first. It's still not wonderful, 
> but at least now it tries to verify the whole _range_ of the mapping.
> 
> Steven, does this "alternate mmap_kmem fix" patch work for you?
> 

Sorry for the late reply, my wife's Grandmother just passed away a few
days ago (at 98 years old) and if I went within 6 feet of the computer
she would have killed me!

I just tried out the patch, and it worked fine for me.

Thanks,

-- Steve


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel 2.4.27-10: isofs driver ignore some parameters with mount

2005-08-15 Thread Marcelo Tosatti


Hi folks,

On Fri, Aug 12, 2005 at 05:29:36PM +0900, Horms wrote:
> On Fri, Aug 12, 2005 at 10:44:17AM +0300, Alexander Pytlev wrote:
> > Hello Debian,
> > 
> > Kernel 2.4.27-10
> > With mount isofs filesystem, any mount parameters after
> > iocharset=,map=,session= are ignored.
> > 
> > Sample:
> > 
> > mount -t isofs -o uid=100,iocharset=koi8-r,gid=100 /dev/cdrom /media/cdrom
> > 
> > gid=100 - was ignored
> > 
> > I look in source and find that problem. I make two patch, simply and full
> > (what addeded some functionality - ignore wrong mount parameters)
> 
> Thanks,
> 
> I will try and get the simple version of this patch into the next
> Sarge update.
> 
> I have also CCed Marcelo and the LKML for their consideration,
> as this problem still seems to be present in the lastest 2.4 tree.
> 
> -- 
> Horms
> 
> simply patch:
> ===
> --- kernel-source-2.4.27/fs/isofs/inode.c   2005-05-19 13:29:39.0 
> +0300
> +++ kernel-source/fs/isofs/inode.c  2005-08-11 11:55:12.0 +0300
> @@ -340,13 +340,13 @@
> else if (!strcmp(value,"acorn")) popt->map = 'a';
> else return 0;
> }
> -   if (!strcmp(this_char,"session") && value) {
> +   else if (!strcmp(this_char,"session") && value) {
> char * vpnt = value;
> unsigned int ivalue = simple_strtoul(vpnt, , 0);
> if(ivalue < 0 || ivalue >99) return 0;
> popt->session=ivalue+1;
> }
> -   if (!strcmp(this_char,"sbsector") && value) {
> +   else if (!strcmp(this_char,"sbsector") && value) {
> char * vpnt = value;
> unsigned int ivalue = simple_strtoul(vpnt, , 0);
> if(ivalue < 0 || ivalue >660*512) return 0;
> ===

Neither "sbsector" or "session" parameters are part of the options string used 
in Alexander's example, so how come this patch can make any difference? 

Usage of "sbsector" or "session" parameters could explain the above patch
making a difference because the buggy, always true "(unsigned long) ivalue < 0"
comparison invokes "return 0", but that is not the case.

The code after the "popt->iocharset = value;" does not make any sense.

It seems that the "*value = 0" assignment can screw up the rest of the
string, isnt that the real issue?

#ifdef CONFIG_JOLIET
if (!strcmp(this_char,"iocharset") && value) {
popt->iocharset = value;
while (*value && *value != ',')
value++;
if (value == popt->iocharset)
return 0;
*value = 0;
} else
#endif





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] add transport class symlink to device object

2005-08-15 Thread James Bottomley

On Mon, 2005-08-15 at 20:52 -0400, [EMAIL PROTECTED] wrote:
> What is ":00:04:0" in this case ? The "device" is not a serial
> port, which is what the ttyXX back link would lead you to believe.
> Thus, it's a serial port multiplexer that supports up to N ports,
> right ? and wouldn't the more correct representation have been to
> enumerate a device for each serial port ? (e.g. :00:04.0/line0,
> :00:04.0/line1, or similar)

It's PCI segment 0, bus 0, slot 4, function 0, which is apparently a 3
port serial card (probably the GSP function of a pa8800?)

> Think if SCSI used this same style of representation. For example,
> if there was no scsi target device entity, but class entities did
> exist and they just pointed back to the scsi host device entry.

Yes, it's theoretically possible to have had SCSI do this.  We didn't do
it at the time because class_devices didn't exist when the SCSI tree was
first put together.  It would, however, have rather put the mockers on
doing transport classes since class devices can't point at other class
devices.

> My vote is to make the multiplexor instantiate each serial line
> as a separate device.

That's a choice that's up to the maintainer of the serial driver ...

James


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] add transport class symlink to device object

2005-08-15 Thread James . Smart

Actually, I view this as being a little odd...

What is ":00:04:0" in this case ? The "device" is not a serial
port, which is what the ttyXX back link would lead you to believe.
Thus, it's a serial port multiplexer that supports up to N ports,
right ? and wouldn't the more correct representation have been to
enumerate a device for each serial port ? (e.g. :00:04.0/line0,
:00:04.0/line1, or similar)

Think if SCSI used this same style of representation. For example,
if there was no scsi target device entity, but class entities did
exist and they just pointed back to the scsi host device entry.

My vote is to make the multiplexor instantiate each serial line
as a separate device.

-- james s

> On Sun, 2005-08-14 at 16:02 +0100, Matthew Wilcox wrote:
> > /sys/class/tty/ttyS0/device -> 
> ../../../devices/parisc/0/0:0/pci:00/:00:04.0
> > /sys/class/tty/ttyS1/device -> 
> ../../../devices/parisc/0/0:0/pci:00/:00:04.0
> > /sys/class/tty/ttyS2/device -> 
> ../../../devices/parisc/0/0:0/pci:00/:00:04.0
> > /sys/class/tty/ttyS3/device -> 
> ../../../devices/parisc/0/0:0/pci:00/:00:05.0
> > /sys/class/tty/ttyS4/device -> 
> ../../../devices/parisc/0/0:0/pci:00/:00:05.0
> 
> Actually, isn't the fix to all of this to combine Greg and James'
> patches?
> 
> The Greg one fails in SCSI because we don't have unique class device
> names (by convention we use the same name as the device bus_id) and
> James' one fails for ttys because the class name isn't 
> unique.  However,
> if the link were derived from something like
> 
> :
> 
> Then is would be unique in both cases.
> 
> Unless anyone can think of any more failing cases?
> 
> James
> 
> 
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] removes pci_find_device from i6300esb.c

2005-08-15 Thread Jiri Slaby

This patch changes pci_find_device to pci_get_device (encapsulated in
for_each_pci_dev) in i6300esb watchdog card with appropriate adding pci_dev_put.

Generated in 2.6.13-rc5-mm1 kernel version.

Signed-off-by: Jiri Slaby <[EMAIL PROTECTED]>

This is repost, the patch was posted yet:
8 Aug 2005

diff --git a/drivers/char/watchdog/i6300esb.c b/drivers/char/watchdog/i6300esb.c
--- a/drivers/char/watchdog/i6300esb.c
+++ b/drivers/char/watchdog/i6300esb.c
@@ -368,12 +368,11 @@ static unsigned char __init esb_getdevic
  *  Find the PCI device
  */
 
-while ((dev = pci_find_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) {
+for_each_pci_dev(dev)
 if (pci_match_id(esb_pci_tbl, dev)) {
 esb_pci = dev;
 break;
 }
-}
 
 if (esb_pci) {
if (pci_enable_device(esb_pci)) {
@@ -430,6 +429,7 @@ err_release:
pci_release_region(esb_pci, 0);
 err_disable:
pci_disable_device(esb_pci);
+   pci_dev_put(esb_pci);
}
 out:
return 0;
@@ -481,6 +481,7 @@ err_unmap:
pci_release_region(esb_pci, 0);
 /* err_disable: */
pci_disable_device(esb_pci);
+   pci_dev_put(esb_pci);
 /* out: */
 return ret;
 }
@@ -497,6 +498,7 @@ static void __exit watchdog_cleanup (voi
iounmap(BASEADDR);
pci_release_region(esb_pci, 0);
pci_disable_device(esb_pci);
+   pci_dev_put(esb_pci);
 }
 
 module_init(watchdog_init);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [-mm PATCH] remove use of pci_find_device in watchdog driver for Intel 6300ESB chipset

2005-08-15 Thread Jiri Slaby


David Härdeman napsal(a):


On Mon, Aug 15, 2005 at 02:30:15PM -0700, Naveen Gupta wrote:
[...}

-while ((dev = pci_find_device(PCI_ANY_ID, PCI_ANY_ID, dev)) 
!= NULL) {

-if (pci_match_id(esb_pci_tbl, dev)) {
-esb_pci = dev;
-break;
-}
-}
+while (ids->vendor && ids->device) {
+if ((dev = pci_get_device(ids->vendor, ids->device, dev)) != 
NULL) {

+esb_pci = dev;
+break;
+}
+ids++;
+}



I'm certainly not sure about this, but the proposed while loop looks a 
bit unconventional, wouldn't something like:


for_each_pci_dev(dev)
if (pci_match_id(esb_pci_tbl, dev)) {
esb_pci = dev;
break;
}
}

be better?


I did it here http://lkml.org/lkml/2005/8/9/305, but it wasn't acked 
yet. I should repost.


--
Jiri Slaby www.fi.muni.cz/~xslaby
~\-/~  [EMAIL PROTECTED]  ~\-/~
241B347EC88228DE51EE A49C4A73A25004CB2A10

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)

2005-08-15 Thread john stultz

On Tue, 2005-08-16 at 00:14 +0200, Roman Zippel wrote:
> On Wed, 10 Aug 2005, john stultz wrote:
> 
> > Here's the next rev in my rework of the current timekeeping subsystem.
> > No major changes, only some cleanups and further splitting the larger
> > patches into smaller ones.
> > 
> > The goal of this patch set is to provide a simplified and streamlined
> > common timekeeping infrastructure that architectures can optionally use
> > to avoid duplicating code with other architectures.
> 
> It's still the same old abstraction. Let's try it in OO terms why it's the 
> wrong one. What basically is needed is something like this:
> 
>   base clock  -> continuous clock -> clock implemention
>   -> tick clock   -> 
> 
> The base clock class is an abstract class which provides the basic time 
> services like get time, set time...
> The continuous clock class and tick clock class are also abstract classes,
> but provide basic template functions, which can be used by the actual 
> implementations do most of the work.
> 
> What you do with your patches is to just provide an abstract class for 
> continous clocks and tick based clocks have to emulate a continuous clock. 

Sorry. It was subtle, but after thinking more about your arguments, I've
stepped back from my earlier goals of replacing the timekeeping code for
all arches and instead I've decided to just focus on allowing
architectures that would duplicate code using a continuous timesource
use a common code base.  

Think of it more as a replacement for the time_interpolator code (which
thanks to Christoph Lameter, it is quite influenced by).

So in that way the "abstract class" will just be the current interface
of:

1. do_gettimeofday()
2. do_settimeofday()
3. getnstimeofday()
4. periodic hook (update_wall_time)
5. init code

To that I'd like to add 
6. do_monotonic_clock() which I've just added and implementation for
tick based systems.

Then in the tick based class, nothing changes (except for the new
do_monotonic_clock implementation). And in the continuous timesource
class, it uses my generic-tod code.

> Please provide the right abstractions, e.g. leave the gettimeofday 
> implementation to the timesource and use some helper (template) functions 
> to do the actual work. Basically as long as you have a cycle_t in the 
> common code something is really wrong, this code belongs in the continous 
> clock template.

I'm not sure I agree. By pushing all the gettimeofday logic behind the
timesource or clock class you describe above, you end up with lots of
duplicated and error prone code. That's the issue I'm trying to avoid
between the different arches. Additionally The current i386 timer_opts
code (which I'm to blame for) does almost exactly this at the timesource
level, and while it did allow for alternate timesources to be easily
used, it caused a large amount of almost duplicate code with slightly
differing behavior, and has made changes like dynamic ticks difficult to
do correctly.

It was this reason (along with Christoph's proddings - due to the
fsyscall requirements) that the timesource structure only provides an
abstraction to a free running counter instead of a state-full structure
with function pointers that return the timeofday.

Now, this does not limit any arch from doing their own thing and
implementing their own "timeofday abstract class". I'm just trying to
provide a correct and clean infrastructure for the arches that could use
a continuous timesource. 

> This also allows better implementations, e.g. gettimeofday can be done in 
> a single step instead of two using a single lock instead of two.

This is a miss-characterization. In most cases the continuous
gettimeofday is done in a single step with a single lock. Although it
does have the flexibility to allow for more complex setups, as well as
the ability to opt out and use the existing tick based code. 

thanks
-john

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/6] i386 virtualization - Make generic set wrprotect a macro

2005-08-15 Thread Zachary Amsden


Adrian Bunk wrote:


On Mon, Aug 15, 2005 at 04:00:39PM -0700, [EMAIL PROTECTED] wrote:

 


Make the generic version of ptep_set_wrprotect a macro.  This is good for
code uniformity, and fixes the build for architectures which include pgtable.h
through headers into assembly code, but do not define a ptep_set_wrprotect
function.
   




This against the kernel coding style.
In fact, we are usually doing exactly the opposite. 

What exactly is the technical problem this patch is trying to solve, IOW 
which architectures are breaking for you?
 



The generic pgtable.h include is special and apparently deliberately 
against kernel coding style.  Look at the rest of the file.  All 
"functions" here are purely macros, or encapsulated with:


#ifndef __ASSEMBLY__
static inline void foo()
#endif

This is because asm-generic/pgtable.h can get included in assembler 
files via a number of ways.


Now, if you have a header file that gets conditionally excluded based on 
#ifndef __ASSEMBLY__,  as asm-i386/pgtable.h does to 
pgtable-{2|3}level.h you must do one of the following:


1) move all  __HAVE_ARCH_PTEP_XXX definitions out of the !__ASSEMBLY__ 
clause
2) protect all inline assembler functions in asm-generic/pgtable.h with 
!__ASSEMBLY

3) use macros instead of inline functions in asm-generic

Having the ability to redefine page table accessors at the sub-arch 
level is necessary to have a paravirtualized sub-arch of i386.  My third 
attempt at this (the first was a horror unthinkable to even publish) is 
trying to make the code as clean and consistent as possible.  #1 above 
makes maintaing compile time PAE for i386 with a paravirtualized 
sub-arch cumbersome, since one must either isolate the __HAVE_ARCH _XXX 
defines from the XXX function definition itself, surround each 
individual function with !__ASSEMBLY__, or switch to macros instead of 
inline functions for include/asm-i386/pgtable-{2|3}level.h.  Ugly and 
difficult to maintain.


Thus, I chose the default convention of following the surrounding code.  
There are 6 C inline functions in the generic pgtable.h and 37 macros.  
Converting to and from macros and inline functions here is rather 
tedious and error prone, all of these functions are conditionally 
defined based on the architecture, and I don't want to risk introducing 
yet another regression for an architecture that I don't have a 
cross-compile set up for.


If you have a better approch, I'd be interested in hearing it.

Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

usbmon in 2.6.13: peeking into DMA areas

2005-08-15 Thread Pete Zaitcev

I am not completely confident in this patch, so while the fix is being
requested by users, I would like to have it postponed until 2.6.13.

This code looks at urb->transfer_dma, maps the page and takes the data.
I am looking for volunteers to contribute architectures other than i386
or to develop an architecure-neutral API for it (or point me that it
was done already).

Signed-off-by: Pete Zaitcev <[EMAIL PROTECTED]>

diff -urpN -X dontdiff linux-2.6.13-rc6/drivers/usb/mon/Makefile 
linux-2.6.13-rc6-lem/drivers/usb/mon/Makefile
--- linux-2.6.13-rc6/drivers/usb/mon/Makefile   2005-08-14 20:57:43.0 
-0700
+++ linux-2.6.13-rc6-lem/drivers/usb/mon/Makefile   2005-08-15 
11:25:32.0 -0700
@@ -2,7 +2,7 @@
 # Makefile for USB Core files and filesystem
 #
 
-usbmon-objs:= mon_main.o mon_stat.o mon_text.o
+usbmon-objs:= mon_main.o mon_stat.o mon_text.o mon_dma.o
 
 # This does not use CONFIG_USB_MON because we want this to use a tristate.
 obj-$(CONFIG_USB)  += usbmon.o
diff -urpN -X dontdiff linux-2.6.13-rc6/drivers/usb/mon/mon_dma.c 
linux-2.6.13-rc6-lem/drivers/usb/mon/mon_dma.c
--- linux-2.6.13-rc6/drivers/usb/mon/mon_dma.c  1969-12-31 16:00:00.0 
-0800
+++ linux-2.6.13-rc6-lem/drivers/usb/mon/mon_dma.c  2005-08-15 
16:11:51.0 -0700
@@ -0,0 +1,55 @@
+/*
+ * The USB Monitor, inspired by Dave Harding's USBMon.
+ * 
+ * mon_dma.c: Library which snoops on DMA areas.
+ *
+ * Copyright (C) 2005 Pete Zaitcev ([EMAIL PROTECTED])
+ */
+#include 
+#include 
+#include 
+#include 
+
+#include  /* Only needed for declarations in usb_mon.h */
+#include "usb_mon.h"
+
+#ifdef __i386__/* CONFIG_ARCH_I386 does not exit */
+#define MON_HAS_UNMAP 1
+
+#define phys_to_page(phys) pfn_to_page((phys) >> PAGE_SHIFT)
+
+char mon_dmapeek(unsigned char *dst, dma_addr_t dma_addr, int len)
+{
+   struct page *pg;
+   unsigned long flags;
+   unsigned char *map;
+   unsigned char *ptr;
+
+   /*
+* On i386, a DMA handle is the "physical" address of a page.
+* In other words, the bus address is equal to physical address.
+* There is no IOMMU.
+*/
+   pg = phys_to_page(dma_addr);
+
+   /*
+* We are called from hardware IRQs in case of callbacks.
+* But we can be called from softirq or process context in case
+* of submissions. In such case, we need to protect KM_IRQ0.
+*/
+   local_irq_save(flags);
+   map = kmap_atomic(pg, KM_IRQ0);
+   ptr = map + (dma_addr & (PAGE_SIZE-1));
+   memcpy(dst, ptr, len);
+   kunmap_atomic(map, KM_IRQ0);
+   local_irq_restore(flags);
+   return 0;
+}
+#endif /* __i386__ */
+
+#ifndef MON_HAS_UNMAP
+char mon_dmapeek(unsigned char *dst, dma_addr_t dma_addr, int len)
+{
+   return 'D';
+}
+#endif
diff -urpN -X dontdiff linux-2.6.13-rc6/drivers/usb/mon/mon_text.c 
linux-2.6.13-rc6-lem/drivers/usb/mon/mon_text.c
--- linux-2.6.13-rc6/drivers/usb/mon/mon_text.c 2005-08-14 20:57:43.0 
-0700
+++ linux-2.6.13-rc6-lem/drivers/usb/mon/mon_text.c 2005-08-15 
11:44:13.0 -0700
@@ -91,25 +91,11 @@ static inline char mon_text_get_data(str
 int len, char ev_type)
 {
int pipe = urb->pipe;
-   unsigned char *data;
-
-   /*
-* The check to see if it's safe to poke at data has an enormous
-* number of corner cases, but it seems that the following is
-* more or less safe.
-*
-* We do not even try to look transfer_buffer, because it can
-* contain non-NULL garbage in case the upper level promised to
-* set DMA for the HCD.
-*/
-   if (urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP)
-   return 'D';
 
if (len <= 0)
return 'L';
-
-   if ((data = urb->transfer_buffer) == NULL)
-   return 'Z'; /* '0' would be not as pretty. */
+   if (len >= DATA_MAX)
+   len = DATA_MAX;
 
/*
 * Bulk is easy to shortcut reliably. 
@@ -126,8 +112,21 @@ static inline char mon_text_get_data(str
}
}
 
-   if (len >= DATA_MAX)
-   len = DATA_MAX;
+   /*
+* The check to see if it's safe to poke at data has an enormous
+* number of corner cases, but it seems that the following is
+* more or less safe.
+*
+* We do not even try to look transfer_buffer, because it can
+* contain non-NULL garbage in case the upper level promised to
+* set DMA for the HCD.
+*/
+   if (urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP)
+   return mon_dmapeek(ep->data, urb->transfer_dma, len);
+
+   if (urb->transfer_buffer == NULL)
+   return 'Z'; /* '0' would be not as pretty. */
+
memcpy(ep->data, urb->transfer_buffer, len);
return 0;
 }
diff -urpN -X dontdiff linux-2.6.13-rc6/drivers/usb/mon/usb_mon.h 
linux-2.6.13-rc6-lem/drivers/usb/mon/usb_mon.h

Re: [patch] Real-Time Preemption, -RT-2.6.13-rc4-V0.7.53-01, High Resolution Timers & RCU-tasklist features

2005-08-15 Thread George Anzinger


Ingo Molnar wrote:

* Ingo Molnar <[EMAIL PROTECTED]> wrote:



* George Anzinger  wrote:



Ingo, all

I, silly person that I am, configured an RT, SMP, PREEMPT_DEBUG system. 
Someone put code in the NMI path to modify the preempt count which, 
often as not will generate a PREEMPT_DEBUG message as there is no tell 
what state the preempt count is in on an NMI interrupt.  I have sent 
the attached patch to Andrew on this, but meanwhile, if you want RT, 
SMP, PREEMPT_DEBUG you will be much better off with this.


ah - thanks, applied. Might explain some of the recent SMP weirdnesses 
i'm seeing. Attributed them to the HRT patch ;-)



i'm still seeing weird crashes under SMP, which go away if i disable 
CONFIG_HIGH_RES_TIMERS. (this after i fixed a couple of other SMP bugs 
in the HRT code) It happens sometime during the bootup, after enabling 
the network but before users can log in. There's no good debug info, 
just a hang that comes from all CPUs trying to get some debug info out 
but crashing deeply.


I haven't looked at this new code all that closely as yet.  One thing I 
did notice is that there is an assumption that the "timer being 
delivered flag" can be shared between LR timers and HR timers.  I 
suspect this is wrong as the delivery code is in seperate threads (I 
assume).  This could lead to del_timer_async missing a timer.


In the prior patch we just ignored the del_timer_async issue for HR 
timers (code I plan to do soon).  This WAS taken care of in earlier 
kernels by a reuse of one of the list link fields, but Andrew convince 
me that this was _not_ good.


So, my guess, a nanosleep for an RT task (I think you said these are 
promoted to HR) is completing and over writing the deliver in progress 
flag for a LR timer which just happens to have a del_timer_sync going on 
at the same time.

--
George Anzinger   george@mvista.com
HRT (High-res-timers):  http://sourceforge.net/projects/high-res-timers/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/4] UML -

2005-08-15 Thread Jeff Dike

>From Al Viro:

Fix a macro typo which could break if the macro is passed arguments with
side-effects.

Signed-off-by: Jeff Dike <[EMAIL PROTECTED]>

Index: linux-2.6.13-rc6/arch/um/include/sysdep-x86_64/ptrace.h
===
--- linux-2.6.13-rc6.orig/arch/um/include/sysdep-x86_64/ptrace.h
2005-06-17 15:48:29.0 -0400
+++ linux-2.6.13-rc6/arch/um/include/sysdep-x86_64/ptrace.h 2005-08-15 
13:32:57.0 -0400
@@ -227,7 +227,7 @@
 panic("Bad register in UPT_SET : %d\n", reg);  \
break; \
 } \
-val; \
+__upt_val; \
 })
 
 #define UPT_SET_SYSCALL_RETURN(r, res) \

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/4] UML - Fix signal frame copy_user

2005-08-15 Thread Jeff Dike

>From Al Viro:

The copy_user stuff in the signal frame code was broke.

Signed-off-by: Jeff Dike <[EMAIL PROTECTED]>

Index: linux-2.6.13-rc6/arch/um/sys-i386/signal.c
===
--- linux-2.6.13-rc6.orig/arch/um/sys-i386/signal.c 2005-08-15 
12:03:10.0 -0400
+++ linux-2.6.13-rc6/arch/um/sys-i386/signal.c  2005-08-15 12:04:08.0 
-0400
@@ -122,9 +122,9 @@
int err;
 
to_fp = to->fpstate;
-   from_fp = from->fpstate;
sigs = to->oldmask;
err = copy_from_user(to, from, sizeof(*to));
+   from_fp = to->fpstate;
to->oldmask = sigs;
to->fpstate = to_fp;
if(to_fp != NULL)
Index: linux-2.6.13-rc6/arch/um/sys-x86_64/signal.c
===
--- linux-2.6.13-rc6.orig/arch/um/sys-x86_64/signal.c   2005-08-15 
12:03:10.0 -0400
+++ linux-2.6.13-rc6/arch/um/sys-x86_64/signal.c2005-08-15 
12:04:08.0 -0400
@@ -104,28 +104,35 @@
 int copy_sc_from_user_tt(struct sigcontext *to, struct sigcontext *from,
 int fpsize)
 {
-   struct _fpstate *to_fp, *from_fp;
-   unsigned long sigs;
-   int err;
-
-   to_fp = to->fpstate;
-   from_fp = from->fpstate;
-   sigs = to->oldmask;
-   err = copy_from_user(to, from, sizeof(*to));
-   to->oldmask = sigs;
-   return(err);
+   struct _fpstate *to_fp, *from_fp;
+   unsigned long sigs;
+   int err;
+
+   to_fp = to->fpstate;
+   sigs = to->oldmask;
+   err = copy_from_user(to, from, sizeof(*to));
+   from_fp = to->fpstate;
+   to->fpstate = to_fp;
+   to->oldmask = sigs;
+   if(to_fp != NULL)
+   err |= copy_from_user(to_fp, from_fp, fpsize);
+   return(err);
 }
 
 int copy_sc_to_user_tt(struct sigcontext *to, struct _fpstate *fp,
   struct sigcontext *from, int fpsize)
 {
-   struct _fpstate *to_fp, *from_fp;
-   int err;
+   struct _fpstate *to_fp, *from_fp;
+   int err;
 
-   to_fp = (fp ? fp : (struct _fpstate *) (to + 1));
-   from_fp = from->fpstate;
-   err = copy_to_user(to, from, sizeof(*to));
-   return(err);
+   to_fp = (fp ? fp : (struct _fpstate *) (to + 1));
+   from_fp = from->fpstate;
+   err = copy_to_user(to, from, sizeof(*to));
+   if(from_fp != NULL){
+   err |= copy_to_user(>fpstate, _fp, sizeof(to->fpstate));
+   err |= copy_to_user(to_fp, from_fp, fpsize);
+   }
+   return(err);
 }
 
 #endif

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/6] i386 virtualization patches, Set 3

2005-08-15 Thread Andi Kleen

> you are forgetting about the embedded market, there 386 cpu (or things 
> that look like 386 cpu's) are still available.

They cannot use it much though because the code is obviously  in so 
bad shape. Perhaps they have all FPUs ? Ok given LDT usage 
is rare, but still there are probably lots of other bugs 
in that unmaintained code too.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Hiro Yoshioka

On 8/15/05, Arjan van de Ven <[EMAIL PROTECTED]> wrote:
> > copy_from_user_nocache() is fine.
> >
> > But I don't know where I can use it. (I'm not so
> >  familiar with the linux kernel file system yet.)
> 
> I suspect the few cases where it will make the most difference will be
> in the VFS for the write() system call, and the AIO variants thereof.
> 
> generic_file_buffered_write() will be a good candidate to try first...

Thanks.

filemap_copy_from_user() calls __copy_from_user_inatomic() calls
__copy_from_user_ll().

I'll look at the code.

Hiro
--
Hiro Yoshioka
mailto:hyoshiok at miraclelinux.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/4] UML - Fix the x86_64 build

2005-08-15 Thread Jeff Dike

>From Al Viro:

asm/elf.h breaks the x86_64 build.

Signed-off-by: Jeff Dike <[EMAIL PROTECTED]>

Index: linux-2.6.13-rc6/arch/um/os-Linux/elf_aux.c
===
--- linux-2.6.13-rc6.orig/arch/um/os-Linux/elf_aux.c2005-08-08 
12:11:18.0 -0400
+++ linux-2.6.13-rc6/arch/um/os-Linux/elf_aux.c 2005-08-15 13:53:50.0 
-0400
@@ -9,7 +9,6 @@
  */
 #include 
 #include 
-#include 
 #include "init.h"
 #include "elf_user.h"
 #include "mem_user.h"

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/4] UML - Fix a crash under screen

2005-08-15 Thread Jeff Dike

Running UML inside a detached screen delivers SIGWINCH when UML is not 
expecting it.  This patch ignores them.

Signed-off-by: Jeff Dike <[EMAIL PROTECTED]>

Index: linux-2.6.13-rc6/arch/um/kernel/skas/process.c
===
--- linux-2.6.13-rc6.orig/arch/um/kernel/skas/process.c 2005-08-13 
09:02:43.0 -0400
+++ linux-2.6.13-rc6/arch/um/kernel/skas/process.c  2005-08-13 
09:03:15.0 -0400
@@ -61,7 +61,11 @@
 
 CATCH_EINTR(n = waitpid(pid, , WUNTRACED));
 } while((n >= 0) && WIFSTOPPED(status) &&
-(WSTOPSIG(status) == SIGVTALRM));
+((WSTOPSIG(status) == SIGVTALRM) || 
+/* running UML inside a detached screen can cause 
+ * SIGWINCHes 
+ */
+(WSTOPSIG(status) == SIGWINCH)));
 
 if((n < 0) || !WIFSTOPPED(status) ||
(WSTOPSIG(status) != SIGUSR1 && WSTOPSIG(status) != SIGTRAP)){

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2.6.12.5 ATAPI Iomega Zip100 problem, more info, solved?

2005-08-15 Thread Grant Coady

Greetings,

Some more info on removable media oddness.  I use both vfat and ext2 
format zip disk.  Two mountpoints:

/dev/hdc4   /mnt/zipvfatnoauto,user 0 0
/dev/hdc1   /mnt/zip2   ext2noauto,user 0 0

Odd behaviour:

$ mount /mnt/zip
mount: /dev/hdc4 is not a valid block device

Yet at this point fdisk can access the zip disk.  On the other hand an ext2 
formatted zip disk works as expected with "mount /mnt/zip2"


Making ide_floppy a module avoids this odd behaviour.

Cheers,
Grant.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/6] i386 virtualization patches, Set 3

2005-08-15 Thread David Lang


On Tue, 16 Aug 2005, Andi Kleen wrote:



On Mon, Aug 15, 2005 at 03:58:09PM -0700, [EMAIL PROTECTED] wrote:

I was going to attempt to clean up the math-emu code to make it use the
nice new segment and descriptor table accessors, but it quickly became
apparent that this would be a long, tedious, error prone process that
would eventually result in the death of a large section of my brain.
In addition, it is not very fun to test this on the actual hardware it
is designed to run on (although I did manage to track down a 386 with
detachable i387 coprocessor, the owner is not sure it still boots).
Someday it would be nice to have an audit of this code; it appears to
be riddled with bugs relating to segmentation, for example it assumes
LDT segments on overrides, does not use the mm->context semaphore to
protect LDT access, and generally looks scarily out of date in both
function and appearance.


Perhaps the best would be to just remove it. Near all 386s should be far
beyond their MTBF by now. Mark it CONFIG_BROKEN and if nobody complains for
one or two releases remove it completely.


you are forgetting about the embedded market, there 386 cpu (or things 
that look like 386 cpu's) are still available.


David Lang


The ugly verify_area 386 bugfix workaround code could go at the same
time.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



--
There are two ways of constructing a software design. One way is to make it so 
simple that there are obviously no deficiencies. And the other way is to make 
it so complicated that there are no obvious deficiencies.
 -- C.A.R. Hoare
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/6] i386 virtualization - Make generic set wrprotect a macro

2005-08-15 Thread Adrian Bunk

On Mon, Aug 15, 2005 at 04:00:39PM -0700, [EMAIL PROTECTED] wrote:

> Make the generic version of ptep_set_wrprotect a macro.  This is good for
> code uniformity, and fixes the build for architectures which include pgtable.h
> through headers into assembly code, but do not define a ptep_set_wrprotect
> function.


This against the kernel coding style.
In fact, we are usually doing exactly the opposite. 

What exactly is the technical problem this patch is trying to solve, IOW 
which architectures are breaking for you?


> Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]>
> Index: linux-2.6.13/include/asm-generic/pgtable.h
> ===
> --- linux-2.6.13.orig/include/asm-generic/pgtable.h   2005-08-12 
> 12:12:55.0 -0700
> +++ linux-2.6.13/include/asm-generic/pgtable.h2005-08-15 
> 13:54:42.0 -0700
> @@ -313,11 +313,12 @@
>  #endif
>  
>  #ifndef __HAVE_ARCH_PTEP_SET_WRPROTECT
> -static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long 
> address, pte_t *ptep)
> -{
> - pte_t old_pte = *ptep;
> - set_pte_at(mm, address, ptep, pte_wrprotect(old_pte));
> -}
> +#define ptep_set_wrprotect(__mm, __address, __ptep)  \
> +({   \
> + pte_t __old_pte = *(__ptep);\
> + set_pte_at((__mm), (__address), (__ptep),   \
> + pte_wrprotect(__old_pte));  \
> +})
>  #endif
>  
>  #ifndef __HAVE_ARCH_PTE_SAME


cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/6] i386 virtualization patches, Set 3

2005-08-15 Thread Andi Kleen

On Mon, Aug 15, 2005 at 03:58:09PM -0700, [EMAIL PROTECTED] wrote:
> I was going to attempt to clean up the math-emu code to make it use the
> nice new segment and descriptor table accessors, but it quickly became
> apparent that this would be a long, tedious, error prone process that
> would eventually result in the death of a large section of my brain.
> In addition, it is not very fun to test this on the actual hardware it
> is designed to run on (although I did manage to track down a 386 with
> detachable i387 coprocessor, the owner is not sure it still boots).
> Someday it would be nice to have an audit of this code; it appears to
> be riddled with bugs relating to segmentation, for example it assumes
> LDT segments on overrides, does not use the mm->context semaphore to
> protect LDT access, and generally looks scarily out of date in both
> function and appearance.

Perhaps the best would be to just remove it. Near all 386s should be far
beyond their MTBF by now. Mark it CONFIG_BROKEN and if nobody complains for 
one or two releases remove it completely.

The ugly verify_area 386 bugfix workaround code could go at the same
time.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: rc6 keeps hanging and blanking displays - bisection complete

2005-08-15 Thread Dave Airlie

> > I'm a little surprised, as a ppc64 fix theoretically shouldn't matter for
> > x86_64? But perhaps they share something?
> 
> My guess is that it is maybe the DRM changes that have done it... the
> 32/64-bit code in 2.6.13-rc6 may have issues, but they've been tested
> on a number of configurations (none of them by me... I can't test what
> I don't have...)
>

Actually after looking back 2.6.13-rc4-mm1 which you say works doesn't
contain any of the later 32/64-bit changes.. so maybe you can try just
applying the git-drm.patch from that tree to see if it makes a
difference...

I'm getting less and less sure this is caused by the drm, (have you
built with DRM disabled completely??)

Do you have any fb support in-kernel (I know you might have answered
this already but I'm getting a bit lost on this thread...)

Dave.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Doug Warzecha

On Mon, Aug 15, 2005 at 04:23:37PM -0400, Kyle Moffett wrote:
> On Aug 15, 2005, at 16:05:22, Doug Warzecha wrote:
> >This patch adds the Dell Systems Management Base Driver with sysfs  
> >support.
> 
> >+On some Dell systems, systems management software must access certain
> >+management information via a system management interrupt (SMI).   
> >The SMI data
> >+buffer must reside in 32-bit address space, and the physical  
> >address of the
> >+buffer is required for the SMI.  The driver maintains the memory  
> >required for
> >+the SMI and provides a way for the application to generate the SMI.
> >+The driver creates the following sysfs entries for systems management
> >+software to perform these system management interrupts:
> 
> Why can't you just implement the system management actions in the kernel
> driver?

We want to minimize the amount of code in the kernel and avoid having to update 
the driver each time a new system management command is added.

> This is tantamount to a binary SMI hook to userspace.  What
> functionality does this provide on a dell system from an administrator's
> point of view?

The libsmbios project is being updated to use this code.  
http://linux.dell.com/libsmbios/main/.  Using the libsmbios code, you will be 
able to set all of the options in BIOS F2 screen from Linux userspace.  Also, 
libsmbios is looking at implementing a few other things like fan status.  
Libsmbios is 100% open-source (OSL/GPL dual license).

> >+Host Control Action
> >+
> >+Dell OpenManage supports a host control feature that allows the  
> >administrator
> >+to perform a power cycle or power off of the system after the OS  
> >has finished
> >+shutting down.  On some Dell systems, this host control feature  
> >requires that
> >+a driver perform a SMI after the OS has finished shutting down.
> >+
> >+The driver creates the following sysfs entries for systems  
> >management software
> >+to schedule the driver to perform a power cycle or power off host  
> >control
> >+action after the system has finished shutting down:
> >+
> >+/sys/devices/platform/dcdbas/host_control_action
> >+/sys/devices/platform/dcdbas/host_control_smi_type
> >+/sys/devices/platform/dcdbas/host_control_on_shutdown
> 
> How is this different from shutdown() or reboot()?

The power cycle feature of the system powers off the system for a few seconds 
and then powers the system back on without user intervention.  shutdown() and 
reboot() don't provide that feature.

> What exactly is smi_type used for?  Please provide better documentation
> on how to use this and what it does.

The method of generating a host control SMI is not exactly the same for each 
PowerEdge system listed in dcdbas.txt.  host_control_smi_type tells the driver 
how to generate the host control SMI for the system in use.  I'll update 
dcdbas.txt with the SMI type value associated with the systems listed in that 
file.

> If this is supposed to be used with the RBU code to trigger a BIOS  
> update, ...

This driver is not needed by the RBU code.

Doug
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: rc6 keeps hanging and blanking displays - bisection complete

2005-08-15 Thread Dave Airlie

> 
> I'm a little surprised, as a ppc64 fix theoretically shouldn't matter for
> x86_64? But perhaps they share something?

My guess is that it is maybe the DRM changes that have done it... the
32/64-bit code in 2.6.13-rc6 may have issues, but they've been tested
on a number of configurations (none of them by me... I can't test what
I don't have...)

Can you do me a favour and check 2.6.13-rc6 with the git-drm.patch from -mm?

http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc5/2.6.13-rc5-mm1/broken-out/git-drm.patch

If this is a 32/64-bit issue I think that patch might help, I'm not
convinced I can't see how the DRM would ever start blanking the
screen, it doesn't have any code in that area at all.. but stranger
things have surprised me...

Is there any difference in your Xorg.0.log files before/after this...

There is also an issue at:
http://bugme.osdl.org/show_bug.cgi?id=4965

which was caused by the pci assign resources patch on x86... I'm not
sure if this is similiar..

Dave.

Dave.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [-mm PATCH] remove use of pci_find_device in watchdog driver for Intel 6300ESB chipset

2005-08-15 Thread David Härdeman


On Mon, Aug 15, 2005 at 02:30:15PM -0700, Naveen Gupta wrote:
[...}

-while ((dev = pci_find_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) {
-if (pci_match_id(esb_pci_tbl, dev)) {
-esb_pci = dev;
-break;
-}
-}
+   while (ids->vendor && ids->device) {
+   if ((dev = pci_get_device(ids->vendor, ids->device, dev)) != 
NULL) {
+   esb_pci = dev;
+   break;
+   }
+   ids++;
+   }


I'm certainly not sure about this, but the proposed while loop looks a 
bit unconventional, wouldn't something like:


for_each_pci_dev(dev)
if (pci_match_id(esb_pci_tbl, dev)) {
esb_pci = dev;
break;
}
}

be better?

//David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/6] i386 virtualization - Remove some dead debugging code

2005-08-15 Thread zach

This code is quite dead.  Release_thread is always guaranteed that the mm has
already been released, thus dead_task->mm will always be NULL.

Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]>
Index: linux-2.6.13/arch/i386/kernel/process.c
===
--- linux-2.6.13.orig/arch/i386/kernel/process.c2005-08-15 
10:46:18.0 -0700
+++ linux-2.6.13/arch/i386/kernel/process.c 2005-08-15 10:48:51.0 
-0700
@@ -421,17 +421,7 @@
 
 void release_thread(struct task_struct *dead_task)
 {
-   if (dead_task->mm) {
-   // temporary debugging check
-   if (dead_task->mm->context.size) {
-   printk("WARNING: dead process %8s still has LDT? 
<%p/%d>\n",
-   dead_task->comm,
-   dead_task->mm->context.ldt,
-   dead_task->mm->context.size);
-   BUG();
-   }
-   }
-
+   BUG_ON(dead_task->mm);
release_vm86_irqs(dead_task);
 }
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/6] i386 virtualization - Fix uml build

2005-08-15 Thread zach

Attempt to fix the UML build by assuming the default i386 subarchitecture
(mach-default).

I can't fully test this because spinlock breakage is still happening in
my tree, but it gets rid of the mach_xxx.h missing file warnings.

Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]>
Index: linux-2.6.13/arch/um/Makefile-i386
===
--- linux-2.6.13.orig/arch/um/Makefile-i386 2005-08-12 11:57:45.0 
-0700
+++ linux-2.6.13/arch/um/Makefile-i386  2005-08-12 12:28:09.0 -0700
@@ -27,7 +27,9 @@
 endif
 endif
 
-CFLAGS += -U__$(SUBARCH)__ -U$(SUBARCH) $(STUB_CFLAGS)
+CFLAGS += -U__$(SUBARCH)__ -U$(SUBARCH) $(STUB_CFLAGS) \
+ -Iinclude/asm-i386/mach-default \
+  $(if $(KBUILD_SRC),-Iinclude2/asm-i386/mach-default 
-I$(srctree)/include/asm-i386/mach-default)
 
 ifneq ($(CONFIG_GPROF),y)
 ARCH_CFLAGS += -DUM_FASTCALL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/6] i386 virtualization - Make generic set wrprotect a macro

2005-08-15 Thread zach

Make the generic version of ptep_set_wrprotect a macro.  This is good for
code uniformity, and fixes the build for architectures which include pgtable.h
through headers into assembly code, but do not define a ptep_set_wrprotect
function.

Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]>
Index: linux-2.6.13/include/asm-generic/pgtable.h
===
--- linux-2.6.13.orig/include/asm-generic/pgtable.h 2005-08-12 
12:12:55.0 -0700
+++ linux-2.6.13/include/asm-generic/pgtable.h  2005-08-15 13:54:42.0 
-0700
@@ -313,11 +313,12 @@
 #endif
 
 #ifndef __HAVE_ARCH_PTEP_SET_WRPROTECT
-static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long 
address, pte_t *ptep)
-{
-   pte_t old_pte = *ptep;
-   set_pte_at(mm, address, ptep, pte_wrprotect(old_pte));
-}
+#define ptep_set_wrprotect(__mm, __address, __ptep)\
+({ \
+   pte_t __old_pte = *(__ptep);\
+   set_pte_at((__mm), (__address), (__ptep),   \
+   pte_wrprotect(__old_pte));  \
+})
 #endif
 
 #ifndef __HAVE_ARCH_PTE_SAME
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/6] i386 virtualization - Make ldt a desc struct

2005-08-15 Thread zach

Make the LDT a desc_struct pointer, since this is what it actually is.
There is code which relies on the fact that LDTs are allocated in page
chunks, and it is both cleaner and more convenient to keep the rather
poorly named "size" variable from the LDT in terms of LDT pages.

Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]>
Index: linux-2.6.13/include/asm-i386/mmu.h
===
--- linux-2.6.13.orig/include/asm-i386/mmu.h2005-08-15 11:16:59.0 
-0700
+++ linux-2.6.13/include/asm-i386/mmu.h 2005-08-15 11:19:49.0 -0700
@@ -9,9 +9,9 @@
  * cpu_vm_mask is used to optimize ldt flushing.
  */
 typedef struct { 
-   int size;
struct semaphore sem;
-   void *ldt;
+   struct desc_struct *ldt;
+   int ldt_pages;
 } mm_context_t;
 
 #endif
Index: linux-2.6.13/include/asm-i386/desc.h
===
--- linux-2.6.13.orig/include/asm-i386/desc.h   2005-08-15 11:16:59.0 
-0700
+++ linux-2.6.13/include/asm-i386/desc.h2005-08-15 11:19:49.0 
-0700
@@ -6,6 +6,9 @@
 
 #define CPU_16BIT_STACK_SIZE 1024
 
+/* The number of LDT entries per page */
+#define LDT_ENTRIES_PER_PAGE (PAGE_SIZE / LDT_ENTRY_SIZE)
+
 #ifndef __ASSEMBLY__
 
 #include 
@@ -30,7 +33,7 @@
 static inline unsigned long get_desc_base(struct desc_struct *desc)
 {
unsigned long base;
-   base = ((desc->a >> 16)  & 0x) |
+   base = (desc->a >> 16) |
((desc->b << 16) & 0x00ff) |
(desc->b & 0xff00);
return base;
@@ -171,7 +174,7 @@
 static inline void load_LDT_nolock(mm_context_t *pc, int cpu)
 {
void *segments = pc->ldt;
-   int count = pc->size;
+   int count = pc->ldt_pages * LDT_ENTRIES_PER_PAGE;
 
if (likely(!count)) {
segments = _ldt[0];
Index: linux-2.6.13/include/asm-i386/mmu_context.h
===
--- linux-2.6.13.orig/include/asm-i386/mmu_context.h2005-08-15 
11:16:59.0 -0700
+++ linux-2.6.13/include/asm-i386/mmu_context.h 2005-08-15 11:19:49.0 
-0700
@@ -19,7 +19,7 @@
memset(>context, 0, sizeof(mm->context));
init_MUTEX(>context.sem);
old_mm = current->mm;
-   if (old_mm && unlikely(old_mm->context.size > 0)) {
+   if (old_mm && unlikely(old_mm->context.ldt)) {
retval = copy_ldt(>context, _mm->context);
}
if (retval == 0)
@@ -32,7 +32,7 @@
  */
 static inline void destroy_context(struct mm_struct *mm)
 {
-   if (unlikely(mm->context.size))
+   if (unlikely(mm->context.ldt))
destroy_ldt(mm);
del_lazy_mm(mm);
 }
Index: linux-2.6.13/include/asm-i386/mach-default/mach_desc.h
===
--- linux-2.6.13.orig/include/asm-i386/mach-default/mach_desc.h 2005-08-15 
11:16:59.0 -0700
+++ linux-2.6.13/include/asm-i386/mach-default/mach_desc.h  2005-08-15 
11:19:49.0 -0700
@@ -62,11 +62,10 @@
_set_tssldt_desc(_cpu_gdt_table(cpu)[GDT_ENTRY_LDT], (int)addr, 
((size << 3)-1), 0x82);
 }
 
-static inline int write_ldt_entry(void *ldt, int entry, __u32 entry_a, __u32 
entry_b)
+static inline int write_ldt_entry(struct desc_struct *ldt, int entry, __u32 
entry_a, __u32 entry_b)
 {
-   __u32 *lp = (__u32 *)((char *)ldt + entry*8);
-   *lp = entry_a;
-   *(lp+1) = entry_b;
+   ldt[entry].a = entry_a;
+   ldt[entry].b = entry_b;
return 0;
 }
 
Index: linux-2.6.13/arch/i386/kernel/ptrace.c
===
--- linux-2.6.13.orig/arch/i386/kernel/ptrace.c 2005-08-15 11:16:59.0 
-0700
+++ linux-2.6.13/arch/i386/kernel/ptrace.c  2005-08-15 11:19:49.0 
-0700
@@ -164,18 +164,20 @@
 * and APM bios ones we just ignore here.
 */
if (segment_from_ldt(seg)) {
-   u32 *desc;
+   mm_context_t *context;
+   struct desc_struct *desc;
unsigned long base;
 
-   down(>mm->context.sem);
-   desc = child->mm->context.ldt + (seg & ~7);
-   base = (desc[0] >> 16) | ((desc[1] & 0xff) << 16) | (desc[1] & 
0xff00);
+   context = >mm->context;
+   down(>sem);
+   desc = >ldt[segment_index(seg)];
+   base = get_desc_base(desc);
 
/* 16-bit code segment? */
-   if (!((desc[1] >> 22) & 1))
+   if (!get_desc_32bit(desc))
addr &= 0x;
addr += base;
-   up(>mm->context.sem);
+   up(>sem);
}
return addr;
 }
Index: linux-2.6.13/arch/i386/kernel/ldt.c
===
--- linux-2.6.13.orig/arch/i386/kernel/ldt.c2005-08-15 11:16:59.0

Re: rc6 keeps hanging and blanking displays - bisection complete

2005-08-15 Thread Linus Torvalds

On Tue, 16 Aug 2005, Helge Hafting wrote:
>
> This was interesting.  At first, lots of kernels just kept working,
> I almost suspected I was doing something wrong. Then the second last kernel
> recompiled a lot of DRM stuff - and the crash came back!
> The kernel after that worked again, and so the final message was:
> 
> 561fb765b97f287211a2c73a844c5edb12f44f1d is first bad commit

Ok, that definitely looks bogus. 

That commit should not matter at _all_, it only changes ppc64 specific 
things. 

If the bug is sometimes hard to trigger, maybe one of the "good" kernels 
wasn't good after all. That would definitely throw a wrench in the 
bisection.

Anyway, with something like this, where there may be false positives 
(false "good" kernels), the only thing you can _really_ trust is a kernel 
that got marked bad, because that one definitely has the problem. So make 
sure that you remember all known-bad kernels.

Btw, we haven't had a lot of testign of the termination condition for "git 
bisect", so it's possible it's off by a commit or two. However, the commit 
you actually ended up on is literally just two commits before 2.6.13-rc5, 
which makes me suspect that it's not the termination condition, as much as 
the fact that it really was an earlier kernel that had the problem, but 
you bisected it as "good" because the problem just didn't trigger quickly 
enough..

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/6] i386 virtualization - Ldt kprobes bugfix

2005-08-15 Thread zach

While cleaning up the LDT code, I noticed that kprobes code was very bogus
with respect to segment handling.  Three bugs are fixed here.

1) Taking an int3 from v8086 mode could cause the kprobes code to read
   a non-existent LDT.
2) The CS value is not truncated to 16 bit, which could cause an access
   beyond the bounds of the LDT.
3) The LDT was being read without taking the mm->context semaphore, which
   means bogus and or non-existent vmalloc()ed pages could be read.

I've also included my latest version of the LDT test.

/*
 * Copyright (c) 2005, Zachary Amsden ([EMAIL PROTECTED])
 * This is licensed under the GPL.
 */

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#define __KERNEL__
#include 

/*
 * Spin modifying LDT entry 1 to get contention on the mm->context
 * semaphore.
 */
void evil_child(void *addr)
{
struct user_desc desc;

while (1) {
desc.entry_number = 1;
desc.base_addr = addr;
desc.limit = 1;
desc.seg_32bit = 1;
desc.contents = MODIFY_LDT_CONTENTS_CODE;
desc.read_exec_only = 0;
desc.limit_in_pages = 1;
desc.seg_not_present = 0;
desc.useable = 1;
if (modify_ldt(1, , sizeof(desc)) != 0) {
perror("modify_ldt");
abort();
}
}
exit(0);
}

void catch_sig(int signo, struct sigcontext ctx)
{
return;
}

void main(void)
{
struct user_desc desc;
char *code;
unsigned long long tsc;
char *stack;
pid_t child; 
int i;
unsigned long long lasttsc = 0;

code = (char *)mmap(0, 8192, PROT_EXEC|PROT_READ|PROT_WRITE,
 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

/* Test 1 - CODE, 32-BIT, 2 page limit */
desc.entry_number = 0;
desc.base_addr = code;
desc.limit = 1;
desc.seg_32bit = 1;
desc.contents = MODIFY_LDT_CONTENTS_CODE;
desc.read_exec_only = 0;
desc.limit_in_pages = 1;
desc.seg_not_present = 0;
desc.useable = 1;
if (modify_ldt(1, , sizeof(desc)) != 0) {
perror("modify_ldt");
abort();
}
printf("INFO: code base is 0x%08x\n", (unsigned)code);
code[0x0ffe] = 0x0f;  /* rdtsc */
code[0x0fff] = 0x31;
code[0x1000] = 0xcb;  /* lret */
__asm__ __volatile("lcall $7,$0xffe" : "=A" (tsc));
printf("INFO: TSC is 0x%016llx\n", tsc);

/*
 * Fork an evil child that shares the same MM context
 */
stack = malloc(8192);
child = clone(evil_child, stack, CLONE_VM, 0xb0b0);
if (child == -1) {
perror("clone");
abort();
}

/* Test 2 - CODE, 32-BIT, 4097 byte limit */
desc.entry_number = 512;
desc.base_addr = code;
desc.limit = 4096;
desc.seg_32bit = 1;
desc.contents = MODIFY_LDT_CONTENTS_CODE;
desc.read_exec_only = 0;
desc.limit_in_pages = 0;
desc.seg_not_present = 0;
desc.useable = 1;
if (modify_ldt(1, , sizeof(desc)) != 0) {
perror("modify_ldt");
abort();
}
code[0x0ffe] = 0x0f;  /* rdtsc */
code[0x0fff] = 0x31;
code[0x1000] = 0xcb;  /* lret */
__asm__ __volatile("lcall $0x1007,$0xffe" : "=A" (tsc));

/*
 * Test 3 - CODE, 32-BIT, maximal LDT.  Race against evil
 * child while taking debug traps on LDT CS.
 */
for (i = 0; i < 1000; i++) {
signal(SIGTRAP, catch_sig);
desc.entry_number = 8191;
desc.base_addr = code;
desc.limit = 4097;
desc.seg_32bit = 1;
desc.contents = MODIFY_LDT_CONTENTS_CODE;
desc.read_exec_only = 0;
desc.limit_in_pages = 0;
desc.seg_not_present = 0;
desc.useable = 1;
if (modify_ldt(1, , sizeof(desc)) != 0) {
perror("modify_ldt");
abort();
}
code[0x0ffe] = 0x0f;  /* rdtsc */
code[0x0fff] = 0x31;
code[0x1000] = 0xcc;  /* int3 */
code[0x1001] = 0xcb;  /* lret */
__asm__ __volatile("lcall $0x,$0xffe" : "=A" (tsc));
if (tsc < lasttsc) {
printf("WARNING: TSC went backwards\n");
}
lasttsc = tsc;
}
if (kill(child, SIGTERM) != 0) {
perror("kill");
abort();
}
printf("PASS: LDT code segment\n");
}

Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]>
Index: linux-2.6.13/arch/i386/kernel/kprobes.c

[PATCH 6/6] i386 virtualization - Attempt to clean up pgtable code motion

2005-08-15 Thread zach

Virtualization aware Linux kernels may need to redefine functions which write
to hardware page tables at the sub-architecture layer.  Previously, this was
done by encapsulation in a split mach-xxx/pgtable-{2|3}level-ops.h file, but
having 8 pgtable header files is simply unacceptable.  This goes some ways
towards cleaning that up by deprecating the 2/3 level subarch functions.
This is accomplished by using __HAVE_ARCH_FUNC macros, and allowing
one sub-arch file, pgtable-ops.h, which gets included before any functions
which write to hardware page tables, allowing the sub-architecture to override
any or all definitions it needs.

Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]>
Index: linux-2.6.13/include/asm-i386/pgtable-2level.h
===
--- linux-2.6.13.orig/include/asm-i386/pgtable-2level.h 2005-08-15 
14:23:06.0 -0700
+++ linux-2.6.13/include/asm-i386/pgtable-2level.h  2005-08-15 
14:24:11.0 -0700
@@ -55,4 +55,25 @@
 #define __pte_to_swp_entry(pte)((swp_entry_t) { (pte).pte_low 
})
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+/*
+ * Certain architectures need to do special things when PTEs
+ * within a page table are directly modified.  Thus, the following
+ * hook is made available.
+ */
+#ifndef __HAVE_ARCH_SET_PTE
+#define __HAVE_ARCH_SET_PTE
+#define set_pte(pteptr, pteval) (*(pteptr) = pteval)
+#endif
+#define set_pte_atomic(pteptr, pteval) set_pte(pteptr, pteval)
+
+#ifndef __HAVE_ARCH_SET_PMD
+#define __HAVE_ARCH_SET_PMD
+#define set_pmd(pmdptr, pmdval) (*(pmdptr) = (pmdval))
+#endif
+
+#ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR
+#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
+#define ptep_get_and_clear(mm,addr,xp) __pte(xchg(&(xp)->pte_low, 0))
+#endif
+
 #endif /* _I386_PGTABLE_2LEVEL_H */
Index: linux-2.6.13/include/asm-i386/pgtable-3level.h
===
--- linux-2.6.13.orig/include/asm-i386/pgtable-3level.h 2005-08-15 
14:23:06.0 -0700
+++ linux-2.6.13/include/asm-i386/pgtable-3level.h  2005-08-15 
14:24:11.0 -0700
@@ -123,4 +123,58 @@
 
 #define __pmd_free_tlb(tlb, x) do { } while (0)
 
+/*
+ * Sub-arch is allowed to override these, so check for definition first.
+ * New functions which write to hardware page table entries should go here.
+ */
+
+/* Rules for using set_pte: the pte being assigned *must* be
+ * either not present or in a state where the hardware will
+ * not attempt to update the pte.  In places where this is
+ * not possible, use pte_get_and_clear to obtain the old pte
+ * value and then use set_pte to update it.  -ben
+ */
+#ifndef __HAVE_ARCH_SET_PTE
+#define __HAVE_ARCH_SET_PTE
+static inline void set_pte(pte_t *ptep, pte_t pte)
+{
+   ptep->pte_high = pte.pte_high;
+   smp_wmb();
+   ptep->pte_low = pte.pte_low;
+}
+#endif
+
+#ifndef __HAVE_ARCH_SET_PTE_ATOMIC
+#define __HAVE_ARCH_SET_PTE_ATOMIC
+#define set_pte_atomic(pteptr,pteval) \
+   set_64bit((unsigned long long *)(pteptr),pte_val(pteval))
+#endif
+
+#ifndef __HAVE_ARCH_SET_PMD
+#define __HAVE_ARCH_SET_PMD
+#define set_pmd(pmdptr,pmdval) \
+   set_64bit((unsigned long long *)(pmdptr),pmd_val(pmdval))
+#endif
+
+#ifndef __HAVE_ARCH_SET_PUD
+#define __HAVE_ARCH_SET_PUD
+#define set_pud(pudptr,pudval) \
+   (*(pudptr) = (pudval))
+#endif
+
+#ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR
+#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
+static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long 
addr, pte_t *ptep)
+{
+   pte_t res;
+
+   /* xchg acts as a barrier before the setting of the high bits */
+   res.pte_low = xchg(>pte_low, 0);
+   res.pte_high = ptep->pte_high;
+   ptep->pte_high = 0;
+
+   return res;
+}
+#endif
+
 #endif /* _I386_PGTABLE_3LEVEL_H */
Index: linux-2.6.13/include/asm-i386/pgtable.h
===
--- linux-2.6.13.orig/include/asm-i386/pgtable.h2005-08-15 
14:23:06.0 -0700
+++ linux-2.6.13/include/asm-i386/pgtable.h 2005-08-15 14:24:11.0 
-0700
@@ -236,12 +236,55 @@
 static inline pte_t pte_mkwrite(pte_t pte) { (pte).pte_low |= _PAGE_RW; 
return pte; }
 static inline pte_t pte_mkhuge(pte_t pte)  { (pte).pte_low |= 
_PAGE_PRESENT | _PAGE_PSE; return pte; }
 
+#include 
 #ifdef CONFIG_X86_PAE
 # include 
 #else
 # include 
 #endif
-#include 
+
+/*
+ * We give sub-architectures a chance to override functions which write to page
+ * tables, thus we check for existing definitions first.
+ */
+#ifndef __HAVE_ARCH_PTEP_TEST_AND_CLEAR_DIRTY
+#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_DIRTY
+static inline int ptep_test_and_clear_dirty(struct vm_area_struct *vma, 
unsigned long addr, pte_t *ptep)
+{
+   if (!pte_dirty(*ptep))
+   return 0;
+   return test_and_clear_bit(_PAGE_BIT_DIRTY, >pte_low);
+}
+#endif
+
+#ifndef

Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Michael_E_Brown

>> +On some Dell systems, systems management software must access
certain
>> +management information via a system management interrupt (SMI).   
>> The SMI data
>> +buffer must reside in 32-bit address space, and the physical  
>> address of the
>> +buffer is required for the SMI.  The driver maintains the memory  
>> required for
>> +the SMI and provides a way for the application to generate the SMI.
>> +The driver creates the following sysfs entries for systems
management
>> +software to perform these system management interrupts:
>
> Why can't you just implement the system management actions in the
kernel
> driver?  This is tantamount to a binary SMI hook to userspace.  What
> functionality does this provide on a dell system from an
administrator's
> point of view?

Kyle,
I'm sure that not everybody agrees with the whole concept of SMI

calls. Nevertheless, these calls exist, and in order to have a complete 
systems-management solution, we have to provide a way to do SMI calls. 
Now, we have developed a way to do these SMI calls from userspace
without 
kernel support, but we are trying to be community-friendly and show our 
hooks in the open, rather than trying to sneak them in under the covers.

You might not like the concept of a generic hook for SMI calls
in 
the kernel, but the alternatives are hardly better. One alternative is 
the already-mentioned method that we do things under the covers in 
userspace. Another alternative is that we write separate kernel code for

each and every SMI call that exists in the Dell BIOS. The second 
alternative is not entirely feasible. We have over 60 SMI functions, and

we would have to write a kernel-mode wrapper for each and every one. I 
hope you agree that code that doesn't exist is less buggy than code that

is, and that code that is in userspace is a whole lot less likely to
cause 
a kernel crash than code that is in the kernel. We are trying to keep
our 
kernel bloat down. We don't really think that customers of IBM or HP
really 
want their Red Hat kernels loaded down with a bunch of Dell-only code.

Additionally, we are releasing an open source library (GPL/OSL
dual 
license) that can use these hooks to perform many systems management 
functions in userspace. See http://linux.dell.com/libsmbios/main/. We 
should have code in libsmbios to do SMI using this driver within about
two 
weeks.  We currently writing the SMI hooks in libsmbios using this
posted 
version of the driver. I am the maintainer of this project, and it is my
goal 
to have code in libsmbios for every Dell SMI call.

Dell is not trying to use this driver as a method of inserting
binary 
blobs into the kernel, nor are we trying to subvert kernel security by 
implementing this driver. We are simply trying to get all of our systems

management software into the kernel as open source drivers. This
represents 
a fundamental shift in philosophy from the Dell systems-management team
from 
our previous binary-only driver approach. 

We would welcome feedback on a better way to implement this
driver in 
the kernel, but the fact remains that we have to have a way to do this,
and 
we are open-sourcing all of the code necessary to get this done. 
--
Michael Brown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/6] i386 virtualization patches, Set 3

2005-08-15 Thread zach

This round attempts to conclude all of the LDT related cleanup with some
finally nice looking LDT code, fixes for the UML build, a bugfix for
really rather nasty kprobes problems, and the basic framework for an LDT
test suite.  It is really rather unfortunate that this code is so
difficult to test, even with DOSemu and Wine, there are still very nasty
corner cases here - anyone want an iret to 16-bit stack test?.

I was going to attempt to clean up the math-emu code to make it use the
nice new segment and descriptor table accessors, but it quickly became
apparent that this would be a long, tedious, error prone process that
would eventually result in the death of a large section of my brain.
In addition, it is not very fun to test this on the actual hardware it
is designed to run on (although I did manage to track down a 386 with
detachable i387 coprocessor, the owner is not sure it still boots).
Someday it would be nice to have an audit of this code; it appears to
be riddled with bugs relating to segmentation, for example it assumes
LDT segments on overrides, does not use the mm->context semaphore to
protect LDT access, and generally looks scarily out of date in both
function and appearance.

I also have a makeover for the pgtable.h code.  Splitting operations that
write hardware page tables into the sub-arch layer was very ugly,
hopefully this is a much cleaner approach.  There really must be a way
for a paravirtualized hypervisor to hook the page table updates, and this
appears to be the cleanest solution so far.

This patch set is based on 2.6.13-rc6 -mm1 broken out series.  It applies
and builds i386, x86_64, and um-i386 on 2.6.13-rc5.  I've tested PAE and
non-PAE SMP kernels and am working on an LDT test suite.  Depends on
the i386 cleanups, sub-arch movement, and LDT cleanups I've already sent
out.

--
Zachary Amsden <[EMAIL PROTECTED]>
Whee!  Actually deliver the signal.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC - 0/13] NTP cleanup work (v. B5)

2005-08-15 Thread john stultz

On Tue, 2005-08-16 at 00:12 +0200, Roman Zippel wrote:
> On Wed, 10 Aug 2005, john stultz wrote:
> 
> > The goal of this patch set is to isolate the in kernel NTP state
> > machine in the hope of simplifying the current timekeeping code and
> > allowing for optional future changes in the timekeeping subsystem.
> > 
> > I've tried to address some of the complexity concerns for systems that
> > do not have a continuous timesource, preserving the existing behavior
> > while still providing a ppm interface allowing smooth adjustments to
> > continuous timesources. 
> 
> I think most of this is premature cleanup. As it also changes the logic in 
> small ways, I'm not even sure it qualifies as a cleanup.

Please, Roman, I'm spending quite a bit of time breaking this up into
small chunks specifically to help this discussion. Rather then just
stating that the logic is changed in small ways, could you please be
specific and point to where that logic has changed and we can fix or
discuss it.

> The only obvious patch is the PPS code removal, which is fine.

Ok, one down, 12 to go ;)

> For the rest I can't agree on to move everything that aggressively into 
> the ntp namespace. The kernel clock is controlled via NTP, but how it 
> actually works has little to do with "network time". 

Eh? The adjtimex() interface causes a small adjustment to be added or
removed from the system time each tick. Why should this code not be put
behind a clear interface?

> Some of the 
> parameters are even private clock variables (e.g. time adjustment, phase), 
> which don't belong in any common code. (I'll expand on that in the next 
> mail.)

Again, I'm not understanding your objection. Its exactly because the
time_adjust and phase values are NTP specific variables that I'm trying
to move them from the timer.c code into ntp.c

I'm trying to address your concerns, I just need you to be more
explicit.

thanks
-john

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Fix mmap_kmem (was: [question] What's the difference between /dev/kmem and /dev/mem)

2005-08-15 Thread Linus Torvalds



On Mon, 15 Aug 2005, Olaf Hering wrote:
> 
> ARCH=um doesnt like your version, but mine.

Yours is broken. As is arch-um.

The fix would _seem_ to be something like the appended. Can you verify?

Linus

diff --git a/include/asm-um/page.h b/include/asm-um/page.h
--- a/include/asm-um/page.h
+++ b/include/asm-um/page.h
@@ -104,8 +104,8 @@ extern void *to_virt(unsigned long phys)
  * casting is the right thing, but 32-bit UML can't have 64-bit virtual
  * addresses
  */
-#define __pa(virt) to_phys((void *) (unsigned long) virt)
-#define __va(phys) to_virt((unsigned long) phys)
+#define __pa(virt) to_phys((void *) (unsigned long) (virt))
+#define __va(phys) to_virt((unsigned long) (phys))
 
 #define page_to_pfn(page) ((page) - mem_map)
 #define pfn_to_page(pfn) (mem_map + (pfn))
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add transport class symlink to device object

2005-08-15 Thread James Bottomley

On Sun, 2005-08-14 at 16:02 +0100, Matthew Wilcox wrote:
> /sys/class/tty/ttyS0/device -> 
> ../../../devices/parisc/0/0:0/pci:00/:00:04.0
> /sys/class/tty/ttyS1/device -> 
> ../../../devices/parisc/0/0:0/pci:00/:00:04.0
> /sys/class/tty/ttyS2/device -> 
> ../../../devices/parisc/0/0:0/pci:00/:00:04.0
> /sys/class/tty/ttyS3/device -> 
> ../../../devices/parisc/0/0:0/pci:00/:00:05.0
> /sys/class/tty/ttyS4/device -> 
> ../../../devices/parisc/0/0:0/pci:00/:00:05.0

Actually, isn't the fix to all of this to combine Greg and James'
patches?

The Greg one fails in SCSI because we don't have unique class device
names (by convention we use the same name as the device bus_id) and
James' one fails for ttys because the class name isn't unique.  However,
if the link were derived from something like

:

Then is would be unique in both cases.

Unless anyone can think of any more failing cases?

James

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: libata PATA todo list

2005-08-15 Thread Bill Davidsen


Jeff Garzik wrote:


Since there's been some recent interest in the subject, I thought I 
would post the PATA todo list for libata.  Some of these items are from 
my memory, and some are from a list Alan was kind enough to create.  The 
items verbatim from Alan are prefixed "Alan: ".




2) Simplex DMA

PCI IDE specification has a 'simplex' DMA bit, which should be tested. 
Simplex means that only one command can be outstanding, for BOTH port0 
and port1, at any given time.


Possibly some hosts also need Simplex DMA, but may not assert the 
standard PCI IDE Simplex DMA capability bit.  I don't know.


I remember using devices which require this. Not recently.



4) Alan:  Command filter

Alan -- explanation?

I know one line item here, at least:  Promise controllers snoop SET 
FEATURES - XFER MODE command.  We must stop command processing on ALL 
ports when this command is issued, to avoid corruption.


The last time I tried, cdrecord was allowed to burn the first session of 
a multi-session CD as a user (with correct device permissions) but not 
to read the multisession info (current ISO size) to burn another 
session. I haven't tried it in the last few months, I changed my script 
to do something else. However, it really should work.


I will test this if you like, but I'm on 7x24 coverage this week and 
7x24 vacation after that, so not soon.




10) ATAPI DMA alignment (discussed elsewhere)

Needed even for PATA, AFAICT.


Thanks for keeping the list!

--
   -bill davidsen ([EMAIL PROTECTED])
"The secret to procrastination is to put things off until the
 last possible moment - but no longer"  -me
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Trouble shooting a ten minute boot delay (SiI3112)

2005-08-15 Thread Shaun Jackman

2005/8/12, Steven Rostedt <[EMAIL PROTECTED]>:
> Is the keyboard ever set up then? This is all happening before
> console_init (since that's when the prints start) and the early printk
> won't show anything before it parses the options.  For other
> architectures, I use to write out to the serial really early, just an
> 'x'. If you know how to do that, you could give it a try. Start at
> start_kernel in main hopefully you see the 'x'. If you do, keep moving
> it until you find where it's delaying.  Of course, this could be before
> start_kernel, then you're really screwed, unless you're good at doing
> the same in assembly (which I've done for MIPS, PPC and ARM, but never
> for x86).

Since each reboot takes ten minutes, this would be a tedious process.
Thanks for the suggestion though.

> > I compiled a vanilla 2.6.12.4 kernel, enabled EARLY_PRINTK and
> > rebooted with earlyprintk=vga. The kernel didn't display any extra
> > information before the delay.
> 
> Do you see grub saying "uncompressing kernel..." or whatever that says?

Grub says...

root (hd2,2)
  Filesystem type is ext2fs, partition type 0x83
kernel /boot/vmlinuz-2.6.12.4 root=/dev/md0 ro nodma
   [Linux-bzImage, setup=0x1e00, size=0x1302ff]

I suspect this second message is where grub decompresses the kernel.
The last message grub displays is simply...

boot

Cheers,
Shaun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH,RFC] quirks for VIA VT8237 southbridge

2005-08-15 Thread Bjorn Helgaas

On Saturday 13 August 2005 9:10 am, Karsten Wiese wrote:
> this fixes the 'doubled ioapic level interrupt rate' issue I've been
> seeing on a K8T800/AMD64 mainboard.
> It also switches off quirk_via_irq() for the VT8237 southbridge.

These patches seem unrelated except that they both contain the
text "via", so I think you should at least split them.

> + * Devices part of the VIA VT8237 don't need quirk_via_irq().
> + * They also don't get confused by it, but dmesg gets quiter
> + * with this 'anti'-quirk.
> + * Here we are overly paranoic:
> + * we assume there might also exist via devices not part of the VT8237
> + * needing quirk_via_irq().
> + * This might never be the case in reality, when there is a VT8237.

This quirk_via_irq() change seems like an awful lot of work to
get rid of a few log messages.  In my opinion, it's just not
worth it, because it's so hard to debug problems in that area
already.

> +static unsigned int vt8237_devfn[] = {
> +   PCI_DEVFN(15, 0),
> +   PCI_DEVFN(15, 1),
> +   PCI_DEVFN(16, 0),
> +   PCI_DEVFN(16, 1),
> +   PCI_DEVFN(16, 2),
> +   PCI_DEVFN(16, 3),
> +   PCI_DEVFN(16, 4),
> +   PCI_DEVFN(16, 5),
> +   PCI_DEVFN(17, 5),
> +   PCI_DEVFN(17, 6),
> +   PCI_DEVFN(18, 0)
> +};
> +static struct pci_dev *quirk_via_irq_not[ARRAY_SIZE(vt8237_devfn)];
> +static void quirk_via_irq_not_for_8237(struct pci_dev *dev)
> +{
> +   // Make sure we do this only once
> +   if (quirk_via_irq_not[0] != NULL)
> +   return;
> +
> +   if (dev->devfn == PCI_DEVFN(0x11, 0)) {
> +   int i, j;
> +   for (i = 0, j = 0; i < ARRAY_SIZE(vt8237_devfn); i++) {
> +   struct pci_dev * d;
> +   d = pci_find_slot(dev->bus->number, vt8237_devfn[i]);
> +   if (d != NULL)
> +   quirk_via_irq_not[j++] = d;
> +   }
> +   }
> +}
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8237, 
> quirk_via_irq_not_for_8237);
> +
> +/*
>   * Via 686A/B:  The PCI_INTERRUPT_LINE register for the on-chip
>   * devices, USB0/1, AC97, MC97, and ACPI, has an unusual feature:
>   * when written, it makes an internal connection to the PIC.
> @@ -499,8 +559,14 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_V
>   */
>  static void quirk_via_irq(struct pci_dev *dev)
>  {
> +   int i;
> u8 irq, new_irq;
> 
> +   for (i = 0; i < ARRAY_SIZE(vt8237_devfn); i++)
> +   if (quirk_via_irq_not[i] == dev)
> +   return;
> +
> +
> new_irq = dev->irq & 0xf;
> pci_read_config_byte(dev, PCI_INTERRUPT_LINE, );
> if (new_irq != irq) {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [-mm PATCH] set correct bit in reload register of Watchdog Timer for Intel 6300 chipset

2005-08-15 Thread David Härdeman


On Mon, Aug 15, 2005 at 02:02:19PM -0700, Naveen Gupta wrote:


This patch writes into bit 8 of the reload register to perform the
correct 'Reload Sequence' instead of writing into bit 4 of Watchdog for
Intel 6300ESB chipset.

Signed-off-by: Naveen Gupta <[EMAIL PROTECTED]>


Acked-by: David Härdeman <[EMAIL PROTECTED]>

Thanks alot Naveen.

//David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [-mm PATCH] set enable bit instead of lock bit of Watchdog Timer in Intel 6300 chipset

2005-08-15 Thread David Härdeman


On Mon, Aug 15, 2005 at 02:21:05PM -0700, Naveen Gupta wrote:


This patch sets the WDT_ENABLE bit of the Lock Register to enable the
watchdog and WDT_LOCK bit only if nowayout is set. The old code always
sets the WDT_LOCK bit of watchdog timer for Intel 6300ESB chipset. So, we
end up locking the watchdog instead of enabling it.

Signed-off-by: Naveen Gupta <[EMAIL PROTECTED]>


Acked-by: David Härdeman <[EMAIL PROTECTED]>

Thanks Naveen.

//David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pmtmr and PRINTK_TIME timings display

2005-08-15 Thread john stultz

On Thu, 2005-08-04 at 17:23 +0200, Borislav Petkov wrote:
> I get it. Actually, I wasn't very sure whether this is the right solution 
> since my desktop machine uses tsc timer as default while the laptop the 
> pmtmr. I also remember that there was a patch a while ago on lkml which 
> enabled a modifiable behavior for PRINTK_TIME through a /proc interface and 
> kernel boot option but it somehow didn't get accepted. Ok, then, since we 
> keep the jiffies solution across arch's, how can I force the kernel to use 
> tsc for printk timings so that i can see the deltas between the different 
> printk's instead of the jiffies_64 ns value? The Pentium-M Centrino on the 
> laptop evidently supports rdtsc as a msr instruction.

The issue is that there are a number of laptops that do not properly
support cpufreq and additionally newer laptop chips halt their TSC's
when they go into C3 mode. This keeps the TSC from working as a proper
timesource on these systems, and causes the need for alternative
timesources like the ACPI PM timer. 

thanks
-john


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.12.4] ACPI oops during ipmi_si driver init

2005-08-15 Thread Bjorn Helgaas

On Friday 12 August 2005 1:44 pm, Peter Martuccelli wrote:
> Stumbled into this problem working on the ipmi_si driver.  When the
> ipmi_si driver initialization fails the acpi_tb_get_table 
> call, after rsdt_info has been allocated, acpi_get_firmware_table()
> will oops trying to reference off rsdt_info->pointer in the cleanup
> code.

I don't know whether the ACPI patch is correct or desirable, but
I think the ipmi_si ACPI discovery is bogus (it was probably
written before the current ACPI and PNPACPI driver registration
interfaces were stable).

Currently, ipmi_si uses the static SPMI table to locate the
device.  But the static table should only be used if we need
the device very early, before the ACPI namespace is available.

I don't think we use the device early, so we should use
pnp_register_driver() to claim the appropriate PNP IDs.
Or we might have to use acpi_bus_register_driver() since
it looks like it uses ACPI-specific features like GPEs.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [-mm PATCH] set enable bit instead of lock bit of Watchdog Timer in Intel 6300 chipset

2005-08-15 Thread Naveen Gupta

David,

Yes, I have tested these patches. In fact I found these bugs while trying
to make the driver work on our machines.

-Naveen

On Tue, 16 Aug 2005, David Härdeman wrote:

> On Mon, Aug 15, 2005 at 02:21:05PM -0700, Naveen Gupta wrote:
> >
> >This patch sets the WDT_ENABLE bit of the Lock Register to enable the
> >watchdog and WDT_LOCK bit only if nowayout is set. The old code always
> >sets the WDT_LOCK bit of watchdog timer for Intel 6300ESB chipset. So, we
> >end up locking the watchdog instead of enabling it.
> 
> Naveen,
> 
> thanks alot for testing the driver further and finding these bugs. I've 
> not been able to do so myself as the only computers available to me with 
> this watchdog are production-servers meaning that I've only been able to 
> test during scheduled downtimes.
> 
> Have you tested and verified that the driver works after these patches 
> have been applied?
> 
> Re,
> David
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)

2005-08-15 Thread Roman Zippel

Hi,

On Wed, 10 Aug 2005, john stultz wrote:

>   Here's the next rev in my rework of the current timekeeping subsystem.
> No major changes, only some cleanups and further splitting the larger
> patches into smaller ones.
> 
> The goal of this patch set is to provide a simplified and streamlined
> common timekeeping infrastructure that architectures can optionally use
> to avoid duplicating code with other architectures.

It's still the same old abstraction. Let's try it in OO terms why it's the 
wrong one. What basically is needed is something like this:

base clock  -> continuous clock -> clock implemention
-> tick clock   -> 

The base clock class is an abstract class which provides the basic time 
services like get time, set time...
The continuous clock class and tick clock class are also abstract classes,
but provide basic template functions, which can be used by the actual 
implementations do most of the work.

What you do with your patches is to just provide an abstract class for 
continous clocks and tick based clocks have to emulate a continuous clock. 
Please provide the right abstractions, e.g. leave the gettimeofday 
implementation to the timesource and use some helper (template) functions 
to do the actual work. Basically as long as you have a cycle_t in the 
common code something is really wrong, this code belongs in the continous 
clock template.

This also allows better implementations, e.g. gettimeofday can be done in 
a single step instead of two using a single lock instead of two.

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC - 0/13] NTP cleanup work (v. B5)

2005-08-15 Thread Roman Zippel

Hi,

On Wed, 10 Aug 2005, john stultz wrote:

>   The goal of this patch set is to isolate the in kernel NTP state
> machine in the hope of simplifying the current timekeeping code and
> allowing for optional future changes in the timekeeping subsystem.
> 
> I've tried to address some of the complexity concerns for systems that
> do not have a continuous timesource, preserving the existing behavior
> while still providing a ppm interface allowing smooth adjustments to
> continuous timesources. 

I think most of this is premature cleanup. As it also changes the logic in 
small ways, I'm not even sure it qualifies as a cleanup.
The only obvious patch is the PPS code removal, which is fine.
For the rest I can't agree on to move everything that aggressively into 
the ntp namespace. The kernel clock is controlled via NTP, but how it 
actually works has little to do with "network time". Some of the 
parameters are even private clock variables (e.g. time adjustment, phase), 
which don't belong in any common code. (I'll expand on that in the next 
mail.)

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/5] add i2c_probe_device and i2c_remove_device

2005-08-15 Thread Jean Delvare

Hi Nathan,

> These functions can be used for special-purpose adapters, such as
> those on TV tuner cards, where we generally know in advance what
> devices are attached.  This is important in cases where the adapter
> does not support probing or when probing is potentially dangerous to
> the connected devices.

Do you know of any adapter actually not supporting the SMBus Quick
command (which we use for probing)?

> --- linux-2.6.13-rc6+gregkh.orig/drivers/i2c/i2c-core.c
> +++ linux-2.6.13-rc6+gregkh/drivers/i2c/i2c-core.c
> @@ -671,6 +671,75 @@ int i2c_control(struct i2c_client *clien
>  }
>  
>  /* 
> + * direct add/remove functions to avoid probing
> + * 
> + */
> +
> +int i2c_probe_device(struct i2c_adapter *adapter, int driver_id,
> +  int addr, int kind)
> +{
> + struct list_head   *item;
> + struct i2c_driver  *driver = NULL;
> +
> + /* There's no way to probe addresses on this adapter... */
> + if (kind < 0 && !i2c_check_functionality(adapter,I2C_FUNC_SMBUS_QUICK))
> + return -EINVAL;

Coding style please: one space after the comma. 

> +
> + down(_lists);
> + list_for_each(item,) {

Ditto.

> + driver = list_entry(item, struct i2c_driver, list);
> + if (driver->id == driver_id)
> + break;
> + }
> + up(_lists);
> + if (!item)
> + return -ENOENT;
> +
> + /* Already in use? */
> + if (i2c_check_addr(adapter, addr))
> + return -EBUSY;
> +
> + /* Make sure there is something at this address, unless forced */
> + if (kind < 0) {
> + if (i2c_smbus_xfer(adapter, addr, 0, 0, 0,
> +I2C_SMBUS_QUICK, NULL) < 0)
> + return -ENODEV;
> +
> + /* prevent 24RF08 corruption */
> + if ((addr & ~0x0f) == 0x50)
> + i2c_smbus_xfer(adapter, addr, 0, 0, 0,
> +I2C_SMBUS_QUICK, NULL);
> + }
> +
> + return driver->detect_client(adapter, addr, kind);
> +}

You are duplicating a part of i2c_probe_address() here. Why don't you
simply call it?

This part of the code is very sensible because of the 24RF08 corruption
issue. I have plans to change the probing method, e.g. by using SMBus
Receive Byte instead of SMBus Quick for the 0x50-0x5F address range.
Thus I would really appreciate if this code would not be duplicated.

Thanks,
-- 
Jean Delvare
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pmtmr and PRINTK_TIME timings display

2005-08-15 Thread john stultz

On Thu, 2005-08-04 at 14:59 +0200, Borislav Petkov wrote:
> Hi,
> 
> on my laptop ASUS M6B00N PRINTK_TIME is enabled in order to show timing 
> information in all the boottime printk's. However, all output looks like this
> 
> 
> [4294667.997000] CPU: After generic identify, caps: a7e9fbbf  
>  
>  0180  
> [4294667.997000] CPU: After vendor identify, caps: a7e9fbbf   
>  0180  
> [4294667.997000] CPU: L1 I cache: 32K, L1 D cache: 32K
> [4294667.997000] CPU: L2 cache: 1024K
> [4294667.997000] CPU: After all inits, caps: a7e9fbbf   
> 0040 0180  
> [4294667.997000] CPU: Intel(R) Pentium(R) M processor 1500MHz stepping 05
> [4294667.997000] Enabling fast FPU save and restore... done.
> [4294667.997000] Enabling unmasked SIMD FPU exception support... done.
> [4294667.997000] Checking 'hlt' instruction... OK.
> [4294668.041000] ACPI: setting ELCR to 0200 (from 0c30)
> 
> 
> If I'm not wrong, the time value that gets printed is actually the jiffies_64 
> value set to INITIAL_JIFFIES, which in turn is set to wrap 5 minutes after 
> boot so that "jiffies wrap bugs show up earlier." This is because 
> sched_clock() in  returns the jiffies_64 
> value converted to nanoseconds after checking use_tsc. This, in turn, is 0 
> because my machine selects the power management timer as the high-res 
> timesource before reading the timestamp counter for printk timing.
> 
[snip]
> After applying it, printk timing looks like this:
> 
> 
> [0.00] Detected 1500.132 MHz processor.
> [0.00] Using pmtmr for high-res timesource
> [0.00] Console: colour VGA+ 80x25
> [1.89] Dentry cache hash table entries: 131072 (order: 7, 524288 
> bytes)
> [1.891000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
> [1.906000] Memory: 513756k/523520k available (2839k kernel code, 9276k 
> reserved, 1148k data, 152k init, 0k highmem)
> [1.906000] Checking if this processor honours the WP bit even in 
> supervisor mode... Ok.
> [1.906000] Calibrating delay loop... 2973.69 BogoMIPS (lpj=1486848)
> [1.928000] Security Framework v1.0.0 initialized
> 
> 
> 
> Signed-off-by: Borislav Petkov <[EMAIL PROTECTED]>
> 
> --- arch/i386/kernel/timers/timer_tsc.c.orig  2005-08-04 12:57:37.0 
> +0200
> +++ arch/i386/kernel/timers/timer_tsc.c   2005-08-04 14:19:48.0 
> +0200
> @@ -146,7 +146,7 @@ unsigned long long sched_clock(void)
>   if (!use_tsc)
>  #endif
>   /* no locking but a rare wrong value is not a big deal */
> - return jiffies_64 * (10 / HZ);
> + return (jiffies_64 - INITIAL_JIFFIES) * (10 / HZ);
>  
>   /* Read the Time Stamp Counter */
>   rdtscll(this_offset);
> 
> 

This patch looks fine to me.
-john


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] libata passthrough - return register data from HDIO_* commands

2005-08-15 Thread Jon Escombe



Here is a first attempt at a patch to return register data from the 
libata passthrough HDIO ioctl handlers, I needed this as the ATA 
'unload immediate' command returns the success in the lbal register. 
This patch applies on top of 2.6.12 and Jeffs 
2.6.12-git4-passthru1.patch. (Apologies, but Thunderbird appears to 
have replaced the tabs with spaces).


One oddity is that the sr_result field is correctly being set in 
ata_gen_ata_desc_sense(), however the high word is different when 
we're back in the ioctl hander. I've coded round this for now by only 
checking the low word, but this needs more investigation.


Jeff, could this functionality be incorporated into the pasthrough 
patch when complete?



I'd failed to realise that scsi_finish_command() sets the high byte of 
the result field to DRIVER_SENSE when there is sense data. Patch updated 
to reflect this...


Haven't had any feedback on the patch itself - but this now does what I 
wanted it do to.  (I can't find a way to make Thunderbird retain tabs in 
the message body, so sending as an attachment).


Regards,
Jon.



__
Email via Mailtraq4Free from Enstar (www.mailtraqdirect.co.uk)--- a/drivers/scsi/libata-scsi.c
+++ b/drivers/scsi/libata-scsi.c
@@ -89,6 +89,7 @@
 	u8 args[4], *argbuf = NULL;
 	int argsize = 0;
 	struct scsi_request *sreq;
+	unsigned char *sb, *desc;
 
 	if (NULL == (void *)arg)
 		return -EINVAL;
@@ -100,6 +101,9 @@
 	if (!sreq)
 		return -EINTR;
 
+	sb = sreq->sr_sense_buffer;
+	desc = sb + 8;
+
 	memset(scsi_cmd, 0, sizeof(scsi_cmd));
 
 	if (args[3]) {
@@ -109,12 +113,12 @@
 			return -ENOMEM;
 
 		scsi_cmd[1]  = (4 << 1); /* PIO Data-in */
-		scsi_cmd[2]  = 0x0e; /* no off.line or cc, read from dev,
+		scsi_cmd[2]  = 0x2e; /* no off.line, read from dev, request cc
 		block count in sector count field */
 		sreq->sr_data_direction = DMA_FROM_DEVICE;
 	} else {
 		scsi_cmd[1]  = (3 << 1); /* Non-data */
-		/* scsi_cmd[2] is already 0 -- no off.line, cc, or data xfer */
+		scsi_cmd[2]  = 0x20; /* no off.line or data xfer, request check condtion */
 		sreq->sr_data_direction = DMA_NONE;
 	}
 
@@ -135,16 +139,24 @@
 	   from scsi_ioctl_send_command() for default case... */
 	scsi_wait_req(sreq, scsi_cmd, argbuf, argsize, (10*HZ), 5);
 
-	if (sreq->sr_result) {
+	if (!sreq->sr_result == ((DRIVER_SENSE << 24) + SAM_STAT_CHECK_CONDITION)) {
 		rc = -EIO;
 		goto error;
 	}
 
-	/* Need code to retrieve data from check condition? */
+	/* Retrieve data from check condition */
+	args[1] = desc[3];
+	args[2] = desc[5];
+	args[3] = desc[7];
+	args[4] = desc[9];
+	args[5] = desc[11];
+	args[0] = desc[13];
 
 	if ((argbuf)
 	 && copy_to_user((void *)(arg + sizeof(args)), argbuf, argsize))
 		rc = -EFAULT;
+	if (copy_to_user(arg, args, sizeof(args)))
+		rc = -EFAULT;
 error:
 	scsi_release_request(sreq);
 
@@ -171,6 +183,7 @@
 	u8 scsi_cmd[MAX_COMMAND_SIZE];
 	u8 args[7];
 	struct scsi_request *sreq;
+	unsigned char *sb, *desc;
 
 	if (NULL == (void *)arg)
 		return -EINVAL;
@@ -181,7 +194,7 @@
 	memset(scsi_cmd, 0, sizeof(scsi_cmd));
 	scsi_cmd[0]  = ATA_16;
 	scsi_cmd[1]  = (3 << 1); /* Non-data */
-	/* scsi_cmd[2] is already 0 -- no off.line, cc, or data xfer */
+	scsi_cmd[2]  = 0x20; /* no off.line, or data xfer, request cc */
 	scsi_cmd[4]  = args[1];
 	scsi_cmd[6]  = args[2];
 	scsi_cmd[8]  = args[3];
@@ -195,18 +208,29 @@
 		goto error;
 	}
 
+	sb = sreq->sr_sense_buffer;
+	desc = sb + 8;
+
 	sreq->sr_data_direction = DMA_NONE;
 	/* Good values for timeout and retries?  Values below
 	   from scsi_ioctl_send_command() for default case... */
 	scsi_wait_req(sreq, scsi_cmd, NULL, 0, (10*HZ), 5);
 
-	if (sreq->sr_result) {
+	if (!sreq->sr_result == ((DRIVER_SENSE << 24) + SAM_STAT_CHECK_CONDITION)) {
 		rc = -EIO;
 		goto error;
 	}
 
-	/* Need code to retrieve data from check condition? */
+	/* Retrieve data from check condition */
+	args[1] = desc[3];
+	args[2] = desc[5];
+	args[3] = desc[7];
+	args[4] = desc[9];
+	args[5] = desc[11];
+	args[0] = desc[13];
 
+	if (copy_to_user(arg, args, sizeof(args)))
+		rc = -EFAULT;
 error:
 	scsi_release_request(sreq);
 	return rc;

Re: [-mm PATCH] set enable bit instead of lock bit of Watchdog Timer in Intel 6300 chipset

2005-08-15 Thread David Härdeman


On Mon, Aug 15, 2005 at 02:21:05PM -0700, Naveen Gupta wrote:


This patch sets the WDT_ENABLE bit of the Lock Register to enable the
watchdog and WDT_LOCK bit only if nowayout is set. The old code always
sets the WDT_LOCK bit of watchdog timer for Intel 6300ESB chipset. So, we
end up locking the watchdog instead of enabling it.


Naveen,

thanks alot for testing the driver further and finding these bugs. I've 
not been able to do so myself as the only computers available to me with 
this watchdog are production-servers meaning that I've only been able to 
test during scheduled downtimes.


Have you tested and verified that the driver works after these patches 
have been applied?


Re,
David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: rc6 keeps hanging and blanking displays - bisection complete

2005-08-15 Thread Helge Hafting

On Mon, Aug 15, 2005 at 08:50:12AM -0700, Linus Torvalds wrote:
> 
> 
> On Mon, 15 Aug 2005, Helge Hafting wrote:
> >
> > Ok, I have downlaoded git and started the first compile.
> > Git will tell when the correct point is found (assuming I
> > do the "git bisect bad/good" right), by itself?
> 
> Yes. You should see 
> 
>   Bisecting: xxx revisions left to test after this
> 
> and the "xxx" should hopefully decrease by half during each round. And t 
> the end of it, you should get
> 
>is first bad commit
> 
> followed by the actual patch that caused the problem.
> 
This was interesting.  At first, lots of kernels just kept working,
I almost suspected I was doing something wrong. Then the second last kernel
recompiled a lot of DRM stuff - and the crash came back!
The kernel after that worked again, and so the final message was:

561fb765b97f287211a2c73a844c5edb12f44f1d is first bad commit
diff-tree 561fb765b97f287211a2c73a844c5edb12f44f1d (from 
6ade43fbbcc3c12f0ddba112351d14d6c82ae476)
Author: Anton Blanchard <[EMAIL PROTECTED]>
Date:   Mon Aug 1 21:11:46 2005 -0700

[PATCH] ppc64: topology API fix

Dont include asm-generic/topology.h unconditionally, we end up overriding
all the ppc64 specific functions when NUMA is on.

Signed-off-by: Anton Blanchard <[EMAIL PROTECTED]>
Acked-by: Paul Mackerras <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>

:04 04 a760521110f862aecbee74cffa674993b6dca4a3 
66b9cb2db119ab029ca7b8f71bd06507fca63921 M  include

I'm a little surprised, as a ppc64 fix theoretically shouldn't matter for 
x86_64? But perhaps they share something?

I hope this is of help,
Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 >

1 - 100 of 592 matches

Mail list logo