Re: r8169: NFG in 2.6.24-rc2

2007-11-07 Thread Hans-Jürgen Koch
Am Wed, 07 Nov 2007 11:07:07 -0500
schrieb Mark Lord <[EMAIL PROTECTED]>:

> My ASUS board has one of these:
> 
> 01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
> RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01)
> Subsystem: ASUSTeK Computer Inc. Unknown device 81aa Control: I/O+
> Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR-
> FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
> >TAbort- SERR-  >IRQ 16 Region 0: I/O ports at 9800 [size=256]
> Region 2: Memory at ff3ff000 (64-bit, non-prefetchable)
> [size=4K] Expansion ROM at ff3c [disabled] [size=128K]
> Capabilities: [40] Power Management version 2
> Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA
> PME(D0-,D1+,D2+,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0
> DScale=0 PME- Capabilities: [48] Vital Product Data
> Capabilities: [50] Message Signalled Interrupts: Mask- 64bit+
> Queue=0/1 Enable- Address:   Data: 
> Capabilities: [60] Express Endpoint IRQ 0
> Device: Supported: MaxPayload 1024 bytes, PhantFunc
> 0, ExtTag+ Device: Latency L0s <1us, L1 unlimited
> Device: AtnBtn+ AtnInd+ PwrInd+
> Device: Errors: Correctable- Non-Fatal- Fatal-
> Unsupported- Device: RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
> Device: MaxPayload 128 bytes, MaxReadReq 4096 bytes
> Link: Supported Speed 2.5Gb/s, Width x4, ASPM L0s,
> Port 0 Link: Latency L0s unlimited, L1 unlimited
> Link: ASPM Disabled RCB 64 bytes CommClk+ ExtSynch-
> Link: Speed 2.5Gb/s, Width x1
> Capabilities: [84] Vendor Specific Information
> 
> It works perfectly in 2.6.23.
> It does not work in 2.6.24-rc2.  Dunno about -rc1 or earlier -git*.
> 
> Without CONFIG_PCI_MSI, it works slightly, enough to ping it a couple
> of times, but it then dies when used for anything real:
> 
>   r8169 Gigabit Ethernet driver 2.2LK loaded
>   r8169 :01:00.0: no MSI. Back to INTx.
>   ...
>   eth0: RTL8168b/8111b at 0xf884a000, 00:17:31:64:e0:bc, XID
> 3000 IRQ 16 ...
>   r8169: eth0: link up
>   ...
>   kernel: NETDEV WATCHDOG: eth0: transmit timed out
>   r8169: eth0: link up
>   ...
> Not usable from this point on.

Same problem here with a MSI K9AGM2 board. The problem appeared in
2.6.24-rc1 (http://bugzilla.kernel.org/show_bug.cgi?id=9257).

It seems to be better in -rc2, at least the chip is detected again. I
can assign an IP to that interface and bring it up, but no data traffic
is possible. After some tests in both directions, ifconfig reports 157.6
KiB RX bytes, but TX bytes is 0, so sent packets seem to disappear
quite early.

Thanks,
Hans
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Massive slowdown when re-querying large nfs dir

2007-11-07 Thread Al Boldi
Andrew Morton wrote:
> > > I would suggest getting a 'tcpdump -s0' trace and seeing (with
> > > wireshark) what is different between the various cases.
> >
> > Thanks Neil for looking into this.  Your suggestion has already been
> > answered in a previous post, where the difference has been attributed to
> > "ls -l" inducing lookup for the first try, which is fast, and getattr
> > for later tries, which is super-slow.
> >
> > Now it's easy to blame the userland rpc.nfs.V2 server for this, but
> > what's not clear is how come 2.4.31 handles getattr faster than 2.6.23?
>
> We broke 2.6?  It'd be interesting to run the ls in an infinite loop on
> the client them start poking at the server.  Is the 2.6 server doing
> physical IO?  Is the 2.6 server consuming more system time?  etc.  A basic
> `vmstat 1' trace for both 2.4 and 2.6 would be a starting point.
>
> Could be that there's some additional latency caused by networking
> changes, too.  I expect the tcpdump/wireshark/etc traces would have
> sufficient resolution for us to be able to see that.

The problem turns out to be "tune2fs -O dir_index".
Removing that feature resolves the big slowdown.

Does 2.4.31 support this feature?

Neil Brown wrote:
> Maybe an "strace -tt" of the nfs server might show some significant
> difference.

###
# ls -l <3K dir entry> (first try after mount inducing lookup) in ~3sec
# strace -tt rpc.nfsd

08:28:14.668557 time([1194499694])  = 1194499694
08:28:14.669420 alarm(5)= 2
08:28:14.669667 select(1024, [4 5], NULL, NULL, NULL) = 1 (in [4])
08:28:14.670142 recvfrom(4, 
"\275\3607{\0\0\0\0\0\0\0\2\0\1\206\243\0\0\0\2\0\0\0\4"..., 8800, 0, 
{sa_family=AF_INET, sin_port=htons(888), sin_addr=inet_addr("10.0.0.111")}, 
[16]) = 116
08:28:14.670554 time(NULL)  = 1194499694
08:28:14.670711 time([1194499694])  = 1194499694
08:28:14.670875 lstat("/a/x", {st_mode=S_IFDIR|0755, st_size=36864, ...}) = 0
08:28:14.671134 time([1194499694])  = 1194499694
08:28:14.671302 lstat("/a/x/3619", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
08:28:14.671530 time([1194499694])  = 1194499694
08:28:14.671701 alarm(2)= 5
08:28:14.671903 time([1194499694])  = 1194499694
08:28:14.672060 lstat("/a/x/3619", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
08:28:14.672305 time([1194499694])  = 1194499694
08:28:14.672508 sendto(4, 
"\275\3607{\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 128, 0, 
{sa_family=AF_INET, sin_port=htons(888), sin_addr=inet_addr("10.0.0.111")}, 16) 
= 128
08:28:14.672909 time([1194499694])  = 1194499694
08:28:14.673869 alarm(5)= 2
08:28:14.674145 select(1024, [4 5], NULL, NULL, NULL) = 1 (in [4])
08:28:14.674589 recvfrom(4, 
"\276\3607{\0\0\0\0\0\0\0\2\0\1\206\243\0\0\0\2\0\0\0\4"..., 8800, 0, 
{sa_family=AF_INET, sin_port=htons(888), sin_addr=inet_addr("10.0.0.111")}, 
[16]) = 116
08:28:14.675003 time(NULL)  = 1194499694
08:28:14.675160 time([1194499694])  = 1194499694
08:28:14.675321 lstat("/a/x", {st_mode=S_IFDIR|0755, st_size=36864, ...}) = 0
08:28:14.675581 time([1194499694])  = 1194499694
08:28:14.675749 lstat("/a/x/3631", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
08:28:14.675979 time([1194499694])  = 1194499694
08:28:14.676150 alarm(2)= 5
08:28:14.676348 time([1194499694])  = 1194499694
08:28:14.676505 lstat("/a/x/3631", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
08:28:14.676746 time([1194499694])  = 1194499694
08:28:14.676952 sendto(4, 
"\276\3607{\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 128, 0, 
{sa_family=AF_INET, sin_port=htons(888), sin_addr=inet_addr("10.0.0.111")}, 16) 
= 128

##
# ls -l <3K dir entry> (second try after mount inducing getattr) in ~11sec
# strace -tt rpc.nfsd

08:28:40.963668 time([1194499720])  = 1194499720
08:28:40.964525 alarm(5)= 2
08:28:40.964772 select(1024, [4 5], NULL, NULL, NULL) = 1 (in [4])
08:28:40.965215 recvfrom(4, 
",\3747{\0\0\0\0\0\0\0\2\0\1\206\243\0\0\0\2\0\0\0\1\0\0"..., 8800, 0, 
{sa_family=AF_INET, sin_port=htons(888), sin_addr=inet_addr("10.0.0.111")}, 
[16]) = 108
08:28:40.965609 time(NULL)  = 1194499720
08:28:40.965763 time([1194499720])  = 1194499720
08:28:40.965941 stat("/", {st_mode=S_IFDIR|0755, st_size=2048, ...}) = 0
08:28:40.966176 setfsuid(0) = 0
08:28:40.966329 stat("/", {st_mode=S_IFDIR|0755, st_size=2048, ...}) = 0
08:28:40.966539 stat("/", {st_mode=S_IFDIR|0755, st_size=2048, ...}) = 0
08:28:40.966748 open("/", O_RDONLY|O_NONBLOCK) = 0
08:28:40.966919 fcntl(0, F_SETFD, FD_CLOEXEC) = 0
08:28:40.967084 lseek(0, 0, SEEK_CUR)   = 0
08:28:40.967240 getdents(0, /* 71 entries */, 3933) = 1220
08:28:40.968195 close(0)= 0
08:28:40.968351 stat("/a/", {st_mode=S_IFDIR|0755, st_size=1024, ...}) = 0
08:28:40.968583 stat("/a/", 

Re: LTP ustat01 test fails on NFSROOT

2007-11-07 Thread Kumar Gala


On Nov 2, 2007, at 9:28 AM, Kumar Gala wrote:


On Thu, 25 Oct 2007, Trond Myklebust wrote:


Could you please try the following patch?

Cheers
  Trond


Its a new month so I'll ping again about sending this fix upstream to
linus for 2.6.24 :) ?

- k


Trond,

any update on sending this to Linus for 2.6.24?

- k

- CUT HERE  
-

From: Trond Myklebust <[EMAIL PROTECTED]>
Date: Thu, 25 Oct 2007 13:56:10 -0400
NFS: Fix the ustat() regression

Since 2.6.18, the superblock sb->s_root has been a dummy dentry  
with a

dummy inode. This breaks ustat(), which actually uses sb->s_root in a
vfstat() call.

Fix this by making the s_root a dummy alias to the directory inode  
that was

used when creating the superblock.

Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]>
---

 fs/nfs/getroot.c |   81 + 
+

 1 files changed, 27 insertions(+), 54 deletions(-)

diff --git a/fs/nfs/getroot.c b/fs/nfs/getroot.c
index 522e5ad..0ee4384 100644
--- a/fs/nfs/getroot.c
+++ b/fs/nfs/getroot.c
@@ -43,6 +43,25 @@
 #define NFSDBG_FACILITYNFSDBG_CLIENT

 /*
+ * Set the superblock root dentry.
+ * Note that this function frees the inode in case of error.
+ */
+static int nfs_superblock_set_dummy_root(struct super_block *sb,  
struct inode *inode)

+{
+   /* The mntroot acts as the dummy root dentry for this superblock */
+   if (sb->s_root == NULL) {
+   sb->s_root = d_alloc_root(inode);
+   if (sb->s_root == NULL) {
+   iput(inode);
+   return -ENOMEM;
+   }
+   /* Circumvent igrab(): we know the inode is not being freed */
+   atomic_inc(>i_count);
+   }
+   return 0;
+}
+
+/*
  * get an NFS2/NFS3 root dentry from the root filehandle
  */
 struct dentry *nfs_get_root(struct super_block *sb, struct nfs_fh  
*mntfh)
@@ -54,33 +73,6 @@ struct dentry *nfs_get_root(struct super_block  
*sb, struct nfs_fh *mntfh)

struct inode *inode;
int error;

-	/* create a dummy root dentry with dummy inode for this  
superblock */

-   if (!sb->s_root) {
-   struct nfs_fh dummyfh;
-   struct dentry *root;
-   struct inode *iroot;
-
-   memset(, 0, sizeof(dummyfh));
-   memset(, 0, sizeof(fattr));
-   nfs_fattr_init();
-   fattr.valid = NFS_ATTR_FATTR;
-   fattr.type = NFDIR;
-   fattr.mode = S_IFDIR | S_IRUSR | S_IWUSR;
-   fattr.nlink = 2;
-
-   iroot = nfs_fhget(sb, , );
-   if (IS_ERR(iroot))
-   return ERR_PTR(PTR_ERR(iroot));
-
-   root = d_alloc_root(iroot);
-   if (!root) {
-   iput(iroot);
-   return ERR_PTR(-ENOMEM);
-   }
-
-   sb->s_root = root;
-   }
-
/* get the actual root for this mount */
fsinfo.fattr = 

@@ -96,6 +88,10 @@ struct dentry *nfs_get_root(struct super_block  
*sb, struct nfs_fh *mntfh)

return ERR_PTR(PTR_ERR(inode));
}

+   error = nfs_superblock_set_dummy_root(sb, inode);
+   if (error != 0)
+   return ERR_PTR(error);
+
 	/* root dentries normally start off anonymous and get spliced in  
later

 * if the dentry tree reaches them; however if the dentry already
 * exists, we'll pick it up at this point and use it as the root
@@ -241,33 +237,6 @@ struct dentry *nfs4_get_root(struct  
super_block *sb, struct nfs_fh *mntfh)


dprintk("--> nfs4_get_root()\n");

-	/* create a dummy root dentry with dummy inode for this  
superblock */

-   if (!sb->s_root) {
-   struct nfs_fh dummyfh;
-   struct dentry *root;
-   struct inode *iroot;
-
-   memset(, 0, sizeof(dummyfh));
-   memset(, 0, sizeof(fattr));
-   nfs_fattr_init();
-   fattr.valid = NFS_ATTR_FATTR;
-   fattr.type = NFDIR;
-   fattr.mode = S_IFDIR | S_IRUSR | S_IWUSR;
-   fattr.nlink = 2;
-
-   iroot = nfs_fhget(sb, , );
-   if (IS_ERR(iroot))
-   return ERR_PTR(PTR_ERR(iroot));
-
-   root = d_alloc_root(iroot);
-   if (!root) {
-   iput(iroot);
-   return ERR_PTR(-ENOMEM);
-   }
-
-   sb->s_root = root;
-   }
-
/* get the info about the server and filesystem */
error = nfs4_server_capabilities(server, mntfh);
if (error < 0) {
@@ -289,6 +258,10 @@ struct dentry *nfs4_get_root(struct  
super_block *sb, struct nfs_fh *mntfh)

return ERR_PTR(PTR_ERR(inode));
}

+   error = nfs_superblock_set_dummy_root(sb, inode);
+   if (error != 0)
+   return 

Re: SC1200 failure in 2.6.23 and 2.6.24-rc1-git10

2007-11-07 Thread Denys Fedoryshchenko
Does it work as kernel parameter?

I tried libata_dma_mask=0x4 and to set 0xf or 0xff - doesn't help. How to 
disable DMA in libata, if it is compiled in kernel?

On Thu, 8 Nov 2007 01:30:53 +0100, Bartlomiej Zolnierkiewicz wrote
> On Thursday 08 November 2007, Denys Fedoryshchenko wrote:
> > 2.6.24-rc2 not working very well
> > 
> > 
> > dmesg
> > [   12.386395] Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
> > [   12.405579] ide: Assuming 33MHz system bus speed for PIO modes; 
override 
> > with idebus=xx
> > [   12.430441] SC1200: IDE controller (0x100b:0x0502 rev 0x01) at  PCI 
slot 
> > :00:12.2
> > [   12.454070] SC1200: not 100% native mode: will probe irqs later
> > [   12.471947] ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:pio, 
> > hdb:pio
> > [   12.493873] ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:pio, 
> > hdd:pio
> > [   12.515810] Probing IDE interface ide0...
> > [   12.528810] Clocksource tsc unstable (delta = -497423729 ns)
> > [   12.545888] Time: pit clocksource has been installed.
> > [   12.563379] hda: SanDisk SDCFH-1024, CFA DISK drive
> > [   12.578340] hda: applying conservative PIO "downgrade"
> > [   12.593869] hda: host max PIO4 wanted PIO255(auto-tune) selected PIO1
> > [   12.594006] hda: MW DMA 2 mode selected
> > [   12.594297] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
> > [   12.608778] Probing IDE interface ide1...
> > [   12.623192] hda: max request size: 128KiB
> > [   12.635322] hda: 2001888 sectors (1024 MB) w/1KiB Cache, CHS=1986/16/
63, 
> > DMA
> > [   12.657134]  hda:<4>hda: dma_timer_expiry: dma status == 0x21
> > [   12.865846] hda: DMA timeout error
> > [   12.876092]  ide_dma_end dma_stat=21 err=1 newerr=0
> > [   12.890753] hda: dma timeout error: status=0x58 { DriveReady 
SeekComplete 
> > DataRequest }
> > [   12.914977] ide: failed opcode was: unknown
> > [   12.927743] hda: DMA disabled
> > [   12.937035] ide0: reset: success
> > [   12.948324]  hda1
> > 
> > Mounting taking long time on 1GB card cause of DMA issues. In dmesg i am 
not 
> > sure about timestamp showing few seconds, in real life it took about 2 
> > minutes.
> 
> Please try booting with "hda=nodma".
> 
> It could be a hardware problem (CF adapter without DMA lines).
> 
> Thanks,
> Bart


--
Denys Fedoryshchenko
Technical Manager
Virtual ISP S.A.L.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread David Miller
From: Andrew Morton <[EMAIL PROTECTED]>
Date: Wed, 7 Nov 2007 23:09:16 -0800

> I don't think that's a big problem?  This syscall can (oddly) return any
> 32-bit (64-bit) number and a smart application developer (after saying wtf)
> would realise that he just can't check for errors and have correctly
> working code.
> 
> Then again, if he was smart he just wouldn't use times(2)'s return value
> for anything.  But what is the alternative?  I don't think there is one,
> apart from much saner things like gettimeofday().

You and I would say "wtf", but the manual states what it does:

On error, (clock_t) -1 is returned, and errno is  set  appro-
priately.

And I think this (obviously bogus) convention is something we
are really stuck with.

Another awful aspect of this is that glibc is going to overwrite
'errno' for this return value range.  That will likely cause more
application misbehavior than some of the other side effects we've been
discussing.

In short we have two problems:

1) glibc thinks -4096 < x < 0 is an error, and will write this
   value into errno and return -1 to the application

2) the manual states that -1 means error

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Andrew Morton
> On Wed, 07 Nov 2007 22:25:30 -0800 (PST) David Miller <[EMAIL PROTECTED]> 
> wrote:
> From: Andrew Morton <[EMAIL PROTECTED]>
> Date: Wed, 7 Nov 2007 21:20:05 -0800
> 
> > Yup.  But userspace will already have a fit if either the start or end time
> > advanced into the glibc-thought-that-was-an-error range.
> 
> On x86 only.  We could use force_successful_syscall_return()
> to make sure the condition codes get set correctly on
> other platforms.
> 
> But even in that case we'd still be broken when the return
> value is exactly -1 and that's what the application is going
> to compare against to test for errors.

I don't think that's a big problem?  This syscall can (oddly) return any
32-bit (64-bit) number and a smart application developer (after saying wtf)
would realise that he just can't check for errors and have correctly
working code.

Then again, if he was smart he just wouldn't use times(2)'s return value
for anything.  But what is the alternative?  I don't think there is one,
apart from much saner things like gettimeofday().

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Fwd: same problem with 2.6.24-rc2

2007-11-07 Thread Randy Dunlap


[adding linux-kernel again]

werner wrote:

The compilation is ready. By any reason that list as suggested by you wasn't 
generated.
However, the 3 compiling/linking lists what my kernel-build-script normally 
generates,
were. They are annexed here. It's the same , after booting the kernel crashs 
imediately
with EIP error. And the building process reclaims a  missing Makefile.o in 
//arch/x86.



OK, first show us (that is, the mailing list "linux-kernel@vger.kernel.org", not
just me) what your "kernel-build-script" looks like.

The beginning of the log files that you sent to me (at end of this email)
is very suspicious looking.  It looks like you are not using the expect kernel
build procedures.

The crash problem (snippet below) is a fault in xor_sse_2() in the function
that tries to choose the best (fastest) xor method.  I would expect other
people to be having a similar problem.  I don't suspect that it's related to
the build problem (Makefile.o), but we need to have you building kernels
correctly before we try to find out why they break when you boot them.



=
On 7/Nov/2007 22:06 Randy Dunlap wrote ..

On Wed, 07 Nov 2007 21:32:43 -0300 (GFT) werner wrote:


On 7/Nov/2007 20:10 werner wrote ..

With 2.6.23-rc2 is the same problem:  it crashed at the beginning:  EIP 060

c03fdea4

EFLAGS 00010212 EIP is at xor_sse_2+0x34/0x200
Again during the compilation was reclaimed that /arch/x86/Makefile.o
cannot be found and were certain dependencies on it not made, such a file isn't
present in the source code (present are, f.ex. Makefile_32 , Makefile_64 ),

nor

was generated automaticaly during compilation, I think this is incorrect and

the

reason for the problems

Hi,

Please provide the complete build log (with V=1 if possible) for the
missing Makefile.o problem.

E.g.:

make V=1 all >build.log 2>&1

Make sure that build.log contains the error message and then send
the complete build.log file to us at linux-kernel@vger.kernel.org .



wl
[EMAIL PROTECTED]
=
On 7/Nov/2007 16:14 Andrew Morton wrote ..

On Wed, 07 Nov 2007 15:55:12 -0300 (GFT) "werner" <[EMAIL PROTECTED]>

wrote:

I really don't know what's happening.  I don't understand nothing about

the

kernel

error reporting system.   Because of this, always when there is a problem,

I

report

it via e-mail to  linux-kernel@vger.kernel.org .  I don't know what people

there
do with my messages. 



It went like this:

1: you sent an email to linux-kernel

2: I sent a reply to you and linux-kernel

3: you sent a reply to me, but NOT linux-kernel!

In other words, you did "reply", not "reply to all", thus you removed three
thousand people from the discussion.  One of those people is the person who
created the bug which you're hitting, and that person no longer knows
what's happening.


So please go back and resend all those emails, and retain ALL Cc:'s.  Don't
just send them only to me.  Keep all indivisuals and all mailing lists on
the email Cc: list.





gcc -m32 -m elf_i386  /usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile.o   
-o /usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile
gcc: /usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile.o: No such file or 
directory
gcc: no input files
make: [/usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile] Error 1 (ignored)




--
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.34-rc1 eat my photo SD card :-(

2007-11-07 Thread Jens Axboe
On Wed, Nov 07 2007, Roland Dreier wrote:
>  > Well, I spent the last 36 hours (more or less) trying to bisect the SD
>  > problem. The method I used was to insert the card, umount it, and make 8 dd
>  > in a row; the kernel is "bad" if they differs, "good" if they are the 
> same. 
>  > 
>  > I could not finish the bisect. The last pair good/bad were:
>  > 
>  > bad:   [7aeacf982203fb4dea2f3434eefdc268cfd5d6d9] 
>  >[BLOCK] blk_rq_map_sg: force clear termination bit
>  > good:  [e38f981758118d829cd40cfe9c09e3fa81e422aa] 
>  >exportfs: update documentation
> 
> Thanks, that helps.  I read over the mmc changes in between those two
> commits, and I think I found the problem... could you please try the
> patch below (on top of the latest kernel) and report back how it
> works?  Unfortunately I am traveling and I don't have an SD card with
> me to test on my laptop...
> 
> Pierre, assuming Romano tests this patch successfully, please apply!
> 
> Thanks,
>   Roland
> 
> <-- patch below -->
> 
> mmc: Fix sg helper copy-and-paste error
> 
> Commit 45711f1a ("[SG] Update drivers to use sg helpers") had the
> following bogus change in drivers/mmc/card/queue.c:
> 
> > -   src_buf = page_address(src->page) + src->offset;
> > +   src_buf = sg_virt(dst);
> 
> (Notice that "src" is converted to "dst").  Turn this "dst" back into
> the intended "src".
> 
> Cc: Jens Axboe <[EMAIL PROTECTED]>
> Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>
> ---
> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
> index 9203a0b..1b9c9b6 100644
> --- a/drivers/mmc/card/queue.c
> +++ b/drivers/mmc/card/queue.c
> @@ -310,7 +310,7 @@ static void copy_sg(struct scatterlist *dst, unsigned int 
> dst_len,
>   }
>  
>   if (src_size == 0) {
> - src_buf = sg_virt(dst);
> + src_buf = sg_virt(src);
>   src_size = src->length;
>   }
>  

How embarassing, sorry about that! Pierre, shall I shove this upstream
or will you?

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SC1200 failure in 2.6.23 and 2.6.24-rc1-git10

2007-11-07 Thread Denys Fedoryshchenko
You are right, seems no dma lines in adapter. hda=nodma helped, no errors 
anymore. I will try now also libata_dma_mask and will mail result. Btw there 
is no notes in Documentation/kernel-parameters.txt about it.

In any case it is complete board, WRAP.2C made by PCEngines in 2003. Kind of 
popular and mass produced, before was widely used by StarOS, probably known 
GPL violator, who didn't bother himself to supply patches, but at same time 
used it in his projects.

If it is valid for all board with this revision, maybe it is better to put it 
in some kind of fixup/quirk/black list, or how it is called?

On Wed, 07 Nov 2007 19:41:15 -0600, Robert Hancock wrote
> Denys wrote:
> > Finally i got full DMESG with 1GB card till end. Seems not readable too.
> >
> 
> ...
> 
> > ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> > ata1.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 4096 
in
> >  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> > ata1.00: status: { DRDY }
> > ata1: soft resetting link
> > ata1.00: configured for MWDMA1
> > sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08
> > sd 0:0:0:0: [sda] Sense Key : 0xb [current] [descriptor]
> > Descriptor sense data with sense descriptors (in hex):
> > 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
> > 00 00 00 00
> > sd 0:0:0:0: [sda] ASC=0x0 ASCQ=0x0
> > end_request: I/O error, dev sda, sector 0
> > Buffer I/O error on device sda, logical block 0
> > ata1: EH complete
> 
> I'm guessing that your CF-to-IDE adapter doesn't have the correct 
> lines wired up for DMA to work properly, and the card indicates DMA 
> support, which libata tries to use but which doesn't work. It looks 
> like it never tried falling back to PIO after DMA failed. Seems like 
> a deficiency in the speed-down logic?
> 
> -- 
> Robert Hancock  Saskatoon, SK, Canada
> To email, remove "nospam" from [EMAIL PROTECTED]
> Home Page: http://www.roberthancock.com/


--
Denys Fedoryshchenko
Technical Manager
Virtual ISP S.A.L.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] pktcdvd: fix BUG caused by sysfs module reference semantics change

2007-11-07 Thread Jens Axboe
On Thu, Nov 08 2007, Tejun Heo wrote:
> Greg KH wrote:
> > On Thu, Nov 08, 2007 at 11:27:16AM +0900, Tejun Heo wrote:
> >> pkt_setup_dev() expects module reference to be held on invocation.
> >> This used to be true for sysfs callbacks but not anymore.  Test and
> >> grab module reference around pkt_setup_dev() in
> >> class_pktcdvd_store_add().
> >>
> >> Signed-off-by: Tejun Heo <[EMAIL PROTECTED]>
> >> Acked-by: Peter Osterlund <[EMAIL PROTECTED]>
> >> ---
> >> Greg, can you please push this patch through your tree? 
> >> Thanks a lot.
> >>
> >>  drivers/block/pktcdvd.c |9 +
> >>  1 file changed, 9 insertions(+)
> > 
> > Why through my tree?  I don't do block devices :)
> 
> Because it's a regression introduced by changes in sysfs?
> 
> > Shouldn't Jens or at least Andrew take it?
> 
> That's fine too.  Jens?

Sure, I'm pushing some stuff off today anyway.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] pktcdvd: fix BUG caused by sysfs module reference semantics change

2007-11-07 Thread Tejun Heo
Greg KH wrote:
> On Thu, Nov 08, 2007 at 11:27:16AM +0900, Tejun Heo wrote:
>> pkt_setup_dev() expects module reference to be held on invocation.
>> This used to be true for sysfs callbacks but not anymore.  Test and
>> grab module reference around pkt_setup_dev() in
>> class_pktcdvd_store_add().
>>
>> Signed-off-by: Tejun Heo <[EMAIL PROTECTED]>
>> Acked-by: Peter Osterlund <[EMAIL PROTECTED]>
>> ---
>> Greg, can you please push this patch through your tree? 
>> Thanks a lot.
>>
>>  drivers/block/pktcdvd.c |9 +
>>  1 file changed, 9 insertions(+)
> 
> Why through my tree?  I don't do block devices :)

Because it's a regression introduced by changes in sysfs?

> Shouldn't Jens or at least Andrew take it?

That's fine too.  Jens?

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc2 breaks nVidia MCP51 High Definition Audio

2007-11-07 Thread Takashi Iwai
At Wed, 7 Nov 2007 19:07:07 -0500 (EST),
Gerhard Mack wrote:
> 
> On Wed, 7 Nov 2007, Andrew Morton wrote:
> 
> > Date: Wed, 7 Nov 2007 15:21:27 -0800
> > From: Andrew Morton <[EMAIL PROTECTED]>
> > To: Gerhard Mack <[EMAIL PROTECTED]>
> > Cc: linux-kernel@vger.kernel.org, Jaroslav Kysela <[EMAIL PROTECTED]>,
> > Takashi Iwai <[EMAIL PROTECTED]>, Rafael J. Wysocki <[EMAIL PROTECTED]>
> > Subject: Re: 2.6.24-rc2 breaks nVidia MCP51 High Definition Audio
> > 
> > > On Wed, 7 Nov 2007 17:39:41 -0500 (EST) Gerhard Mack <[EMAIL PROTECTED]> 
> > > wrote:
> > > hello,
> > > 
> > > This worked fine in 2.6.23 but now the kernel no longer sees my audio 
> > > controller.
> > > 
> > > 00:10.1 Audio device: nVidia Corporation MCP51 High Definition Audio (rev 
> > > a2)
> > > 00:10.1 0403: 10de:026c (rev a2)
> > > 
> > > Let me know if I can provide more info or test patches.
> > > 
> > 
> > Please provide the output of `dmesg -s 100' for both 2.6.23
> > and 2.6.24-rc3, thanks.
> > 
> > Are you sure that the driver is suitably configured?  Sometimes
> > we like to fiddle config options so that a `make oldconfig' will go and
> > unconfigure drivers which you need.
> 
> Found an option for generic HD audio and enabled that with only marginally 
> better results.  Now instead of not detecting my card it's showing a 
> single volume control in the mixer and not providing any sound at all.
> 
> 2.6.23:
> Advanced Linux Sound Architecture Driver Version 1.0.14 (Fri Jul 20 
> 09:12:58 2007 UTC).
> ACPI: PCI Interrupt Link [AAZA] enabled at IRQ 22
> ACPI: PCI Interrupt :00:10.1[B] -> Link [AAZA] -> GSI 22 (level, low) 
> -> IRQ 22
> PCI: Setting latency timer of device :00:10.1 to 64
> ALSA device list:
>   #0: HDA NVidia at 0xfe024000 irq 22
> GACT probability on
> 
> 2.6.24-rc2:
> Advanced Linux Sound Architecture Driver Version 1.0.15 (Tue Oct 23 
> 06:09:18 2007 UTC).
> ACPI: PCI Interrupt Link [AAZA] enabled at IRQ 22
> ACPI: PCI Interrupt :00:10.1[B] -> Link [AAZA] -> GSI 22 (level, low) 
> -> IRQ 22
> PCI: Setting latency timer of device :00:10.1 to 64
> ieee1394: Host added: ID:BUS[0-00:1023]  GUID[0011d8000101f761]
> ALSA device list:
>   #0: HDA NVidia at 0xfe024000 irq 22
> GACT probability on

Both look OK.

Please show your kernel config and /proc/asound/card0/codec#*
contents.  Did you choose CONFIG_SND_HDA_CODEC_* properly?

Also, please be more specific about your hardware.  The implementation
of HD-audio stuff is deifferent greatly among products.  It's very
important to know what kind of machine (h/w vendor, product name,
model, etc) to identify whether the configuration is known or not
(i.e. it was really supported or it worked just casually).


Takashi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [kvm-devel] [PATCH 2/3] Put the virtio under the virtualization menu

2007-11-07 Thread Avi Kivity
Anthony Liguori wrote:
> This patch moves virtio under the virtualization menu and changes virtio
> devices to not claim to only be for lguest.
>   

Perhaps the virt menu needs to be split into a host-side support menu
and guest-side support menu.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Use of virtio device IDs

2007-11-07 Thread Avi Kivity
Gregory Haskins wrote:
>
>> PCI means that you can reuse all of the platform's infrastructure for
>> irq allocation, discovery, device hotplug, and management.
>> 
>
> Its tempting to use, yes.  However, most of that infrastructure is
> completely inappropriate for a PV implementation, IMHO.  

Why?

> You are
> probably better off designing something that is PV specific instead of
> shoehorning it in to fit a different model (at least for the things I
> have in mind).  

Well, if we design our pv devices to look like hardware, they will fit
quite well.  Both to the guest OS and to user's expectations.

> Its not a heck of a lot of code to write a pv-centric
> version of these facilities.
>
>   

It is.  Especially if you consider Windows and a gazillion versions of
deployed, non-pv-capable Linux systems.  For pv-friendly newer Linux,
it's probably doable, but why?

Look at the mess Xen finds itself in.

>> You can write it for new guests but backporting it to older guests will be a
>> huge task.
>>
>> We will support non-pci for s390, but in order to support Windows and
>> older Linux PCI is necessary.
>> 
>
> I don't know if I would agree with "necessary".  "Easier" perhaps. ;) By
> definition once you are PV you are hypervisor aware.  Now its just a
> matter of plugging in the appropriate plumbing to bridge the hypervisor
> to the guest-os.  Some might be easier than others, sure.  But all
> should be extensible to a degree.
>
>   

It's "necessary" in a pragmatic sense: we want to deliver drivers that
provide features for a wide variety of guests in a reasonable
timeframe.  And that means no rewriting guest OS infrastructure.


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread David Miller
From: Andrew Morton <[EMAIL PROTECTED]>
Date: Wed, 7 Nov 2007 21:20:05 -0800

> Yup.  But userspace will already have a fit if either the start or end time
> advanced into the glibc-thought-that-was-an-error range.

On x86 only.  We could use force_successful_syscall_return()
to make sure the condition codes get set correctly on
other platforms.

But even in that case we'd still be broken when the return
value is exactly -1 and that's what the application is going
to compare against to test for errors.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread David Miller
From: Paul Mackerras <[EMAIL PROTECTED]>
Date: Thu, 8 Nov 2007 16:15:51 +1100

> David Miller writes:
> 
> > I can't see where x86 is doing this though, so perhaps for x86
> > glibc does make the negative value check.  But I doubt it is
> > checking the range 0x8000-0x, otherwise mmap() would
> > be busted.
> 
> At least for the INTERNAL_SYSCALL macro in glibc, the error check is:
> 
> #define INTERNAL_SYSCALL_ERROR_P(val, err) \
>   ((unsigned int) (val) >= 0xf001u)
> 
> in sysdeps/unix/sysv/linux/i386/sysdep.h.  Similarly the PSEUDO macro
> in that file does a cmpl $-4095,%eax to test for error.  (There is also
> a PSEUDO_NOERRNO which doesn't test for error.)
> 
> So the convention on (32-bit) x86 is that -4095 .. -1 are error
> values, and other values are successful return values.

Thanks for figuring that out.

Really there is no way to fix sys_times() return values
universally.  Each proposed solution either doesn't fix
the problem, or adds a new failure mode.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread David Miller
From: Paul Mackerras <[EMAIL PROTECTED]>
Date: Thu, 8 Nov 2007 15:59:12 +1100

> Not on powerpc.  On powerpc the error indication is carried separately
> in a condition register bit.  So a force_successful_syscall_return()
> call will make glibc automatically do the right thing without any
> glibc changes on powerpc.

It still won't fix the problem.

When the return value is (clock_t) -1, all the
force_successful_syscall_return() calls and glibc condition
codes checks in the world are not going to fix the application
code which checks for error using -1.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [poll] Is the megafreeze development model broken?

2007-11-07 Thread Stephen Hemminger
On Wed, 07 Nov 2007 23:56:57 +0100
ciol <[EMAIL PROTECTED]> wrote:

> Hi, I'd like to ask you a few questions:
> 
> * Do you like the way linux distributions integrate the kernel?
> 
> * Wouldn't you prefer they ship with the stable and still maintained 
> 2.6.16.X, while providing optionally the latest kernel for those who 
> want or just have a new hardware?
> 
> * Do you think the megafreeze development model [1] and the "I don't 
> trust in upstream" development model are broken? (And why)
> 
> 
> 
> [1] http://www.modeemi.fi/~tuomov/b/archives/2007/03/03/T19_15_26/
> 
> 
> (I'm going to ask this for several projects, not only the kernel)
> 

It's a free world, do what you want.

-- 
Stephen Hemminger <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [kvm-devel] [PATCH 3/3] virtio PCI device

2007-11-07 Thread Avi Kivity
Anthony Liguori wrote:
> This is a PCI device that implements a transport for virtio.  It allows virtio
> devices to be used by QEMU based VMMs like KVM or Xen.
>
>   

Didn't see support for dma. I think that with Amit's pvdma patches you
can support dma-capable devices as well without too much fuss.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Andrew Morton
> On Thu, 8 Nov 2007 16:36:08 +1100 Paul Mackerras <[EMAIL PROTECTED]> wrote:
> Andrew Morton writes:
> 
> > Yup.  But userspace will already have a fit if either the start or end time
> > advanced into the glibc-thought-that-was-an-error range.
> 
> Not nearly as much of a fit.  The effect on x86 is that values between
> -4095 and -1 are reported as -1, so the end-start difference will be
> out by less than 41 seconds.  That's not nearly as dramatic as a
> difference of 21 million seconds (over 16 years). :)
> 
> I really think that wrapping at 0x7fff makes the situation worse,
> not better.
> 

Sure.

So we need to do what you say: never return an error from sys_times() and
change glibc to not perform error-interpretation on sys_times() return
values and recommend that people bypass libc and go direct to the syscall
so they'll work correctly on older glibc.   Lovely.

I wonder what happens with things like F_GETOWN, shmat() and lseek(/dev/mem)
on x86 (things which use force_successful_syscall_return()).  According
to the comment in include/linux/ptrace.h, glibc should be special-casing
these.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] fix incorrect test in trident_ac97_set(); sound/oss/trident.c

2007-11-07 Thread Muli Ben-Yehuda
On Wed, Nov 07, 2007 at 11:04:41AM -0800, Ray Lee wrote:

> On Nov 7, 2007 10:50 AM, Roel Kluin <[EMAIL PROTECTED]> wrote:

> > If count reaches zero, the loop ends, but the postfix decrement
> > still subtracts: testing for 'count == 0' will not work.
> >
> > Signed-off-by: Roel Kluin <[EMAIL PROTECTED]>
> > ---
> > diff --git a/sound/oss/trident.c b/sound/oss/trident.c
> > index 96adc47..6959ee1 100644
> > --- a/sound/oss/trident.c
> > +++ b/sound/oss/trident.c
> > @@ -2935,7 +2935,7 @@ trident_ac97_set(struct ac97_codec *codec, u8 reg, 
> > u16 val)
> > do {
> > if ((inw(TRID_REG(card, address)) & busy) == 0)
> > break;
> > -   } while (count--);
> > +   } while (--count);
> >
> > data |= (mask | (reg & AC97_REG_ADDR));
> >
> > @@ -2996,7 +2996,7 @@ trident_ac97_get(struct ac97_codec *codec, u8 reg)
> > data = inl(TRID_REG(card, address));
> > if ((data & busy) == 0)
> > break;
> > -   } while (count--);
> > +   } while (--count);
> >
> > spin_unlock_irqrestore(>lock, flags);
> >
> > if (count == 0) {
> >
> 
> Thanks, much better. In the future, please also CC: the appropriate
> maintainers, or Andrew Morton if you're at a loss...

Indeed.

> Reviewed-by: Ray Lee <[EMAIL PROTECTED]>

Acked-by: Muli Ben-Yehuda <[EMAIL PROTECTED]>

Andrew, can you please push to Linus?

Thanks,
Muli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread David Brown

On Wed, Nov 07, 2007 at 03:28:33PM -0800, Andrew Morton wrote:

On Wed, 7 Nov 2007 14:47:22 -0800 David Brown <[EMAIL PROTECTED]> wrote:



will return '-1' to user space and set the negated clock_t value to errno.

At minimum, perhaps it should return a sane errno value.


RETURN VALUE
  times()  returns  the  number of clock ticks that have elapsed since an
  arbitrary point in the past.  For Linux 2.4 and earlier this  point  is
  the  moment  the  system  was  booted.   Since Linux 2.6, this point is
  (2^32/HZ) - 300 (i.e., about 429 million) seconds  before  system  boot
  time.   The  return  value  may  overflow  the  possible  range of type
  clock_t.  On error, (clock_t) -1 is returned, and errno is  set  appro-
  priately.


The strange -1 behavior is enshrined in history.  I think a better answer
is to tell people to use getrusage() if they want a return result without
this problem.

Adding INITIAL_JIFFIES will fix the case where an embedded system is booted
up to run a test and then shut down, and the mask, although it causes
discontinuities periodically at least moves them away from the early boot.

INITIAL_JIFFIES was a good idea, but it is probably best to keep it inside
of the kernel.

David Brown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.34-rc1 eat my photo SD card :-(

2007-11-07 Thread Pierre Ossman
On Wed, 07 Nov 2007 15:37:46 -0800
Roland Dreier <[EMAIL PROTECTED]> wrote:

> 
> mmc: Fix sg helper copy-and-paste error
> 
> Commit 45711f1a ("[SG] Update drivers to use sg helpers") had the
> following bogus change in drivers/mmc/card/queue.c:
> 
> > -   src_buf = page_address(src->page) + src->offset;
> > +   src_buf = sg_virt(dst);
> 
> (Notice that "src" is converted to "dst").  Turn this "dst" back into
> the intended "src".
> 
> Cc: Jens Axboe <[EMAIL PROTECTED]>
> Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>

Ouch! Well that was obviously a bug. I wonder how the hell it only explodes for 
Romano. I've been shuffling loads of data using -rc1 without an incident.

Rgds
-- 
 -- Pierre Ossman

  Linux kernel, MMC maintainerhttp://www.kernel.org
  PulseAudio, core developer  http://pulseaudio.org
  rdesktop, core developer  http://www.rdesktop.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Paul Mackerras
Andrew Morton writes:

> Yup.  But userspace will already have a fit if either the start or end time
> advanced into the glibc-thought-that-was-an-error range.

Not nearly as much of a fit.  The effect on x86 is that values between
-4095 and -1 are reported as -1, so the end-start difference will be
out by less than 41 seconds.  That's not nearly as dramatic as a
difference of 21 million seconds (over 16 years). :)

I really think that wrapping at 0x7fff makes the situation worse,
not better.

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] pktcdvd: fix BUG caused by sysfs module reference semantics change

2007-11-07 Thread Greg KH
On Thu, Nov 08, 2007 at 11:27:16AM +0900, Tejun Heo wrote:
> pkt_setup_dev() expects module reference to be held on invocation.
> This used to be true for sysfs callbacks but not anymore.  Test and
> grab module reference around pkt_setup_dev() in
> class_pktcdvd_store_add().
> 
> Signed-off-by: Tejun Heo <[EMAIL PROTECTED]>
> Acked-by: Peter Osterlund <[EMAIL PROTECTED]>
> ---
> Greg, can you please push this patch through your tree? 
> Thanks a lot.
> 
>  drivers/block/pktcdvd.c |9 +
>  1 file changed, 9 insertions(+)

Why through my tree?  I don't do block devices :)

Shouldn't Jens or at least Andrew take it?

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Andrew Morton
> On Thu, 8 Nov 2007 15:59:12 +1100 Paul Mackerras <[EMAIL PROTECTED]> wrote:
> Andrew Morton writes:
> 
> > "the latter" is what my protopatch does isn't it?  It wraps at 0x7fff.
> > It appears that glibc treats all of 0x8000-0x as an error.
> 
> Not on powerpc.  On powerpc the error indication is carried separately
> in a condition register bit.  So a force_successful_syscall_return()
> call will make glibc automatically do the right thing without any
> glibc changes on powerpc.

OK

> Wrapping at 0x7fff will cause programs to see large negative
> deltas between successive calls when the wrap occurs.  I can see that
> giving userspace fits. :)
> 

Yup.  But userspace will already have a fit if either the start or end time
advanced into the glibc-thought-that-was-an-error range.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Paul Mackerras
David Miller writes:

> I can't see where x86 is doing this though, so perhaps for x86
> glibc does make the negative value check.  But I doubt it is
> checking the range 0x8000-0x, otherwise mmap() would
> be busted.

At least for the INTERNAL_SYSCALL macro in glibc, the error check is:

#define INTERNAL_SYSCALL_ERROR_P(val, err) \
  ((unsigned int) (val) >= 0xf001u)

in sysdeps/unix/sysv/linux/i386/sysdep.h.  Similarly the PSEUDO macro
in that file does a cmpl $-4095,%eax to test for error.  (There is also
a PSEUDO_NOERRNO which doesn't test for error.)

So the convention on (32-bit) x86 is that -4095 .. -1 are error
values, and other values are successful return values.

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm/memory.c: remove warning from an uninitialized spinlock. was: Re: 2.6.21-rc7-mm2

2007-11-07 Thread Borislav Petkov
On Wed, Nov 07, 2007 at 02:20:03PM -0500, Steven Rostedt wrote:
> > 
> > Introduce a macro for suppressing gcc from generating a warning about a
> > probable uninitialized state of a variable.
> > 
> > Example:
> > 
> > -   spinlock_t *ptl;
> > +   spinlock_t *uninitialized_var(ptl);
> > 
> > Not a happy solution, but those warnings are obnoxious.
> > 
> > - Using the usual pointlessly-set-it-to-zero approach wastes several
> >   bytes of text.
> > 
> > - Using a macro means we can (hopefully) do something else if gcc changes
> >   cause the `x = x' hack to stop working
> > 
> > - Using a macro means that people who are worried about hiding true bugs
> >   can easily turn it off.
> > 
> > Signed-off-by: Borislav Petkov <[EMAIL PROTECTED]>
> > Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
> 
> I just stumbled across this being in the kernel. Well, I'm finally glad
> it made it in, even though it was suggested one year earlier ;-)
> 
>   http://lkml.org/lkml/2006/5/11/50

yeah, this was Andrew's idea. The version in the kernel, in
contrast to yours, doesn't have a config option so you still
have to make really sure you're not aiding any bugs with it.

-- 
Regards/Gruß,
Boris.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Paul Mackerras
Andrew Morton writes:

> "the latter" is what my protopatch does isn't it?  It wraps at 0x7fff.
> It appears that glibc treats all of 0x8000-0x as an error.

Not on powerpc.  On powerpc the error indication is carried separately
in a condition register bit.  So a force_successful_syscall_return()
call will make glibc automatically do the right thing without any
glibc changes on powerpc.

Wrapping at 0x7fff will cause programs to see large negative
deltas between successive calls when the wrap occurs.  I can see that
giving userspace fits. :)

Paul.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: avoid large irq-latencies in smp-balancing

2007-11-07 Thread Gregory Haskins
Peter Zijlstra wrote:
> Bah, missed a hunk
> 
> ---
> Subject: sched: avoid large irq-latencies in smp-balancing
> 
> SMP balancing is done with IRQs disabled and can iterate the full rq. When rqs
> are large this can cause large irq-latencies. Limit the nr of iterations on
> each run.
> 
> Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]>
> CC: Peter Williams <[EMAIL PROTECTED]>

Tested-by: Gregory Haskins <[EMAIL PROTECTED]> (as part of 23.1-rt11)

> ---
>  include/linux/sched.h |1 +
>  kernel/sched.c|   15 ++-
>  kernel/sysctl.c   |8 
>  3 files changed, 19 insertions(+), 5 deletions(-)
> 
> Index: linux-2.6-2/kernel/sched.c
> ===
> --- linux-2.6-2.orig/kernel/sched.c
> +++ linux-2.6-2/kernel/sched.c
> @@ -474,6 +474,12 @@ const_debug unsigned int sysctl_sched_fe
>  #define sched_feat(x) (sysctl_sched_features & SCHED_FEAT_##x)
>  
>  /*
> + * Number of tasks to iterate in a single balance run.
> + * Limited because this is done with IRQs disabled.
> + */
> +const_debug unsigned int sysctl_sched_nr_migrate = 32;
> +
> +/*
>   * For kernel-internal use: high-speed (but slightly incorrect) per-cpu
>   * clock constructed from sched_clock():
>   */
> @@ -2237,7 +2243,7 @@ balance_tasks(struct rq *this_rq, int th
> enum cpu_idle_type idle, int *all_pinned,
> int *this_best_prio, struct rq_iterator *iterator)
>  {
> - int pulled = 0, pinned = 0, skip_for_load;
> + int loops = 0, pulled = 0, pinned = 0, skip_for_load;
>   struct task_struct *p;
>   long rem_load_move = max_load_move;
>  
> @@ -2251,10 +2257,10 @@ balance_tasks(struct rq *this_rq, int th
>*/
>   p = iterator->start(iterator->arg);
>  next:
> - if (!p)
> + if (!p || loops++ > sysctl_sched_nr_migrate)
>   goto out;
>   /*
> -  * To help distribute high priority tasks accross CPUs we don't
> +  * To help distribute high priority tasks across CPUs we don't
>* skip a task if it will be the highest priority task (i.e. smallest
>* prio value) on its new queue regardless of its load weight
>*/
> @@ -2271,8 +2277,7 @@ next:
>   rem_load_move -= p->se.load.weight;
>  
>   /*
> -  * We only want to steal up to the prescribed number of tasks
> -  * and the prescribed amount of weighted load.
> +  * We only want to steal up to the prescribed amount of weighted load.
>*/
>   if (rem_load_move > 0) {
>   if (p->prio < *this_best_prio)
> Index: linux-2.6-2/kernel/sysctl.c
> ===
> --- linux-2.6-2.orig/kernel/sysctl.c
> +++ linux-2.6-2/kernel/sysctl.c
> @@ -298,6 +298,14 @@ static struct ctl_table kern_table[] = {
>   .mode   = 0644,
>   .proc_handler   = _dointvec,
>   },
> + {
> + .ctl_name   = CTL_UNNUMBERED,
> + .procname   = "sched_nr_migrate",
> + .data   = _sched_nr_migrate,
> + .maxlen = sizeof(unsigned int),
> + .mode   = 644,
> + .proc_handler   = _dointvec,
> + },
>  #endif
>   {
>   .ctl_name   = CTL_UNNUMBERED,
> Index: linux-2.6-2/include/linux/sched.h
> ===
> --- linux-2.6-2.orig/include/linux/sched.h
> +++ linux-2.6-2/include/linux/sched.h
> @@ -1466,6 +1466,7 @@ extern unsigned int sysctl_sched_batch_w
>  extern unsigned int sysctl_sched_child_runs_first;
>  extern unsigned int sysctl_sched_features;
>  extern unsigned int sysctl_sched_migration_cost;
> +extern unsigned int sysctl_sched_nr_migrate;
>  #endif
>  
>  extern unsigned int sysctl_sched_compat_yield;
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] MN10300: Add the MN10300 architecture to Linux kernel [try #3]

2007-11-07 Thread Adrian Bunk
On Wed, Nov 07, 2007 at 05:43:23PM +, David Howells wrote:
> 
> 
> These patches add the MEI/Panasonic MN10300/AM33 architecture to the Linux
> kernel.
> 
> The first patch suppresses AOUT support in the kernel if CONFIG_BINFMT_AOUT=n
> and CONFIG_IA32_AOUT=n.  MN10300 does not support the AOUT binfmt, so the ELF
> binfmt should not be permitted to go looking for AOUT libraries to load, nor
> should random bits of the kernel depend on asm/a.out.h.
> 
> The second patch adds the architecture itself, to be selected by ARCH=mn10300
> on the make command line.
> 
> The patches can also be downloaded from:
> 
>   http://people.redhat.com/~dhowells/mn10300/mn10300-arch.tar.bz2


The patch to include/asm-generic/Kbuild.asm doesn't seem to be required.


+#elif defined(__mn10300__)

Please use a CONFIG_ variable in such cases.


The parts outside arch/mn10300/ and include/asm-mn10300/ (except for the 
trivial "&& {,!}MN10300" Kconfig changes) should go separately through 
the maintainers or get ACKs from the maintainers, even more since they 
also contain cleanups like

-   .regions = {ERASEINFO(0x01000,64),
+   .regions= {
+   ERASEINFO(0x01000,64),
}


--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
...
+extern void __kprobes arch_remove_kprobe(struct kprobe *p);

This looks as if it will break compilation on avr32 and sparc64.


> A suitable toolchain can be downloaded from:
> 
>   ftp://ftp.redhat.com/pub/redhat/gnupro/AM33/
>...

What is the status of support in upstream GNU binutils and GNU gcc?

> David

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: build #337 failed for 2.6.24-rc1-gb1d08ac In function `usbnet_set_settings':

2007-11-07 Thread Adrian Bunk
On Wed, Nov 07, 2007 at 11:52:32PM +0100, Adrian Bunk wrote:
> On Wed, Nov 07, 2007 at 02:34:52PM -0800, David Brownell wrote:
> > > > But on the other hand, it seems that only the ASIX code will work
> > > > right; the DM9601 and MCS7830 Kconfig is different/wrong.
> > > 
> > > I'm not seeing the problem.
> > > 
> > > Which configuration will be handled wrongly?
> > 
> > Notice how only the ASIX kconfig depended on NET_ETHERNET...
> > since MII depends on NET_ETHERNET, and (last I knew) the
> > reverse dependencies didn't capture the complete dependency
> > tree, selecting only MII would leave out some stuff.
> 
> Except for one s390 net driver (I'll check why it's doing this) the 
> NET_ETHERNET option does not influence what code is being generated - 
> it's just a Kconfig-internal option allowing to disable a huge bunch
> of drivers at once.

Damn, I shouldn't have only grep'ed under drivers/.

@davem:

Please look at net/ipv4/arp.c:arp_process()

Am I right that CONFIG_NET_ETHERNET=n and CONFIG_NETDEV_1000=y or 
CONFIG_NETDEV_1=y will not be handled correctly there?

And the best solution is to nuke all #ifdef's in this function and make 
the code unconditionally available?

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: build #337 failed for 2.6.24-rc1-gb1d08ac In function `usbnet_set_settings':

2007-11-07 Thread Adrian Bunk
On Wed, Nov 07, 2007 at 06:53:48PM -0800, David Brownell wrote:
> On Wednesday 07 November 2007, Adrian Bunk wrote:
> > On Wed, Nov 07, 2007 at 02:34:52PM -0800, David Brownell wrote:
> > > > > But on the other hand, it seems that only the ASIX code will work
> > > > > right; the DM9601 and MCS7830 Kconfig is different/wrong.
> > > > 
> > > > I'm not seeing the problem.
> > > > 
> > > > Which configuration will be handled wrongly?
> > > 
> > > Notice how only the ASIX kconfig depended on NET_ETHERNET...
> > > since MII depends on NET_ETHERNET, and (last I knew) the
> > > reverse dependencies didn't capture the complete dependency
> > > tree, selecting only MII would leave out some stuff.
> > 
> > Except for one s390 net driver (I'll check why it's doing this) the 
> > NET_ETHERNET option does not influence what code is being generated - 
> > it's just a Kconfig-internal option allowing to disable a huge bunch
> > of drivers at once.
> 
> Drivers like ... AX88xxx, DM9601, and MCS7830!!  Except as
> it turns out, only the first one behaves as intended.
> 
> You can tell it's a problem by the way it's inconsistent,
> regardless of the details of the problem.  :)

I'm all for cleanups that make things consistent.  :)

As long as we can agree that there's a difference between a problem like 
a compile or runtime error and an opportunity for making things 
consistent.

> - Dave

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sysctl: Check length at deprecated_sysctl_warning.

2007-11-07 Thread Andrew Morton
> On Thu, 08 Nov 2007 11:57:26 +0900 Tetsuo Handa <[EMAIL PROTECTED]> wrote:
> Original patch assumed args->nlen < CTL_MAXNAME, but it can be false.
> 
> Signed-off-by: Tetsuo Handa <[EMAIL PROTECTED]>
> 
> 
> --- linux-2.6.22-rc2.orig/kernel/sysctl.c 2007-11-08 10:38:17.0 
> +0900
> +++ linux-2.6.22-rc2/kernel/sysctl.c  2007-11-08 11:24:27.0 +0900
> @@ -2609,6 +2609,10 @@ static int deprecated_sysctl_warning(str
>   int name[CTL_MAXNAME];
>   int i;
>  
> + /* Check args->nlen. */
> + if (args->nlen > CTL_MAXNAME)
> + return -EFAULT;
> +
>   /* Read in the sysctl name for better debug message logging */
>   for (i = 0; i < args->nlen; i++)
>   if (get_user(name[i], args->name + i))

Well that would have been a nice roothole for someone.  Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread David Miller
From: Andrew Morton <[EMAIL PROTECTED]>
Date: Wed, 7 Nov 2007 19:07:14 -0800

> It appears that glibc treats all of 0x8000-0x as an
> error.

glibc treats it as an error if the system call returns with
the carry condition code set.  At least that's how I've
understood it to work and at a minimum this is how it works
on sparc, ppc, ia64, mips, etc.

The error indication is being created by the system call return path
in the kernel.  It tests for values between -512 and 0, and marks
those as errors unless force_successful_syscall() has been called.

I can't see where x86 is doing this though, so perhaps for x86
glibc does make the negative value check.  But I doubt it is
checking the range 0x8000-0x, otherwise mmap() would
be busted.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Andrew Morton
> On Thu, 8 Nov 2007 12:53:57 +1100 Paul Mackerras <[EMAIL PROTECTED]> wrote:
> Andrew Morton writes:
> 
> > Given all this stuff, the return value from sys_times() doesn't seem a
> > particularly useful or reliable kernel interface.
> 
> I think the best thing would be to ignore any error from copy_to_user
> and always return the number of clock ticks.  We should call
> force_successful_syscall_return, and glibc on x86 should be taught not
> to interpret negative values as an error.

Changing glibc might be hard ;)

> POSIX doesn't require us to return an EFAULT error if the buf argument
> is bogus.  If userspace does supply a bogus buf pointer, then either
> it will dereference it itself and get a segfault, or it won't
> dereference it, in which case it obviously didn't care about the
> values we tried to put there.
> 
> If we try to return an error under some circumstances, then there is
> at least one 32-bit value for the number of ticks that will cause
> confusion.  We can either change that value (or values) to some other
> value, which seems pretty bogus, or we can just decide not to return
> any errors.  The latter seems to me to have no significant downside
> and to be the simplest solution to the problem.

"the latter" is what my protopatch does isn't it?  It wraps at 0x7fff.
It appears that glibc treats all of 0x8000-0x as an error.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] sysctl: Check length at deprecated_sysctl_warning.

2007-11-07 Thread Tetsuo Handa
Original patch assumed args->nlen < CTL_MAXNAME, but it can be false.

Signed-off-by: Tetsuo Handa <[EMAIL PROTECTED]>


--- linux-2.6.22-rc2.orig/kernel/sysctl.c   2007-11-08 10:38:17.0 
+0900
+++ linux-2.6.22-rc2/kernel/sysctl.c2007-11-08 11:24:27.0 +0900
@@ -2609,6 +2609,10 @@ static int deprecated_sysctl_warning(str
int name[CTL_MAXNAME];
int i;
 
+   /* Check args->nlen. */
+   if (args->nlen > CTL_MAXNAME)
+   return -EFAULT;
+
/* Read in the sysctl name for better debug message logging */
for (i = 0; i < args->nlen; i++)
if (get_user(name[i], args->name + i))
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: build #337 failed for 2.6.24-rc1-gb1d08ac In function `usbnet_set_settings':

2007-11-07 Thread David Brownell
On Wednesday 07 November 2007, Adrian Bunk wrote:
> On Wed, Nov 07, 2007 at 02:34:52PM -0800, David Brownell wrote:
> > > > But on the other hand, it seems that only the ASIX code will work
> > > > right; the DM9601 and MCS7830 Kconfig is different/wrong.
> > > 
> > > I'm not seeing the problem.
> > > 
> > > Which configuration will be handled wrongly?
> > 
> > Notice how only the ASIX kconfig depended on NET_ETHERNET...
> > since MII depends on NET_ETHERNET, and (last I knew) the
> > reverse dependencies didn't capture the complete dependency
> > tree, selecting only MII would leave out some stuff.
> 
> Except for one s390 net driver (I'll check why it's doing this) the 
> NET_ETHERNET option does not influence what code is being generated - 
> it's just a Kconfig-internal option allowing to disable a huge bunch
> of drivers at once.

Drivers like ... AX88xxx, DM9601, and MCS7830!!  Except as
it turns out, only the first one behaves as intended.

You can tell it's a problem by the way it's inconsistent,
regardless of the details of the problem.  :)

- Dave
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: is minimum udelay() not respected in preemptible SMP kernel-2.6.23?

2007-11-07 Thread Matt Mackall
On Thu, Nov 08, 2007 at 02:20:58AM +0100, Andi Kleen wrote:
> 
> > But I think we'd be best off stashing a single bit somewhere and
> > checking it at migrate time (relatively infrequent) rather than
> > copying and zeroing out a potentially enormous affinity mask every
> > time we disable migration (often, and in fast paths). Perhaps adding
> > TASK_PINNED to the task state flags would do it?
> 
> It would need to be a count to be able to nest it.

Ahh, right. Suppose that means fattening the task struct until someone
comes up with something more clever.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] virtio PCI device

2007-11-07 Thread Anthony Liguori
This is a PCI device that implements a transport for virtio.  It allows virtio
devices to be used by QEMU based VMMs like KVM or Xen.

Signed-off-by: Anthony Liguori <[EMAIL PROTECTED]>

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 9e33fc4..c81e0f3 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -6,3 +6,20 @@ config VIRTIO
 config VIRTIO_RING
bool
depends on VIRTIO
+
+config VIRTIO_PCI
+   tristate "PCI driver for virtio devices (EXPERIMENTAL)"
+   depends on PCI && EXPERIMENTAL
+   select VIRTIO
+   select VIRTIO_RING
+   ---help---
+ This drivers provides support for virtio based paravirtual device
+ drivers over PCI.  This requires that your VMM has appropriate PCI
+ virtio backends.  Most QEMU based VMMs should support these devices
+ (like KVM or Xen).
+
+ Currently, the ABI is not considered stable so there is no guarantee
+ that this version of the driver will work with your VMM.
+
+ If unsure, say M.
+  
diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index f70e409..cc84999 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_VIRTIO) += virtio.o
 obj-$(CONFIG_VIRTIO_RING) += virtio_ring.o
+obj-$(CONFIG_VIRTIO_PCI) += virtio_pci.o
diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
new file mode 100644
index 000..85ae096
--- /dev/null
+++ b/drivers/virtio/virtio_pci.c
@@ -0,0 +1,469 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+MODULE_AUTHOR("Anthony Liguori <[EMAIL PROTECTED]>");
+MODULE_DESCRIPTION("virtio-pci");
+MODULE_LICENSE("GPL");
+MODULE_VERSION("1");
+
+/* Our device structure */
+struct virtio_pci_device
+{
+   /* the virtio device */
+   struct virtio_device vdev;
+   /* the PCI device */
+   struct pci_dev *pci_dev;
+   /* the IO mapping for the PCI config space */
+   void *ioaddr;
+
+   spinlock_t lock;
+   struct list_head virtqueues;
+};
+
+struct virtio_pci_vq_info
+{
+   /* the number of entries in the queue */
+   int num;
+   /* the number of pages the device needs for the ring queue */
+   int n_pages;
+   /* the index of the queue */
+   int queue_index;
+   /* the struct page of the ring queue */
+   struct page *pages;
+   /* the virtual address of the ring queue */
+   void *queue;
+   /* a pointer to the virtqueue */
+   struct virtqueue *vq;
+   /* the node pointer */
+   struct list_head node;
+};
+
+/* We have to enumerate here all virtio PCI devices. */
+static struct pci_device_id virtio_pci_id_table[] = {
+   { 0x5002, 0x2258, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 }, /* Dummy entry */
+   { 0 },
+};
+
+MODULE_DEVICE_TABLE(pci, virtio_pci_id_table);
+
+/* A PCI device has it's own struct device and so does a virtio device so
+ * we create a place for the virtio devices to show up in sysfs.  I think it
+ * would make more sense for virtio to not insist on having it's own device. */
+static struct device virtio_pci_root = {
+   .parent = NULL,
+   .bus_id = "virtio-pci",
+};
+
+/* Unique numbering for devices under the kvm root */
+static unsigned int dev_index;
+
+/* Convert a generic virtio device to our structure */
+static struct virtio_pci_device *to_vp_device(struct virtio_device *vdev)
+{
+   return container_of(vdev, struct virtio_pci_device, vdev);
+}
+
+/* virtio config->feature() implementation */
+static bool vp_feature(struct virtio_device *vdev, unsigned bit)
+{
+   struct virtio_pci_device *vp_dev = to_vp_device(vdev);
+   u32 mask;
+
+   /* Since this function is supposed to have the side effect of
+* enabling a queried feature, we simulate that by doing a read
+* from the host feature bitmask and then writing to the guest
+* feature bitmask */
+   mask = ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
+   if (mask & (1 << bit)) {
+   mask |= (1 << bit);
+   iowrite32(mask, vp_dev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+   }
+
+   return !!(mask & (1 << bit));
+}
+
+/* virtio config->get() implementation */
+static void vp_get(struct virtio_device *vdev, unsigned offset,
+  void *buf, unsigned len)
+{
+   struct virtio_pci_device *vp_dev = to_vp_device(vdev);
+   void *ioaddr = vp_dev->ioaddr + VIRTIO_PCI_CONFIG + offset;
+
+   /* We translate appropriately sized get requests into more natural
+* IO operations.  These functions also take care of endianness
+* conversion. */
+   switch (len) {
+   case 1: {
+   u8 val;
+   val = ioread8(ioaddr);
+   memcpy(buf, , sizeof(val));
+   break;
+   }
+   case 2: {
+   u16 val;
+   val = 

[PATCH 2/3] Put the virtio under the virtualization menu

2007-11-07 Thread Anthony Liguori
This patch moves virtio under the virtualization menu and changes virtio
devices to not claim to only be for lguest.

Signed-off-by: Anthony Liguori <[EMAIL PROTECTED]>

diff --git a/drivers/Kconfig b/drivers/Kconfig
index f4076d9..d945ffc 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -93,6 +93,4 @@ source "drivers/auxdisplay/Kconfig"
 source "drivers/kvm/Kconfig"
 
 source "drivers/uio/Kconfig"
-
-source "drivers/virtio/Kconfig"
 endmenu
diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig
index 4d0119e..be4b224 100644
--- a/drivers/block/Kconfig
+++ b/drivers/block/Kconfig
@@ -429,6 +429,7 @@ config VIRTIO_BLK
tristate "Virtio block driver (EXPERIMENTAL)"
depends on EXPERIMENTAL && VIRTIO
---help---
- This is the virtual block driver for lguest.  Say Y or M.
+ This is the virtual block driver for virtio.  It can be used with
+  lguest or QEMU based VMMs (like KVM or Xen).  Say Y or M.
 
 endif # BLK_DEV
diff --git a/drivers/kvm/Kconfig b/drivers/kvm/Kconfig
index 6569206..ac4bcdf 100644
--- a/drivers/kvm/Kconfig
+++ b/drivers/kvm/Kconfig
@@ -50,5 +50,6 @@ config KVM_AMD
 # OK, it's a little counter-intuitive to do this, but it puts it neatly under
 # the virtualization menu.
 source drivers/lguest/Kconfig
+source drivers/virtio/Kconfig
 
 endif # VIRTUALIZATION
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 86b8641..e66aec4 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -3107,6 +3107,7 @@ config VIRTIO_NET
tristate "Virtio network driver (EXPERIMENTAL)"
depends on EXPERIMENTAL && VIRTIO
---help---
- This is the virtual network driver for lguest.  Say Y or M.
+ This is the virtual network driver for virtio.  It can be used with
+  lguest or QEMU based VMMs (like KVM or Xen).  Say Y or M.
 
 endif # NETDEVICES
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] Export vring functions for modules to use

2007-11-07 Thread Anthony Liguori
This is needed for the virtio PCI device to be compiled as a module.

Signed-off-by: Anthony Liguori <[EMAIL PROTECTED]>

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 0e1bf05..3f28b47 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -260,6 +260,8 @@ irqreturn_t vring_interrupt(int irq, void *_vq)
return IRQ_HANDLED;
 }
 
+EXPORT_SYMBOL_GPL(vring_interrupt);
+
 static struct virtqueue_ops vring_vq_ops = {
.add_buf = vring_add_buf,
.get_buf = vring_get_buf,
@@ -306,8 +308,12 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
return >vq;
 }
 
+EXPORT_SYMBOL_GPL(vring_new_virtqueue);
+
 void vring_del_virtqueue(struct virtqueue *vq)
 {
kfree(to_vvq(vq));
 }
 
+EXPORT_SYMBOL_GPL(vring_del_virtqueue);
+
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/3] virtio PCI driver

2007-11-07 Thread Anthony Liguori
This patch series implements a PCI driver for virtio.  This allows virtio
devices (like block and network) to be used in QEMU/KVM.  I'll post a very
early KVM userspace backend in kvm-devel for those that are interested.

This series depends on the two virtio fixes I've posted and Rusty's config_ops
refactoring.  I've tested with these patches on Rusty's experimental virtio
tree.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio config_ops refactoring

2007-11-07 Thread Anthony Liguori

Rusty Russell wrote:

On Thursday 08 November 2007 04:30:50 Anthony Liguori wrote:

I would prefer that the virtio API not expose a little endian standard.
I'm currently converting config->get() ops to ioreadXX depending on the
size which already does the endianness conversion for me so this just
messes things up.  I think it's better to let the backend deal with
endianness since it's trivial to handle for both the PCI backend and the
lguest backend (lguest doesn't need to do any endianness conversion).


-ETOOMUCHMAGIC.  We should either expose all the XX interfaces (but this isn't 
a high-speed interface, so let's not) or not "sometimes" convert endianness.  
Getting surprises because a field happens to be packed into 4 bytes is 
counter-intuitive.


Then I think it's necessary to expose the XX interfaces.  Otherwise, the 
backend has to deal with doing all register operations at a per-byte 
granularity which adds a whole lot of complexity on a per-device basis 
(as opposed to a little complexity once in the transport layer).


You really want to be able to rely on multi-byte atomic operations too 
when setting values.  Otherwise, you need another register to just to 
signal when it's okay for the device to examine any given register.


Regards,

Anthony Liguori

Since your most trivial implementation is to do a byte at a time, I don't 
think you have a good argument on that basis either.


Cheers,
Rusty.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RFC 4/7] x86: unify pgtable*.h

2007-11-07 Thread Jeremy Fitzhardinge
All x86 modes and architectures have very similar pagetable
structures: the page flags, the accessors for testing/setting them,
and the combinations of page flags used for kernel and usermode
mappings are all the same.  The main difference is between 32 and
64-bit pagetable entries, with the latter supporting the NX bit.

The most significant difference between the modes/architectures is the
number of levels in the pagetable (4 for 64-bit, 3 for 32-bit/PAE, 2
for non-PAE 32-bit).  This accounts for the remaining code in the
various mode-specific headers.

I've tried to avoid changing formatting as much as possible, so that
the code motion is more obvious.  A subsequent patch will clean things
up in place.

Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>

---
 include/asm-x86/pgtable-2level.h |   21 --
 include/asm-x86/pgtable-3level.h |   40 
 include/asm-x86/pgtable.h|  318 ++
 include/asm-x86/pgtable_32.h |  204 
 include/asm-x86/pgtable_64.h |  225 --
 5 files changed, 331 insertions(+), 477 deletions(-)

===
--- a/include/asm-x86/pgtable-2level.h
+++ b/include/asm-x86/pgtable-2level.h
@@ -24,16 +24,13 @@ static inline void native_set_pmd(pmd_t 
 {
*pmdp = pmd;
 }
-#ifndef CONFIG_PARAVIRT
-#define set_pte(pteptr, pteval)native_set_pte(pteptr, pteval)
-#define set_pte_at(mm,addr,ptep,pteval) native_set_pte_at(mm, addr, ptep, 
pteval)
-#define set_pmd(pmdptr, pmdval)native_set_pmd(pmdptr, pmdval)
-#endif
 
+#undef set_pte_atomic
 #define set_pte_atomic(pteptr, pteval) set_pte(pteptr,pteval)
 #define set_pte_present(mm,addr,ptep,pteval) set_pte_at(mm,addr,ptep,pteval)
 
 #define pte_clear(mm,addr,xp)  do { set_pte_at(mm, addr, xp, __pte(0)); } 
while (0)
+#undef pmd_clear
 #define pmd_clear(xp)  do { set_pmd(xp, __pmd(0)); } while (0)
 
 static inline void native_pte_clear(struct mm_struct *mm, unsigned long addr, 
pte_t *xp)
@@ -50,12 +47,6 @@ static inline pte_t native_ptep_get_and_
 #define native_ptep_get_and_clear(xp) native_local_ptep_get_and_clear(xp)
 #endif
 
-#define pte_page(x)pfn_to_page(pte_pfn(x))
-#define pte_none(x)(!(x).pte_low)
-#define pte_pfn(x) (pte_val(x) >> PAGE_SHIFT)
-#define pfn_pte(pfn, prot) __pte(((pfn) << PAGE_SHIFT) | pgprot_val(prot))
-#define pfn_pmd(pfn, prot) __pmd(((pfn) << PAGE_SHIFT) | pgprot_val(prot))
-
 /*
  * All present pages are kernel-executable:
  */
@@ -64,17 +55,13 @@ static inline int pte_exec_kernel(pte_t 
return 1;
 }
 
+#define __supported_pte_mask   (~0ul)
+
 /*
  * Bits 0, 6 and 7 are taken, split up the 29 bits of offset
  * into this range:
  */
 #define PTE_FILE_MAX_BITS  29
-
-#define pte_to_pgoff(pte) \
-   pte).pte_low >> 1) & 0x1f ) + (((pte).pte_low >> 8) << 5 ))
-
-#define pgoff_to_pte(off) \
-   ((pte_t) { (((off) & 0x1f) << 1) + (((off) >> 5) << 8) + _PAGE_FILE })
 
 /* Encode and de-code a swap entry */
 #define __swp_type(x)  (((x).val >> 1) & 0x1f)
===
--- a/include/asm-x86/pgtable-3level.h
+++ b/include/asm-x86/pgtable-3level.h
@@ -94,17 +94,6 @@ static inline void native_pmd_clear(pmd_
*(tmp + 1) = 0;
 }
 
-#ifndef CONFIG_PARAVIRT
-#define set_pte(ptep, pte) native_set_pte(ptep, pte)
-#define set_pte_at(mm, addr, ptep, pte)native_set_pte_at(mm, 
addr, ptep, pte)
-#define set_pte_present(mm, addr, ptep, pte)   native_set_pte_present(mm, 
addr, ptep, pte)
-#define set_pte_atomic(ptep, pte)  native_set_pte_atomic(ptep, pte)
-#define set_pmd(pmdp, pmd) native_set_pmd(pmdp, pmd)
-#define set_pud(pudp, pud) native_set_pud(pudp, pud)
-#define pte_clear(mm, addr, ptep)  native_pte_clear(mm, addr, ptep)
-#define pmd_clear(pmd) native_pmd_clear(pmd)
-#endif
-
 /*
  * Pentium-II erratum A13: in PAE mode we explicitly have to flush
  * the TLB via cr3 if the top-level pgd is changed...
@@ -119,10 +108,6 @@ static inline void pud_clear (pud_t * pu
 #define pud_page_vaddr(pud) \
 ((unsigned long) __va(pud_val(pud) & PAGE_MASK))
 
-
-/* Find an entry in the second-level page table.. */
-#define pmd_offset(pud, address) ((pmd_t *) pud_page(*(pud)) + \
-   pmd_index(address))
 
 #ifdef CONFIG_SMP
 static inline pte_t native_ptep_get_and_clear(pte_t *ptep)
@@ -146,38 +131,13 @@ static inline int pte_same(pte_t a, pte_
return a.pte_low == b.pte_low && a.pte_high == b.pte_high;
 }
 
-#define pte_page(x)pfn_to_page(pte_pfn(x))
-
-static inline int pte_none(pte_t pte)
-{
-   return !pte.pte_low && !pte.pte_high;
-}
-
-static inline unsigned long pte_pfn(pte_t pte)
-{
-   return pte_val(pte) >> PAGE_SHIFT;
-}
-
 extern unsigned long 

[PATCH RFC 2/7] x86: clean up mm/init_32.c

2007-11-07 Thread Jeremy Fitzhardinge
Some code reformatting in init_32.c.  No functional change.

Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>

---
 arch/x86/mm/init_32.c |   31 +--
 1 file changed, 21 insertions(+), 10 deletions(-)

===
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -165,16 +165,25 @@ static void __init kernel_physical_mappi
pmd = one_md_table_init(pgd);
if (pfn >= max_low_pfn)
continue;
-   for (pmd_idx = 0; pmd_idx < PTRS_PER_PMD && pfn < max_low_pfn; 
pmd++, pmd_idx++) {
+   for (pmd_idx = 0;
+pmd_idx < PTRS_PER_PMD && pfn < max_low_pfn;
+pmd++, pmd_idx++) {
unsigned int address = pfn * PAGE_SIZE + PAGE_OFFSET;
 
-   /* Map with big pages if possible, otherwise create 
normal page tables. */
+   /* Map with big pages if possible, otherwise
+  create normal page tables. */
if (cpu_has_pse) {
-   unsigned int address2 = (pfn + PTRS_PER_PTE - 
1) * PAGE_SIZE + PAGE_OFFSET + PAGE_SIZE-1;
-   if (is_kernel_text(address) || 
is_kernel_text(address2))
-   set_pmd(pmd, pfn_pmd(pfn, 
PAGE_KERNEL_LARGE_EXEC));
-   else
-   set_pmd(pmd, pfn_pmd(pfn, 
PAGE_KERNEL_LARGE));
+   unsigned int address2;
+   pgprot_t prot = PAGE_KERNEL_LARGE;
+
+   address2 = (pfn + PTRS_PER_PTE - 1) * PAGE_SIZE 
+
+   PAGE_OFFSET + PAGE_SIZE-1;
+
+   if (is_kernel_text(address) ||
+   is_kernel_text(address2))
+   prot = PAGE_KERNEL_LARGE_EXEC;
+
+   set_pmd(pmd, pfn_pmd(pfn, prot));
 
pfn += PTRS_PER_PTE;
} else {
@@ -183,10 +192,12 @@ static void __init kernel_physical_mappi
for (pte_ofs = 0;
 pte_ofs < PTRS_PER_PTE && pfn < 
max_low_pfn;
 pte++, pfn++, pte_ofs++, address += 
PAGE_SIZE) {
+   pgprot_t prot = PAGE_KERNEL;
+
if (is_kernel_text(address))
-   set_pte(pte, pfn_pte(pfn, 
PAGE_KERNEL_EXEC));
-   else
-   set_pte(pte, pfn_pte(pfn, 
PAGE_KERNEL));
+   prot = PAGE_KERNEL_EXEC;
+
+   set_pte(pte, pfn_pte(pfn, prot));
}
}
}

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RFC 0/7] Unify asm-x86/pgtable.h and page.h

2007-11-07 Thread Jeremy Fitzhardinge
NB: RFC ONLY.  DO NOT APPLY.

This series unifies many definitions in asm-x86/pgtable.h and page.h.
Later in the series, I take advantage of some of the earlier
infrastructure to simplify paravirt.h and bits of the Xen code.

This patch applies on top of Glauber's 64-bit pvops unification, so it
won't apply directly to the current tree.

I've tested all the 32-bit combinations
(paravirt/non-paravirt/PAE/non-PAE), but haven't set up a 64-bit test
box yet.

The diffstat of the pure unification bits is nice:
 arch/x86/mm/init_32.c|   31 ++-
 arch/x86/mm/init_64.c|3 
 arch/x86/xen/enlighten.c |8 
 arch/x86/xen/mmu.c   |   67 +++-
 arch/x86/xen/mmu.h   |   26 ---
 include/asm-x86/page.h   |   49 -
 include/asm-x86/page_32.h|   77 +
 include/asm-x86/page_64.h|   37 +---
 include/asm-x86/paravirt.h   |  324 ---
 include/asm-x86/pgtable-2level.h |   21 --
 include/asm-x86/pgtable-3level.h |   40 
 include/asm-x86/pgtable.h|  318 ++
 include/asm-x86/pgtable_32.h |  204 
 include/asm-x86/pgtable_64.h |  234 
 14 files changed, 630 insertions(+), 809 deletions(-)

(The code formatting patch adds a pile of lines because it splits long
single lints into multiline code.)

J
-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RFC 7/7] x86: fix up formatting in pgtable*.h

2007-11-07 Thread Jeremy Fitzhardinge
Fix up various pieces of unconventional formatting in
asm-x86/pgtable*.h.  In some cases, the old formatting was arguablly
clearer with a wide enough terminal, but this patch gives the option
of using a more standard form.

Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
---
 include/asm-x86/pgtable-2level.h |   24 +++---
 include/asm-x86/pgtable-3level.h |   17 ---
 include/asm-x86/pgtable.h|   91 --
 include/asm-x86/pgtable_64.h |   20 +---
 4 files changed, 118 insertions(+), 34 deletions(-)

===
--- a/include/asm-x86/pgtable-2level.h
+++ b/include/asm-x86/pgtable-2level.h
@@ -15,25 +15,36 @@ static inline void native_set_pte(pte_t 
 {
*ptep = pte;
 }
+
 static inline void native_set_pte_at(struct mm_struct *mm, unsigned long addr,
 pte_t *ptep , pte_t pte)
 {
native_set_pte(ptep, pte);
 }
+
 static inline void native_set_pmd(pmd_t *pmdp, pmd_t pmd)
 {
*pmdp = pmd;
 }
 
 #undef set_pte_atomic
-#define set_pte_atomic(pteptr, pteval) set_pte(pteptr,pteval)
-#define set_pte_present(mm,addr,ptep,pteval) set_pte_at(mm,addr,ptep,pteval)
+#define set_pte_atomic(pteptr, pteval) set_pte(pteptr,pteval)
 
-#define pte_clear(mm,addr,xp)  do { set_pte_at(mm, addr, xp, __pte(0)); } 
while (0)
+#define set_pte_present(mm,addr,ptep,pteval)   set_pte_at(mm,addr,ptep,pteval)
+
+#define pte_clear(mm,addr,xp)  \
+   do {\
+   set_pte_at(mm, addr, xp, __pte(0)); \
+   } while (0)
+
 #undef pmd_clear
-#define pmd_clear(xp)  do { set_pmd(xp, __pmd(0)); } while (0)
+#define pmd_clear(xp)  \
+   do {\
+   set_pmd(xp, __pmd(0));  \
+   } while (0)
 
-static inline void native_pte_clear(struct mm_struct *mm, unsigned long addr, 
pte_t *xp)
+static inline void native_pte_clear(struct mm_struct *mm,
+   unsigned long addr, pte_t *xp)
 {
*xp = __pte(0);
 }
@@ -66,7 +77,8 @@ static inline int pte_exec_kernel(pte_t 
 /* Encode and de-code a swap entry */
 #define __swp_type(x)  (((x).val >> 1) & 0x1f)
 #define __swp_offset(x)((x).val >> 8)
-#define __swp_entry(type, offset)  ((swp_entry_t) { ((type) << 1) | 
((offset) << 8) })
+#define __swp_entry(type, offset)  \
+   ((swp_entry_t) { ((type) << 1) | ((offset) << 8) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { (pte).pte_low 
})
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
===
--- a/include/asm-x86/pgtable-3level.h
+++ b/include/asm-x86/pgtable-3level.h
@@ -9,7 +9,8 @@
  */
 
 #define pte_ERROR(e) \
-   printk("%s:%d: bad pte %p(%08lx%08lx).\n", __FILE__, __LINE__, &(e), 
(e).pte_high, (e).pte_low)
+   printk("%s:%d: bad pte %p(%08lx%08lx).\n", __FILE__, __LINE__,  \
+  &(e), (e).pte_high, (e).pte_low)
 #define pmd_ERROR(e) \
printk("%s:%d: bad pmd %p(%016Lx).\n", __FILE__, __LINE__, &(e), 
pmd_val(e))
 #define pgd_ERROR(e) \
@@ -39,6 +40,7 @@ static inline void native_set_pte(pte_t 
smp_wmb();
ptep->pte_low = pte.pte_low;
 }
+
 static inline void native_set_pte_at(struct mm_struct *mm, unsigned long addr,
 pte_t *ptep , pte_t pte)
 {
@@ -65,10 +67,12 @@ static inline void native_set_pte_atomic
 {
set_64bit((unsigned long long *)(ptep),native_pte_val(pte));
 }
+
 static inline void native_set_pmd(pmd_t *pmdp, pmd_t pmd)
 {
set_64bit((unsigned long long *)(pmdp),native_pmd_val(pmd));
 }
+
 static inline void native_set_pud(pud_t *pudp, pud_t pud)
 {
*pudp = pud;
@@ -79,7 +83,8 @@ static inline void native_set_pud(pud_t 
  * entry, so clear the bottom half first and enforce ordering with a compiler
  * barrier.
  */
-static inline void native_pte_clear(struct mm_struct *mm, unsigned long addr, 
pte_t *ptep)
+static inline void native_pte_clear(struct mm_struct *mm,
+   unsigned long addr, pte_t *ptep)
 {
ptep->pte_low = 0;
smp_wmb();
@@ -102,11 +107,11 @@ static inline void native_pmd_clear(pmd_
  */
 static inline void pud_clear (pud_t * pud) { }
 
-#define pud_page(pud) \
-((struct page *) __va(pud_val(pud) & PAGE_MASK))
+#define pud_page(pud)  \
+   ((struct page *) __va(pud_val(pud) & PAGE_MASK))
 
-#define pud_page_vaddr(pud) \
-((unsigned long) __va(pud_val(pud) & PAGE_MASK))
+#define pud_page_vaddr(pud)\
+   ((unsigned long) __va(pud_val(pud) & PAGE_MASK))
 
 
 #ifdef CONFIG_SMP
===
--- a/include/asm-x86/pgtable.h
+++ 

[PATCH RFC 5/7] x86: simplify pagetable-related operationsin paravirt.h

2007-11-07 Thread Jeremy Fitzhardinge
Simplify paravirt.h using the unified page/pgtable.h infrastructure.
This removes a fair amount of duplication of the ops function pointers
themselves, but also of PVOP_*CALL* wrappers.

The wrappers are complicated by the fact that on a 32-bit PAE system,
literal 64-bit values are passed in two arguments, and so a different
form of the call must be used compared to 64-bit or 32-bit non-PAE,
where all the arguments are less than or equal to the native register
size.

The code chooses the appropriate form to use by using the compile-time
comparison of sizeof(pteval_t) and sizeof(unsigned long).  This does
not need to be done for calls which are either PAE or 64-bit specific.

Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>

---
 include/asm-x86/paravirt.h |  324 +++-
 1 file changed, 141 insertions(+), 183 deletions(-)

===
--- a/include/asm-x86/paravirt.h
+++ b/include/asm-x86/paravirt.h
@@ -240,40 +240,38 @@ struct pv_mmu_ops {
void (*pte_update_defer)(struct mm_struct *mm,
 unsigned long addr, pte_t *ptep);
 
+   pteval_t (*pte_val)(pte_t);
+   pgdval_t (*pgd_val)(pgd_t);
+
+   pte_t (*make_pte)(pteval_t pte);
+   pgd_t (*make_pgd)(pgdval_t pgd);
+
+#if PAGETABLE_LEVELS >= 3
 #ifdef CONFIG_X86_PAE
void (*set_pte_atomic)(pte_t *ptep, pte_t pteval);
void (*set_pte_present)(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte);
 #endif
-#if defined(CONFIG_X86_PAE) || defined(CONFIG_X86_64)
-   void (*set_pud)(pud_t *pudp, pud_t pudval);
+
void (*pte_clear)(struct mm_struct *mm, unsigned long addr, pte_t 
*ptep);
void (*pmd_clear)(pmd_t *pmdp);
 
-   unsigned long long (*pte_val)(pte_t);
-   unsigned long long (*pmd_val)(pmd_t);
-   unsigned long long (*pgd_val)(pgd_t);
+   pmdval_t (*pmd_val)(pmd_t);
+   pmd_t (*make_pmd)(pmdval_t pmd);
 
-   pte_t (*make_pte)(unsigned long long pte);
-   pmd_t (*make_pmd)(unsigned long long pmd);
-   pgd_t (*make_pgd)(unsigned long long pgd);
-  #ifdef CONFIG_X86_64
+   void (*set_pud)(pud_t *pudp, pud_t pudval);
+
+#if PAGETABLE_LEVELS == 4
void (*set_pgd)(pgd_t *pgdp, pgd_t pgdval);
 
void (*pud_clear)(pud_t *pudp);
void (*pgd_clear)(pgd_t *pgdp);
 
-   unsigned long long (*pud_val)(pud_t);
+   pudval_t (*pud_val)(pud_t);
 
-   pud_t (*make_pud)(unsigned long long pud);
-  #endif
-#else
-   unsigned long (*pte_val)(pte_t);
-   unsigned long (*pgd_val)(pgd_t);
-
-   pte_t (*make_pte)(unsigned long pte);
-   pgd_t (*make_pgd)(unsigned long pgd);
-#endif
+   pud_t (*make_pud)(pudval_t pud);
+#endif /* PAGETABLE_LEVELS == 4 */
+#endif /* PAGETABLE_LEVELS >= 3 */
 
 #ifdef CONFIG_HIGHPTE
void *(*kmap_atomic_pte)(struct page *page, enum km_type type);
@@ -958,85 +956,137 @@ static inline void pte_update_defer(stru
PVOP_VCALL3(pv_mmu_ops.pte_update_defer, mm, addr, ptep);
 }
 
-#ifdef CONFIG_X86_PAE
-static inline pte_t __pte(unsigned long long val)
+/*
+ * Pagetable manipulators
+ *
+ * There are three cases to deal with:
+ * 32-bit processor, non-PAE:  2-level pagetable with 32-bit entries
+ * 32-bit processor, PAE:  3-level pagetable with 64-bit entries
+ * 64-bit processor:   4-level pagetable with 64-bit entries
+ *
+ * In 32-bit mode, passing 64-bit parameters must be done in two
+ * 32-bit chunks, so we need to use a separate PVOP_CALLx macro from
+ * either 64-bit mode or 32-bit/non-PAE.
+ *
+ * We rely on the predefined native_make_X/native_X_val to do
+ * packing/unpacking of the current pagetable type.
+ */
+static inline pte_t __pte(pteval_t val)
 {
-   unsigned long long ret = PVOP_CALL2(unsigned long long,
-   pv_mmu_ops.make_pte,
-   val, val >> 32);
-   return (pte_t) { ret, ret >> 32 };
+   pteval_t ret;
+
+   if (sizeof(val) > sizeof(unsigned long))
+   ret = PVOP_CALL2(pteval_t, pv_mmu_ops.make_pte,
+val, (u64)val>>32);
+   else
+   ret = PVOP_CALL1(pteval_t, pv_mmu_ops.make_pte, val);
+
+   return native_make_pte(ret);
 }
 
-static inline pmd_t __pmd(unsigned long long val)
+static inline pteval_t pte_val(pte_t x)
 {
-   return (pmd_t) { PVOP_CALL2(unsigned long long, pv_mmu_ops.make_pmd,
-   val, val >> 32) };
+   pteval_t val = native_pte_val(x);
+   if (sizeof(pteval_t) > sizeof(unsigned long))
+   return PVOP_CALL2(pteval_t, pv_mmu_ops.pte_val,
+ val, (u64)val>>32);
+   else
+   return PVOP_CALL1(pteval_t, pv_mmu_ops.pte_val, val);
 }
 
-static inline pgd_t __pgd(unsigned long long val)
+static inline pgd_t __pgd(pgdval_t val)
 {
-

[PATCH RFC 3/7] x86: clean up asm-x86/page*.h

2007-11-07 Thread Jeremy Fitzhardinge
Unify common definitions in page*.h.  To simplify other code, I added
typedefs for the value of pte/pmd/pud/pgd values, so they can be used
symbolically elsewhere without needing to have lots of 32/64/PAE
tests.

Also, add PAGETABLE_LEVELS define so that other definitions can test
for it directly rather than using indirect 32/64/PAE tests.

Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>

---
 include/asm-x86/page.h|   49 ++--
 include/asm-x86/page_32.h |   77 +
 include/asm-x86/page_64.h |   37 +++--
 3 files changed, 95 insertions(+), 68 deletions(-)

===
--- a/include/asm-x86/page.h
+++ b/include/asm-x86/page.h
@@ -1,13 +1,42 @@
+#ifndef _ASM_X86_PAGE_H
+#define _ASM_X86_PAGE_H
+
+#include 
+
+/* PAGE_SHIFT determines the page size */
+#define PAGE_SHIFT 12
+#define PAGE_SIZE  (_AC(1,UL) << PAGE_SHIFT)
+#define PAGE_MASK  (~(PAGE_SIZE-1))
+#define PHYSICAL_PAGE_MASK (~(PAGE_SIZE-1) & __PHYSICAL_MASK)
+
+#define LARGE_PAGE_MASK (~(LARGE_PAGE_SIZE-1))
+#define LARGE_PAGE_SIZE (_AC(1,UL) << PMD_SHIFT)
+
 #ifdef __KERNEL__
-# ifdef CONFIG_X86_32
-#  include "page_32.h"
-# else
-#  include "page_64.h"
-# endif
+
+#ifdef CONFIG_X86_32
+# include "page_32.h"
 #else
-# ifdef __i386__
-#  include "page_32.h"
-# else
-#  include "page_64.h"
-# endif
+# include "page_64.h"
 #endif
+
+#ifndef CONFIG_PARAVIRT
+#define pgd_val(x) native_pgd_val(x)
+#define __pgd(x)   native_make_pgd(x)
+
+#ifndef __PAGETABLE_PUD_FOLDED
+#define pud_val(x) native_pud_val(x)
+#define __pud(x)   native_make_pud(x)
+#endif
+
+#ifndef __PAGETABLE_PMD_FOLDED
+#define pmd_val(x) native_pmd_val(x)
+#define __pmd(x)   native_make_pmd(x)
+#endif
+
+#define pte_val(x) native_pte_val(x)
+#define __pte(x)   native_make_pte(x)
+#endif /* CONFIG_PARAVIRT */
+
+#endif /* __KERNEL__ */
+#endif /* _ASM_X86_PAGE_H */
===
--- a/include/asm-x86/page_32.h
+++ b/include/asm-x86/page_32.h
@@ -1,16 +1,13 @@
 #ifndef _I386_PAGE_H
 #define _I386_PAGE_H
 
-/* PAGE_SHIFT determines the page size */
-#define PAGE_SHIFT 12
-#define PAGE_SIZE  (1UL << PAGE_SHIFT)
-#define PAGE_MASK  (~(PAGE_SIZE-1))
+#ifndef _ASM_X86_PAGE_H
+#error Include asm/page.h
+#endif
 
-#define LARGE_PAGE_MASK (~(LARGE_PAGE_SIZE-1))
-#define LARGE_PAGE_SIZE (1UL << PMD_SHIFT)
+#ifndef __ASSEMBLY__
 
-#ifdef __KERNEL__
-#ifndef __ASSEMBLY__
+#include 
 
 #ifdef CONFIG_X86_USE_3DNOW
 
@@ -43,71 +40,86 @@
  */
 extern int nx_enabled;
 
+/* macro to avoid #include hell */
+#define native_pud_val(pud)native_pgd_val((pud).pgd)
+
 #ifdef CONFIG_X86_PAE
+#define PAGETABLE_LEVELS   3
+
+typedef u64pteval_t;
+typedef u64pmdval_t;
+typedef u64pudval_t;
+typedef u64pgdval_t;
+
 typedef struct { unsigned long pte_low, pte_high; } pte_t;
-typedef struct { unsigned long long pmd; } pmd_t;
-typedef struct { unsigned long long pgd; } pgd_t;
+typedef struct { pmdval_t pmd; } pmd_t;
+typedef struct { pgdval_t pgd; } pgd_t;
 typedef struct { unsigned long long pgprot; } pgprot_t;
 
-static inline unsigned long long native_pgd_val(pgd_t pgd)
+static inline pgdval_t native_pgd_val(pgd_t pgd)
 {
return pgd.pgd;
 }
 
-static inline unsigned long long native_pmd_val(pmd_t pmd)
+static inline pmdval_t native_pmd_val(pmd_t pmd)
 {
return pmd.pmd;
 }
 
-static inline unsigned long long native_pte_val(pte_t pte)
+static inline pteval_t native_pte_val(pte_t pte)
 {
return pte.pte_low | ((unsigned long long)pte.pte_high << 32);
 }
 
-static inline pgd_t native_make_pgd(unsigned long long val)
+static inline pgd_t native_make_pgd(pgdval_t val)
 {
return (pgd_t) { val };
 }
 
-static inline pmd_t native_make_pmd(unsigned long long val)
+static inline pmd_t native_make_pmd(pmdval_t val)
 {
return (pmd_t) { val };
 }
 
-static inline pte_t native_make_pte(unsigned long long val)
+static inline pte_t native_make_pte(pteval_t val)
 {
return (pte_t) { .pte_low = val, .pte_high = (val >> 32) } ;
 }
 
-#ifndef CONFIG_PARAVIRT
-#define pmd_val(x) native_pmd_val(x)
-#define __pmd(x)   native_make_pmd(x)
-#endif
-
 #define HPAGE_SHIFT21
 #include 
 #else  /* !CONFIG_X86_PAE */
+
+#define PAGETABLE_LEVELS   2
+
+typedef u32pteval_t;
+typedef u32pmdval_t;
+typedef u32pgdval_t;
+
 typedef struct { unsigned long pte_low; } pte_t;
 typedef struct { unsigned long pgd; } pgd_t;
 typedef struct { unsigned long pgprot; } pgprot_t;
 #define boot_pte_t pte_t /* or would you rather have a typedef */
 
-static inline unsigned long native_pgd_val(pgd_t pgd)
+static inline pgdval_t native_pgd_val(pgd_t pgd)
 {
return pgd.pgd;
 }
 
-static inline unsigned long native_pte_val(pte_t pte)
+static inline pteval_t native_pte_val(pte_t pte)
 {
return 

[PATCH RFC 6/7] x86/xen: simplify Xen mmu operations

2007-11-07 Thread Jeremy Fitzhardinge
Take advantage of the unified page/pgtable.h definitions to reduce the
number of duplicate definitions of the various Xen mmu_ops functions.

Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
---
 arch/x86/xen/enlighten.c |8 -
 arch/x86/xen/mmu.c   |   67 +-
 arch/x86/xen/mmu.h   |   26 +
 3 files changed, 41 insertions(+), 60 deletions(-)

===
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1038,16 +1038,18 @@ static const struct pv_mmu_ops xen_mmu_o
.make_pte = xen_make_pte,
.make_pgd = xen_make_pgd,
 
+#if PAGETABLE_LEVELS >= 3
 #ifdef CONFIG_X86_PAE
.set_pte_atomic = xen_set_pte_atomic,
.set_pte_present = xen_set_pte_at,
+#endif /* PAE */
.set_pud = xen_set_pud,
.pte_clear = xen_pte_clear,
.pmd_clear = xen_pmd_clear,
 
.make_pmd = xen_make_pmd,
.pmd_val = xen_pmd_val,
-#endif /* PAE */
+#endif /* PAGETABLE_LEVELS >= 3 */
 
.activate_mm = xen_activate_mm,
.dup_mmap = xen_dup_mmap,
@@ -1175,6 +1177,10 @@ asmlinkage void __init xen_start_kernel(
xen_setup_vcpu_info_placement();
 #endif
 
+#ifdef CONFIG_X86_PAE
+   __supported_pte_mask &= ~_PAGE_PCD;
+#endif
+
pv_info.kernel_rpl = 1;
if (xen_feature(XENFEAT_supervisor_mode_kernel))
pv_info.kernel_rpl = 0;
===
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -211,7 +211,7 @@ void xen_pmd_clear(pmd_t *pmdp)
xen_set_pmd(pmdp, __pmd(0));
 }
 
-unsigned long long xen_pte_val(pte_t pte)
+pteval_t xen_pte_val(pte_t pte)
 {
unsigned long long ret = 0;
 
@@ -223,23 +223,7 @@ unsigned long long xen_pte_val(pte_t pte
return ret;
 }
 
-unsigned long long xen_pmd_val(pmd_t pmd)
-{
-   unsigned long long ret = pmd.pmd;
-   if (ret)
-   ret = machine_to_phys(XMADDR(ret)).paddr | 1;
-   return ret;
-}
-
-unsigned long long xen_pgd_val(pgd_t pgd)
-{
-   unsigned long long ret = pgd.pgd;
-   if (ret)
-   ret = machine_to_phys(XMADDR(ret)).paddr | 1;
-   return ret;
-}
-
-pte_t xen_make_pte(unsigned long long pte)
+pte_t xen_make_pte(pteval_t pte)
 {
if (pte & 1)
pte = phys_to_machine(XPADDR(pte)).maddr;
@@ -247,20 +231,13 @@ pte_t xen_make_pte(unsigned long long pt
return (pte_t){ pte, pte >> 32 };
 }
 
-pmd_t xen_make_pmd(unsigned long long pmd)
+
+pmd_t xen_make_pmd(pmdval_t pmd)
 {
if (pmd & 1)
pmd = phys_to_machine(XPADDR(pmd)).maddr;
 
-   return (pmd_t){ pmd };
-}
-
-pgd_t xen_make_pgd(unsigned long long pgd)
-{
-   if (pgd & _PAGE_PRESENT)
-   pgd = phys_to_machine(XPADDR(pgd)).maddr;
-
-   return (pgd_t){ pgd };
+   return native_make_pmd(pmd);
 }
 #else  /* !PAE */
 void xen_set_pte(pte_t *ptep, pte_t pte)
@@ -268,7 +245,7 @@ void xen_set_pte(pte_t *ptep, pte_t pte)
*ptep = pte;
 }
 
-unsigned long xen_pte_val(pte_t pte)
+pteval_t xen_pte_val(pte_t pte)
 {
unsigned long ret = pte.pte_low;
 
@@ -278,30 +255,38 @@ unsigned long xen_pte_val(pte_t pte)
return ret;
 }
 
-unsigned long xen_pgd_val(pgd_t pgd)
-{
-   unsigned long ret = pgd.pgd;
-   if (ret)
-   ret = machine_to_phys(XMADDR(ret)).paddr | 1;
-   return ret;
-}
-
-pte_t xen_make_pte(unsigned long pte)
+pte_t xen_make_pte(pteval_t pte)
 {
if (pte & _PAGE_PRESENT)
pte = phys_to_machine(XPADDR(pte)).maddr;
 
return (pte_t){ pte };
 }
+#endif /* CONFIG_X86_PAE */
 
-pgd_t xen_make_pgd(unsigned long pgd)
+pmdval_t xen_pmd_val(pmd_t pmd)
+{
+   pmdval_t ret = native_pmd_val(pmd);
+   if (ret)
+   ret = machine_to_phys(XMADDR(ret)).paddr | 1;
+   return ret;
+}
+
+pgdval_t xen_pgd_val(pgd_t pgd)
+{
+   pgdval_t ret = native_pgd_val(pgd);
+   if (ret)
+   ret = machine_to_phys(XMADDR(ret)).paddr | 1;
+   return ret;
+}
+
+pgd_t xen_make_pgd(pgdval_t pgd)
 {
if (pgd & _PAGE_PRESENT)
pgd = phys_to_machine(XPADDR(pgd)).maddr;
 
-   return (pgd_t){ pgd };
+   return native_make_pgd(pgd);
 }
-#endif /* CONFIG_X86_PAE */
 
 enum pt_level {
PT_PGD,
===
--- a/arch/x86/xen/mmu.h
+++ b/arch/x86/xen/mmu.h
@@ -30,31 +30,21 @@ void xen_pgd_pin(pgd_t *pgd);
 void xen_pgd_pin(pgd_t *pgd);
 //void xen_pgd_unpin(pgd_t *pgd);
 
+pteval_t xen_pte_val(pte_t);
+pmdval_t xen_pmd_val(pmd_t);
+pgdval_t xen_pgd_val(pgd_t);
+
+pte_t xen_make_pte(pteval_t);
+pmd_t xen_make_pmd(pmdval_t);
+pgd_t xen_make_pgd(pgdval_t);
+
 #ifdef CONFIG_X86_PAE
-unsigned long long xen_pte_val(pte_t);
-unsigned long long xen_pmd_val(pmd_t);
-unsigned long long xen_pgd_val(pgd_t);
-
-pte_t xen_make_pte(unsigned long long);

[PATCH RFC 1/7] x86: kill mk_pte_huge

2007-11-07 Thread Jeremy Fitzhardinge
It only has a single use, which can be trivially replaced.

Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>

---
 arch/x86/mm/init_64.c|3 +--
 include/asm-x86/pgtable_64.h |9 -
 2 files changed, 1 insertion(+), 11 deletions(-)

===
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -768,8 +768,7 @@ int __meminit vmemmap_populate(struct pa
if (!p)
return -ENOMEM;
 
-   entry = pfn_pte(__pa(p) >> PAGE_SHIFT, PAGE_KERNEL);
-   mk_pte_huge(entry);
+   entry = pfn_pte(__pa(p) >> PAGE_SHIFT, 
PAGE_KERNEL_LARGE);
set_pmd(pmd, __pmd(pte_val(entry)));
 
printk(KERN_DEBUG " [%lx-%lx] PMD ->%p on node %d\n",
===
--- a/include/asm-x86/pgtable_64.h
+++ b/include/asm-x86/pgtable_64.h
@@ -378,15 +378,6 @@ static inline pte_t pte_clrhuge(pte_t pt
 /* page, protection -> pte */
 #define mk_pte(page, pgprot)   pfn_pte(page_to_pfn(page), (pgprot))
 
-static inline pte_t __mk_pte_huge(pte_t entry)
-{
-   unsigned long pte;
-   pte = pte_val(entry);
-   pte |= _PAGE_PRESENT | _PAGE_PSE;
-   return  __pte(pte);
-}
-#define mk_pte_huge(entry) ((entry) = __mk_pte_huge(entry))
-
 #include 
 static inline int ptep_test_and_clear_young(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep)

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: mm snapshot broken-out-2007-11-06-02-32 build failure - !CONFIG_PPC_ISERIES

2007-11-07 Thread Tony Breeds
On Thu, Nov 08, 2007 at 02:27:07AM +0530, Kamalesh Babulal wrote:
> Hi Andrew,
> 
> The kernel build fails with randconfig, with following error
> 
>   CC  arch/powerpc/platforms/celleb/setup.o
> arch/powerpc/platforms/celleb/setup.c:151: error: ‘generic_calibrate_decr’ 
> undeclared here (not in a function)
> make[2]: *** [arch/powerpc/platforms/celleb/setup.o] Error 1
> make[1]: *** [arch/powerpc/platforms/celleb] Error 2
> make: *** [arch/powerpc/platforms] Error 2


I think you need this patch:
http://patchwork.ozlabs.org/linuxppc/patch?q=Tony%20Breeds=14462

Yours Tony

  linux.conf.auhttp://linux.conf.au/ || http://lca2008.linux.org.au/
  Jan 28 - Feb 02 2008 The Australian Linux Technical Conference!

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] pktcdvd: fix BUG caused by sysfs module reference semantics change

2007-11-07 Thread Tejun Heo
pkt_setup_dev() expects module reference to be held on invocation.
This used to be true for sysfs callbacks but not anymore.  Test and
grab module reference around pkt_setup_dev() in
class_pktcdvd_store_add().

Signed-off-by: Tejun Heo <[EMAIL PROTECTED]>
Acked-by: Peter Osterlund <[EMAIL PROTECTED]>
---
Greg, can you please push this patch through your tree? 
Thanks a lot.

 drivers/block/pktcdvd.c |9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index a8130a4..a5ee213 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -358,10 +358,19 @@ static ssize_t class_pktcdvd_store_add(struct class *c, 
const char *buf,
size_t count)
 {
unsigned int major, minor;
+
if (sscanf(buf, "%u:%u", , ) == 2) {
+   /* pkt_setup_dev() expects caller to hold reference to self */
+   if (!try_module_get(THIS_MODULE))
+   return -ENODEV;
+
pkt_setup_dev(MKDEV(major, minor), NULL);
+
+   module_put(THIS_MODULE);
+
return count;
}
+
return -EINVAL;
 }
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][VIRTIO] Fix vring_init() ring computations

2007-11-07 Thread Rusty Russell
On Thursday 08 November 2007 12:06:07 Anthony Liguori wrote:
> Rusty Russell wrote:
> > On Wednesday 07 November 2007 13:52:29 Anthony Liguori wrote:
> >> This patch fixes a typo in vring_init().
> >
> > Thanks, applied.
> >
> > I've put it in the new, experimental virtio git tree on git.kernel.org.
>
> Hrm, perhaps you forgot to push?  I don't see it in the tree although I
> see the config ops refactoring.

It should be in the patches/1 branch.  I've pushed again...

Thanks,
Rusty.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio config_ops refactoring

2007-11-07 Thread Rusty Russell
On Thursday 08 November 2007 04:30:50 Anthony Liguori wrote:
> I would prefer that the virtio API not expose a little endian standard.
> I'm currently converting config->get() ops to ioreadXX depending on the
> size which already does the endianness conversion for me so this just
> messes things up.  I think it's better to let the backend deal with
> endianness since it's trivial to handle for both the PCI backend and the
> lguest backend (lguest doesn't need to do any endianness conversion).

-ETOOMUCHMAGIC.  We should either expose all the XX interfaces (but this isn't 
a high-speed interface, so let's not) or not "sometimes" convert endianness.  
Getting surprises because a field happens to be packed into 4 bytes is 
counter-intuitive.

Since your most trivial implementation is to do a byte at a time, I don't 
think you have a good argument on that basis either.

Cheers,
Rusty.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2 take #2] libata: pata_platform: Support polling-mode configuration.

2007-11-07 Thread Paul Mundt
Some SH boards (old R2D-1 boards) have generally not had working CF
under libata, due to both buswidth issues (handled by Aoi Shinkai
in 43f4b8c7578b928892b6f01d374346ae14e5eb70), and buggy interrupt
controllers. For these sorts of boards simply disabling the IRQ and
polling ends up working fine.

This conditionalizes the IRQ resource for pata_platform and lets
platforms that want to use polling mode simply omit the resource
entirely.

Signed-off-by: Paul Mundt <[EMAIL PROTECTED]>

---

 drivers/ata/pata_platform.c |   35 ---
 1 file changed, 28 insertions(+), 7 deletions(-)

diff --git a/drivers/ata/pata_platform.c b/drivers/ata/pata_platform.c
index fc72a96..ac03a90 100644
--- a/drivers/ata/pata_platform.c
+++ b/drivers/ata/pata_platform.c
@@ -1,7 +1,7 @@
 /*
  * Generic platform device PATA driver
  *
- * Copyright (C) 2006  Paul Mundt
+ * Copyright (C) 2006 - 2007  Paul Mundt
  *
  * Based on pata_pcmcia:
  *
@@ -22,7 +22,7 @@
 #include 
 
 #define DRV_NAME "pata_platform"
-#define DRV_VERSION "1.1"
+#define DRV_VERSION "1.2"
 
 static int pio_mask = 1;
 
@@ -120,15 +120,20 @@ static void pata_platform_setup_port(struct ata_ioports 
*ioaddr,
  * Register a platform bus IDE interface. Such interfaces are PIO and we
  * assume do not support IRQ sharing.
  *
- * Platform devices are expected to contain 3 resources per port:
+ * Platform devices are expected to contain at least 2 resources per port:
  *
  * - I/O Base (IORESOURCE_IO or IORESOURCE_MEM)
  * - CTL Base (IORESOURCE_IO or IORESOURCE_MEM)
+ *
+ * and optionally:
+ *
  * - IRQ  (IORESOURCE_IRQ)
  *
  * If the base resources are both mem types, the ioremap() is handled
  * here. For IORESOURCE_IO, it's assumed that there's no remapping
  * necessary.
+ *
+ * If no IRQ resource is present, PIO polling mode is used instead.
  */
 static int __devinit pata_platform_probe(struct platform_device *pdev)
 {
@@ -137,11 +142,12 @@ static int __devinit pata_platform_probe(struct 
platform_device *pdev)
struct ata_port *ap;
struct pata_platform_info *pp_info;
unsigned int mmio;
+   int irq;
 
/*
 * Simple resource validation ..
 */
-   if (unlikely(pdev->num_resources != 3)) {
+   if ((pdev->num_resources != 3) && (pdev->num_resources != 2)) {
dev_err(>dev, "invalid number of resources\n");
return -EINVAL;
}
@@ -173,6 +179,13 @@ static int __devinit pata_platform_probe(struct 
platform_device *pdev)
(ctl_res->flags == IORESOURCE_MEM));
 
/*
+* And the IRQ
+*/
+   irq = platform_get_irq(pdev, 0);
+   if (irq < 0)
+   irq = 0;/* no irq */
+
+   /*
 * Now that that's out of the way, wire up the port..
 */
host = ata_host_alloc(>dev, 1);
@@ -185,6 +198,14 @@ static int __devinit pata_platform_probe(struct 
platform_device *pdev)
ap->flags |= ATA_FLAG_SLAVE_POSS;
 
/*
+* Use polling mode if there's no IRQ
+*/
+   if (!irq) {
+   ap->flags |= ATA_FLAG_PIO_POLLING;
+   ata_port_desc(ap, "no IRQ, using PIO polling");
+   }
+
+   /*
 * Handle the MMIO case
 */
if (mmio) {
@@ -213,9 +234,9 @@ static int __devinit pata_platform_probe(struct 
platform_device *pdev)
  (unsigned long long)ctl_res->start);
 
/* activate */
-   return ata_host_activate(host, platform_get_irq(pdev, 0),
-ata_interrupt, pp_info ? pp_info->irq_flags
-: 0, _platform_sht);
+   return ata_host_activate(host, irq, irq ? ata_interrupt : NULL,
+pp_info ? pp_info->irq_flags : 0,
+_platform_sht);
 }
 
 /**
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2 take #2] libata: Support PIO polling-only hosts.

2007-11-07 Thread Paul Mundt
By default ata_host_activate() expects a valid IRQ in order to
successfully register the host. This patch enables a special case
for registering polling-only hosts that either don't have IRQs
or have buggy IRQ generation (either in terms of handling or
sensing), which otherwise work fine.

Hosts that want to use polling mode can simply set ATA_FLAG_PIO_POLLING
and pass in an invalid IRQ.

Signed-off-by: Paul Mundt <[EMAIL PROTECTED]>

---

 drivers/ata/libata-core.c |   10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index ec3ce12..89fd0e9 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -7178,6 +7178,10 @@ int ata_host_register(struct ata_host *host, struct 
scsi_host_template *sht)
  * request IRQ and register it.  This helper takes necessasry
  * arguments and performs the three steps in one go.
  *
+ * An invalid IRQ skips the IRQ registration and expects the host to
+ * have set polling mode on the port. In this case, @irq_handler
+ * should be NULL.
+ *
  * LOCKING:
  * Inherited from calling layer (may sleep).
  *
@@ -7194,6 +7198,12 @@ int ata_host_activate(struct ata_host *host, int irq,
if (rc)
return rc;
 
+   /* Special case for polling mode */
+   if (!irq) {
+   WARN_ON(irq_handler);
+   return ata_host_register(host, sht);
+   }
+
rc = devm_request_irq(host->dev, irq, irq_handler, irq_flags,
  dev_driver_string(host->dev), host);
if (rc)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Module init call vs symbols exporting race?

2007-11-07 Thread Rusty Russell
On Wednesday 07 November 2007 21:01:30 Jan Glauber wrote:
> Hi Rusty,
>
> I've seen a symbol-resolving race on s390. The qeth module uses symbols
> from qdio and although the loading order seems correct and the qdio
> symbols should be available the following error appears:
>
> qdio: loading QDIO base support version 2
> qeth: Unknown symbol qdio_synchronize

Looks like qdio does something which triggers qeth to load, but of course qdio 
isn't finished initializing yet so its symbols aren't available.

It's not obvious what's triggering the load, but you could probably find it by 
using printk's through qdio.c's init_QDIO().

Cheers,
Rusty.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread David Miller
From: Paul Mackerras <[EMAIL PROTECTED]>
Date: Thu, 8 Nov 2007 12:53:57 +1100

> Andrew Morton writes:
> 
> > Given all this stuff, the return value from sys_times() doesn't seem a
> > particularly useful or reliable kernel interface.
> 
> I think the best thing would be to ignore any error from copy_to_user
> and always return the number of clock ticks.  We should call
> force_successful_syscall_return, and glibc on x86 should be taught not
> to interpret negative values as an error.
> 
> POSIX doesn't require us to return an EFAULT error if the buf argument
> is bogus.  If userspace does supply a bogus buf pointer, then either
> it will dereference it itself and get a segfault, or it won't
> dereference it, in which case it obviously didn't care about the
> values we tried to put there.
> 
> If we try to return an error under some circumstances, then there is
> at least one 32-bit value for the number of ticks that will cause
> confusion.  We can either change that value (or values) to some other
> value, which seems pretty bogus, or we can just decide not to return
> any errors.  The latter seems to me to have no significant downside
> and to be the simplest solution to the problem.

I agree with this analysis.

The Linux man page for times() explicitly lists (clock_t) -1 as a
return value meaning error.

So even if we did make some effort to return errors "properly" (via
force_successful_syscall_return() et al.) userspace would still be
screwed because (clock_t) -1 would be interpreted as an error.

Actually I think this basically proves we cannot return (clock_t) -1
ever because all existing userland (I'm not talking about inside
glibc, I'm talking about inside of applications) will see this as an
error.

User applications have no other way to check for error.

This API is definitely very poorly designed, no matter which way we
"fix" this some case will remain broken.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Paul Mackerras
Andrew Morton writes:

> Given all this stuff, the return value from sys_times() doesn't seem a
> particularly useful or reliable kernel interface.

I think the best thing would be to ignore any error from copy_to_user
and always return the number of clock ticks.  We should call
force_successful_syscall_return, and glibc on x86 should be taught not
to interpret negative values as an error.

POSIX doesn't require us to return an EFAULT error if the buf argument
is bogus.  If userspace does supply a bogus buf pointer, then either
it will dereference it itself and get a segfault, or it won't
dereference it, in which case it obviously didn't care about the
values we tried to put there.

If we try to return an error under some circumstances, then there is
at least one 32-bit value for the number of ticks that will cause
confusion.  We can either change that value (or values) to some other
value, which seems pretty bogus, or we can just decide not to return
any errors.  The latter seems to me to have no significant downside
and to be the simplest solution to the problem.

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SC1200 failure in 2.6.23 and 2.6.24-rc1-git10

2007-11-07 Thread Robert Hancock

Denys wrote:

Finally i got full DMESG with 1GB card till end. Seems not readable too.



..



ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata1.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 4096 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
ata1: soft resetting link
ata1.00: configured for MWDMA1
sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08
sd 0:0:0:0: [sda] Sense Key : 0xb [current] [descriptor]
Descriptor sense data with sense descriptors (in hex):
72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
00 00 00 00
sd 0:0:0:0: [sda] ASC=0x0 ASCQ=0x0
end_request: I/O error, dev sda, sector 0
Buffer I/O error on device sda, logical block 0
ata1: EH complete


I'm guessing that your CF-to-IDE adapter doesn't have the correct lines 
wired up for DMA to work properly, and the card indicates DMA support, 
which libata tries to use but which doesn't work. It looks like it never 
tried falling back to PIO after DMA failed. Seems like a deficiency in 
the speed-down logic?


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [poll] Is the megafreeze development model broken?

2007-11-07 Thread Adrian Bunk
On Wed, Nov 07, 2007 at 11:56:57PM +0100, ciol wrote:
> Hi, I'd like to ask you a few questions:
>
> * Do you like the way linux distributions integrate the kernel?
>
> * Wouldn't you prefer they ship with the stable and still maintained 
> 2.6.16.X, while providing optionally the latest kernel for those who want 
> or just have a new hardware?

No.

With 2.6.16 "new hardware" roughly equals to "sold during the
last 2-3 years", so most users would be forced to use this "option".

"providing optionally the latest kernel" would be a horror to support 
for a distribution.

>From all I hear all big distributions spend 3-6 months of QA work 
between pushing a kernel into the development branch of their 
distribution and putting it into a release.

They can't do this work for 4-6 different upstream kernels each year.

And if they'd omit it, their custumers would both blame them for 
shipping such a buggy distribution and swamp their support with bug
reports.

> * Do you think the megafreeze development model [1] and the "I don't trust 
> in upstream" development model are broken? (And why)
>...

Definitely not.

If your "stable base system" contains the kernel you lose the hardware 
support for recent hardware.

What should be more important for users than having their hardware 
supported?

And although it's off-topic for linux-kernel, your suggested 
"well-maintained additional package collections" also sound horrific:

As an example, consider the following:
- a new version of GNOME might require a new version of GTK+
- recently GTK+ 2.12 entered Debian testing, and this new version
  exposed a serious bug in the xfwm4 package that was at that time
  in testing

There are at least two obvious problems with what you propose:
- for avoiding breakages for users a huge amount of coordination
  work between the "additional package collections" would be required
- most users want their software to work correctly, not crash, etc.
  when a distribution has a 2-3 months freeze before a release that's 
  not lost time, that's time where _all_ software that will be shipped 
  gets tested and bugs fixed

There's one important thing you must have in mind:
Geeks (like you and me) can get the latest software versions from the 
development versions of their distribution, but for most users - for 
whom a computer is a tool that should simply work (no matter whether 
it's a server or a desktop) and not a toy - the QA work done during a 
freeze has a _huge_ value.

Fedora, openSUSE and Ubuntu all offer new releases every 6 months, which 
results in the software in the latest release always being less than
1 year old plus the user getting the QA work and the resulting stability 
of a freeze. This seems to be a good solution for desktop user.

cu
Adrian (2.6.16 maintainer)

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: is minimum udelay() not respected in preemptible SMP kernel-2.6.23?

2007-11-07 Thread Andi Kleen

> But I think we'd be best off stashing a single bit somewhere and
> checking it at migrate time (relatively infrequent) rather than
> copying and zeroing out a potentially enormous affinity mask every
> time we disable migration (often, and in fast paths). Perhaps adding
> TASK_PINNED to the task state flags would do it?

It would need to be a count to be able to nest it.

> > get_cpu() etc. could be changed to use this then too.
>
> Some users of get_cpu might be relying on it to avoid actual
> preemption. In other words, we should have introduced a
> migrate_disable() when we first discovered the preempt/per_cpu
> conflict.

Ok perhaps it would make sense to migrate it step by step :- 
define a replacement for get_cpu and migrate over as users are getting
audited and eventually deprecate old one.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: build #337 failed for 2.6.24-rc1-gb1d08ac In function `usbnet_set_settings':

2007-11-07 Thread Adrian Bunk
On Wed, Nov 07, 2007 at 02:34:52PM -0800, David Brownell wrote:
> > > But on the other hand, it seems that only the ASIX code will work
> > > right; the DM9601 and MCS7830 Kconfig is different/wrong.
> > 
> > I'm not seeing the problem.
> > 
> > Which configuration will be handled wrongly?
> 
> Notice how only the ASIX kconfig depended on NET_ETHERNET...
> since MII depends on NET_ETHERNET, and (last I knew) the
> reverse dependencies didn't capture the complete dependency
> tree, selecting only MII would leave out some stuff.

Except for one s390 net driver (I'll check why it's doing this) the 
NET_ETHERNET option does not influence what code is being generated - 
it's just a Kconfig-internal option allowing to disable a huge bunch
of drivers at once.

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Andrew Morton
> On Thu, 08 Nov 2007 01:54:40 +0100 Andreas Schwab <[EMAIL PROTECTED]> wrote:
> Andrew Morton <[EMAIL PROTECTED]> writes:
> 
> > diff -puN kernel/compat.c~a kernel/compat.c
> > --- a/kernel/compat.c~a
> > +++ a/kernel/compat.c
> > @@ -162,7 +162,8 @@ asmlinkage long compat_sys_times(struct 
> > if (copy_to_user(tbuf, , sizeof(tmp)))
> > return -EFAULT;
> > }
> > -   return compat_jiffies_to_clock_t(jiffies);
> > +   return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) &
> > +   LONG_MAX);
> 
> Are you sure you want LONG_MAX here, not 0x7fff?
> 

I'm not sure of anything - I'm just trolling ;)

That's 0x7fff for architectures which implement this function. 
I think that lines up correctly with jiffies and the return value from
compat_sys_times().

Perhaps formally it should be USERSPACE_CLOCK_T_MAX, but we don't have that.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Andrew Morton
> On Wed, 07 Nov 2007 16:50:22 -0800 (PST) David Miller <[EMAIL PROTECTED]> 
> wrote:
> From: Andrew Morton <[EMAIL PROTECTED]>
> Date: Wed, 7 Nov 2007 15:28:33 -0800
> 
> > Perhaps this is a bug in glibc: it is interpreting the times() return value
> > in the same way as other syscalls.
> 
> The problem is more likely that we are failing to
> invoke force_successful_syscall_return() here.
> 
> Otherwise the syscall return path interprets negative
> values as errors, and sets the cpu condition codes.
> 
> And that is what userspace is actually checking for
> to determine if there is an error or not.

hm, I'd forgotten about that.

It seems to be a no-op on lots of architectures?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] Suppress A.OUT library support in ELF binfmt if !CONFIG_BINFMT_AOUT [try #3]

2007-11-07 Thread Adrian Bunk
On Wed, Nov 07, 2007 at 05:43:28PM +, David Howells wrote:

> Suppress A.OUT library support in ELF binfmt if CONFIG_BINFMT_AOUT is not set.
> 
> Not all architectures support the A.OUT binfmt, so the ELF binfmt should not
> be permitted to go looking for A.OUT libraries to load in such a case.
>...

The a.out interpreter support for ELF executables is already scheduled
for being completely removed in 2.6.25.

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bugme-new] [Bug 9319] New: National characters are not displayed under console.

2007-11-07 Thread H. Peter Anvin

This isn't a regression.  It's an intentional default change.

The default console mode changed from 8-bit legacy to UTF-8 in 2.6.24.

Apparently this user is using a legacy character set (note that it's a 
Slackware machine), and isn't explicitly setting the character set via 
the appropriate escape sequence.


The new default can be overridden via 
/sys/module/vt/parameters/default_utf8 or something like that...


-hpa

Andrew Morton wrote:

On Wed,  7 Nov 2007 13:19:16 -0800 (PST) [EMAIL PROTECTED] wrote:
http://bugzilla.kernel.org/show_bug.cgi?id=9319

   Summary: National characters are not displayed under console.
   Product: Drivers
   Version: 2.5
 KernelVersion: 2.6.24-rcX
  Platform: All
OS/Version: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Console/Framebuffers
AssignedTo: [EMAIL PROTECTED]
ReportedBy: [EMAIL PROTECTED]


Most recent kernel where this bug did not occur: 2.6.23

Distribution: Slackware

Hardware Environment:
Toshiba Tecra M1
Pentium M 1.6 512MB RAM, ICH4-M chipset, Trident CyberBlade XP4 video card

Software Environment: Slackware-current (kbd-1.12, glibs 2.5)

Problem Description:
The national characters like "ą", "ł" or "ż" are not displayed corectlly
under console (no matter vesa framebuffer, or standard vga). Instead of them
"?" on strange background is displayed. Problem begins on 2.6.24-rc1 and
continues on 2.6.24-rc2. On 2.6.23 everything is OK.

Steps to reproduce:
Run 2.6.24-rcX kernel and set national console font by setfont.



Another post-2.6.23 regression.  Possible culprits cc'ed?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.24-rc2: Reported regressions from 2.6.23

2007-11-07 Thread Rafael J. Wysocki
This message contains a list of some regressions from 2.6.23 which have been
reported since 2.6.24-rc1 was released and for which there are no fixes in the
mainline that I know of.  If any of them have been fixed already, please let me
know.

If you know of any other unresolved regressions from 2.6.23, please let me know
either and I'll add them to the list.


Subject : On 2.6.24-rc1-gc9927c2b BUG: unable to handle kernel paging 
request at virtual address 3d15b925
Submitter   : Giacomo Catenazzi <[EMAIL PROTECTED]>
References  : http://lkml.org/lkml/2007/10/24/487
  http://bugzilla.kernel.org/show_bug.cgi?id=9246
Handled-By  : 
Patch   : 


Subject : Potential regression in -git15: can't resume stopped root 
shell?
Submitter   : Theodore Tso <[EMAIL PROTECTED]>
References  : http://lkml.org/lkml/2007/10/20/114
  http://bugzilla.kernel.org/show_bug.cgi?id=9247
Handled-By  : Serge Hallyn <[EMAIL PROTECTED]>
Patch   : http://bugzilla.kernel.org/attachment.cgi?id=13361=view
  http://bugzilla.kernel.org/attachment.cgi?id=13375=view


Subject : irq 21: nobody cared 2.6.24-rc1
Submitter   : Bongani Hlope <[EMAIL PROTECTED]>
References  : http://lkml.org/lkml/2007/10/25/90
  http://bugzilla.kernel.org/show_bug.cgi?id=9249
Handled-By  : 
Patch   : 


Subject : [BUG] panic after umount (biscted)
Submitter   : Sebastian Siewior <[EMAIL PROTECTED]>
References  : http://marc.info/?l=linux-kernel=119338387030335=2
  http://bugzilla.kernel.org/show_bug.cgi?id=9250
Handled-By  : Jens Axboe <[EMAIL PROTECTED]>
Patch   : http://marc.info/?l=linux-kernel=119348520210349=2


Subject : 2.6.24-rc1 sysctl table check failed on PowerMac
Submitter   : Mikael Pettersson <[EMAIL PROTECTED]>
References  : http://marc.info/?l=linux-kernel=119350802331857=2
  http://bugzilla.kernel.org/show_bug.cgi?id=9251
Handled-By  : Alexey Dobriyan <[EMAIL PROTECTED]>
Patch   : http://marc.info/?l=linux-kernel=119351015801660=2


Subject : 2.6.24-rc1: pata_acpi fails to activate DMA for DVD-ROM on 
ALi M5229 secondary channel
Submitter   : Andrey Borzenkov <[EMAIL PROTECTED]>
References  : http://marc.info/?l=linux-kernel=119342005216716=2
  http://bugzilla.kernel.org/show_bug.cgi?id=9252
Handled-By  : Alan Cox <[EMAIL PROTECTED]>
Patch   : 
Note: pata_acpi was not present in 2.6.23


Subject : 2.6.24-rc1 freezes on powerbook at first boot stage
Submitter   : Elimar Riesebieter <[EMAIL PROTECTED]>
References  : http://lkml.org/lkml/2007/10/24/205
  http://bugzilla.kernel.org/show_bug.cgi?id=9254
Handled-By  : 
Patch   : 


Subject : build #286 failed for 2.6.24-rc1-gea45d15 in 
linux/arch/x86/kernel/setup_32.c
Submitter   : Toralf Förster <[EMAIL PROTECTED]>
References  : http://lkml.org/lkml/2007/10/28/110
  http://bugzilla.kernel.org/show_bug.cgi?id=9256
Handled-By  : "H. Peter Anvin" <[EMAIL PROTECTED]>
Patch   : http://marc.info/[EMAIL PROTECTED]


Subject : 2.6.24-rc1 kills onboard r8169 (rtl8111b) NIC
Submitter   : "Sergey S. Kostyliov" <[EMAIL PROTECTED]>
References  : http://lkml.org/lkml/2007/10/28/144
  http://bugzilla.kernel.org/show_bug.cgi?id=9257
Handled-By  : Francois Romieu <[EMAIL PROTECTED]>
Patch   : http://bugzilla.kernel.org/attachment.cgi?id=13441=view


Subject : Commit "Hibernation: Enter platform hibernation state in a 
consistent way)" makes my system to resume instantly from S4
Submitter   : Maxim Levitsky <[EMAIL PROTECTED]>
References  : http://lkml.org/lkml/2007/10/27/66
  http://bugzilla.kernel.org/show_bug.cgi?id=9258
Handled-By  : "Rafael J. Wysocki" <[EMAIL PROTECTED]>
Patch   : 
Note: $subject commit apparently exposes a problem that existed 
previously


Subject : leds: ledtrig-timer calls sleeping function from invalid 
context
Submitter   : Márton Németh <[EMAIL PROTECTED]>
References  : http://bugzilla.kernel.org/show_bug.cgi?id=9264
Handled-By  : 
Patch   : 


Subject : Device mapper regression 2.6.23 vs. v2.6.23-6597-gcfa76f0
Submitter   : Thomas Meyer <[EMAIL PROTECTED]>
References  : http://lkml.org/lkml/2007/10/21/153
  http://bugzilla.kernel.org/show_bug.cgi?id=9280
Handled-By  : 
Patch   : 


Subject : [2.6.24-rc1][BUG] Oops on battery removal
Submitter   : Rolf Eike Beer <[EMAIL PROTECTED]>
References  : http://lkml.org/lkml/2007/11/2/23
  http://bugzilla.kernel.org/show_bug.cgi?id=9283
Handled-By  : Alexey Starikovskiy <[EMAIL PROTECTED]>
Patch   : http://lkml.org/lkml/2007/11/2/71


Subject : [2.6.24-rc1 

Re: Fwd: same problem with 2.6.24-rc2

2007-11-07 Thread Randy Dunlap
On Wed, 07 Nov 2007 21:32:43 -0300 (GFT) werner wrote:

> On 7/Nov/2007 20:10 werner wrote ..
> > With 2.6.23-rc2 is the same problem:  it crashed at the beginning:  EIP 060 
> > c03fdea4
> > EFLAGS 00010212 EIP is at xor_sse_2+0x34/0x200
> > Again during the compilation was reclaimed that 
> > /arch/x86/Makefile.o
> > cannot be found and were certain dependencies on it not made, such a file 
> > isn't
> > present in the source code (present are, f.ex. Makefile_32 , Makefile_64 ), 
> > nor
> > was generated automaticaly during compilation, I think this is incorrect 
> > and the
> > reason for the problems

Hi,

Please provide the complete build log (with V=1 if possible) for the
missing Makefile.o problem.

E.g.:

make V=1 all >build.log 2>&1

Make sure that build.log contains the error message and then send
the complete build.log file to us at linux-kernel@vger.kernel.org .


> > wl
> > [EMAIL PROTECTED]
> > =
> > On 7/Nov/2007 16:14 Andrew Morton wrote ..
> > > > On Wed, 07 Nov 2007 15:55:12 -0300 (GFT) "werner" <[EMAIL PROTECTED]> 
> > > > wrote:
> > > > I really don't know what's happening.  I don't understand nothing about 
> > > > the
> > kernel
> > > error reporting system.   Because of this, always when there is a 
> > > problem, I
> > report
> > > it via e-mail to  linux-kernel@vger.kernel.org .  I don't know what 
> > > people there
> > > do with my messages. 
> > > 
> > > 
> > > It went like this:
> > > 
> > > 1: you sent an email to linux-kernel
> > > 
> > > 2: I sent a reply to you and linux-kernel
> > > 
> > > 3: you sent a reply to me, but NOT linux-kernel!
> > > 
> > > In other words, you did "reply", not "reply to all", thus you removed 
> > > three
> > > thousand people from the discussion.  One of those people is the person 
> > > who
> > > created the bug which you're hitting, and that person no longer knows
> > > what's happening.
> > > 
> > > 
> > > So please go back and resend all those emails, and retain ALL Cc:'s.  
> > > Don't
> > > just send them only to me.  Keep all indivisuals and all mailing lists on
> > > the email Cc: list.



---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: is minimum udelay() not respected in preemptible SMP kernel-2.6.23?

2007-11-07 Thread Matt Mackall
On Thu, Nov 08, 2007 at 01:31:00AM +0100, Andi Kleen wrote:
> On Thursday 08 November 2007 01:20, Matt Mackall wrote:
> > On Wed, Nov 07, 2007 at 12:30:45PM -0800, Andrew Morton wrote:
> > > Ow.  Yes, from my reading delay_tsc() can return early (or after
> > > heat-death-of-the-universe) if the TSCs are offset and if preemption
> > > migrates the calling task between CPUs.
> > >
> > > I suppose a lameo fix would be to disable preemption in delay_tsc().
> >
> > preempt_disable is lousy documentation here. This and other cases
> > (lots of per_cpu users, IIRC) actually want a migrate_disable() which
> > is a proper subset. We can simply implement migrate_disable() as
> > preempt_disable() for now and come back later and implement a proper
> > migrate_disable() that still allows preemption (and thus avoids the
> > latency).
> 
> We could actually do this right now. migrate_disable() can be just changing
> the cpu affinity of the current thread to current cpu and then restoring it 
> afterwards. That should even work from interrupt context.

Yes, that's one way. But we need somewhere to stash the old flags.
Expanding the task struct sucks. Jamming another bit in the preempt
count sucks.

But I think we'd be best off stashing a single bit somewhere and
checking it at migrate time (relatively infrequent) rather than
copying and zeroing out a potentially enormous affinity mask every
time we disable migration (often, and in fast paths). Perhaps adding
TASK_PINNED to the task state flags would do it?

> get_cpu() etc. could be changed to use this then too.

Some users of get_cpu might be relying on it to avoid actual
preemption. In other words, we should have introduced a
migrate_disable() when we first discovered the preempt/per_cpu
conflict.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][VIRTIO] Fix vring_init() ring computations

2007-11-07 Thread Anthony Liguori

Rusty Russell wrote:

On Wednesday 07 November 2007 13:52:29 Anthony Liguori wrote:
  
This patch fixes a typo in vring_init(). 



Thanks, applied.

I've put it in the new, experimental virtio git tree on git.kernel.org.
  


Hrm, perhaps you forgot to push?  I don't see it in the tree although I 
see the config ops refactoring.


Regards,

Anthony Liguori


Cheers,
Rusty.
  


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Patch] Allocate sparse vmemmap block above 4G

2007-11-07 Thread Zou Nan hai
Resend the patch for more people to review

On some single node x64 system with huge amount of physical memory e.g >
64G. the memmap size maybe very big. 

If the memmap is allocated from low pages, it may occupies too much
memory below 4G. 
then swiotlb could fail to reserve bounce buffer under 4G which will
lead to boot failure.

This patch will first try to allocate memmap memory above 4G in sparse
vmemmap code. 
If it failed, it will allocate memmap above MAX_DMA_ADDRESS. 
This patch is against 2.6.24-rc1-git14

Signed-off-by: Zou Nan hai <[EMAIL PROTECTED]>
Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]>

diff -Nraup a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
--- a/arch/x86/mm/init_64.c 2007-11-06 15:16:12.0 +0800
+++ b/arch/x86/mm/init_64.c 2007-11-06 15:55:50.0 +0800
@@ -448,6 +448,13 @@ void online_page(struct page *page)
num_physpages++;
 }
 
+void * __meminit alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size,
+unsigned long align)
+{
+return __alloc_bootmem_core(pgdat->bdata, size,
+align, (4UL*1024*1024*1024), 0, 1);
+}
+
 #ifdef CONFIG_MEMORY_HOTPLUG
 /*
  * Memory is added always to NORMAL zone. This means you will never get
diff -Nraup a/include/linux/bootmem.h b/include/linux/bootmem.h
--- a/include/linux/bootmem.h   2007-11-06 16:06:31.0 +0800
+++ b/include/linux/bootmem.h   2007-11-06 15:50:36.0 +0800
@@ -61,6 +61,10 @@ extern void *__alloc_bootmem_core(struct
  unsigned long limit,
  int strict_goal);
 
+extern void *alloc_bootmem_high_node(pg_data_t *pgdat,
+unsigned long size,
+unsigned long align);
+
 #ifndef CONFIG_HAVE_ARCH_BOOTMEM_NODE
 extern void reserve_bootmem(unsigned long addr, unsigned long size);
 #define alloc_bootmem(x) \
diff -Nraup a/mm/bootmem.c b/mm/bootmem.c
--- a/mm/bootmem.c  2007-11-06 16:06:31.0 +0800
+++ b/mm/bootmem.c  2007-11-06 15:49:20.0 +0800
@@ -492,3 +492,11 @@ void * __init __alloc_bootmem_low_node(p
return __alloc_bootmem_core(pgdat->bdata, size, align, goal,
ARCH_LOW_ADDRESS_LIMIT, 0);
 }
+
+__attribute__((weak)) __meminit
+void *alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size,
+unsigned long align)
+{
+return NULL;
+}
+
diff -Nraup a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
--- a/mm/sparse-vmemmap.c   2007-11-06 15:16:12.0 +0800
+++ b/mm/sparse-vmemmap.c   2007-11-06 16:08:52.0 +0800
@@ -43,9 +43,13 @@ void * __meminit vmemmap_alloc_block(uns
if (page)
return page_address(page);
return NULL;
-   } else
+   } else {
+   void *p = alloc_bootmem_high_node(NODE_DATA(node), size, size);
+   if (p)
+   return p;
return __alloc_bootmem_node(NODE_DATA(node), size, size,
__pa(MAX_DMA_ADDRESS));
+   }
 }
 
 void __meminit vmemmap_verify(pte_t *pte, int node,




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Patch]Add strict_goal parameter to __alloc_bootmem_core

2007-11-07 Thread Zou Nan hai
Resend the patch for more people to review.

If __alloc_bootmem_core was given a goal, it will first try to allocate
memory above that goal. If failed, it will try from the low pages.

Sometimes we don't want this behavior, we want the goal to be strict.

This patch introduce a strict_goal parameter to __alloc_bootmem_core, 

If strict_goal is set, __alloc_bootmem_core will return NULL to indicate
it can't allocate memory above that goal.

Note we do not scan from last_success if strict_goal is set, it will
scan from the beginning of the goal instead
We skip this optimization to keep the code simple because strict_goal is
not supposed to be used in hot path.

Signed-off-by: Zou Nan hai <[EMAIL PROTECTED]>
Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]>

diff -Nraup a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c
--- a/arch/x86/mm/numa_64.c 2007-10-24 11:50:57.0 +0800
+++ b/arch/x86/mm/numa_64.c 2007-11-07 13:06:50.0 +0800
@@ -247,7 +247,7 @@ void __init setup_node_zones(int nodeid)
__alloc_bootmem_core(NODE_DATA(nodeid)->bdata, 
memmapsize, SMP_CACHE_BYTES, 
round_down(limit - memmapsize, PAGE_SIZE), 
-   limit);
+   limit, 1);
 #endif
 } 
 
diff -Nraup a/include/linux/bootmem.h b/include/linux/bootmem.h
--- a/include/linux/bootmem.h   2007-11-07 13:06:35.0 +0800
+++ b/include/linux/bootmem.h   2007-11-07 13:06:04.0 +0800
@@ -58,7 +58,8 @@ extern void *__alloc_bootmem_core(struct
  unsigned long size,
  unsigned long align,
  unsigned long goal,
- unsigned long limit);
+ unsigned long limit,
+ int strict_goal);
 
 #ifndef CONFIG_HAVE_ARCH_BOOTMEM_NODE
 extern void reserve_bootmem(unsigned long addr, unsigned long size);
diff -Nraup a/mm/bootmem.c b/mm/bootmem.c
--- a/mm/bootmem.c  2007-11-07 13:06:35.0 +0800
+++ b/mm/bootmem.c  2007-11-07 13:06:18.0 +0800
@@ -179,7 +179,7 @@ static void __init free_bootmem_core(boo
  */
 void * __init
 __alloc_bootmem_core(struct bootmem_data *bdata, unsigned long size,
- unsigned long align, unsigned long goal, unsigned long limit)
+ unsigned long align, unsigned long goal, unsigned long limit, int 
strict_goal)
 {
unsigned long offset, remaining_size, areasize, preferred;
unsigned long i, start = 0, incr, eidx, end_pfn;
@@ -212,15 +212,20 @@ __alloc_bootmem_core(struct bootmem_data
/*
 * We try to allocate bootmem pages above 'goal'
 * first, then we try to allocate lower pages.
-*/
-   if (goal && goal >= bdata->node_boot_start && PFN_DOWN(goal) < end_pfn) 
{
-   preferred = goal - bdata->node_boot_start;
+* if the goal is not strict.
+ */
+
+   preferred = 0;
+   if (goal) {
+   if (goal >= bdata->node_boot_start && PFN_DOWN(goal) < end_pfn) 
{
+   preferred = goal - bdata->node_boot_start;
 
if (bdata->last_success >= preferred)
-   if (!limit || (limit && limit > bdata->last_success))
+   if (!strict_goal && (!limit || (limit && limit > 
bdata->last_success)))
preferred = bdata->last_success;
-   } else
-   preferred = 0;
+   } else if (strict_goal)
+return NULL;
+   }
 
preferred = PFN_DOWN(ALIGN(preferred, align)) + offset;
areasize = (size + PAGE_SIZE-1) / PAGE_SIZE;
@@ -247,7 +252,7 @@ restart_scan:
i = ALIGN(j, incr);
}
 
-   if (preferred > offset) {
+   if (preferred > offset && !strict_goal) {
preferred = offset;
goto restart_scan;
}
@@ -421,7 +426,7 @@ void * __init __alloc_bootmem_nopanic(un
void *ptr;
 
list_for_each_entry(bdata, _list, list) {
-   ptr = __alloc_bootmem_core(bdata, size, align, goal, 0);
+   ptr = __alloc_bootmem_core(bdata, size, align, goal, 0, 0);
if (ptr)
return ptr;
}
@@ -449,7 +454,7 @@ void * __init __alloc_bootmem_node(pg_da
 {
void *ptr;
 
-   ptr = __alloc_bootmem_core(pgdat->bdata, size, align, goal, 0);
+   ptr = __alloc_bootmem_core(pgdat->bdata, size, align, goal, 0, 0);
if (ptr)
return ptr;
 
@@ -468,7 +473,7 @@ void * __init __alloc_bootmem_low(unsign
 
list_for_each_entry(bdata, _list, list) {
ptr = __alloc_bootmem_core(bdata, size, align, goal,
-   ARCH_LOW_ADDRESS_LIMIT);
+   ARCH_LOW_ADDRESS_LIMIT, 0);
if (ptr)
   

Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Andreas Schwab
Andrew Morton <[EMAIL PROTECTED]> writes:

> diff -puN kernel/compat.c~a kernel/compat.c
> --- a/kernel/compat.c~a
> +++ a/kernel/compat.c
> @@ -162,7 +162,8 @@ asmlinkage long compat_sys_times(struct 
>   if (copy_to_user(tbuf, , sizeof(tmp)))
>   return -EFAULT;
>   }
> - return compat_jiffies_to_clock_t(jiffies);
> + return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) &
> + LONG_MAX);

Are you sure you want LONG_MAX here, not 0x7fff?

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread David Miller
From: Andrew Morton <[EMAIL PROTECTED]>
Date: Wed, 7 Nov 2007 15:28:33 -0800

> Perhaps this is a bug in glibc: it is interpreting the times() return value
> in the same way as other syscalls.

The problem is more likely that we are failing to
invoke force_successful_syscall_return() here.

Otherwise the syscall return path interprets negative
values as errors, and sets the cpu condition codes.

And that is what userspace is actually checking for
to determine if there is an error or not.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: writeout stalls in current -git

2007-11-07 Thread David Chinner
On Wed, Nov 07, 2007 at 08:15:06AM +0100, Torsten Kaiser wrote:
> On 11/7/07, David Chinner <[EMAIL PROTECTED]> wrote:
> > Ok, so it's not synchronous writes that we are doing - we're just
> > submitting bio's tagged as WRITE_SYNC to get the I/O issued quickly.
> > The "synchronous" nature appears to be coming from higher level
> > locking when reclaiming inodes (on the flush lock). It appears that
> > inode write clustering is failing completely so we are writing the
> > same block multiple times i.e. once for each inode in the cluster we
> > have to write.
> 
> Works for me. The only remaining stalls are sub second and look
> completely valid, considering the amount of files being removed.

> Tested-by: Torsten Kaiser <[EMAIL PROTECTED]>

Great - thanks for reporting the problem and testing the fix.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.34-rc1 eat my photo SD card :-(

2007-11-07 Thread Rafael J. Wysocki
On Wednesday, 7 of November 2007, Romano Giannetti wrote:
> 
> On Tue, 2007-11-06 at 23:17 +0100, Romano Giannetti wrote:
> > Well, I started bisecting it. It will be a long shot, I suspect...
> 
> Well, I spent the last 36 hours (more or less) trying to bisect the SD
> problem. The method I used was to insert the card, umount it, and make 8 dd
> in a row; the kernel is "bad" if they differs, "good" if they are the same. 
> 
> I could not finish the bisect. The last pair good/bad were:
> 
> bad:   [7aeacf982203fb4dea2f3434eefdc268cfd5d6d9] 
>[BLOCK] blk_rq_map_sg: force clear termination bit
> good:  [e38f981758118d829cd40cfe9c09e3fa81e422aa] 
>exportfs: update documentation
> 
> The problem to conclude the bisect is that there is a whole series of
> commits, named [SG] something, that seems to matter; but my three try of a
> commit between the previous two ended with a MMC layer not working with this
> oops:

Can you please update the Bugzilla entry at
http://bugzilla.kernel.org/show_bug.cgi?id=9286 with this information?

 
> [   81.738991] BUG: unable to handle kernel NULL pointer dereference at 
> virtual address 
> [   81.739003] printing eip: c01db437 *pde =  
> [   81.739010] Oops:  [#1] SMP 
> [   81.739016] Modules linked in: mmc_block binfmt_misc rfcomm l2cap 
> bluetooth ppdev i915 drm acpi_cpufreq cpufreq_conservative cpufreq_stats 
> cpufreq_ondemand freq_table cpufreq_userspace cpufreq_powersave dock 
> container sbs sbshc af_packet nls_iso8859_1 nls_cp437 vfat fat nls_utf8 ntfs 
> dm_crypt dm_mod sbp2 parport_pc lp parport fuse snd_hda_intel snd_pcm_oss 
> snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss iTCO_wdt iTCO_vendor_support 
> serio_raw sdhci snd_seq_midi snd_rawmidi snd_seq_midi_event psmouse pcspkr 
> mmc_core snd_seq snd_timer snd_seq_device snd soundcore video output battery 
> snd_page_alloc ac button intel_agp agpgart evdev ext3 jbd mbcache sg sr_mod 
> cdrom sd_mod ata_piix ehci_hcd ata_generic ohci1394 uhci_hcd ieee1394 libata 
> scsi_mod generic usbcore r8169 thermal processor fan
> [   81.739122] 
> [   81.739127] Pid: 6075, comm: mmcqd Not tainted (2.6.23-bisect #19)
> [   81.739132] EIP: 0060:[] EFLAGS: 00010246 CPU: 0
> [   81.739141] EIP is at blk_rq_map_sg+0xd7/0x190
> [   81.739145] EAX: 03619000 EBX:  ECX: c3464198 EDX: c3464698
> [   81.739150] ESI: 0361a000 EDI: 1000 EBP: cb82fe24 ESP: cb82fdec
> [   81.739154]  DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
> [   81.739159] Process mmcqd (pid: 6075, ti=cb82e000 task=cb2a5550 
> task.ti=cb82e000)
> [   81.739163] Stack: 0292 c366c530 cb839a70 2000 0361b000 c3464698 
> 0001 0001 
> [   81.739176] c34e0848 01ae4698 c33ef2b0 c33ef2b0 cb2ec870 
> cb82fe3c f8e81e6c 
> [   81.739188]00200200 c3342580 c33ef2b0 cb2ec870 cb82ffb8 f8e816f9 
> 7898775f 5f6f5965 
> [   81.739200] Call Trace:
> [   81.739204]  [] show_trace_log_lvl+0x1a/0x30
> [   81.739213]  [] show_stack_log_lvl+0xb1/0xe0
> [   81.739220]  [] show_registers+0xc1/0x1d0
> [   81.739226]  [] die+0x11a/0x230
> [   81.739232]  [] do_page_fault+0x269/0x5f0
> [   81.739239]  [] error_code+0x72/0x78
> [   81.739247]  [] mmc_queue_map_sg+0x2c/0xe0 [mmc_block]
> [   81.739258]  [] mmc_blk_issue_rq+0x199/0x750 [mmc_block]
> [   81.739267]  [] mmc_queue_thread+0x80/0xf0 [mmc_block]
> [   81.739275]  [] kthread+0x42/0x70
> [   81.739282]  [] kernel_thread_helper+0x7/0x10
> [   81.739289]  ===
> [   81.739292] Code: f0 89 45 d8 8b 01 2b 05 80 aa 67 c0 c1 f8 02 69 c0 c5 4e 
> ec c4 c1 e0 0c 03 41 08 39 45 d8 0f 84 8e 00 00 00 f6 03 02 74 52 31 db <8b> 
> 03 c7 43 0c 00 00 00 00 c7 43 08 00 00 00 00 83 e0 03 0b 01 
> [   81.739358] EIP: [] blk_rq_map_sg+0xd7/0x190 SS:ESP 0068:cb82fdec
> 
> It seems to me that the two commits:
> 
> [BLOCK] blk_rq_map_sg: force clear termination bit
> [BLOCK] Don't clear sg_dma_len/addr() in blk_rq_map_sg()
> 
> have the potential to fix the aforementioned oops, but in a way that create
> for the mmc layer the problem reported. It's just gut feeling, I have not
> the knowledge of the kernel needed to debug this, but this comment:
> 
> +  * If the driver previously mapped a shorter
> +  * list, we could see a termination bit
> +  * prematurely unless it fully inits the sg
> +  * table on each mapping. We KNOW that there
> +  * must be more entries here or the driver
> +  * would be buggy, so force clear the
> +  * termination bit to avoid doing a full
> +  * sg_init_table() in drivers for each command.
> +  */
> 
> rang a bell. When the bug occurs, it seems that some random page is mapped
> into the device, so that... maybe the list was not supposed to continue in
> this case? 
> 
> Well, I hope it can helps someone to find the bug. I am available to
> test/try whatever patches you send me. 
> 
>Romano 
> 
> Complete git bisect log:
> 
> git-bisect start
> # bad: 

Re: is minimum udelay() not respected in preemptible SMP kernel-2.6.23?

2007-11-07 Thread Andi Kleen
On Thursday 08 November 2007 01:20, Matt Mackall wrote:
> On Wed, Nov 07, 2007 at 12:30:45PM -0800, Andrew Morton wrote:
> > Ow.  Yes, from my reading delay_tsc() can return early (or after
> > heat-death-of-the-universe) if the TSCs are offset and if preemption
> > migrates the calling task between CPUs.
> >
> > I suppose a lameo fix would be to disable preemption in delay_tsc().
>
> preempt_disable is lousy documentation here. This and other cases
> (lots of per_cpu users, IIRC) actually want a migrate_disable() which
> is a proper subset. We can simply implement migrate_disable() as
> preempt_disable() for now and come back later and implement a proper
> migrate_disable() that still allows preemption (and thus avoids the
> latency).

We could actually do this right now. migrate_disable() can be just changing
the cpu affinity of the current thread to current cpu and then restoring it 
afterwards. That should even work from interrupt context.

get_cpu() etc. could be changed to use this then too.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Fwd: same problem with 2.6.24-rc2

2007-11-07 Thread werner
On 7/Nov/2007 20:10 werner wrote ..
> With 2.6.23-rc2 is the same problem:  it crashed at the beginning:  EIP 060 
> c03fdea4
> EFLAGS 00010212 EIP is at xor_sse_2+0x34/0x200
> Again during the compilation was reclaimed that 
> /arch/x86/Makefile.o
> cannot be found and were certain dependencies on it not made, such a file 
> isn't
> present in the source code (present are, f.ex. Makefile_32 , Makefile_64 ), 
> nor
> was generated automaticaly during compilation, I think this is incorrect and 
> the
> reason for the problems
>
> wl
> [EMAIL PROTECTED]
> =
> On 7/Nov/2007 16:14 Andrew Morton wrote ..
> > > On Wed, 07 Nov 2007 15:55:12 -0300 (GFT) "werner" <[EMAIL PROTECTED]> 
> > > wrote:
> > > I really don't know what's happening.  I don't understand nothing about 
> > > the
> kernel
> > error reporting system.   Because of this, always when there is a problem, I
> report
> > it via e-mail to  linux-kernel@vger.kernel.org .  I don't know what people 
> > there
> > do with my messages.
> >
> >
> > It went like this:
> >
> > 1: you sent an email to linux-kernel
> >
> > 2: I sent a reply to you and linux-kernel
> >
> > 3: you sent a reply to me, but NOT linux-kernel!
> >
> > In other words, you did "reply", not "reply to all", thus you removed three
> > thousand people from the discussion.  One of those people is the person who
> > created the bug which you're hitting, and that person no longer knows
> > what's happening.
> >
> >
> > So please go back and resend all those emails, and retain ALL Cc:'s.  Don't
> > just send them only to me.  Keep all indivisuals and all mailing lists on
> > the email Cc: list.
> ==
> *** www.copaya.yi.org / www.monkey.is-a-geek.net ***
> O único servidor comunitário na Guiana-Francesa.  Situado no local, rápido, 
> imuno
> contra guerras / desastres na Europa.   Serviço não-comercial e gratuito de:  
> http
> (forum, página web), irc (chat), ftp (download), name (subdomain) .
==
*** www.copaya.yi.org / www.monkey.is-a-geek.net ***
O único servidor comunitário na Guiana-Francesa.  Situado no local, rápido, 
imuno contra guerras / desastres na Europa.   Serviço não-comercial e gratuito 
de:  http (forum, página web), irc (chat), ftp (download), name (subdomain) .



Re: [PATCH] create /sys/.../power when CONFIG_PM is set

2007-11-07 Thread Greg KH
On Wed, Nov 07, 2007 at 11:24:55PM +0100, Rafael J. Wysocki wrote:
> On Wednesday, 7 of November 2007, Daniel Drake wrote:
> > The CONFIG_SUSPEND changes in 2.6.23 caused a regression under certain
> > configuration conditions (SUSPEND=n, USB_AUTOSUSPEND=y) where all USB device
> > attributes in sysfs (idVendor, idProduct, ...) silently disappeared, causing
> > udev breakage and more.
> > 
> > The cause of this is that the /sys/.../power subdirectory is now only 
> > created
> > when CONFIG_PM_SLEEP is set, however, it should be created whenever 
> > CONFIG_PM
> > is set to handle the above situation. The following patch fixes the
> > regression.
> > 
> > Signed-off-by: Daniel Drake <[EMAIL PROTECTED]>
> 
> Acked-by: Rafael J. Wysocki <[EMAIL PROTECTED]>
> 
> Greg, I think this patch should go through your tree?

Yes, I'll take it.  I'm at a conference until Friday, but will take it
then and then get it to Linus before 2.6.24 is out.

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: avoid large irq-latencies in smp-balancing

2007-11-07 Thread Eric St-Laurent

On Wed, 2007-11-07 at 17:10 -0500, Steven Rostedt wrote:
> > 
> > It would be nice if sched_nr_migrate didn't exist, really.  It's hard to
> > imagine anyone wanting to tweak it, apart from developers.
> 
> I'm not so sure about that. It is a tunable for RT. That is we can tweak
> this value to be smaller if we don't like the latencies it gives us.
> 
> This is one of those things that sacrifices performance for latency.
> The higher the number, the better it can spread tasks around, but it
> also causes large latencies.
> 
> I've just included this patch into 2.6.23.1-rt11 and it brought down an
> unbounded latency to just 42us. (previously we got into the
> milliseconds!).
> 
> Perhaps when this feature matures, we can come to a good defined value
> that would be good for all. But until then, I recommend keeping this a
> tunable.


Why not use the latency-expectation infrastructure?

Iterate under lock until (or before...) the system global latency is
respected.


- Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA eating my disk, port reset, destroying unrelated data

2007-11-07 Thread Robert Hancock

Norbert Preining wrote:

Dear all!

(please Cc me for answers)

Since about 5 days I am having serious problems with my SATA drive:

kernel 2.6.22 (from Debian/sid)
hardware nv

Sometimes at boot time, often/always at disk io intense stuff:

ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x40 action 0x2


Serror 0x40 means a handshake error. Usually Serror indications are 
due to a hardware problem (bad SATA cable, power or drive problem).



ata1.00: (BMDMA stat 0x25)
ata1.00: cmd 35/00:00:2a:6f:c0/00:04:0c:00:00/e0 tag 0 cdb 0x0 data 524288 out
 res 51/84:10:1a:72:c0/84:01:0c:00:00/e0 Emask 0x10 (ATA bus error)
ata1: soft resetting port
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: configured for UDMA/133
ata1: EH complete

Even worse, sometimes the reset does not work ...

ata1: device not ready (errno=-16), forcing hardreset
ata1: hard resetting port
ata1 SRST failed (errno=-19)
ata1: reset failed (errno=-19), retrying in 10 secs
..

(typed from a digital photo, nothing remains in the logs)

After this I need to do a cold boot otherwise the drive is really in a
bad state and not even the bios gets it right.


If even the BIOS cannot reset properly then that also really points to a 
hardware problem..




Interestingly the whole stuff DID work for a long time until I did too
many things at the same time: 2 x svn up, copying 40G from the SATA
drive to an USB drive, aptitude upgrade. Before I did regularly the same
stuff (like svn up etc), but this time it was too much, it seems.

Apropos data hosing: After the first incident some data on my windows
partitions (/dev/sda1) was hosed, programs missing, chkdisk necessary
etc.

I attach dmesg (from the current boot with a succeeding soft reset, I
interrupted the svn process before the SATA drives goes into hard reset
failures), .config, lspci -v output.

Are there any chances that using 2.6.23 will improve/fix this? Any other
suggestions?

I would consider it an hardware problem, but since it started at one big
io thingy and is persistent since then I am a bit sceptic.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SC1200 failure in 2.6.23 and 2.6.24-rc1-git10

2007-11-07 Thread Bartlomiej Zolnierkiewicz
On Thursday 08 November 2007, Denys Fedoryshchenko wrote:
> 2.6.24-rc2 not working very well
> 
> 
> dmesg
> [   12.386395] Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
> [   12.405579] ide: Assuming 33MHz system bus speed for PIO modes; override 
> with idebus=xx
> [   12.430441] SC1200: IDE controller (0x100b:0x0502 rev 0x01) at  PCI slot 
> :00:12.2
> [   12.454070] SC1200: not 100% native mode: will probe irqs later
> [   12.471947] ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:pio, 
> hdb:pio
> [   12.493873] ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:pio, 
> hdd:pio
> [   12.515810] Probing IDE interface ide0...
> [   12.528810] Clocksource tsc unstable (delta = -497423729 ns)
> [   12.545888] Time: pit clocksource has been installed.
> [   12.563379] hda: SanDisk SDCFH-1024, CFA DISK drive
> [   12.578340] hda: applying conservative PIO "downgrade"
> [   12.593869] hda: host max PIO4 wanted PIO255(auto-tune) selected PIO1
> [   12.594006] hda: MW DMA 2 mode selected
> [   12.594297] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
> [   12.608778] Probing IDE interface ide1...
> [   12.623192] hda: max request size: 128KiB
> [   12.635322] hda: 2001888 sectors (1024 MB) w/1KiB Cache, CHS=1986/16/63, 
> DMA
> [   12.657134]  hda:<4>hda: dma_timer_expiry: dma status == 0x21
> [   12.865846] hda: DMA timeout error
> [   12.876092]  ide_dma_end dma_stat=21 err=1 newerr=0
> [   12.890753] hda: dma timeout error: status=0x58 { DriveReady SeekComplete 
> DataRequest }
> [   12.914977] ide: failed opcode was: unknown
> [   12.927743] hda: DMA disabled
> [   12.937035] ide0: reset: success
> [   12.948324]  hda1
> 
> Mounting taking long time on 1GB card cause of DMA issues. In dmesg i am not 
> sure about timestamp showing few seconds, in real life it took about 2 
> minutes.

Please try booting with "hda=nodma".

It could be a hardware problem (CF adapter without DMA lines).

Thanks,
Bart
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: is minimum udelay() not respected in preemptible SMP kernel-2.6.23?

2007-11-07 Thread Matt Mackall
On Wed, Nov 07, 2007 at 12:30:45PM -0800, Andrew Morton wrote:
> Ow.  Yes, from my reading delay_tsc() can return early (or after
> heat-death-of-the-universe) if the TSCs are offset and if preemption
> migrates the calling task between CPUs.
> 
> I suppose a lameo fix would be to disable preemption in delay_tsc().

preempt_disable is lousy documentation here. This and other cases
(lots of per_cpu users, IIRC) actually want a migrate_disable() which
is a proper subset. We can simply implement migrate_disable() as
preempt_disable() for now and come back later and implement a proper
migrate_disable() that still allows preemption (and thus avoids the
latency).

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH 3/3] Recursive mtime for ext3

2007-11-07 Thread Theodore Tso
On Wed, Nov 07, 2007 at 03:36:05PM +0100, Jan Kara wrote:
> > What if more than one application wants to use this facility?
>
>   That should be fine - let's see: Each application keeps somewhere a time 
> when
> it started a scan of a subtree (or it can actually remember a time when it
> set the flag for each directory), during the scan, it sets the flag on
> each directory. When it wakes up to recheck the subtree it just compares
> the rtime against the stored time - if rtime is greater, subtree has been
> modified since the last scan and we recurse in it and when we are finished
> with it we set the flag. Now notice that we don't care about the flag when
> we check for changes - we care only for rtime - so if there are several
> applications interested in the same subtree, the flag just gets set more
> often and thus the update of rtime happens more often but the same scheme
> still works fine.

OK, so in this case you don't need to set rtime on the every single
file inode, but only directory inode, right?  Because you're only
using checking the rtime at the directory level, and not the flag.
And it's just as easy for you to check the rtime flag for the file's
containing directory (modulo magic vis-a-vis hard links) as the file's
inode.

I'm just really wishing that rtime and the rtime flag didn't have live
on disk, but could rather be in memory.  If you only needed to save
the directory flags and rtimes, that might actually be doable.

Note by the way that since you need to own the file/directory to set
flags, this means that only programs that are running as root or
running as the uid who owns the entire subtree will be able to use
this scheme.  One advantage of doing in kernel memory is that you
might be able to support watching a tree that is not owned by the
watcher.

>   I don't get it here - you need to scan the whole subtree and set the flag
> only during the initial scan. Later, you need to scan and set the flag only
> for directories in whose subtree something changed. Similarty rtime needs
> to be updated for each inode at most once after the scan. 

OK, so in the worst case every single file in a kernel source tree
might change after doing an extreme git checkout.  That means around
36k of files get updated.  So if you have to set/clear the rtime flag
during the checkout process 36k file inodes would have to have their
rtime flag cleared, plus 2k worth of directory inodes; but those would
probably be folded into other changes made to the inodes anyway.  But
then when trackerd goes back and scans the subtree, if you are
actually setting rtime flags for every single file inode, then that's
38k of indoes that need updating.  If you only need to set the rtime
flags for directories, that's only 2k worth of extra gratuitous inode
updates.

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] r8169 fix regression on ASUS motherboards (updated)

2007-11-07 Thread Francois Romieu
Mark Lord <[EMAIL PROTECTED]> :
[...]
> I've now received a couple of private emails from people reporting
> full success with this patch.

Ok, I have pushed the patch below for Jeff to pull at korg.

>From 1dd7681bc2ff171341ea5cae957f8ecb5c0c102e Mon Sep 17 00:00:00 2001
From: Mark Lord <[EMAIL PROTECTED]>
Date: Thu, 8 Nov 2007 01:03:04 +0100
Subject: [PATCH] r8169: revert 7da97ec96a0934319c7fbedd3d38baf533e20640 (partly)

Various symptoms depending on the .config options:
- the card stops working after some (short) time
- the card does not work at all
- the card disappears (nothing in lspci/dmesg)

A real power-off is needed to recover the card.

Signed-off-by: Mark Lord <[EMAIL PROTECTED]>
Signed-off-by: Francois Romieu <[EMAIL PROTECTED]>
---
 drivers/net/r8169.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 9dbab3f..a37cf82 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -1328,6 +1328,7 @@ static void rtl_hw_phy_config(struct net_device *dev)
break;
case RTL_GIGA_MAC_VER_11:
case RTL_GIGA_MAC_VER_12:
+   break;
case RTL_GIGA_MAC_VER_17:
rtl8168b_hw_phy_config(ioaddr);
break;
-- 
1.5.3.3

-- 
Ueimor
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Andrew Morton
> On Wed, 7 Nov 2007 15:28:33 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote:
> > On Wed, 7 Nov 2007 14:47:22 -0800 David Brown <[EMAIL PROTECTED]> wrote:
> > compat_sys_times() has bogus return until jiffies is >= 0.  I discovered
> > this running LTP within 5 minutes of booting.
> > 
> > The return result
> > 
> > return compat_jiffies_to_clock_t(jiffies);
> > 
> > will return '-1' to user space and set the negated clock_t value to errno.
> > 
> > I'm not sure what the correct fix for this is.  I can come up with a patch
> > if anyone has ideas on how to fix it.
> > 
> > At minimum, perhaps it should return a sane errno value.
> 
> RETURN VALUE
>times()  returns  the  number of clock ticks that have elapsed since an
>arbitrary point in the past.  For Linux 2.4 and earlier this  point  is
>the  moment  the  system  was  booted.   Since Linux 2.6, this point is
>(2^32/HZ) - 300 (i.e., about 429 million) seconds  before  system  boot
>time.   The  return  value  may  overflow  the  possible  range of type
>clock_t.  On error, (clock_t) -1 is returned, and errno is  set  appro-
>priately.
> 
> 
> Perhaps this is a bug in glibc: it is interpreting the times() return value
> in the same way as other syscalls.
> 
> It would have been sensible for us to add INITIAL_JIFFIES to the value
> instead of exposing this kernel-only detail to the world, although the
> problem will of course reoccur once jiffies hits 0x8000.  Unfortunately
> we've even gone and enshrined this bogon in the manpage.
> 
> Proposed fix:
> 
> -return compat_jiffies_to_clock_t(jiffies);
> +return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) &
> + 0x7fff);
> 
> ?

Like this?

It gets messy.


From: Andrew Morton <[EMAIL PROTECTED]>

David Brown points out that compat_sys_times() (and sys_times()) can return
arbitrary 32-bit (or 64-bit values).  If these happen to be negative (jiffy
wrap, or before INITIAL_JIFFIES) then libc will interpret this as an error and
will return -1 to the libc user and will set errno.

The manpage for times(2) says:

   times()  returns  the  number of clock ticks that have elapsed since an
   arbitrary point in the past.  For Linux 2.4 and earlier this  point  is
   the  moment  the  system  was  booted.   Since Linux 2.6, this point is
   (2^32/HZ) - 300 (i.e., about 429 million) seconds  before  system  boot
   time.   The  return  value  may  overflow  the  possible  range of type
   clock_t.  On error, (clock_t) -1 is returned, and errno is  set  appro-
   priately.

We can fix this by masking the return value down to a 31-bit (63-bit) value.

Also, let's correct for INTIAL_JIFFIES - this isn't a detail which should be
exposed to userspace.

Unfortunately this change can break userspace.  If a program was (correctly)
doing:

unsigned long start = times(...);
...
unsigned long end = times(...);
unsigned long delta = end - start;

then `delta' can be grossly wrong if we wrapped in the interval.  Instead
userspace will need to mask `delta' by 0x7fff (0x7fff) to get
the correct number.

But userspace was already busted in the presence of wraparound, due to glibc's
convert-to-negative-one behaviour.

Given all this stuff, the return value from sys_times() doesn't seem a
particularly useful or reliable kernel interface.

Cc: David Brown <[EMAIL PROTECTED]>
Cc: Ulrich Drepper <[EMAIL PROTECTED]>
Cc: Michael Kerrisk <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---

 kernel/compat.c |3 ++-
 kernel/sys.c|3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff -puN kernel/sys.c~a kernel/sys.c
--- a/kernel/sys.c~a
+++ a/kernel/sys.c
@@ -897,7 +897,8 @@ asmlinkage long sys_times(struct tms __u
if (copy_to_user(tbuf, , sizeof(struct tms)))
return -EFAULT;
}
-   return (long) jiffies_64_to_clock_t(get_jiffies_64());
+   return jiffies_64_to_clock_t((get_jiffies_64() + INITIAL_JIFFIES) &
+   LONG_MAX);
 }
 
 /*
diff -puN kernel/compat.c~a kernel/compat.c
--- a/kernel/compat.c~a
+++ a/kernel/compat.c
@@ -162,7 +162,8 @@ asmlinkage long compat_sys_times(struct 
if (copy_to_user(tbuf, , sizeof(tmp)))
return -EFAULT;
}
-   return compat_jiffies_to_clock_t(jiffies);
+   return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) &
+   LONG_MAX);
 }
 
 /*
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Problem with accessing namespace_sem from LSM.

2007-11-07 Thread Tetsuo Handa
Hello.

Christoph Hellwig wrote:
> Same argument as with the AA folks: it does not have any business looking
> at the vfsmount.  If you create a file it can and in many setups will
> show up in multiple vfsmounts, so making decisions based on the particular
> one this creat happens through is wrong and actually dangerous.
Thus TOMOYO 1.x doesn't use LSM hooks, and AppArmor for OpenSuSE 10.3
added "struct vfsmount" parameter for VFS helper functions and LSM hooks.

Not all systems use bind mounts.
There is likely only one vfsmount which corresponds with a given dentry.

What does "dangerous" mean? It causes crash?

Regards.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86 - 32-bit ptrace emulation mishandles 6th arg

2007-11-07 Thread Chuck Ebbert
On 11/07/2007 04:12 PM, Roland McGrath wrote:
> Sure has my ACK.  
> I never really understood why my old patch was not taken 2.5 years ago.
> 
> 

I forget the details, but I had to make some kind of trivial change
to make it work in some corner cases.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sata NCQ blacklist entry

2007-11-07 Thread Robert Hancock

Tejun Heo wrote:

Florian La Roche wrote:

Hello all,

I've taking email addresses from the last NCQ blacklist changes going
into the kernel.
This Fujitsu drive also gives me spurious command completions. Detailed
output also available at https://bugzilla.redhat.com/show_bug.cgi?id=366181.

Let me know if you need more info or anything else.

--- drivers/ata/libata-core.c
+++ drivers/ata/libata-core.c
@@ -4222,6 +4222,7 @@
{ "WDC WD740ADFD-00NLR1", NULL,   ATA_HORKAGE_NONCQ, },
{ "WDC WD3200AAJS-00RYA0", "12.01B01",  ATA_HORKAGE_NONCQ, },
{ "FUJITSU MHV2080BH","00840028",   ATA_HORKAGE_NONCQ, },
+   { "FUJITSU MHW2160BJ G2",   NULL,   ATA_HORKAGE_NONCQ },
{ "ST9120822AS",  "3.CLF",  ATA_HORKAGE_NONCQ, },
{ "ST9160821AS",  "3.CLF",  ATA_HORKAGE_NONCQ, },
{ "ST9160821AS",  "3.ALD",  ATA_HORKAGE_NONCQ, },


Thanks.  We're currently trying to find out what's actually going on
with all these drives.  At first, drives which got blacklisted aren't
many and made sense (had other problems with NCQ, etc..) but with new
generation drives from many vendors showing the same symptom, we aren't
too sure now.

I'll keep your email in my todo list and add the drive to the blacklist
once the problem is verified.


I agree that something seems fishy with this. It seems unlikely that 
this many drives from multiple vendors would have the exact same, 
relatively obscure problem..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SC1200 failure in 2.6.23 and 2.6.24-rc1-git10

2007-11-07 Thread Denys Fedoryshchenko
2.6.24-rc2 not working very well


dmesg
[   12.386395] Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
[   12.405579] ide: Assuming 33MHz system bus speed for PIO modes; override 
with idebus=xx
[   12.430441] SC1200: IDE controller (0x100b:0x0502 rev 0x01) at  PCI slot 
:00:12.2
[   12.454070] SC1200: not 100% native mode: will probe irqs later
[   12.471947] ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:pio, 
hdb:pio
[   12.493873] ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:pio, 
hdd:pio
[   12.515810] Probing IDE interface ide0...
[   12.528810] Clocksource tsc unstable (delta = -497423729 ns)
[   12.545888] Time: pit clocksource has been installed.
[   12.563379] hda: SanDisk SDCFH-1024, CFA DISK drive
[   12.578340] hda: applying conservative PIO "downgrade"
[   12.593869] hda: host max PIO4 wanted PIO255(auto-tune) selected PIO1
[   12.594006] hda: MW DMA 2 mode selected
[   12.594297] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
[   12.608778] Probing IDE interface ide1...
[   12.623192] hda: max request size: 128KiB
[   12.635322] hda: 2001888 sectors (1024 MB) w/1KiB Cache, CHS=1986/16/63, 
DMA
[   12.657134]  hda:<4>hda: dma_timer_expiry: dma status == 0x21
[   12.865846] hda: DMA timeout error
[   12.876092]  ide_dma_end dma_stat=21 err=1 newerr=0
[   12.890753] hda: dma timeout error: status=0x58 { DriveReady SeekComplete 
DataRequest }
[   12.914977] ide: failed opcode was: unknown
[   12.927743] hda: DMA disabled
[   12.937035] ide0: reset: success
[   12.948324]  hda1

Mounting taking long time on 1GB card cause of DMA issues. In dmesg i am not 
sure about timestamp showing few seconds, in real life it took about 2 
minutes.

after that in dmesg
[   14.965070] hda: dma_timer_expiry: dma status == 0x21
[   15.107909] hda: DMA timeout error
[   15.118149]  ide_dma_end dma_stat=21 err=1 newerr=0
[   15.132809] hda: dma timeout error: status=0x58 { DriveReady SeekComplete 
DataRequest }
[   15.157035] ide: failed opcode was: unknown
[   15.169799] hda: DMA disabled
[   15.178797] ide0: reset: success
[   15.312698] hda: dma_timer_expiry: dma status == 0x21
[   15.650705] hda: DMA timeout error
[   15.660952]  ide_dma_end dma_stat=21 err=1 newerr=0
[   15.675614] hda: dma timeout error: status=0x58 { DriveReady SeekComplete 
DataRequest }
[   15.699836] ide: failed opcode was: unknown
[   15.712601] hda: DMA disabled
[   15.721603] ide0: reset: success
[   16.325999] hda: dma_timer_expiry: dma status == 0x21
[   16.565756] hda: DMA timeout error
[   16.576001]  ide_dma_end dma_stat=21 err=1 newerr=0
[   16.590661] hda: dma timeout error: status=0x58 { DriveReady SeekComplete 
DataRequest }
[   16.614886] ide: failed opcode was: unknown
[   16.627651] hda: DMA disabled
[   16.636659] ide0: reset: success
[   16.650061] EXT2-fs warning: mounting unchecked fs, running e2fsck is 
recommended


On Wed, 7 Nov 2007 18:20:45 -0500, Jeff Garzik wrote
> On Wed, Nov 07, 2007 at 02:12:55PM -0500, Mark Lord wrote:
> > That cannot be correct (??).  Is this with hdparm-7.7 (latest 
sourceforge) 
> > ??
> > Can you show us the "hdparm --Istdout" output as well, please.
> 
> If this is applicable...  FWIW hdparm was only recently (in past <72
> hours) updated from 6.9 to 7.7 in Fedora...
> 
>   Jeff


--
Denys Fedoryshchenko
Technical Manager
Virtual ISP S.A.L.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86 - 32-bit ptrace emulation mishandles 6th arg

2007-11-07 Thread Roland McGrath
FYI, http://sourceware.org/systemtap/wiki/utrace/tests has details on the
ptrace-tests suite we're collecting.  A test I added there is how I noticed
the PTRACE_GET_THREAD_AREA regression.  A regression test for the ebp bug
should be easy to add too.


Thanks,
Roland
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SC1200 failure in 2.6.23 and 2.6.24-rc1-git10

2007-11-07 Thread Denys Fedoryshchenko
I am using Gentoo (and it is custom build of linux, actually only busybox + 
kernel + uclibc and few other tools), hdparm is vanilla 7.7

I will try to compile now -rc2 to see if there any changes.

With 16MB 2.6.24-rc1 works fine, 1GB working also with some errors in dmesg. 

And IF that all is important, cause it is relatively old hardware and 
probably if it is only this hardware-specific bug, it is enough to issue 
workaround just to be able to use it. I dont think so someone using them now 
much, but IMHO things must work in kernel if they are there.

On Wed, 7 Nov 2007 18:20:45 -0500, Jeff Garzik wrote
> On Wed, Nov 07, 2007 at 02:12:55PM -0500, Mark Lord wrote:
> > That cannot be correct (??).  Is this with hdparm-7.7 (latest 
sourceforge) 
> > ??
> > Can you show us the "hdparm --Istdout" output as well, please.
> 
> If this is applicable...  FWIW hdparm was only recently (in past <72
> hours) updated from 6.9 to 7.7 in Fedora...
> 
>   Jeff


--
Denys Fedoryshchenko
Technical Manager
Virtual ISP S.A.L.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc1-gb4f5550 oops

2007-11-07 Thread Rafael J. Wysocki
On Monday, 5 of November 2007, Grant Wilson wrote:
> Hi,
> I got this oops on 2.6.24-rc1-641-gb4f5550:

(1) Is this reproducible?
(2) Did it happen previously on your system?


> [18073.371126] Unable to handle kernel NULL pointer dereference at 
> 0120 RIP: 
> [18073.371134]  [] check_preempt_wakeup+0x6e/0x110
> [18073.371144] PGD 81f9067 PUD 81c8067 PMD 0 
> [18073.371151] Oops:  [1] PREEMPT SMP 
> [18073.371157] CPU 2 
> [18073.371161] Modules linked in: vfat fat
> [18073.371168] Pid: 4639, comm: kwin Not tainted 2.6.24-rc1 #1
> [18073.371171] RIP: 0010:[]  [] 
> check_preempt_wakeup+0x6e/0x110
> [18073.371177] RSP: 0018:810008531a78  EFLAGS: 00010006
> [18073.371179] RAX:  RBX:  RCX: 
> 
> [18073.371183] RDX: 810004441bf0 RSI: 81000801e860 RDI: 
> 81000444ab80
> [18073.371186] RBP: 810008531aa8 R08: 00d0d47a4a90 R09: 
> 
> [18073.371188] R10: 810004441bf0 R11: 0001 R12: 
> 810006520400
> [18073.371190] R13: 81000801e860 R14: 81000a63a000 R15: 
> 81000443d8e0
> [18073.371193] FS:  2b7d646a86f0() GS:810004c11780() 
> knlGS:
> [18073.371196] CS:  0010 DS:  ES:  CR0: 8005003b
> [18073.371199] CR2: 0120 CR3: 08495000 CR4: 
> 06e0
> [18073.371202] DR0:  DR1:  DR2: 
> 
> [18073.371211] DR3:  DR6: 0ff0 DR7: 
> 0400
> [18073.371214] Process kwin (pid: 4639, threadinfo 81000853, task 
> 81000840a860)
> [18073.371216] Stack:  81000444ab80 0001 81000801e860 
> 81000444ab80
> [18073.371231]  0002 81000443d8e0 810008531b38 
> 8023061e
> [18073.371238]   810004441b80 0002 
> 0001
> [18073.371245] Call Trace:
> [18073.371250]  [] try_to_wake_up+0x2fe/0x3a0
> [18073.371253]  [] default_wake_function+0xd/0x10
> [18073.371257]  [] __wake_up_common+0x5a/0x90
> [18073.371260]  [] __wake_up_sync+0x4a/0x70
> [18073.371264]  [] unix_write_space+0x8f/0xa0
> [18073.371269]  [] sock_wfree+0x49/0x50
> [18073.371272]  [] __kfree_skb+0x69/0xe0
> [18073.371275]  [] kfree_skb+0x17/0x30
> [18073.371278]  [] unix_stream_recvmsg+0x267/0x610
> [18073.371283]  [] sock_aio_read+0x107/0x110
> [18073.371287]  [] do_sync_read+0xf1/0x130
> [18073.371291]  [] sock_ioctl+0x0/0x260
> [18073.371295]  [] autoremove_wake_function+0x0/0x40
> [18073.371299]  [] unix_ioctl+0xb2/0xf0
> [18073.371302]  [] sock_ioctl+0xd1/0x260
> [18073.371305]  [] do_ioctl+0x31/0x90
> [18073.371308]  [] vfs_read+0x156/0x160
> [18073.371311]  [] sys_read+0x50/0x90
> [18073.371315]  [] system_call+0x7e/0x83
> [18073.371317] 
> [18073.371319] 
> [18073.371319] Code: 48 8b 90 20 01 00 00 48 39 93 20 01 00 00 75 e2 48 81 3b 
> 00 
> [18073.371346] RIP  [] check_preempt_wakeup+0x6e/0x110
> [18073.371351]  RSP 
> [18073.371354] CR2: 0120
> [18073.371358] note: kwin[4639] exited with preempt_count 3
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.34-rc1 eat my photo SD card :-(

2007-11-07 Thread Roland Dreier
 > Well, I spent the last 36 hours (more or less) trying to bisect the SD
 > problem. The method I used was to insert the card, umount it, and make 8 dd
 > in a row; the kernel is "bad" if they differs, "good" if they are the same. 
 > 
 > I could not finish the bisect. The last pair good/bad were:
 > 
 > bad:   [7aeacf982203fb4dea2f3434eefdc268cfd5d6d9] 
 >[BLOCK] blk_rq_map_sg: force clear termination bit
 > good:  [e38f981758118d829cd40cfe9c09e3fa81e422aa] 
 >exportfs: update documentation

Thanks, that helps.  I read over the mmc changes in between those two
commits, and I think I found the problem... could you please try the
patch below (on top of the latest kernel) and report back how it
works?  Unfortunately I am traveling and I don't have an SD card with
me to test on my laptop...

Pierre, assuming Romano tests this patch successfully, please apply!

Thanks,
  Roland

<-- patch below -->

mmc: Fix sg helper copy-and-paste error

Commit 45711f1a ("[SG] Update drivers to use sg helpers") had the
following bogus change in drivers/mmc/card/queue.c:

> - src_buf = page_address(src->page) + src->offset;
> + src_buf = sg_virt(dst);

(Notice that "src" is converted to "dst").  Turn this "dst" back into
the intended "src".

Cc: Jens Axboe <[EMAIL PROTECTED]>
Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>
---
diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index 9203a0b..1b9c9b6 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@ -310,7 +310,7 @@ static void copy_sg(struct scatterlist *dst, unsigned int 
dst_len,
}
 
if (src_size == 0) {
-   src_buf = sg_virt(dst);
+   src_buf = sg_virt(src);
src_size = src->length;
}
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86 - 32-bit ptrace emulation mishandles 6th arg

2007-11-07 Thread Jeff Dike
On Wed, Nov 07, 2007 at 01:12:22PM -0800, Roland McGrath wrote:
> Sure has my ACK.  
> I never really understood why my old patch was not taken 2.5 years ago.

Nor I.  It's needed.

As is your PTRACE_SET_THREAD_INFO patch from yesterday - with these
two fixes, I can boot a 32-bit UML on a 64-bit host.

Jeff

-- 
Work email - jdike at linux dot intel dot com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] Suppress A.OUT library support in ELF binfmt if !CONFIG_BINFMT_AOUT [try #3]

2007-11-07 Thread David Howells
David Woodhouse <[EMAIL PROTECTED]> wrote:

> Ew, no. This is horridly broken. You should never use CONFIG_xxx_MODULE
> in the static kernel at all -- and you should _especially_ not be using
> it in header files which are exported to userspace.

AOUT support can be mostly built into a module, but a small part of it that is
arch-specific still gets built into the main kernel.  *That* is the main thing
that is wrong.

I suppose it might be possible to move those bits of the main kernel into
inline functions in asm/a.out.h and thus include them directly in
binfmt_aout.ko.

> This abomination certainly doesn't seem to have any direct relation to
> mn10300 support -- I think all you really need there is not to attempt
> to export {asm,linux}/a.out.h if asm/a.out.h doesn't exist, which is
> something you haven't attempted here anyway.

No, it's not that simple.  If asm/a.out.h doesn't exist, then various bits of
the kernel break that shouldn't.  fs/binfmt_elf.c for example.  fs/exec.c for
another.  They *expect* bits of the asm/a.out.h and linux/a.out.h to exist -
which they shouldn't.

Not exporting them isn't by itself sufficient.  The required constants
themselves are not defined for an arch that doesn't have the support, and so
the core code must not depend on them.  This patch fixes that.

Furthermore, STACK_TOP and STACK_TOP_MAX don't belong in asm/a.out.h as far as
I can tell.  They should probably be wherever TASK_SIZE resides (ie:
asm/processor.h).

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Andrew Morton
> On Wed, 7 Nov 2007 14:47:22 -0800 David Brown <[EMAIL PROTECTED]> wrote:
> compat_sys_times() has bogus return until jiffies is >= 0.  I discovered
> this running LTP within 5 minutes of booting.
> 
> The return result
> 
>   return compat_jiffies_to_clock_t(jiffies);
> 
> will return '-1' to user space and set the negated clock_t value to errno.
> 
> I'm not sure what the correct fix for this is.  I can come up with a patch
> if anyone has ideas on how to fix it.
> 
> At minimum, perhaps it should return a sane errno value.

RETURN VALUE
   times()  returns  the  number of clock ticks that have elapsed since an
   arbitrary point in the past.  For Linux 2.4 and earlier this  point  is
   the  moment  the  system  was  booted.   Since Linux 2.6, this point is
   (2^32/HZ) - 300 (i.e., about 429 million) seconds  before  system  boot
   time.   The  return  value  may  overflow  the  possible  range of type
   clock_t.  On error, (clock_t) -1 is returned, and errno is  set  appro-
   priately.


Perhaps this is a bug in glibc: it is interpreting the times() return value
in the same way as other syscalls.

It would have been sensible for us to add INITIAL_JIFFIES to the value
instead of exposing this kernel-only detail to the world, although the
problem will of course reoccur once jiffies hits 0x8000.  Unfortunately
we've even gone and enshrined this bogon in the manpage.

Proposed fix:

-return compat_jiffies_to_clock_t(jiffies);
+return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) &
+   0x7fff);

?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   >